Allows the user to easily read/write a SAM/BAM file. More...
#include <SamFile.h>


Public Types | |
| enum | OpenType { READ, WRITE } |
Enum for indicating whether to open the file for read or write. More... | |
| enum | SortedType { UNSORTED = 0, FLAG, COORDINATE, QUERY_NAME } |
Enum for indicating the type of sort for the file. More... | |
Public Member Functions | |
| SamFile () | |
| Default Constructor. | |
| SamFile (ErrorHandler::HandlingType errorHandlingType) | |
| Constructor that sets the error handling type. | |
| SamFile (const char *filename, OpenType mode) | |
| Constructor that opens the specified file based on the specified mode (READ/WRITE). | |
| SamFile (const char *filename, OpenType mode, ErrorHandler::HandlingType errorHandlingType) | |
| Constructor that opens the specified file based on the specified mode (READ/WRITE) and handles errors per the specified handleType. | |
| bool | OpenForRead (const char *filename) |
| Open a sam/bam file for reading with the specified filename. | |
| bool | OpenForWrite (const char *filename) |
| Open a sam/bam file for writing with the specified filename. | |
| bool | ReadBamIndex (const char *filename) |
| Reads the specified bam index file. | |
| void | SetReference (GenomeSequence *reference) |
| Sets the reference to the specified genome sequence object. | |
| void | SetReadSequenceTranslation (SamRecord::SequenceTranslation translation) |
| Set the type of sequence translation to use when reading the sequence. | |
| void | SetWriteSequenceTranslation (SamRecord::SequenceTranslation translation) |
| Set the type of sequence translation to use when writing the sequence. | |
| void | Close () |
| Close the file if there is one open. | |
| bool | IsEOF () |
| Returns whether or not the end of the file has been reached. | |
| bool | ReadHeader (SamFileHeader &header) |
| Reads the header section from the file and stores it in the passed in header. | |
| bool | WriteHeader (SamFileHeader &header) |
| Writes the specified header into the file. | |
| bool | ReadRecord (SamFileHeader &header, SamRecord &record) |
| Reads the next record from the file & stores it in the passed in record. | |
| bool | WriteRecord (SamFileHeader &header, SamRecord &record) |
| Writes the specified record into the file. | |
| void | setSortedValidation (SortedType sortType) |
| Set the flag to validate that the file is sorted as it is read/written. | |
| uint32_t | GetCurrentRecordCount () |
| Return the number of records that have been read/written so far. | |
| SamStatus::Status | GetFailure () |
| Get the Status of the last call that sets status. | |
| SamStatus::Status | GetStatus () |
| Get the Status of the last call that sets status. | |
| const char * | GetStatusMessage () |
| Get the Status of the last call that sets status. | |
| bool | SetReadSection (int32_t refID) |
| Sets what part of the BAM file should be read. | |
| bool | SetReadSection (const char *refName) |
| Sets what part of the BAM file should be read. | |
| bool | SetReadSection (int32_t refID, int32_t start, int32_t end) |
| Sets what part of the BAM file should be read. | |
| bool | SetReadSection (const char *refName, int32_t start, int32_t end) |
| Sets what part of the BAM file should be read. | |
| uint32_t | GetNumOverlaps (SamRecord &samRecord) |
| Returns the number of bases in the passed in read that overlap the region that is currently set. | |
| void | GenerateStatistics (bool genStats) |
| Whether or not statistics should be generated for this file. | |
| void | PrintStatistics () |
Protected Member Functions | |
| void | resetFile () |
| Resets the file prepping for a new file. | |
| bool | validateSortOrder (SamRecord &record, SamFileHeader &header) |
| Validate that the record is sorted compared to the previously read record if there is one, according to the specified sort order. | |
| SortedType | getSortOrderFromHeader (SamFileHeader &header) |
| bool | readIndexedRecord (SamFileHeader &header, SamRecord &record) |
| Overwrites read record to read from the specific reference only. | |
| bool | processNewSection (SamFileHeader &header) |
Protected Attributes | |
| IFILE | myFilePtr |
| GenericSamInterface * | myInterfacePtr |
| bool | myIsOpenForRead |
| Flag to indicate if a file is open for reading. | |
| bool | myIsOpenForWrite |
| Flag to indicate if a file is open for writing. | |
| bool | myHasHeader |
| Flag to indicate if a header has been read/written - required before being able to read/write a record. | |
| SortedType | mySortedType |
| int32_t | myPrevCoord |
| Previous values used for checking if the file is sorted. | |
| int32_t | myPrevRefID |
| std::string | myPrevReadName |
| uint32_t | myRecordCount |
| Keep a count of the number of records that have been read/written so far. | |
| SamStatistics * | myStatistics |
| Pointer to the statistics for this file. | |
| SamStatus | myStatus |
| The status of the last SamFile command. | |
| bool | myIsBamOpenForRead |
| Values for reading Sorted BAM files via the index. | |
| bool | myNewSection |
| int32_t | myRefID |
| int32_t | myStartPos |
| int32_t | myEndPos |
| uint64_t | myCurrentChunkEnd |
| SortedChunkList | myChunksToRead |
| BamIndex * | myBamIndex |
| GenomeSequence * | myRefPtr |
| SamRecord::SequenceTranslation | myReadTranslation |
| SamRecord::SequenceTranslation | myWriteTranslation |
| std::string | myRefName |
Allows the user to easily read/write a SAM/BAM file.
Definition at line 30 of file SamFile.h.
| enum SamFile::OpenType |
| enum SamFile::SortedType |
Enum for indicating the type of sort for the file.
| UNSORTED |
file is not sorted. |
| FLAG |
SO flag from the header indicates the sort type. |
| COORDINATE |
file is sorted by coordinate. |
| QUERY_NAME |
file is sorted by queryname. |
Definition at line 41 of file SamFile.h.
00041 { 00042 UNSORTED = 0, ///< file is not sorted. 00043 FLAG, ///< SO flag from the header indicates the sort type. 00044 COORDINATE, ///< file is sorted by coordinate. 00045 QUERY_NAME ///< file is sorted by queryname. 00046 };
| SamFile::SamFile | ( | ErrorHandler::HandlingType | errorHandlingType | ) |
Constructor that sets the error handling type.
| errorHandlingType | how to handle errors. |
Definition at line 40 of file SamFile.cpp.
References resetFile().
00041 : myFilePtr(NULL), 00042 myInterfacePtr(NULL), 00043 myStatistics(NULL), 00044 myStatus(errorHandlingType), 00045 myBamIndex(NULL), 00046 myRefPtr(NULL), 00047 myReadTranslation(SamRecord::NONE), 00048 myWriteTranslation(SamRecord::NONE) 00049 { 00050 resetFile(); 00051 }
| SamFile::SamFile | ( | const char * | filename, | |
| OpenType | mode | |||
| ) |
Constructor that opens the specified file based on the specified mode (READ/WRITE).
| filename | name of the file to open. | |
| mode | mode to use for opening the file. |
Definition at line 56 of file SamFile.cpp.
References GetStatusMessage(), OpenForRead(), OpenForWrite(), READ, and resetFile().
00057 : myFilePtr(NULL), 00058 myInterfacePtr(NULL), 00059 myStatistics(NULL), 00060 myStatus(), 00061 myBamIndex(NULL), 00062 myRefPtr(NULL), 00063 myReadTranslation(SamRecord::NONE), 00064 myWriteTranslation(SamRecord::NONE) 00065 { 00066 resetFile(); 00067 00068 bool openStatus = true; 00069 if(mode == READ) 00070 { 00071 // open the file for read. 00072 openStatus = OpenForRead(filename); 00073 } 00074 else 00075 { 00076 // open the file for write. 00077 openStatus = OpenForWrite(filename); 00078 } 00079 if(!openStatus) 00080 { 00081 // Failed to open the file - print error and abort. 00082 fprintf(stderr, "%s\n", GetStatusMessage()); 00083 std::cerr << "FAILURE - EXITING!!!" << std::endl; 00084 exit(-1); 00085 } 00086 }
| SamFile::SamFile | ( | const char * | filename, | |
| OpenType | mode, | |||
| ErrorHandler::HandlingType | errorHandlingType | |||
| ) |
Constructor that opens the specified file based on the specified mode (READ/WRITE) and handles errors per the specified handleType.
| filename | name of the file to open. | |
| mode | mode to use for opening the file. | |
| errorHandlingType | how to handle errors. |
Definition at line 91 of file SamFile.cpp.
References GetStatusMessage(), OpenForRead(), OpenForWrite(), READ, and resetFile().
00093 : myFilePtr(NULL), 00094 myInterfacePtr(NULL), 00095 myStatistics(NULL), 00096 myStatus(errorHandlingType), 00097 myBamIndex(NULL), 00098 myRefPtr(NULL), 00099 myReadTranslation(SamRecord::NONE), 00100 myWriteTranslation(SamRecord::NONE) 00101 { 00102 resetFile(); 00103 00104 bool openStatus = true; 00105 if(mode == READ) 00106 { 00107 // open the file for read. 00108 openStatus = OpenForRead(filename); 00109 } 00110 else 00111 { 00112 // open the file for write. 00113 openStatus = OpenForWrite(filename); 00114 } 00115 if(!openStatus) 00116 { 00117 // Failed to open the file - print error and abort. 00118 fprintf(stderr, "%s\n", GetStatusMessage()); 00119 std::cerr << "FAILURE - EXITING!!!" << std::endl; 00120 exit(-1); 00121 } 00122 }
| void SamFile::GenerateStatistics | ( | bool | genStats | ) |
Whether or not statistics should be generated for this file.
The value is carried over between files and is not reset, but the statistics themselves are reset between files.
| genStats | set to true if statistics should be generated, false if not. |
Definition at line 674 of file SamFile.cpp.
References myStatistics.
00675 { 00676 if(genStats) 00677 { 00678 if(myStatistics == NULL) 00679 { 00680 // Want to generate statistics, but do not yet have the 00681 // structure for them, so create one. 00682 myStatistics = new SamStatistics(); 00683 } 00684 } 00685 else 00686 { 00687 // Do not generate statistics, so if myStatistics is not NULL, 00688 // delete it. 00689 if(myStatistics != NULL) 00690 { 00691 delete myStatistics; 00692 myStatistics = NULL; 00693 } 00694 } 00695 00696 }
| SamStatus::Status SamFile::GetFailure | ( | ) | [inline] |
Get the Status of the last call that sets status.
To remain backwards compatable - will be removed later.
Definition at line 138 of file SamFile.h.
References GetStatus().
00139 { 00140 return(GetStatus()); 00141 }
| uint32_t SamFile::GetNumOverlaps | ( | SamRecord & | samRecord | ) |
Returns the number of bases in the passed in read that overlap the region that is currently set.
| samRecord | to check for overlapping bases. |
Definition at line 660 of file SamFile.cpp.
References SamRecord::getNumOverlaps(), SamRecord::setReference(), and SamRecord::setSequenceTranslation().
00661 { 00662 if(myRefPtr != NULL) 00663 { 00664 samRecord.setReference(myRefPtr); 00665 } 00666 samRecord.setSequenceTranslation(myReadTranslation); 00667 00668 // Get the overlaps in the sam record for the region currently set 00669 // for this file. 00670 return(samRecord.getNumOverlaps(myStartPos, myEndPos)); 00671 }
| bool SamFile::IsEOF | ( | ) |
Returns whether or not the end of the file has been reached.
Definition at line 356 of file SamFile.cpp.
| bool SamFile::OpenForRead | ( | const char * | filename | ) |
Open a sam/bam file for reading with the specified filename.
| filename,: | the sam/bam file to open for reading. |
Definition at line 136 of file SamFile.cpp.
References myIsBamOpenForRead, myIsOpenForRead, myStatus, and resetFile().
Referenced by SamFile(), and SamFileReader::SamFileReader().
00137 { 00138 // Reset for any previously operated on files. 00139 resetFile(); 00140 00141 int lastchar = 0; 00142 00143 while (filename[lastchar] != 0) lastchar++; 00144 00145 // If at least one character, check for '-'. 00146 if((lastchar >= 1) && (filename[0] == '-')) 00147 { 00148 // Read from stdin - determine type of file to read. 00149 // Determine if compressed bam. 00150 if(strcmp(filename, "-.bam") == 0) 00151 { 00152 // Compressed bam - open as bgzf. 00153 // -.bam is the filename, read compressed bam from stdin 00154 filename = "-"; 00155 myFilePtr = ifopen(filename, "rb", InputFile::BGZF); 00156 00157 myInterfacePtr = new BamInterface; 00158 00159 // Read the magic string. 00160 char magic[4]; 00161 ifread(myFilePtr, magic, 4); 00162 } 00163 else if(strcmp(filename, "-.ubam") == 0) 00164 { 00165 // uncompressed BAM File. 00166 // -.ubam is the filename, read uncompressed bam from stdin 00167 filename = "-"; 00168 myFilePtr = ifopen(filename, "rb", InputFile::UNCOMPRESSED); 00169 00170 myInterfacePtr = new BamInterface; 00171 00172 // Read the magic string. 00173 char magic[4]; 00174 ifread(myFilePtr, magic, 4); 00175 } 00176 else 00177 { 00178 // SAM File. 00179 // read sam from stdin 00180 filename = "-"; 00181 myFilePtr = ifopen(filename, "rb", InputFile::UNCOMPRESSED); 00182 myInterfacePtr = new SamInterface; 00183 } 00184 } 00185 else 00186 { 00187 // Not from stdin. Read the file to determine the type. 00188 myFilePtr = ifopen(filename, "rb"); 00189 00190 if (myFilePtr == NULL) 00191 { 00192 std::string errorMessage = "Failed to Open "; 00193 errorMessage += filename; 00194 errorMessage += " for reading"; 00195 myStatus.setStatus(SamStatus::FAIL_IO, errorMessage.c_str()); 00196 return(false); 00197 } 00198 00199 char magic[4]; 00200 ifread(myFilePtr, magic, 4); 00201 00202 if (magic[0] == 'B' && magic[1] == 'A' && magic[2] == 'M' && 00203 magic[3] == 1) 00204 { 00205 myInterfacePtr = new BamInterface; 00206 // Set that it is a bam file open for reading. This is needed to 00207 // determine if an index file can be used. 00208 myIsBamOpenForRead = true; 00209 } 00210 else 00211 { 00212 // Not a bam, so rewind to the beginning of the file so it 00213 // can be read. 00214 ifrewind(myFilePtr); 00215 myInterfacePtr = new SamInterface; 00216 } 00217 } 00218 00219 // File is open for reading. 00220 myIsOpenForRead = true; 00221 // Successfully opened the file. 00222 myStatus = SamStatus::SUCCESS; 00223 return(true); 00224 }
| bool SamFile::OpenForWrite | ( | const char * | filename | ) |
Open a sam/bam file for writing with the specified filename.
Definition at line 228 of file SamFile.cpp.
References myIsOpenForWrite, myStatus, and resetFile().
Referenced by SamFile(), and SamFileWriter::SamFileWriter().
00229 { 00230 // Reset for any previously operated on files. 00231 resetFile(); 00232 00233 int lastchar = 0; 00234 while (filename[lastchar] != 0) lastchar++; 00235 if (lastchar >= 4 && 00236 filename[lastchar - 4] == 'u' && 00237 filename[lastchar - 3] == 'b' && 00238 filename[lastchar - 2] == 'a' && 00239 filename[lastchar - 1] == 'm') 00240 { 00241 // BAM File. 00242 // if -.ubam is the filename, write uncompressed bam to stdout 00243 if((lastchar == 6) && (filename[0] == '-') && (filename[1] == '.')) 00244 { 00245 filename = "-"; 00246 } 00247 myFilePtr = ifopen(filename, "wb", InputFile::UNCOMPRESSED); 00248 00249 myInterfacePtr = new BamInterface; 00250 } 00251 else if (lastchar >= 3 && 00252 filename[lastchar - 3] == 'b' && 00253 filename[lastchar - 2] == 'a' && 00254 filename[lastchar - 1] == 'm') 00255 { 00256 // BAM File. 00257 // if -.bam is the filename, write compressed bam to stdout 00258 if((lastchar == 5) && (filename[0] == '-') && (filename[1] == '.')) 00259 { 00260 filename = "-"; 00261 } 00262 myFilePtr = ifopen(filename, "wb", InputFile::BGZF); 00263 00264 myInterfacePtr = new BamInterface; 00265 } 00266 else 00267 { 00268 // SAM File 00269 // if - (followed by anything is the filename, 00270 // write uncompressed sam to stdout 00271 if((lastchar >= 1) && (filename[0] == '-')) 00272 { 00273 filename = "-"; 00274 } 00275 myFilePtr = ifopen(filename, "wb", InputFile::UNCOMPRESSED); 00276 00277 myInterfacePtr = new SamInterface; 00278 } 00279 00280 if (myFilePtr == NULL) 00281 { 00282 std::string errorMessage = "Failed to Open "; 00283 errorMessage += filename; 00284 errorMessage += " for writing"; 00285 myStatus.setStatus(SamStatus::FAIL_IO, errorMessage.c_str()); 00286 return(false); 00287 } 00288 00289 myIsOpenForWrite = true; 00290 00291 // Successfully opened the file. 00292 myStatus = SamStatus::SUCCESS; 00293 return(true); 00294 }
| bool SamFile::ReadBamIndex | ( | const char * | filename | ) |
Reads the specified bam index file.
It must be read prior to setting a read section, for seeking and reading portions of a bam file.
Definition at line 298 of file SamFile.cpp.
References myStatus, and BamIndex::readIndex().
00299 { 00300 // Cleanup a previously setup index. 00301 if(myBamIndex != NULL) 00302 { 00303 delete myBamIndex; 00304 myBamIndex = NULL; 00305 } 00306 00307 // Create a new bam index. 00308 myBamIndex = new BamIndex(); 00309 SamStatus::Status indexStat = myBamIndex->readIndex(bamIndexFilename); 00310 00311 if(indexStat != SamStatus::SUCCESS) 00312 { 00313 std::string errorMessage = "Failed to read the bam Index file: "; 00314 errorMessage += bamIndexFilename; 00315 myStatus.setStatus(indexStat, errorMessage.c_str()); 00316 delete myBamIndex; 00317 myBamIndex = NULL; 00318 return(false); 00319 } 00320 myStatus = SamStatus::SUCCESS; 00321 return(true); 00322 }
| bool SamFile::ReadHeader | ( | SamFileHeader & | header | ) |
Reads the header section from the file and stores it in the passed in header.
Definition at line 369 of file SamFile.cpp.
References myHasHeader, myIsOpenForRead, and myStatus.
00370 { 00371 if(myIsOpenForRead == false) 00372 { 00373 // File is not open for read 00374 myStatus.setStatus(SamStatus::FAIL_ORDER, 00375 "Cannot read header since the file is not open for reading"); 00376 return(false); 00377 } 00378 00379 if(myHasHeader == true) 00380 { 00381 // The header has already been read. 00382 myStatus.setStatus(SamStatus::FAIL_ORDER, 00383 "Cannot read header since it has already been read."); 00384 return(false); 00385 } 00386 00387 myStatus = myInterfacePtr->readHeader(myFilePtr, header); 00388 if(myStatus == SamStatus::SUCCESS) 00389 { 00390 // The header has now been successfully read. 00391 myHasHeader = true; 00392 return(true); 00393 } 00394 return(false); 00395 }
| bool SamFile::ReadRecord | ( | SamFileHeader & | header, | |
| SamRecord & | record | |||
| ) |
Reads the next record from the file & stores it in the passed in record.
Definition at line 433 of file SamFile.cpp.
References myHasHeader, myIsOpenForRead, myRecordCount, myStatistics, myStatus, readIndexedRecord(), BamIndex::REF_ID_ALL, SamRecord::setReference(), SamRecord::setSequenceTranslation(), and validateSortOrder().
00435 { 00436 myStatus = SamStatus::SUCCESS; 00437 00438 if(myIsOpenForRead == false) 00439 { 00440 // File is not open for read 00441 myStatus.setStatus(SamStatus::FAIL_ORDER, 00442 "Cannot read record since the file is not open for reading"); 00443 throw(std::runtime_error("SOFTWARE BUG: trying to read a SAM/BAM record prior to opening the file.")); 00444 return(false); 00445 } 00446 00447 if(myHasHeader == false) 00448 { 00449 // The header has not yet been read. 00450 // TODO - maybe just read the header. 00451 myStatus.setStatus(SamStatus::FAIL_ORDER, 00452 "Cannot read record since the header has not been read."); 00453 throw(std::runtime_error("SOFTWARE BUG: trying to read a SAM/BAM record prior to reading the header.")); 00454 return(false); 00455 } 00456 00457 // Check to see if a new region has been set. If so, determine the 00458 // chunks for that region. 00459 if(myNewSection) 00460 { 00461 if(!processNewSection(header)) 00462 { 00463 // Failed processing a new section. Could be an 00464 // order issue like the file not being open or the 00465 // indexed file not having been read. 00466 // processNewSection sets myStatus with the failure reason. 00467 return(false); 00468 } 00469 } 00470 00471 // Check to see if the file should be read by index. 00472 if(myRefID != BamIndex::REF_ID_ALL) 00473 { 00474 // Reference ID is set, so read by index. 00475 return(readIndexedRecord(header, record)); 00476 } 00477 00478 record.setReference(myRefPtr); 00479 record.setSequenceTranslation(myReadTranslation); 00480 00481 // File is open for reading and the header has been read, so read the next 00482 // record. 00483 myInterfacePtr->readRecord(myFilePtr, header, record, myStatus); 00484 if(myStatus == SamStatus::SUCCESS) 00485 { 00486 // A record was successfully read, so increment the record count. 00487 myRecordCount++; 00488 00489 if(myStatistics != NULL) 00490 { 00491 // Statistics should be updated. 00492 myStatistics->updateStatistics(record); 00493 } 00494 00495 // Successfully read the record, so check the sort order. 00496 if(!validateSortOrder(record, header)) 00497 { 00498 // ValidateSortOrder sets the status on a failure. 00499 return(false); 00500 } 00501 return(true); 00502 } 00503 // Failed to read the record. 00504 return(false); 00505 }
| bool SamFile::SetReadSection | ( | const char * | refName, | |
| int32_t | start, | |||
| int32_t | end | |||
| ) |
Sets what part of the BAM file should be read.
This version will set it to only read a specific reference name and start/end position. The records for this section will be retrieved on each ReadRecord call. When all records have been retrieved for the specified section, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.
| refName | the reference name of the records to read from the file. | |
| start | inclusive 0-based start position of records that should be read for this refID. | |
| end | exclusive 0-based end position of records that should be read for this refID. |
Definition at line 620 of file SamFile.cpp.
References myIsBamOpenForRead, myStatus, BamIndex::REF_ID_ALL, and BamIndex::REF_ID_UNMAPPED.
00621 { 00622 // If there is not a BAM file open for reading, return failure. 00623 // Opening a new file clears the read section, so it must be 00624 // set after the file is opened. 00625 if(!myIsBamOpenForRead) 00626 { 00627 // There is not a BAM file open for reading. 00628 myStatus.setStatus(SamStatus::FAIL_ORDER, 00629 "Canot set section since there is no bam file open"); 00630 return(false); 00631 } 00632 00633 myNewSection = true; 00634 myStartPos = start; 00635 myEndPos = end; 00636 if((strcmp(refName, "") == 0) || (strcmp(refName, "*") == 0)) 00637 { 00638 // No Reference name specified, so read just the "-1" entries. 00639 myRefID = BamIndex::REF_ID_UNMAPPED; 00640 } 00641 else 00642 { 00643 // save the reference name and revert the reference ID to unknown 00644 // so it will be calculated later. 00645 myRefName = refName; 00646 myRefID = BamIndex::REF_ID_ALL; 00647 } 00648 myChunksToRead.clear(); 00649 // Reset the end of the current chunk. We are resetting our read, so 00650 // we no longer have a "current chunk" that we are reading. 00651 myCurrentChunkEnd = 0; 00652 myStatus = SamStatus::SUCCESS; 00653 00654 return(true); 00655 }
| bool SamFile::SetReadSection | ( | int32_t | refID, | |
| int32_t | start, | |||
| int32_t | end | |||
| ) |
Sets what part of the BAM file should be read.
This version will set it to only read a specific reference id and start/end position. The records for this section will be retrieved on each ReadRecord call. When all records have been retrieved for the specified section, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.
| refID | the reference ID of the records to read from the file. | |
| start | inclusive 0-based start position of records that should be read for this refID. | |
| end | exclusive 0-based end position of records that should be read for this refID. |
Definition at line 591 of file SamFile.cpp.
References myIsBamOpenForRead, and myStatus.
00592 { 00593 // If there is not a BAM file open for reading, return failure. 00594 // Opening a new file clears the read section, so it must be 00595 // set after the file is opened. 00596 if(!myIsBamOpenForRead) 00597 { 00598 // There is not a BAM file open for reading. 00599 myStatus.setStatus(SamStatus::FAIL_ORDER, 00600 "Canot set section since there is no bam file open"); 00601 return(false); 00602 } 00603 00604 myNewSection = true; 00605 myStartPos = start; 00606 myEndPos = end; 00607 myRefID = refID; 00608 myRefName.clear(); 00609 myChunksToRead.clear(); 00610 // Reset the end of the current chunk. We are resetting our read, so 00611 // we no longer have a "current chunk" that we are reading. 00612 myCurrentChunkEnd = 0; 00613 myStatus = SamStatus::SUCCESS; 00614 00615 return(true); 00616 }
| bool SamFile::SetReadSection | ( | const char * | refName | ) |
Sets what part of the BAM file should be read.
This version will set it to only read a specific reference name. The records for that reference id will be retrieved on each ReadRecord call. When all records have been retrieved for the specified reference name, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.
| refName | the reference name of the records to read from the file. |
Definition at line 583 of file SamFile.cpp.
References SetReadSection().
00584 { 00585 // No start/end specified, so set back to default -1. 00586 return(SetReadSection(refName, -1, -1)); 00587 }
| bool SamFile::SetReadSection | ( | int32_t | refID | ) |
Sets what part of the BAM file should be read.
This version will set it to only read a specific reference id. The records for that reference id will be retrieved on each ReadRecord call. When all records have been retrieved for the specified reference id, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.
| refID | the reference ID of the records to read from the file. |
Definition at line 574 of file SamFile.cpp.
Referenced by SetReadSection().
00575 { 00576 // No start/end specified, so set back to default -1. 00577 return(SetReadSection(refID, -1, -1)); 00578 }
| void SamFile::SetReadSequenceTranslation | ( | SamRecord::SequenceTranslation | translation | ) |
Set the type of sequence translation to use when reading the sequence.
Passed down to the SamRecord when it is read. NONE (the sequence is left as-is).
| translation | type of sequence translation to use. |
Definition at line 333 of file SamFile.cpp.
| void SamFile::SetReference | ( | GenomeSequence * | reference | ) |
Sets the reference to the specified genome sequence object.
| reference | pointer to the GenomeSequence object. |
Definition at line 326 of file SamFile.cpp.
| void SamFile::setSortedValidation | ( | SortedType | sortType | ) |
Set the flag to validate that the file is sorted as it is read/written.
Must be called after the file has been opened.
Definition at line 560 of file SamFile.cpp.
| void SamFile::SetWriteSequenceTranslation | ( | SamRecord::SequenceTranslation | translation | ) |
Set the type of sequence translation to use when writing the sequence.
Passed down to the SamRecord when it is written. The default type (if this method is never called) is NONE (the sequence is left as-is).
| translation | type of sequence translation to use. |
Definition at line 340 of file SamFile.cpp.
| bool SamFile::validateSortOrder | ( | SamRecord & | record, | |
| SamFileHeader & | header | |||
| ) | [protected] |
Validate that the record is sorted compared to the previously read record if there is one, according to the specified sort order.
If the sort order is UNSORTED, true is returned.
Definition at line 752 of file SamFile.cpp.
References FLAG, SamRecord::get0BasedPosition(), SamRecord::getReadName(), SamRecord::getReferenceID(), myPrevCoord, myRecordCount, myStatus, QUERY_NAME, BamIndex::REF_ID_UNMAPPED, SamRecord::setReference(), SamRecord::setSequenceTranslation(), and UNSORTED.
Referenced by readIndexedRecord(), ReadRecord(), and WriteRecord().
00753 { 00754 if(myRefPtr != NULL) 00755 { 00756 record.setReference(myRefPtr); 00757 } 00758 record.setSequenceTranslation(myReadTranslation); 00759 00760 bool status = false; 00761 if(mySortedType == UNSORTED) 00762 { 00763 // Unsorted, so nothing to validate, just return true. 00764 status = true; 00765 } 00766 else 00767 { 00768 // Check to see if mySortedType is based on the header. 00769 if(mySortedType == FLAG) 00770 { 00771 // Determine the sorted type from what was read out of the header. 00772 mySortedType = getSortOrderFromHeader(header); 00773 } 00774 00775 if(mySortedType == QUERY_NAME) 00776 { 00777 // Validate that it is sorted by query name. 00778 // Get the query name from the record. 00779 const char* readName = record.getReadName(); 00780 if(myPrevReadName.compare(readName) > 0) 00781 { 00782 // The previous name is greater than the new record's name, so 00783 // return false. 00784 String errorMessage = "ERROR: File is not sorted at record "; 00785 errorMessage += myRecordCount; 00786 myStatus.setStatus(SamStatus::INVALID_SORT, 00787 errorMessage.c_str()); 00788 status = false; 00789 } 00790 else 00791 { 00792 myPrevReadName = readName; 00793 status = true; 00794 } 00795 } 00796 else 00797 { 00798 // Validate that it is sorted by COORDINATES. 00799 // Get the leftmost coordinate and the reference index. 00800 int32_t refID = record.getReferenceID(); 00801 int32_t coord = record.get0BasedPosition(); 00802 // The unmapped reference id is at the end of a sorted file. 00803 if(refID == BamIndex::REF_ID_UNMAPPED) 00804 { 00805 // A new reference ID that is for the unmapped reads 00806 // is always valid. 00807 status = true; 00808 myPrevRefID = refID; 00809 myPrevCoord = coord; 00810 } 00811 else if(myPrevRefID == BamIndex::REF_ID_UNMAPPED) 00812 { 00813 // Previous reference ID was for unmapped reads, but the 00814 // current one is not, so this is not sorted. 00815 String errorMessage = "ERROR: File is not sorted at record "; 00816 errorMessage += myRecordCount; 00817 myStatus.setStatus(SamStatus::INVALID_SORT, 00818 errorMessage.c_str()); 00819 status = false; 00820 } 00821 else if(refID < myPrevRefID) 00822 { 00823 // Current reference id is less than the previous one, 00824 //meaning that it is not sorted. 00825 String errorMessage = "ERROR: File is not sorted at record "; 00826 errorMessage += myRecordCount; 00827 myStatus.setStatus(SamStatus::INVALID_SORT, 00828 errorMessage.c_str()); 00829 status = false; 00830 } 00831 else 00832 { 00833 // The reference IDs are in the correct order. 00834 if(refID > myPrevRefID) 00835 { 00836 // New reference id, so set the previous coordinate to -1 00837 myPrevCoord = -1; 00838 } 00839 00840 // Check the coordinates. 00841 if(coord < myPrevCoord) 00842 { 00843 // New Coord is less than the previous position. 00844 String errorMessage = "ERROR: File is not sorted at record "; 00845 errorMessage += myRecordCount; 00846 myStatus.setStatus(SamStatus::INVALID_SORT, 00847 errorMessage.c_str()); 00848 status = false; 00849 } 00850 else 00851 { 00852 myPrevRefID = refID; 00853 myPrevCoord = coord; 00854 status = true; 00855 } 00856 } 00857 } 00858 } 00859 00860 return(status); 00861 }
| bool SamFile::WriteHeader | ( | SamFileHeader & | header | ) |
Writes the specified header into the file.
Definition at line 399 of file SamFile.cpp.
References myHasHeader, myIsOpenForWrite, and myStatus.
00400 { 00401 if(myIsOpenForWrite == false) 00402 { 00403 // File is not open for write 00404 // -OR- 00405 // The header has already been written. 00406 myStatus.setStatus(SamStatus::FAIL_ORDER, 00407 "Cannot write header since the file is not open for writing"); 00408 return(false); 00409 } 00410 00411 if(myHasHeader == true) 00412 { 00413 // The header has already been written. 00414 myStatus.setStatus(SamStatus::FAIL_ORDER, 00415 "Cannot write header since it has already been written"); 00416 return(false); 00417 } 00418 00419 myStatus = myInterfacePtr->writeHeader(myFilePtr, header); 00420 if(myStatus == SamStatus::SUCCESS) 00421 { 00422 // The header has now been successfully written. 00423 myHasHeader = true; 00424 return(true); 00425 } 00426 00427 // return the status. 00428 return(false); 00429 }
| bool SamFile::WriteRecord | ( | SamFileHeader & | header, | |
| SamRecord & | record | |||
| ) |
Writes the specified record into the file.
Definition at line 510 of file SamFile.cpp.
References myHasHeader, myIsOpenForWrite, myRecordCount, myStatus, SamRecord::setReference(), and validateSortOrder().
00512 { 00513 if(myIsOpenForWrite == false) 00514 { 00515 // File is not open for writing 00516 myStatus.setStatus(SamStatus::FAIL_ORDER, 00517 "Cannot write record since the file is not open for writing"); 00518 return(false); 00519 } 00520 00521 if(myHasHeader == false) 00522 { 00523 // The header has not yet been written. 00524 myStatus.setStatus(SamStatus::FAIL_ORDER, 00525 "Cannot write record since the header has not been written"); 00526 return(false); 00527 } 00528 00529 // Before trying to write the record, validate the sort order. 00530 if(!validateSortOrder(record, header)) 00531 { 00532 // Not sorted like it is supposed to be, do not write the record 00533 myStatus.setStatus(SamStatus::INVALID_SORT, 00534 "Cannot write the record since the file is not properly sorted."); 00535 return(false); 00536 } 00537 00538 if(myRefPtr != NULL) 00539 { 00540 record.setReference(myRefPtr); 00541 } 00542 00543 // File is open for writing and the header has been written, so write the 00544 // record. 00545 myStatus = myInterfacePtr->writeRecord(myFilePtr, header, record, 00546 myWriteTranslation); 00547 00548 if(myStatus == SamStatus::SUCCESS) 00549 { 00550 // A record was successfully written, so increment the record count. 00551 myRecordCount++; 00552 return(true); 00553 } 00554 return(false); 00555 }
bool SamFile::myHasHeader [protected] |
Flag to indicate if a header has been read/written - required before being able to read/write a record.
Definition at line 240 of file SamFile.h.
Referenced by ReadHeader(), ReadRecord(), resetFile(), WriteHeader(), and WriteRecord().
1.6.3