SamFile Class Reference

Inheritance diagram for SamFile:
Inheritance graph
[legend]
Collaboration diagram for SamFile:
Collaboration graph
[legend]

List of all members.

Public Types

enum  OpenType { READ, WRITE }
enum  SortedType { UNSORTED = 0, FLAG, COORDINATE, QUERY_NAME }
 

Enum for indicating the type of sort for the file.

More...

Public Member Functions

 SamFile ()
 Default Constructor.
 SamFile (ErrorHandler::HandlingType errorHandlingType)
 Constructor that sets the error handling type.
 SamFile (const char *filename, OpenType mode)
 Constructor that opens the specified file based on the specified mode (READ/WRITE).
 SamFile (const char *filename, OpenType mode, ErrorHandler::HandlingType errorHandlingType)
 Constructor that opens the specified file based on the specified mode (READ/WRITE) and handles errors per the specified handleType.
bool OpenForRead (const char *filename)
 Open a sam/bam file for reading with the specified filename.
bool OpenForWrite (const char *filename)
 Open a sam/bam file for writing with the specified filename.
bool ReadBamIndex (const char *filename)
 Reads the specified bam index file.
void Close ()
 Close the file if there is one open.
bool IsEOF ()
 Returns whether or not the end of the file has been reached.
bool ReadHeader (SamFileHeader &header)
 Reads the header section from the file and stores it in the passed in header.
bool WriteHeader (SamFileHeader &header)
 Writes the specified header into the file.
bool ReadRecord (SamFileHeader &header, SamRecord &record)
 Reads the next record from the file & stores it in the passed in record.
bool WriteRecord (SamFileHeader &header, SamRecord &record)
 Writes the specified record into the file.
void setSortedValidation (SortedType sortType)
 Set the flag to validate that the file is sorted as it is read/written.
uint32_t GetCurrentRecordCount ()
 Return the number of records that have been read/written so far.
SamStatus::Status GetFailure ()
 Get the Status of the last call that sets status.
SamStatus::Status GetStatus ()
 Get the Status of the last call that sets status.
const char * GetStatusMessage ()
 Get the Status of the last call that sets status.
bool SetReadSection (int32_t refID)
 Sets what part of the BAM file should be read.
bool SetReadSection (const char *refName)
 Sets what part of the BAM file should be read.
bool SetReadSection (int32_t refID, int32_t start, int32_t end)
 Sets what part of the BAM file should be read.
bool SetReadSection (const char *refName, int32_t start, int32_t end)
 Sets what part of the BAM file should be read.
uint32_t GetNumOverlaps (SamRecord &samRecord)
 Returns the number of bases in the passed in read that overlap the region that is currently set.
void GenerateStatistics (bool genStats)
 Whether or not statistics should be generated for this file.
void PrintStatistics ()

Protected Member Functions

void resetFile ()
bool validateSortOrder (SamRecord &record, SamFileHeader &header)
 Validate that the record is sorted compared to the previously read record if there is one, according to the specified sort order.
SortedType getSortOrderFromHeader (SamFileHeader &header)
bool readIndexedRecord (SamFileHeader &header, SamRecord &record)
 Overwrites read record to read from the specific reference only.
bool processNewSection (SamFileHeader &header)

Protected Attributes

IFILE myFilePtr
GenericSamInterfacemyInterfacePtr
bool myIsOpenForRead
 Flag to indicate if a file is open for reading.
bool myIsOpenForWrite
 Flag to indicate if a file is open for writing.
bool myHasHeader
 Flag to indicate if a header has been read/written - required before being able to read/write a record.
SortedType mySortedType
int32_t myPrevCoord
 Previous values used for checking if the file is sorted.
int32_t myPrevRefID
std::string myPrevReadName
uint32_t myRecordCount
 Keep a count of the number of records that have been read/written so far.
SamStatisticsmyStatistics
 Pointer to the statistics for this file.
SamStatus myStatus
 The status of the last SamFile command.
bool myIsBamOpenForRead
 Values for reading Sorted BAM files via the index.
bool myNewSection
int32_t myRefID
int32_t myStartPos
int32_t myEndPos
uint64_t myCurrentChunkEnd
SortedChunkList myChunksToRead
BamIndexmyBamIndex
std::string myRefName

Detailed Description

Definition at line 29 of file SamFile.h.


Member Enumeration Documentation

Enum for indicating the type of sort for the file.

Enumerator:
UNSORTED 

file is not sorted.

FLAG 

SO flag from the header indicates the sort type.

COORDINATE 

file is sorted by coordinate.

QUERY_NAME 

file is sorted by queryname.

Definition at line 35 of file SamFile.h.

00035                     {
00036         UNSORTED = 0, ///< file is not sorted.
00037         FLAG,         ///< SO flag from the header indicates the sort type.
00038         COORDINATE,   ///< file is sorted by coordinate.
00039         QUERY_NAME    ///< file is sorted by queryname.
00040     };


Constructor & Destructor Documentation

SamFile::SamFile ( ErrorHandler::HandlingType  errorHandlingType  ) 

Constructor that sets the error handling type.

Parameters:
errorHandlingType how to handle errors.

Definition at line 37 of file SamFile.cpp.

00038     : myFilePtr(NULL),
00039       myInterfacePtr(NULL),
00040       myStatistics(NULL),
00041       myStatus(errorHandlingType),
00042       myBamIndex(NULL)
00043 {
00044     resetFile();
00045 }

SamFile::SamFile ( const char *  filename,
OpenType  mode 
)

Constructor that opens the specified file based on the specified mode (READ/WRITE).

Parameters:
filename name of the file to open.
mode mode to use for opening the file.

Definition at line 50 of file SamFile.cpp.

References GetStatusMessage(), OpenForRead(), and OpenForWrite().

00051     : myFilePtr(NULL),
00052       myInterfacePtr(NULL),
00053       myStatistics(NULL),
00054       myStatus(),
00055       myBamIndex(NULL)
00056 {
00057     resetFile();
00058 
00059     bool openStatus = true;
00060     if(mode == READ)
00061     {
00062         // open the file for read.
00063         openStatus = OpenForRead(filename);
00064     }
00065     else
00066     {
00067         // open the file for write.
00068         openStatus = OpenForWrite(filename);
00069     }
00070     if(!openStatus)
00071     {
00072         // Failed to open the file - print error and abort.
00073         fprintf(stderr, "%s\n", GetStatusMessage());
00074         std::cerr << "FAILURE - EXITING!!!" << std::endl;
00075         exit(-1);
00076     }
00077 }

SamFile::SamFile ( const char *  filename,
OpenType  mode,
ErrorHandler::HandlingType  errorHandlingType 
)

Constructor that opens the specified file based on the specified mode (READ/WRITE) and handles errors per the specified handleType.

Parameters:
filename name of the file to open.
mode mode to use for opening the file.
errorHandlingType how to handle errors.

Definition at line 82 of file SamFile.cpp.

References GetStatusMessage(), OpenForRead(), and OpenForWrite().

00084     : myFilePtr(NULL),
00085       myInterfacePtr(NULL),
00086       myStatistics(NULL),
00087       myStatus(errorHandlingType),
00088       myBamIndex(NULL)
00089 {
00090     resetFile();
00091 
00092     bool openStatus = true;
00093     if(mode == READ)
00094     {
00095         // open the file for read.
00096         openStatus = OpenForRead(filename);
00097     }
00098     else
00099     {
00100         // open the file for write.
00101         openStatus = OpenForWrite(filename);
00102     }
00103     if(!openStatus)
00104     {
00105         // Failed to open the file - print error and abort.
00106         fprintf(stderr, "%s\n", GetStatusMessage());
00107         std::cerr << "FAILURE - EXITING!!!" << std::endl;
00108         exit(-1);
00109     }
00110 }


Member Function Documentation

void SamFile::GenerateStatistics ( bool  genStats  ) 

Whether or not statistics should be generated for this file.

The value is carried over between files and is not reset, but the statistics themselves are reset between files.

Parameters:
genStats set to true if statistics should be generated, false if not.

Definition at line 627 of file SamFile.cpp.

References myStatistics.

00628 {
00629     if(genStats)
00630     {
00631         if(myStatistics == NULL)
00632         {
00633             // Want to generate statistics, but do not yet have the
00634             // structure for them, so create one.
00635             myStatistics = new SamStatistics();
00636         }
00637     }
00638     else
00639     {
00640         // Do not generate statistics, so if myStatistics is not NULL, 
00641         // delete it.
00642         if(myStatistics != NULL)
00643         {
00644             delete myStatistics;
00645             myStatistics = NULL;
00646         }
00647     }
00648 
00649 }

SamStatus::Status SamFile::GetFailure (  )  [inline]

Get the Status of the last call that sets status.

To remain backwards compatable - will be removed later.

Definition at line 114 of file SamFile.h.

References GetStatus().

00115     {
00116         return(GetStatus());
00117     }

uint32_t SamFile::GetNumOverlaps ( SamRecord samRecord  ) 

Returns the number of bases in the passed in read that overlap the region that is currently set.

Parameters:
samRecord to check for overlapping bases.
Returns:
number of bases that overlap region that is currently set.

Definition at line 619 of file SamFile.cpp.

00620 {
00621     // Get the overlaps in the sam record for the region currently set
00622     // for this file.
00623     return(samRecord.getNumOverlaps(myStartPos, myEndPos));
00624 }

bool SamFile::IsEOF (  ) 

Returns whether or not the end of the file has been reached.

Returns:
true = EOF; false = not eof. If the file is not open, false is returned.

Definition at line 324 of file SamFile.cpp.

00325 {
00326     if (myFilePtr != NULL)
00327     {
00328         // File Pointer is set, so return if eof.
00329         return(ifeof(myFilePtr));
00330     }
00331     // File pointer is not set, so return true, eof.
00332     return true;
00333 }

bool SamFile::OpenForRead ( const char *  filename  ) 

Open a sam/bam file for reading with the specified filename.

Parameters:
filename,: the sam/bam file to open for reading.
Returns:
true = success; false = failure.

Definition at line 124 of file SamFile.cpp.

References myIsBamOpenForRead, myIsOpenForRead, and myStatus.

Referenced by SamFile(), and SamFileReader::SamFileReader().

00125 {
00126     // Reset for any previously operated on files.
00127     resetFile();
00128 
00129     int lastchar = 0;
00130 
00131     while (filename[lastchar] != 0) lastchar++;
00132 
00133     // If at least one character, check for '-'.
00134     if((lastchar >= 1) && (filename[0] == '-'))
00135     {
00136         // Read from stdin - determine type of file to read.
00137         // Determine if compressed bam.
00138         if(strcmp(filename, "-.bam") == 0)
00139         {
00140             // Compressed bam - open as bgzf.
00141             // -.bam is the filename, read compressed bam from stdin
00142             filename = "-";
00143             myFilePtr = ifopen(filename, "rb", InputFile::BGZF);
00144             
00145             myInterfacePtr = new BamInterface;
00146 
00147             // Read the magic string.
00148             char magic[4];
00149             ifread(myFilePtr, magic, 4);
00150         }
00151         else if(strcmp(filename, "-.ubam") == 0)
00152         {
00153             // uncompressed BAM File.
00154             // -.ubam is the filename, read uncompressed bam from stdin
00155             filename = "-";
00156             myFilePtr = ifopen(filename, "rb", InputFile::UNCOMPRESSED);
00157         
00158             myInterfacePtr = new BamInterface;
00159 
00160             // Read the magic string.
00161             char magic[4];
00162             ifread(myFilePtr, magic, 4);
00163         }
00164         else
00165         {
00166             // SAM File.
00167             // read sam from stdin
00168             filename = "-";
00169             myFilePtr = ifopen(filename, "rb", InputFile::UNCOMPRESSED);
00170             myInterfacePtr = new SamInterface;
00171         }
00172     }
00173     else
00174     {
00175         // Not from stdin.  Read the file to determine the type.
00176         myFilePtr = ifopen(filename, "rb");
00177         
00178         if (myFilePtr == NULL)
00179         {
00180             std::string errorMessage = "Failed to Open ";
00181             errorMessage += filename;
00182             errorMessage += " for reading";
00183             myStatus.setStatus(SamStatus::FAIL_IO, errorMessage.c_str());
00184             return(false);
00185         }
00186         
00187         char magic[4];
00188         ifread(myFilePtr, magic, 4);
00189         
00190         if (magic[0] == 'B' && magic[1] == 'A' && magic[2] == 'M' &&
00191             magic[3] == 1)
00192         {
00193             myInterfacePtr = new BamInterface;
00194             // Set that it is a bam file open for reading.  This is needed to
00195             // determine if an index file can be used.
00196             myIsBamOpenForRead = true;
00197         }
00198         else
00199         {
00200             // Not a bam, so rewind to the beginning of the file so it
00201             // can be read.
00202             ifrewind(myFilePtr);
00203             myInterfacePtr = new SamInterface;
00204         }
00205     }
00206 
00207     // File is open for reading.
00208     myIsOpenForRead = true;
00209     // Successfully opened the file.
00210     myStatus = SamStatus::SUCCESS;
00211     return(true);
00212 }

bool SamFile::OpenForWrite ( const char *  filename  ) 

Open a sam/bam file for writing with the specified filename.

Returns:
true = success; false = failure.

Definition at line 216 of file SamFile.cpp.

References myIsOpenForWrite, and myStatus.

Referenced by SamFile(), and SamFileWriter::SamFileWriter().

00217 {
00218     // Reset for any previously operated on files.
00219     resetFile();
00220     
00221     int lastchar = 0;
00222     while (filename[lastchar] != 0) lastchar++;   
00223     if (lastchar >= 4 && 
00224         filename[lastchar - 4] == 'u' &&
00225         filename[lastchar - 3] == 'b' &&
00226         filename[lastchar - 2] == 'a' &&
00227         filename[lastchar - 1] == 'm')
00228     {
00229         // BAM File.
00230         // if -.ubam is the filename, write uncompressed bam to stdout
00231         if((lastchar == 6) && (filename[0] == '-') && (filename[1] == '.'))
00232         {
00233             filename = "-";
00234         }
00235         myFilePtr = ifopen(filename, "wb", InputFile::UNCOMPRESSED);
00236 
00237         myInterfacePtr = new BamInterface;
00238     }
00239     else if (lastchar >= 3 && 
00240              filename[lastchar - 3] == 'b' &&
00241              filename[lastchar - 2] == 'a' &&
00242              filename[lastchar - 1] == 'm')
00243     {
00244         // BAM File.
00245         // if -.bam is the filename, write compressed bam to stdout
00246         if((lastchar == 5) && (filename[0] == '-') && (filename[1] == '.'))
00247         {
00248             filename = "-";
00249         }
00250         myFilePtr = ifopen(filename, "wb", InputFile::BGZF);
00251         
00252         myInterfacePtr = new BamInterface;
00253     }
00254     else
00255     {
00256         // SAM File
00257         // if - (followed by anything is the filename,
00258         // write uncompressed sam to stdout
00259         if((lastchar >= 1) && (filename[0] == '-'))
00260         {
00261             filename = "-";
00262         }
00263         myFilePtr = ifopen(filename, "wb", InputFile::UNCOMPRESSED);
00264    
00265         myInterfacePtr = new SamInterface;
00266     }
00267 
00268     if (myFilePtr == NULL)
00269     {
00270         std::string errorMessage = "Failed to Open ";
00271         errorMessage += filename;
00272         errorMessage += " for writing";
00273         myStatus.setStatus(SamStatus::FAIL_IO, errorMessage.c_str());
00274         return(false);
00275     }
00276    
00277     myIsOpenForWrite = true;
00278 
00279     // Successfully opened the file.
00280     myStatus = SamStatus::SUCCESS;
00281     return(true);
00282 }

bool SamFile::ReadBamIndex ( const char *  filename  ) 

Reads the specified bam index file.

It must be read prior to setting a read section, for seeking and reading portions of a bam file.

Returns:
true = success; false = failure.

Definition at line 286 of file SamFile.cpp.

References myStatus, and BamIndex::readIndex().

00287 {
00288     // Cleanup a previously setup index.
00289     if(myBamIndex != NULL)
00290     {
00291         delete myBamIndex;
00292         myBamIndex = NULL;
00293     }
00294 
00295     // Create a new bam index.
00296     myBamIndex = new BamIndex();
00297     SamStatus::Status indexStat = myBamIndex->readIndex(bamIndexFilename);
00298 
00299     if(indexStat != SamStatus::SUCCESS)
00300     {
00301         std::string errorMessage = "Failed to read the bam Index file: ";
00302         errorMessage += bamIndexFilename;
00303         myStatus.setStatus(indexStat, errorMessage.c_str());
00304         delete myBamIndex;
00305         myBamIndex = NULL;
00306         return(false);
00307     }
00308     myStatus = SamStatus::SUCCESS;
00309     return(true);
00310 }

bool SamFile::ReadHeader ( SamFileHeader header  ) 

Reads the header section from the file and stores it in the passed in header.

Returns:
true = success; false = failure.

Definition at line 337 of file SamFile.cpp.

References myHasHeader, myIsOpenForRead, and myStatus.

00338 {
00339     if(myIsOpenForRead == false)
00340     {
00341         // File is not open for read
00342         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00343                            "Cannot read header since the file is not open for reading");
00344         return(false);
00345     }
00346 
00347     if(myHasHeader == true)
00348     {
00349         // The header has already been read.
00350         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00351                            "Cannot read header since it has already been read.");
00352         return(false);
00353     }
00354 
00355     myStatus = myInterfacePtr->readHeader(myFilePtr, header);
00356     if(myStatus == SamStatus::SUCCESS)
00357     {
00358         // The header has now been successfully read.
00359         myHasHeader = true;
00360         return(true);
00361     }
00362     return(false);
00363 }

bool SamFile::ReadRecord ( SamFileHeader header,
SamRecord record 
)

Reads the next record from the file & stores it in the passed in record.

Returns:
true = record was successfully set. false = record was not successfully set.

Definition at line 401 of file SamFile.cpp.

References myHasHeader, myIsOpenForRead, myRecordCount, myStatistics, myStatus, readIndexedRecord(), BamIndex::REF_ID_ALL, and validateSortOrder().

00403 {
00404     myStatus = SamStatus::SUCCESS;
00405 
00406     if(myIsOpenForRead == false)
00407     {
00408         // File is not open for read
00409         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00410                            "Cannot read record since the file is not open for reading");
00411         throw(std::runtime_error("SOFTWARE BUG: trying to read a SAM/BAM record prior to opening the file."));
00412         return(false);
00413     }
00414 
00415     if(myHasHeader == false)
00416     {
00417         // The header has not yet been read.
00418         // TODO - maybe just read the header.
00419         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00420                            "Cannot read record since the header has not been read.");
00421         throw(std::runtime_error("SOFTWARE BUG: trying to read a SAM/BAM record prior to reading the header."));
00422         return(false);
00423     }
00424 
00425     // Check to see if a new region has been set.  If so, determine the
00426     // chunks for that region.
00427     if(myNewSection)
00428     {
00429         if(!processNewSection(header))
00430         {
00431             // Failed processing a new section.  Could be an 
00432             // order issue like the file not being open or the
00433             // indexed file not having been read.
00434             // processNewSection sets myStatus with the failure reason.
00435             return(false);
00436         }
00437     }
00438 
00439     // Check to see if the file should be read by index.
00440     if(myRefID != BamIndex::REF_ID_ALL)
00441     {
00442         // Reference ID is set, so read by index.
00443         return(readIndexedRecord(header, record));
00444     }
00445 
00446     // File is open for reading and the header has been read, so read the next
00447     // record.
00448     myInterfacePtr->readRecord(myFilePtr, header, record, myStatus);
00449     if(myStatus == SamStatus::SUCCESS)
00450     {
00451         // A record was successfully read, so increment the record count.
00452         myRecordCount++;
00453 
00454         if(myStatistics != NULL)
00455         {
00456             // Statistics should be updated.
00457             myStatistics->updateStatistics(record);
00458         }
00459 
00460         // Successfully read the record, so check the sort order.
00461         if(!validateSortOrder(record, header))
00462         {
00463             // ValidateSortOrder sets the status on a failure.
00464             return(false);
00465         }
00466         return(true);
00467     }
00468     // Failed to read the record.
00469     return(false);
00470 }

bool SamFile::SetReadSection ( const char *  refName,
int32_t  start,
int32_t  end 
)

Sets what part of the BAM file should be read.

This version will set it to only read a specific reference name and start/end position. The records for this section will be retrieved on each ReadRecord call. When all records have been retrieved for the specified section, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.

Parameters:
refName the reference name of the records to read from the file.
start inclusive 0-based start position of records that should be read for this refID.
end exclusive 0-based end position of records that should be read for this refID.
Returns:
true = success; false = failure.

Definition at line 579 of file SamFile.cpp.

References myIsBamOpenForRead, myStatus, BamIndex::REF_ID_ALL, and BamIndex::REF_ID_UNMAPPED.

00580 {
00581     // If there is not a BAM file open for reading, return failure.
00582     // Opening a new file clears the read section, so it must be
00583     // set after the file is opened.
00584     if(!myIsBamOpenForRead)
00585     {
00586         // There is not a BAM file open for reading.
00587         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00588                            "Canot set section since there is no bam file open");
00589         return(false);
00590     }
00591 
00592     myNewSection = true;
00593     myStartPos = start;
00594     myEndPos = end;
00595     if((strcmp(refName, "") == 0) || (strcmp(refName, "*") == 0))
00596     {
00597         // No Reference name specified, so read just the "-1" entries.
00598         myRefID = BamIndex::REF_ID_UNMAPPED;
00599     }
00600     else
00601     {
00602         // save the reference name and revert the reference ID to unknown
00603         // so it will be calculated later.
00604         myRefName = refName;
00605         myRefID = BamIndex::REF_ID_ALL;
00606     }
00607     myChunksToRead.clear();
00608     // Reset the end of the current chunk.  We are resetting our read, so
00609     // we no longer have a "current chunk" that we are reading.
00610     myCurrentChunkEnd = 0;
00611     myStatus = SamStatus::SUCCESS;
00612     
00613     return(true);
00614 }

bool SamFile::SetReadSection ( int32_t  refID,
int32_t  start,
int32_t  end 
)

Sets what part of the BAM file should be read.

This version will set it to only read a specific reference id and start/end position. The records for this section will be retrieved on each ReadRecord call. When all records have been retrieved for the specified section, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.

Parameters:
refID the reference ID of the records to read from the file.
start inclusive 0-based start position of records that should be read for this refID.
end exclusive 0-based end position of records that should be read for this refID.
Returns:
true = success; false = failure.

Definition at line 550 of file SamFile.cpp.

References myIsBamOpenForRead, and myStatus.

00551 {
00552     // If there is not a BAM file open for reading, return failure.
00553     // Opening a new file clears the read section, so it must be
00554     // set after the file is opened.
00555     if(!myIsBamOpenForRead)
00556     {
00557         // There is not a BAM file open for reading.
00558         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00559                            "Canot set section since there is no bam file open");
00560         return(false);
00561     }
00562 
00563     myNewSection = true;
00564     myStartPos = start;
00565     myEndPos = end;
00566     myRefID = refID;
00567     myRefName.clear();
00568     myChunksToRead.clear();
00569     // Reset the end of the current chunk.  We are resetting our read, so
00570     // we no longer have a "current chunk" that we are reading.
00571     myCurrentChunkEnd = 0;
00572     myStatus = SamStatus::SUCCESS;
00573     
00574     return(true);
00575 }

bool SamFile::SetReadSection ( const char *  refName  ) 

Sets what part of the BAM file should be read.

This version will set it to only read a specific reference name. The records for that reference id will be retrieved on each ReadRecord call. When all records have been retrieved for the specified reference name, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.

Parameters:
refName the reference name of the records to read from the file.
Returns:
true = success; false = failure.

Definition at line 542 of file SamFile.cpp.

References SetReadSection().

00543 {
00544     // No start/end specified, so set back to default -1.
00545     return(SetReadSection(refName, -1, -1));
00546 }

bool SamFile::SetReadSection ( int32_t  refID  ) 

Sets what part of the BAM file should be read.

This version will set it to only read a specific reference id. The records for that reference id will be retrieved on each ReadRecord call. When all records have been retrieved for the specified reference id, ReadRecord will return failure until a new read section is set. Must be called only after the file has been opened for reading.

Parameters:
refID the reference ID of the records to read from the file.
Returns:
true = success; false = failure.

Definition at line 533 of file SamFile.cpp.

Referenced by SetReadSection().

00534 {
00535     // No start/end specified, so set back to default -1.
00536     return(SetReadSection(refID, -1, -1));
00537 }

void SamFile::setSortedValidation ( SortedType  sortType  ) 

Set the flag to validate that the file is sorted as it is read/written.

Must be called after the file has been opened.

Definition at line 519 of file SamFile.cpp.

00520 {
00521     mySortedType = sortType;
00522 }

bool SamFile::validateSortOrder ( SamRecord record,
SamFileHeader header 
) [protected]

Validate that the record is sorted compared to the previously read record if there is one, according to the specified sort order.

If the sort order is UNSORTED, true is returned.

Definition at line 705 of file SamFile.cpp.

References FLAG, myPrevCoord, myRecordCount, myStatus, QUERY_NAME, BamIndex::REF_ID_UNMAPPED, and UNSORTED.

Referenced by readIndexedRecord(), ReadRecord(), and WriteRecord().

00706 {
00707     bool status = false;
00708     if(mySortedType == UNSORTED)
00709     {
00710         // Unsorted, so nothing to validate, just return true.
00711         status = true;
00712     }
00713     else 
00714     {
00715         // Check to see if mySortedType is based on the header.
00716         if(mySortedType == FLAG)
00717         {
00718             // Determine the sorted type from what was read out of the header.
00719             mySortedType = getSortOrderFromHeader(header);
00720         }
00721 
00722         if(mySortedType == QUERY_NAME)
00723         {
00724             // Validate that it is sorted by query name.
00725             // Get the query name from the record.
00726             const char* readName = record.getReadName();
00727             if(myPrevReadName.compare(readName) > 0)
00728             {
00729                 // The previous name is greater than the new record's name, so
00730                 // return false.
00731                 String errorMessage = "ERROR: File is not sorted at record ";
00732                 errorMessage += myRecordCount;
00733                 myStatus.setStatus(SamStatus::INVALID_SORT, 
00734                                    errorMessage.c_str());
00735                 status = false;
00736             }
00737             else
00738             {
00739                 myPrevReadName = readName;
00740                 status = true;
00741             }
00742         }
00743         else 
00744         {
00745             // Validate that it is sorted by COORDINATES.
00746             // Get the leftmost coordinate and the reference index.
00747             int32_t refID = record.getReferenceID();
00748             int32_t coord = record.get0BasedPosition();
00749             // The unmapped reference id is at the end of a sorted file.
00750             if(refID == BamIndex::REF_ID_UNMAPPED)
00751             {
00752                 // A new reference ID that is for the unmapped reads
00753                 // is always valid.
00754                 status = true;
00755                 myPrevRefID = refID;
00756                 myPrevCoord = coord;
00757             }
00758             else if(myPrevRefID == BamIndex::REF_ID_UNMAPPED)
00759             {
00760                 // Previous reference ID was for unmapped reads, but the
00761                 // current one is not, so this is not sorted.
00762                 String errorMessage = "ERROR: File is not sorted at record ";
00763                 errorMessage += myRecordCount;
00764                 myStatus.setStatus(SamStatus::INVALID_SORT, 
00765                                    errorMessage.c_str());
00766                 status = false;
00767             }
00768             else if(refID < myPrevRefID)
00769             {
00770                 // Current reference id is less than the previous one, 
00771                 //meaning that it is not sorted.
00772                 String errorMessage = "ERROR: File is not sorted at record ";
00773                 errorMessage += myRecordCount;
00774                 myStatus.setStatus(SamStatus::INVALID_SORT, 
00775                                    errorMessage.c_str());
00776                 status = false;
00777             }
00778             else
00779             {
00780                 // The reference IDs are in the correct order.
00781                 if(refID > myPrevRefID)
00782                 {
00783                     // New reference id, so set the previous coordinate to -1
00784                     myPrevCoord = -1;
00785                 }
00786             
00787                 // Check the coordinates.
00788                 if(coord < myPrevCoord)
00789                 {
00790                     // New Coord is less than the previous position.
00791                     String errorMessage = "ERROR: File is not sorted at record ";
00792                     errorMessage += myRecordCount;
00793                     myStatus.setStatus(SamStatus::INVALID_SORT, 
00794                                        errorMessage.c_str());
00795                     status = false;
00796                 }
00797                 else
00798                 {
00799                     myPrevRefID = refID;
00800                     myPrevCoord = coord;
00801                     status = true;
00802                 }
00803             }
00804         }
00805     }
00806 
00807     return(status);
00808 }

bool SamFile::WriteHeader ( SamFileHeader header  ) 

Writes the specified header into the file.

Returns:
true = success; false = failure.

Definition at line 367 of file SamFile.cpp.

References myHasHeader, myIsOpenForWrite, and myStatus.

00368 {
00369     if(myIsOpenForWrite == false)
00370     {
00371         // File is not open for write
00372         // -OR-
00373         // The header has already been written.
00374         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00375                            "Cannot write header since the file is not open for writing");
00376         return(false);
00377     }
00378 
00379     if(myHasHeader == true)
00380     {
00381         // The header has already been written.
00382         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00383                            "Cannot write header since it has already been written");
00384         return(false);
00385     }
00386 
00387     myStatus = myInterfacePtr->writeHeader(myFilePtr, header);
00388     if(myStatus == SamStatus::SUCCESS)
00389     {
00390         // The header has now been successfully written.
00391         myHasHeader = true;
00392         return(true);
00393     }
00394 
00395     // return the status.
00396     return(false);
00397 }

bool SamFile::WriteRecord ( SamFileHeader header,
SamRecord record 
)

Writes the specified record into the file.

Returns:
true = success; false = failure.

Definition at line 475 of file SamFile.cpp.

References myHasHeader, myIsOpenForWrite, myRecordCount, myStatus, and validateSortOrder().

00477 {
00478     if(myIsOpenForWrite == false)
00479     {
00480         // File is not open for writing
00481         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00482                            "Cannot write record since the file is not open for writing");
00483         return(false);
00484     }
00485 
00486     if(myHasHeader == false)
00487     {
00488         // The header has not yet been written.
00489         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00490                            "Cannot write record since the header has not been written");
00491         return(false);
00492     }
00493 
00494     // Before trying to write the record, validate the sort order.
00495     if(!validateSortOrder(record, header))
00496     {
00497         // Not sorted like it is supposed to be, do not write the record
00498         myStatus.setStatus(SamStatus::INVALID_SORT, 
00499                            "Cannot write the record since the file is not properly sorted.");
00500         return(false);
00501     }
00502 
00503     // File is open for writing and the header has been written, so write the
00504     // record.
00505     myStatus = myInterfacePtr->writeRecord(myFilePtr, header, record);
00506 
00507     if(myStatus == SamStatus::SUCCESS)
00508     {
00509         // A record was successfully written, so increment the record count.
00510         myRecordCount++;
00511         return(true);
00512     }
00513     return(false);
00514 }


Member Data Documentation

bool SamFile::myHasHeader [protected]

Flag to indicate if a header has been read/written - required before being able to read/write a record.

Definition at line 215 of file SamFile.h.

Referenced by ReadHeader(), ReadRecord(), WriteHeader(), and WriteRecord().


The documentation for this class was generated from the following files:
Generated on Wed Nov 17 15:38:38 2010 for StatGen Software by  doxygen 1.6.3