SamRecord Class Reference

Class providing an easy to use interface to get/set/operate on the fields in a SAM/BAM record. More...

#include <SamRecord.h>

Collaboration diagram for SamRecord:
Collaboration graph
[legend]

List of all members.

Public Types

enum  SequenceTranslation { NONE, EQUAL, BASES }
 

Enum containing the settings on how to translate the sequence if a reference is available.

More...

Public Member Functions

 SamRecord ()
 Default Constructor.
 SamRecord (ErrorHandler::HandlingType errorHandlingType)
 Constructor that sets the error handling type.
 ~SamRecord ()
 Destructor.
void resetRecord ()
 Reset the fields of the record to a default value.
bool isValid (SamFileHeader &header)
 Returns whether or not the record is valid, setting the status to indicate success or failure.
void setReference (GenomeSequence *reference)
 Set the reference to the specified genome sequence object.
void setSequenceTranslation (SequenceTranslation translation)
 Set the type of sequence translation to use when getting the sequence.
const SamStatusgetStatus ()
 Returns the status associated with the last method that sets the status.
Set Alignment Data

Set methods for record fields.

All of the "set" methods set the status to indicate success or the failure reason.

bool setReadName (const char *readName)
 Set QNAME to the passed in name.
bool setFlag (uint16_t flag)
 Set the bitwise FLAG to the specified value.
bool setReferenceName (SamFileHeader &header, const char *referenceName)
 Set the reference sequence name (RNAME) to the specified name, using the header to determine the reference id.
bool set1BasedPosition (int32_t position)
 Set the leftmost position (POS) using the specified 1-based (SAM format) value.
bool set0BasedPosition (int32_t position)
 Set the leftmost position using the specified 0-based (BAM format) value.
bool setMapQuality (uint8_t mapQuality)
 Set the mapping quality (MAPQ).
bool setCigar (const char *cigar)
 Set the CIGAR to the specified SAM formatted cigar string.
bool setCigar (const Cigar &cigar)
 Set the CIGAR to the specified Cigar object.
bool setMateReferenceName (SamFileHeader &header, const char *mateReferenceName)
 Set the mate/next fragment's reference sequence name (RNEXT) to the specified name, using the header to determine the mate reference id.
bool set1BasedMatePosition (int32_t matePosition)
 Set the mate/next fragment's leftmost position (PNEXT) using the specified 1-based (SAM format) value.
bool set0BasedMatePosition (int32_t matePosition)
 Set the mate/next fragment's leftmost position using the specified 0-based (BAM format) value.
bool setInsertSize (int32_t insertSize)
 Sets the inferred insert size (ISIZE)/observed template length (TLEN).
bool setSequence (const char *seq)
 Sets the sequence (SEQ) to the specified SAM formatted sequence string.
bool setQuality (const char *quality)
 Sets the quality (QUAL) to the specified SAM formatted quality string.
bool shiftIndelsLeft ()
 Shift the indels (if any) to the left by updating the CIGAR.
SamStatus::Status setBuffer (const char *fromBuffer, uint32_t fromBufferSize, SamFileHeader &header)
 Sets the SamRecord to contain the information in the BAM formatted fromBuffer.
SamStatus::Status setBufferFromFile (IFILE filePtr, SamFileHeader &header)
 Read the BAM record from a file.
Set Tag Data

Set methods for tags.

bool addIntTag (const char *tag, int32_t value)
 Add the specified integer tag to the record.
bool addTag (const char *tag, char vtype, const char *value)
 Add the specified tag,vtype,value to the record.
void clearTags ()
 Clear the tags in this record.
bool rmTag (const char *tag, char type)
 Remove a tag.
bool rmTags (const char *tags)
 Remove tags.
Get Alignment Data

Get methods for record fields.

All of the "get" methods set the status to indicate success or the failure reason.

const void * getRecordBuffer ()
 Get a const pointer to the buffer that contains the BAM representation of the record.
const void * getRecordBuffer (SequenceTranslation translation)
 Get a const pointer to the buffer that contains the BAM representation of the record using the specified translation on the sequence.
SamStatus::Status writeRecordBuffer (IFILE filePtr)
 Write the record as a BAM into the specified already opened file.
SamStatus::Status writeRecordBuffer (IFILE filePtr, SequenceTranslation translation)
 Write the record as a BAM into the specified already opened file using the specified translation on the sequence.
int32_t getBlockSize ()
 Get the block size of the record (BAM format).
const char * getReferenceName ()
 Get the reference sequence name (RNAME) of the record.
int32_t getReferenceID ()
 Get the reference sequence id of the record (BAM format rid).
int32_t get1BasedPosition ()
 Get the 1-based(SAM) leftmost position (POS) of the record.
int32_t get0BasedPosition ()
 Get the 0-based(BAM) leftmost position of the record.
uint8_t getReadNameLength ()
 Get the length of the readname (QNAME) including the null.
uint8_t getMapQuality ()
 Get the mapping quality (MAPQ) of the record.
uint16_t getBin ()
 Get the BAM bin for the record.
uint16_t getCigarLength ()
 Get the length of the BAM formatted CIGAR.
uint16_t getFlag ()
 Get the flag (FLAG).
int32_t getReadLength ()
 Get the length of the read.
const char * getMateReferenceName ()
 Get the mate/next fragment's reference sequence name (RNEXT).
const char * getMateReferenceNameOrEqual ()
 Get the mate/next fragment's reference sequence name (RNEXT), returning "=" if it is the same as the reference name, unless they are both "*" in which case "*" is returned.
int32_t getMateReferenceID ()
 Get the mate reference id of the record (BAM format: mate_rid/next_refID).
int32_t get1BasedMatePosition ()
 Get the 1-based(SAM) leftmost mate/next fragment's position (PNEXT).
int32_t get0BasedMatePosition ()
 Get the 0-based(BAM) leftmost mate/next fragment's position.
int32_t getInsertSize ()
 Get the inferred insert size of the read pair (ISIZE) or observed template length (TLEN).
int32_t get0BasedAlignmentEnd ()
 Returns the 0-based inclusive rightmost position of the clipped sequence.
int32_t get1BasedAlignmentEnd ()
 Returns the 1-based inclusive rightmost position of the clipped sequence.
int32_t getAlignmentLength ()
 Returns the length of the clipped sequence, returning 0 if the cigar is '*'.
int32_t get0BasedUnclippedStart ()
 Returns the 0-based inclusive left-most position adjusted for clipped bases.
int32_t get1BasedUnclippedStart ()
 Returns the 1-based inclusive left-most position adjusted for clipped bases.
int32_t get0BasedUnclippedEnd ()
 Returns the 0-based inclusive right-most position adjusted for clipped bases.
int32_t get1BasedUnclippedEnd ()
 Returns the 1-based inclusive right-most position adjusted for clipped bases.
const char * getReadName ()
 Returns the SAM formatted Read Name (QNAME).
const char * getCigar ()
 Returns the SAM formatted CIGAR string.
const char * getSequence ()
 Returns the SAM formatted sequence string (SEQ), translating the base as specified by setSequenceTranslation.
const char * getSequence (SequenceTranslation translation)
 Returns the SAM formatted sequence string (SEQ) performing the specified sequence translation.
const char * getQuality ()
 Returns the SAM formatted quality string (QUAL).
char getSequence (int index)
 Get the sequence base at the specified index into this sequence 0 to readLength - 1, translating the base as specified by setSequenceTranslation.
char getSequence (int index, SequenceTranslation translation)
 Get the sequence base at the specified index into this sequence 0 to readLength - 1 performing the specified sequence translation.
char getQuality (int index)
 Get the quality character at the specified index into the quality 0 to readLength - 1.
CigargetCigarInfo ()
 Returns a pointer to the Cigar object associated with this record.
uint32_t getNumOverlaps (int32_t start, int32_t end)
 Return the number of bases in this read that overlap the passed in region.
bool getFields (bamRecordStruct &recStruct, String &readName, String &cigar, String &sequence, String &quality)
 Returns the values of all fields except the tags.
bool getFields (bamRecordStruct &recStruct, String &readName, String &cigar, String &sequence, String &quality, SequenceTranslation translation)
 Returns the values of all fields except the tags using the specified sequence translation.
GenomeSequencegetReference ()
 Returns a pointer to the genome sequence object associated with this record if it was set (NULL if it was not set).

Get Tag Methods

Get methods for obtaining information on tags.



uint32_t getTagLength ()
 Returns the length of the BAM formatted tags.
bool getNextSamTag (char *tag, char &vtype, void **value)
 Get the next tag from the record.
void resetTagIter ()
 Reset the tag iterator to the beginning of the tags.
bool getTagsString (const char *tags, String &returnString, char delim= '\t')
 Get the string representation of the tags from the record, formatted as TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE.
const StringgetStringTag (const char *tag)
 Get the string value for the specified tag.
int * getIntegerTag (const char *tag)
 Get the integer value for the specified tag.
double * getDoubleTag (const char *tag)
 Get the double value for the specified tag.
StringgetString (const char *tag)
 Get the string value for the specified tag.
int & getInteger (const char *tag)
 Get the integer value for the specified tag.
double & getDouble (const char *tag)
 Get the double value for the specified tag.
bool checkString (const char *tag)
 Check if the specified tag contains a string.
bool checkInteger (const char *tag)
 Check if the specified tag contains a string.
bool checkDouble (const char *tag)
 Check if the specified tag contains a string.
bool checkTag (const char *tag, char type)
 Check if the specified tag contains a value of the specified vtype.
static bool isIntegerType (char vtype)
 Returns whether or not the specified vtype is an integer type.
static bool isDoubleType (char vtype)
 Returns whether or not the specified vtype is a double type.
static bool isCharType (char vtype)
 Returns whether or not the specified vtype is a char type.
static bool isStringType (char vtype)
 Returns whether or not the specified vtype is a string type.

Detailed Description

Class providing an easy to use interface to get/set/operate on the fields in a SAM/BAM record.

Definition at line 51 of file SamRecord.h.


Member Enumeration Documentation

Enum containing the settings on how to translate the sequence if a reference is available.

If no reference is available, no translation is done.

Enumerator:
NONE 

Leave the sequence as is.

EQUAL 

Translate bases that match the reference to '='.

BASES 

Translate '=' to the actual base.

Definition at line 57 of file SamRecord.h.

00057                              { 
00058         NONE,   ///< Leave the sequence as is.
00059         EQUAL,  ///< Translate bases that match the reference to '='
00060         BASES,  ///< Translate '=' to the actual base.
00061     };


Constructor & Destructor Documentation

SamRecord::SamRecord ( ErrorHandler::HandlingType  errorHandlingType  ) 

Constructor that sets the error handling type.

Parameters:
errorHandlingType how to handle errors.

Definition at line 53 of file SamRecord.cpp.

References resetRecord().

00054     : myStatus(errorHandlingType),
00055       myRefPtr(NULL),
00056       mySequenceTranslation(NONE)
00057 {
00058     int32_t defaultAllocSize = DEFAULT_BLOCK_SIZE + sizeof(int32_t);
00059 
00060     myRecordPtr = 
00061         (bamRecordStruct *) malloc(defaultAllocSize);
00062 
00063     myCigarTempBuffer = NULL;
00064     myCigarTempBufferAllocatedSize = 0;
00065 
00066     allocatedSize = defaultAllocSize;
00067 
00068     resetRecord();
00069 }


Member Function Documentation

bool SamRecord::addIntTag ( const char *  tag,
int32_t  value 
)

Add the specified integer tag to the record.

Internal processing handles switching between SAM/BAM formats when read/written and determining the type for BAM format. If the tag is already there this code will replace it if the specified value is different.

Parameters:
tag two character tag to be added to the SAM/BAM record.
value value for the specified tag.
Returns:
true if the tag was successfully added, false otherwise.

Definition at line 636 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

Referenced by addTag().

00637 {
00638     myStatus = SamStatus::SUCCESS;
00639     int key = 0;
00640     int index = 0;
00641     char bamvtype;
00642 
00643     int tagBufferSize = 0;
00644 
00645     // First check to see if the tags need to be synced to the buffer.
00646     if(myNeedToSetTagsFromBuffer)
00647     {
00648         if(!setTagsFromBuffer())
00649         {
00650             // Failed to read tags from the buffer, so cannot add new ones.
00651             return(false);
00652         }
00653     }
00654 
00655     // Ints come in as int.  But it can be represented in fewer bits.
00656     // So determine a more specific type that is in line with the
00657     // types for BAM files.
00658     // First check to see if it is a negative.
00659     if(value < 0)
00660     {
00661         // The int is negative, so it will need to use a signed type.
00662         // See if it is greater than the min value for a char.
00663         if(value > std::numeric_limits<char>::min())
00664         {
00665             // It can be stored in a signed char.
00666             bamvtype = 'c';
00667             tagBufferSize += 4;
00668         }
00669         else if(value > std::numeric_limits<short>::min())
00670         {
00671             // It fits in a signed short.
00672             bamvtype = 's';
00673             tagBufferSize += 5;
00674         }
00675         else
00676         {
00677             // Just store it as a signed int.
00678             bamvtype = 'i';
00679             tagBufferSize += 7;
00680         }
00681     }
00682     else
00683     {
00684         // It is positive, so an unsigned type can be used.
00685         if(value < std::numeric_limits<unsigned char>::max())
00686         {
00687             // It is under the max of an unsigned char.
00688             bamvtype = 'C';
00689             tagBufferSize += 4;
00690         }
00691         else if(value < std::numeric_limits<unsigned short>::max())
00692         {
00693             // It is under the max of an unsigned short.
00694             bamvtype = 'S';
00695             tagBufferSize += 5;
00696         }
00697         else
00698         {
00699             // Just store it as an unsigned int.
00700             bamvtype = 'I';
00701             tagBufferSize += 7;
00702         }
00703     }
00704 
00705     // Check to see if the tag is already there.
00706     key = MAKEKEY(tag[0], tag[1], bamvtype);
00707     unsigned int hashIndex = extras.Find(key);
00708     if(hashIndex != LH_NOTFOUND)
00709     {
00710         // Tag was already found.
00711         index = extras[hashIndex];
00712 
00713         // First check to see if the value changed.
00714         if((integers[index] == value) && (intType[index] == bamvtype))
00715         {
00716             // The value has not changed, so do nothing.
00717             return(true);
00718         }
00719         else
00720         {
00721             // Not the same value, so adjust the settings.
00722             // Subtract the size of the previous tag from tagBufferSize to get
00723             // the adjusted size.
00724             switch(intType[index])
00725             {
00726                 case 'c':
00727                 case 'C':
00728                     tagBufferSize -= 4;
00729                     break;
00730                 case 's':
00731                 case 'S':
00732                     tagBufferSize -= 5;
00733                     break;
00734                 case 'i':
00735                 case 'I':
00736                     tagBufferSize -= 7;
00737                     break;
00738                 default:
00739                 myStatus.setStatus(SamStatus::INVALID, 
00740                                    "unknown tag inttype type found.\n");
00741                 return(false);              
00742             }
00743             
00744             // Update the integer value and type.
00745             integers[index] = value;
00746             intType[index] = bamvtype;
00747         }
00748     }
00749     else
00750     {
00751         // Tag is not already there, so add it.
00752         index = integers.Length();
00753         
00754         integers.Push(value);
00755         intType.push_back(bamvtype);
00756 
00757         extras.Add(key, index);
00758     }
00759 
00760     // The buffer tags are now out of sync.
00761     myNeedToSetTagsInBuffer = true;
00762     myIsTagsBufferValid = false;
00763     myIsBufferSynced = false;
00764     myTagBufferSize += tagBufferSize;
00765 
00766     return(true);
00767 }

bool SamRecord::addTag ( const char *  tag,
char  vtype,
const char *  value 
)

Add the specified tag,vtype,value to the record.

Vtype can be SAM/BAM format. Internal processing handles switching between SAM/BAM formats when read/written. If the tag is already there this code will replace it if the specified value is different.

Parameters:
tag two character tag to be added to the SAM/BAM record.
vtype vtype of the specified value - either SAM/BAM vtypes.
value value as a string for the specified tag.
Returns:
true if the tag was successfully added, false otherwise.

Definition at line 773 of file SamRecord.cpp.

References addIntTag(), StatGenStatus::FAIL_PARSE, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00774 {
00775     if(vtype == 'i')
00776     {
00777         // integer type.  Call addIntTag to handle it.
00778         int intVal = atoi(valuePtr);
00779         return(addIntTag(tag, intVal));
00780     }
00781 
00782     // Non-int type.
00783     myStatus = SamStatus::SUCCESS;
00784     bool status = true; // default to successful.
00785     int key = 0;
00786     int index = 0;
00787 
00788     int tagBufferSize = 0;
00789 
00790     // First check to see if the tags need to be synced to the buffer.
00791     if(myNeedToSetTagsFromBuffer)
00792     {
00793         if(!setTagsFromBuffer())
00794         {
00795             // Failed to read tags from the buffer, so cannot add new ones.
00796             return(false);
00797         }
00798     }
00799 
00800     // First check to see if the tag is already there.
00801     key = MAKEKEY(tag[0], tag[1], vtype);
00802     unsigned int hashIndex = extras.Find(key);
00803     if(hashIndex == LH_NOTFOUND)
00804     {
00805         // The key was not found.  If this is a 'B'/'Z', check for
00806         // the key with 'Z'/'B'.
00807         if(vtype == 'Z')
00808         {
00809             int tmpkey = MAKEKEY(tag[0], tag[1], 'B');
00810             hashIndex = extras.Find(tmpkey);
00811         }
00812         else if(vtype == 'B')
00813         {
00814             int tmpkey = MAKEKEY(tag[0], tag[1], 'Z');
00815             hashIndex = extras.Find(tmpkey);
00816         }
00817     }
00818     if(hashIndex != LH_NOTFOUND)
00819     {
00820         // The key was found in the hash, so get the lookup index.
00821         index = extras[hashIndex];
00822 
00823         // Adjust the currently pointed to value to the new setting.
00824         switch (vtype)
00825         {
00826             case 'A' :
00827                 // First check to see if the value changed.
00828                 if(integers[index] == (const int)*(valuePtr))
00829                 {
00830                     // The value has not changed, so do nothing.
00831                     return(true);
00832                 }
00833                 else
00834                 {
00835                     // Tag buffer size doesn't change between different 'A' entries.
00836                     integers[index] = (const int)*(valuePtr);
00837                     intType[index] = vtype;
00838                 }
00839                 break;
00840             case 'Z' :
00841             case 'B' :
00842                 // First check to see if the value changed.
00843                 if(strings[index] == valuePtr)
00844                 {
00845                     // The value has not changed, so do nothing.
00846                     return(true);
00847                 }
00848                 else
00849                 {
00850                     // Adjust the tagBufferSize by removing the size of the old string.
00851                     tagBufferSize -= strings[index].Length();
00852                     strings[index] = valuePtr;
00853                     // Adjust the tagBufferSize by adding the size of the new string.
00854                     tagBufferSize += strings[index].Length();
00855                 }
00856                 break;
00857             case 'f' :
00858                 // First check to see if the value changed.
00859                 if(doubles[index] == atof(valuePtr))
00860                 {
00861                     // The value has not changed, so do nothing.
00862                     return(true);
00863                 }
00864                 else
00865                 {
00866                     // Tag buffer size doesn't change between different 'f' entries.
00867                     doubles[index] = atof(valuePtr);
00868                 }
00869                 break;
00870             default :
00871                 fprintf(stderr,
00872                         "samRecord::addTag() - Unknown custom field of type %c\n",
00873                         vtype);
00874                 myStatus.setStatus(SamStatus::FAIL_PARSE, 
00875                                    "Unknown custom field in a tag");
00876                 status = false;
00877                 break;
00878         }
00879     }
00880     else
00881     {
00882         // The key was not found in the hash, so add it.
00883         switch (vtype)
00884         {
00885             case 'A' :
00886                 index = integers.Length();
00887                 integers.Push((const int)*(valuePtr));
00888                 intType.push_back(vtype);
00889                 tagBufferSize += 4;
00890                 break;
00891             case 'Z' :
00892             case 'B' :
00893                 index = strings.Length();
00894                 strings.Push(valuePtr);
00895                 tagBufferSize += 4 + strings.Last().Length();
00896                 break;
00897             case 'f' :
00898                 index = doubles.Length();
00899                 doubles.Push(atof(valuePtr));
00900                 tagBufferSize += 7;
00901                 break;
00902             default :
00903                 fprintf(stderr,
00904                         "samRecord::addTag() - Unknown custom field of type %c\n",
00905                         vtype);
00906                 myStatus.setStatus(SamStatus::FAIL_PARSE, 
00907                                    "Unknown custom field in a tag");
00908                 status = false;
00909                 break;
00910         }
00911         if(status)
00912         {
00913             // If successful, add the key to extras.
00914             extras.Add(key, index);
00915         }
00916     }
00917 
00918     // Only add the tag if it has so far been successfully processed.
00919     if(status)
00920     {
00921         // The buffer tags are now out of sync.
00922         myNeedToSetTagsInBuffer = true;
00923         myIsTagsBufferValid = false;
00924         myIsBufferSynced = false;
00925         myTagBufferSize += tagBufferSize;
00926     }
00927     return(status);
00928 }

bool SamRecord::checkDouble ( const char *  tag  )  [inline]

Check if the specified tag contains a string.

Does not set SamStatus.

Parameters:
tag SAM tag to check contents of.
Returns:
true if the value associated with the tag is a string.

Definition at line 601 of file SamRecord.h.

References checkTag().

00601 { return checkTag(tag, 'f'); }

bool SamRecord::checkInteger ( const char *  tag  )  [inline]

Check if the specified tag contains a string.

Does not set SamStatus.

Parameters:
tag SAM tag to check contents of.
Returns:
true if the value associated with the tag is a string.

Definition at line 595 of file SamRecord.h.

References checkTag().

00595 { return checkTag(tag, 'i'); }

bool SamRecord::checkString ( const char *  tag  )  [inline]

Check if the specified tag contains a string.

Does not set SamStatus.

Parameters:
tag SAM tag to check contents of.
Returns:
true if the value associated with the tag is a string.

Definition at line 588 of file SamRecord.h.

References checkTag().

00589     { return(checkTag(tag, 'Z') || checkTag(tag, 'B')); }

bool SamRecord::checkTag ( const char *  tag,
char  type 
)

Check if the specified tag contains a value of the specified vtype.

Does not set SamStatus.

Parameters:
tag SAM tag to check contents of.
type value type to check if the SAM tag matches.
Returns:
true if the value associated with the tag is a string.

Definition at line 2327 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by checkDouble(), checkInteger(), and checkString().

02328 {
02329     // Init to success.
02330     myStatus = SamStatus::SUCCESS;
02331     // Parse the buffer if necessary.
02332     if(myNeedToSetTagsFromBuffer)
02333     {
02334         if(!setTagsFromBuffer())
02335         {
02336             // Failed to read the tags from the buffer, so cannot
02337             // get tags.  setTagsFromBuffer set the error.
02338             return("");
02339         }
02340     }
02341     
02342     int key = MAKEKEY(tag[0], tag[1], type);
02343 
02344     return (extras.Find(key) != LH_NOTFOUND);
02345 }

void SamRecord::clearTags (  ) 

Clear the tags in this record.

Does not set SamStatus.

Definition at line 931 of file SamRecord.cpp.

References resetTagIter().

Referenced by resetRecord().

00932 {
00933     if(extras.Entries() != 0)
00934     {
00935         extras.Clear();
00936     }
00937     strings.Clear();
00938     integers.Clear();
00939     intType.clear();
00940     doubles.Clear();
00941     myTagBufferSize = 0;
00942     resetTagIter();
00943 }

int32_t SamRecord::get0BasedAlignmentEnd (  ) 

Returns the 0-based inclusive rightmost position of the clipped sequence.

Returns:
0-based inclusive rightmost position

Definition at line 1417 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by get0BasedUnclippedEnd(), get1BasedAlignmentEnd(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignment(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignmentRegion(), and CigarHelper::softClipEndByRefPos().

01418 {
01419     myStatus = SamStatus::SUCCESS;
01420     if(myAlignmentLength == -1)
01421     {
01422         // Alignment end has not been set, so calculate it.
01423         parseCigar();
01424     }
01425     // If alignment length > 0, subtract 1 from it to get the end.
01426     if(myAlignmentLength == 0)
01427     {
01428         // Length is 0, just return the start position.
01429         return(myRecordPtr->myPosition);
01430     }
01431     return(myRecordPtr->myPosition + myAlignmentLength - 1);
01432 }

int32_t SamRecord::get0BasedMatePosition (  ) 

Get the 0-based(BAM) leftmost mate/next fragment's position.

Returns:
0-based leftmost position.

Definition at line 1402 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01403 {
01404     myStatus = SamStatus::SUCCESS;
01405     return myRecordPtr->myMatePosition;
01406 }

int32_t SamRecord::get0BasedPosition (  ) 
int32_t SamRecord::get0BasedUnclippedEnd (  ) 

Returns the 0-based inclusive right-most position adjusted for clipped bases.

Returns:
0-based inclusive rightmost position including clips.

Definition at line 1476 of file SamRecord.cpp.

References get0BasedAlignmentEnd().

Referenced by get1BasedUnclippedEnd().

01477 {
01478     // myUnclippedEndOffset will be set by get0BasedAlignmentEnd if the 
01479     // cigar has not yet been parsed, so no need to check it here.
01480     return(get0BasedAlignmentEnd() + myUnclippedEndOffset);
01481 }

int32_t SamRecord::get0BasedUnclippedStart (  ) 

Returns the 0-based inclusive left-most position adjusted for clipped bases.

Returns:
0-based inclusive leftmost position including clips.

Definition at line 1456 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by get1BasedUnclippedStart().

01457 {
01458     myStatus = SamStatus::SUCCESS;
01459     if(myUnclippedStartOffset == -1)
01460     {
01461         // Unclipped has not yet been calculated, so parse the cigar to get it
01462         parseCigar();
01463     }
01464     return(myRecordPtr->myPosition - myUnclippedStartOffset);
01465 }

int32_t SamRecord::get1BasedAlignmentEnd (  ) 

Returns the 1-based inclusive rightmost position of the clipped sequence.

Returns:
1-based inclusive rightmost position

Definition at line 1436 of file SamRecord.cpp.

References get0BasedAlignmentEnd().

Referenced by getBin().

01437 {
01438     return(get0BasedAlignmentEnd() + 1);
01439 }

int32_t SamRecord::get1BasedMatePosition (  ) 

Get the 1-based(SAM) leftmost mate/next fragment's position (PNEXT).

Returns:
1-based leftmost position.

Definition at line 1395 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01396 {
01397     myStatus = SamStatus::SUCCESS;
01398     return (myRecordPtr->myMatePosition + 1);
01399 }

int32_t SamRecord::get1BasedPosition (  ) 

Get the 1-based(SAM) leftmost position (POS) of the record.

Returns:
1-based leftmost position.

Definition at line 1262 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

01263 {
01264     myStatus = SamStatus::SUCCESS;
01265     return (myRecordPtr->myPosition + 1);
01266 }

int32_t SamRecord::get1BasedUnclippedEnd (  ) 

Returns the 1-based inclusive right-most position adjusted for clipped bases.

Returns:
1-based inclusive rightmost position including clips.

Definition at line 1485 of file SamRecord.cpp.

References get0BasedUnclippedEnd().

01486 {
01487     return(get0BasedUnclippedEnd() + 1);
01488 }

int32_t SamRecord::get1BasedUnclippedStart (  ) 

Returns the 1-based inclusive left-most position adjusted for clipped bases.

Returns:
1-based inclusive leftmost position including clips.

Definition at line 1469 of file SamRecord.cpp.

References get0BasedUnclippedStart().

01470 {
01471     return(get0BasedUnclippedStart() + 1);
01472 }

int32_t SamRecord::getAlignmentLength (  ) 

Returns the length of the clipped sequence, returning 0 if the cigar is '*'.

Returns:
length of the clipped sequence.

Definition at line 1443 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01444 {
01445     myStatus = SamStatus::SUCCESS;
01446     if(myAlignmentLength == -1)
01447     {
01448         // Alignment end has not been set, so calculate it.
01449         parseCigar();
01450     }
01451     // Return the alignment length.
01452     return(myAlignmentLength);
01453 }

uint16_t SamRecord::getBin (  ) 

Get the BAM bin for the record.

Returns:
BAM bin

Definition at line 1297 of file SamRecord.cpp.

References get1BasedAlignmentEnd(), and StatGenStatus::SUCCESS.

01298 {
01299     myStatus = SamStatus::SUCCESS;
01300     if(!myIsBinValid)
01301     {
01302         // The bin that is set in the record is not valid, so
01303         // reset it.
01304         myRecordPtr->myBin = 
01305             bam_reg2bin(myRecordPtr->myPosition, get1BasedAlignmentEnd());      
01306         myIsBinValid = true;
01307     }
01308     return(myRecordPtr->myBin);
01309 }

int32_t SamRecord::getBlockSize (  ) 

Get the block size of the record (BAM format).

Returns:
BAM block size of the record.

Definition at line 1231 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01232 {
01233     myStatus = SamStatus::SUCCESS;
01234     // If the buffer isn't synced, sync the buffer to determine the
01235     // block size.
01236     if(myIsBufferSynced == false)
01237     {
01238         // Since this just returns the block size, the translation of
01239         // the sequence does not matter, so just use the currently set
01240         // value.
01241         fixBuffer(myBufferSequenceTranslation);
01242     }
01243     return myRecordPtr->myBlockSize;
01244 }

const char * SamRecord::getCigar (  ) 

Returns the SAM formatted CIGAR string.

Returns:
cigar string.

Definition at line 1505 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by getFields(), SamValidator::isValidCigar(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

01506 {
01507     myStatus = SamStatus::SUCCESS;
01508     if(myCigar.Length() == 0)
01509     {
01510         // 0 Length, means that it is in the buffer, but has not yet
01511         // been synced to the string, so do the sync.
01512         parseCigarBinary();
01513     }
01514     return myCigar.c_str();
01515 }

Cigar * SamRecord::getCigarInfo (  ) 

Returns a pointer to the Cigar object associated with this record.

The object is essentially read-only, only allowing modifications due to lazy evaluations.

Returns:
pointer to the Cigar object.

Definition at line 1786 of file SamRecord.cpp.

Referenced by PileupElementBaseQual::addEntry(), SamRecordHelper::checkSequence(), SamTags::createMDTag(), getSequence(), SamQuerySeqWithRefIter::reset(), SamFilter::softClip(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

01787 {
01788     // Check to see whether or not the Cigar has already been
01789     // set - this is determined by checking if alignment length
01790     // is set since alignment length and the cigar are set
01791     // at the same time.
01792     if(myAlignmentLength == -1)
01793     {
01794         // Not been set, so calculate it.
01795         parseCigar();
01796     }
01797     return(&myCigarRoller);
01798 }

uint16_t SamRecord::getCigarLength (  ) 

Get the length of the BAM formatted CIGAR.

Returns:
length of BAM formatted cigar.

Definition at line 1312 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01313 {
01314     myStatus = SamStatus::SUCCESS;
01315     // If the cigar buffer is valid
01316     // then get the length from there.
01317     if(myIsCigarBufferValid)
01318     {
01319         return myRecordPtr->myCigarLength;      
01320     }
01321 
01322     if(myCigarTempBufferLength == -1)
01323     {
01324         // The cigar buffer is not valid and the cigar temp buffer is not set,
01325         // so parse the string.
01326         parseCigarString();
01327     }
01328    
01329     // The temp buffer is now set, so return the size.
01330     return(myCigarTempBufferLength);
01331 }

double * SamRecord::getDoubleTag ( const char *  tag  ) 

Get the double value for the specified tag.

Parameters:
tag tag to retrieve
Returns:
pointer to the tag's double value if found, NULL if not found.

Definition at line 2198 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

02199 {
02200     // Init to success.
02201     myStatus = SamStatus::SUCCESS;
02202     // Parse the buffer if necessary.
02203     if(myNeedToSetTagsFromBuffer)
02204     {
02205         if(!setTagsFromBuffer())
02206         {
02207             // Failed to read the tags from the buffer, so cannot
02208             // get tags.  setTagsFromBuffer set the errors,
02209             // so just return null.
02210             return(NULL);
02211         }
02212     }
02213     
02214     int key = MAKEKEY(tag[0], tag[1], 'f');
02215     int offset = extras.Find(key);
02216 
02217     int value;
02218     if (offset < 0)
02219     {
02220         // Failed to find the tag.
02221         return(NULL);
02222     }
02223     else
02224         value = extras[offset];
02225 
02226     return(&(doubles[value]));
02227 }

bool SamRecord::getFields ( bamRecordStruct recStruct,
String readName,
String cigar,
String sequence,
String quality,
SequenceTranslation  translation 
)

Returns the values of all fields except the tags using the specified sequence translation.

Parameters:
recStruct structure containing the contents of all non-variable length fields.
readName read name from the record (return param)
cigar cigar string from the record (return param)
sequence sequence string from the record (return param)
quality quality string from the record (return param)
translation type of sequence translation to use.
Returns:
true if all fields were successfully set, false otherwise.

Definition at line 1825 of file SamRecord.cpp.

References getCigar(), getQuality(), getReadName(), getSequence(), and StatGenStatus::SUCCESS.

01828 {
01829     myStatus = SamStatus::SUCCESS;
01830     if(myIsBufferSynced == false)
01831     {
01832         if(!fixBuffer(translation))
01833         {
01834             // failed to set the buffer, return false.
01835             return(false);
01836         }
01837     }
01838     memcpy(&recStruct, myRecordPtr, sizeof(bamRecordStruct));
01839 
01840     readName = getReadName();
01841     // Check the status.
01842     if(myStatus != SamStatus::SUCCESS)
01843     {
01844         // Failed to set the fields, return false.
01845         return(false);
01846     }
01847     cigar = getCigar();
01848     // Check the status.
01849     if(myStatus != SamStatus::SUCCESS)
01850     {
01851         // Failed to set the fields, return false.
01852         return(false);
01853     }
01854     sequence = getSequence(translation);
01855     // Check the status.
01856     if(myStatus != SamStatus::SUCCESS)
01857     {
01858         // Failed to set the fields, return false.
01859         return(false);
01860     }
01861     quality = getQuality();
01862     // Check the status.
01863     if(myStatus != SamStatus::SUCCESS)
01864     {
01865         // Failed to set the fields, return false.
01866         return(false);
01867     }
01868     return(true);
01869 }

bool SamRecord::getFields ( bamRecordStruct recStruct,
String readName,
String cigar,
String sequence,
String quality 
)

Returns the values of all fields except the tags.

Parameters:
recStruct structure containing the contents of all non-variable length fields.
readName read name from the record (return param)
cigar cigar string from the record (return param)
sequence sequence string from the record (return param)
quality quality string from the record (return param)
Returns:
true if all fields were successfully set, false otherwise.

Definition at line 1816 of file SamRecord.cpp.

01818 {
01819     return(getFields(recStruct, readName, cigar, sequence, quality,
01820                      mySequenceTranslation));
01821 }

uint16_t SamRecord::getFlag (  ) 

Get the flag (FLAG).

Returns:
flag.

Definition at line 1334 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead(), SamQuerySeqWithRefIter::getNextMatchMismatch(), SamValidator::isValid(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processFile(), and SamFile::ReadRecord().

01335 {
01336     myStatus = SamStatus::SUCCESS;
01337     return myRecordPtr->myFlag;
01338 }

int32_t SamRecord::getInsertSize (  ) 

Get the inferred insert size of the read pair (ISIZE) or observed template length (TLEN).

Returns:
inferred insert size or observed template length.

Definition at line 1409 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01410 {
01411     myStatus = SamStatus::SUCCESS;
01412     return myRecordPtr->myInsertSize;
01413 }

int * SamRecord::getIntegerTag ( const char *  tag  ) 

Get the integer value for the specified tag.

Parameters:
tag tag to retrieve pointer to the tag's integer value if found, NULL if not found.

Definition at line 2166 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

02167 {
02168     // Init to success.
02169     myStatus = SamStatus::SUCCESS;
02170     // Parse the buffer if necessary.
02171     if(myNeedToSetTagsFromBuffer)
02172     {
02173         if(!setTagsFromBuffer())
02174         {
02175             // Failed to read the tags from the buffer, so cannot
02176             // get tags.  setTagsFromBuffer set the errors,
02177             // so just return null.
02178             return(NULL);
02179         }
02180     }
02181     
02182     int key = MAKEKEY(tag[0], tag[1], 'i');
02183     int offset = extras.Find(key);
02184 
02185     int value;
02186     if (offset < 0)
02187     {
02188         // Failed to find the tag.
02189         return(NULL);
02190     }
02191     else
02192         value = extras[offset];
02193 
02194     return(&(integers[value]));
02195 }

uint8_t SamRecord::getMapQuality (  ) 

Get the mapping quality (MAPQ) of the record.

Returns:
map quality.

Definition at line 1290 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

01291 {
01292     myStatus = SamStatus::SUCCESS;
01293     return myRecordPtr->myMapQuality;
01294 }

int32_t SamRecord::getMateReferenceID (  ) 

Get the mate reference id of the record (BAM format: mate_rid/next_refID).

Returns:
reference id

Definition at line 1388 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01389 {
01390     myStatus = SamStatus::SUCCESS;
01391     return myRecordPtr->myMateReferenceID;
01392 }

const char * SamRecord::getMateReferenceName (  ) 

Get the mate/next fragment's reference sequence name (RNEXT).

If it is equal to the reference name, it still returns the reference name.

Returns:
reference sequence name

Definition at line 1360 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01361 {
01362     myStatus = SamStatus::SUCCESS;
01363     return myMateReferenceName.c_str();
01364 }

const char * SamRecord::getMateReferenceNameOrEqual (  ) 

Get the mate/next fragment's reference sequence name (RNEXT), returning "=" if it is the same as the reference name, unless they are both "*" in which case "*" is returned.

Returns:
reference sequence name or '='

Definition at line 1370 of file SamRecord.cpp.

References getReferenceName(), and StatGenStatus::SUCCESS.

01371 {
01372     myStatus = SamStatus::SUCCESS;
01373     if(myMateReferenceName == "*")
01374     {
01375         return(myMateReferenceName);
01376     }
01377     if(myMateReferenceName == getReferenceName())
01378     {
01379         return(FIELD_ABSENT_STRING);
01380     }
01381     else
01382     {
01383         return(myMateReferenceName);
01384     }
01385 }

bool SamRecord::getNextSamTag ( char *  tag,
char &  vtype,
void **  value 
)

Get the next tag from the record.

Sets the Status to SUCCESS when a tag is successfully returned or when there are no more tags. Otherwise the status is set to describe why it failed (parsing, etc).

Parameters:
tag set to the tag when a tag is read.
vtype set to the vtype when a tag is read.
value pointer to the value of the tag (will need to cast to int, double, char, or string based on vtype).
Returns:
true if a tag was read, false if there are no more tags.

Definition at line 1912 of file SamRecord.cpp.

References StatGenStatus::FAIL_PARSE, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

Referenced by SamRecordHelper::genSamTagsString().

01913 {
01914     myStatus = SamStatus::SUCCESS;
01915     if(myNeedToSetTagsFromBuffer)
01916     {
01917         if(!setTagsFromBuffer())
01918         {
01919             // Failed to read the tags from the buffer, so cannot
01920             // get tags.
01921             return(false);
01922         }
01923     }
01924 
01925     // Increment the tag index to start looking at the next tag.
01926     // At the beginning, it is set to -1.
01927     myLastTagIndex++;
01928     int maxTagIndex = extras.Capacity();
01929     if(myLastTagIndex >= maxTagIndex)
01930     {
01931         // Hit the end of the tags, return false, no more tags.
01932         // Status is still success since this is not an error, 
01933         // it is just the end of the list.
01934         return(false);
01935     }
01936 
01937     bool tagFound = false;
01938     // Loop until a tag is found or the end of extras is hit.
01939     while((tagFound == false) && (myLastTagIndex < maxTagIndex))
01940     {
01941         if(extras.SlotInUse(myLastTagIndex))
01942         {
01943             // Found a slot to use.
01944             int key = extras.GetKey(myLastTagIndex);
01945             getTag(key, tag);
01946             getTypeFromKey(key, vtype);
01947             tagFound = true;
01948             // Get the value associated with the key based on the vtype.
01949             switch (vtype)
01950             {
01951                 case 'f' :
01952                     *value = getDoublePtr(myLastTagIndex);
01953                     break;
01954                 case 'i' :
01955                     *value = getIntegerPtr(myLastTagIndex, vtype);
01956                     if(vtype != 'A')
01957                     {
01958                         // Convert all int types to 'i'
01959                         vtype = 'i';
01960                     }
01961                     break;
01962                 case 'Z' :
01963                 case 'B' :
01964                     *value = getStringPtr(myLastTagIndex);
01965                     break;
01966                 default:
01967                     myStatus.setStatus(SamStatus::FAIL_PARSE,
01968                                        "Unknown tag type");
01969                     tagFound = false;
01970                     break;
01971             }
01972         }
01973         if(!tagFound)
01974         {
01975             // Increment the index since a tag was not found.
01976             myLastTagIndex++;
01977         }
01978     }
01979     return(tagFound);
01980 }

uint32_t SamRecord::getNumOverlaps ( int32_t  start,
int32_t  end 
)

Return the number of bases in this read that overlap the passed in region.

Matches & mismatches between the read and the reference are counted as overlaps, but insertions, deletions, skips, clips, and pads are not counted.

Parameters:
start inclusive 0-based start position (reference position) of the region to check for overlaps in. (-1 indicates to start at the beginning of the reference.)
end exclusive 0-based end position (reference position) of the region to check for overlaps in. (-1 indicates to go to the end of the reference.)
Returns:
number of overlapping bases

Definition at line 1803 of file SamRecord.cpp.

References get0BasedPosition(), and Cigar::getNumOverlaps().

Referenced by SamFile::GetNumOverlaps().

01804 {
01805     // Determine whether or not the cigar has been parsed, which sets up
01806     // the cigar roller.  This is determined by checking the alignment length.
01807     if(myAlignmentLength == -1)
01808     {
01809         parseCigar();
01810     }
01811     return(myCigarRoller.getNumOverlaps(start, end, get0BasedPosition()));
01812 }

char SamRecord::getQuality ( int  index  ) 

Get the quality character at the specified index into the quality 0 to readLength - 1.

Throws an exception if index is out of range.

Parameters:
index index into the quality string (0 to readLength-1).
Returns:
the quality character at the specified index into the quality.

Definition at line 1732 of file SamRecord.cpp.

References getReadLength(), and BaseUtilities::UNKNOWN_QUALITY_CHAR.

01733 {
01734     // Determine the read length.
01735     int32_t readLen = getReadLength();
01736 
01737     // If the read length is 0, return ' ' whose ascii code is below
01738     // the minimum ascii code for qualities.
01739     if(readLen == 0)
01740     {
01741         return(BaseUtilities::UNKNOWN_QUALITY_CHAR);
01742     }
01743     else if((index < 0) || (index >= readLen))
01744     {
01745         // Only get here if the index was out of range, so thow an exception.
01746         String exceptionString = "SamRecord::getQuality(";
01747         exceptionString += index;
01748         exceptionString += ") is out of range. Index must be between 0 and ";
01749         exceptionString += (readLen - 1);
01750         throw std::runtime_error(exceptionString.c_str());
01751     }
01752 
01753     if(myQuality.Length() == 0) 
01754     {
01755         // Parse BAM Quality.
01756         // Know that myPackedQuality is correct since readLen != 0.
01757         return(myPackedQuality[index] + 33);
01758     }
01759     else
01760     {
01761         // Already have string.
01762         if((myQuality.Length() == 1) && (myQuality[0] == '*'))
01763         {
01764             // Return the unknown quality character.
01765             return(BaseUtilities::UNKNOWN_QUALITY_CHAR);
01766         }
01767         else if(index >= myQuality.Length())
01768         {
01769             // Only get here if the index was out of range, so thow an exception.
01770             // Technically the myQuality string is not guaranteed to be the same length
01771             // as the sequence, so this catches that error.
01772             String exceptionString = "SamRecord::getQuality(";
01773             exceptionString += index;
01774             exceptionString += ") is out of range. Index must be between 0 and ";
01775             exceptionString += (myQuality.Length() - 1);
01776             throw std::runtime_error(exceptionString.c_str());
01777         }
01778         else
01779         {
01780             return(myQuality[index]);
01781         }
01782     }
01783 }

const char * SamRecord::getQuality (  ) 

Returns the SAM formatted quality string (QUAL).

Returns:
quality string.

Definition at line 1588 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by PileupElementBaseQual::addEntry(), getFields(), SamValidator::isValidQuality(), and SamFilter::sumMismatchQuality().

01589 {
01590     myStatus = SamStatus::SUCCESS;
01591     if(myQuality.Length() == 0)
01592     {
01593         // 0 Length, means that it is in the buffer, but has not yet
01594         // been synced to the string, so do the sync.
01595         setSequenceAndQualityFromBuffer();      
01596     }
01597     return myQuality.c_str();
01598 }

int32_t SamRecord::getReadLength (  ) 

Get the length of the read.

Returns:
read length.

Definition at line 1341 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::clipOnMismatchThreshold(), SamQuerySeqWithRefIter::getNextMatchMismatch(), getQuality(), getSequence(), SamValidator::isValidCigar(), SamValidator::isValidQuality(), SamQuerySeqWithRefIter::reset(), and CigarHelper::softClipEndByRefPos().

01342 {
01343     myStatus = SamStatus::SUCCESS;
01344     if(myIsSequenceBufferValid == false)
01345     {
01346         // If the sequence is "*", then return 0.
01347         if((mySequence.Length() == 1) && (mySequence[0] == '*'))
01348         {
01349             return(0);
01350         }
01351         // Do not add 1 since it is not null terminated.
01352         return(mySequence.Length());
01353     }
01354     return(myRecordPtr->myReadLength);
01355 }

const char * SamRecord::getReadName (  ) 

Returns the SAM formatted Read Name (QNAME).

Returns:
read name.

Definition at line 1492 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by getFields(), SamValidator::isValid(), and SamFile::validateSortOrder().

01493 {
01494     myStatus = SamStatus::SUCCESS;
01495     if(myReadName.Length() == 0)
01496     {
01497         // 0 Length, means that it is in the buffer, but has not yet
01498         // been synced to the string, so do the sync.
01499         myReadName = (char*)&(myRecordPtr->myData);
01500     }
01501     return myReadName.c_str();
01502 }

uint8_t SamRecord::getReadNameLength (  ) 

Get the length of the readname (QNAME) including the null.

Returns:
length of the read name (including null).

Definition at line 1276 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

01277 {
01278     myStatus = SamStatus::SUCCESS;
01279     // If the buffer is valid, return the size from there, otherwise get the 
01280     // size from the string length + 1 (ending null).
01281     if(myIsReadNameBufferValid)
01282     {
01283         return(myRecordPtr->myReadNameLength);
01284     }
01285    
01286     return(myReadName.Length() + 1);
01287 }

const void * SamRecord::getRecordBuffer ( SequenceTranslation  translation  ) 

Get a const pointer to the buffer that contains the BAM representation of the record using the specified translation on the sequence.

Parameters:
translation type of sequence translation to use.
Returns:
const pointer to the buffer that contains the BAM representation of the record.

Definition at line 1161 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01162 {
01163     myStatus = SamStatus::SUCCESS;
01164     bool status = true;
01165     // If the buffer is not synced or the sequence in the buffer is not
01166     // properly translated, fix the buffer.
01167     if((myIsBufferSynced == false) ||
01168        (myBufferSequenceTranslation != translation))
01169     {
01170         status &= fixBuffer(translation);
01171     }
01172     // If the buffer is synced, check to see if the tags need to be synced.
01173     if(myNeedToSetTagsInBuffer)
01174     {
01175         status &= setTagsInBuffer();
01176     }
01177     if(!status)
01178     {
01179         return(NULL);
01180     }
01181     return (const void *)myRecordPtr;
01182 }

const void * SamRecord::getRecordBuffer (  ) 

Get a const pointer to the buffer that contains the BAM representation of the record.

Returns:
const pointer to the buffer that contains the BAM representation of the record.

Definition at line 1154 of file SamRecord.cpp.

01155 {
01156     return(getRecordBuffer(mySequenceTranslation));
01157 }

GenomeSequence * SamRecord::getReference (  ) 

Returns a pointer to the genome sequence object associated with this record if it was set (NULL if it was not set).

Returns:
pointer to the GenomeSequence object or NULL if there isn't one.

Definition at line 1873 of file SamRecord.cpp.

Referenced by SamValidator::isValidTags().

01874 {
01875     return(myRefPtr);
01876 }

int32_t SamRecord::getReferenceID (  ) 

Get the reference sequence id of the record (BAM format rid).

Returns:
reference sequence id

Definition at line 1255 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamCoordOutput::add(), SamValidator::isValid(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignment(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignmentRegion(), and SamFile::validateSortOrder().

01256 {
01257     myStatus = SamStatus::SUCCESS;
01258     return myRecordPtr->myReferenceID;
01259 }

const char * SamRecord::getReferenceName (  ) 

Get the reference sequence name (RNAME) of the record.

Returns:
reference sequence name

Definition at line 1248 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by PileupElement::addEntry(), SamTags::createMDTag(), getMateReferenceNameOrEqual(), getSequence(), SamValidator::isValid(), and SamQuerySeqWithRefIter::reset().

01249 {
01250     myStatus = SamStatus::SUCCESS;
01251     return myReferenceName.c_str();
01252 }

char SamRecord::getSequence ( int  index,
SequenceTranslation  translation 
)

Get the sequence base at the specified index into this sequence 0 to readLength - 1 performing the specified sequence translation.

Throws an exception if index is out of range.

Parameters:
index index into the sequence string (0 to readLength-1).
translation type of sequence translation to use.
Returns:
the sequence base at the specified index into the sequence.

Definition at line 1607 of file SamRecord.cpp.

References EQUAL, getCigarInfo(), getReadLength(), getReferenceName(), NONE, SamQuerySeqWithRef::seqWithEquals(), and SamQuerySeqWithRef::seqWithoutEquals().

01608 {
01609     static const char * asciiBases = "=AC.G...T......N";
01610 
01611     // Determine the read length.
01612     int32_t readLen = getReadLength();
01613 
01614     // If the read length is 0, this method should not be called.
01615     if(readLen == 0)
01616     {
01617         String exceptionString = "SamRecord::getSequence(";
01618         exceptionString += index;
01619         exceptionString += ") is not allowed since sequence = '*'";
01620         throw std::runtime_error(exceptionString.c_str());
01621     }
01622     else if((index < 0) || (index >= readLen))
01623     {
01624         // Only get here if the index was out of range, so thow an exception.
01625         String exceptionString = "SamRecord::getSequence(";
01626         exceptionString += index;
01627         exceptionString += ") is out of range. Index must be between 0 and ";
01628         exceptionString += (readLen - 1);
01629         throw std::runtime_error(exceptionString.c_str());
01630     }
01631 
01632     // Determine if translation needs to be done.
01633     if((translation == NONE) || (myRefPtr == NULL))
01634     {
01635         // No translation needs to be done.
01636         if(mySequence.Length() == 0)
01637         {
01638             // Parse BAM sequence.
01639             if(myIsSequenceBufferValid)
01640             {
01641                 return(index & 1 ?
01642                        asciiBases[myPackedSequence[index / 2] & 0xF] :
01643                        asciiBases[myPackedSequence[index / 2] >> 4]);
01644             }
01645             else
01646             {
01647                 String exceptionString = "SamRecord::getSequence(";
01648                 exceptionString += index;
01649                 exceptionString += ") called with no sequence set";
01650                 throw std::runtime_error(exceptionString.c_str());
01651             }
01652         }
01653         // Already have string.
01654         return(mySequence[index]);
01655     }
01656     else
01657     {
01658         // Need to translate the sequence either to have '=' or to not
01659         // have it.
01660         // First check to see if the sequence has been set.
01661         if(mySequence.Length() == 0)
01662         {
01663             // 0 Length, means that it is in the buffer, but has not yet
01664             // been synced to the string, so do the sync.
01665             setSequenceAndQualityFromBuffer();
01666         }
01667 
01668         // Check the type of translation.
01669         if(translation == EQUAL)
01670         {
01671             // Check whether or not the string has already been 
01672             // retrieved that has the '=' in it.
01673             if(mySeqWithEq.length() == 0)
01674             {
01675                 // The string with '=' has not yet been determined,
01676                 // so get the string.
01677                 // Check to see if the sequence is defined.
01678                 if(mySequence == "*")
01679                 {
01680                     // Sequence is undefined, so no translation necessary.
01681                     mySeqWithEq = '*';
01682                 }
01683                 else
01684                 {
01685                     // Sequence defined, so translate it.
01686                     SamQuerySeqWithRef::seqWithEquals(mySequence.c_str(), 
01687                                                       myRecordPtr->myPosition, 
01688                                                       *(getCigarInfo()),
01689                                                       getReferenceName(),
01690                                                       *myRefPtr,
01691                                                       mySeqWithEq);
01692                 }
01693             }
01694             // Sequence is set, so return it.
01695             return(mySeqWithEq[index]);
01696         }
01697         else
01698         {
01699             // translation == BASES
01700             // Check whether or not the string has already been 
01701             // retrieved that does not have the '=' in it.
01702             if(mySeqWithoutEq.length() == 0)
01703             {
01704                 // The string with '=' has not yet been determined,
01705                 // so get the string.
01706                 // Check to see if the sequence is defined.
01707                 if(mySequence == "*")
01708                 {
01709                     // Sequence is undefined, so no translation necessary.
01710                     mySeqWithoutEq = '*';
01711                 }
01712                 else
01713                 {
01714                     // Sequence defined, so translate it.
01715                     // The string without '=' has not yet been determined,
01716                     // so get the string.
01717                     SamQuerySeqWithRef::seqWithoutEquals(mySequence.c_str(), 
01718                                                          myRecordPtr->myPosition, 
01719                                                          *(getCigarInfo()),
01720                                                          getReferenceName(),
01721                                                          *myRefPtr,
01722                                                          mySeqWithoutEq);
01723                 }
01724             }
01725             // Sequence is set, so return it.
01726             return(mySeqWithoutEq[index]);
01727         }
01728     }
01729 }

char SamRecord::getSequence ( int  index  ) 

Get the sequence base at the specified index into this sequence 0 to readLength - 1, translating the base as specified by setSequenceTranslation.

Throws an exception if index is out of range.

Parameters:
index index into the sequence string (0 to readLength-1).
Returns:
the sequence base at the specified index into the sequence.

Definition at line 1601 of file SamRecord.cpp.

References getSequence().

01602 {
01603     return(getSequence(index, mySequenceTranslation));
01604 }

const char * SamRecord::getSequence ( SequenceTranslation  translation  ) 

Returns the SAM formatted sequence string (SEQ) performing the specified sequence translation.

Parameters:
translation type of sequence translation to use.
Returns:
sequence string.

Definition at line 1524 of file SamRecord.cpp.

References EQUAL, getCigarInfo(), getReferenceName(), NONE, SamQuerySeqWithRef::seqWithEquals(), SamQuerySeqWithRef::seqWithoutEquals(), and StatGenStatus::SUCCESS.

01525 {
01526     myStatus = SamStatus::SUCCESS;
01527     if(mySequence.Length() == 0)
01528     {
01529         // 0 Length, means that it is in the buffer, but has not yet
01530         // been synced to the string, so do the sync.
01531         setSequenceAndQualityFromBuffer();
01532     }
01533 
01534     // Determine if translation needs to be done.
01535     if((translation == NONE) || (myRefPtr == NULL))
01536     {
01537         return mySequence.c_str();
01538     }
01539     else if(translation == EQUAL)
01540     {
01541         if(mySeqWithEq.length() == 0)
01542         {
01543             // Check to see if the sequence is defined.
01544             if(mySequence == "*")
01545             {
01546                 // Sequence is undefined, so no translation necessary.
01547                 mySeqWithEq = '*';
01548             }
01549             else
01550             {
01551                 // Sequence defined, so translate it.
01552                 SamQuerySeqWithRef::seqWithEquals(mySequence.c_str(), 
01553                                                   myRecordPtr->myPosition,
01554                                                   *(getCigarInfo()),
01555                                                   getReferenceName(),
01556                                                   *myRefPtr,
01557                                                   mySeqWithEq);
01558             }
01559         }
01560         return(mySeqWithEq.c_str());
01561     }
01562     else
01563     {
01564         // translation == BASES
01565         if(mySeqWithoutEq.length() == 0)
01566         {
01567             if(mySequence == "*")
01568             {
01569                 // Sequence is undefined, so no translation necessary.
01570                 mySeqWithoutEq = '*';
01571             }
01572             else
01573             {
01574                 // Sequence defined, so translate it.
01575                 SamQuerySeqWithRef::seqWithoutEquals(mySequence.c_str(), 
01576                                                      myRecordPtr->myPosition,
01577                                                      *(getCigarInfo()),
01578                                                      getReferenceName(),
01579                                                      *myRefPtr,
01580                                                      mySeqWithoutEq);
01581             }
01582         }
01583         return(mySeqWithoutEq.c_str());
01584     }
01585 }

const char * SamRecord::getSequence (  ) 

Returns the SAM formatted sequence string (SEQ), translating the base as specified by setSequenceTranslation.

Returns:
sequence string.

Definition at line 1518 of file SamRecord.cpp.

Referenced by PileupElementBaseQual::addEntry(), SamRecordHelper::checkSequence(), SamTags::createMDTag(), getFields(), SamQuerySeqWithRefIter::getNextMatchMismatch(), getSequence(), and shiftIndelsLeft().

01519 {
01520     return(getSequence(mySequenceTranslation));
01521 }

const SamStatus & SamRecord::getStatus (  ) 

Returns the status associated with the last method that sets the status.

Returns:
SamStatus of the last command that sets status.

Definition at line 2349 of file SamRecord.cpp.

02350 {
02351     return(myStatus);
02352 }

const String * SamRecord::getStringTag ( const char *  tag  ) 

Get the string value for the specified tag.

Parameters:
tag tag to retrieve
pointer to the tag's string value if found, NULL if not found.

Definition at line 2130 of file SamRecord.cpp.

Referenced by SamTags::isMDTagCorrect(), and SamValidator::isValidTags().

02131 {
02132     // Parse the buffer if necessary.
02133     if(myNeedToSetTagsFromBuffer)
02134     {
02135         if(!setTagsFromBuffer())
02136         {
02137             // Failed to read the tags from the buffer, so cannot
02138             // get tags.  setTagsFromBuffer set the errors,
02139             // so just return null.
02140             return(NULL);
02141         }
02142     }
02143     
02144     int key = MAKEKEY(tag[0], tag[1], 'Z');
02145     int offset = extras.Find(key);
02146 
02147     int value;
02148     if (offset < 0)
02149     {
02150         // Check for 'B' tag.
02151         key = MAKEKEY(tag[0], tag[1], 'B');
02152         offset = extras.Find(key);
02153         if(offset < 0)
02154         {
02155             // Tag not found.
02156             return(NULL);
02157         }
02158     }
02159 
02160     // Offset is valid, so return the tag.
02161     value = extras[offset];
02162     return(&(strings[value]));
02163 }

uint32_t SamRecord::getTagLength (  ) 

Returns the length of the BAM formatted tags.

Returns:
length of the BAM formatted tags.

Definition at line 1879 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

01880 {
01881     myStatus = SamStatus::SUCCESS;
01882     if(myNeedToSetTagsFromBuffer)
01883     {
01884         // Tags are only set in the buffer, so the size of the tags is 
01885         // the length of the record minus the starting location of the tags.
01886         unsigned char * tagStart = 
01887             (unsigned char *)myRecordPtr->myData 
01888             + myRecordPtr->myReadNameLength 
01889             + myRecordPtr->myCigarLength * sizeof(int)
01890             + (myRecordPtr->myReadLength + 1) / 2 + myRecordPtr->myReadLength;
01891       
01892         // The non-tags take up from the start of the record to the tag start.
01893         // Do not include the block size part of the record since it is not
01894         // included in the size.
01895         uint32_t nonTagSize = 
01896             tagStart - (unsigned char*)&(myRecordPtr->myReferenceID);
01897         // Tags take up the size of the block minus the non-tag section.
01898         uint32_t tagSize = myRecordPtr->myBlockSize - nonTagSize;
01899         return(tagSize);
01900     }
01901 
01902     // Tags are stored outside the buffer, so myTagBufferSize is set.
01903     return(myTagBufferSize);
01904 }

bool SamRecord::getTagsString ( const char *  tags,
String returnString,
char  delim = '\t' 
)

Get the string representation of the tags from the record, formatted as TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE.

.. Sets the Status to SUCCESS when the tags are successfully returned or the tags were not found. If a different error occured, the status is set appropriately.

Parameters:
tags the tags to retrieve, formatted as TAG:TYPE;TAG:TYPE...
returnString the String to set (this method first clears returnString) to TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE...
delim delimiter to use to separate two tags, default is a tab.
Returns:
true if there were not any errors even if no tags were found.

Definition at line 2032 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

02033 {
02034     const char* currentTagPtr = tags;
02035 
02036     returnString.Clear();
02037     myStatus = SamStatus::SUCCESS;
02038     if(myNeedToSetTagsFromBuffer)
02039     {
02040         if(!setTagsFromBuffer())
02041         {
02042             // Failed to read the tags from the buffer, so cannot
02043             // get tags.
02044             return(false);
02045         }
02046     }
02047     
02048     bool returnStatus = true;
02049 
02050     while(*currentTagPtr != '\0')
02051     {
02052         // Tags are formatted as: XY:Z
02053         // Where X is [A-Za-z], Y is [A-Za-z], and
02054         // Z is A,i,f,Z,H (cCsSI are also excepted)
02055         if((currentTagPtr[0] == '\0') || (currentTagPtr[1] == '\0') ||
02056            (currentTagPtr[2] != ':') || (currentTagPtr[3] == '\0'))
02057         {
02058             myStatus.setStatus(SamStatus::INVALID, 
02059                                "getTagsString called with improperly formatted tags.\n");
02060             returnStatus = false;
02061             break;
02062         }
02063 
02064         // Construct the key.
02065         int key = MAKEKEY(currentTagPtr[0], currentTagPtr[1], 
02066                           currentTagPtr[3]);
02067         // Look to see if the key exsists in the hash.
02068         int offset = extras.Find(key);
02069 
02070         if(offset >= 0)
02071         {
02072             // Offset is set, so the key was found.
02073             if(!returnString.IsEmpty())
02074             {
02075                 returnString += delim;
02076             }
02077             returnString += currentTagPtr[0];
02078             returnString += currentTagPtr[1];
02079             returnString += ':';
02080             returnString += currentTagPtr[3];
02081             returnString += ':';
02082 
02083             // First if it is an integer, determine the actual type of the int.
02084             char vtype;
02085             getTypeFromKey(key, vtype);
02086 
02087             switch(vtype)
02088             {
02089                 case 'i':
02090                     returnString += *(int*)getIntegerPtr(offset, vtype);
02091                     break;
02092                 case 'f':
02093                     returnString += *(double*)getDoublePtr(offset);
02094                     break;
02095                 case 'Z':
02096                 case 'B':
02097                     returnString += *(String*)getStringPtr(offset);
02098                     break;
02099                 default:
02100                     myStatus.setStatus(SamStatus::INVALID, 
02101                                        "rmTag called with unknown type.\n");
02102                     returnStatus = false;
02103                     break;
02104             };
02105         }
02106         // Increment to the next tag.
02107         if(currentTagPtr[4] == ';')
02108         {
02109             // Increment once more.
02110             currentTagPtr += 5;
02111         }
02112         else if(currentTagPtr[4] != '\0')
02113         {
02114             // Invalid tag format. 
02115             myStatus.setStatus(SamStatus::INVALID, 
02116                                "rmTags called with improperly formatted tags.\n");
02117             returnStatus = false;
02118             break;
02119         }
02120         else
02121         {
02122             // Last Tag.
02123             currentTagPtr += 4;
02124         }
02125     }
02126     return(returnStatus);
02127 }

bool SamRecord::isCharType ( char  vtype  )  [static]

Returns whether or not the specified vtype is a char type.

Does not set SamStatus.

Parameters:
vtype value type to check.
Returns:
true if the passed in vtype is a char ('A'), false otherwise.

Definition at line 2012 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

02013 {
02014     if(vtype == 'A')
02015     {
02016         return(true);
02017     }
02018     return(false);
02019 }

bool SamRecord::isDoubleType ( char  vtype  )  [static]

Returns whether or not the specified vtype is a double type.

Does not set SamStatus.

Parameters:
vtype value type to check.
Returns:
true if the passed in vtype is a double ('f'), false otherwise.

Definition at line 2002 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

02003 {
02004     if(vtype == 'f')
02005     {
02006         return(true);
02007     }
02008     return(false);
02009 }

bool SamRecord::isIntegerType ( char  vtype  )  [static]

Returns whether or not the specified vtype is an integer type.

Does not set SamStatus.

Parameters:
vtype value type to check.
Returns:
true if the passed in vtype is an integer ('c', 'C', 's', 'S', 'i', 'I'), false otherwise.

Definition at line 1990 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

01991 {
01992     if((vtype == 'c') || (vtype == 'C') ||
01993        (vtype == 's') || (vtype == 'S') ||
01994        (vtype == 'i') || (vtype == 'I'))
01995     {
01996         return(true);
01997     }
01998     return(false);
01999 }

bool SamRecord::isStringType ( char  vtype  )  [static]

Returns whether or not the specified vtype is a string type.

Does not set SamStatus.

Parameters:
vtype value type to check.
Returns:
true if the passed in vtype is a string ('Z'/'B'), false othwerise.

Definition at line 2022 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

02023 {
02024     if((vtype == 'Z') || (vtype == 'B'))
02025     {
02026         return(true);
02027     }
02028     return(false);
02029 }

bool SamRecord::isValid ( SamFileHeader header  ) 

Returns whether or not the record is valid, setting the status to indicate success or failure.

Parameters:
header SAM Header associated with the record. Used to perform some validation against the header.
Returns:
true if the record is valid, false if not.

Definition at line 162 of file SamRecord.cpp.

References SamValidationErrors::getErrorString(), StatGenStatus::INVALID, SamValidator::isValid(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00163 {
00164     myStatus = SamStatus::SUCCESS;
00165     SamValidationErrors invalidSamErrors;
00166     if(!SamValidator::isValid(header, *this, invalidSamErrors))
00167     {
00168         // The record is not valid.
00169         std::string errorMessage = "";
00170         invalidSamErrors.getErrorString(errorMessage);
00171         myStatus.setStatus(SamStatus::INVALID, errorMessage.c_str());
00172         return(false);
00173     }
00174     // The record is valid.
00175     return(true);
00176 }

void SamRecord::resetRecord (  ) 

Reset the fields of the record to a default value.

This is not necessary when you are reading a SAM/BAM file, but if you are setting fields, it is a good idea to clean out a record before reusing it. Clearing it allows you to not have to set any empty fields.

Definition at line 91 of file SamRecord.cpp.

References clearTags(), NONE, and StatGenStatus::SUCCESS.

Referenced by SamRecord(), setBuffer(), setBufferFromFile(), and ~SamRecord().

00092 {
00093     myIsBufferSynced = true;
00094 
00095     myRecordPtr->myBlockSize = DEFAULT_BLOCK_SIZE;
00096     myRecordPtr->myReferenceID = -1;
00097     myRecordPtr->myPosition = -1;
00098     myRecordPtr->myReadNameLength = DEFAULT_READ_NAME_LENGTH;
00099     myRecordPtr->myMapQuality = 0;
00100     myRecordPtr->myBin = DEFAULT_BIN;
00101     myRecordPtr->myCigarLength = 0;
00102     myRecordPtr->myFlag = 0;
00103     myRecordPtr->myReadLength = 0;
00104     myRecordPtr->myMateReferenceID = -1;
00105     myRecordPtr->myMatePosition = -1;
00106     myRecordPtr->myInsertSize = 0;
00107    
00108     // Set the sam values for the variable length fields.
00109     // TODO - one way to speed this up might be to not set to "*" and just
00110     // clear them, and write out a '*' for SAM if it is empty.
00111     myReadName = DEFAULT_READ_NAME;
00112     myReferenceName = "*";
00113     myMateReferenceName = "*";
00114     myCigar = "*";
00115     mySequence = "*";
00116     mySeqWithEq.clear();
00117     mySeqWithoutEq.clear();
00118     myQuality = "*";
00119     myNeedToSetTagsFromBuffer = false;
00120     myNeedToSetTagsInBuffer = false;
00121 
00122     // Initialize the calculated alignment info to the uncalculated value.
00123     myAlignmentLength = -1;
00124     myUnclippedStartOffset = -1;
00125     myUnclippedEndOffset = -1;
00126 
00127     clearTags();
00128 
00129     // Set the bam values for the variable length fields.
00130     // Only the read name needs to be set, the others are a length of 0.
00131     // Set the read name.  The min size of myRecordPtr includes the size for
00132     // the default read name.
00133     memcpy(&(myRecordPtr->myData), myReadName.c_str(), 
00134            myRecordPtr->myReadNameLength);
00135 
00136     // Set that the variable length buffer fields are valid.
00137     myIsReadNameBufferValid = true;
00138     myIsCigarBufferValid = true;
00139     myPackedSequence = 
00140         (unsigned char *)myRecordPtr->myData + myRecordPtr->myReadNameLength +
00141         myRecordPtr->myCigarLength * sizeof(int);
00142     myIsSequenceBufferValid = true;
00143     myBufferSequenceTranslation = NONE;
00144 
00145     myPackedQuality = myPackedSequence;
00146     myIsQualityBufferValid = true;
00147     myIsTagsBufferValid = true;
00148     myIsBinValid = true;
00149 
00150     myCigarTempBufferLength = -1;
00151 
00152     myStatus = SamStatus::SUCCESS;
00153 
00154     NOT_FOUND_TAG_STRING = "";
00155     NOT_FOUND_TAG_INT = -1;
00156     NOT_FOUND_TAG_DOUBLE = -1;
00157 }

bool SamRecord::rmTag ( const char *  tag,
char  type 
)

Remove a tag.

Parameters:
tag tag to remove.
type of the tag to be removed.
Returns:
true if the tag no longer exists in the record, false if it could not be removed (Returns true if the tag was not found in the record).

Definition at line 946 of file SamRecord.cpp.

References getString(), StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00947 {
00948     // Check the length of tag.
00949     if(strlen(tag) != 2)
00950     {
00951         // Tag is the wrong length.
00952         myStatus.setStatus(SamStatus::INVALID, 
00953                            "rmTag called with tag that is not 2 characters\n");
00954         return(false);
00955     }
00956 
00957     myStatus = SamStatus::SUCCESS;
00958     if(myNeedToSetTagsFromBuffer)
00959     {
00960         if(!setTagsFromBuffer())
00961         {
00962             // Failed to read the tags from the buffer, so cannot
00963             // get tags.
00964             return(false);
00965         }
00966     }
00967 
00968     // Construct the key.
00969     int key = MAKEKEY(tag[0], tag[1], type);
00970     // Look to see if the key exsists in the hash.
00971     int offset = extras.Find(key);
00972 
00973     if(offset < 0)
00974     {
00975         // Not found, so return true, successfully removed since
00976         // it is not in tag.
00977         return(true);
00978     }
00979 
00980     // Offset is set, so the key was found.
00981     // First if it is an integer, determine the actual type of the int.
00982     char vtype;
00983     getTypeFromKey(key, vtype);
00984     if(vtype == 'i')
00985     {
00986         vtype = getIntegerType(offset);
00987     }
00988 
00989     // Offset is set, so recalculate the buffer size without this entry.
00990     // Do NOT remove from strings, integers, or doubles because then
00991     // extras would need to be updated for all entries with the new indexes
00992     // into those variables.
00993     int rmBuffSize = 0;
00994     switch(vtype)
00995     {
00996         case 'A':
00997         case 'c':
00998         case 'C':
00999             rmBuffSize = 4;
01000             break;
01001         case 's':
01002         case 'S':
01003             rmBuffSize = 5;
01004             break;
01005         case 'i':
01006         case 'I':
01007             rmBuffSize = 7;
01008             break;
01009         case 'f':
01010             rmBuffSize = 7;
01011             break;
01012         case 'Z':
01013         case 'B':
01014             rmBuffSize = 4 + getString(offset).Length();
01015             break;
01016         default:
01017             myStatus.setStatus(SamStatus::INVALID, 
01018                                "rmTag called with unknown type.\n");
01019             return(false);
01020             break;
01021     };
01022 
01023     // The buffer tags are now out of sync.
01024     myNeedToSetTagsInBuffer = true;
01025     myIsTagsBufferValid = false;
01026     myIsBufferSynced = false;
01027     myTagBufferSize -= rmBuffSize;
01028 
01029     // Remove from the hash.
01030     extras.Delete(offset);
01031     return(true);
01032 }

bool SamRecord::rmTags ( const char *  tags  ) 

Remove tags.

Parameters:
tags tags to remove, formatted as Tag:Type;Tag:Type;Tag:Type...
Returns:
true if all tags no longer exist in the record, false if any could not be removed (Returns true if the tags were not found in the record). SamStatus is set to INVALID if the tags are incorrectly formatted.

Definition at line 1035 of file SamRecord.cpp.

References getString(), StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

01036 {
01037     const char* currentTagPtr = tags;
01038 
01039     myStatus = SamStatus::SUCCESS;
01040     if(myNeedToSetTagsFromBuffer)
01041     {
01042         if(!setTagsFromBuffer())
01043         {
01044             // Failed to read the tags from the buffer, so cannot
01045             // get tags.
01046             return(false);
01047         }
01048     }
01049     
01050     bool returnStatus = true;
01051 
01052     int rmBuffSize = 0;
01053     while(*currentTagPtr != '\0')
01054     {
01055 
01056         // Tags are formatted as: XY:Z
01057         // Where X is [A-Za-z], Y is [A-Za-z], and
01058         // Z is A,i,f,Z,H (cCsSI are also excepted)
01059         if((currentTagPtr[0] == '\0') || (currentTagPtr[1] == '\0') ||
01060            (currentTagPtr[2] != ':') || (currentTagPtr[3] == '\0'))
01061         {
01062             myStatus.setStatus(SamStatus::INVALID, 
01063                                "rmTags called with improperly formatted tags.\n");
01064             returnStatus = false;
01065             break;
01066         }
01067 
01068         // Construct the key.
01069         int key = MAKEKEY(currentTagPtr[0], currentTagPtr[1], 
01070                           currentTagPtr[3]);
01071         // Look to see if the key exsists in the hash.
01072         int offset = extras.Find(key);
01073 
01074         if(offset >= 0)
01075         {
01076             // Offset is set, so the key was found.
01077             // First if it is an integer, determine the actual type of the int.
01078             char vtype;
01079             getTypeFromKey(key, vtype);
01080             if(vtype == 'i')
01081             {
01082                 vtype = getIntegerType(offset);
01083             }
01084             
01085             // Offset is set, so recalculate the buffer size without this entry.
01086             // Do NOT remove from strings, integers, or doubles because then
01087             // extras would need to be updated for all entries with the new indexes
01088             // into those variables.
01089             switch(vtype)
01090             {
01091                 case 'A':
01092                 case 'c':
01093                 case 'C':
01094                     rmBuffSize += 4;
01095                     break;
01096                 case 's':
01097                 case 'S':
01098                     rmBuffSize += 5;
01099                     break;
01100                 case 'i':
01101                 case 'I':
01102                     rmBuffSize += 7;
01103                     break;
01104                 case 'f':
01105                     rmBuffSize += 7;
01106                     break;
01107                 case 'Z':
01108                 case 'B':
01109                     rmBuffSize += 4 + getString(offset).Length();
01110                     break;
01111                 default:
01112                     myStatus.setStatus(SamStatus::INVALID, 
01113                                        "rmTag called with unknown type.\n");
01114                     returnStatus = false;
01115                     break;
01116             };
01117             
01118             // Remove from the hash.
01119             extras.Delete(offset);
01120         }
01121         // Increment to the next tag.
01122         if(currentTagPtr[4] == ';')
01123         {
01124             // Increment once more.
01125             currentTagPtr += 5;
01126         }
01127         else if(currentTagPtr[4] != '\0')
01128         {
01129             // Invalid tag format. 
01130             myStatus.setStatus(SamStatus::INVALID, 
01131                                "rmTags called with improperly formatted tags.\n");
01132             returnStatus = false;
01133             break;
01134         }
01135         else
01136         {
01137             // Last Tag.
01138             currentTagPtr += 4;
01139         }
01140     }
01141 
01142     // The buffer tags are now out of sync.
01143     myNeedToSetTagsInBuffer = true;
01144     myIsTagsBufferValid = false;
01145     myIsBufferSynced = false;
01146     myTagBufferSize -= rmBuffSize;
01147     
01148 
01149     return(returnStatus);
01150 }

bool SamRecord::set0BasedMatePosition ( int32_t  matePosition  ) 

Set the mate/next fragment's leftmost position using the specified 0-based (BAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
position 0-based start position
Returns:
true if successfully set, false if not.

Definition at line 329 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by set1BasedMatePosition().

00330 {
00331     myStatus = SamStatus::SUCCESS;
00332     myRecordPtr->myMatePosition = matePosition;
00333     return true;
00334 }

bool SamRecord::set0BasedPosition ( int32_t  position  ) 

Set the leftmost position using the specified 0-based (BAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
position 0-based start position
Returns:
true if successfully set, false if not.

Definition at line 243 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by set1BasedPosition(), and SamFilter::softClip().

00244 {
00245     myStatus = SamStatus::SUCCESS;
00246     myRecordPtr->myPosition = position;
00247     myIsBinValid = false;
00248     return true;
00249 }

bool SamRecord::set1BasedMatePosition ( int32_t  matePosition  ) 

Set the mate/next fragment's leftmost position (PNEXT) using the specified 1-based (SAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
position 1-based start position
Returns:
true if successfully set, false if not.

Definition at line 323 of file SamRecord.cpp.

References set0BasedMatePosition().

00324 {
00325     return(set0BasedMatePosition(matePosition - 1));
00326 }

bool SamRecord::set1BasedPosition ( int32_t  position  ) 

Set the leftmost position (POS) using the specified 1-based (SAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
position 1-based start position
Returns:
true if successfully set, false if not.

Definition at line 237 of file SamRecord.cpp.

References set0BasedPosition().

00238 {
00239     return(set0BasedPosition(position - 1));
00240 }

SamStatus::Status SamRecord::setBuffer ( const char *  fromBuffer,
uint32_t  fromBufferSize,
SamFileHeader header 
)

Sets the SamRecord to contain the information in the BAM formatted fromBuffer.

Parameters:
fromBuffer buffer to read the BAM record from.
fromBufferSize size of the buffer containing the BAM record.
header BAM header for the record.
Returns:
status of reading the BAM record from the buffer.

Definition at line 526 of file SamRecord.cpp.

References StatGenStatus::FAIL_MEM, StatGenStatus::FAIL_PARSE, resetRecord(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00529 {
00530     myStatus = SamStatus::SUCCESS;
00531     if((fromBuffer == NULL) || (fromBufferSize == 0))
00532     {
00533         // Buffer is empty.
00534         myStatus.setStatus(SamStatus::FAIL_PARSE,
00535                            "Cannot parse an empty file.");
00536         return(SamStatus::FAIL_PARSE);
00537     }
00538 
00539     // Clear the record.   
00540     resetRecord();
00541 
00542     // allocate space for the record size.
00543     if(!allocateRecordStructure(fromBufferSize))
00544     {
00545         // Failed to allocate space.
00546         return(SamStatus::FAIL_MEM);
00547     }
00548    
00549     memcpy(myRecordPtr, fromBuffer, fromBufferSize);
00550 
00551     setVariablesForNewBuffer(header);
00552 
00553     // Return the status of the record.
00554     return(SamStatus::SUCCESS);
00555 }

SamStatus::Status SamRecord::setBufferFromFile ( IFILE  filePtr,
SamFileHeader header 
)

Read the BAM record from a file.

Parameters:
filePtr file to read the buffer from.
header BAM header for the record.
Returns:
status of the reading the BAM record from the file.

Definition at line 559 of file SamRecord.cpp.

References StatGenStatus::FAIL_IO, StatGenStatus::FAIL_MEM, StatGenStatus::FAIL_ORDER, StatGenStatus::FAIL_PARSE, ifeof(), ifread(), InputFile::isOpen(), StatGenStatus::NO_MORE_RECS, resetRecord(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00561 {
00562     myStatus = SamStatus::SUCCESS;
00563     if((filePtr == NULL) || (filePtr->isOpen() == false))
00564     {
00565         // File is not open, return failure.
00566         myStatus.setStatus(SamStatus::FAIL_ORDER, 
00567                            "Can't read from an unopened file.");
00568         return(SamStatus::FAIL_ORDER);
00569     }
00570 
00571     // Clear the record.
00572     resetRecord();
00573 
00574     // read the record size.
00575     int numBytes = 
00576         ifread(filePtr, &(myRecordPtr->myBlockSize), sizeof(int32_t));
00577 
00578     // Check to see if the end of the file was hit and no bytes were read.
00579     if(ifeof(filePtr) && (numBytes == 0))
00580     {
00581         // End of file, nothing was read, no more records.
00582         myStatus.setStatus(SamStatus::NO_MORE_RECS,
00583                            "No more records left to read.");
00584         return(SamStatus::NO_MORE_RECS);
00585     }
00586     
00587     if(numBytes != sizeof(int32_t))
00588     {
00589         // Failed to read the entire block size.  Either the end of the file
00590         // was reached early or there was an error.
00591         if(ifeof(filePtr))
00592         {
00593             // Error: end of the file reached prior to reading the rest of the
00594             // record.
00595             myStatus.setStatus(SamStatus::FAIL_PARSE, 
00596                                "EOF reached in the middle of a record.");
00597             return(SamStatus::FAIL_PARSE);
00598         }
00599         else
00600         {
00601             // Error reading.
00602             myStatus.setStatus(SamStatus::FAIL_IO, 
00603                                "Failed to read the record size.");
00604             return(SamStatus::FAIL_IO);
00605         }
00606     }
00607 
00608     // allocate space for the record size.
00609     if(!allocateRecordStructure(myRecordPtr->myBlockSize + sizeof(int32_t)))
00610     {
00611         // Failed to allocate space.
00612         // Status is set by allocateRecordStructure.
00613         return(SamStatus::FAIL_MEM);
00614     }
00615 
00616     // Read the rest of the alignment block, starting at the reference id.
00617     if(ifread(filePtr, &(myRecordPtr->myReferenceID), myRecordPtr->myBlockSize)
00618        != (unsigned int)myRecordPtr->myBlockSize)
00619     {
00620         // Error reading the record.  Reset it and return failure.
00621         resetRecord();
00622         myStatus.setStatus(SamStatus::FAIL_IO,
00623                            "Failed to read the record");
00624         return(SamStatus::FAIL_IO);
00625     }
00626 
00627     setVariablesForNewBuffer(header);
00628 
00629     // Return the status of the record.
00630     return(SamStatus::SUCCESS);
00631 }

bool SamRecord::setCigar ( const Cigar cigar  ) 

Set the CIGAR to the specified Cigar object.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
cigar object to set this record's cigar to have.
Returns:
true if successfully set, false if not.

Definition at line 279 of file SamRecord.cpp.

References Cigar::getCigarString(), and StatGenStatus::SUCCESS.

00280 {
00281     myStatus = SamStatus::SUCCESS;
00282     cigar.getCigarString(myCigar);
00283  
00284     myIsBufferSynced = false;
00285     myIsCigarBufferValid = false;
00286     myCigarTempBufferLength = -1;
00287     myIsBinValid = false;
00288 
00289     // Initialize the calculated alignment info to the uncalculated value.
00290     myAlignmentLength = -1;
00291     myUnclippedStartOffset = -1;
00292     myUnclippedEndOffset = -1;
00293 
00294     return true;
00295 }

bool SamRecord::setCigar ( const char *  cigar  ) 

Set the CIGAR to the specified SAM formatted cigar string.

Internal processing handles the switching between SAM/BAM formats when read/written.

Parameters:
cigar string containing the SAM formatted cigar.
Returns:
true if successfully set, false if not.

Definition at line 260 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by shiftIndelsLeft(), and SamFilter::softClip().

00261 {
00262     myStatus = SamStatus::SUCCESS;
00263     myCigar = cigar;
00264  
00265     myIsBufferSynced = false;
00266     myIsCigarBufferValid = false;
00267     myCigarTempBufferLength = -1;
00268     myIsBinValid = false;
00269 
00270     // Initialize the calculated alignment info to the uncalculated value.
00271     myAlignmentLength = -1;
00272     myUnclippedStartOffset = -1;
00273     myUnclippedEndOffset = -1;
00274 
00275     return true;
00276 }

bool SamRecord::setFlag ( uint16_t  flag  ) 

Set the bitwise FLAG to the specified value.

Parameters:
flag integer flag to use.
Returns:
true if successfully set, false if not.

Definition at line 216 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead().

00217 {
00218     myStatus = SamStatus::SUCCESS;
00219     myRecordPtr->myFlag = flag;
00220     return true;
00221 }

bool SamRecord::setInsertSize ( int32_t  insertSize  ) 

Sets the inferred insert size (ISIZE)/observed template length (TLEN).

Parameters:
insertSize inferred insert size/observed template length.
Returns:
true if successfully set, false if not.

Definition at line 337 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

00338 {
00339     myStatus = SamStatus::SUCCESS;
00340     myRecordPtr->myInsertSize = insertSize;
00341     return true;
00342 }

bool SamRecord::setMapQuality ( uint8_t  mapQuality  ) 

Set the mapping quality (MAPQ).

Parameters:
mapQuality map quality to set in the record.
Returns:
true if successfully set, false if not.

Definition at line 252 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

00253 {
00254     myStatus = SamStatus::SUCCESS;
00255     myRecordPtr->myMapQuality = mapQuality;
00256     return true;
00257 }

bool SamRecord::setMateReferenceName ( SamFileHeader header,
const char *  mateReferenceName 
)

Set the mate/next fragment's reference sequence name (RNEXT) to the specified name, using the header to determine the mate reference id.

Parameters:
header SAM/BAM header to use to determine the mate reference id.
referenceName mate reference name to use.
Returns:
true if successfully set, false if not

Definition at line 298 of file SamRecord.cpp.

References SamFileHeader::getReferenceID(), and StatGenStatus::SUCCESS.

00300 {
00301     myStatus = SamStatus::SUCCESS;
00302     // Set the mate reference, if it is "=", set it to be equal
00303     // to myReferenceName.  This assumes that myReferenceName has already
00304     // been called.
00305     if(strcmp(mateReferenceName, FIELD_ABSENT_STRING) == 0)
00306     {
00307         myMateReferenceName = myReferenceName;
00308     }
00309     else
00310     {
00311         myMateReferenceName = mateReferenceName;
00312     }
00313 
00314     // Set the Mate Reference ID.
00315     // If the reference ID does not already exist, add it (pass true)
00316     myRecordPtr->myMateReferenceID = 
00317         header.getReferenceID(myMateReferenceName, true);
00318 
00319     return true;
00320 }

bool SamRecord::setQuality ( const char *  quality  ) 

Sets the quality (QUAL) to the specified SAM formatted quality string.

Internal processing handles switching between SAM/BAM formats when read/written.

Parameters:
quality SAM quality string.
Returns:
true if successfully set, false if not.

Definition at line 358 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

00359 {
00360     myStatus = SamStatus::SUCCESS;
00361     myQuality = quality;
00362     myIsBufferSynced = false;
00363     myIsQualityBufferValid = false;
00364     return true;
00365 }

bool SamRecord::setReadName ( const char *  readName  ) 

Set QNAME to the passed in name.

Parameters:
readName the readname to set the QNAME to.
Returns:
true if successfully set, false if not.

Definition at line 194 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

00195 {
00196     myReadName = readName;
00197     myIsBufferSynced = false;
00198     myIsReadNameBufferValid = false;
00199     myStatus = SamStatus::SUCCESS;
00200 
00201     // The read name must at least have some length, otherwise this is a parsing
00202     // error.
00203     if(myReadName.Length() == 0)
00204     {
00205         // Invalid - reset ReadName return false.
00206         myReadName = DEFAULT_READ_NAME;
00207         myRecordPtr->myReadNameLength = DEFAULT_READ_NAME_LENGTH;
00208         myStatus.setStatus(SamStatus::INVALID, "0 length Query Name.");
00209         return(false);
00210     }
00211 
00212     return true;
00213 }

void SamRecord::setReference ( GenomeSequence reference  ) 

Set the reference to the specified genome sequence object.

Parameters:
reference pointer to the GenomeSequence object.

Definition at line 179 of file SamRecord.cpp.

Referenced by SamFile::GetNumOverlaps(), SamFile::ReadRecord(), SamFile::validateSortOrder(), and SamFile::WriteRecord().

00180 {
00181     myRefPtr = reference;
00182 }

bool SamRecord::setReferenceName ( SamFileHeader header,
const char *  referenceName 
)

Set the reference sequence name (RNAME) to the specified name, using the header to determine the reference id.

Parameters:
header SAM/BAM header to use to determine the reference id.
referenceName reference name to use.
Returns:
true if successfully set, false if not

Definition at line 224 of file SamRecord.cpp.

References SamFileHeader::getReferenceID(), and StatGenStatus::SUCCESS.

00226 {
00227     myStatus = SamStatus::SUCCESS;
00228 
00229     myReferenceName = referenceName;
00230     // If the reference ID does not already exist, add it (pass true)
00231     myRecordPtr->myReferenceID = header.getReferenceID(referenceName, true);
00232 
00233     return true;
00234 }

bool SamRecord::setSequence ( const char *  seq  ) 

Sets the sequence (SEQ) to the specified SAM formatted sequence string.

Internal processing handles switching between SAM/BAM formats when read/written.

Parameters:
seq SAM sequence string. May contain '='.
Returns:
true if successfully set, false if not.

Definition at line 345 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

00346 {
00347     myStatus = SamStatus::SUCCESS;
00348     mySequence = seq;
00349     mySeqWithEq.clear();
00350     mySeqWithoutEq.clear();
00351    
00352     myIsBufferSynced = false;
00353     myIsSequenceBufferValid = false;
00354     return true;
00355 }

void SamRecord::setSequenceTranslation ( SequenceTranslation  translation  ) 

Set the type of sequence translation to use when getting the sequence.

The default type (if this method is never called) is NONE (the sequence is left as-is). Can be over-ridden by using the accessors that take a SequenceTranslation parameter.

Parameters:
translation type of sequence translation to use.

Definition at line 188 of file SamRecord.cpp.

Referenced by SamFile::GetNumOverlaps(), SamFile::ReadRecord(), and SamFile::validateSortOrder().

00189 {
00190     mySequenceTranslation = translation;
00191 }

bool SamRecord::shiftIndelsLeft (  ) 

Shift the indels (if any) to the left by updating the CIGAR.

Returns:
true if the cigar was shifted, false if not.

Definition at line 369 of file SamRecord.cpp.

References BASES, Cigar::foundInQuery(), getSequence(), CigarRoller::IncrementCount(), Cigar::insert, Cigar::isMatchOrMismatch(), CigarRoller::Remove(), setCigar(), Cigar::size(), and CigarRoller::Update().

00370 {
00371     // Check to see whether or not the Cigar has already been
00372     // set - this is determined by checking if alignment length
00373     // is set since alignment length and the cigar are set
00374     // at the same time.
00375     if(myAlignmentLength == -1)
00376     {
00377         // Not been set, so calculate it.
00378         parseCigar();
00379     }
00380     
00381     // Track whether or not there was a shift.
00382     bool shifted = false;
00383 
00384     // Cigar is set, so now myCigarRoller can be used.
00385     // Track where in the read we are.
00386     uint32_t currentPos = 0;
00387 
00388     // Since the loop starts at 1 because the first operation can't be shifted,
00389     // increment the currentPos past the first operation.
00390     if(Cigar::foundInQuery(myCigarRoller[0]))
00391     {
00392         // This op was found in the read, increment the current position.
00393         currentPos += myCigarRoller[0].count;
00394     }
00395    
00396     int numOps = myCigarRoller.size();
00397     
00398     // Loop through the cigar operations from the 2nd operation since
00399     // the first operation is already on the end and can't shift.
00400     for(int currentOp = 1; currentOp < numOps; currentOp++)
00401     {
00402         if(myCigarRoller[currentOp].operation == Cigar::insert)
00403         {
00404             // For now, only shift a max of 1 operation.
00405             int prevOpIndex = currentOp-1;
00406             // Track the next op for seeing if it is the same as the
00407             // previous for merging reasons.
00408             int nextOpIndex = currentOp+1;
00409             if(nextOpIndex == numOps)
00410             {
00411                 // There is no next op, so set it equal to the current one.
00412                 nextOpIndex = currentOp;
00413             }
00414             // The start of the previous operation, so we know when we hit it
00415             // so we don't shift past it.
00416             uint32_t prevOpStart = 
00417                 currentPos - myCigarRoller[prevOpIndex].count;
00418 
00419             // We can only shift if the previous operation
00420             if(!Cigar::isMatchOrMismatch(myCigarRoller[prevOpIndex]))
00421             {
00422                 // TODO - shift past pads
00423                 // An insert is in the read, so increment the position.
00424                 currentPos += myCigarRoller[currentOp].count;                 
00425                 // Not a match/mismatch, so can't shift into it.
00426                 continue;
00427             }
00428                     
00429             // It is a match or mismatch, so check to see if we can
00430             // shift into it.
00431 
00432             // The end of the insert is calculated by adding the size
00433             // of this insert minus 1 to the start of the insert.
00434             uint32_t insertEndPos = 
00435                 currentPos + myCigarRoller[currentOp].count - 1;
00436                 
00437             // The insert starts at the current position.
00438             uint32_t insertStartPos = currentPos;
00439                 
00440             // Loop as long as the position before the insert start
00441             // matches the last character in the insert. If they match,
00442             // the insert can be shifted one index left because the
00443             // implied reference will not change.  If they do not match,
00444             // we can't shift because the implied reference would change.
00445             // Stop loop when insertStartPos = prevOpStart, because we 
00446             // don't want to move past that.
00447             while((insertStartPos > prevOpStart) && 
00448                   (getSequence(insertEndPos,BASES) == 
00449                    getSequence(insertStartPos - 1, BASES)))
00450             {
00451                 // We can shift, so move the insert start & end one left.
00452                 --insertEndPos;
00453                 --insertStartPos;
00454             }
00455 
00456             // Determine if a shift has occurred.
00457             int shiftLen = currentPos - insertStartPos;
00458             if(shiftLen > 0)
00459             {
00460                 // Shift occured, so adjust the cigar if the cigar will
00461                 // not become more operations.
00462                 // If the next operation is the same as the previous or
00463                 // if the insert and the previous operation switch positions
00464                 // then the cigar has the same number of operations.
00465                 // If the next operation is different, and the shift splits
00466                 // the previous operation in 2, then the cigar would
00467                 // become longer, so we do not want to shift.
00468                 if(myCigarRoller[nextOpIndex].operation == 
00469                    myCigarRoller[prevOpIndex].operation)
00470                 {
00471                     // The operations are the same, so merge them by adding
00472                     // the length of the shift to the next operation.
00473                     myCigarRoller.IncrementCount(nextOpIndex, shiftLen);
00474                     myCigarRoller.IncrementCount(prevOpIndex, -shiftLen);
00475 
00476                     // If the previous op length is 0, just remove that
00477                     // operation.
00478                     if(myCigarRoller[prevOpIndex].count == 0)
00479                     {
00480                         myCigarRoller.Remove(prevOpIndex);
00481                     }
00482                     shifted = true;
00483                 } 
00484                 else
00485                 {
00486                     // Can only shift if the insert shifts past the
00487                     // entire previous operation, otherwise an operation
00488                     // would need to be added.
00489                     if(insertStartPos == prevOpStart)
00490                     { 
00491                         // Swap the positions of the insert and the
00492                         // previous operation.
00493                         myCigarRoller.Update(currentOp,
00494                                              myCigarRoller[prevOpIndex].operation,
00495                                              myCigarRoller[prevOpIndex].count);
00496                         // Size of the previous op is the entire
00497                         // shift length.
00498                         myCigarRoller.Update(prevOpIndex, 
00499                                              Cigar::insert,
00500                                              shiftLen);
00501                         shifted = true;
00502                     }
00503                 }
00504             }
00505             // An insert is in the read, so increment the position.
00506             currentPos += myCigarRoller[currentOp].count;                 
00507         }
00508         else if(Cigar::foundInQuery(myCigarRoller[currentOp]))
00509         {
00510             // This op was found in the read, increment the current position.
00511             currentPos += myCigarRoller[currentOp].count;
00512         }
00513     }
00514     if(shifted)
00515     {
00516         // TODO - setCigar is currently inefficient because later the cigar
00517         // roller will be recalculated, but for now it will work.
00518         setCigar(myCigarRoller);
00519     }
00520     return(shifted);
00521 }

SamStatus::Status SamRecord::writeRecordBuffer ( IFILE  filePtr,
SequenceTranslation  translation 
)

Write the record as a BAM into the specified already opened file using the specified translation on the sequence.

Parameters:
filePtr file to write the BAM record into.
translation type of sequence translation to use.
Returns:
status of the write.

Definition at line 1194 of file SamRecord.cpp.

References StatGenStatus::FAIL_IO, StatGenStatus::FAIL_ORDER, StatGenStatus::getStatus(), ifwrite(), InputFile::isOpen(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

01196 {
01197     myStatus = SamStatus::SUCCESS;
01198     if((filePtr == NULL) || (filePtr->isOpen() == false))
01199     {
01200         // File is not open, return failure.
01201         myStatus.setStatus(SamStatus::FAIL_ORDER,
01202                            "Can't write to an unopened file.");
01203         return(SamStatus::FAIL_ORDER);
01204     }
01205 
01206     if((myIsBufferSynced == false) ||
01207        (myBufferSequenceTranslation != translation))
01208     {
01209         if(!fixBuffer(translation))
01210         {
01211             return(myStatus.getStatus());
01212         }
01213     }
01214 
01215     // Write the record.
01216     unsigned int numBytesToWrite = myRecordPtr->myBlockSize + sizeof(int32_t);
01217     unsigned int numBytesWritten = 
01218         ifwrite(filePtr, myRecordPtr, numBytesToWrite);
01219 
01220     // Return status based on if the correct number of bytes were written.
01221     if(numBytesToWrite == numBytesWritten)
01222     {
01223         return(SamStatus::SUCCESS);
01224     }
01225     // The correct number of bytes were not written.
01226     myStatus.setStatus(SamStatus::FAIL_IO, "Failed to write the entire record.");
01227     return(SamStatus::FAIL_IO);
01228 }

SamStatus::Status SamRecord::writeRecordBuffer ( IFILE  filePtr  ) 

Write the record as a BAM into the specified already opened file.

Parameters:
filePtr file to write the BAM record into.
Returns:
status of the write.

Definition at line 1187 of file SamRecord.cpp.

01188 {
01189     return(writeRecordBuffer(filePtr, mySequenceTranslation));
01190 }


The documentation for this class was generated from the following files:
Generated on Mon Feb 11 13:45:28 2013 for libStatGen Software by  doxygen 1.6.3