libStatGen Software  1
SamRecord Class Reference

Class providing an easy to use interface to get/set/operate on the fields in a SAM/BAM record. More...

#include <SamRecord.h>

List of all members.

Public Types

enum  SequenceTranslation { NONE, EQUAL, BASES }
 Enum containing the settings on how to translate the sequence if a reference is available. More...

Public Member Functions

 SamRecord ()
 Default Constructor.
 SamRecord (ErrorHandler::HandlingType errorHandlingType)
 Constructor that sets the error handling type.
 ~SamRecord ()
void resetRecord ()
 Reset the fields of the record to a default value.
bool isValid (SamFileHeader &header)
 Returns whether or not the record is valid, setting the status to indicate success or failure.
void setReference (GenomeSequence *reference)
 Set the reference to the specified genome sequence object.
void setSequenceTranslation (SequenceTranslation translation)
 Set the type of sequence translation to use when getting the sequence.
const SamStatusgetStatus ()
 Returns the status associated with the last method that sets the status.
Set Alignment Data

Set methods for record fields.

All of the "set" methods set the status to indicate success or the failure reason.

bool setReadName (const char *readName)
 Set QNAME to the passed in name.
bool setFlag (uint16_t flag)
 Set the bitwise FLAG to the specified value.
bool setReferenceName (SamFileHeader &header, const char *referenceName)
 Set the reference sequence name (RNAME) to the specified name, using the header to determine the reference id.
bool set1BasedPosition (int32_t position)
 Set the leftmost position (POS) using the specified 1-based (SAM format) value.
bool set0BasedPosition (int32_t position)
 Set the leftmost position using the specified 0-based (BAM format) value.
bool setMapQuality (uint8_t mapQuality)
 Set the mapping quality (MAPQ).
bool setCigar (const char *cigar)
 Set the CIGAR to the specified SAM formatted cigar string.
bool setCigar (const Cigar &cigar)
 Set the CIGAR to the specified Cigar object.
bool setMateReferenceName (SamFileHeader &header, const char *mateReferenceName)
 Set the mate/next fragment's reference sequence name (RNEXT) to the specified name, using the header to determine the mate reference id.
bool set1BasedMatePosition (int32_t matePosition)
 Set the mate/next fragment's leftmost position (PNEXT) using the specified 1-based (SAM format) value.
bool set0BasedMatePosition (int32_t matePosition)
 Set the mate/next fragment's leftmost position using the specified 0-based (BAM format) value.
bool setInsertSize (int32_t insertSize)
 Sets the inferred insert size (ISIZE)/observed template length (TLEN).
bool setSequence (const char *seq)
 Sets the sequence (SEQ) to the specified SAM formatted sequence string.
bool setQuality (const char *quality)
 Sets the quality (QUAL) to the specified SAM formatted quality string.
bool shiftIndelsLeft ()
 Shift the indels (if any) to the left by updating the CIGAR.
SamStatus::Status setBuffer (const char *fromBuffer, uint32_t fromBufferSize, SamFileHeader &header)
 Sets the SamRecord to contain the information in the BAM formatted fromBuffer.
SamStatus::Status setBufferFromFile (IFILE filePtr, SamFileHeader &header)
 Read the BAM record from a file.
Set Tag Data

Set methods for tags.

bool addIntTag (const char *tag, int32_t value)
 Add the specified integer tag to the record.
bool addTag (const char *tag, char vtype, const char *value)
 Add the specified tag,vtype,value to the record.
void clearTags ()
 Clear the tags in this record.
bool rmTag (const char *tag, char type)
 Remove a tag.
bool rmTags (const char *tags)
 Remove tags.
Get Alignment Data

Get methods for record fields.

All of the "get" methods set the status to indicate success or the failure reason.

const void * getRecordBuffer ()
 Get a const pointer to the buffer that contains the BAM representation of the record.
const void * getRecordBuffer (SequenceTranslation translation)
 Get a const pointer to the buffer that contains the BAM representation of the record using the specified translation on the sequence.
SamStatus::Status writeRecordBuffer (IFILE filePtr)
 Write the record as a BAM into the specified already opened file.
SamStatus::Status writeRecordBuffer (IFILE filePtr, SequenceTranslation translation)
 Write the record as a BAM into the specified already opened file using the specified translation on the sequence.
int32_t getBlockSize ()
 Get the block size of the record (BAM format).
const char * getReferenceName ()
 Get the reference sequence name (RNAME) of the record.
int32_t getReferenceID ()
 Get the reference sequence id of the record (BAM format rid).
int32_t get1BasedPosition ()
 Get the 1-based(SAM) leftmost position (POS) of the record.
int32_t get0BasedPosition ()
 Get the 0-based(BAM) leftmost position of the record.
uint8_t getReadNameLength ()
 Get the length of the readname (QNAME) including the null.
uint8_t getMapQuality ()
 Get the mapping quality (MAPQ) of the record.
uint16_t getBin ()
 Get the BAM bin for the record.
uint16_t getCigarLength ()
 Get the length of the BAM formatted CIGAR.
uint16_t getFlag ()
 Get the flag (FLAG).
int32_t getReadLength ()
 Get the length of the read.
const char * getMateReferenceName ()
 Get the mate/next fragment's reference sequence name (RNEXT).
const char * getMateReferenceNameOrEqual ()
 Get the mate/next fragment's reference sequence name (RNEXT), returning "=" if it is the same as the reference name, unless they are both "*" in which case "*" is returned.
int32_t getMateReferenceID ()
 Get the mate reference id of the record (BAM format: mate_rid/next_refID).
int32_t get1BasedMatePosition ()
 Get the 1-based(SAM) leftmost mate/next fragment's position (PNEXT).
int32_t get0BasedMatePosition ()
 Get the 0-based(BAM) leftmost mate/next fragment's position.
int32_t getInsertSize ()
 Get the inferred insert size of the read pair (ISIZE) or observed template length (TLEN).
int32_t get0BasedAlignmentEnd ()
 Returns the 0-based inclusive rightmost position of the clipped sequence.
int32_t get1BasedAlignmentEnd ()
 Returns the 1-based inclusive rightmost position of the clipped sequence.
int32_t getAlignmentLength ()
 Returns the length of the clipped sequence, returning 0 if the cigar is '*'.
int32_t get0BasedUnclippedStart ()
 Returns the 0-based inclusive left-most position adjusted for clipped bases.
int32_t get1BasedUnclippedStart ()
 Returns the 1-based inclusive left-most position adjusted for clipped bases.
int32_t get0BasedUnclippedEnd ()
 Returns the 0-based inclusive right-most position adjusted for clipped bases.
int32_t get1BasedUnclippedEnd ()
 Returns the 1-based inclusive right-most position adjusted for clipped bases.
const char * getReadName ()
 Returns the SAM formatted Read Name (QNAME).
const char * getCigar ()
 Returns the SAM formatted CIGAR string.
const char * getSequence ()
 Returns the SAM formatted sequence string (SEQ), translating the base as specified by setSequenceTranslation.
const char * getSequence (SequenceTranslation translation)
 Returns the SAM formatted sequence string (SEQ) performing the specified sequence translation.
const char * getQuality ()
 Returns the SAM formatted quality string (QUAL).
char getSequence (int index)
 Get the sequence base at the specified index into this sequence 0 to readLength - 1, translating the base as specified by setSequenceTranslation.
char getSequence (int index, SequenceTranslation translation)
 Get the sequence base at the specified index into this sequence 0 to readLength - 1 performing the specified sequence translation.
char getQuality (int index)
 Get the quality character at the specified index into the quality 0 to readLength - 1.
CigargetCigarInfo ()
 Returns a pointer to the Cigar object associated with this record.
uint32_t getNumOverlaps (int32_t start, int32_t end)
 Return the number of bases in this read that overlap the passed in region.
bool getFields (bamRecordStruct &recStruct, String &readName, String &cigar, String &sequence, String &quality)
 Returns the values of all fields except the tags.
bool getFields (bamRecordStruct &recStruct, String &readName, String &cigar, String &sequence, String &quality, SequenceTranslation translation)
 Returns the values of all fields except the tags using the specified sequence translation.
GenomeSequencegetReference ()
 Returns a pointer to the genome sequence object associated with this record if it was set (NULL if it was not set).

Get Tag Methods

Get methods for obtaining information on tags.

uint32_t getTagLength ()
 Returns the length of the BAM formatted tags.
bool getNextSamTag (char *tag, char &vtype, void **value)
 Get the next tag from the record.
void resetTagIter ()
 Reset the tag iterator to the beginning of the tags.
bool getTagsString (const char *tags, String &returnString, char delim= '\t')
 Get the string representation of the tags from the record, formatted as TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE...
const StringgetStringTag (const char *tag)
 Get the string value for the specified tag.
int * getIntegerTag (const char *tag)
 Get the integer value for the specified tag, DEPRECATED, use one that returns a bool (success/failure).
bool getIntegerTag (const char *tag, int &tagVal)
 Get the integer value for the specified tag.
bool getFloatTag (const char *tag, float &tagVal)
 Get the float value for the specified tag.
const StringgetString (const char *tag)
 Get the string value for the specified tag.
int & getInteger (const char *tag)
 Get the integer value for the specified tag, DEPRECATED, use getIntegerTag that returns a bool.
bool checkString (const char *tag)
 Check if the specified tag contains a string.
bool checkInteger (const char *tag)
 Check if the specified tag contains an integer.
bool checkFloat (const char *tag)
 Check if the specified tag contains a string.
bool checkTag (const char *tag, char type)
 Check if the specified tag contains a value of the specified vtype.
static bool isIntegerType (char vtype)
 Returns whether or not the specified vtype is an integer type.
static bool isFloatType (char vtype)
 Returns whether or not the specified vtype is a float type.
static bool isCharType (char vtype)
 Returns whether or not the specified vtype is a char type.
static bool isStringType (char vtype)
 Returns whether or not the specified vtype is a string type.

Detailed Description

Class providing an easy to use interface to get/set/operate on the fields in a SAM/BAM record.

Definition at line 51 of file SamRecord.h.

Member Enumeration Documentation

Enum containing the settings on how to translate the sequence if a reference is available.

If no reference is available, no translation is done.


Leave the sequence as is.


Translate bases that match the reference to '='.


Translate '=' to the actual base.

Definition at line 57 of file SamRecord.h.

        NONE,   ///< Leave the sequence as is.
        EQUAL,  ///< Translate bases that match the reference to '='
        BASES,  ///< Translate '=' to the actual base.

Constructor & Destructor Documentation

Constructor that sets the error handling type.

errorHandlingTypehow to handle errors.

Definition at line 53 of file SamRecord.cpp.

References resetRecord().

    : myStatus(errorHandlingType),
    int32_t defaultAllocSize = DEFAULT_BLOCK_SIZE + sizeof(int32_t);

    myRecordPtr = 
        (bamRecordStruct *) malloc(defaultAllocSize);

    myCigarTempBuffer = NULL;
    myCigarTempBufferAllocatedSize = 0;

    allocatedSize = defaultAllocSize;


Member Function Documentation

bool SamRecord::addIntTag ( const char *  tag,
int32_t  value 

Add the specified integer tag to the record.

Internal processing handles switching between SAM/BAM formats when read/written and determining the type for BAM format. If the tag is already there this code will replace it if the specified value is different.

tagtwo character tag to be added to the SAM/BAM record.
valuevalue for the specified tag.
true if the tag was successfully added, false otherwise.

Definition at line 635 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

Referenced by addTag().

    myStatus = SamStatus::SUCCESS;
    int key = 0;
    int index = 0;
    char bamvtype;

    int tagBufferSize = 0;

    // First check to see if the tags need to be synced to the buffer.
            // Failed to read tags from the buffer, so cannot add new ones.

    // Ints come in as int.  But it can be represented in fewer bits.
    // So determine a more specific type that is in line with the
    // types for BAM files.
    // First check to see if it is a negative.
    if(value < 0)
        // The int is negative, so it will need to use a signed type.
        // See if it is greater than the min value for a char.
        if(value > ((std::numeric_limits<char>::min)()))
            // It can be stored in a signed char.
            bamvtype = 'c';
            tagBufferSize += 4;
        else if(value > ((std::numeric_limits<short>::min)()))
            // It fits in a signed short.
            bamvtype = 's';
            tagBufferSize += 5;
            // Just store it as a signed int.
            bamvtype = 'i';
            tagBufferSize += 7;
        // It is positive, so an unsigned type can be used.
        if(value < ((std::numeric_limits<unsigned char>::max)()))
            // It is under the max of an unsigned char.
            bamvtype = 'C';
            tagBufferSize += 4;
        else if(value < ((std::numeric_limits<unsigned short>::max)()))
            // It is under the max of an unsigned short.
            bamvtype = 'S';
            tagBufferSize += 5;
            // Just store it as an unsigned int.
            bamvtype = 'I';
            tagBufferSize += 7;

    // Check to see if the tag is already there.
    key = MAKEKEY(tag[0], tag[1], bamvtype);
    unsigned int hashIndex = extras.Find(key);
    if(hashIndex != LH_NOTFOUND)
        // Tag was already found.
        index = extras[hashIndex];
        // Since the tagBufferSize was already updated with the new value,
        // subtract the size for the previous tag (even if they are the same).
            case 'c':
            case 'C':
            case 'A':
                tagBufferSize -= 4;
            case 's':
            case 'S':
                tagBufferSize -= 5;
            case 'i':
            case 'I':
                tagBufferSize -= 7;
                                   "unknown tag inttype type found.\n");
        // Tag already existed, print message about overwriting.
        // WARN about dropping duplicate tags.
        if(myNumWarns++ < myMaxWarns)
            String newVal;
            String origVal;
            appendIntArrayValue(index, origVal);
            appendIntArrayValue(bamvtype, value, newVal);
            fprintf(stderr, "WARNING: Duplicate Tags, overwritting %c%c:%c:%s with %c%c:%c:%s\n",
                    tag[0], tag[1], intType[index], origVal.c_str(), tag[0], tag[1], bamvtype, newVal.c_str());
            if(myNumWarns == myMaxWarns)
                fprintf(stderr, "Suppressing rest of Duplicate Tag warnings.\n");

        // Update the integer value and type.
        integers[index] = value;
        intType[index] = bamvtype;
        // Tag is not already there, so add it.
        index = integers.Length();

        extras.Add(key, index);

    // The buffer tags are now out of sync.
    myNeedToSetTagsInBuffer = true;
    myIsTagsBufferValid = false;
    myIsBufferSynced = false;
    myTagBufferSize += tagBufferSize;

bool SamRecord::addTag ( const char *  tag,
char  vtype,
const char *  value 

Add the specified tag,vtype,value to the record.

Vtype can be SAM/BAM format. Internal processing handles switching between SAM/BAM formats when read/written. If the tag is already there this code will replace it if the specified value is different.

tagtwo character tag to be added to the SAM/BAM record.
vtypevtype of the specified value - either SAM/BAM vtypes.
valuevalue as a string for the specified tag.
true if the tag was successfully added, false otherwise.

Definition at line 779 of file SamRecord.cpp.

References addIntTag(), StatGenStatus::FAIL_PARSE, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    if(vtype == 'i')
        // integer type.  Call addIntTag to handle it.
        int intVal = atoi(valuePtr);
        return(addIntTag(tag, intVal));

    // Non-int type.
    myStatus = SamStatus::SUCCESS;
    bool status = true; // default to successful.
    int key = 0;
    int index = 0;

    int tagBufferSize = 0;

    // First check to see if the tags need to be synced to the buffer.
            // Failed to read tags from the buffer, so cannot add new ones.

    // First check to see if the tag is already there.
    key = MAKEKEY(tag[0], tag[1], vtype);
    unsigned int hashIndex = extras.Find(key);
    if(hashIndex != LH_NOTFOUND)
        // The key was found in the hash, so get the lookup index.
        index = extras[hashIndex];

        String origTag;
        char origType = vtype;

        // Adjust the currently pointed to value to the new setting.
        switch (vtype)
            case 'A' :
                // First check to see if the value changed.
                if((integers[index] == (const int)*(valuePtr)) &&
                   (intType[index] == vtype))
                    // The value & type has not changed, so do nothing.
                    // Tag buffer size changes if type changes, so subtract & add.
                    origType = intType[index];
                    appendIntArrayValue(index, origTag);
                    tagBufferSize -= getNumericTagTypeSize(intType[index]);
                    tagBufferSize += getNumericTagTypeSize(vtype);
                    integers[index] = (const int)*(valuePtr);
                    intType[index] = vtype;
            case 'Z' :
                // First check to see if the value changed.
                if(strings[index] == valuePtr)
                    // The value has not changed, so do nothing.
                    // Adjust the tagBufferSize by removing the size of the old string.
                    origTag = strings[index];
                    tagBufferSize -= strings[index].Length();
                    strings[index] = valuePtr;
                    // Adjust the tagBufferSize by adding the size of the new string.
                    tagBufferSize += strings[index].Length();
            case 'B' :
                // First check to see if the value changed.
                if(strings[index] == valuePtr)
                    // The value has not changed, so do nothing.
                    // Adjust the tagBufferSize by removing the size of the old field.
                    origTag = strings[index];
                    tagBufferSize -= getBtagBufferSize(strings[index]);
                    strings[index] = valuePtr;
                    // Adjust the tagBufferSize by adding the size of the new field.
                    tagBufferSize += getBtagBufferSize(strings[index]);
            case 'f' :
                // First check to see if the value changed.
                if(floats[index] == (float)atof(valuePtr))
                    // The value has not changed, so do nothing.
                    // Tag buffer size doesn't change between different 'f' entries.
                    floats[index] = (float)atof(valuePtr);
            default :
                        "samRecord::addTag() - Unknown custom field of type %c\n",
                                   "Unknown custom field in a tag");
                status = false;

        // Duplicate tag in this record.
        // Tag already existed, print message about overwriting.
        // WARN about dropping duplicate tags.
        if(myNumWarns++ < myMaxWarns)
            fprintf(stderr, "WARNING: Duplicate Tags, overwritting %c%c:%c:%s with %c%c:%c:%s\n",
                    tag[0], tag[1], origType, origTag.c_str(), tag[0], tag[1], vtype, valuePtr);
            if(myNumWarns == myMaxWarns)
                fprintf(stderr, "Suppressing rest of Duplicate Tag warnings.\n");
        // The key was not found in the hash, so add it.
        switch (vtype)
            case 'A' :
                index = integers.Length();
                integers.Push((const int)*(valuePtr));
                tagBufferSize += 4;
            case 'Z' :
                index = strings.Length();
                tagBufferSize += 4 + strings.Last().Length();
            case 'B' :
                index = strings.Length();
                tagBufferSize += 3 + getBtagBufferSize(strings[index]);
            case 'f' :
                index = floats.size();
                tagBufferSize += 7;
            default :
                        "samRecord::addTag() - Unknown custom field of type %c\n",
                                   "Unknown custom field in a tag");
                status = false;
            // If successful, add the key to extras.
            extras.Add(key, index);

    // Only add the tag if it has so far been successfully processed.
        // The buffer tags are now out of sync.
        myNeedToSetTagsInBuffer = true;
        myIsTagsBufferValid = false;
        myIsBufferSynced = false;
        myTagBufferSize += tagBufferSize;
bool SamRecord::checkFloat ( const char *  tag) [inline]

Check if the specified tag contains a string.

Does not set SamStatus.

tagSAM tag to check contents of.
true if the value associated with the tag is a string.

Definition at line 613 of file SamRecord.h.

References checkTag().

{ return checkTag(tag, 'f'); }
bool SamRecord::checkInteger ( const char *  tag) [inline]

Check if the specified tag contains an integer.

Does not set SamStatus.

tagSAM tag to check contents of.
true if the value associated with the tag is a string.

Definition at line 607 of file SamRecord.h.

References checkTag().

{ return checkTag(tag, 'i'); }
bool SamRecord::checkString ( const char *  tag) [inline]

Check if the specified tag contains a string.

Does not set SamStatus.

tagSAM tag to check contents of.
true if the value associated with the tag is a string.

Definition at line 600 of file SamRecord.h.

References checkTag().

    { return(checkTag(tag, 'Z') || checkTag(tag, 'B')); }
bool SamRecord::checkTag ( const char *  tag,
char  type 

Check if the specified tag contains a value of the specified vtype.

Does not set SamStatus.

tagSAM tag to check contents of.
typevalue type to check if the SAM tag matches.
true if the value associated with the tag is a string.

Definition at line 2369 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by checkFloat(), checkInteger(), and checkString().

    // Init to success.
    myStatus = SamStatus::SUCCESS;
    // Parse the buffer if necessary.
            // Failed to read the tags from the buffer, so cannot
            // get tags.  setTagsFromBuffer set the error.
    int key = MAKEKEY(tag[0], tag[1], type);

    return (extras.Find(key) != LH_NOTFOUND);

Clear the tags in this record.

Does not set SamStatus.

Definition at line 965 of file SamRecord.cpp.

References resetTagIter().

Referenced by resetRecord().

    if(extras.Entries() != 0)
    myTagBufferSize = 0;

Returns the 0-based inclusive rightmost position of the clipped sequence.

0-based inclusive rightmost position

Definition at line 1455 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by get0BasedUnclippedEnd(), get1BasedAlignmentEnd(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignment(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignmentRegion(), and CigarHelper::softClipEndByRefPos().

    myStatus = SamStatus::SUCCESS;
    if(myAlignmentLength == -1)
        // Alignment end has not been set, so calculate it.
    // If alignment length > 0, subtract 1 from it to get the end.
    if(myAlignmentLength == 0)
        // Length is 0, just return the start position.
    return(myRecordPtr->myPosition + myAlignmentLength - 1);

Get the 0-based(BAM) leftmost mate/next fragment's position.

0-based leftmost position.

Definition at line 1440 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myMatePosition;

Returns the 0-based inclusive right-most position adjusted for clipped bases.

0-based inclusive rightmost position including clips.

Definition at line 1514 of file SamRecord.cpp.

References get0BasedAlignmentEnd().

Referenced by get1BasedUnclippedEnd().

    // myUnclippedEndOffset will be set by get0BasedAlignmentEnd if the 
    // cigar has not yet been parsed, so no need to check it here.
    return(get0BasedAlignmentEnd() + myUnclippedEndOffset);

Returns the 0-based inclusive left-most position adjusted for clipped bases.

0-based inclusive leftmost position including clips.

Definition at line 1494 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by get1BasedUnclippedStart().

    myStatus = SamStatus::SUCCESS;
    if(myUnclippedStartOffset == -1)
        // Unclipped has not yet been calculated, so parse the cigar to get it
    return(myRecordPtr->myPosition - myUnclippedStartOffset);

Returns the 1-based inclusive rightmost position of the clipped sequence.

1-based inclusive rightmost position

Definition at line 1474 of file SamRecord.cpp.

References get0BasedAlignmentEnd().

Referenced by getBin().

    return(get0BasedAlignmentEnd() + 1);

Get the 1-based(SAM) leftmost mate/next fragment's position (PNEXT).

1-based leftmost position.

Definition at line 1433 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    return (myRecordPtr->myMatePosition + 1);

Get the 1-based(SAM) leftmost position (POS) of the record.

1-based leftmost position.

Definition at line 1300 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

    myStatus = SamStatus::SUCCESS;
    return (myRecordPtr->myPosition + 1);

Returns the 1-based inclusive right-most position adjusted for clipped bases.

1-based inclusive rightmost position including clips.

Definition at line 1523 of file SamRecord.cpp.

References get0BasedUnclippedEnd().

    return(get0BasedUnclippedEnd() + 1);

Returns the 1-based inclusive left-most position adjusted for clipped bases.

1-based inclusive leftmost position including clips.

Definition at line 1507 of file SamRecord.cpp.

References get0BasedUnclippedStart().

    return(get0BasedUnclippedStart() + 1);

Returns the length of the clipped sequence, returning 0 if the cigar is '*'.

length of the clipped sequence.

Definition at line 1481 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if(myAlignmentLength == -1)
        // Alignment end has not been set, so calculate it.
    // Return the alignment length.
uint16_t SamRecord::getBin ( )

Get the BAM bin for the record.

BAM bin

Definition at line 1335 of file SamRecord.cpp.

References get1BasedAlignmentEnd(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
        // The bin that is set in the record is not valid, so
        // reset it.
        myRecordPtr->myBin = 
            bam_reg2bin(myRecordPtr->myPosition, get1BasedAlignmentEnd());      
        myIsBinValid = true;

Get the block size of the record (BAM format).

BAM block size of the record.

Definition at line 1269 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    // If the buffer isn't synced, sync the buffer to determine the
    // block size.
    if(myIsBufferSynced == false)
        // Since this just returns the block size, the translation of
        // the sequence does not matter, so just use the currently set
        // value.
    return myRecordPtr->myBlockSize;
const char * SamRecord::getCigar ( )

Returns the SAM formatted CIGAR string.

cigar string.

Definition at line 1543 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by getFields(), SamValidator::isValidCigar(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

    myStatus = SamStatus::SUCCESS;
    if(myCigar.Length() == 0)
        // 0 Length, means that it is in the buffer, but has not yet
        // been synced to the string, so do the sync.
    return myCigar.c_str();

Returns a pointer to the Cigar object associated with this record.

The object is essentially read-only, only allowing modifications due to lazy evaluations.

pointer to the Cigar object.

Definition at line 1824 of file SamRecord.cpp.

Referenced by PileupElementBaseQual::addEntry(), SamRecordHelper::checkSequence(), SamTags::createMDTag(), getSequence(), SamQuerySeqWithRefIter::reset(), SamFilter::softClip(), CigarHelper::softClipBeginByRefPos(), and CigarHelper::softClipEndByRefPos().

    // Check to see whether or not the Cigar has already been
    // set - this is determined by checking if alignment length
    // is set since alignment length and the cigar are set
    // at the same time.
    if(myAlignmentLength == -1)
        // Not been set, so calculate it.

Get the length of the BAM formatted CIGAR.

length of BAM formatted cigar.

Definition at line 1350 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    // If the cigar buffer is valid
    // then get the length from there.
        return myRecordPtr->myCigarLength;      

    if(myCigarTempBufferLength == -1)
        // The cigar buffer is not valid and the cigar temp buffer is not set,
        // so parse the string.
    // The temp buffer is now set, so return the size.
bool SamRecord::getFields ( bamRecordStruct recStruct,
String readName,
String cigar,
String sequence,
String quality 

Returns the values of all fields except the tags.

recStructstructure containing the contents of all non-variable length fields.
readNameread name from the record (return param)
cigarcigar string from the record (return param)
sequencesequence string from the record (return param)
qualityquality string from the record (return param)
true if all fields were successfully set, false otherwise.

Definition at line 1854 of file SamRecord.cpp.

    return(getFields(recStruct, readName, cigar, sequence, quality,
bool SamRecord::getFields ( bamRecordStruct recStruct,
String readName,
String cigar,
String sequence,
String quality,
SequenceTranslation  translation 

Returns the values of all fields except the tags using the specified sequence translation.

recStructstructure containing the contents of all non-variable length fields.
readNameread name from the record (return param)
cigarcigar string from the record (return param)
sequencesequence string from the record (return param)
qualityquality string from the record (return param)
translationtype of sequence translation to use.
true if all fields were successfully set, false otherwise.

Definition at line 1863 of file SamRecord.cpp.

References getCigar(), getQuality(), getReadName(), getSequence(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if(myIsBufferSynced == false)
            // failed to set the buffer, return false.
    memcpy(&recStruct, myRecordPtr, sizeof(bamRecordStruct));

    readName = getReadName();
    // Check the status.
    if(myStatus != SamStatus::SUCCESS)
        // Failed to set the fields, return false.
    cigar = getCigar();
    // Check the status.
    if(myStatus != SamStatus::SUCCESS)
        // Failed to set the fields, return false.
    sequence = getSequence(translation);
    // Check the status.
    if(myStatus != SamStatus::SUCCESS)
        // Failed to set the fields, return false.
    quality = getQuality();
    // Check the status.
    if(myStatus != SamStatus::SUCCESS)
        // Failed to set the fields, return false.
uint16_t SamRecord::getFlag ( )

Get the flag (FLAG).


Definition at line 1372 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead(), SamQuerySeqWithRefIter::getNextMatchMismatch(), SamValidator::isValid(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processFile(), and SamFile::ReadRecord().

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myFlag;
bool SamRecord::getFloatTag ( const char *  tag,
float &  tagVal 

Get the float value for the specified tag.

tagtag to retrieve
tagValreturn parameter with integer value for the tag
bool true if Float tag was found and tagVal was set, false if not.

Definition at line 2269 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    // Init to success.
    myStatus = SamStatus::SUCCESS;
    // Parse the buffer if necessary.
            // Failed to read the tags from the buffer, so cannot
            // get tags.  setTagsFromBuffer set the errors,
            // so just return false.
    int key = MAKEKEY(tag[0], tag[1], 'f');
    int offset = extras.Find(key);

    int value;
    if (offset < 0)
        // Failed to find the tag.
        value = extras[offset];

    tagVal = floats[value];

Get the inferred insert size of the read pair (ISIZE) or observed template length (TLEN).

inferred insert size or observed template length.

Definition at line 1447 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myInsertSize;
int * SamRecord::getIntegerTag ( const char *  tag)

Get the integer value for the specified tag, DEPRECATED, use one that returns a bool (success/failure).

tagtag to retrieve pointer to the tag's integer value if found, NULL if not found.

Definition at line 2204 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    // Init to success.
    myStatus = SamStatus::SUCCESS;
    // Parse the buffer if necessary.
            // Failed to read the tags from the buffer, so cannot
            // get tags.  setTagsFromBuffer set the errors,
            // so just return NULL.
    int key = MAKEKEY(tag[0], tag[1], 'i');
    int offset = extras.Find(key);

    int value;
    if (offset < 0)
        // Failed to find the tag.
        value = extras[offset];

bool SamRecord::getIntegerTag ( const char *  tag,
int &  tagVal 

Get the integer value for the specified tag.

tagtag to retrieve
tagValreturn parameter with integer value for the tag bool true if Integer tag was found and tagVal was set, false if not.

Definition at line 2236 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    // Init to success.
    myStatus = SamStatus::SUCCESS;
    // Parse the buffer if necessary.
            // Failed to read the tags from the buffer, so cannot
            // get tags.  setTagsFromBuffer set the errors,
            // so just return false.
    int key = MAKEKEY(tag[0], tag[1], 'i');
    int offset = extras.Find(key);

    int value;
    if (offset < 0)
        // Failed to find the tag.
        value = extras[offset];

    tagVal = integers[value];

Get the mapping quality (MAPQ) of the record.

map quality.

Definition at line 1328 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myMapQuality;

Get the mate reference id of the record (BAM format: mate_rid/next_refID).

reference id

Definition at line 1426 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myMateReferenceID;

Get the mate/next fragment's reference sequence name (RNEXT).

If it is equal to the reference name, it still returns the reference name.

reference sequence name

Definition at line 1398 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    return myMateReferenceName.c_str();

Get the mate/next fragment's reference sequence name (RNEXT), returning "=" if it is the same as the reference name, unless they are both "*" in which case "*" is returned.

reference sequence name or '='

Definition at line 1408 of file SamRecord.cpp.

References getReferenceName(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if(myMateReferenceName == "*")
    if(myMateReferenceName == getReferenceName())
bool SamRecord::getNextSamTag ( char *  tag,
char &  vtype,
void **  value 

Get the next tag from the record.

Sets the Status to SUCCESS when a tag is successfully returned or when there are no more tags. Otherwise the status is set to describe why it failed (parsing, etc).

tagset to the tag when a tag is read.
vtypeset to the vtype when a tag is read.
valuepointer to the value of the tag (will need to cast to int, float, char, or string based on vtype).
true if a tag was read, false if there are no more tags.

Definition at line 1950 of file SamRecord.cpp.

References StatGenStatus::FAIL_PARSE, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

Referenced by SamRecordHelper::genSamTagsString().

    myStatus = SamStatus::SUCCESS;
            // Failed to read the tags from the buffer, so cannot
            // get tags.

    // Increment the tag index to start looking at the next tag.
    // At the beginning, it is set to -1.
    int maxTagIndex = extras.Capacity();
    if(myLastTagIndex >= maxTagIndex)
        // Hit the end of the tags, return false, no more tags.
        // Status is still success since this is not an error, 
        // it is just the end of the list.

    bool tagFound = false;
    // Loop until a tag is found or the end of extras is hit.
    while((tagFound == false) && (myLastTagIndex < maxTagIndex))
            // Found a slot to use.
            int key = extras.GetKey(myLastTagIndex);
            getTag(key, tag);
            getTypeFromKey(key, vtype);
            tagFound = true;
            // Get the value associated with the key based on the vtype.
            switch (vtype)
                case 'f' :
                    *value = getFloatPtr(myLastTagIndex);
                case 'i' :
                    *value = getIntegerPtr(myLastTagIndex, vtype);
                    if(vtype != 'A')
                        // Convert all int types to 'i'
                        vtype = 'i';
                case 'Z' :
                case 'B' :
                    *value = getStringPtr(myLastTagIndex);
                                       "Unknown tag type");
                    tagFound = false;
            // Increment the index since a tag was not found.
uint32_t SamRecord::getNumOverlaps ( int32_t  start,
int32_t  end 

Return the number of bases in this read that overlap the passed in region.

Matches & mismatches between the read and the reference are counted as overlaps, but insertions, deletions, skips, clips, and pads are not counted.

startinclusive 0-based start position (reference position) of the region to check for overlaps in. (-1 indicates to start at the beginning of the reference.)
endexclusive 0-based end position (reference position) of the region to check for overlaps in. (-1 indicates to go to the end of the reference.)
number of overlapping bases

Definition at line 1841 of file SamRecord.cpp.

References get0BasedPosition(), and Cigar::getNumOverlaps().

Referenced by SamFile::GetNumOverlaps().

    // Determine whether or not the cigar has been parsed, which sets up
    // the cigar roller.  This is determined by checking the alignment length.
    if(myAlignmentLength == -1)
    return(myCigarRoller.getNumOverlaps(start, end, get0BasedPosition()));
const char * SamRecord::getQuality ( )

Returns the SAM formatted quality string (QUAL).

quality string.

Definition at line 1626 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by PileupElementBaseQual::addEntry(), getFields(), SamValidator::isValidQuality(), and SamFilter::sumMismatchQuality().

    myStatus = SamStatus::SUCCESS;
    if(myQuality.Length() == 0)
        // 0 Length, means that it is in the buffer, but has not yet
        // been synced to the string, so do the sync.
    return myQuality.c_str();
char SamRecord::getQuality ( int  index)

Get the quality character at the specified index into the quality 0 to readLength - 1.

Throws an exception if index is out of range.

indexindex into the quality string (0 to readLength-1).
the quality character at the specified index into the quality.

Definition at line 1770 of file SamRecord.cpp.

References getReadLength(), and BaseUtilities::UNKNOWN_QUALITY_CHAR.

    // Determine the read length.
    int32_t readLen = getReadLength();

    // If the read length is 0, return ' ' whose ascii code is below
    // the minimum ascii code for qualities.
    if(readLen == 0)
    else if((index < 0) || (index >= readLen))
        // Only get here if the index was out of range, so thow an exception.
        String exceptionString = "SamRecord::getQuality(";
        exceptionString += index;
        exceptionString += ") is out of range. Index must be between 0 and ";
        exceptionString += (readLen - 1);
        throw std::runtime_error(exceptionString.c_str());

    if(myQuality.Length() == 0) 
        // Parse BAM Quality.
        // Know that myPackedQuality is correct since readLen != 0.
        return(myPackedQuality[index] + 33);
        // Already have string.
        if((myQuality.Length() == 1) && (myQuality[0] == '*'))
            // Return the unknown quality character.
        else if(index >= myQuality.Length())
            // Only get here if the index was out of range, so thow an exception.
            // Technically the myQuality string is not guaranteed to be the same length
            // as the sequence, so this catches that error.
            String exceptionString = "SamRecord::getQuality(";
            exceptionString += index;
            exceptionString += ") is out of range. Index must be between 0 and ";
            exceptionString += (myQuality.Length() - 1);
            throw std::runtime_error(exceptionString.c_str());

Get the length of the read.

read length.

Definition at line 1379 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::clipOnMismatchThreshold(), SamQuerySeqWithRefIter::getNextMatchMismatch(), getQuality(), getSequence(), SamValidator::isValidCigar(), SamValidator::isValidQuality(), SamQuerySeqWithRefIter::reset(), and CigarHelper::softClipEndByRefPos().

    myStatus = SamStatus::SUCCESS;
    if(myIsSequenceBufferValid == false)
        // If the sequence is "*", then return 0.
        if((mySequence.Length() == 1) && (mySequence[0] == '*'))
        // Do not add 1 since it is not null terminated.
const char * SamRecord::getReadName ( )

Returns the SAM formatted Read Name (QNAME).

read name.

Definition at line 1530 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by getFields(), SamValidator::isValid(), and SamFile::validateSortOrder().

    myStatus = SamStatus::SUCCESS;
    if(myReadName.Length() == 0)
        // 0 Length, means that it is in the buffer, but has not yet
        // been synced to the string, so do the sync.
        myReadName = (char*)&(myRecordPtr->myData);
    return myReadName.c_str();

Get the length of the readname (QNAME) including the null.

length of the read name (including null).

Definition at line 1314 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamValidator::isValid().

    myStatus = SamStatus::SUCCESS;
    // If the buffer is valid, return the size from there, otherwise get the 
    // size from the string length + 1 (ending null).
    return(myReadName.Length() + 1);
const void * SamRecord::getRecordBuffer ( )

Get a const pointer to the buffer that contains the BAM representation of the record.

const pointer to the buffer that contains the BAM representation of the record.

Definition at line 1192 of file SamRecord.cpp.

const void * SamRecord::getRecordBuffer ( SequenceTranslation  translation)

Get a const pointer to the buffer that contains the BAM representation of the record using the specified translation on the sequence.

translationtype of sequence translation to use.
const pointer to the buffer that contains the BAM representation of the record.

Definition at line 1199 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    bool status = true;
    // If the buffer is not synced or the sequence in the buffer is not
    // properly translated, fix the buffer.
    if((myIsBufferSynced == false) ||
       (myBufferSequenceTranslation != translation))
        status &= fixBuffer(translation);
    // If the buffer is synced, check to see if the tags need to be synced.
        status &= setTagsInBuffer();
    return (const void *)myRecordPtr;

Returns a pointer to the genome sequence object associated with this record if it was set (NULL if it was not set).

pointer to the GenomeSequence object or NULL if there isn't one.

Definition at line 1911 of file SamRecord.cpp.

Referenced by SamValidator::isValidTags().


Get the reference sequence id of the record (BAM format rid).

reference sequence id

Definition at line 1293 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamCoordOutput::add(), SamValidator::isValid(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignment(), Pileup< PILEUP_TYPE, FUNC_CLASS >::processAlignmentRegion(), and SamFile::validateSortOrder().

    myStatus = SamStatus::SUCCESS;
    return myRecordPtr->myReferenceID;
const char * SamRecord::getReferenceName ( )

Get the reference sequence name (RNAME) of the record.

reference sequence name

Definition at line 1286 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by PileupElement::addEntry(), SamTags::createMDTag(), getMateReferenceNameOrEqual(), getSequence(), SamValidator::isValid(), and SamQuerySeqWithRefIter::reset().

    myStatus = SamStatus::SUCCESS;
    return myReferenceName.c_str();
const char * SamRecord::getSequence ( )

Returns the SAM formatted sequence string (SEQ), translating the base as specified by setSequenceTranslation.

sequence string.

Definition at line 1556 of file SamRecord.cpp.

Referenced by PileupElementBaseQual::addEntry(), SamRecordHelper::checkSequence(), SamTags::createMDTag(), getFields(), SamQuerySeqWithRefIter::getNextMatchMismatch(), getSequence(), and shiftIndelsLeft().

const char * SamRecord::getSequence ( SequenceTranslation  translation)

Returns the SAM formatted sequence string (SEQ) performing the specified sequence translation.

translationtype of sequence translation to use.
sequence string.

Definition at line 1562 of file SamRecord.cpp.

References EQUAL, getCigarInfo(), getReferenceName(), NONE, SamQuerySeqWithRef::seqWithEquals(), SamQuerySeqWithRef::seqWithoutEquals(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if(mySequence.Length() == 0)
        // 0 Length, means that it is in the buffer, but has not yet
        // been synced to the string, so do the sync.

    // Determine if translation needs to be done.
    if((translation == NONE) || (myRefPtr == NULL))
        return mySequence.c_str();
    else if(translation == EQUAL)
        if(mySeqWithEq.length() == 0)
            // Check to see if the sequence is defined.
            if(mySequence == "*")
                // Sequence is undefined, so no translation necessary.
                mySeqWithEq = '*';
                // Sequence defined, so translate it.
        // translation == BASES
        if(mySeqWithoutEq.length() == 0)
            if(mySequence == "*")
                // Sequence is undefined, so no translation necessary.
                mySeqWithoutEq = '*';
                // Sequence defined, so translate it.
char SamRecord::getSequence ( int  index)

Get the sequence base at the specified index into this sequence 0 to readLength - 1, translating the base as specified by setSequenceTranslation.

Throws an exception if index is out of range.

indexindex into the sequence string (0 to readLength-1).
the sequence base at the specified index into the sequence.

Definition at line 1639 of file SamRecord.cpp.

References getSequence().

    return(getSequence(index, mySequenceTranslation));
char SamRecord::getSequence ( int  index,
SequenceTranslation  translation 

Get the sequence base at the specified index into this sequence 0 to readLength - 1 performing the specified sequence translation.

Throws an exception if index is out of range.

indexindex into the sequence string (0 to readLength-1).
translationtype of sequence translation to use.
the sequence base at the specified index into the sequence.

Definition at line 1645 of file SamRecord.cpp.

References EQUAL, getCigarInfo(), getReadLength(), getReferenceName(), NONE, SamQuerySeqWithRef::seqWithEquals(), and SamQuerySeqWithRef::seqWithoutEquals().

    static const char * asciiBases = "=AC.G...T......N";

    // Determine the read length.
    int32_t readLen = getReadLength();

    // If the read length is 0, this method should not be called.
    if(readLen == 0)
        String exceptionString = "SamRecord::getSequence(";
        exceptionString += index;
        exceptionString += ") is not allowed since sequence = '*'";
        throw std::runtime_error(exceptionString.c_str());
    else if((index < 0) || (index >= readLen))
        // Only get here if the index was out of range, so thow an exception.
        String exceptionString = "SamRecord::getSequence(";
        exceptionString += index;
        exceptionString += ") is out of range. Index must be between 0 and ";
        exceptionString += (readLen - 1);
        throw std::runtime_error(exceptionString.c_str());

    // Determine if translation needs to be done.
    if((translation == NONE) || (myRefPtr == NULL))
        // No translation needs to be done.
        if(mySequence.Length() == 0)
            // Parse BAM sequence.
                return(index & 1 ?
                       asciiBases[myPackedSequence[index / 2] & 0xF] :
                       asciiBases[myPackedSequence[index / 2] >> 4]);
                String exceptionString = "SamRecord::getSequence(";
                exceptionString += index;
                exceptionString += ") called with no sequence set";
                throw std::runtime_error(exceptionString.c_str());
        // Already have string.
        // Need to translate the sequence either to have '=' or to not
        // have it.
        // First check to see if the sequence has been set.
        if(mySequence.Length() == 0)
            // 0 Length, means that it is in the buffer, but has not yet
            // been synced to the string, so do the sync.

        // Check the type of translation.
        if(translation == EQUAL)
            // Check whether or not the string has already been 
            // retrieved that has the '=' in it.
            if(mySeqWithEq.length() == 0)
                // The string with '=' has not yet been determined,
                // so get the string.
                // Check to see if the sequence is defined.
                if(mySequence == "*")
                    // Sequence is undefined, so no translation necessary.
                    mySeqWithEq = '*';
                    // Sequence defined, so translate it.
            // Sequence is set, so return it.
            // translation == BASES
            // Check whether or not the string has already been 
            // retrieved that does not have the '=' in it.
            if(mySeqWithoutEq.length() == 0)
                // The string with '=' has not yet been determined,
                // so get the string.
                // Check to see if the sequence is defined.
                if(mySequence == "*")
                    // Sequence is undefined, so no translation necessary.
                    mySeqWithoutEq = '*';
                    // Sequence defined, so translate it.
                    // The string without '=' has not yet been determined,
                    // so get the string.
            // Sequence is set, so return it.

Returns the status associated with the last method that sets the status.

SamStatus of the last command that sets status.

Definition at line 2391 of file SamRecord.cpp.

const String * SamRecord::getStringTag ( const char *  tag)

Get the string value for the specified tag.

tagtag to retrieve
pointerto the tag's string value if found, NULL if not found.

Definition at line 2168 of file SamRecord.cpp.

Referenced by SamTags::isMDTagCorrect(), and SamValidator::isValidTags().

    // Parse the buffer if necessary.
            // Failed to read the tags from the buffer, so cannot
            // get tags.  setTagsFromBuffer set the errors,
            // so just return null.
    int key = MAKEKEY(tag[0], tag[1], 'Z');
    int offset = extras.Find(key);

    int value;
    if (offset < 0)
        // Check for 'B' tag.
        key = MAKEKEY(tag[0], tag[1], 'B');
        offset = extras.Find(key);
        if(offset < 0)
            // Tag not found.

    // Offset is valid, so return the tag.
    value = extras[offset];

Returns the length of the BAM formatted tags.

length of the BAM formatted tags.

Definition at line 1917 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
        // Tags are only set in the buffer, so the size of the tags is 
        // the length of the record minus the starting location of the tags.
        unsigned char * tagStart = 
            (unsigned char *)myRecordPtr->myData 
            + myRecordPtr->myReadNameLength 
            + myRecordPtr->myCigarLength * sizeof(int)
            + (myRecordPtr->myReadLength + 1) / 2 + myRecordPtr->myReadLength;
        // The non-tags take up from the start of the record to the tag start.
        // Do not include the block size part of the record since it is not
        // included in the size.
        uint32_t nonTagSize = 
            tagStart - (unsigned char*)&(myRecordPtr->myReferenceID);
        // Tags take up the size of the block minus the non-tag section.
        uint32_t tagSize = myRecordPtr->myBlockSize - nonTagSize;

    // Tags are stored outside the buffer, so myTagBufferSize is set.
bool SamRecord::getTagsString ( const char *  tags,
String returnString,
char  delim = '\t' 

Get the string representation of the tags from the record, formatted as TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE...

Sets the Status to SUCCESS when the tags are successfully returned or the tags were not found. If a different error occured, the status is set appropriately. The delimiter between the tags to retrieve is ',' or ';'. ',' was added since the original delimiter, ';', requires the string to be quoted on the command-line.

tagsthe tags to retrieve, formatted as TAG:TYPE,TAG:TYPE...
returnStringthe String to set (this method first clears returnString) to TAG:TYPE:VALUE<delim>TAG:TYPE:VALUE...
delimdelimiter to use to separate two tags, default is a tab.
true if there were not any errors even if no tags were found.

Definition at line 2070 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    const char* currentTagPtr = tags;

    myStatus = SamStatus::SUCCESS;
            // Failed to read the tags from the buffer, so cannot
            // get tags.
    bool returnStatus = true;

    while(*currentTagPtr != '\0')
        // Tags are formatted as: XY:Z
        // Where X is [A-Za-z], Y is [A-Za-z], and
        // Z is A,i,f,Z,H (cCsSI are also excepted)
        if((currentTagPtr[0] == '\0') || (currentTagPtr[1] == '\0') ||
           (currentTagPtr[2] != ':') || (currentTagPtr[3] == '\0'))
                               "getTagsString called with improperly formatted tags.\n");
            returnStatus = false;

        // Construct the key.
        int key = MAKEKEY(currentTagPtr[0], currentTagPtr[1], 
        // Look to see if the key exsists in the hash.
        int offset = extras.Find(key);

        if(offset >= 0)
            // Offset is set, so the key was found.
                returnString += delim;
            returnString += currentTagPtr[0];
            returnString += currentTagPtr[1];
            returnString += ':';
            returnString += currentTagPtr[3];
            returnString += ':';

            // First if it is an integer, determine the actual type of the int.
            char vtype;
            getTypeFromKey(key, vtype);

                case 'i':
                    returnString += *(int*)getIntegerPtr(offset, vtype);
                case 'f':
                    returnString += *(float*)getFloatPtr(offset);
                case 'Z':
                case 'B':
                    returnString += *(String*)getStringPtr(offset);
                                       "rmTag called with unknown type.\n");
                    returnStatus = false;
        // Increment to the next tag.
        if((currentTagPtr[4] == ';') || (currentTagPtr[4] == ','))
            // Increment once more.
            currentTagPtr += 5;
        else if(currentTagPtr[4] != '\0')
            // Invalid tag format. 
                               "rmTags called with improperly formatted tags.\n");
            returnStatus = false;
            // Last Tag.
            currentTagPtr += 4;
bool SamRecord::isCharType ( char  vtype) [static]

Returns whether or not the specified vtype is a char type.

Does not set SamStatus.

vtypevalue type to check.
true if the passed in vtype is a char ('A'), false otherwise.

Definition at line 2050 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

    if(vtype == 'A')
bool SamRecord::isFloatType ( char  vtype) [static]

Returns whether or not the specified vtype is a float type.

Does not set SamStatus.

vtypevalue type to check.
true if the passed in vtype is a float ('f'), false otherwise.

Definition at line 2040 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

    if(vtype == 'f')
bool SamRecord::isIntegerType ( char  vtype) [static]

Returns whether or not the specified vtype is an integer type.

Does not set SamStatus.

vtypevalue type to check.
true if the passed in vtype is an integer ('c', 'C', 's', 'S', 'i', 'I'), false otherwise.

Definition at line 2028 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

    if((vtype == 'c') || (vtype == 'C') ||
       (vtype == 's') || (vtype == 'S') ||
       (vtype == 'i') || (vtype == 'I'))
bool SamRecord::isStringType ( char  vtype) [static]

Returns whether or not the specified vtype is a string type.

Does not set SamStatus.

vtypevalue type to check.
true if the passed in vtype is a string ('Z'/'B'), false othwerise.

Definition at line 2060 of file SamRecord.cpp.

Referenced by SamRecordHelper::genSamTagString().

    if((vtype == 'Z') || (vtype == 'B'))
bool SamRecord::isValid ( SamFileHeader header)

Returns whether or not the record is valid, setting the status to indicate success or failure.

headerSAM Header associated with the record. Used to perform some validation against the header.
true if the record is valid, false if not.

Definition at line 161 of file SamRecord.cpp.

References SamValidationErrors::getErrorString(), StatGenStatus::INVALID, SamValidator::isValid(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    SamValidationErrors invalidSamErrors;
    if(!SamValidator::isValid(header, *this, invalidSamErrors))
        // The record is not valid.
        std::string errorMessage = "";
        myStatus.setStatus(SamStatus::INVALID, errorMessage.c_str());
    // The record is valid.

Reset the fields of the record to a default value.

This is not necessary when you are reading a SAM/BAM file, but if you are setting fields, it is a good idea to clean out a record before reusing it. Clearing it allows you to not have to set any empty fields.

Definition at line 91 of file SamRecord.cpp.

References clearTags(), NONE, and StatGenStatus::SUCCESS.

Referenced by SamRecord(), setBuffer(), setBufferFromFile(), and ~SamRecord().

    myIsBufferSynced = true;

    myRecordPtr->myBlockSize = DEFAULT_BLOCK_SIZE;
    myRecordPtr->myReferenceID = -1;
    myRecordPtr->myPosition = -1;
    myRecordPtr->myReadNameLength = DEFAULT_READ_NAME_LENGTH;
    myRecordPtr->myMapQuality = 0;
    myRecordPtr->myBin = DEFAULT_BIN;
    myRecordPtr->myCigarLength = 0;
    myRecordPtr->myFlag = 0;
    myRecordPtr->myReadLength = 0;
    myRecordPtr->myMateReferenceID = -1;
    myRecordPtr->myMatePosition = -1;
    myRecordPtr->myInsertSize = 0;
    // Set the sam values for the variable length fields.
    // TODO - one way to speed this up might be to not set to "*" and just
    // clear them, and write out a '*' for SAM if it is empty.
    myReadName = DEFAULT_READ_NAME;
    myReferenceName = "*";
    myMateReferenceName = "*";
    myCigar = "*";
    mySequence = "*";
    myQuality = "*";
    myNeedToSetTagsFromBuffer = false;
    myNeedToSetTagsInBuffer = false;

    // Initialize the calculated alignment info to the uncalculated value.
    myAlignmentLength = -1;
    myUnclippedStartOffset = -1;
    myUnclippedEndOffset = -1;


    // Set the bam values for the variable length fields.
    // Only the read name needs to be set, the others are a length of 0.
    // Set the read name.  The min size of myRecordPtr includes the size for
    // the default read name.
    memcpy(&(myRecordPtr->myData), myReadName.c_str(), 

    // Set that the variable length buffer fields are valid.
    myIsReadNameBufferValid = true;
    myIsCigarBufferValid = true;
    myPackedSequence = 
        (unsigned char *)myRecordPtr->myData + myRecordPtr->myReadNameLength +
        myRecordPtr->myCigarLength * sizeof(int);
    myIsSequenceBufferValid = true;
    myBufferSequenceTranslation = NONE;

    myPackedQuality = myPackedSequence;
    myIsQualityBufferValid = true;
    myIsTagsBufferValid = true;
    myIsBinValid = true;

    myCigarTempBufferLength = -1;

    myStatus = SamStatus::SUCCESS;

    NOT_FOUND_TAG_INT = -1; // TODO - deprecate
bool SamRecord::rmTag ( const char *  tag,
char  type 

Remove a tag.

tagtag to remove.
typeof the tag to be removed.
true if the tag no longer exists in the record, false if it could not be removed (Returns true if the tag was not found in the record).

Definition at line 980 of file SamRecord.cpp.

References getString(), StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    // Check the length of tag.
    if(strlen(tag) != 2)
        // Tag is the wrong length.
                           "rmTag called with tag that is not 2 characters\n");

    myStatus = SamStatus::SUCCESS;
            // Failed to read the tags from the buffer, so cannot
            // get tags.

    // Construct the key.
    int key = MAKEKEY(tag[0], tag[1], type);
    // Look to see if the key exsists in the hash.
    int offset = extras.Find(key);

    if(offset < 0)
        // Not found, so return true, successfully removed since
        // it is not in tag.

    // Offset is set, so the key was found.
    // First if it is an integer, determine the actual type of the int.
    char vtype;
    getTypeFromKey(key, vtype);
    if(vtype == 'i')
        vtype = getIntegerType(offset);

    // Offset is set, so recalculate the buffer size without this entry.
    // Do NOT remove from strings, integers, or floats because then
    // extras would need to be updated for all entries with the new indexes
    // into those variables.
    int rmBuffSize = 0;
        case 'A':
        case 'c':
        case 'C':
            rmBuffSize = 4;
        case 's':
        case 'S':
            rmBuffSize = 5;
        case 'i':
        case 'I':
            rmBuffSize = 7;
        case 'f':
            rmBuffSize = 7;
        case 'Z':
            rmBuffSize = 4 + getString(offset).Length();
        case 'B':
            rmBuffSize = 3 + getBtagBufferSize(getString(offset));
                               "rmTag called with unknown type.\n");

    // The buffer tags are now out of sync.
    myNeedToSetTagsInBuffer = true;
    myIsTagsBufferValid = false;
    myIsBufferSynced = false;
    myTagBufferSize -= rmBuffSize;

    // Remove from the hash.
bool SamRecord::rmTags ( const char *  tags)

Remove tags.

The delimiter between the tags is ',' or ';'. ',' was added since the original delimiter, ';', requires the string to be quoted on the command-line.

tagstags to remove, formatted as Tag:Type,Tag:Type,Tag:Type...
true if all tags no longer exist in the record, false if any could not be removed (Returns true if the tags were not found in the record). SamStatus is set to INVALID if the tags are incorrectly formatted.

Definition at line 1071 of file SamRecord.cpp.

References getString(), StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    const char* currentTagPtr = tags;

    myStatus = SamStatus::SUCCESS;
            // Failed to read the tags from the buffer, so cannot
            // get tags.
    bool returnStatus = true;

    int rmBuffSize = 0;
    while(*currentTagPtr != '\0')

        // Tags are formatted as: XY:Z
        // Where X is [A-Za-z], Y is [A-Za-z], and
        // Z is A,i,f,Z,H (cCsSI are also excepted)
        if((currentTagPtr[0] == '\0') || (currentTagPtr[1] == '\0') ||
           (currentTagPtr[2] != ':') || (currentTagPtr[3] == '\0'))
                               "rmTags called with improperly formatted tags.\n");
            returnStatus = false;

        // Construct the key.
        int key = MAKEKEY(currentTagPtr[0], currentTagPtr[1], 
        // Look to see if the key exsists in the hash.
        int offset = extras.Find(key);

        if(offset >= 0)
            // Offset is set, so the key was found.
            // First if it is an integer, determine the actual type of the int.
            char vtype;
            getTypeFromKey(key, vtype);
            if(vtype == 'i')
                vtype = getIntegerType(offset);
            // Offset is set, so recalculate the buffer size without this entry.
            // Do NOT remove from strings, integers, or floats because then
            // extras would need to be updated for all entries with the new indexes
            // into those variables.
                case 'A':
                case 'c':
                case 'C':
                    rmBuffSize += 4;
                case 's':
                case 'S':
                    rmBuffSize += 5;
                case 'i':
                case 'I':
                    rmBuffSize += 7;
                case 'f':
                    rmBuffSize += 7;
                case 'Z':
                    rmBuffSize += 4 + getString(offset).Length();
                case 'B':
                    rmBuffSize += 3 + getBtagBufferSize(getString(offset));
                                       "rmTag called with unknown type.\n");
                    returnStatus = false;
            // Remove from the hash.
        // Increment to the next tag.
        if((currentTagPtr[4] == ';') || (currentTagPtr[4] == ','))
            // Increment once more.
            currentTagPtr += 5;
        else if(currentTagPtr[4] != '\0')
            // Invalid tag format. 
                               "rmTags called with improperly formatted tags.\n");
            returnStatus = false;
            // Last Tag.
            currentTagPtr += 4;

    // The buffer tags are now out of sync.
    myNeedToSetTagsInBuffer = true;
    myIsTagsBufferValid = false;
    myIsBufferSynced = false;
    myTagBufferSize -= rmBuffSize;

bool SamRecord::set0BasedMatePosition ( int32_t  matePosition)

Set the mate/next fragment's leftmost position using the specified 0-based (BAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

position0-based start position
true if successfully set, false if not.

Definition at line 328 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by set1BasedMatePosition().

    myStatus = SamStatus::SUCCESS;
    myRecordPtr->myMatePosition = matePosition;
    return true;
bool SamRecord::set0BasedPosition ( int32_t  position)

Set the leftmost position using the specified 0-based (BAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

position0-based start position
true if successfully set, false if not.

Definition at line 242 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by set1BasedPosition(), and SamFilter::softClip().

    myStatus = SamStatus::SUCCESS;
    myRecordPtr->myPosition = position;
    myIsBinValid = false;
    return true;
bool SamRecord::set1BasedMatePosition ( int32_t  matePosition)

Set the mate/next fragment's leftmost position (PNEXT) using the specified 1-based (SAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

position1-based start position
true if successfully set, false if not.

Definition at line 322 of file SamRecord.cpp.

References set0BasedMatePosition().

    return(set0BasedMatePosition(matePosition - 1));
bool SamRecord::set1BasedPosition ( int32_t  position)

Set the leftmost position (POS) using the specified 1-based (SAM format) value.

Internal processing handles the switching between SAM/BAM formats when read/written.

position1-based start position
true if successfully set, false if not.

Definition at line 236 of file SamRecord.cpp.

References set0BasedPosition().

    return(set0BasedPosition(position - 1));
SamStatus::Status SamRecord::setBuffer ( const char *  fromBuffer,
uint32_t  fromBufferSize,
SamFileHeader header 

Sets the SamRecord to contain the information in the BAM formatted fromBuffer.

fromBufferbuffer to read the BAM record from.
fromBufferSizesize of the buffer containing the BAM record.
headerBAM header for the record.
status of reading the BAM record from the buffer.

Definition at line 525 of file SamRecord.cpp.

References StatGenStatus::FAIL_MEM, StatGenStatus::FAIL_PARSE, resetRecord(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if((fromBuffer == NULL) || (fromBufferSize == 0))
        // Buffer is empty.
                           "Cannot parse an empty file.");

    // Clear the record.   

    // allocate space for the record size.
        // Failed to allocate space.
    memcpy(myRecordPtr, fromBuffer, fromBufferSize);


    // Return the status of the record.

Read the BAM record from a file.

filePtrfile to read the buffer from.
headerBAM header for the record.
status of the reading the BAM record from the file.

Definition at line 558 of file SamRecord.cpp.

References StatGenStatus::FAIL_IO, StatGenStatus::FAIL_MEM, StatGenStatus::FAIL_ORDER, StatGenStatus::FAIL_PARSE, ifeof(), ifread(), InputFile::isOpen(), StatGenStatus::NO_MORE_RECS, resetRecord(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if((filePtr == NULL) || (filePtr->isOpen() == false))
        // File is not open, return failure.
                           "Can't read from an unopened file.");

    // Clear the record.

    // read the record size.
    int numBytes = 
        ifread(filePtr, &(myRecordPtr->myBlockSize), sizeof(int32_t));

    // Check to see if the end of the file was hit and no bytes were read.
    if(ifeof(filePtr) && (numBytes == 0))
        // End of file, nothing was read, no more records.
                           "No more records left to read.");
    if(numBytes != sizeof(int32_t))
        // Failed to read the entire block size.  Either the end of the file
        // was reached early or there was an error.
            // Error: end of the file reached prior to reading the rest of the
            // record.
                               "EOF reached in the middle of a record.");
            // Error reading.
                               "Failed to read the record size.");

    // allocate space for the record size.
    if(!allocateRecordStructure(myRecordPtr->myBlockSize + sizeof(int32_t)))
        // Failed to allocate space.
        // Status is set by allocateRecordStructure.

    // Read the rest of the alignment block, starting at the reference id.
    if(ifread(filePtr, &(myRecordPtr->myReferenceID), myRecordPtr->myBlockSize)
       != (unsigned int)myRecordPtr->myBlockSize)
        // Error reading the record.  Reset it and return failure.
                           "Failed to read the record");


    // Return the status of the record.
bool SamRecord::setCigar ( const char *  cigar)

Set the CIGAR to the specified SAM formatted cigar string.

Internal processing handles the switching between SAM/BAM formats when read/written.

cigarstring containing the SAM formatted cigar.
true if successfully set, false if not.

Definition at line 259 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead(), shiftIndelsLeft(), and SamFilter::softClip().

    myStatus = SamStatus::SUCCESS;
    myCigar = cigar;
    myIsBufferSynced = false;
    myIsCigarBufferValid = false;
    myCigarTempBufferLength = -1;
    myIsBinValid = false;

    // Initialize the calculated alignment info to the uncalculated value.
    myAlignmentLength = -1;
    myUnclippedStartOffset = -1;
    myUnclippedEndOffset = -1;

    return true;
bool SamRecord::setCigar ( const Cigar cigar)

Set the CIGAR to the specified Cigar object.

Internal processing handles the switching between SAM/BAM formats when read/written.

cigarobject to set this record's cigar to have.
true if successfully set, false if not.

Definition at line 278 of file SamRecord.cpp.

References Cigar::getCigarString(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    myIsBufferSynced = false;
    myIsCigarBufferValid = false;
    myCigarTempBufferLength = -1;
    myIsBinValid = false;

    // Initialize the calculated alignment info to the uncalculated value.
    myAlignmentLength = -1;
    myUnclippedStartOffset = -1;
    myUnclippedEndOffset = -1;

    return true;
bool SamRecord::setFlag ( uint16_t  flag)

Set the bitwise FLAG to the specified value.

flaginteger flag to use.
true if successfully set, false if not.

Definition at line 215 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead().

    myStatus = SamStatus::SUCCESS;
    myRecordPtr->myFlag = flag;
    return true;
bool SamRecord::setInsertSize ( int32_t  insertSize)

Sets the inferred insert size (ISIZE)/observed template length (TLEN).

insertSizeinferred insert size/observed template length.
true if successfully set, false if not.

Definition at line 336 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    myRecordPtr->myInsertSize = insertSize;
    return true;
bool SamRecord::setMapQuality ( uint8_t  mapQuality)

Set the mapping quality (MAPQ).

mapQualitymap quality to set in the record.
true if successfully set, false if not.

Definition at line 251 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

Referenced by SamFilter::filterRead().

    myStatus = SamStatus::SUCCESS;
    myRecordPtr->myMapQuality = mapQuality;
    return true;
bool SamRecord::setMateReferenceName ( SamFileHeader header,
const char *  mateReferenceName 

Set the mate/next fragment's reference sequence name (RNEXT) to the specified name, using the header to determine the mate reference id.

headerSAM/BAM header to use to determine the mate reference id.
referenceNamemate reference name to use.
true if successfully set, false if not

Definition at line 297 of file SamRecord.cpp.

References SamFileHeader::getReferenceID(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    // Set the mate reference, if it is "=", set it to be equal
    // to myReferenceName.  This assumes that myReferenceName has already
    // been called.
    if(strcmp(mateReferenceName, FIELD_ABSENT_STRING) == 0)
        myMateReferenceName = myReferenceName;
        myMateReferenceName = mateReferenceName;

    // Set the Mate Reference ID.
    // If the reference ID does not already exist, add it (pass true)
    myRecordPtr->myMateReferenceID = 
        header.getReferenceID(myMateReferenceName, true);

    return true;
bool SamRecord::setQuality ( const char *  quality)

Sets the quality (QUAL) to the specified SAM formatted quality string.

Internal processing handles switching between SAM/BAM formats when read/written.

qualitySAM quality string.
true if successfully set, false if not.

Definition at line 357 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    myQuality = quality;
    myIsBufferSynced = false;
    myIsQualityBufferValid = false;
    return true;
bool SamRecord::setReadName ( const char *  readName)

Set QNAME to the passed in name.

readNamethe readname to set the QNAME to.
true if successfully set, false if not.

Definition at line 193 of file SamRecord.cpp.

References StatGenStatus::INVALID, StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    myReadName = readName;
    myIsBufferSynced = false;
    myIsReadNameBufferValid = false;
    myStatus = SamStatus::SUCCESS;

    // The read name must at least have some length, otherwise this is a parsing
    // error.
    if(myReadName.Length() == 0)
        // Invalid - reset ReadName return false.
        myReadName = DEFAULT_READ_NAME;
        myRecordPtr->myReadNameLength = DEFAULT_READ_NAME_LENGTH;
        myStatus.setStatus(SamStatus::INVALID, "0 length Query Name.");

    return true;
void SamRecord::setReference ( GenomeSequence reference)

Set the reference to the specified genome sequence object.

referencepointer to the GenomeSequence object.

Definition at line 178 of file SamRecord.cpp.

Referenced by SamFile::GetNumOverlaps(), SamFile::ReadRecord(), SamFile::validateSortOrder(), and SamFile::WriteRecord().

    myRefPtr = reference;
bool SamRecord::setReferenceName ( SamFileHeader header,
const char *  referenceName 

Set the reference sequence name (RNAME) to the specified name, using the header to determine the reference id.

headerSAM/BAM header to use to determine the reference id.
referenceNamereference name to use.
true if successfully set, false if not

Definition at line 223 of file SamRecord.cpp.

References SamFileHeader::getReferenceID(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;

    myReferenceName = referenceName;
    // If the reference ID does not already exist, add it (pass true)
    myRecordPtr->myReferenceID = header.getReferenceID(referenceName, true);

    return true;
bool SamRecord::setSequence ( const char *  seq)

Sets the sequence (SEQ) to the specified SAM formatted sequence string.

Internal processing handles switching between SAM/BAM formats when read/written.

seqSAM sequence string. May contain '='.
true if successfully set, false if not.

Definition at line 344 of file SamRecord.cpp.

References StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    mySequence = seq;
    myIsBufferSynced = false;
    myIsSequenceBufferValid = false;
    return true;

Set the type of sequence translation to use when getting the sequence.

The default type (if this method is never called) is NONE (the sequence is left as-is). Can be over-ridden by using the accessors that take a SequenceTranslation parameter.

translationtype of sequence translation to use.

Definition at line 187 of file SamRecord.cpp.

Referenced by SamFile::GetNumOverlaps(), SamFile::ReadRecord(), and SamFile::validateSortOrder().

    mySequenceTranslation = translation;

Shift the indels (if any) to the left by updating the CIGAR.

true if the cigar was shifted, false if not.

Definition at line 368 of file SamRecord.cpp.

References BASES, Cigar::foundInQuery(), getSequence(), CigarRoller::IncrementCount(), Cigar::insert, Cigar::isMatchOrMismatch(), CigarRoller::Remove(), setCigar(), Cigar::size(), and CigarRoller::Update().

    // Check to see whether or not the Cigar has already been
    // set - this is determined by checking if alignment length
    // is set since alignment length and the cigar are set
    // at the same time.
    if(myAlignmentLength == -1)
        // Not been set, so calculate it.
    // Track whether or not there was a shift.
    bool shifted = false;

    // Cigar is set, so now myCigarRoller can be used.
    // Track where in the read we are.
    uint32_t currentPos = 0;

    // Since the loop starts at 1 because the first operation can't be shifted,
    // increment the currentPos past the first operation.
        // This op was found in the read, increment the current position.
        currentPos += myCigarRoller[0].count;
    int numOps = myCigarRoller.size();
    // Loop through the cigar operations from the 2nd operation since
    // the first operation is already on the end and can't shift.
    for(int currentOp = 1; currentOp < numOps; currentOp++)
        if(myCigarRoller[currentOp].operation == Cigar::insert)
            // For now, only shift a max of 1 operation.
            int prevOpIndex = currentOp-1;
            // Track the next op for seeing if it is the same as the
            // previous for merging reasons.
            int nextOpIndex = currentOp+1;
            if(nextOpIndex == numOps)
                // There is no next op, so set it equal to the current one.
                nextOpIndex = currentOp;
            // The start of the previous operation, so we know when we hit it
            // so we don't shift past it.
            uint32_t prevOpStart = 
                currentPos - myCigarRoller[prevOpIndex].count;

            // We can only shift if the previous operation
                // TODO - shift past pads
                // An insert is in the read, so increment the position.
                currentPos += myCigarRoller[currentOp].count;                 
                // Not a match/mismatch, so can't shift into it.
            // It is a match or mismatch, so check to see if we can
            // shift into it.

            // The end of the insert is calculated by adding the size
            // of this insert minus 1 to the start of the insert.
            uint32_t insertEndPos = 
                currentPos + myCigarRoller[currentOp].count - 1;
            // The insert starts at the current position.
            uint32_t insertStartPos = currentPos;
            // Loop as long as the position before the insert start
            // matches the last character in the insert. If they match,
            // the insert can be shifted one index left because the
            // implied reference will not change.  If they do not match,
            // we can't shift because the implied reference would change.
            // Stop loop when insertStartPos = prevOpStart, because we 
            // don't want to move past that.
            while((insertStartPos > prevOpStart) && 
                  (getSequence(insertEndPos,BASES) == 
                   getSequence(insertStartPos - 1, BASES)))
                // We can shift, so move the insert start & end one left.

            // Determine if a shift has occurred.
            int shiftLen = currentPos - insertStartPos;
            if(shiftLen > 0)
                // Shift occured, so adjust the cigar if the cigar will
                // not become more operations.
                // If the next operation is the same as the previous or
                // if the insert and the previous operation switch positions
                // then the cigar has the same number of operations.
                // If the next operation is different, and the shift splits
                // the previous operation in 2, then the cigar would
                // become longer, so we do not want to shift.
                if(myCigarRoller[nextOpIndex].operation == 
                    // The operations are the same, so merge them by adding
                    // the length of the shift to the next operation.
                    myCigarRoller.IncrementCount(nextOpIndex, shiftLen);
                    myCigarRoller.IncrementCount(prevOpIndex, -shiftLen);

                    // If the previous op length is 0, just remove that
                    // operation.
                    if(myCigarRoller[prevOpIndex].count == 0)
                    shifted = true;
                    // Can only shift if the insert shifts past the
                    // entire previous operation, otherwise an operation
                    // would need to be added.
                    if(insertStartPos == prevOpStart)
                        // Swap the positions of the insert and the
                        // previous operation.
                        // Size of the previous op is the entire
                        // shift length.
                        shifted = true;
            // An insert is in the read, so increment the position.
            currentPos += myCigarRoller[currentOp].count;                 
        else if(Cigar::foundInQuery(myCigarRoller[currentOp]))
            // This op was found in the read, increment the current position.
            currentPos += myCigarRoller[currentOp].count;
        // TODO - setCigar is currently inefficient because later the cigar
        // roller will be recalculated, but for now it will work.

Write the record as a BAM into the specified already opened file.

filePtrfile to write the BAM record into.
status of the write.

Definition at line 1225 of file SamRecord.cpp.

    return(writeRecordBuffer(filePtr, mySequenceTranslation));

Write the record as a BAM into the specified already opened file using the specified translation on the sequence.

filePtrfile to write the BAM record into.
translationtype of sequence translation to use.
status of the write.

Definition at line 1232 of file SamRecord.cpp.

References StatGenStatus::FAIL_IO, StatGenStatus::FAIL_ORDER, StatGenStatus::getStatus(), ifwrite(), InputFile::isOpen(), StatGenStatus::setStatus(), and StatGenStatus::SUCCESS.

    myStatus = SamStatus::SUCCESS;
    if((filePtr == NULL) || (filePtr->isOpen() == false))
        // File is not open, return failure.
                           "Can't write to an unopened file.");

    if((myIsBufferSynced == false) ||
       (myBufferSequenceTranslation != translation))

    // Write the record.
    unsigned int numBytesToWrite = myRecordPtr->myBlockSize + sizeof(int32_t);
    unsigned int numBytesWritten = 
        ifwrite(filePtr, myRecordPtr, numBytesToWrite);

    // Return status based on if the correct number of bytes were written.
    if(numBytesToWrite == numBytesWritten)
    // The correct number of bytes were not written.
    myStatus.setStatus(SamStatus::FAIL_IO, "Failed to write the entire record.");

The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends