libStatGen Software  1
CigarRoller Class Reference

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar. More...

#include <CigarRoller.h>

Inheritance diagram for CigarRoller:
Collaboration diagram for CigarRoller:

List of all members.

Public Member Functions

 CigarRoller ()
 Default constructor initializes as a CIGAR with no operations.
 CigarRoller (const char *cigarString)
 Constructor that initializes the object with the specified cigarString.
CigarRolleroperator+= (CigarRoller &rhs)
 Add the contents of the specified CigarRoller to this object.
CigarRolleroperator+= (const CigarOperator &rhs)
 Append the specified operator to this object.
CigarRolleroperator= (CigarRoller &rhs)
 Set this object to be equal to the specified CigarRoller.
void Add (Operation operation, int count)
 Append the specified operation with the specified count to this object.
void Add (char operation, int count)
 Append the specified operation with the specified count to this object.
void Add (const char *cigarString)
 Append the specified cigarString to this object.
void Add (CigarRoller &rhs)
 Append the specified Cigar object to this object.
bool Remove (int index)
 Remove the operation at the specified index.
bool IncrementCount (int index, int increment)
 Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement.
bool Update (int index, Operation op, int count)
 Updates the operation at the specified index to be the specified operation and have the specified count.
void Set (const char *cigarString)
 Sets this object to the specified cigarString.
void Set (const uint32_t *cigarBuffer, uint16_t bufferLen)
 Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long.
int getMatchPositionOffset ()
 DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc.
const char * getString ()
 Get the string reprentation of the Cigar operations in this object, caller must delete the returned value.
void clear ()
 Clear this object so that it has no Cigar Operations.

Friends

std::ostream & operator<< (std::ostream &stream, const CigarRoller &roller)
 Writes all of the cigar operations contained in this roller to the passed in stream.

Detailed Description

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar.

Docs from Sam1.pdf:

Clipped alignment. In Smith-Waterman alignment, a sequence may not be aligned from the first residue to the last one. Subsequences at the ends may be clipped off. We introduce operation ʻSʼ to describe (softly) clipped alignment. Here is an example. Suppose the clipped alignment is: REF: AGCTAGCATCGTGTCGCCCGTCTAGCATACGCATGATCGACTGTCAGCTAGTCAGACTAGTCGATCGATGTG READ: gggGTGTAACC-GACTAGgggg where on the read sequence, bases in uppercase are matches and bases in lowercase are clipped off. The CIGAR for this alignment is: 3S8M1D6M4S.

If the mapping position of the query is not available, RNAME and CIGAR are set as “*”

A CIGAR string is comprised of a series of operation lengths plus the operations. The conventional CIGAR format allows for three types of operations: M for match or mismatch, I for insertion and D for deletion. The extended CIGAR format further allows four more operations, as is shown in the following table, to describe clipping, padding and splicing:

op Description -- ----------- M Match or mismatch I Insertion to the reference D Deletion from the reference N Skipped region from the reference S Soft clip on the read (clipped sequence present in <seq>) H Hard clip on the read (clipped sequence NOT present in <seq>) P Padding (silent deletion from the padded reference sequence)

CigarRoller is an aid to correctly generating the CIGAR strings necessary to represent how a read maps to the reference.

It is called once a particular match candidate is being written out, so it is far less performance sensitive than the Smith Waterman code below.

Definition at line 66 of file CigarRoller.h.


Member Function Documentation

DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc.

Definition at line 244 of file CigarRoller.cpp.

References Cigar::del, and Cigar::insert.

{
    int offset = 0;
    std::vector<CigarOperator>::iterator i;

    for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
    {
        switch (i->operation)
        {
            case insert:
                offset += i->count;
                break;
            case del:
                offset -= i->count;
                break;
                // TODO anything for case skip:????
            default:
                break;
        }
    }
    return offset;
}
const char * CigarRoller::getString ( )

Get the string reprentation of the Cigar operations in this object, caller must delete the returned value.

Definition at line 272 of file CigarRoller.cpp.

{
    // NB: the exact size of the string is not important, it just needs to be guaranteed
    // larger than the largest number of characters we could put into it.

    // we do not explicitly manage memory usage, and we expect when program exits, the memory used here will be freed
    static char *ret = NULL;
    static unsigned int retSize = 0;

    if (ret == NULL)
    {
        retSize = cigarOperations.size() * 12 + 1;  // 12 == a magic number -> > 1 + log base 10 of MAXINT
        ret = (char*) malloc(sizeof(char) * retSize);
        assert(ret != NULL);

    }
    else
    {
        // currently, ret pointer has enough memory to use
        if (retSize > cigarOperations.size() * 12 + 1)
        {
        }
        else
        {
            retSize = cigarOperations.size() * 12 + 1;
            free(ret);
            ret = (char*) malloc(sizeof(char) * retSize);
        }
        assert(ret != NULL);
    }

    char *ptr = ret;
    char buf[12];   // > 1 + log base 10 of MAXINT

    std::vector<CigarOperator>::iterator i;

    // Progressively append the character representations of the operations to
    // the cigar string we allocated above.

    *ptr = '\0';    // clear result string
    for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
    {
        sprintf(buf, "%d%c", (*i).count, (*i).getChar());
        strcat(ptr, buf);
        while (*ptr)
        {
            ptr++;    // limit the cost of strcat above
        }
    }
    return ret;
}
bool CigarRoller::IncrementCount ( int  index,
int  increment 
)

Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement.

Returns:
true if it is successfully incremented, false if not.

Definition at line 171 of file CigarRoller.cpp.

Referenced by SamRecord::shiftIndelsLeft().

{
    if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
    {
        // can't update, out of range, return false.
        return(false);
    }
    cigarOperations[index].count += increment;

    // Modifying the cigar, so the query & reference indexes are out of date,
    // so clear them.
    clearQueryAndReferenceIndexes();
    return(true);
}
bool CigarRoller::Remove ( int  index)

Remove the operation at the specified index.

Returns:
true if successfully removed, false if not.

Definition at line 156 of file CigarRoller.cpp.

Referenced by SamRecord::shiftIndelsLeft(), and CigarHelper::softClipEndByRefPos().

{
    if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
    {
        // can't remove, out of range, return false.
        return(false);
    }
    cigarOperations.erase(cigarOperations.begin() + index);
    // Modifying the cigar, so the query & reference indexes are out of date,
    // so clear them.
    clearQueryAndReferenceIndexes();
    return(true);
}
void CigarRoller::Set ( const uint32_t *  cigarBuffer,
uint16_t  bufferLen 
)

Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long.

Definition at line 211 of file CigarRoller.cpp.

References Add(), and clear().

{
    clear();

    // Parse the buffer.
    for (int i = 0; i < bufferLen; i++)
    {
        int opLen = cigarBuffer[i] >> 4;

        Add(cigarBuffer[i] & 0xF, opLen);
    }
}
bool CigarRoller::Update ( int  index,
Operation  op,
int  count 
)

Updates the operation at the specified index to be the specified operation and have the specified count.

Returns:
true if it is successfully updated, false if not.

Definition at line 187 of file CigarRoller.cpp.

Referenced by SamRecord::shiftIndelsLeft().

{
    if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
    {
        // can't update, out of range, return false.
        return(false);
    }
    cigarOperations[index].operation = op;
    cigarOperations[index].count = count;

    // Modifying the cigar, so the query & reference indexes are out of date,
    // so clear them.
    clearQueryAndReferenceIndexes();
    return(true);
}

Friends And Related Function Documentation

std::ostream& operator<< ( std::ostream &  stream,
const CigarRoller roller 
) [friend]

Writes all of the cigar operations contained in this roller to the passed in stream.

Definition at line 167 of file CigarRoller.h.

{
    stream << roller.cigarOperations;
    return stream;
}

The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends