libStatGen Software  1
SamQuerySeqWithRef Class Reference

Contains methods for converting between the query sequence and reference. More...

#include <SamQuerySeqWithRefHelper.h>

List of all members.

Static Public Member Functions

static void seqWithEquals (const char *currentSeq, int32_t seq0BasedPos, Cigar &cigar, const char *referenceName, const GenomeSequence &refSequence, std::string &updatedSeq)
 Gets the sequence with '=' in any position where the sequence matches the reference.
static void seqWithoutEquals (const char *currentSeq, int32_t seq0BasedPos, Cigar &cigar, const char *referenceName, const GenomeSequence &refSequence, std::string &updatedSeq)
 Gets the sequence converting '=' to the appropriate base using the reference.

Detailed Description

Contains methods for converting between the query sequence and reference.

Definition at line 101 of file SamQuerySeqWithRefHelper.h.


Member Function Documentation

void SamQuerySeqWithRef::seqWithEquals ( const char *  currentSeq,
int32_t  seq0BasedPos,
Cigar cigar,
const char *  referenceName,
const GenomeSequence refSequence,
std::string &  updatedSeq 
) [static]

Gets the sequence with '=' in any position where the sequence matches the reference.

NOTE: 'N' in both the sequence and the reference is not considered a match.

Parameters:
currentSeqsequence that should be converted
seq0BasedPos0 based start position of currentSeq on the reference.
cigarcigar string for currentSeq (used for determining how the sequence aligns to the reference)
referenceNamereference name associated with this sequence
refSequencereference sequence object
updatedSeqreturn parameter that this method sets to the current sequence, replacing any matches to the reference with '='.

Definition at line 243 of file SamQuerySeqWithRefHelper.cpp.

References BaseUtilities::areEqual(), GenomeSequence::getGenomePosition(), Cigar::getRefOffset(), Cigar::INDEX_NA, and BaseUtilities::isAmbiguous().

Referenced by SamRecord::getSequence().

{
    updatedSeq = currentSeq;

    int32_t seqLength = updatedSeq.length();
    int32_t queryIndex = 0;

    uint32_t startOfReadOnRefIndex = 
        refSequence.getGenomePosition(referenceName);
    
    if(startOfReadOnRefIndex == INVALID_GENOME_INDEX)
    {
        // This reference name was not found in the reference file, so just
        // return.
        return;
    }
    startOfReadOnRefIndex += seq0BasedPos;
    
    // Loop until the entire sequence has been updated.
    while(queryIndex < seqLength)
    {
        // Still more bases, look for matches.

        // Get the reference offset for this read position.
        int32_t refOffset = cigar.getRefOffset(queryIndex);
        if(refOffset != Cigar::INDEX_NA)
        {
            // Both the reference and the read have a base, so get the bases.
            char readBase = currentSeq[queryIndex];
            char refBase = refSequence[startOfReadOnRefIndex + refOffset];

            // If neither base is unknown and they are the same, count it
            // as a match.
            if(!BaseUtilities::isAmbiguous(readBase) && 
               !BaseUtilities::isAmbiguous(refBase) && 
               (BaseUtilities::areEqual(readBase, refBase)))
            {
                // Match.
                updatedSeq[queryIndex] = '=';
            }
        }
        // Increment the query index to the next position.
        ++queryIndex;
        continue;
    }
}
void SamQuerySeqWithRef::seqWithoutEquals ( const char *  currentSeq,
int32_t  seq0BasedPos,
Cigar cigar,
const char *  referenceName,
const GenomeSequence refSequence,
std::string &  updatedSeq 
) [static]

Gets the sequence converting '=' to the appropriate base using the reference.

Parameters:
currentSeqsequence that should be converted
seq0BasedPos0 based start position of currentSeq on the reference.
cigarcigar string for currentSeq (used for determining how the sequence aligns to the reference)
referenceNamereference name associated with this sequence
refSequencereference sequence object
updatedSeqreturn parameter that this method sets to the current sequence, replacing any '=' with the base from the reference.

Definition at line 296 of file SamQuerySeqWithRefHelper.cpp.

References BaseUtilities::areEqual(), GenomeSequence::getGenomePosition(), Cigar::getRefOffset(), and Cigar::INDEX_NA.

Referenced by SamRecord::getSequence().

{
    updatedSeq = currentSeq;

    int32_t seqLength = updatedSeq.length();
    int32_t queryIndex = 0;

    uint32_t startOfReadOnRefIndex = 
        refSequence.getGenomePosition(referenceName);
    
    if(startOfReadOnRefIndex == INVALID_GENOME_INDEX)
    {
        // This reference name was not found in the reference file, so just
        // return.
        return;
    }
    startOfReadOnRefIndex += seq0BasedPos;
    
    // Loop until the entire sequence has been updated.
    while(queryIndex < seqLength)
    {
        // Still more bases, look for matches.

        // Get the reference offset for this read position.
        int32_t refOffset = cigar.getRefOffset(queryIndex);
        if(refOffset != Cigar::INDEX_NA)
        {
            // Both the reference and the read have a base, so get the bases.
            char readBase = currentSeq[queryIndex];
            char refBase = refSequence[startOfReadOnRefIndex + refOffset];
            
            // If the bases are equal, set the sequence to the reference
            // base. (Skips the check for ambiguous to catch a case where
            // ambiguous had been converted to a '=', and if both are ambiguous,
            // it will still be set to ambiguous.)
            if(BaseUtilities::areEqual(readBase, refBase))
            {
                // Match.
                updatedSeq[queryIndex] = refBase;
            }
        }

        // Increment the query index to the next position.
        ++queryIndex;
        continue;
    }
}

The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends