libStatGen Software
1
|
Iterates through the query and compare with reference. More...
#include <SamQuerySeqWithRefHelper.h>
Public Member Functions | |
SamQuerySeqWithRefIter (SamRecord &record, GenomeSequence &refSequence, bool forward=true) | |
bool | reset (bool forward=true) |
Reset to start at the beginning of the record. | |
bool | getNextMatchMismatch (SamSingleBaseMatchInfo &matchMismatchInfo) |
Returns information for the next position where the query and the reference match or mismatch. |
Iterates through the query and compare with reference.
NOTE: References to the GenomeSequence and SamRecord are stored, the objects are not copied, so they must remain valid as long as this class is used.
Definition at line 58 of file SamQuerySeqWithRefHelper.h.
bool SamQuerySeqWithRefIter::getNextMatchMismatch | ( | SamSingleBaseMatchInfo & | matchMismatchInfo | ) |
Returns information for the next position where the query and the reference match or mismatch.
To be a match or mismatch, both the query and reference must have a base that is not 'N'. This means: insertions and deletions are not mismatches or matches. 'N' bases are not matches or mismatches
matchMismatchInfo | return parameter with the information about the matching/mismatching base. |
Definition at line 102 of file SamQuerySeqWithRefHelper.cpp.
References BaseUtilities::areEqual(), SamRecord::getFlag(), SamRecord::getReadLength(), Cigar::getRefOffset(), SamRecord::getSequence(), Cigar::INDEX_NA, BaseUtilities::isAmbiguous(), SamRecord::NONE, SamSingleBaseMatchInfo::setQueryIndex(), and SamSingleBaseMatchInfo::setType().
Referenced by SamFilter::clipOnMismatchThreshold(), and SamFilter::sumMismatchQuality().
{ // Check whether or not this read is mapped. // If the read is not mapped, return no matches. if(!SamFlag::isMapped(myRecord.getFlag())) { // Not mapped. return(false); } // Check that the Cigar is set. if(myCigar == NULL) { // Error. throw(std::runtime_error("Cannot determine matches/mismatches since failed to retrieve the cigar")); return(false); } // If myStartOfReadOnRefIndex is the default (unset) value, then // the reference was not found, so return false, no matches/mismatches. if(myStartOfReadOnRefIndex == INVALID_GENOME_INDEX) { // This reference name was not found in the reference file, so just // return no matches/mismatches. return(false); } // Repull the read length from the record to check just in case the // record has changed length. // Loop until a match or mismatch is found as long as query index // is still within the read (Loop is broken by a return). while((myQueryIndex < myRecord.getReadLength()) && (myQueryIndex >= 0)) { // Still more bases, look for a match/mismatch. // Get the reference offset for this read position. int32_t refOffset = myCigar->getRefOffset(myQueryIndex); if(refOffset == Cigar::INDEX_NA) { // This is either a softclip or an insertion // which do not count as a match or a mismatch, so // go to the next index. nextIndex(); continue; } // Both the reference and the read have a base, so get the bases. char readBase = myRecord.getSequence(myQueryIndex, SamRecord::NONE); char refBase = myRefSequence[myStartOfReadOnRefIndex + refOffset]; // If either the read or the reference bases are unknown, then // it does not count as a match or a mismatch. if(BaseUtilities::isAmbiguous(readBase) || BaseUtilities::isAmbiguous(refBase)) { // Either the reference base or the read base are unknown, // so skip this position. nextIndex(); continue; } // Both the read & the reference have a known base, so it is either // a match or a mismatch. matchMismatchInfo.setQueryIndex(myQueryIndex); // Check if they are equal. if(BaseUtilities::areEqual(readBase, refBase)) { // Match. matchMismatchInfo.setType(SamSingleBaseMatchInfo::MATCH); // Increment the query index to the next position. nextIndex(); return(true); } else { // Mismatch matchMismatchInfo.setType(SamSingleBaseMatchInfo::MISMATCH); // Increment the query index to the next position. nextIndex(); return(true); } } // No matches or mismatches were found, so return false. return(false); }
bool SamQuerySeqWithRefIter::reset | ( | bool | forward = true | ) |
Reset to start at the beginning of the record.
This will re-read values from SamRecord, so can be used if it has changed to contain information for a new record.
forward | true means to start from the beginning and go to the end; false means to start from the end and go to the beginning. |
Definition at line 58 of file SamQuerySeqWithRefHelper.cpp.
References SamRecord::get0BasedPosition(), SamRecord::getCigarInfo(), GenomeSequence::getGenomePosition(), SamRecord::getReadLength(), and SamRecord::getReferenceName().
{ myCigar = myRecord.getCigarInfo(); if(myCigar == NULL) { // Failed to get Cigar. return(false); } // Get where the position of where this read starts as mapped to the // reference. myStartOfReadOnRefIndex = myRefSequence.getGenomePosition(myRecord.getReferenceName()); if(myStartOfReadOnRefIndex != INVALID_GENOME_INDEX) { // This reference name was found in the reference file, so // add the start position. myStartOfReadOnRefIndex += myRecord.get0BasedPosition(); } myForward = forward; if(myForward) { myQueryIndex = 0; } else { // reverse, so start at the last entry. myQueryIndex = myRecord.getReadLength() - 1; } return(true); }