InputFile Class Reference

Class for easily reading/writing files without having to worry about file type (uncompressed, gzip, bgzf) when reading. More...

#include <InputFile.h>

Collaboration diagram for InputFile:
Collaboration graph
[legend]

List of all members.

Public Types

enum  ifileCompression { DEFAULT, UNCOMPRESSED, GZIP, BGZF }
 

Compression to use when writing a file & decompression used when reading a file from stdin.

More...

Public Member Functions

 InputFile ()
 Default constructor.
 ~InputFile ()
 Destructor.
 InputFile (const char *filename, const char *mode, InputFile::ifileCompression compressionMode=InputFile::DEFAULT)
 Constructor for opening a file.
void bufferReads (unsigned int bufferSize=DEFAULT_BUFFER_SIZE)
 Set the buffer size for reading from files so that bufferSize bytes are read at a time and stored until accessed by another read call.
void disableBuffering ()
 Disable read buffering.
int ifclose ()
 Close the file.
int ifread (void *buffer, unsigned int size)
 Read size bytes from the file into the buffer.
int ifgetc ()
 Get a character from the file.
void ifrewind ()
 Reset to the beginning of the file.
int ifeof ()
 Check to see if we have reached the EOF.
unsigned int ifwrite (const void *buffer, unsigned int size)
 Write the specified buffer into the file.
bool isOpen ()
 Returns whether or not the file was successfully opened.
long int iftell ()
 Get current position in the file.
bool ifseek (long int offset, int origin)
 Seek to the specified offset from the origin.
const char * getFileName () const
 Get the filename that is currently opened.

Protected Member Functions

bool openFile (const char *filename, const char *mode, InputFile::ifileCompression compressionMode)
int readFromFile (void *buffer, unsigned int size)

Protected Attributes

FileTypemyFileTypePtr
unsigned int myAllocatedBufferSize
char * myFileBuffer
unsigned int myBufferIndex
unsigned int myCurrentBufferSize
std::string myFileName

Static Protected Attributes

static const unsigned int DEFAULT_BUFFER_SIZE = 1048576

Detailed Description

Class for easily reading/writing files without having to worry about file type (uncompressed, gzip, bgzf) when reading.

Definition at line 35 of file InputFile.h.


Member Enumeration Documentation

Compression to use when writing a file & decompression used when reading a file from stdin.

Any other read checks the file to determine how to uncompress it.

Enumerator:
DEFAULT 

Check the extension, if it is ".gz", treat as gzip, otherwise treat it as UNCOMPRESSED.

UNCOMPRESSED 

uncompressed file.

GZIP 

gzip file.

BGZF 

bgzf file.

Definition at line 42 of file InputFile.h.

00042                           {
00043         DEFAULT,  ///< Check the extension, if it is ".gz", treat as gzip, otherwise treat it as UNCOMPRESSED.
00044         UNCOMPRESSED,  ///< uncompressed file.
00045         GZIP,  ///< gzip file.
00046         BGZF ///< bgzf file.
00047     };


Constructor & Destructor Documentation

InputFile::InputFile ( const char *  filename,
const char *  mode,
InputFile::ifileCompression  compressionMode = InputFile::DEFAULT 
)

Constructor for opening a file.

Parameters:
filename file to open
mode same format as fopen: "r" for read & "w" for write.
compressionMode set the type of file to open for writing or for reading from stdin (when reading files, the compression type is determined by reading the file).

Definition at line 27 of file InputFile.cpp.

00029 {
00030     myFileTypePtr = NULL;
00031     myBufferIndex = 0;
00032     myCurrentBufferSize = 0;
00033     myAllocatedBufferSize = DEFAULT_BUFFER_SIZE;
00034     myFileBuffer = new char[myAllocatedBufferSize];
00035     myFileName.clear();
00036 
00037     openFile(filename, mode, compressionMode);
00038 }


Member Function Documentation

void InputFile::bufferReads ( unsigned int  bufferSize = DEFAULT_BUFFER_SIZE  )  [inline]

Set the buffer size for reading from files so that bufferSize bytes are read at a time and stored until accessed by another read call.

This improves performance over reading the file small bits at a time. Buffering reads disables the tell call for bgzf files. Any previous values in the buffer will be deleted.

Parameters:
bufferSize number of bytes to read/buffer at a time, default buffer size is 1048576, and turn off read buffering by setting bufferSize = 1;

Definition at line 81 of file InputFile.h.

Referenced by disableBuffering().

00082     {
00083         // If the buffer size is the same, do nothing.
00084         if(bufferSize == myAllocatedBufferSize)
00085         {
00086             return;
00087         }
00088         // Delete the previous buffer.
00089         if(myFileBuffer != NULL)
00090         {
00091             delete[] myFileBuffer;
00092         }
00093         myBufferIndex = 0;
00094         myCurrentBufferSize = 0;
00095         // The buffer size must be at least 1 so one character can be
00096         // read and ifgetc can just assume reading into the buffer.
00097         if(bufferSize < 1)
00098         {
00099             bufferSize = 1;
00100         }
00101         myFileBuffer = new char[bufferSize];
00102         myAllocatedBufferSize = bufferSize;
00103 
00104         if(myFileTypePtr != NULL)
00105         {
00106             if(bufferSize == 1)
00107             {
00108                 myFileTypePtr->setBuffered(false);
00109             }
00110             else
00111             {
00112                 myFileTypePtr->setBuffered(true);
00113             }
00114         }
00115     }

const char* InputFile::getFileName (  )  const [inline]

Get the filename that is currently opened.

Returns:
filename associated with this class

Definition at line 341 of file InputFile.h.

Referenced by SamFile::ReadBamIndex().

00342     {
00343         return(myFileName.c_str());
00344     }

int InputFile::ifclose (  )  [inline]

Close the file.

Returns:
status of the close (0 is success).

Definition at line 131 of file InputFile.h.

Referenced by ifclose().

00132     {
00133         if (myFileTypePtr == NULL)
00134         {
00135             return EOF;
00136         }
00137         int result = myFileTypePtr->close();
00138         delete myFileTypePtr;
00139         myFileTypePtr = NULL;
00140         myFileName.clear();
00141         return result;
00142     }

int InputFile::ifeof (  )  [inline]

Check to see if we have reached the EOF.

Returns:
0 if not EOF, any other value means EOF.

Definition at line 257 of file InputFile.h.

Referenced by ifeof().

00258     {
00259         // Not EOF if we are not at the end of the buffer.
00260         if (myBufferIndex < myCurrentBufferSize)
00261         {
00262             // There are still available bytes in the buffer, so NOT EOF.
00263             return false;
00264         }
00265         else
00266         {
00267             if (myFileTypePtr == NULL)
00268             {
00269                 // No myFileTypePtr, so not eof (return 0).
00270                 return 0;
00271             }
00272             // exhausted our buffer, so check the file for eof.
00273             return myFileTypePtr->eof();
00274         }
00275     }

int InputFile::ifgetc (  )  [inline]

Get a character from the file.

Read a character from the internal buffer, or if the end of the buffer has been reached, read from the file into the buffer and return index 0.

Returns:
character that was read or EOF.

Definition at line 221 of file InputFile.h.

Referenced by ifgetc(), and operator>>().

00222     {
00223         if (myBufferIndex >= myCurrentBufferSize)
00224         {
00225             // at the last index, read a new buffer.
00226             myCurrentBufferSize = readFromFile(myFileBuffer, myAllocatedBufferSize);
00227             myBufferIndex = 0;
00228         }
00229         // If the buffer index is still greater than or equal to the
00230         // myCurrentBufferSize, then we failed to read the file - return EOF.
00231         if (myBufferIndex >= myCurrentBufferSize)
00232         {
00233             return(EOF);
00234         }
00235         return(myFileBuffer[myBufferIndex++]);
00236     }

int InputFile::ifread ( void *  buffer,
unsigned int  size 
) [inline]

Read size bytes from the file into the buffer.

Parameters:
buffer pointer to memory at least size bytes big to write the data into.
size number of bytes to be read
Returns:
number of bytes read

Definition at line 149 of file InputFile.h.

Referenced by ifread().

00150     {
00151         // There are 2 cases:
00152         //  1) There are already size available bytes in buffer.
00153         //  2) There are not size bytes in buffer.
00154 
00155         // Determine the number of available bytes in the buffer.
00156         unsigned int availableBytes = myCurrentBufferSize - myBufferIndex;
00157         unsigned int returnSize = 0;
00158 
00159         // Case 1: There are already size available bytes in buffer.
00160         if (size <= availableBytes)
00161         {
00162             //   Just copy from the buffer, increment the index and return.
00163             memcpy(buffer, myFileBuffer+myBufferIndex, size);
00164             // Increment the buffer index.
00165             myBufferIndex += size;
00166             returnSize = size;
00167         }
00168         // Case 2: There are not size bytes in buffer.
00169         else
00170         {
00171             // Check to see if there are some bytes in the buffer.
00172             if (availableBytes > 0)
00173             {
00174                 // Size > availableBytes > 0
00175                 // Copy the available bytes into the buffer.
00176                 memcpy(buffer, myFileBuffer+myBufferIndex, availableBytes);
00177             }
00178             unsigned int remainingSize = size - availableBytes;
00179 
00180             // Check if the remaining size is more or less than the
00181             // max buffer size.
00182             if(remainingSize < myAllocatedBufferSize)
00183             {
00184                 // the remaining size is not the full buffer, but read
00185                 //  a full buffer worth of data anyway.
00186                 myCurrentBufferSize =
00187                     readFromFile(myFileBuffer, myAllocatedBufferSize);
00188                 
00189                 // Check to see how much was copied.
00190                 unsigned int copySize = remainingSize;
00191                 if(copySize > myCurrentBufferSize)
00192                 {
00193                     copySize = myCurrentBufferSize;
00194                 }
00195 
00196                 // Now copy the rest of the bytes into the buffer.
00197                 memcpy((char*)buffer+availableBytes, myFileBuffer, copySize);
00198 
00199                 // set the buffer index to the location after what we read.
00200                 myBufferIndex = copySize;
00201                 
00202                 returnSize = availableBytes + copySize;
00203             }
00204             else
00205             {
00206                 // More remaining to be read than the max buffer size, so just
00207                 // read directly into the output buffer.
00208                 int readSize = readFromFile((char*)buffer + availableBytes,
00209                                             remainingSize);
00210                 returnSize = readSize + availableBytes;
00211             }
00212         }
00213         return(returnSize);
00214     }

bool InputFile::ifseek ( long int  offset,
int  origin 
) [inline]

Seek to the specified offset from the origin.

Parameters:
offset offset into the file to move to (must be from a tell call)
origin can be any of the following: Note: not all are valid for all filetypes. SEEK_SET - Beginning of file SEEK_CUR - Current position of the file pointer SEEK_END - End of file
Returns:
true on successful seek and false on a failed seek.

Definition at line 326 of file InputFile.h.

Referenced by ifseek().

00327     {
00328         if (myFileTypePtr == NULL)
00329         {
00330             // No myFileTypePtr, so return false - could not seek.
00331             return false;
00332         }
00333         // Reset buffering since a seek is being done.
00334         myBufferIndex = 0;
00335         myCurrentBufferSize = 0;
00336         return myFileTypePtr->seek(offset, origin);
00337     }

long int InputFile::iftell (  )  [inline]

Get current position in the file.

Returns:
current position in the file, -1 indicates an error.

Definition at line 307 of file InputFile.h.

Referenced by iftell().

00308     {
00309         if (myFileTypePtr == NULL)
00310         {
00311             // No myFileTypePtr, so return false - could not seek.
00312             return -1;
00313         }
00314         return myFileTypePtr->tell();
00315     }

unsigned int InputFile::ifwrite ( const void *  buffer,
unsigned int  size 
) [inline]

Write the specified buffer into the file.

Parameters:
buffer buffer containing size bytes to write to the file.
size number of bytes to write
Returns:
number of bytes written We do not buffer the write call, so just leave this as normal.

Definition at line 282 of file InputFile.h.

Referenced by ifwrite().

00283     {
00284         if (myFileTypePtr == NULL)
00285         {
00286             // No myFileTypePtr, so return 0 - nothing written.
00287             return 0;
00288         }
00289         return myFileTypePtr->write(buffer, size);
00290     }

bool InputFile::isOpen (  )  [inline]

Returns whether or not the file was successfully opened.

Returns:
true if the file is open, false if not.

Definition at line 294 of file InputFile.h.

Referenced by ifopen(), GlfHeader::read(), SamRecord::setBufferFromFile(), GlfHeader::write(), and SamRecord::writeRecordBuffer().

00295     {
00296         // It is open if the myFileTypePtr is set and says it is open.
00297         if ((myFileTypePtr != NULL) && myFileTypePtr->isOpen())
00298         {
00299             return true;
00300         }
00301         // File was not successfully opened.
00302         return false;
00303     }


The documentation for this class was generated from the following files:
Generated on Tue Mar 22 22:50:22 2011 for StatGen Software by  doxygen 1.6.3