Estimating IBD probabilities
NEW!!! This page was written a few years ago, and since then a very useful
alternative to the prelude and finale interface for IBD estimation
as emerged in the merlin program. If your data set consists of small to
moderate size pedigrees (typically less than 30 individuals), you can learn about
how to use merlin to estimate IBD coefficients by following
this link.
QTDT uses simwalk2
or genehunter2
to estimate IBD probabilities. prelude and finale provide
a convenient interface between the QTDT package and these programs.
prelude takes matched pedigree and data files as input,
estimates allele frequencies and generates input files for IBD
calculations. finale collects simwalk2 or genehunter results
and produces an IBD file in QTDT format.
Warning! When using simwalk2 to estimate IBD probabilities,
marker, family and individual names should be restricted to 8
characters!
To setup a simwalk2 run for the sibs example data set, run
prelude -d sibs.dat -p sibs.ped -aa -t 0.001. As usual,
the -d and -p options specify the input file names,
the -aa option specifies that all individuals should be
considered for estimating allele frequencies, and the -t 0.001
option specifies the default recombination fraction between markers.
This should produce the following output:
Prelude - (c) 1999-2000 Goncalo Abecasis
Preprocessor for external IBD sources
The following parameters are in effect:
Pedigree File : sibs.ped (-pname)
Data File : sibs.dat (-dname)
Theta between markers : 0.001 (-t99.999)
Allele Frequencies : ALL INDIVIDUALS (-a[a|e|f])
Preparing SimWalk2 Run Specification...
Preparing SimWalk2 Data Description...
Preparing SimWalk2 Pedigree Data...
You may now run SimWalk2 to estimate IBD
When you are finished, run finale
|
The recombination fractions between markers are listed in the
BATCH2.DAT file (which contains the simwalk2 run parameters) and
may be edited if required. Instead, we will proceed to run simwalk2.
Simwalk2 is slow, but like QTDT can handle almost any type of
pedigree. For this example, you may wish to run simwalk2 during
your lunch break! Simwalk2 screen output will look like this:
SimWalk2 version 2.60
Type of data analysis: Identity-By-Descent
Locus data INPUT file: LOCUS.DAT
Pedigree data INPUT file: PEDIGREE.DAT
Individual OUTPUT files: IBD-01.mmm
Here 'mmm' is from the order within the input pedigree file,
e.g., '001' for the first pedigree, etc.
Working on pedigree initialization ...
Pedigree #001 completed initialization;
Pedigree #002 completed initialization;
(... etc etc ...)
|
Simwalk2 places summary IBD statistics for each pedigree in
a file named IBD-01.[nnn], where [nnn] is the pedigree's serial
number. To collect these statistics into a single QTDT ibd file,
run finale IBD-01.*.
Finale - (c) 1999-2000 Goncalo Abecasis
Postprocessor for external IBD sources
Typical usage: finale IBD-01.*
Preparing qtdt.ibd file...
Processed 50 Simwalk2 IBD files...
|
Genehunter is less flexible than Simwalk2 but much faster for
small pedigrees. To use genehunter in IBD calculations:
- Run prelude as above (prelude -d sibs.dat -p sibs.ped
-aa -t 0.001). The recombination fractions used by genehunter
are in the file genehunter.in, which may be edited if
required.
- Run gh < genehunter.in which instructs genehunter
to execute all the commands in the genehunter.in file
(you must have genehunter2!).
- Finally, collect ibd estimates by running finale genehunter.in.
To see how qtdt can estimate heritabilities and evaluate
if a candidate polymorphism is the disease mutation, proceed
to the next section.
|