Main

Abecasis Lab

QTDT

Home

-----------------------------------------------------------------

Tutorial

-----------------------------------------------------------------

Download

-----------------------------------------------------------------

Reference

-----------------------------------------------------------------

FAQ

QTDT - Tour | Previous | Next

Variance Components Models of Association and Permutations for Exact p-values

Have a look at the second set of example files, sibs.dat, sibs.ped and sibs.ibd. These describe 50 nuclear families with four offspring each, but no parental phenotypes or genotypes. The pedigree file may look familiar, as it is, in fact, a pre-makeped LINKAGE format file. The file of IBD probabilities is generated automatically by prelude and finale, as described in the next section.

The data include 3 markers, 1 quantitative phenotype and 1 covariate. Missing genotypes are encoded as zeros, and missing phenotypes are encoded as -99.999. If you wish, run pedstats -d sibs.dat -p sibs.ped -x-99.999 to get a more detailed description of this file (do not place a space between -x and -99.999)

File: sibs.dat

File: sibs.ped

 M  SNP_1
 M  SNP_2
 M  SNP_3
 T  Trait   
 C  Covariate

   1  1  0  0  1   0  0   0  0   0  0  -99.999  -99.999
   1  2  0  0  2   0  0   0  0   0  0  -99.999  -99.999
   1  3  1  2  1   1  1   1  1   2  2   80.690   81.722
   1  4  1  2  1   1  2   1  2   2  2   80.955   90.420
   1  5  1  2  1   1  2   1  2   2  2   73.202   90.747
   1  6  1  2  2   1  1   1  1   2  2   90.030   99.436
(... additional families follow ...)

Simple linear models do not provide valid tests of linkage disequilibrium when multiple offspring per family are considered. qtdt uses variance components to model the phenotypic similarities that are common in family data. Variance components are specified using the -w option. A typical model for the variances might include environmental (e), polygenic (g) and additive (a) components of variance.

Run the following command qtdt -d sibs.dat -p sibs.ped -i sibs.ibd -x-99.999 -wega. (The command line parameters specify the input file names, the missing value code and the model for the variances). After the usual copyright notice, reference list, and summary of command line parameters you should see this model description:

The following models will be evaluated...
  NULL MODEL
     Means = Mu + Covariate + B
 Variances = Ve + Vg + Va
  FULL MODEL
     Means = Mu + Covariate + B + W
 Variances = Ve + Vg + Va

The model description now includes not only a linear model for the means (with the covariate defined in the pedigree file) but also a model for the variances. Means and variances are fitted by maximum likelihood using a numeric minimizer (different minimizers specified by the -n command-line option). The results section looks similar to the one in the previous section:

Testing trait:                          Trait
=============================================
Testing marker:                         SNP_1
---------------------------------------------
 Allele   df(0)  LnLk(0)   df(T)  LnLk(T)   ChiSq       p
    1 :     194   681.22     193   674.58   13.28  0.0003  ( 164/200 probands)
    2 :     194   681.22     193   674.58   13.28  0.0003  ( 164/200 probands)
Testing marker:                         SNP_2
---------------------------------------------
 Allele   df(0)  LnLk(0)   df(T)  LnLk(T)   ChiSq       p
    1 :     194   683.68     193   678.30   10.76  0.0010  ( 152/200 probands)
    2 :     194   683.68     193   678.30   10.76  0.0010  ( 152/200 probands)
Testing marker:                         SNP_3
---------------------------------------------
 Allele   df(0)  LnLk(0)   df(T)  LnLk(T)   ChiSq       p
    1 :     194   684.99     193   682.61    4.76  0.0291  ( 168/200 probands)
    2 :     194   684.99     193   682.61    4.76  0.0291  ( 168/200 probands)

SNP_1 appears to provide the strongest evidence for association. To find out what happens if you don't include the covariate in the analysis, run qtdt -d sibs.dat -p sibs.ped -i sibs.ibd -x-99.999 -wega -c-.

Variance components models can be sensitive to the phenotypic distribution, especially in small or selected samples. qtdt can calculate empirical p-values using a Monte-Carlo permutation framework. These permutations condition on the trait distribution, linkage and familiality, and provide a test for linkage disequilibrium. This can be relatively slow, but provides added confidence in your result.

To try some permutations, run qtdt -d sibs.dat -p sibs.ped -i sibs.ibd -x-99.999 -wega -m1000 -1 and go have a break!

To find out how the IBD probabilities were estimated using simwalk2, proceed to the next section.

University of Michigan | School of Public Health | Abecasis Lab