University of Michigan Center for Statistical 
Genetics
Search
 
 

 
 

QTDT - Tour | Previous | Next

Input Files

qtdt requires a pair of matched pedigree and data files as input. We will start by analysing the example file trios.dat and trios.ped. These are standard text files, where columns are separated by tabs of spaces.

Start by looking at trios.dat:

 File: trios.dat
 M SNP_1
 T Trait_1
 T Trait_2

This file describes the organization of the pedigree file. Possible types of pedigree data include marker genotypes (M), traits (T) and covariates (C). Each data item is described on a separate line. In this case, the pedigree will include a single marker genotype (SNP_1) and two phenotypes for each individual (Trait_1 and Trait_2). Markers, traits and covariates can be defined in any order, but the data and pedigree files must match.

Now have a look at trios.ped:

 File: trios.ped
   1  1  x  x  1   2/ 2   98.667  103.138
   1  2  x  x  2   2/ 2   99.845   96.476
   1  3  1  2  1   2/ 2  108.057  105.977
  10  1  x  x  1   x/ x        x        x
  10  2  x  x  2   x/ x        x        x
  10  3  1  2  1   x/ x        x        x
 100  1  x  x  1   1/ 2   82.720  100.866
 100  2  x  x  2   1/ 2  104.746   96.111
 100  3  1  2  1   1/ 2   95.851   88.789

Each line describes a single individual, and includes a family identifier, a personal identifier, paternal and maternal identifiers and a sex code (1 for males, 2 for females). These are followed by the information specified in the data file, in this case a marker genotype and two quantitative traits.

Marker genotypes are encoded as two consecutive integers and missing values can be encoded as zeros (0) or exes (x). In this case, marker genotypes have been encode as an allele pair separated by a forward-slash (/), but the slash is optional. Notice that family 10 (lines 4 to 6 hasn't been genotyped).

Phenotypes are encoded as numbers. Missing values can be encoded as exes (x) or using an unusual number (such as -99.999). You can get a summary of the contents of QTDT format pedigree and data files by running pedstats. The output provides a convenient check that your data is being interpreted correctly by QTDT. Try typing pedstats -d trios.dat -p trios.ped now.

The output should start with a brief copyright notice and summary of available parameters. This initial output is similar for all programs in the QTDT package:

Pedigree Statistics - (c) 1999-2000 Goncalo Abecasis
The following parameters are in effect:
            QTDT Pedigree File :       trios.ped (-pname)
                QTDT Data File :       trios.dat (-dname)
                 QTDT IBD File :        qtdt.ibd (-iname)
            Missing Value Code :         -99.999 (-xname)

pedstats for example, allows you to specify an input data file (-d), pedigree (-p), ibd information file (-i) and missing value code (-x).

The rest of the output describes the family composition of your pedigree, and summary statistics for each trait (mean, average, number of phenotyped individuals, number of phenotyped founders) and marker (heterozygosity, number of genotyped individuals, number of genotyped founders).

PEDIGREE STRUCTURE
==================
Families:    100
Individuals: 300 (200 founders, 100 nonfounders)
Family Size: 3 to 3
Generations: 2 to 2
QUANTITATIVE TRAIT STATISTICS
=============================
                 [Count]       [Founder]              Mean        Var
        Trait_1      297  99.0%      198  99.0%     99.570     94.635
        Trait_2      297  99.0%      198  99.0%     99.552     93.777
          Total      594  99.0%      396  99.0%
MARKER GENOTYPE STATISTICS
==========================
                 [Count]       [Founder]            Hetero        IBD
          SNP_1      297  99.0%      198  99.0%      48.8%         0/100 families
          Total      297  99.0%      198  99.0%      48.8%

As you can see, the data consists of 100 families each with a single offspring. Let's analyse it now!


 
 

University of Michigan | School of Public Health | Abecasis Lab