Input Files
qtdt requires a pair of matched pedigree and data files
as input. We will start by analysing the example file trios.dat
and trios.ped. These are standard text files, where columns
are separated by tabs of spaces.
Start by looking at trios.dat:
File: trios.dat |
M SNP_1
T Trait_1
T Trait_2
|
This file describes the organization of the pedigree file.
Possible types of pedigree data include marker genotypes (M),
traits (T) and covariates (C). Each data item is
described on a separate line. In this case, the pedigree will
include a single marker genotype (SNP_1) and two phenotypes
for each individual (Trait_1 and Trait_2). Markers,
traits and covariates can be defined in any order, but the data
and pedigree files must match.
Now have a look at trios.ped:
File: trios.ped |
1 1 x x 1 2/ 2 98.667 103.138
1 2 x x 2 2/ 2 99.845 96.476
1 3 1 2 1 2/ 2 108.057 105.977
10 1 x x 1 x/ x x x
10 2 x x 2 x/ x x x
10 3 1 2 1 x/ x x x
100 1 x x 1 1/ 2 82.720 100.866
100 2 x x 2 1/ 2 104.746 96.111
100 3 1 2 1 1/ 2 95.851 88.789
|
Each line describes a single individual, and includes a family
identifier, a personal identifier, paternal and maternal identifiers
and a sex code (1 for males, 2 for females). These are followed
by the information specified in the data file, in this case a
marker genotype and two quantitative traits.
Marker genotypes are encoded as two consecutive integers and
missing values can be encoded as zeros (0) or exes (x).
In this case, marker genotypes have been encode as an allele pair
separated by a forward-slash (/), but the slash is optional.
Notice that family 10 (lines 4 to 6 hasn't been genotyped).
Phenotypes are encoded as numbers. Missing values can be encoded
as exes (x) or using an unusual number (such as -99.999).
You can get a summary of the contents of QTDT format pedigree
and data files by running pedstats. The output provides
a convenient check that your data is being interpreted correctly
by QTDT. Try typing pedstats -d trios.dat -p trios.ped
now.
The output should start with a brief copyright notice and summary
of available parameters. This initial output is similar for all
programs in the QTDT package:
Pedigree Statistics - (c) 1999-2000 Goncalo Abecasis
The following parameters are in effect:
QTDT Pedigree File : trios.ped (-pname)
QTDT Data File : trios.dat (-dname)
QTDT IBD File : qtdt.ibd (-iname)
Missing Value Code : -99.999 (-xname)
|
pedstats for example, allows you to specify an input
data file (-d), pedigree (-p), ibd information file
(-i) and missing value code (-x).
The rest of the output describes the family composition of
your pedigree, and summary statistics for each trait (mean, average,
number of phenotyped individuals, number of phenotyped founders)
and marker (heterozygosity, number of genotyped individuals, number
of genotyped founders).
PEDIGREE STRUCTURE
==================
Families: 100
Individuals: 300 (200 founders, 100 nonfounders)
Family Size: 3 to 3
Generations: 2 to 2
QUANTITATIVE TRAIT STATISTICS
=============================
[Count] [Founder] Mean Var
Trait_1 297 99.0% 198 99.0% 99.570 94.635
Trait_2 297 99.0% 198 99.0% 99.552 93.777
Total 594 99.0% 396 99.0%
MARKER GENOTYPE STATISTICS
==========================
[Count] [Founder] Hetero IBD
SNP_1 297 99.0% 198 99.0% 48.8% 0/100 families
Total 297 99.0% 198 99.0% 48.8%
|
As you can see, the data consists of 100 families each with
a single offspring. Let's analyse it now!
|