GAINQC Relation Checks
The genotype data can also be used to infer relationships among the subjects in the study. This can further be used to
verify the correctness of the putative relationships. The misspecified relation information is output in a
relation information file.
In order to infer the relationships among the individual samples, we compute the three allele sharing
Identity By Descent (IBD) coefficients for all pairs of individuals. These IBD coefficients (corresponding to
sharing 0, 1 or 2 alleles by descent) are obtained using the Identity By State (IBS - sharing by state as
opposed to descent) coefficients. The IBS are computed using the genotypes of the two individuals and the allele
frequencies for each SNP. These coefficients are updated after processing each snp in the second pass. Some minor
allele frequency restriction can be applied to the markers used for relationship inference. The IBD coefficients
are computed using the IBS coefficients. More information on these coefficients, along with typical values for
certain relationship pairs and equations to convert from IBS to IBD probabilities, can be found in
Goncalo's lecture slides (pp 23-30).
For the relationship checks in GAINQC, only the below mentioned relationships are inferred:
- Duplicates/MZ twins
- Parent-Offspring
- Siblings
- Half-siblings
- Unrelated
Any pair that cannot be inferred as any one of these relationships is just called a related pair.
The kinship coefficients for each pair of samples is computed. The histogram of kinship coefficients
among all sample pairs with the same putative relationship (relation group) is examined. Any pair that has a kinship
coefficient more similar (on the basis of z-scores) to some other relationship is flagged and output in the
relation information file. The mean kinship of each relation group is also compared to the expected kinship
for the relationship. If this difference is large, a warning message is displayed indicating the same. If there
are no pairs with any of the above relationships, then the expected kinship for the relationship is used to
infer the relationships.
NOTE: There is a known bug with relationship checks. The next release should have this bug fixed. ETA for the
next release is Jan 31st, 2008.
|