How are pedigree errors identified?

Pairs of different classes of relatives and non-relatives can be characterised by a unique distribution of allele sharing across the genome.  For instance, full siblings are expected to share a higher proportion of alleles than half siblings, and parent-offspring pairs are expected to share the same number of alleles, on average, as sibling pairs, but their variability in allele sharing is much lower than that of sibling pairs.  Pairs of unrelated individuals should share fewer alleles on average than half siblings.  Finally, MZ twins should share alleles at all loci (provided there is no genotyping error).

How does GRR use IBS allele sharing to identify pedigree errors?

By computing the average allele sharing for any pair of individuals in a sample, across all available markers, along with the standard deviation in genome-wide IBS and plotting this mean against the standard deviation, we readily characterise full-sib from half-sib pairs, parent-offspring pairs from siblings, and unrelated individuals from relatives.  Each relative class will form a distinct cluster on these plots and outliers from these clusters will represent likely pedigree errors.  When including all pairings in a sample, not just pairings within families, we can further detect problems such as sample duplications or perhaps related individuals who have been presumed to be unrelated.


