PEDSTATS Tutorial -- Fast Exact Hardy Weinberg Test for SNPs

The exact Hardy-Weinberg test implemented in Pedstats is based on an exact calculation of the probability of observing H heterozygotes conditional on the number of copies of the minor SNP allele.

In general, if we have N total genotypes, R copies of the minor allele, and therefore C = 2N - R copies of the common allele, there will be 2N!/ C!R! possible arrangements for the alleles in the sample. If we have H heterozygotes, the number of homozygous carriers of the minor allele can be written as

RH = (R-H) / 2

while the number of homozygous carriers of the common allele is given by

CH = (2N - R - H) / 2.

Of the 2N!/C!R! possible allele arrangements, 2H N! / (H! RH!CH!) contain exactly H heterozygotes. Therefore, the probability of observing H heterozygotes, conditional on R copies of the minor allele and N total genotypes is given by

P(H | N, R) = (2H N! / (H! RH! CH!) * (R! C! / 2N!)

PEDSTATS uses a fast algorithm to evaluate this probability for all possible values of H in order to recreate the exact probability distribution. Once this distribution is known, a p-value for the observed number of heterozygotes is calculated as:

