GOLD - Available Statistics
Notation
Consider markers A and B, with r and c alleles,
respectively. As usual, define the following variables based
on observed haplotype counts:

These can be used to derive the haplotype frequencies:

The available measures of disequilibrium are defined in terms
of the above quantities.
Lewontin's Disequilibrium Coefficient D and D'
The definition of Hedrick (1987), which allows for multi-allelic
markers is used:

Delta-Squared Measure of disequilibrium
This is only defined for bi-allelic markers. The conventional
definition (see for example, Devlin and Risch, 1995) is used:

Other Measures of Association - Chi-Squared
The usual contingency table chi-squared is calculated, and
its significant is estimated from an asymptotic distribution
with (r-1)(c-1) degrees of freedom.
Other Measures of Association - Uncertainty Coefficient
U
The definition of Press, Teukolsky, Vetterling and Flannery,
NRC (2nd Edition), is used:

H(A), H(B), H(A,B) refer to the informativeness of marker
A, marker B, and the two marker genotypes together. U refers
to how much information one marker provides on the other's genotype.
It varies between 0 (independent) and 1 (completely dependent).
Other Measures of Association - Cramer's V
This is a transformation of the chi-squared statistic into
the zero to one interval and is useful for comparing the relative
intensity of association between marker pairs with the r
and c. The definition of Press, Teukolsky, Vetterling
and Flannery, NRC (2nd Edition), is used:

|