University of Michigan Center for Statistical 


An evaluation of the replicate pool method: quick estimation of genome-wide linkage peak p-values.

Wigginton JE and Abecasis GR

Genet Epidemiol (2006) 30:320-32

The calculation of empirical p-values for genome-wide non-parametric linkage tests continues to present significant computational challenges for many complex disease mapping studies. The gold standard approach is to use gene dropping to simulate null genome scans. Unfortunately, this approach is too computationally expensive for many data sets of interest. An alternative, more efficient method for sampling null genome scans is to pre-calculate pools of family-specific statistics and then resample from these replicate pools to generate "pseudo-replicate" genome scans. In this study, we use simulations to explore properties of the replicate pool p-value estimator pRP and show that it provides an excellent approximation to the traditional gene-dropping estimator for significantly less computational effort. While the computational efficiency of the replicate pool estimator is noticeable in almost all data sets, by applying the replicate pool method to several previously characterized data sets we show that savings in computational effort can be especially significant (on the order of 10,000-fold or more) when one or more large families are analyzed. We also estimate replicate pool p-values for the schizophrenia data described by Abecasis et al. and show that pRP closely approximates gene-drop p-values for all linkage peaks reported for this study. Lastly, we expand upon Song et al.'s previous work by deriving a conservative estimator of the variance for PRP that can easily be computed in practical settings. We have implemented the replicate pool method along with our variance estimator in a new program called Pseudo, which is the first widely available automated implementation of the replicate pool method.




University of Michigan | School of Public Health | Abecasis Lab