University of Michigan Center for Statistical 
Search Liming's website


Genotype-Based Matching to Correct for Population Stratification in
Large-Scale Case-Control Genetic Association Studies

Case-control association tests are generally more powerful than family-based association tests but population stratification can lead to spurious disease-marker association or mask a true association. We propose a similarity score matching approach that matches cases with controls and perform association test condition on the matched set so as to adjust for underlying population structures and potentially increase power. The genetic similarity score matching analysis consists of three steps:

      1) Similarity score calculation for each pair of case and control
      2) Optimal full matching to match cases with controls
      3) Conditional logistic regression (additive or 2 d.f. test)

Comments and suggestions are appreciated! Please email me: and Weihua Guan

Precompiled binaries for Windows, example data and instructions.
Mac OS
If you are using Mac OS, please read this.
Precompiled binaries for Linux (64bit system), example data and instructions.
Precompiled binaries for Linux (32bit system), example data and instructions.


Input data format, meaning of parameters and output interpretation

Update history

We start to add new features and document all updates and bug fixes


Weihua Guan*, Liming Liang*, Michael Boehnke, Gonçalo R. Abecasis (2009). Genotype-based matching to correct for population stratification in large-scale case-control genetic association studies. Genet Epidemiol DOI:10.1002/gepi.20403

* These authors contributed equally to this work.


University of Michigan | School of Public Health