1000G 2010-08 Download

Original data (generated by the Broad Institute) are available at BI autosomes and BI X chromosome.

There are total 629 individuals broken down by:

    174 AFR = 78 YRI + 67 LWK + 24 ASW + 5 PUR
    283 EUR = 90 CEU + 92 TSI + 43 GBR + 36 FIN + 17 MXL + 5 PUR
    194 ASN = 68 CHB + 25 CHS + 84 JPT + 17 MXL

For each continental group (AFR, EUR, ASN), SNPs with missing genotypes are removed. For autosomes, we applied further filtering of SNPs not flagged as QC+ in the 4-way (Broad Institute, Michigan, Boston College and NCBI) merged set. Original supporting data used to construct the 4-way consensus SNP set are available at 4-way merged autosomes. Note that this original set doesn't include the X chromosome.

Singletons (SNPs with minor allele appearing once) are NOT removed.

Download Data


 The files can be directly fed to mach. We recommend a 2-step imputation procedure: pre-phasing using MaCH and imputation using minimac. For details, please go to minimac .

Report to Yun Li if a large number of genotyped SNPs are discarded due to absence in this reference. You can check through the following command line
> grep "will be ignored" mach.*.log

* $pop.chrX.hap.gz contains two duplicated chromosomes for each male. Please use $pop.chrX.noDup.hap.gz instead.

* Do not turn on --compact if memory is not an issue.


