1000G 2010-06 Download

Original data (generated by merging three preliminary call sets: (1) by Jared Maguire and colleagues at the Broad Institute; (2) by Yun Li and Goncalo Abecasis at the University of Michigan; and (3) by Quang Le and Richard Durbin at the Sanger Institute) are the March 2010 release of phased data from the 1000 Genomes Project, downloadable from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/pilot_data/release/2010_03/pilot1/ (README). The CHB+JPT dataset contains 124 haplotypes. Singletons (SNPs with minor allele appearing once) are NOT removed. Haplotypes are merged by Bryan Howie in June 2010.

Download Data

    2010-06.CEU.hap.tgz 
    2010-06.CEU.snps.tgz           
    2010-06.CEU.map.tgz  
    2010-06.CEU.annotation.tgz  

    2010-06.CHB+JPT.hap.tgz          
    2010-06.CHB+JPT.snps.tgz        	      
    2010-06.CHB+JPT.map.tgz        	      
    2010-06.CHB+JPT.annotation.tgz        	      

    2010-06.YRI.hap.tgz          
    2010-06.YRI.snps.tgz        	      
    2010-06.YRI.map.tgz        	      
    2010-06.YRI.annotation.tgz        	      
  

 The files can be directly fed to mach. We recommend a 2-step imputation procedure: pre-phasing using MaCH and imputation using minimac. For details, please go to minimac .

 Warning:
Report to Yun Li if a large number of genotyped SNPs are discarded due to absence in this reference. You can check through the following command line
> grep "will be ignored" mach.*.log

 Notes:
Do not turn on --compact if memory is not an issue.