1000G 2010-06 Download
Original data (generated by merging three preliminary call sets: (1) by Jared Maguire and colleagues at the Broad Institute; (2) by Yun Li and Goncalo Abecasis at the
University of Michigan; and (3) by Quang Le and Richard Durbin at the Sanger Institute) are the March 2010 release of phased data from the 1000 Genomes Project,
downloadable from
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/pilot_data/release/2010_03/pilot1/
(README). The CHB+JPT dataset contains 124
haplotypes.
Singletons (SNPs with minor allele appearing once) are NOT removed. Haplotypes are merged by Bryan Howie in June
2010.
The files can be directly fed to mach. We recommend a 2-step imputation procedure: pre-phasing using MaCH and imputation using minimac.
For details, please go to minimac .
Warning:
Report to Yun Li if a large number of genotyped SNPs are discarded due to absence in this
reference. You can check through the following command line
> grep "will be ignored" mach.*.log
Notes:
Do not turn on --compact if memory is not an issue.
|