LAMP Tutorial -- Input File Formats

Main

CSG Home

-----------------------------------------------------------------

Abecasis Lab

Tutorial

LAMP Home

-----------------------------------------------------------------

Input Files

-----------------------------------------------------------------

Linkage Analysis

-----------------------------------------------------------------

Association Analysis I

-----------------------------------------------------------------

Association Analysis II

LAMP Input Files

LAMP performs joint linkage and association analysis using pedigree data. Input files describe relationships between individuals in your dataset, store marker genotypes and disease status information, and provide information on marker locations

LAMP supports input files in MERLIN format. LAMP will consider information on twin status, disease status and marker genotypes encoded in these input files. Covariate and quantitative trait information will be ignored. If you are not familiar with the MERLIN format pedigree and data files, please check the MERLIN tutorial. As usual, the PEDSTATS program provides a convenient way to check the content of pedigree and data files.

Although the map file formats used by MERLIN and LAMP are similar, the two programs use map files in slightly different ways. Thus, we'll describe how LAMP interprets map files in this section..

Genetic Maps

At a minimum, LAMP requires a candidate map that lists markers and other locations on the genetic map that you wish to evaluate. Each marker or location of interest should be listed in a separate line; which should include a chromosome number, marker name and position.

If you want to test three SNPs for association your candidate marker file might look like this:

<contents of candidate.map>
24           cSNP1    12.5
24           cSNP2    12.6
24           cSNP2    12.7
<end of candidate.map>

Positions for each SNP should be listed in cM. If this information is not available, one good approximation is to assume that 1cM = 1,000,000 bp and to look up the physical position of each marker on the current genome build and divide the resulting value by 1 million.

In addition to SNP and multi-allelic markers, the candidate map file can also include candidate positions. Although association can't be modeled without genotype data, linkage can be evaluated on the basis of flanking marker genotypes and LAMP will evaluate the evidence for linkage at these locations. For each candidate location, the second column (which typically includes the marker name) should now include the cM position. For example, to evaluate evidence for linkage at equally spaced 1cM positions:

<contents of candidate.map>
24           0.0       0.0
24           1.0       1.0
24           2.0       2.0
24           3.0       3.0
24           4.0       4.0
24           5.0       5.0
<end of candidate.map>

Optionally, a map listing flanking markers can be provided. These flanking markers will increase power when evaluating family samples, especially if some individuals are not genotyped for the candidate SNPs and if the candidate SNPs are indirectly associated with disease. Each line in the list of flanking markers should simply list chromosome, marker name and position. In the current version of LAMP a marker cannot be listed both in the framework and candidate maps, but we hope to remove this restriction soon.

<contents of frame.map>
24           flanking_marker1  10.0
24           flanking_marker2  15.0
<end of frame.map>

The data file and map file can include different sets of markers, but markers that are absent from both map files will be ignored by LAMP.

Using separate data and map files makes for a very simple file structure that will hopefully make it easy for you to prepare your input files!

At this point, you can proceed to learn about carrying out MOD score linkage analyses, simple association analyses, or more advanced association analysis.

University of Michigan | School of Public Health | Abecasis Lab