MERLIN Reference -- Specifying a Disease Model
Linkage analysis tests for co-segregation of a chromosomal region
and a trait locus of interest. In parametric linkage analysis, a specific
disease model is used to describe segregation of the trait locus.
This page describes how to specify a parametric disease model when using MERLIN.
Basic Parametric Analysis
In addition to the standard Merlin input files, parametric linkage analyses
require disease locus parameters to be specified in a separate text file. This text
file has one row for each of the disease models to be evaluated, and
can include as many different models as available memory allows. The file name
can be specified with the --model filename command line option.
In general, the file should be tab or space delimited, with 4 fields per line:
an affection status label (matching the data file), a disease allele frequency,
a probability of being affected for individuals with 0, 1 and 2 copies of
the disease allele (penetrances), and finally a label for the analysis model.
A header can be included for readability, but is not required.
Here is an example:
Example of merlin.model file
DISEASE ALLELE_FREQ PENETRANCES LABEL
CF 0.02 0.001,0.001,1.000 RECESSIVE_DISEASE_MODEL
Analysis with Liability Classes
In addition to the basic analysis, where the penetrance function is the same
for all individuals, it is possible to specify penetrance functions that depend
on the value of a particular covariate. These can be useful when
analyzing traits whose prevalence varies with age or sex, for example.
To specify a penetrance function that varies according to individual specific
covariates or sex, set the baseline penetrance to * (an asterisk) in the
line that includes the disease allele frequency. Then, include a series of lines
each describing the particular penetrance class
(using a condition such as AGE < 45, SEX = MALE or LIABILITY = 1).
Here is an example:
Example of merlin.model file where penetrances vary by age and sex
DISEASE ALLELE_FREQ PENETRANCES LABEL
PROSTATE_CANCER 0.001 * HYPOTHETICAL_RECESSIVE_MODEL
SEX = FEMALE 0.000,0.000,0.000
AGE < 50 0.001,0.050,0.100
AGE < 70 0.002,0.200,0.400
OTHERWISE 0.004,0.500,0.800
In the example above, the model describes an hypothetical susceptibility allele
for prostate cancer. The first liability class includes all females, and specifies
that they never develop prostate cancer. The next row specifies that males under
the age of 50 have about a 5% chance of developing cancer if they are heterozygotes
for this allele and a 10% chance if they are homozygotes. These probabilities increase
for males aged between 50 and 70. Finally, the last specifies the penetrances for
all other individuals (in this case) males aged 70 or over.
To apply these penetrance models, MERLIN proceeds in the following manner. First, MERLIN lists what covariates
are required for the model to be evaluated. In the example, AGE and SEX are required and all individuals without age
or sex information are treated as having a missing phenotype. Next, each condition is checked in turn until a
matching one is found. For example, all females will match the first condition and that set of penetrances will be
used. Males with AGE < 50 will match the second condition and that set of penetrances will be used. Males with AGE <
70 (but with AGE >= 50) will match the next line, and so on. A line labeled OTHERWISE or ELSE must be present to
indicate the last liability class.
Analysis of X-linked traits
One of the programs in Merlin package, MINX, can carry out parameteric linkage analysis
for X-linked traits. The process for specifying the trait model for these analysis is similar
to that for autosomal traits; the most important step is to understand how penetrances for males
and females are handled (since males carry a single X chrosome and, thus, can never be
heterozygotes). When calculating the probability of disease given trait locus genotypes, MINX's default
behavior is to assume that the phenotype distribution for males that carry a particular allele
will be the same as for females that carry two copies of that allele. For example, consider
the following model file:
Example of merlin.model file for an X-linked trait
DISEASE ALLELE_FREQ PENETRANCES LABEL
DMD 0.0001 0.001,0.003,0.95 RECESSIVE_MODEL
The above model specifies a disease allele with frequency 0.0001. For females, we expect
that nearly all individuals will be homozygous for the wild-type allele and the probability
that these will be affected will be 0.001. Heterozygous females, which will account for
0.0002 of the population will have slightly higher risk of disease at 0.003. Finally, homozygous
females, which should be extremely rare, will be at high risk of disease with a probability of 0.95
of being labelled affected. For males, MINX will only use the first two penetrances ... the first
penetrance of 0.001 will be used for males who carry the wildtype allele in their X chromosome; the
second penetrance of 0.95 will be used for males who carry the mutant allele.
Of course, just as with autosomal traits, it is possible to specify sex specific penetrances
so that males who carry a particular allele have a different risk of disease than females
homozygous for that allele. Here is one example:
Example of refined merlin.model file for X-linked analysis
DISEASE ALLELE_FREQ PENETRANCES LABEL
BALDNESS 0.05 * HYPOTHETICAL_RECESSIVE_MODEL
SEX = FEMALE 0.01,0.05,0.20
OTHERWISE 0.01,0.00,0.80
The above model describes a relatively common allele (frequency of 0.05) that increases
the risk of baldness in both males and females. For females, the penetrances are set at
0.01, 0.05 and 0.20 for wildtype homozygotes, heterozygote carries and risk allele homozygotes,
respectively. For males, the penetrances are 0.01 or 0.80 depending on whether the wildtype or risk
alleles are present. The middle penetrance of 0.00 is ignored, since there can be no male heterozygotes
for an X-linked marker.
|