The complex genetics of multiple sclerosis – pitfalls and
Author: Stephen Sawcer.
Affiliation: University of Cambridge, Department of Clinical Neuroscience,
Addenbrooke’s, Hospital, Hills Road, Cambridge, CB2 2QQ, UK
Corresponding Author: Stephen Sawcer
University of Cambridge, Department of Clinical Neuroscience, Addenbrooke’s,
Hospital, Hills Road, Cambridge, CB2 2QQ, UK
Phone: +44 1223 217091
FAX: +44 1223 336941
Section 1 - The Carter Effect
Under Falconer’s (1981) liability threshold model it is assumed that the liability to a
dichotomous non-Mendelian trait, such as multiple sclerosis, results from the additive
effects of many factors (genetic and environmental) and is therefore normally distributed
in the population, with the disease developing in those who exceed some threshold of
liability (see Fig 1.1).
Liability frequency distribution.
Some people have very little liability, others a lot and most have an intermediate
level. Only those with more than the threshold level actually develop the disease.
According to this model, in a disease showing a difference in prevalence according to
gender individuals of the gender least often affected will on average carry a greater
genetic load and should therefore confer the greatest risk to their offspring. This
phenomenon is known as the Carter effect (see Fig 1.2).
Gender specific liability frequency distribution.
In a disease like multiple sclerosis, where females are at greater risk than males, i.e.
where gender is a risk factor, the liability curve for females is shifted to the right
compared to males and thus more women exceed the threshold and are affected by
the disease. On the other hand the load of other risk factors is on average higher in
affected males than in affected females.
In multiple sclerosis we might thus anticipate a paternal parent of origin effect since
males are least often affected by the disease and should therefore on average carry a
greater genetic load. In fact the evidence regarding parent of origin effects in multiple
sclerosis is contradictory with some researchers finding evidence for the Carter effect
(Kantarci et al., 2006) while others have found a maternal parent of origin effect (Ebers et
al., 2004) and still others have found no evidence for any parent of origin effect (Herrera
et al., 2007; Hupperts et al., 2001). This apparent inconsistency is not unexpected in a
As we have practically no information concerning the number or nature of environmental
risk factors of relevance in multiple sclerosis it is hard to estimate the form of the overall
liability distribution. However, given that there are probably around 100 genes that
influence susceptibility (see main text) we can at least approximate the genetic
component of the distribution. If we consider a rather simplistic model in which risk
alleles are additive, independent, exert the same scale of effect and have a frequency of
50% then liability would follow a simple symmetrical binomial distribution in the
population. Under this model liability is simply determined by the number of risk alleles
inherited (N). In theory an individual might carry no risk alleles or as many as 200,
however in practice most will carry around 100, and few will carry less than 60 or more
than 140 (see figure 1.3).
Hypothetical genetic liability frequency distribution for multiple sclerosis,
assuming 100 susceptibility genes.
Liability is distributed as a simple binomial such that very few people carry fewer
than 60 risk alleles and very few carry more than 140.
Since we have no information regarding the number or nature of environmental risk
factors involved in multiple sclerosis it is impossible to establish a definitive disease
causing threshold on this genetic liability distribution; even individuals who inherit no
risk alleles could still develop the disease if they are exposed to sufficient environmental
risk. On the other hand as multiple sclerosis affects only a relatively small proportion of
the population (0.1%) we can expect that the majority of affected individuals will carry
substantially more than 100 risk alleles. If genes were the only influence on risk (which is
certainly not the case) then since 0.1% of the population carries 120 or more risk alleles
the liability threshold would be 120 alleles. Since females have a risk of developing
multiple sclerosis that is 2-3 times greater than the risk seen in males, affected males will
carry an excess number of risk alleles in comparison with females. If each susceptibility
allele increases risk by a factor of 1.2 then on average males would carry 5-10 extra risk
alleles. Thus although it is true that men carry a greater genetic load this is only in excess
by perhaps 5-10%.
Although this is only a crude model and in reality we can expect that the frequency and
risk attributable to relevant alleles will vary considerably, it does at least illustrate that
under a model where susceptibility to multiple sclerosis is determined by multiple
common modest effect risk alleles it can be anticipated that the Carter effect will be small
and difficult to detect. In light of these considerations it is easy to see why the available
data appear contradictory - none of the published studies have anything more than
minimal power to address this question. These data are certainly not evidence against a
polygenic model in multiple sclerosis. Others have pointed out that there are also biologic
effects which can confound the Carter effect. For example mitochondrial, imprinting and
intra-uterine effects could all potentially produce a maternal parent of origin effect which
would be further expected to confound efforts to demonstrate the Carter effect.
This simple model provides further insight. Since there are more than 1057 different ways
of selecting 120 risk alleles from amongst a total of 200 we can anticipate huge variation
in the genetic make up of affected individuals. It is thus unsurprising that conditions like
multiple sclerosis show variation in clinical features such as course and severity, indeed it
would be remarkable if such variation did not occur.
Section 2 - Linkage Disequilibrium
Alleles at loci contained on the same chromosome are inherited together at meiosis unless
they are separated by a recombinant event occurring between the loci. Since the
frequency of these recombinant events is determined by the genetic distance between the
loci there is a tendency for alleles from loci close to each other to remain together as they
are inherited through the generations. However, given a sufficient number of generations
an equilibrium will eventually be reached when the rate at which a particular haplotype is
disrupted by recombination is matched by the rate at which it is reformed by
recombination between other haplotypes, at this point the frequency of the various
haplotypes will no longer change with time.
Consider two loci, the first with alleles A and a and the second with alleles B and b; if
these loci lie on the same chromosome then there are four possible haplotypes
(chromosomal arrangements), as shown in figure 2.1.
Possible haplotypes for two biallelic loci
For two biallelic linked loci there are 4 possible haplotypes as illustrated in the
lower half of the figure.
In a large randomly mating population recombination would eventually whittle away any
skew in the frequency of haplotypes until the assortment of alleles across the various
chromosomes is essentially random. In this equilibrium state the frequency of any
haplotype such as the haplotype AB (PAB) would just be determined by the frequency of
alleles making up that haplotype.
PAB = PA * PB
Where PA is the frequency of allele A and PB is the frequency of allele B.
This situation is referred to as linkage equilibrium, all four haplotypes are found in the
population and the frequency of each is just determined by the frequency of the alleles
making up the haplotype. So for example at linkage equilibrium if allele A has a
frequency of 50% and allele B a frequency of 10% then 5% of the chromosomes would
carry both A and B. At linkage equilibrium no haplotype is over- or underrepresented
with respect to what would be predicted on the basis of the allele frequencies and random
assortment. For widely separated loci equilibrium will be reached within a few
Until this equilibrium is achieved the tendency for haplotypes to be transmitted intact
means that the frequency of a particular haplotype in a population may be very different
to that which would be predicted at equilibrium. When this situation exists there is said to
be linkage disequilibrium (LD) between the loci. For example consider a population in
which only allele “a” exists at locus 1. If a mutation occurs and allele A now appears on a
chromosome which happens to be carrying an allele B at the second locus then only three
of the four possible haplotypes will exist in the population (AB, aB and ab) even though
all four alleles exist. The same situation would arise if a migrant carrying the AB
haplotype arrived in the population. If over the next few generations the frequency of the
AB haplotype increases (either by selection or through random genetic drift) then until a
recombinant event occurs all chromosomes carrying A will also carry B. There will thus
be a correlation between A and B and the frequency of the various haplotypes would be
different from that which would be predicted by random assortment (equilibrium). If the
haplotype aB were lost then only two of the four possible haplotypes would exist and
there would be complete correlation between A and B: whenever an individual inherited
an A they would also inherit a B, until recombination separates these alleles.
The extent of linkage disequilibrium between alleles at different loci is generally
measured in terms of parameters called D’ and r2, D’ reflecting the extent to which a
haplotype has escaped the erosive effects of recombination between loci and r2 being
quite literally the correlation coefficient between the alleles across all chromosomes.
Both of these measures vary between 0 and 1, with 0 indicating no LD (linkage
equilibrium) and values increasing as the extent of LD increases. In the situation where
only two of the four possible haplotypes exist in a population there is total correlation
between the alleles and both D’ and r2 =1 (note this two-haplotype situation is only
possible when the minor allele frequency is identical at both loci). In the situation where
only three of the four haplotypes exist D’ will = 1 while r2 will be < 1, this indicates that
no recombinant event has yet occurred but the correlation between alleles is incomplete
as the minor allele frequencies are not equal. As recombination events occur all four
haplotypes will be present in the population but the frequency of each will be deviant
from random assortment and both D’ and r2 will be < 1 but > 0 until equilibrium is
Section 3 - Linkage
In complex diseases like multiple sclerosis where the mode of inheritance cannot be
inferred from segregation analysis the search for linkage has to rely on “non-parametric”
methods which don’t require any knowledge of parameters such as penetrance, risk allele
frequency or the number of loci involved. These methods don’t require large extended
families, which are uncommon in multiple sclerosis, but instead identify linkage by
testing large numbers of smaller families (most typically affected sibling pairs) and
searching for regions of the genome co-inherited by affected individuals within these
families more often than would be expected by chance.
If any two individuals are genotyped for a marker then their genotypes can be directly
compared and we can easily establish whether they have any alleles in common. This
type of allele sharing is referred to as identity by state (IBS). So for example if one
individual has a genotype AC and a second has a genotype AD then we can see that they
have one allele IBS (allele A), that is they share one allele in common. If the two
individuals considered are siblings then we can also ask whether the allele they have in
common has been inherited from the same parent. This issue is illustrated in figure 3.1
which shows three affected sib pair families.
In each case the sibs have the same genotypes with one allele (allele A) identical by state
(IBS). By considering the parental genotypes we can see that in the first family the shared
allele A has been inherited from the same parent (the mother) and is therefore also
identical by descent (IBD). In the second family, however, it is clear that although both
sibs carry an allele A the first sib inherited this from the father while the second inherited
this from the mother, thus although they both carry allele A this is not the same A by
descent. In the third family it is not possible to know whether the same A has been
inherited, so the IBD status cannot be determined unambiguously from these data. Note
however that if flanking markers were typed and these turned out to be heterozygous in
the mother then we would be able to infer the IBD sharing of the mother’s allele with
much more confidence (this is how multipoint linkage analysis boosts the power to detect
Three families with the same IBS sharing in the sibs but differing parental
In each affected sib pair family allele A is shared IBS by the sibs. However,
inspection of the parental genotypes shows that the A allele is IBD in the first
family but not in the second. In the third family it is not possible to unambiguously
infer the IBD status of the A allele from these data.
So we can see that the IBS status of a related pair of individuals does not always predict
the IBD status and that we need to consider the genotypes of the family members to
establish whether an allele that is IBS is in fact also IBD.
Mendel’s laws allow us to predict IBD sharing for any pair of related individuals. For
example consider a pair of siblings, given the genotype of the first sib then we can see
that there is a 50:50 chance that the second sib will inherit the same maternal allele and a
50:50 chance that the second sib will inherit the same paternal allele. There is thus a 25%
chance that they will share 2 alleles IBD, a 25% chance that they will share 0 alleles IBD
and a 50% chance that they will share 1 allele IBD (50% of the time this will be the
maternal allele and 50% the paternal), see figure 3.2.
IBD sharing in the absence of linkage.
This figure shows the four possible genotypes for a second sibling given the
genotype of the first sibling. Each of these four outcomes has an equal probability
under Mendel’s first law of random segregation and thus the expected IBD sharing
for a pair of sibs can be determined. In this example the parental genotypes are
maximally informative (i.e. the parents carry four different alleles). In this special
situation the IBD status can be inferred directly from the IBS status. The expected
IBD proportions would not change if other parental genotype patterns were
considered but of course it would not always be possible to infer the IBD status
with certainty. For example if both parents were homozygous for allele A then the
IBD status for the offspring would still show the expected proportions but in all
situations the offspring would be two sharing IBS since they would both always
have the genotype AA. This illustrates the value of marker heterozygosity. The
greater the heterozygosity the more likely it is that parents will be informative with
regard to allele sharing.
In principle, performing a non-parametric linkage analysis is then simply a matter of
collecting a set of affected multiplex families (e.g. sib pairs), typing a marker to establish
the actual extent of IBD sharing and comparing this with the expected amount of IBD
sharing as predicted by Mendel’s laws. For example if we typed a marker in 100 sib pair
families then on average we would expect 25 families to be 0 sharing, 50 to be 1 sharing
and 25 to be 2 sharing IBD. If in fact we found 90 families to be 2 sharing and 10 to be 1
sharing this would be substantially more than would be expected and would be evidence
that the tested marker was in linkage, i.e. was close to a relevant gene. In reality of course
we would not be able to infer the IBD status exactly in every family since the marker
would not always be informative, or parents might be missing in some cases or fail to
genotype. In some situations we might be able to infer the genotype of missing parents,
especially if we typed other siblings but this would not always be possible. In many
families it won’t be possible to infer the IBD sharing exactly and this will need to be
As in standard linkage analysis the statistical significance of the evidence for linkage, the
extent of excess allele sharing, is usually expressed in terms of a lod score - the logarithm
of the ratio between the likelihood of the observed data if there were linkage as compared
to the likelihood of the observed data if there were no linkage. So for example if a tested
marker showed an excess of IBD allele sharing with a lod score of 2.0 this would indicate
that the observed sharing was 100 times more likely if there were a linked gene in this
region than if there was no relevant gene in this region. Of course since inheritance at
meiosis is a random process we expect to see an excess of IBD sharing in some parts of
the genome just by chance even in the absence of any linkage. Thus when a whole
genome screen for linkage is performed we can expect to see some regions of the genome
showing interesting lod scores just by chance. It has been shown that on average a typical
genome screen will include perhaps 10 regions with a lod score of greater than 1.0 and on
average one region with a lod score greater than 2.0. Indeed one in twenty screens (i.e.
5%) can be expected to include a lod score of 4.0 just by chance alone. It is important to
keep these genome-wide thresholds in mind when considering the results of genome-
wide linkage screens. Only a lod score of > 4.0 has a genome-wide significance of 5%.
In a seminal paper Risch and Merikangas (1996) demonstrated the limited power of
linkage based studies in the context of modest genetic effects. In their modelling they
considered a biallelic susceptibility locus and summarised the effects of the risk allele in
terms of its frequency and the Genotypic Relative Risk (GRR or γ), defined as the
relative increase in the risk of disease in heterozygous individuals as compared to
individuals homozygous for the wild type allele (Risch and Merikangas, 1996). To
simplify their calculations they assumed a multiplicative model such that the risk in
individuals homozygous for the risk allele was increased by a factor of γ2. Employing the
standard formulae considered by Risch and Merikangas figure 3.3 shows the essentially
inverse relationship between minor allele frequency and risk (GRR) for any given level
of allele sharing. From this figure it is clear that rare alleles must convey a large relative
risk (high GRR) if they are to generate the same amount of allele sharing as is produced
by a common risk allele exerting a modest effect (low GRR). Put the other way round
any given level of allele sharing observed in a linkage study may be the result of a
common risk allele exerting a modest effect or a rare risk allele exerting a larger effect; a
non-parametric linkage study does not differentiate between these alternatives.
Figure 3.3. The relationship between GRR and risk allele frequency for a
given level of sib pair allele sharing.
The figure shows the relationship for four levels of sib pair allele sharing 60% (red
line), 57% (green line), 54% (blue line) and 51% (purple line). Calculated using the
standard formulae employed by Risch and Merikangas (1996).
Using additional formulae employed by Risch and Merikangas (1996) it is
straightforward to calculate the number of sib pair families needed to ensure adequate
power (80%) to identify statistically significant linkage (lod score ≥ 4.0) in relation to the
allele sharing resulting from a susceptibility locus. The results from these calculations are
shown in figure 3.4.
Figure 3.4. Number of sib pair families required to provide 80% power to
identify statistically significant linkage (lod score ≥ 4.0) in relation to the
expected allele sharing generated by a susceptibility locus.
Calculated using the standard formulae employed by Risch and Merikangas (1996).
From this figure (3.4) it is clear that the number of affected sib pair families necessary to
map loci by linkage based methods rapidly becomes impractical once the expected allele
sharing falls much below 54%.
Given that the DRB1*1501 risk allele has a frequency of 15% and a GRR of around 3.0 it
is expected to generate approximately 57% allele sharing in affected sib pairs. Since this
level of sharing requires 400 to 600 sib pairs to ensure adequate power it is easy to see
why the original microsatellite based screens failed to identify linkage to the MHC, and
why the few studies reaching this threshold sample size by combining resources from
several populations succeeded. In contrast to the microsatellite based studies (Sawcer et
al., 1996; Haines et al., 1996; Ebers et al., 1996), which individually only had the power
to identify loci generating more than 60% allele sharing, the high density SNP based
linkage screen (IMSGC, 2005) had the power to identify loci generating as little as 54%
allele sharing. The fact that no significant linkage was demonstrated in this study outside
the MHC indicates that the remaining susceptibility loci are unlikely to be identifiable by
linkage since the number of families required would be impractical. The infeasibility of
further linkage analysis in multiple sclerosis is well illustrated by considering the sample
size required to demonstrate significant linkage to the newly identified susceptibility
locus IL7R. At this locus the risk allele has a frequency of 72% and a GRR of 1.2
(Gregory et al., 2007), it is therefore expected to generate approximately 50.2% allele
sharing in affected sib pairs. A linkage based study would need to involve more than a
million sib pair families to ensure adequate power to demonstrate significant linkage to
IL7R. Even demonstrating nominally significant linkage (lod score = 0.7) would require
more than 290,000 sib pairs.
Section 4 - Some basic association statistics
In an association test we aim to compare the genetic make up of those with (cases) and
without (controls) a particular disease in order to see if there is any systematic difference
between these two groups. Of course since we necessarily only consider a sample of
cases and a sample of controls there is always the possibility that any apparent difference
is just due to chance. We therefore need statistics to help us decide how likely it is that
any observed difference reflects a real population difference rather than just a chance
observation. In practice the need to correct for various potentially confounding influences
tends to make this statistical analysis complex and confusing, although at its heart it is a
rather straightforward process. For any tested marker there are only two possibilities:
either the marker is associated at the population level or it is not. We call these two
possibilities the test and the null hypothesis respectively (note at this level there is no
distinction between genuinely causative variants and those in LD with such variants since
both these classes of marker show a difference between cases and controls at the
population level). The main aim of a statistical analysis is to provide some measure of
how confident we can be that a tested marker is indeed associated at the population level,
i.e. that the test hypothesis is true for this marker. For many years researchers have used
the extent to which observed data allows the null hypothesis to be rejected as the primary
evidence that the test hypothesis must be true. Unfortunately although this logic has an
intuitive appeal it is erroneous and has been highly misleading. The extent to which
observed data can be considered to have rejected the null hypothesis is expressed in terms
of α, the significance of the data, which is defined as the probability of seeing the
observed case-control difference or greater assuming the null hypothesis is true. The
significance is NOT the probability that the null hypothesis is true.
The evidence favouring the test over the null hypothesis is better presented in terms of
the odds that the test hypothesis is true rather than the null hypothesis being true; that is
the ratio between the likelihood that the test hypothesis is true and the likelihood that the
null hypothesis is true. The huge amount of variation in the human population means that
for any randomly selected marker before any analysis is done the odds in favour of the
test hypothesis are very low, approximately 100,000:1 against; see the main text and
similar logic presented by the Wellcome Trust Case Control Consortium (WTCCC,
2007). After a marker is typed in a sample of cases and controls the odds that the test
hypothesis is true can be recalculated allowing for how likely the observed data would be
under each of the two hypotheses. The likelihood of the observed data under the test
hypothesis is determined by the power of the study, which is defined as the probability of
seeing the observed case-control difference or greater assuming the test hypothesis is
true. The expectation is that the odds calculated after typing the marker in a set of cases
and controls (the posterior odds) should be more informative than the odds calculated
before the typing (the prior odds). If the test hypothesis is true then the odds in favour of
this hypothesis should have improved after the typing. Bayes’ theorem provides a simple
algebraic relationship between these four parameters such that
Posterior Odds =
significance * Pr ior Odds
Note the ratio between power and significance is known as the likelihood ratio and
indicates how much more likely the observed data is under the test hypothesis than under
the null. Of course since we don’t know the size of effects attributable to relevant loci we
cannot know the power of a given study ahead of time. However, we can calculate the
anticipated power for effects of a given size. Restricting our attention to common
variation and making the simplifying assumption that the power to detect each relevant
locus is consistent across the various relevant loci (as suggested by the WTCCC) then we
can use the above logic, together with our estimate for the prior odds, to calculate the
posterior odds associated with the significance of any observed result.
Note in all of the above it is assumed that typing is complete, error free and unbiased -
these prerequisites will rarely be fully met and generally tend to reduce the likelihood that
any observed association is genuine, see main text.
Ebers GC, Kukay K, Bulman DE, Sadovnick AD, Rice G, Anderson C, et al. A full
genome search in multiple sclerosis. Nat Genet 1996; 13: 472-6.
Ebers GC, Sadovnick AD, Dyment DA, Yee IM, Willer CJ, Risch N. Parent-of-origin
effect in multiple sclerosis: observations in half-siblings. Lancet 2004; 363: 1773-4.
Falconer DS. Introduction to Quantitative Genetics. London: Longman, 1981.
Gregory SG, Schmidt S, Seth P, Oksenberg JR, Hart J, Prokop A, et al. Interleukin 7
receptor alpha chain (IL7R) shows allelic and functional association with multiple
sclerosis. Nat Genet 2007; 39: 1083-91.
Haines JL, Ter-Minassian M, Bazyk A, Gusella JF, Kim DJ, Terwedow H, et al. A
complete genomic screen for multiple sclerosis underscores a role for the major
histocompatability complex. The Multiple Sclerosis Genetics Group. Nat Genet 1996; 13:
Herrera BM, Ramagopalan SV, Orton S, Chao MJ, Yee IM, Sadovnick AD, et al.
Parental transmission of MS in a population-based Canadian cohort. Neurology 2007; 69:
Hupperts R, Broadley S, Mander A, Clayton D, Compston DA, Robertson NP. Patterns
of disease in concordant parent-child pairs with multiple sclerosis. Neurology 2001; 57:
International Multiple Sclerosis Genetics Consortium (IMSGC). A high-density screen
for linkage in multiple sclerosis. Am J Hum Genet 2005; 77: 454-67.
Kantarci OH, Barcellos LF, Atkinson EJ, Ramsay PP, Lincoln R, Achenbach SJ, et al.
Men transmit MS more often to their children vs women: the Carter effect. Neurology
2006; 67: 305-10.
Risch N, Merikangas K. The future of genetic studies of complex human diseases.
Science 1996; 273: 1516-7.
Sawcer S, Jones HB, Feakes R, Gray J, Smaldon N, Chataway J, et al. A genome screen
in multiple sclerosis reveals susceptibility loci on chromosome 6p21 and 17q22. Nat
Genet 1996; 13: 464-8.
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000
cases of seven common diseases and 3,000 shared controls. Nature 2007; 447: 661-78.