The american journal of human genetics (AJHG) Vol 90 Nº4, 2012
This Month in The Journal
Sara B. Cullinan1
Genomic Privacy in GWAS?
Im et al., page 591
Recent technological advances have made it possible to
interrogate human phenotypes at a previously unimagin-
able scale. But, as with any collection of personal data, it
is important to ensure individual privacy. Indeed, previous
investigations into the ability to discern an individual’s
participation in genetic studies have led to the withdrawal
of allele frequencies from publicly available results. In this
issue, Im et al. probe deeper, questioning how much
private information can be extracted from typically re-
ported statistics, such as regression coefﬁcients or p values.
Through a series of analyses, the authors determine that
regression coefﬁcients can, in some cases, provide just as
much information as allele frequencies, thus creating a
situation in which even statistics that were thought to be
‘‘safe’’ can in fact identify participants and their medical
history. The possibility of membership detection is espe-
cially high in cases in which multiple phenotypes are
being reported, e.g., in multiple-omics data sets. With
exome- and whole-genome sequencing (and the large
data sets that they generate) becoming more common, it
is clear that many additional discussions between scien-
tists, clinicians, and ethicists are needed to ensure that
privacy can be maintained without sacriﬁcing the dissem-
ination of research ﬁndings.
A Major mtDNA Shake-Up
Behar et al., page 675
In 1981, the revised Cambridge Reference Sequence was
published. It immediately became the standard against
which human mtDNA is compared and phylogenies are
derived. Indeed, its publication enabled a tremendous
amount of research aimed at better understanding human
history. However, the realization that this sequence belongs
to a recently coalescing European haplogroup creates
several concerns about inconsistencies and misinterpreta-
tion. To address these concerns, Behar et al. set out to reas-
sess and reﬁne the human mtDNA phylogeny, and in so
doing, they constructed a new reference mtDNA sequence,
termed the Reconstructed Sapiens Reference Sequence
(RSRS). Generated through the assessment of over 18,000
human mtDNA sequences, as well as those of Homo
neanderthalensis, the RSRS performs well in molecular clock
analyses and lays the groundwork for a new way of ana-
lyzing mtDNA. Although this change will require a large
amount of rethinking, the authors put forth a coherent
plan to make this feasible, including tools to transform
previously generated data and analyses. With the amount
of deep-sequencing data that should become available in
the coming years, the RSRS presents a ‘‘next-generation’’
approach to understanding human matrilineal diversity.
First Steps toward Understanding Birth Weight
Ishida et al., page 715
Babies come in many different sizes, but being too small
is a major health concern. Indeed, intrauterine growth
restriction (IUGR) serves as a risk factor for several adult
diseases, including obesity and type 2 diabetes. Although
maternal health plays a large role in directing fetal growth,
the genetic factors that contribute to the variability in fetal
size remain poorly understood. Of interest, however, are
those genes that undergo imprinting, a process by which
the parent of origin determines monoallelic expression.
Evolutionary theory posits that expression of alleles in-
herited from the father promote in utero growth, whereas
those inherited from the mother inhibit growth. But what
happens if the maternally inherited allele exhibits an
altered expression pattern? Might the balance be tipped?
In this issue, Ishida et al. explored the possibility that
variants in PHLDA2, which is only expressed from the
maternal allele, might inﬂuence birth weight. Their studies
identiﬁed a variant in the PHLDA2 promoter region that
eliminates several consensus transcription factor binding
sites and should therefore lead to decreased expression.
Then, through a cross-sectional study of normal births,
they showed that inheritance of this variant (from the
mother), as well as maternal homozygosity, correlated
with increased birth weight. Future studies, focused specif-
ically on IUGR, should help to elucidate how variation in
PHLDA2, and potentially in other imprinted genes, con-
tributes to the regulation of birth weight and related
Evolutionary History of AD Risk Alleles
Raj et al., page 720
Alzheimer’s disease (AD) is the most common neuro-
degenerative disease, and as of yet, there are no effective
Deputy Editor, AJHG
DOI 10.1016/j.ajhg.2012.03.008. Ó2012 by The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics 90, 575–576, April 6, 2012 575
treatments, let alone a cure. Therefore, there is great
interest in better understanding the causes of the disease
from both biochemical and genetic standpoints. The
best-characterized genetic risk factor is the ε4 haplotype
of APOE, which, interestingly, shows evidence of having
undergone positive selection, most likely because of an
effect on an unrelated phenotype. With this in mind, Raj
et al. set out to identify other possible indications of selec-
tion in loci shown to associate with AD susceptibility. They
found such evidence, all in East Asian populations, for
three loci, suggesting that the same selective pressure
might have acted on each. Given that AD is unlikely to
serve in such a role, the authors posited that pathogen
exposure might have been the driving force. Indeed,
many signatures of selection in the human genome are
attributed to interactions with pathogens. Interestingly,
the protein products generated at these loci appear to
belong to the same interaction network. This ﬁnding
suggests that additional clues about AD risk might be
found by interrogating other branches of this network.
Although much remains to be learned about the variants
that contribute to AD risk, the study of their evolution,
and possible coevolution, will no doubt yield insights
into the underlying biology of the disease.
X Marks the Spot in Breast Cancer Research
Park et al., page 734
The ubiquitous pink ribbons serve as a reminder that
many women (and some men) are affected by breast
cancer. Although well known, BRCA1 and BRCA2 muta-
tions account for a minority of hereditary cancers. There-
fore, a better understanding of the biology of breast
cancer, along with better screening tests, is sought by
many families. To help achieve these goals, Park et al.
used exome sequencing and identiﬁed rare mutations in
XRCC2 that serve as susceptibility factors for familial
breast cancer. XRCC2 is a RAD51 paralog that is required
for efﬁcient homologous recombination (HR); its loss
leads to marked genome instability and aneuploidy.
Future studies aimed at delineating the exact role of
XRCC2 mutations, as well as mutations that lie within
the same pathway, in disease onset and/or progression
should aid in the discovery of new treatment options.
This ﬁnding adds to the list of genes whose protein prod-
ucts perform crucial roles in HR and whose mutations can
inﬂuence breast cancer risk. It also provides support for
those who seek to better understand common diseases
through sequencing studies.
576 The American Journal of Human Genetics 90, 575–576, April 6, 2012
This Month in Genetics
Kathryn B. Garber1,*
Big Gene, Big Heart
Although the cardiomyopathies have a substantial genetic
etiology, genetic testing for this class of heart disorders has
been notoriously difﬁcult. Indeed, the causative mutation
is found in only 20%–30% of patients with dilated cardio-
myopathy. Titin is a candidate gene for cardiomyopathy
that has been examined for mutations to a limited extent
due to its massive coding sequence, which is ~100 kb
in size. Herman et al. recently published data showing
that the sequence hurdle for this gene is worth the effort.
Through next-generation sequencing, they identiﬁed
a truncating TTN mutation in ~25% of familial cases of
idiopathic dilated cardiomyopathy, moving TTN to the
forefront of genes involved in this form of the disease.
Although these mutations had very high penetrance after
age 40 in familial cases, there is also a signiﬁcant amount
of TTN variation whose clinical signiﬁcance is difﬁcult to
interpret at this time. This includes missense variation,
which was not analyzed in this current paper, so its role
in cardiomyopathy is unclear. Even with truncating muta-
tions in TTN, interpretation is not always simple; these
mutations were identiﬁed, albeit at lower frequency, in
control individuals and in individuals with hypertrophic
cardiomyopathy who also had a pathogenic mutation in
a known disease gene.
Herman et al. (2012) NEJM 366, 619–628.
A Complex Balance
Perhaps it is not surprising that the more closely you look
at something, the more you see. Certainly, the advent of
whole-genome comparative genomic hybridization
(CGH) arrays taught us that many people with normal
G-banded karyotypes have cytogenetic aberrations when
we look more closely. Even high-resolution CGH arrays
don’t give us a complete picture of chromosomes, as
recently illustrated by Chiang et al. These investigators
took a set of individuals who had apparently balanced
chromosome translocations—at least based on G-banding
and whole-genome CGH arrays—and they analyzed the
breakpoints at the nucleotide level. What they found was
an unexpectedly high level of complexity to the break-
points. In almost 20% of cases, three or more breakpoints
were involved, but in some cases, a shockingly complex
interweaving of segments occurred, akin to what was
recently described in cancer cells as ‘‘chromothripsis,’’ or
chromosome shattering and reorganization. The cases
analyzed by Chiang et al. involved upward of ten break-
points with inverted segments interspersed among seg-
ments of the expected orientation. This phenomenon is
not limited to spontaneous rearrangements in humans;
analysis of transgene insertions in mice and in sheep
revealed that the sites of integration can be similarly
Chiang et al. (2012) Nat. Genet. Published online March 4,
Good News for Men
The Y chromosome is just a degenerate of its former auto-
somal self that is on its way to extinction, or so some have
proposed. If you compare the Y to the X chromosome, for
instance, the Y has lost many of the genes that the
chromosomes once shared, and without a companion
chromosome with which to fully pair itself during meiosis,
some think this sex-speciﬁc chromosome is doomed.
David Page argues otherwise. His group does species
comparisons of the Y chromosome in order to understand
its evolution and to better predict the future fate of the Y.
Page’s group previously compared the human to the chim-
panzee Y chromosome, which diverged about six million
years ago, but, in order to look at a much longer evolu-
tionary window, his group recently compared the human
and rhesus macaque Y chromosomes, which diverged
25 million years ago. This comparison yielded a surprising
level of evolutionary stability on the Y. In the majority of
the male-speciﬁc regions of the Y chromosome, rhesus
macaques and humans share the same ancestral genes,
arguing for Y chromosome stability over the long haul.
In only a very restricted segment of the Y has gene loss
occurred in humans since the split from the Old World
monkeys. Their data ﬁt a model in which rapid degenera-
tion of segments on Y was followed by marked slowing
of this decay and chromosome stabilization. Don’t count
the Y out just yet; it looks like it may stick around a while.
Hughes et al. (2012) Nature 483, 82–86.
Enhancers Acting as Promoters
Just as we learn to group letters into words and bin words
into different parts of speech in order to extract meaning
from sentences, we try to interpret genome sequences by
picking out the nucleotide sets that comprise genes and
attempting to recognize the regulatory elements from
strings of As, Cs, Gs, and Ts. But although we might think
Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
DOI 10.1016/j.ajhg.2012.03.009. Ó2012 by The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics 90, 577–578, April 6, 2012 577
we understand what a particular type of genetic element
does, recognition of one of its roles in gene expression
sometimes doesn’t tell the whole story. Take enhancers,
for instance. These are well-studied cis elements that
have a simple job: they bind transcription factors and
enhance expression from gene promoters, hence their
name. Kowalczyk et al. wondered whether that’s all
enhancers do, and they ended up with evidence that intra-
genic enhancers can also act as alternative tissue-speciﬁc
promoters. The resulting mRNAs are spliced and polyade-
nylated but do not appear to be translated into protein.
Because enhancers are much more common than classic
promoters and because about half of enhancers are intra-
genic, this promoter-like activity could contribute substan-
tially to the complexity of the mammalian transcriptome.
The next step is to ﬁgure out how these untranslated tran-
scripts are used.
Kowalczyk et al. (2012) Mol. Cell 45, 447–458.
A Common Turn-On
While we’re on the subject of surprising roles for noncod-
ing elements, a recent paper uncovered the coordinated
regulation of two neighboring, but nonparalogous, genes
that both tie into an identical phenotype. Joe Gleeson’s
group focuses on ciliopathies, and they recently identiﬁed
mutations in TMEM216 at the JBTS2 locus that cause Jou-
bert syndrome. Of the ten JBTS2-linked families, however,
only about half of them had a TMEM216 mutation, despite
an identical phenotype to the mutation-containing
families. When they resequenced the JBTS2 locus, they
found mutations in a neighboring gene, TMEM138, that
is not related to TMEM216, although it also encodes a
transmembrane protein. Although your ﬁrst thought
might be that TMEM138 simply contains a regulatory
element for TMEM216, this is not the case. Rather, both
genes are coordinately expressed via the action of an inter-
genic element, and they both encode proteins involved
in the same process, ciliogenesis. Knockdown of either
protein leads to defective ciliogenesis, which ultimately
is central to the Joubert syndrome phenotype. Thus,
despite the fact that the genes are very different, they
have evolved a system of coordinated regulation and func-
Lee et al. (2012) Science 335, 966–930.
This Month in Our Sister Journal
Yeast System for Characterization of Cystathionine-
Although we know that individuals with deﬁciency of
cystathionine-beta-synthase (CBS) tend to have intellec-
tual disability, a marfanoid habitus, ectopia lentis, and
increased risk of thromboembolism, there is variable
expressivity for this disorder, and it is difﬁcult to predict
outcome from genotype. Dietary protein and methionine
restriction is the central approach to management, and
supplementation with vitamin B6, a cofactor of CBS, can
lead to further reductions in homocystine levels in some
affected individuals, who tend to have milder disease. To
address the challenge of genotype-phenotype correlations
in CBS deﬁciency, Mayﬁeld et al. used a yeast system to
characterize the function of all 84 CBS missense alleles
that had been documented as of 2010. This system, in
which the yeast ortholog of CBS is replaced by human
alleles, allows them to assess the general level of function,
as well as the responsiveness of each allele to vitamin B6
and to another cofactor, heme. The authors also propose
that glutathione deﬁciency should be further explored in
the context of CBS deﬁciency, because they noted reduced
glutathione production in their system when CBS function
Mayﬁeld et al. (2012) Genetics. Published online January 20,
578 The American Journal of Human Genetics 90, 577–578, April 6, 2012
Fragile X and X-Linked Intellectual Disability:
Four Decades of Discovery
Herbert A. Lubs,1 Roger E. Stevenson,1,* and Charles E. Schwartz1
X-Linked intellectual disability (XLID) accounts for 5%–10% of
intellectual disability in males. Over 150 syndromes, the most
common of which is the fragile X syndrome, have been described.
A large number of families with nonsyndromal XLID, 95 of which
have been regionally mapped, have been described as well. Muta-
tions in 102 X-linked genes have been associated with 81 of these
XLID syndromes and with 35 of the regionally mapped families
with nonsyndromal XLID. Identiﬁcation of these genes has
enabled considerable reclassiﬁcation and better understanding of
the biological basis of XLID. At the same time, it has improved
the clinical diagnosis of XLID and allowed for carrier detection
and prevention strategies through gamete donation, prenatal
diagnosis, and genetic counseling. Progress in delineating XLID
has far outpaced the efforts to understand the genetic basis for
autosomal intellectual disability. In large measure, this has been
because of the relative ease of identifying families with XLID
and ﬁnding the responsible mutations, as well as the determined
and interactive efforts of a small group of researchers worldwide.
Mutations resulting in X-linked intellectual disability
(XLID) have been described in 102 genes (Table S1, avail-
This work was accomplished over a 40 year
period during which the term X-linked mental retardation
was widely used; however, we will use intellectual
disability (ID), which is emerging as the preferred termi-
nology. Mutations in these 102 genes are responsible for
81 of the known 160 XLID syndromes and over 50 families
with nonsyndromal XLID (Table S1 and Figures 1 and 2).
An additional 30 XLID syndromes and 48 families with
nonsyndromal XLID have been regionally mapped (Table
1 and Figures 2 and 3), but the genes not yet identiﬁed.
Forty-four XLID syndromes, which remain unmapped,
have also been described (Table S2). Fewer than 400 auto-
somal genes in which mutations resulted in ID have
been identiﬁed. Of 1,640 references to ID in OMIM (as of
March 2010), 316 are entities on the X chromosome. Three
comparably sized chromosomes (6, 7, and 8) show 50, 58,
and 60 references, respectively. Several authors have
recently discussed the possibility that these striking differ-
ences might result from a relative concentration of genes
that inﬂuence intelligence on the X chromosome.2,3
Identiﬁcation of the mutations in 102 genes that cause
XLID has been accomplished primarily through long-
term, planned and coordinated studies from the United
States, Europe, and Australia. These studies took advantage
of the power of pedigrees of relatively large families to
assign putative genes to the X chromosome, linkage anal-
ysis to achieve regional localizations, accumulation and
sharing of large data banks of clinical details and speci-
mens, registries of pertinent X chromosomal transloca-
tions and abnormalities, stored samples from a variety of
populations around the world with ID and effective
communication between numerous investigators. In this
setting, the continuously developing technologies were
applied and reapplied to the available clinical and spec-
imen banks effectively and rapidly. A comparable system-
atic approach to autosomal ID has not been carried out.
Publication of the ﬁrst family with the marker X,4
renamed the fragile X (MIM 300624),5
gave an important
impetus to the ﬁeld by providing a laboratory tool
which clearly identiﬁed the most prevalent XLID syn-
drome. A series of biennial international meetings on
fragile X syndrome and XLID, beginning in 1983, involved
about 100 investigators and provided a sense of unity and
progress to the ﬁeld. Papers and abstracts from these meet-
ings and from other research were published (usually bien-
nially) as conference reports, special issues or updates on
XLID from 1984 to 2008.6–16
The focus of this review will be the discovery process
rather than the details of the clinical or molecular ﬁndings
in the individual XLID entities. Readers are referred to the
recently updated excellent review of the fragile X in OMIM
(MIM 300624) and OMIM entries on other XLID disorders
as detailed in Tables S1 and S2. Other reviews of different
aspects of XLID include the periodic XLID updates from
1984 to 2008, an Atlas of XLID Syndromes,1
and a number
of commentaries by individual investigators.3,17–22
XLID before Fragile X
The prelude to the current cytogenetic and molecular era
covered a century (1868–1968). It encompassed descrip-
tions of a number of clinically deﬁned entities (Pelizaeus-
Merzbacher disease [MIM 312080], Duchenne muscular
dystrophy [MIM 310200], incontinentia pigmenti [MIM
308300], Goltz focal dermal hypoplasia [MIM 305600],
Lenz microphthalmia syndrome [MIM 309800]), inborn
errors of metabolism (Hunter syndrome [MIM 309900],
Lowe syndrome [MIM 309000], Lesch-Nyhan syndrome
[MIM 300322]), and large pedigrees in which ID segregated
with an X-linked pattern.23–28
During the same period, the
excess of males among persons with ID was observed in
Greenwood Genetic Center, JC Self Research Institute of Human Genetics, 113 Gregor Mendel Circle, Greenwood, SC 29646, USA
DOI 10.1016/j.ajhg.2012.02.018. Ó2012 by The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics 90, 579–590, April 6, 2012 579
census surveys and other population studies.29–31
magnitude of the male excess, varied from study to study
but averaged about 30 percent and was found in nearly
These two observations—the excess of males among
persons with ID and clinical syndromes or families with
ID that segregated with an X-linked pattern—provided
compelling evidence that genes on the X chromosome
were important contributors to the overall causation of
ID and, hence, of individual, familial, and societal signiﬁ-
cance. By virtue of having but a single X chromosome,
the male’s genome was uniquely vulnerable and that
vulnerability extended to brain development and function
as well as to other systems.
Further insights during this early period of time were
that XLID comprised syndromal entities (ID plus somatic,
metabolic, or neuromuscular manifestations) and nonsyn-
dromal entities (ID alone or with inconsistent abnormali-
ties). It also became clear that some females in XLID
pedigrees had intellectual limitations, albeit with neither
the consistency nor the severity of males. Technological
limitations (lack of tools for linkage analysis and gene
isolation) precluded a more precise genetic characteriza-
tion of XLID disorders and delayed the clinical delineation.
The Setting of the Initial Observation of the Marker X
In 1966, when a one-year-old boy and his brother were
referred to the Yale chromosome laboratory for study
because of delayed development, medical cytogenetics
was in a period of transition. The major trisomies as well
as translocations and large deletions had been deﬁned by
nonspeciﬁc orcein or Giemsa staining. Prenatal cytoge-
netic diagnosis had begun and in order to provide more
predictive developmental information to families, there
was a need for both better, less biased clinical information
about X and Y aneuploidy and the several types of smaller
variations in the short arms of the acrocentric chromo-
somes and variant heterochromatic regions on 1, 9, 16,
and Y. The Yale laboratory had selected a minimal media
(199) for both routine diagnostic studies and for a year-
long study of 4,500 consecutive cord blood and 500
maternal samples. Special attention was given to breaks,
gaps, and chromosome variants in the year-long study.
The study also sought to identify cytogenetic markers
offin-Lowr (RPSKA3, RSK2)
Telecanthus-hypospadias (MID1)Oral-facial-digital I (OFD1)
Spermine synthase deficiency (SMS)
XLID-infantile seizures, Rett like (CDKL5, STK9)
MIDAS (HCCS)Turner, XLID-hydrocephaly-
basal ganglia calcification
Pyruvate dehydrogenase deficiency (PDHA1)
Glycerol kinase deficiency (GKD)
Duchenne muscular dystrophy (DMD)
Ornithine transcarbamoylase deficiency (OTC)
Monoamine oxidase-A deficiency (MAOA)
Partington, West, Proud, XLAG (ARX)
XIDE (Renin receptor; ATP6AP2
OFCD, Lenz microphthalmia (BCOR)
Ichthyosis follicularis, atrichia, photophobia (MBTPS2)
Chaissaing Lacombe chondrodysplasia (HDAC6)
b ll d i (OPHN
( p )
Stocco dos Santos (SHROOM4, KIAA1202)
XLID l ft li / l t (PHF8)
Cornelia de Lange, X-linked (SMC1L1, SMC1A)
-Thalassemia Intellectual Disability
XLID-hypotonic facies, Carpenter-Waziri,
Holmes-Gang, Chudley-Lowry, XLID-arch
Phosphoglycerate kinase deficiency (PGK1)
Menkes disease (ATP7A)
XLID-cerebellar dysgenesis -1)-cleft lip/palate
Allan-Herndon (SLC16A2, MCT8) Opitz-Kaveggia FG, Lujan (MED12, HOPA)
XLID-macrocephaly-large ears (BRWD3)
Graham coloboma (IGBP1)
Cantagrel spastic paraplegia (KIAA2022)
Cornelia de Lange, X-linked (HDAC8)
Mohr-Tranebjaerg (TIMM8A, DDP)
Lissencephaly, X-linked (DCX)
XLID-optic atrophy (AGTR2)
Arts, PRPP synthetase superactivity (PRPS1)
XLID-short stature-muscle wasting (NXF5)
Mitochondrial encephalopathy (NDUFA1) 23
Epilepsy-intellectual disability limited to females (PCDH19)
XLID-Rolandic seizures (SRPX2)
Simpson-Golabi-Behmel (GPC3) Lesch-Nyhan (HPRT)
Fragile XA (FMR1) MASA spectrum (L1CAM)
XLID-growth hormone deficiency (SOX3)
Danon cardiomyopathy (LAMP2)
XLID-nail dystrophy-seizures (UBE2A)
XLID-macrocephaly-Marfanoid habitus (ZDHHC9)
Christianson, Angelman-like (SLC9A6)
FG/Lujan phenotype (UPF3B)
Chiyonobu XLID (GRIA3)
Mucopolysaccharidosis IIA (IDS)
Myotubular myopathy (MTM1)
Rett, PPM-X (MECP2)* Incontinentia pigmenti (IKBKG, NEMO)
Dyskeratosis congenita (DKC1)
Periventricular nodular heterotopia, Otopalatodigital I,
Otopalatodigital II, Melnick-Needles
Creatine transporter deficiency (SLC6A8)
*XLID-hypotonia-recurrent infections (MECP2 dup)
N-Alpha acetyltransferase deficiency (NAA10)
Figure 1. Genes with Identiﬁed Mutations that Cause Syndromal XLID with Chromosomal Band Location
580 The American Journal of Human Genetics 90, 579–590, April 6, 2012
that might correlate directly with clinical conditions.32
Thus, the initial observation that the two brothers referred
to the laboratory because of ID had a consistent chromatid
break or constriction in the distal long arm of a large C
group chromosome was very pertinent to the research
goals of the laboratory. Further study revealed that their
normal mother and two maternal relatives with ID (an
uncle and great uncle of the boys) had the same marker
The pedigree was, of course, consistent with X-linked ID.
Studies with H3
thymidine showed that the late repli-
cating, large C group chromosome was the same as the
chromosome with the apparent breaks and secondary
constrictions. The data led to the conclusion that ‘‘either
the secondary constriction itself or a closely linked
recessive gene may account for the pattern of X-linked
This was, in fact, probably the ﬁrst precise
localization of a gene associated with human disease. The
fragile X locus was subsequently deﬁned as an uncoiled
region (secondary constriction) by electron microscopy.33
Studies from a number of laboratories would provide a
more precise conﬁrmation and molecular characterization
RPSKA3 (RSK2) (19)
AP1S2 (59)CLCN4 (49)
( , , ,
DLG3 (8, 90)
HUWE1 (17, 31)**
ACSL4 (FACL4) (63 68)
ARHGEF6 ( PIX) (46)
27AFF2 (FMR2, FRAXE)
GDI1 (41, 48)
*MRX64 is due to a dup MECP2
**MRX17 and MRX31 are due to dup HUWE1 and 2 adjacent genes
Figure 2. Location of Genes with Mutations that Cause Nonsyn-
Twenty-two genes shown on the left of the chromosome with
solid arrows cause nonsyndromal XLID only. Numbers in paren-
theses adjacent to the gene symbols are assigned MRX numbers.
Seventeen genes shown on the right of the chromosome with
open arrows cause both syndromal and nonsyndromal XLID.
Table 1. Nonsyndromal XLID families (MRX1 – MRX95) with
linkage or gene identiﬁcationa
1 IQSEC2 33 ARX 65 Xp11.3-q21.33
2 Xp22.1-p22.3 34 del IL1RAPL1 66 Xq21.33-q23
3 HCFC1 35 Xq21.3-q26 67 Xq13.1-q21.31
4 Xp11.22-q21.31 36 ARX 68 ACSL4
5 Xp21.1-q21.3 37 Xp22.31-p22.32 69 Xp11.21-q22.1
6 Xq27 38 ARX 70 Xq23-q25
7 Xp11.23-q12 39 Xp11 71 Xq24-q27.1
8 DLG3 40 Xq21 72 RAB39B
9 FTSJ1 41 GDI1 73 Xp22-p21
10 Xp11.4-p21.3 42 Xp11.3-q13.1; Xq26 74 Xp11.3-p11.4
11 Xp11.22-p21.3 43 ARX 75 Xq24-q26
12 THOC2 44 FTSJ1 76 ARX
13 Xp22.3-q22 45 ZNF81 77 Xq12-q21.33
14 Xp21.2-q13 46 ARHGEF6 78 Xp11.4-p11.23
15 Xp22.1-q12 47 PAK3 79 MECP2
16 MECP2 48 GDI1 80 Xq22-q24
17 dup HUWE1 49 CLCN4 81 Xp11.2-q12
18 IQSEC2 50 Xp11.3-p11.21 82 Xq24-q25
19 RPSKA3 51 Xp11.23-p11.3 83 Not published
20 Xp21.1-q23 52 Xp11.21-q21.32 84 Xp11.3-q22.3
21 IL1RAPL1 53 Xq22.2-q26 85 Xp21.3-p21.1
22 Xp21.1-q21.31 54 ARX 86 Not published
23 Xq23-q24 55 PQBP1 87 ARX
24 Xp22.2-p22.3 56 Xp21.1-p11.21 88 AGTR2
25 Xq27.3 57 Xq24-q25 89 ZNF41
26 Xp11.4-q23 58 TM4SF2 90 DLG3
27 Xq24-q27.1 59 AP1S2 91 ZDHHC15
28 Xq27.3-qter 60 OPHN1 92 ZNF674
29 ARX 61 Xq13.1-q25 93 BRWD3
30 PAK3 62 UPF3B 94 GRIA3
31 dup HUWE1 63 ACSL4 95 MAGT1/OSTb
32 ARX 64 dup MECP2
Mutations in NLGN4, CDKL5, KDM5C, FGD1, SLC16A2, ATRX, AFF2 and SLC6A8
have been found in other families with nonsyndromal XLID.
The American Journal of Human Genetics 90, 579–590, April 6, 2012 581
of the location in the ensuing decade34–36
tion of the gene itself in 1991.37–40
In addition, the juxtaposition and timing of the family
study and the population survey permitted us to look for
the marker X in 5,000 individuals and over 30,000 cells
and to conclude tentatively that it was not a common
marker or variant because not even one marker X cell
was observed. Another family with a similar chromosomal
appearance at distal 16q was also ascertained in this same
interval. This was inherited in an autosomal-dominant
manner and not associated with a disease. We were, there-
fore, able to make the preliminary conclusion that such
markers did not necessarily indicate disease but that the
marker X was a signiﬁcant clinical marker for a Mendelian
disease and hence a new and useful tool.
Observations in the 1970s and 1980s
More complex and folic-acid-enriched media become
popular during the 1970s and presumably made detection
of the fragile X increasingly difﬁcult. Most early studies
gave variable results and were not published. The initial
report was conﬁrmed by Giraud et al.34
These articles and the report by Sutherland36
lished that folic acid in the culture media prevented the
expression and detection of the fragile X.
During the 1980s it became clear that a majority of XLID
families did not have fragile X, and the identiﬁcation and
study of large non-fragile X XLID families with linkage
analysis began in earnest. Large scale studies began across
the globe at this time. The results summarized in Table 1,
Tables S1 and S2, and Figures 1, 2, and 3 are, therefore,
based on about 20 years of clinical and molecular studies.
Methodologies Quicken the Pace of Gene Discovery
Besides the cytogenetic methods used in the diagnosing
and conﬁrmation of fragile X, a number of strategies
have been utilized to identify XLID genes (Table S1 and
Figures 1 and 2). Prior to 1990, these were limited to the
pursuit of genes in cases where the gene products (enzymes
in all cases: HPRT [MIM 308000], PGK1 [MIM 311800],
OTC [MIM 311250] , and PDHA1 [MIM 300582]) were
known, the molecular pathway was known (PLP [MIM
300401]) or a chromosome aberration had localized the
candidate region (DMD [MIM 300377]). Over the next
decade and a half, exploitation of chromosome rearrange-
ments and linkage coupled with candidate gene testing
dominated the ﬁeld. In the past several years, X chromo-
some sequencing, microarrays (expression and genomic),
and exploration of molecular pathways have added to
the range of technologies available for XLID gene identiﬁ-
cation. Five of the ﬁrst seven gene identiﬁcations were
accomplished with a combination of known metabolic
pathways and tissue culture studies in families with inborn
errors of metabolism (Figure 4). The ﬁrst identiﬁcation,
Lesch-Nyhan syndrome due to mutations in HPRT, was re-
ported in 198341
and the most recent was the creatine
CMT, lonasescu variant
Goldblatt spastic paraplegia
XLID spastic paraplegia, type 7
CMT, Cowchock variant
27 XLID-coarse facies
Vitale: aphasia-coarse facies
Hereditary bullous dystrophy
Figure 3. Approximate Linkage Limits for XLID Syndromes for which the Genes Have Not Been Identiﬁed
582 The American Journal of Human Genetics 90, 579–590, April 6, 2012
transporter syndrome (MIM 300352) due to mutations in
SLC6A8 [MIM 300036].42
Mutations in seven genes were
identiﬁed by this methodology.
Two workhorse approaches have been responsible for
the great majority of subsequent gene identiﬁcations.
The ﬁrst of these, based on the ascertainment of a patient
with both ID and a chromosomal rearrangement involving
the X chromosome, was used successfully in identifying
the gene associated with Duchenne muscular dystrophy
in 1987. A total of 31 genes (Table S1 and Figure 4) had
been identiﬁed by the middle of 2011 with this approach.
The second and most productive ‘‘workhorse’’ approach,
linkage study of XLID families followed by molecular
analysis of appropriate candidate genes, was employed
initially by a number of investigators in detecting and
characterizing FMR1 (MIM 309550). Subsequently, its use
has resulted in the identiﬁcation of 43 mutant X genes.
With increasing ease of sequencing, the pace of gene iden-
tiﬁcation by this route accelerated after 2003, as shown in
Table S1 and Figure 4.
The availability of brute force sequencing capability after
completion of the Human Genome Project has brought an
additional effective method of gene identiﬁcation, and 21
have been reported since 2006 (Table S1 and Figure 4).
Whether sequencing of large series of sporadic males,
male siblings, or families with clear XLID will prove to be
the most effective use of this resource remains to be deter-
mined. The selection of pedigree-based subjects for
sequencing, however, has the advantage that segregation
of gene alterations can be tested. Since this approach often
permits a relatively straight-forward path to gene identiﬁ-
cation, continued collection of both clinical data and
blood samples remains important. Exploitation of a speciﬁc
molecular ﬁnding has accounted for four gene identiﬁca-
tions (FANCB [MIM 300515], PORCN [MIM 300651],
SMC1A/SM1L1 [MIM 300040], NDUFA1 [MIM 300078]).
Two other new technologies, expression array and array-
comparative genomic hybridization have, surprisingly,
been applied successfully in only two and one instance,
respectively. Expression array was used in combination
with two other methods to discover the role of GRIA3
(MIM 305915) and PTCHD1 (MIM 300828) in ID. Array-
CGH was used in the isolation of the mutant gene in one
nonsyndromal family (HUWE1 [MIM 300697]).43
potentially valuable combinations of array technologies
for screening followed with brute force sequencing can
Figure 4. The Year and Methodology Used to Identify Genes Associated with XLID
The following abbreviations are used: Exp-Arr ¼ expression microarray. MCGH ¼ genomic microarray. X-seq ¼ gene sequencing.
Mol-Fu ¼ follow up of a known molecular pathway. L-can ¼ candidate gene testing within a linkage interval. Chr-rea ¼ positional
cloning based on a chromosome rearrangement. Met-Fu ¼ follow up of a known metabolic pathway.
The American Journal of Human Genetics 90, 579–590, April 6, 2012 583
be envisioned. Detection of a consistent up or downregula-
tion or other abnormality in two or more XLID family
members can certainly be envisioned as a fruitful approach
to the selection of subjects for partial or complete X
sequencing. Two or more approaches were used in combi-
nation in six instances among the 102 gene identiﬁcations
shown in Table S1 and Figure 1 (FMR1, MID1 [MIM
602148], SOX3 [MIM 313430], HUWE1, CASK [MIM
300172], and GRIA3). The application of CGH and related
methods in conjunction with a variety of molecular
technologies has increasingly been used to detect du-
plications and deletions of genes associated with XLID
In spite of the identiﬁcation of mutations in 102 genes
that result in XLID, the fragile X syndrome continues to
be by far the most frequent XLID syndrome. Whether
the gradual but continuous expansion of the number of
triplet repeats in the large bank of premutation carriers,
which vary from 1/113 in Israel to 1/313–382 in the United
States) plays a role in maintaining its relatively high gene
frequency is unknown.57
Lumping, Splitting, and Reclassiﬁcation Based on
Gene Discovery: A Model for Future Research
Given the variability and imprecision with which clinical
evaluations are carried out, it is inevitable that some indi-
viduals with X-linked ID will be incorrectly included in
existing diagnostic categories, whereas others will be incor-
rectly excluded. The extent to which individuals and
families can be evaluated is dependent on the setting,
access to historical information, availability and ages of
affected and nonaffected family members, and the ex-
perience and expertise of the observers. Differences in
phenotype can result from mutations in different domains
of a gene and by contributions from the balance of the
genome. The identiﬁcation of mutations in many genes
associated with XLID has provided the opportunity to
compensate for some of these variables, resulting in the
lumping of entities previously considered to be separate
and the splitting of other entities previously considered
the same. In addition, the phenotypic limits of some
XLID entities were established with some degree of
Several XLID entities have been most instructive. Dis-
covery that mutations in ATRX (MIM 300032) (Xq21.1)
cause alpha-thalassemia ID allowed testing of large
number of males with hypotonic facies, ID, and other
Currently, as shown in Table S1, four other
named XLID syndromes (Carpenter-Waziri, Holmes-
Gang, XLID-Hypotonia-Arch Fingerprints, and Chudley-
Lowry syndromes [MIM 309580]) have been found to be
allelic variants of alpha-thalassemia ID as have certain
families with spastic paraplegia and nonsyndromal
One family clinically diagnosed as Juberg-
Marsidi syndrome was found to have an ATRX muta-
This is now known to be based on misdiagnosis
of Juberg-Marsidi syndrome (MIM 300612); indeed, the
original family with this syndrome has a mutation in
HUWE1 at Xp11.22 (Friez et al., 2011, 15th International
Workshop on Fragile X and Other Early-Onset Cognitive
Disorders). One family clinically diagnosed as Smith-
Fineman-Myers syndrome was also found to harbor an
ATRX mutation, but the gene has not been analyzed in
the original family.68–70
A clinically similar condition,
Cofﬁn-Lowry syndrome (MIM 303600), was found to be
separate from alpha-thalassemia ID and due to mutations
in RPS6KA3 (MIM 300075), which encodes a serine-threo-
Kalscheuer et al.72
found mutations in PQBP1 (MIM
300463) (Xp11.2) in two named XLID syndromes – Suther-
land-Haan syndrome (MIM 309470) and Hamel cerebropa-
latocardiac syndrome (MIM 309500)—in MRX55 and
two other families with microcephaly and other ﬁndings.
Lenski et al.,73
Stevenson et al.,74
and Lubs et al.75
Renpenning, Porteous, and Golabi-Ito-Hall syndromes to
the list of XLID syndromes caused by mutations in
The six phenotypes now attributed to muta-
tions in PQBP1 are now summarized in the allelic variants
of OMIM 300463. As with the ATRX phenotypes, a wide
variety of phenotypic expressions result from different
mutations in PQBP1 and we remain challenged to better
understand the molecular and developmental mecha-
nisms leading to these differences.
Mutations in ARX (MIM 300382) (Xp22.2) were also
found to be an important cause of XLID encompassing
Wagenstaller et al.54, Horn et al.50
Gijsbers et al.49
Whibley et al.55
F t l 44
Froyen et al.45
royen e a .
Bedeschi et al.48
Koolen et al.46
Mimault et al.51, Woodward et al.56
Koolen et al.46
Koolen et al.46
S l t
Solomon et al.53
Van Esch et al.47, Friez et al.43
Rio et al.52
Figure 5. Location of Segmental Duplications Associated with
Syndromal or Nonsyndromal XLID43–56
584 The American Journal of Human Genetics 90, 579–590, April 6, 2012
multiple phenotypes. Alterations, most commonly a 24 bp
expansion of a polyalanine tract, were found in a number
of families with nonsyndromal XLID (MRX29, 32, 33, 36,
38, 43, 54, and 76), an X-linked dystonia (Partington
syndrome [MIM 309510]), X-linked infantile spasms
(MIM 308350) (West syndrome), X-linked lissencephaly
with abnormal genitalia (MIM 300215), hydranencephaly
and abnormal genitalia (MIM 300215), and Proud
syndrome (MIM 300215).76–83
Perhaps the most prominent example of syndrome split-
ting is FG syndrome (MIM 305450). This syndrome,
initially described in 1974 by Opitz and Kaveggia,84
manifest by macrocephaly (or relative macrocephaly),
downslanting palpebral ﬁssures, imperforate anus or
severe constipation, broad and ﬂat thumbs and great
toes, hypotonia, and ID. In the ensuing years, the manifes-
tations attributed to FG syndrome have become protean,
but none was pathognomonic or required for the
As a result, a number of different localiza-
tions on the X chromosome were proposed for FG
In 2007, Risheg et al.96
found a recurring mutation,
c.2881C>T (p.Arg961Trp), in MED12 (MIM 300188) in
six families with the FG phenotype, including the original
family reported by Opitz and Kaveggia.84
In addition to the
above noted manifestations, two other ﬁndings, small ears
and friendly behavior, were consistently noted.
Although most individuals who have carried the FG
diagnosis have one or more ﬁndings that overlap with
those in FG syndrome, they do not have MED12 muta-
Some have been found to have mutations in
other X-linked genes (FMR1, FLNA [MIM 300017], ATRX,
CASK, and MECP2 [MIM 300005]), whereas others have
duplications or deletions of the autosomes.97
So great is
the currently existing heterogeneity within FG syndrome
that the vast majority of individuals so designated should
best be considered to have ID of undetermined cause.
In a number of instances, certain gene mutations have
been associated with nonsyndromal XLID, whereas other
mutations within the same genes have caused syndromal
XLID. Mutations in 17 genes that may cause either type
of XLID, depending on the mutation, have been identiﬁed
(Figure 2). In some cases (e.g., those with OPHN1 [MIM
300127] and ARX mutations) re-examination has found
syndromal manifestations in families previously consid-
ered to have nonsyndromal XLID.79,99,100
The frequency with which the process of lumping and
splitting in this limited ﬁeld of investigation has occurred
has been extremely instructive to both clinical and molec-
ular investigators. Moreover, the process of reclassifying
and reﬁning the XLID syndromes in light of the gene iden-
tiﬁcations may be one of the most important contributions
by medical genetics to clinical medicine. The underlying
mechanisms or pathways by which mutations in different
genes result in similar phenotypes and different mutations
in a single gene result in disparate phenotypes, however,
remain to be fully elucidated.
Improved Understanding of Disease Mechanisms
in XLID Disorders
Analysis of the presently known 102 genes associated with
XLID lends some insight into the numerous molecular
functions in which disruption can lead to cognitive
impairment and impaired brain development.17
major functions are almost equally represented in proteins
encoded by this panel of 102 genes: 22% are involved in
regulation of transcription, 19% in signal transduction,
and 15% in metabolism. Additionally, 15% are compo-
nents of membrane-associated functions. The remainder
are equally distributed (~3%–5%) in seven other cellular
functions: cytoskeleton, RNA processing, DNA metabo-
lism, protein synthesis, ubiquitinization, cell cycle, and
cell adhesion. Regarding their localization within a cell,
the proteins encoded by genes associated with XLID are
almost equally distributed among the four major subcel-
lular fractions: 30% in the nucleus, 28% in the cytoplasm,
18% in the membranes, and 16% in cellular organelles.17
The XLID disorders offer many opportunities for under-
standing the functions of speciﬁc genes and their interac-
tions with other genes in producing disease. Studies
involving control of gene expression will necessarily be
especially complex. These have just begun, in part because
of their complexity and the rapid development of new tech-
niques. Only recently, for example, has a preliminary ex-
pression microarray analysis been carried out in two affected
fragile X males.101
The study identiﬁed over 90 genes with
a greater than 1.5-fold change in expression. Overrepre-
sented genes were involved in signaling (both under-
and overexpression), morphogenesis (underexpression),
and neurodevelopment and function (overexpression).
Although not addressed in this study, the possibility that
a hallmark ﬁnding in the fragile X syndrome, enlargement
of the testes, might result from altered control of tubular
growth by a speciﬁc target gene is intriguing. One of the
90 genes identiﬁed, NUT (nuclear protein in testis [MIM
608963]), which is normally only expressed in the testis,
should be a candidate gene in future studies because the
BRDA-NUT fusion oncogenes are critical growth promoters
in certain aggressive carcinomas.102
Alternatively, a more
general growth-controlling gene might also explain the
prognathism, macrocephaly and large hands which occur
in some individuals with the fragile X syndrome.
Studies directed at understanding the mechanisms
underlying recurring clinical problems in XLID disorders
such as short stature, microcephaly or macrocephaly,
autistic behavior, and structural CNS abnormalities103
are also particularly appealing because they provide an
opportunity both to simultaneously understand critical
pathways, such as in dendrite development and the devel-
opment of XLID structural abnormalities, gene expression,
and phenotype. The association of autism spectrum dis-
order with mutations in at least eight of the 102 genes
listed in Table S1 is of particular current interest. This has
been reported most frequently in the fragile X syndrome
and Rett syndrome but also in disorders resulting from
The American Journal of Human Genetics 90, 579–590, April 6, 2012 585
mutations in NLGN3 (MIM 300336), NLGN4 (MIM
300427), RPL10 (MIM 312173), RAB39B (MIM 300774),
PTCHD1, and MED12. These genes, however, affect a wide
range of functions (Table S1), and the cause of the clinical
overlap is not clear. In nonsyndromal XLID, for example,
mutations have been identiﬁed in ﬁve genes involved in
the RhoGTPase cycle that affect dendritic outgrowth
(OPHN1, PAK3 [MIM 300142], ARHGEF6 [MIM 300267],
TM4SF2 [MIM 300096], and GDI1 [MIM 300104]) and are
central to the development of the nonsyndromal pheno-
The limited imaging and direct studies of macrocephaly,
microcephaly, and cerebellar hypoplasia have recently
but more extensive application of
anatomical and functional brain imaging and spectros-
copy techniques that can identify variations in speciﬁc
brain regions for each disorder, in conjunction with both
clinical observations and psychometric studies, is critically
Detection of Possible Advantageous Cognitive
and Behavioral Genes
The identiﬁcation of 102 X-linked genes affecting intelli-
gence has raised the probability that X chromosomal genes
(including XLID genes) might play a particularly impor-
tant role in brain structure and function as well as a speciﬁc
role in intelligence and certain cognitive abilities. Clearly,
as discussed at the beginning of this paper, the research
planned and carried out to identify XLID genes and
syndromes over the last several decades might account
for part or even all of this relative excess compared to auto-
somal loci. A number of papers, however, have addressed
the issue of active selection during evolution for X chro-
mosomal localization of important brain and cognitive
The ﬁnding that human and mouse X chro-
mosome genes are hyperexpressed in the CNS compared to
autosomal genes provided additional important conﬁrma-
tory data for the hypothesis of positive evolutionary selec-
These studies showed not only that there was a
doubling of X chromosome expression (compared to auto-
somes) early in development (leading to dosage compensa-
tion), but overexpression in human CNS tissue and in
mouse CNS tissue increased by 2.83 and 2.53, respec-
tively, compared to expression in somatic tissues. These
observations also support the general idea that X genes
are particularly important for brain development and
function. Mutations signiﬁcantly improving intellectual,
creative, perceptive, and leadership qualities would be
fully expressed in males and reasonably could have been
positively selected for in a relatively short period of time
in contrast to the negative selection for XLID muta-
In essence, the XY males may have been the
experimental animal and the XX female, the storage
facility for both advantageous and deleterious mutations.
Medical investigations generally focus on adverse effects
and no organized searches for X-linked pedigrees with
particularly high intellectual or special cognitive talents
have been reported. Thus, the same approach that has
been effective in identifying XLID syndrome genes, investi-
gating families with an X-linked pattern of intellectual
outliers, might also prove rewarding for studies at the other
end of the intellectual spectrum. What if we selected for
families with an X-linked pattern of high intellectual
accomplishment; special talents in art or music; unique
types of cognitive behavior involving memory, problem
solving, or, indeed, any type of special intellectual accom-
plishment such as Nobel awards in Economics or Physics?
Such families will certainly be uncommon but so are most
XLID disorders. Yet families might be identiﬁed if academi-
cians asked the pertinent family history questions during
lunch with colleagues, a dedicated, interactive home page
was available, or notices were placed in journals asking for
information about possible families. The same group of
laboratories that contributed to the data in Table S1 would
be logical sources for referral and molecular studies because
the necessary cognitive and molecular studies are already
in place. A positive result might be even be more important
to society than XLID disease description and provide
important insight into human evolution.
Although there is a wide array of pertinent cognitive
tests, these were not designed to detect speciﬁc familial
talents. The coapplication of a pedigree analysis with perti-
nent laboratory tests should provide sufﬁciently precise
initial diagnosis of the affected to carry out linkage and
array or other screening tests successfully. One family
with four to ﬁve outstanding individuals over several
generations could provide sufﬁcient data to warrant testing
other families (or even other species) and to begin an iden-
tiﬁcation process similar to that described in this paper
that has proven successful for XLID. Imagine the prospects
for investigating speciﬁc gene-environmental interactions
during learning and development!
Why, other than not having looked seriously, have
we not stumbled upon such families? Perhaps we have.
In the Inaugural Book of the new National Museum
of the American Indian, Native Universe, Voices of Indian
in which tribal leaders, writers, scholars, and
story tellers describe Indian traditions and heritages, the
following is recounted:
‘‘Story tells us that a group split from the Lenni Lenape,
perhaps a thousand years ago or more. The people then
settled on the Eastern Shore of the Chesapeake, and were
one and the same as the Nanticoke. Then, for some reason,
the ﬁrst Tayac, Uttapoingassenum, led his people to the
other side of the bay. Upon their arrival, they encountered
peoples who had been living on the land for more than
8,000 years, according to various archeological estimates.
For thirteen generations prior to English settlement, as
told to Jesuit and Moravian missionaries, the Tayac’s inher-
itance passed from brother to brother and then to the
sister’s sons. Each led the people until his death.’’
The possibility that the Nanticoke had intuitively recog-
nized and employed a quality of leadership that followed
an X-linked pattern of inheritance is intriguing to consider.
586 The American Journal of Human Genetics 90, 579–590, April 6, 2012
Although much progress has been made during the past
four decades, the clinical and molecular delineation of
XLID is far from complete. Perhaps little more than half
of the genes in which mutations will result in XLID have
been identiﬁed. The molecular pathways are incompletely
understood, the mechanisms by which brain structure and
function are deranged have not been identiﬁed, and with
few exceptions the neurobehavioral proﬁles and natural
history of the XLID entities have received insufﬁcient
attention. These deﬁciencies notwithstanding, consider-
able beneﬁts have been gained for individuals with XLID
and their families. Speciﬁc molecular tests, including mul-
tigene panels, are now available to more efﬁciently reach
a diagnosis. Carrier testing, donor eggs, prenatal diagnosis,
and preimplantation genetic testing may be used to
prevent recurrence when a speciﬁc gene mutation is found.
Through these measures, reproductive conﬁdence may be
restored for families in which XLID has occurred.114
Supplemental Data include two tables and can be found with this
article online at http://www.cell.com/AJHG/.
The URLs for data presented herein are as follows:
Greenwood Genetic Center, XLID Update, http://www.ggc.org/
Online Mendelian Inheritance in Man (OMIM), http://www.
1. Stevenson, R.E., Schwartz, C.E., and Rogers, R.C. (2012).
Atlas of X-Linked Intellectual Disability Syndromes (New
York: Oxford University Press).
2. Skuse, D.H. (2005). X-linked genes and mental functioning.
Hum. Mol. Genet. 14 (Spec No 1), R27–R32.
3. Ge´cz, J., Shoubridge, C., and Corbett, M. (2009). The genetic
landscape of intellectual disability arising from chromo-
some X. Trends Genet. 25, 308–316.
4. Lubs, H.A. (1969). A marker X chromosome. Am. J. Hum.
Genet. 21, 231–244.
5. Kaiser-McCaw, B., Hecht, F., Cadien, J.D., and Moore, B.C.
(1980). Fragile X-linked mental retardation. Am. J. Med.
Genet. 7, 503–505.
6. Opitz, J.M., and Sutherland, G.R. (1984). Conference report:
International workshop on the fragile X and X-linked intel-
lectual disability. Am. J. Med. Genet. 17, 5–94.
7. Turner, G., Opitz, J.M., Brown, W.T., Davies, K.E., Jacobs,
P.A., Jenkins, E.C., Mikkelson, M., Partington, M.W., and
Sutherland, G.R. (1986). Conference report: Second interna-
tional workshop on the fragile X and on X-linked mental
retardation. Am. J. Med. Genet. 23, 11–67.
8. Neri, G., Opitz, J.M., Mikkelson, M., Jacobs, P.A., Davies, K.,
and Turner, G. (1988). Conference report: Third interna-
tional workshop on the fragile X and X-linked mental
retardation. Am. J. Med. Genet. 30, 1–29.
9. Neri, G., Gurrieri, F., Gal, A., and Lubs, H.A. (1991). XLMR
genes: Update 1990. Am. J. Med. Genet. 38, 186–189.
XLMR genes: Update 1992. Am. J. Med. Genet. 43, 373–382.
11. Neri, G., Chiurazzi, P., Arena, J.F., and Lubs, H.A. (1994).
XLMR genes: Update 1994. Am. J. Med. Genet. 51, 542–549.
12. Brown, W.T., Jenkins, E., Neri, G., Lubs, H., Shapiro, L.R.,
Davies, K.E., Sherman, S., Hagerman, R., and Laird, C.
(1991). Conference report: Fourth international workshop
on the fragile X and X-linked mental retardation. Am. J.
Med. Genet. 38, 158–172.
13. Lubs, H.A., Chiurazzi, P., Arena, J.F., Schwartz, C., Traneb-
jaerg, L., and Neri, G. (1996). XLMR genes: update 1996.
Am. J. Med. Genet. 64, 147–157.
14. Lubs, H., Chiurazzi, P., Arena, J., Schwartz, C., Tranebjaerg,
L., and Neri, G. (1999). XLMR genes: Update 1998. Am. J.
Med. Genet. 83, 237–247.
15. Chiurazzi, P., Hamel, B.C., and Neri, G. (2001). XLMR genes:
Update 2000. Eur. J. Hum. Genet. 9, 71–81.
16. Chiurazzi, P., Schwartz, C.E., Gecz, J., and Neri, G. (2008).
XLMR genes: Update 2007. Eur. J. Hum. Genet. 16, 422–434.
17. Ropers, H.H. (2008). Genetics of intellectual disability. Curr.
Opin. Genet. Dev. 18, 241–250.
18. Chelly, J., Khelfaoui, M., Francis, F., Che´rif, B., and Bienvenu,
T. (2006). Genetics and pathophysiology of mental retarda-
tion. Eur. J. Hum. Genet. 14, 701–713.
19. Ropers, H.H., and Hamel, B.C. (2005). X-linked mental
retardation. Nat. Rev. Genet. 6, 46–57.
20. Kleefstra, T., and Hamel, B.C. (2006). X-linked mental retar-
dation: Further lumping, splitting and emerging pheno-
types. Clin. Genet. 67, 451–467.
21. Stevenson, R.E., and Schwartz, C.E. (2002). Clinical and
molecular contributions to the understanding of X-linked
mental retardation. Cytogenet. Genome Res. 99, 265–275.
22. Neri, G., and Opitz, J.M. (2000). Sixty years of X-linked
mental retardation: A historical footnote. Am. J. Med. Genet.
23. Martin, J.P., and Bell, J. (1943). A pedigree of mental defect
showing sex-linkage. J. Neurol. Psychiatry 6, 154–157.
24. Allan, W., Herndon, C.N., and Dudley, F.C. (1944). Some
examples of the inheritance of mental deﬁciency: Apparently
sex-linked idiocy and microcephaly. Am. J. Ment. Deﬁc. 48,
25. Bickers, D.S., and Adams, R.D. (1949). Hereditary stenosis of
the aqueduct of Sylvius as a cause of congenital hydroceph-
alus. Brain 72, 246–262.
26. Losowsky, M.S. (1961). Hereditary mental defect showing the
pattern of sex inﬂuence. J. Ment. Deﬁc. Res. 5, 60–62.
27. Renpenning, H., Gerrard, J.W., Zaleski, W.A., and Tabata, T.
(1962). Familial sex-linked mental retardation. Can. Med.
Assoc. J. 87, 954–956.
28. Dunn, H.G., Renpenning, H., Gerrard, H.W., Miller, J.R.,
Tabata, T., and Federoff, S. (1963). Mental retardation as
a sex-linked defect. Am. J. Ment. Deﬁc. 67, 827–848.
29. Penrose, L.S. (1938). A clinical and genetic study of 1280 cases
of mental defect. Special Report Series, Medical Research
Council, No. 229 (London: His Majesty’s Stationery Ofﬁce).
30. Lehrke, R.G. (1974). X-linked mental retardation and verbal
disability. Birth Defects Orig. Artic. Ser. 10, 1–100.
31. Herbst, D.S., and Miller, J.R. (1980). Nonspeciﬁc X-linked
mental retardation II: The frequency in British Columbia.
Am. J. Med. Genet. 7, 461–469.
The American Journal of Human Genetics 90, 579–590, April 6, 2012 587
32. Lubs, H.A., and Ruddle, F.H. (1970). Chromosomal abnor-
malities in the human population: estimation of rates based
on New Haven newborn study. Science 169, 495–497.
33. Harrison, C.J., Jack, E.M., Allen, T.D., and Harris, R. (1983).
The fragile X: A scanning electron microscope study. J.
Med. Genet. 20, 280–285.
34. Giraud, F., Ayme, S., Mattei, J.F., and Mattei, M.G. (1976).
Constitutional chromosomal breakage. Hum. Genet. 34,
35. Harvey, J., Judge, C., and Wiener, S. (1977). Familial X-linked
mental retardation with an X chromosome abnormality. J.
Med. Genet. 14, 46–50.
36. Sutherland, G.R. (1977). Fragile sites on human chromo-
somes: Demonstration of their dependence on the type of
tissue culture medium. Science 197, 265–266.
37. Oberle, I., Rousseau, F., Heitz, D., Kretz, C., Kevys, D., Hana-
uer, A., Boue, J., Bertheas, M.F., and Mandel, J.L. (1991).
Instability of a 550-base pair DNA segment and abnormal
methylation in fragile X syndrome. Science 252, 1097–1102.
38. Bell, M.V., Hirst, M.C., Nakahori, Y., MacKinnon, R.N.,
Roche, A., Flint, T.J., Jacobs, P.A., Tommerup, N., Tranebjaerg,
L., Froster-Iskenius, U., et al. (1991). Physical mapping across
the fragile X: hypermethylation and clinical expression of
the fragile X syndrome. Cell 64, 861–866.
39. Yu, S., Pritchard, M., Kremer, E., Lynch, M., Nancarrow, J.,
Baker, E., Holman, K., Mulley, J., Warren, S., Schlessinger,
D., et al. (1991). Fragile X genotype characterized by an
unstable region of DNA. Science 252, 1179–1181.
40. Verkerk, A.J., Pieretti, M., Sutcliffe, J.S., Fu, Y.H., Kuhl, D.P.,
Pizzuti, A., Reiner, O., Richards, S., Victoria, M.F., Zhang,
F.P., et al. (1991). Identiﬁcation of a gene (FMR-1) containing
a CGG repeat coincident with a breakpoint cluster region
exhibiting length variation in fragile X syndrome. Cell 65,
41. Jolly, D.J., Okayama, H., Berg, P., Esty, A.C., Filpula, D.,
Bohlen, P., Johnson, G.G., Shively, J.E., Hunkapillar, T., and
Friedmann, T. (1983). Isolation and characterization of
a full-length expressible cDNA for human hypoxanthine
phosphoribosyl transferase. Proc. Natl. Acad. Sci. USA 80,
42. Salomons, G.S., van Dooren, S.J., Verhoeven, N.M., Cecil,
K.M., Ball, W.S., Degrauw, T.J., and Jakobs, C. (2001).
X-linked creatine-transporter gene (SLC6A8) defect: A new
creatine-deﬁciency syndrome. Am. J. Hum. Genet. 68,
43. Friez, M.J., Jones, J.R., Clarkson, K., Lubs, H., Abuelo, D., Bier,
J.A., Pai, S., Simensen, R., Williams, C., Giampietro, P.F., et al.
(2006). Recurrent infections, hypotonia, and mental retarda-
tion caused by duplication of MECP2 and adjacent region in
Xq28. Pediatrics 118, e1687–e1695.
44. Froyen, G., Van Esch, H., Bauters, M., Hollanders, K., Frints,
S.G., Vermeesch, J.R., Devriendt, K., Fryns, J.P., and Marynen,
P. (2007). Detection of genomic copy number changes in
patients with idiopathic mental retardation by high-resolu-
tion X-array-CGH: Important role for increased gene dosage
of XLMR genes. Hum. Mutat. 28, 1034–1042.
45. Froyen, G., Corbett, M., Vandewalle, J., Jarvela, I., Lawrence,
O., Meldrum, C., Bauters, M., Govaerts, K., Vandeleur, L., Van
Esch, H., et al. (2008). Submicroscopic duplications of the
hydroxysteroid dehydrogenase HSD17B10 and the E3 ubiq-
uitin ligase HUWE1 are associated with mental retardation.
Am. J. Hum. Genet. 82, 432–443.
46. Koolen, D.A., Pfundt, R., de Leeuw, N., Hehir-Kwa, J.Y., Nille-
sen, W.M., Neefs, I., Scheltinga, I., Sistermans, E., Smeets, D.,
Brunner, H.G., et al. (2009). Genomic microarrays in mental
retardation: A practical workﬂow for diagnostic applications.
Hum. Mutat. 30, 283–292.
47. Van Esch, H., Bauters, M., Ignatius, J., Jansen, M., Raynaud, M.,
Hollanders, K., Lugtenberg,D., Bienvenu, T.,Jensen, L.R.,Gecz,
J., et al. (2005). Duplication of the MECP2 region is a frequent
cause of severe mentalretardation and progressiveneurological
symptoms in males. Am. J. Hum. Genet. 77, 442–453.
48. Bedeschi, M.F., Novelli, A., Bernardini, L., Parazzini, C.,
Bianchi, V., Torres, B., Natacci, F., Giuffrida, M.G., Ficarazzi,
P., Dallapiccola, B., and Lalatta, F. (2008). Association of syn-
dromic mental retardation with an Xq12q13.1 duplication
encompassing the oligophrenin 1 gene. Am. J. Med. Genet.
A. 146A, 1718–1724.
49. Gijsbers, A.C., den Hollander, N.S., Helderman-van de Enden,
A.T., Schuurs-Hoeijmakers, J.H., Vijfhuizen, L., Bijlsma, E.K.,
van Haeringen, A., Hansson, K.B., Bakker, E., Breuning,
M.H., and Ruivenkamp, C.A. (2011). X-chromosome duplica-
tions in males with mental retardation: Pathogenic or benign
variants? Clin. Genet. 79, 71–78.
50. Horn, D., Spranger, S., Kruger, G., Wagenstaller, J., Weschke,
B., Ropers, H.H., Mundlos, S., Ullmann, R., Strom, T.M., and
Kiopocki, E. (2007). Microdeletions and microduplications
affecting the STS gene at Xp22.31 are associated with a
distinct phenotypic spectrum. Medizinische Genetik 19, 62.
51. Mimault, C., Giraud, G., Courtois, V., Cailloux, F., Boire, J.Y.,
Dastugue, B., and Boespﬂug-Tanguy, O.; The Clinical Euro-
pean Network on Brain Dysmyelinating Disease. (1999).
Proteolipoprotein gene analysis in 82 patients with sporadic
Pelizaeus-Merzbacher Disease: Duplications, the major cause
of the disease, originate more frequently in male germ cells,
but point mutations do not. Am. J. Hum. Genet. 65,
52. Rio, M., Malan, V., Boissel, S., Toutain, A., Royer, G., Gobin,
S., Morichon-Delvallez, N., Turleau, C., Bonnefont, J.P.,
Munnich, A., et al. (2010). Familial interstitial Xq27.3q28
duplication encompassing the FMR1 gene but not the
MECP2 gene causes a new syndromic mental retardation
condition. Eur. J. Hum. Genet. 18, 285–290.
53. Solomon, N.M., Ross, S.A., Morgan, T., Belsky, J.L., Hol, F.A.,
Karnes, P.S., Hopwood, N.J., Myers, S.E., Tan, A.S., Warne,
G.L., et al. (2004). Array comparative genomic hybridisation
analysis of boys with X linked hypopituitarism identiﬁes
a 3.9 Mb duplicated critical region at Xq27 containing
SOX3. J. Med. Genet. 41, 669–678.
54. Wagenstaller, J., Spranger, S., Lorenz-Depiereux, B., Kaz-
mierczak, B., Nathrath, M., Wahl, D., Heye, B., Glaser, D.,
Liebscher, V., Meitinger, T., and Strom, T.M. (2007).
Copy-number variations measured by single-nucleotide-
polymorphism oligonucleotide arrays in patients with
mental retardation. Am. J. Hum. Genet. 81, 768–779.
55. Whibley, A.C., Plagnol, V., Tarpey, P.S., Abidi, F., Fullston, T.,
Choma, M.K., Boucher, C.A., Shepherd, L., Willatt, L.,
Parkin, G., et al. (2010). Fine-scale survey of X chromosome
copy number variants and indels underlying intellectual
disability. Am. J. Hum. Genet. 87, 173–188.
56. Woodward, K., Palmer, R., Rao, K., and Malcolm, S. (1999).
Prenatal diagnosis by FISH in a family with Pelizaeus-
Merzbacher disease caused by duplication of PLP gene.
Prenat. Diagn. 19, 266–268.
588 The American Journal of Human Genetics 90, 579–590, April 6, 2012
57. Hantash, F.M., Goos, D.G.,Tsao, D., Quan, F., Buller-Burckle,A.,
Peng, M., Jarvis, M., Sun, W., and Strom, C.M. (2010). Qualita-
intermediate, premutation, full mutation, and mosaic carriers
in both sexes: Implications for fragile X syndrome carrier and
newborn screening. Genet. Med. 12, 162–173.
58. Gibbons, R.J., Brueton, L., Buckle, V.J., Burn, J., Clayton-
Smith, J., Davison, B.C., Gardner, R.J., Homfray, T., Kearney,
L., Kingston, H.M., et al. (1995a). Clinical and hematologic
aspects of the X-linked alpha-thalassemia/mental retardation
syndrome (ATR-X). Am. J. Med. Genet. 55, 288–299.
59. Gibbons, R.J., Picketts, D.J., Villard, L., and Higgs, D.R.
(1995b). Mutations in a putative global transcriptional
regulator cause X-linked mental retardation with alpha-
thalassemia (ATR-X syndrome). Cell 80, 837–845.
A.M., Seaver, L., Bonnefont, J.P., Romano, C., Fichera, M., et al.
(1999). Evaluation of a mutation screening strategy for sporadic
cases of ATR-X syndrome. J. Med. Genet. 36, 183–186.
61. Abidi, F., Schwartz, C.E., Carpenter, N.J., Villard, L., Fonte´s,
M., and Curtis, M. (1999). Carpenter-Waziri syndrome results
from a mutation in XNP. Am. J. Med. Genet. 85, 249–251.
62. Lossi, A.M., Milla´n, J.M., Villard, L., Orellana, C., Cardoso,
C., Prieto, F., Fonte´s, M., and Martı´nez, F. (1999). Mutation
of the XNP/ATR-X gene in a family with severe mental
retardation, spastic paraplegia and skewed pattern of X inac-
tivation: Demonstration that the mutation is involved in the
inactivation bias. Am. J. Hum. Genet. 65, 558–562.
63. Abidi, F.E., Cardoso, C., Lossi, A.M., Lowry, R.B., Depetris, D.,
Matte´i, M.G., Lubs, H.A., Stevenson, R.E., Fontes, M.,
Chudley, A.E., and Schwartz, C.E. (2005). Mutation in the
alternatively spliced region of the XNP/ATR-X gene causes
Chudley-Lowry syndrome. Eur. J. Hum. Genet. 13, 176–183.
64. Guerrini, R., Shanahan, J.L., Carrozzo, R., Bonanni, P., Higgs,
D.R., and Gibbons, R.J. (2000). A nonsense mutation of the
ATRX gene causing mild mental retardation and epilepsy.
Ann. Neurol. 47, 117–121.
65. Yntema, H.G., Poppelaars, F.A., Derksen, E., Oudakker, A.R.,
van Roosmalen, T., Jacobs, A., Obbema, H., Brunner, H.G.,
Hamel, B.C., and van Bokhoven, H. (2002). Expanding
phenotype of XNP mutations: Mild to moderate mental
retardation. Am. J. Med. Genet. 110, 243–247.
66. Mattei, J.F., Collignon, P., Ayme, S., and Giraud, F. (1983).
X-linked mental retardation, growth retardation, deafness
and microgenitalism. A second familial report. Clin. Genet.
67. Villard, L., Gecz, J., Matte´i, J.F., Fonte´s, M., Saugier-Veber, P.,
Munnich, A., and Lyonnet, S. (1996). XNP mutation in a
large family with Juberg-Marsidi syndrome. Nat. Genet. 12,
68. Smith, R.D., Fineman, R.M., and Myers, G.G. (1980). Short
stature, psychomotor retardation, and unusual facial appear-
ance in two brothers. Am. J. Med. Genet. 7, 5–9.
69. Ade`s, L.C., Kerr, B., Turner, G., and Wise, G. (1991). Smith-
Fineman-Myers syndrome in two brothers. Am. J. Med.
Genet. 40, 467–470.
70. Villard, L., Fonte`s, M., Ade`s, L.C., and Gecz, J. (2000). Identi-
ﬁcation of a mutation in the XNP/ATR-X gene in a family
reported as Smith-Fineman-Myers syndrome. Am. J. Med.
Genet. 91, 83–85.
71. Trivier, E., De Cesare, D., Jacquot, S., Pannetier, S., Zackai, E.,
Young, I., Mandel, J.L., Sassone-Corsi, P., and Hanauer, A.
(1996). Mutations in the kinase Rsk-2 associated with
Cofﬁn-Lowry syndrome. Nature 384, 567–570.
72. Kalscheuer, V.M., Freude, K., Musante, L., Jensen, L.R.,
Yntema, H.G., Ge´cz, J., Seﬁani, A., Hoffmann, K., Moser, B.,
Haas, S., et al. (2004). Mutations in the polyglutamine
binding protein 1 gene cause X-linked mental retardation.
Nat. Genet. 35, 313–315.
73. Lenski, C., Abidi, F., Meindl, A., Gibson, A., Platzer, M., Frank
Kooy, R., Lubs, H.A., Stevenson, R.E., Ramser, J., and
Schwartz, C.E. (2004). Novel truncating mutations in the
polyglutamine tract binding protein 1 gene (PQBP1) cause
Renpenning syndrome and X-linked mental retardation in
another family with microcephaly. Am. J. Hum. Genet. 74,
74. Stevenson, R.E., Bennett, C.W., Abidi, F., Kleefstra, T.,
Porteous, M., Simensen, R.J., Lubs, H.A., Hamel, B.C., and
Schwartz, C.E. (2005). Renpenning syndrome comes into
focus. Am. J. Med. Genet. A. 134, 415–421.
75. Lubs, H., Abidi, F.E., Echeverri, R., Holloway, L., Meindl, A.,
Stevenson, R.E., and Schwartz, C.E. (2006). Golabi-Ito-Hall
syndrome results from a missense mutation in the WW
domain of the PQBP1 gene. J. Med. Genet. 43, e30.
76. Strømme, P., Mangelsdorf, M.E., Scheffer, I.E., and Ge´cz, J.
(2002). Infantile spasms, dystonia, and other X-linked
phenotypes caused by mutations in Aristaless related
homeobox gene, ARX. Brain Dev. 24, 266–268.
77. Strømme, P., Mangelsdorf, M.E., Shaw, M.A., Lower, K.M.,
Lewis, S.M., Bruyere, H., Lu¨tcherath, V., Gedeon, A.K.,
Wallace, R.H., Scheffer, I.E., et al. (2002). Mutations in the
human ortholog of Aristaless cause X-linked mental retarda-
tion and epilepsy. Nat. Genet. 30, 441–445.
78. Bienvenu, T., Poirier, K., Friocourt, G., Bahi, N., Beaumont,
D., Fauchereau, F., Ben Jeema, L., Zemni, R., Vinet, M.C.,
Francis, F., et al. (2002). ARX, a novel Prd-class-homeobox
gene highly expressed in the telencephalon, is mutated in
X-linked mental retardation. Hum. Mol. Genet. 11, 981–991.
79. Frints, S.G., Froyen, G., Marynen, P., Willekens, D., Legius, E.,
and Fryns, J.P. (2002). Re-evaluation of MRX36 family after
discovery of an ARX gene mutation reveals mild neurological
features of Partington syndrome. Am. J. Med. Genet. 112,
80. Kitamura, K., Yanazawa, M., Sugiyama, N., Miura, H., Iizuka-
Kogo, A., Kusaka, M., Omichi, K., Suzuki, R., Kato-Fukui, Y.,
Kamiirisa, K., et al. (2002). Mutation of ARX causes abnormal
development of forebrain and testes in mice and X-linked lis-
sencephaly with abnormal genitalia in humans. Nat. Genet.
81. Uyanik, G., Aigner, L., Martin, P., Gross, C., Neumann, D.,
Marschner-Scha¨fer, H., Hehr, U., and Winkler, J. (2003).
ARX mutations in X-linked lissencephaly with abnormal
genitalia. Neurology 61, 232–235.
82. Kato, M., Das, S., Petras, K., Kitamura, K., Morohashi, K.,
Abuelo, D.N., Barr, M., Bonneau, D., Brady, A.F., Carpenter,
N.J., et al. (2004). Mutations of ARX are associated with
striking pleiotropy and consistent genotype-phenotype
correlation. Hum. Mutat. 23, 147–159.
83. Stepp, M.L., Cason, A.L., Finnis, M., Mangelsdorf, M., Holin-
ski-Feder, E., Macgregor, D., MacMillan, A., Holden, J.J., Gecz,
J., Stevenson, R.E., and Schwartz, C.E. (2005). XLMR in MRX
families 29, 32, 33 and 38 results from the dup24 mutation in
the ARX (Aristaless related homeobox) gene. BMC Med.
Genet. 6, 16.
The American Journal of Human Genetics 90, 579–590, April 6, 2012 589
84. Opitz, J.M., and Kaveggia, E.G. (1974). Studies of malforma-
tion syndromes of man 33: the FG syndrome. An X-linked
recessive syndrome of multiple congenital anomalies and
mental retardation. Z. Kinderheilkd. 117, 1–18.
85. Opitz, J.M., Richieri-da Costa, A., Aase, J.M., and Benke, P.J.
(1988). FG syndrome update 1988: note of 5 new patients
and bibliography. Am. J. Med. Genet. 30, 309–328.
86. Romano, C., Baraitser, M., and Thompson, E. (1994). A clin-
ical follow-up of British patients with FG syndrome. Clin.
Dysmorphol. 3, 104–114.
87. Ozonoff, S., Williams, B.J., Rauch, A.M., and Opitz, J.O.
(2000). Behavior phenotype of FG syndrome: cognition,
personality, and behavior in eleven affected boys. Am. J.
Med. Genet. 97, 112–118.
88. Battaglia, A., Chines, C., and Carey, J.C. (2006). The FG
syndrome: report of a large Italian series. Am. J. Med. Genet.
A. 140, 2075–2079.
89. Briault, S., Hill, R., Shrimpton, A., Zhu, D., Till, M., Ronce, N.,
Margaritte-Jeannin, P., Baraitser, M., Middleton-Price, H.,
Malcolm, S., et al. (1997). A gene for FG syndrome maps in
the Xq12-q21.31 region. Am. J. Med. Genet. 73, 87–90.
90. Briault, S., Villard, L., Rogner, U., Coy, J., Odent, S., Lucas, J.,
Passage, E., Zhu, D., Shrimpton, A., Pembrey, M., et al.
(2000). Mapping of X chromosome inversion breakpoints
[inv(X)(q11q28)] associated with FG syndrome: A second
FG locus [FGS2]? Am. J. Med. Genet. 95, 178–181.
91. Piluso, G., Carella, M., D’Avanzo, M., Santinelli, R., Carrano,
E.M., D’Avanzo, A., D’Adamo, A.P., Gasparini, P., and Nigro,
V. (2003). Genetic heterogeneity of FG syndrome: a fourth
locus (FGS4) maps to Xp11.4-p11.3 in an Italian family.
Hum. Genet. 112, 124–130.
92. Dessay, S., Moizard, M.P., Gilardi, J.L., Opitz, J.M., Middle-
ton-Price, H., Pembrey, M., Moraine, C., and Briault, S.
(2002). FG syndrome: linkage analysis in two families sup-
porting a new gene localization at Xp22.3 [FGS3]. Am. J.
Med. Genet. 112, 6–11.
93. Jehee, F.S., Rosenberg, C., Krepischi-Santos, A.C., Kok, F.,
Knijnenburg, J., Froyen, G., Vianna-Morgante, A.M., Opitz,
J.M., and Passos-Bueno, M.R. (2005). An Xq22.3 duplication
detected by comparative genomic hybridization microarray
(Array-CGH) deﬁnes a new locus (FGS5) for FG syndrome.
Am. J. Med. Genet. A. 139, 221–226.
94. Tarpey, P.S., Raymond, F.L., Nguyen, L.S., Rodriguez, J.,
Hackett, A., Vandeleur, L., Smith, R., Shoubridge, C., Edkins,
S., Stevens, C., et al. (2007). Mutations in UPF3B, a member
of the nonsense-mediated mRNA decay complex, cause syn-
dromic and nonsyndromic mental retardation. Nat. Genet.
95. Unger, S., Mainberger, A., Spitz, C., Ba¨hr, A., Zeschnigk, C.,
Zabel, B., Superti-Furga, A., and Morris-Rosendahl, D.J.
(2007). Filamin A mutation is one cause of FG syndrome.
Am. J. Med. Genet. A. 143A, 1876–1879.
96. Risheg, H., Graham, J.M., Jr., Clark, R.D., Rogers, R.C., Opitz,
J.M., Moeschler, J.B., Peiffer, A.P., May, M., Joseph, S.M.,
Jones, J.R., et al. (2007). A recurrent mutation in MED12
leading to R961W causes Opitz-Kaveggia syndrome. Nat.
Genet. 39, 451–453.
97. Lyons, M.J., Graham, J.M., Jr., Neri, G., Hunter, A.G.W.,
Clark, R.D., Rogers, R.C., Moscarda, M., Boccuto, L., Simen-
sen, R., Dodd, J., et al. (2009). Clinical experience in the
evaluation of 30 patients with a prior diagnosis of FG
syndrome. J. Med. Genet. 46, 9–13.
98. Clark, R.D., Graham, J.M., Jr., Friez, M.J., Hoo, J.J., Jones, K.L.,
McKeown, C., Moeschler, J.B., Raymond, F.L., Rogers, R.C.,
Schwartz, C.E., et al. (2009). FG syndrome, an X-linked
multiple congenital anomaly syndrome: the clinical pheno-
type and an algorithm for diagnostic testing. Genet. Med.
99. Bergmann, C., Zerres, K., Senderek, J., Rudnik-Schoneborn,
S., Eggermann, T., Ha¨usler, M., Mull, M., and Ramaekers,
V.T. (2003). Oligophrenin 1 (OPHN1) gene mutation causes
syndromic X-linked mental retardation with epilepsy, rostral
ventricular enlargement and cerebellar hypoplasia. Brain
100. Philip, N., Chabrol, B., Lossi, A.M., Cardoso, C., Guerrini, R.,
Dobyns, W.B., Raybaud, C., and Villard, L. (2003). Mutations
in the oligophrenin-1 gene (OPHN1) cause X linked congen-
ital cerebellar hypoplasia. J. Med. Genet. 40, 441–446.
101. Bittel, D.C., Kibiryeva, N., and Butler, M.G. (2007). Whole
genome microarray analysis of gene expression in subjects
with fragile X syndrome. Genet. Med. 9, 464–472.
102. French, C.A., Miyoshi, I., Kubonishi, I., Grier, H.E., Perez-
Atayde, A.R., and Fletcher, J.A. (2003). BRD4-NUT fusion
oncogene: A novel mechanism in aggressive carcinoma.
Cancer Res. 63, 304–307.
103. Stevenson, R.E., and Schwartz, C.E. (2009). X-linked intellec-
tual disability: Unique vulnerability of the male genome.
Dev. Disabil. Res. Rev. 15, 361–368.
104. Renieri, A., Pescucci, C., Longo, I., Ariani, F., Mari, F., and
Meloni, I. (2005). Non-syndromic X-linked mental retarda-
tion: From a molecular to a clinical point of view. J. Cell.
Physiol. 204, 8–20.
105. Zechner, U., Wilda, M., Kehrer-Sawatzki, H., Vogel, W.,
Fundele, R., and Hameister, H. (2001). A high density of
X-linked genes for general cognitive ability: A run-away
106. Graves, J.A., Ge´cz, J., and Hameister, H. (2002). Evolution of
the human X—a smart and sexy chromosome that controls
speciation and development. Cytogenet. Genome Res. 99,
107. Nguyen, D.K., and Disteche, C.M. (2006). Dosage compensa-
tion of the active X chromosome in mammals. Nat. Genet.
108. Turner, G., and Partington, M.W. (1991). Genes for intelli-
gence on the X chromosome. J. Med. Genet. 28, 429.
109. Turner, G. (1996). Finding genes on the X chromosome by
which homo may have become sapiens. Am. J. Hum. Genet.
110. Turner, G. (1996). Intelligence and the X chromosome.
Lancet 347, 1814–1815.
111. Hedges, L.V., and Nowell, A. (1995). Sex differences in
mental test scores, variability, and numbers of high-scoring
individuals. Science 269, 41–45.
112. Lubs, H.A. (1999). The other side of the coin: a hypothesis
concerning the importance of genes for high intelligence
and evolution of the X chromosome. Am. J. Med. Genet.
113. McMaster, G., and Trafzer, C. (2004). Native Universe, Voices
of Indian America (Washington, DC: Smithsonian and
114. Turner, G., Boyle, J., Partington, M.W., Kerr, B., Raymond,
F.L., and Ge´cz, J. (2008). Restoring reproductive conﬁdence
in families with X-linked mental retardation by ﬁnding the
causal mutation. Clin. Genet. 73, 188–190.
590 The American Journal of Human Genetics 90, 579–590, April 6, 2012
On Sharing Quantitative Trait GWAS Results
in an Era of Multiple-omics Data and the Limits
of Genomic Privacy
Hae Kyung Im,1,* Eric R. Gamazon,2 Dan L. Nicolae,2,3,4 and Nancy J. Cox2,3,*
Recent advances in genome-scale, system-level measurements of quantitative phenotypes (transcriptome, metabolome, and proteome)
promise to yield unprecedented biological insights. In this environment, broad dissemination of results from genome-wide association
studies (GWASs) or deep-sequencing efforts is highly desirable. However, summary results from case-control studies (allele frequencies)
have been withdrawn from public access because it has been shown that they can be used for inferring participation in a study if the
individual’s genotype is available. A natural question that follows is how much private information is contained in summary results
from quantitative trait GWAS such as regression coefﬁcients or p values. We show that regression coefﬁcients for many SNPs can reveal
the person’s participation and for participants his or her phenotype with high accuracy. Our power calculations show that regression
coefﬁcients contain as much information on individuals as allele frequencies do, if the person’s phenotype is rather extreme or if
multiple phenotypes are available as has been increasingly facilitated by the use of multiple-omics data sets. These ﬁndings emphasize
the need to devise a mechanism that allows data sharing that will facilitate scientiﬁc progress without sacriﬁcing privacy protection.
Homer et al.1
showed that it is possible to detect an individ-
ual’s presence in a complex genomic DNA mixture even
when the mixture contains only trace quantities of his or
her DNA. The study considered the implications of its ﬁnd-
ings, motivated originally as an application to forensic
science, in the context of genome-wide association studies
(GWASs) from which aggregate allele frequencies for a large
number of markers were being made publicly available.
Shortly after this publication, a reduction in open access
to aggregate GWAS results was implemented. Jacobs et al.2
presented an improved method using a likelihood
approach and showed that disease status could be inferred
for participants of the study. Visscher et al.3
man et al.4
calculated power estimates to understand the
limits of individual detection from sample allele frequen-
cies. They showed that the power to detect membership is
determined by the ratio between the number of markers
and the number of participants in the study.
We present a method that can infer an individual’s partic-
ipation in a study when regression coefﬁcients from
quantitative phenotypes are available. This problem is
especially relevant now that genome-wide system-level
measurements of quantitative phenotypes (transcriptome,
proteome, and metabolome) are being widely collected
and analyzed. Undoubtedly, disseminating results from
quantitative GWAS and deep-sequencing efforts could be
of enormous beneﬁt to research groups working on related
traits. We explore several statistics that can discriminate
study participants from nonparticipants. Notably, we ﬁnd
that the use of only the direction of effects (signs of the
coefﬁcients) enables membership inference with good
accuracy. We show the results from applying the statistics
to the Genetics of Kidneys in Diabetes (GoKinD) data
to illustrate the level of information contained in
aggregate data. We also provide quantiﬁcation of the infor-
mation content by computing the power of the method.
Furthermore, we discuss a general framework that can be
used for integrating our ﬁndings and earlier studies of
genomic privacy based on sample allele frequencies. With
the increasing use of high-throughput technologies to inte-
grate multiple-omics data sets, these various statistics result
in a more powerful approach to the identiﬁcation problem
than with the use of a single phenotype.
Material and Methods
Let us assume that we have the estimated regression coefﬁcients
for M independent SNPs, that we use data on n individuals in a
GWAS (test sample), and that we also have the allelic dosage for
individuals from a reference population such as HapMap7,8
1000 Genomes Project.9
Membership Inference Method
We deﬁne a statistic (a function of available data) that has a
different distribution depending on the membership status and
use this difference to infer membership. We compute this statistic
for the individual of interest, I, and for all individuals in the refer-
ence population. If the statistic falls well within the reference
distribution we will conclude that the individual is not likely to
have participated in the study, and if the statistic falls in the
extremes of the distribution, we will conclude that the individual
did participate in the study.
Department of Health Studies, University of Chicago, Chicago, IL, 60637, USA; 2
Department of Medicine, University of Chicago, Chicago, IL, 60637, USA;
Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA; 4
Department of Statistics, University of Chicago, Chicago,
IL, 60637, USA
*Correspondence: email@example.com (H.K.I.), firstname.lastname@example.org (N.J.C.)
DOI 10.1016/j.ajhg.2012.02.008. Ó2012 by The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics 90, 591–598, April 6, 2012 591
Let bY be deﬁned as
XI;j À bXj
; (Equation 1)
where XI;j is the allelic dosage of individual I at SNP j, bbj is the
estimated coefﬁcient from ﬁtting the model Yi ¼ aj þ bjXi;j þ ei,
and bXj is the estimated mean of allelic dosage (twice the allele
frequency) for SNP j computed with the reference group.
Conditional Mean and Variance of bY
The expected value and the variance of the statistic bYI conditional
on the individual’s genotype XI and demeaned phenotype YI À m
and membership status (in or out) are as follows:
E½bY j XI ; YI ; inzðYI À mÞ
E½bY j XI ; YI ; outz 0
Var½bY j XI ; YI ; in z s2 n
Var½bY j XI ; YI ; out z s2 n
; (Equation 2)
is the variance of the phenotype, and m is the population
mean of the phenotype Y. Note that for the method to work we do
not need to make use of these expressions nor do we need to know
and m because we rely on the empirical distribution from the
reference population to determine membership. These expres-
sions will serve to estimate the power of the method.
Unconditional on YI, the variance of the statistic bY is given by
j XI ; in z s2
In computing these quantities we assume that the number of
markers is much larger than the number of individuals in the
test sample and the number of individuals in the reference group:
M >> n >> 1 and M >> nÃ
>> 1. Hardy Weinberg equilibrium is
assumed. To derive these expressions, we used standard Taylor
expansions and the law of iterative expectations. We tested the
validity of these for ﬁnite samples (n between 100 and 1,000 and
M=n between 1,000 and 50,000) by ﬁtting linear regressions
with simulated genotypes and phenotypes and computing the
sample mean and variances of the bY statistic. See Supplemental
Data, available online, to ﬁnd plots of the validation.
Power of the Method
To compute power, we deﬁne the null and alternative hypothesis.
Under the null hypothesis the individual did not participate
in the study (nor did any relatives of the individual), whereas under
the alternative hypothesis, the individual did participate. Using the
mean and variance under the null hypothesis and the correspond-
ing mean and variance under the alternative hypothesis computed
in Equation 2 and assuming M >> n >> 1; M >> nÃ
normality of the statistic bY, and the sign of YI À m to be known,
the power will be approximately given by
j YI À m j
; (Equation 3)
where a is the type I error, zx ¼ FÀ1
ð1 À xÞ is the ð1 À xÞ-quantile of
the normal distribution, and F is the normal cumulative distribu-
tion function. If the sign of bY À m is not known, a two-sided test
will be used in the derivation and the power will be given by
j YI À m j
: (Equation 4)
See derivation in Appendix A. Because F is a strictly increasing
function the power
d increases when M, the number of SNPs, increases
d decreases when n, the study’s sample size, increases
d increases when the individual’s phenotype deviates more
from the mean (scaled by the standard deviation)
d increases when a, the type I error, increases
To facilitate comparison with Visscher et al.3
let us express the one-sided power Equation 3 with the
following (equivalent) implicit formula
ðza þ zbÞ2
YI À m
; (Equation 5)
where 1 À b is the power (note that in Sankararaman et al.4
deﬁned as the power). Recall that in Visscher et al.3
man et al.4
power was given implicitly by
ðza þ zbÞ2
: (Equation 6)
Thus, the only difference between Equations 5 and 6 is the factor
ððYI À mÞ=sÞ2
. If the phenotype of the person deviates more than
one standard deviation away from the mean, i.e., jYI À mj s
and the sign of YI À m is known, the power when regression
coefﬁcients are used is larger than it is when allele frequencies
are used. If the person’s phenotype is close to the mean, then
the power will be much diminished. Although expectations are
computed conditional on YI À m, we do not need to know its
magnitude in order to achieve this power. However, we do need
to know the sign of YI À m in order to keep the test one-sided.
If the sign is not used, jYI À mj would need to be 1 þ
times greater than the standard deviation in order
to achieve greater power than the allele frequency case. As an
example, if a ¼ 0:05 and M=n ¼ 100, jYI À mj would need to be
greater than 1.031 times s.
Individual Contribution to the Regression Coefﬁcient
In order to get an intuitive understanding of the contribution
of each individual from the sample, we can decompose the esti-
mated regression coefﬁcient into roughly the sum of individual
bbj z ~bI;j þ
; (Equation 7)
deﬁning ~bi;j ¼ ð1=ns2
j Þ ~Xi;j
~Yi as the individual contribution to the
regression coefﬁcient and s2
j as the variance of the allelic dosage
(under Hardy Weinberg assumption s2
j ¼ 2pjð1 À pjÞ where pj is
the minor allele frequency of SNP j). We use the tilde ~X for the
demeaned variable that uses the mean from the sample. It is worth
comparing with the decomposition for the case when minor allele
frequencies for the sample are available: bpjzðpI;j=nÞ þ
where bpj is the sample minor allele frequency and pi;j is the allelic
dosage divided by 2 of individual i for SNP j. This similarity gives
an intuitive understanding of the corresponding similarity in the
dependence of power on the ratio of the number of SNPs and
sample size of the study.
592 The American Journal of Human Genetics 90, 591–598, April 6, 2012
Combining Multiple Phenotypes
If results from multiple phenotypes such as eQTL (or other omics
data) results are available, we can combine the information
regarding the individual’s membership by using a Fisher type of
method (the sum of logarithms of p values).10
For each phenotype k, we can compute an empirical p value, pk,
deﬁned as the proportion of reference individuals with magnitude
of the jbYj greater than the individual’s jbYI j. We can combine
p values across different phenotypes by computing
where npheno is the number of phenotypes to be combined. In
addition to accumulating evidence across phenotypes, this
method avoids the problem of lack of power due to one particular
phenotype being close to the population mean.
Usually other covariates such as age, sex, etc. are adjusted for
when performing GWASs. If the allelic dosage is independent of
the covariates (as will likely be the case for most SNPs) bY will
converge to the covariate-adjusted phenotype instead of the actual
phenotype. The standard deviation might change if the covariates
explain a substantial portion of the phenotypic variability.
However, the method will still work because under no participa-
tion bY will still be around 0, whereas if the individual participated
in the study, bY will converge to the covariate-adjusted phenotype.
The method does not require knowing the actual phenotype and it
will work relative to this adjusted phenotype. For the purpose of
re-identiﬁcation using our method, the presence of covariates is
only a nuisance and no additional power is achieved when they
Sample Correlation Statistic
Equation 7 suggests that the sample correlation between the esti-
mated beta and the individual’s genotype might be useful because
we would expect the correlation to be 0 if the individual was not in
the sample and different from 0 if the individual was part of the
bbj À b
XI;j À bXj À XI À bX
bbj À b
XI;j À bXj À XI À bX
where the long bar above an expression means the sample mean of
Equation 7 also shows that the sign of the correlation coefﬁcient
will be slightly more likely to match the sign of the demeaned
allelic dosage if the person participated in the study than other-
wise. Let bS be deﬁned as:
Xi;j À bXj
We expect that strictly more than 50% of the times the product
signðbbÞ signðXi;j À bXjÞ will be positive (or negative) if the indi-
vidual participated in the study and his or her phenotype is above
(or below) average. By looking at the absolute value of the sign
statistic we expect to gain information on whether the individual
was part of the study or not.
We used the PLINK software11
and ﬁltered out SNP markers that
were not in Hardy Weinberg equilibrium (p 0.001) and those
that had minor allele frequencies less than 5%. Receiver operating
characteristic (ROC) curves were generated by using the absolute
value of the statistic as the predicting variable and membership
in the sample as the labels by using the ROCR12
package for the
R statistical package.13
We used only individuals who self-reported
as white both for sample and reference.
We show the performance of the statistics deﬁned in Mate-
rial and Methods ðbY; bS; bCÞ by using data from the GoKinD
(Genetics of Kidney Disease) study.5,6
The data set was
downloaded from dbGaP14
and consisted of more than
1,800 probands with long-standing type 1 diabetes, over
300 dichotomous and quantitative phenotypes, and geno-
type from Affymetrix Genome-Wide Human SNP Array 5.0
platform. We used a subset of 1,644 individuals reported to
We show results for two of the phenotypes: cholesterol
level and body mass index (BMI). We also tested the
method on a third simulated phenotype and found at least
as good performance. The latter demonstrates that the
method does not depend on any real effect of genotype
We randomly sampled 100, 500, and 1,000 individuals
from each study’s cohort and performed a GWAS including
only individuals from each random sample. The remaining
individuals were used as reference group. The statistics
ðbY; bS; bCÞ were computed for both sample and reference
Identiﬁability Statistic and Phenotype Reconstruction
Figure 1 shows bY versus the actual phenotype (rank
normalized cholesterol levels). The blue dots correspond
to individuals in the sample and the black dots correspond
to individuals in the reference group. For individuals in the
sample, bY lies close to the one-to-one line (perfect predic-
tion line), whereas the individuals in the reference popula-
tion lie close to a ﬂat line around 0 (consistent with our
calculations of mean and variances). The sample size was
n ¼ 1; 000 and the number of SNPs was M ¼ 300; 000.
The number of reference individuals was 644.
This demonstrates that for individuals who participated
in a study, their phenotype can be reconstructed with high
accuracy using the bY statistic, whereas for nonparticipants
what we get is mostly noise.
Distribution of Statistic by Membership Status
and ROC Analysis
The left panel in Figure 2 shows the distribution of the
absolute value of bY by membership status. As in Figure 1
The American Journal of Human Genetics 90, 591–598, April 6, 2012 593