The american journal of human genetics (AJHG) Vol 90 Nº4, 2012

16,765 views
16,613 views

Published on

Ejemplar de la Revista Americana de Genética Humana, Volúmen 90, Nro 4 del año 2012

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
16,765
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
27
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The american journal of human genetics (AJHG) Vol 90 Nº4, 2012

  1. 1. EDITORS’ CORNER This Month in The Journal Sara B. Cullinan1 Genomic Privacy in GWAS? Im et al., page 591 Recent technological advances have made it possible to interrogate human phenotypes at a previously unimagin- able scale. But, as with any collection of personal data, it is important to ensure individual privacy. Indeed, previous investigations into the ability to discern an individual’s participation in genetic studies have led to the withdrawal of allele frequencies from publicly available results. In this issue, Im et al. probe deeper, questioning how much private information can be extracted from typically re- ported statistics, such as regression coefficients or p values. Through a series of analyses, the authors determine that regression coefficients can, in some cases, provide just as much information as allele frequencies, thus creating a situation in which even statistics that were thought to be ‘‘safe’’ can in fact identify participants and their medical history. The possibility of membership detection is espe- cially high in cases in which multiple phenotypes are being reported, e.g., in multiple-omics data sets. With exome- and whole-genome sequencing (and the large data sets that they generate) becoming more common, it is clear that many additional discussions between scien- tists, clinicians, and ethicists are needed to ensure that privacy can be maintained without sacrificing the dissem- ination of research findings. A Major mtDNA Shake-Up Behar et al., page 675 In 1981, the revised Cambridge Reference Sequence was published. It immediately became the standard against which human mtDNA is compared and phylogenies are derived. Indeed, its publication enabled a tremendous amount of research aimed at better understanding human history. However, the realization that this sequence belongs to a recently coalescing European haplogroup creates several concerns about inconsistencies and misinterpreta- tion. To address these concerns, Behar et al. set out to reas- sess and refine the human mtDNA phylogeny, and in so doing, they constructed a new reference mtDNA sequence, termed the Reconstructed Sapiens Reference Sequence (RSRS). Generated through the assessment of over 18,000 human mtDNA sequences, as well as those of Homo neanderthalensis, the RSRS performs well in molecular clock analyses and lays the groundwork for a new way of ana- lyzing mtDNA. Although this change will require a large amount of rethinking, the authors put forth a coherent plan to make this feasible, including tools to transform previously generated data and analyses. With the amount of deep-sequencing data that should become available in the coming years, the RSRS presents a ‘‘next-generation’’ approach to understanding human matrilineal diversity. First Steps toward Understanding Birth Weight Ishida et al., page 715 Babies come in many different sizes, but being too small is a major health concern. Indeed, intrauterine growth restriction (IUGR) serves as a risk factor for several adult diseases, including obesity and type 2 diabetes. Although maternal health plays a large role in directing fetal growth, the genetic factors that contribute to the variability in fetal size remain poorly understood. Of interest, however, are those genes that undergo imprinting, a process by which the parent of origin determines monoallelic expression. Evolutionary theory posits that expression of alleles in- herited from the father promote in utero growth, whereas those inherited from the mother inhibit growth. But what happens if the maternally inherited allele exhibits an altered expression pattern? Might the balance be tipped? In this issue, Ishida et al. explored the possibility that variants in PHLDA2, which is only expressed from the maternal allele, might influence birth weight. Their studies identified a variant in the PHLDA2 promoter region that eliminates several consensus transcription factor binding sites and should therefore lead to decreased expression. Then, through a cross-sectional study of normal births, they showed that inheritance of this variant (from the mother), as well as maternal homozygosity, correlated with increased birth weight. Future studies, focused specif- ically on IUGR, should help to elucidate how variation in PHLDA2, and potentially in other imprinted genes, con- tributes to the regulation of birth weight and related complications. Evolutionary History of AD Risk Alleles Raj et al., page 720 Alzheimer’s disease (AD) is the most common neuro- degenerative disease, and as of yet, there are no effective 1 Deputy Editor, AJHG DOI 10.1016/j.ajhg.2012.03.008. Ó2012 by The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics 90, 575–576, April 6, 2012 575
  2. 2. treatments, let alone a cure. Therefore, there is great interest in better understanding the causes of the disease from both biochemical and genetic standpoints. The best-characterized genetic risk factor is the ε4 haplotype of APOE, which, interestingly, shows evidence of having undergone positive selection, most likely because of an effect on an unrelated phenotype. With this in mind, Raj et al. set out to identify other possible indications of selec- tion in loci shown to associate with AD susceptibility. They found such evidence, all in East Asian populations, for three loci, suggesting that the same selective pressure might have acted on each. Given that AD is unlikely to serve in such a role, the authors posited that pathogen exposure might have been the driving force. Indeed, many signatures of selection in the human genome are attributed to interactions with pathogens. Interestingly, the protein products generated at these loci appear to belong to the same interaction network. This finding suggests that additional clues about AD risk might be found by interrogating other branches of this network. Although much remains to be learned about the variants that contribute to AD risk, the study of their evolution, and possible coevolution, will no doubt yield insights into the underlying biology of the disease. X Marks the Spot in Breast Cancer Research Park et al., page 734 The ubiquitous pink ribbons serve as a reminder that many women (and some men) are affected by breast cancer. Although well known, BRCA1 and BRCA2 muta- tions account for a minority of hereditary cancers. There- fore, a better understanding of the biology of breast cancer, along with better screening tests, is sought by many families. To help achieve these goals, Park et al. used exome sequencing and identified rare mutations in XRCC2 that serve as susceptibility factors for familial breast cancer. XRCC2 is a RAD51 paralog that is required for efficient homologous recombination (HR); its loss leads to marked genome instability and aneuploidy. Future studies aimed at delineating the exact role of XRCC2 mutations, as well as mutations that lie within the same pathway, in disease onset and/or progression should aid in the discovery of new treatment options. This finding adds to the list of genes whose protein prod- ucts perform crucial roles in HR and whose mutations can influence breast cancer risk. It also provides support for those who seek to better understand common diseases through sequencing studies. 576 The American Journal of Human Genetics 90, 575–576, April 6, 2012
  3. 3. EDITORS’ CORNER This Month in Genetics Kathryn B. Garber1,* Big Gene, Big Heart Although the cardiomyopathies have a substantial genetic etiology, genetic testing for this class of heart disorders has been notoriously difficult. Indeed, the causative mutation is found in only 20%–30% of patients with dilated cardio- myopathy. Titin is a candidate gene for cardiomyopathy that has been examined for mutations to a limited extent due to its massive coding sequence, which is ~100 kb in size. Herman et al. recently published data showing that the sequence hurdle for this gene is worth the effort. Through next-generation sequencing, they identified a truncating TTN mutation in ~25% of familial cases of idiopathic dilated cardiomyopathy, moving TTN to the forefront of genes involved in this form of the disease. Although these mutations had very high penetrance after age 40 in familial cases, there is also a significant amount of TTN variation whose clinical significance is difficult to interpret at this time. This includes missense variation, which was not analyzed in this current paper, so its role in cardiomyopathy is unclear. Even with truncating muta- tions in TTN, interpretation is not always simple; these mutations were identified, albeit at lower frequency, in control individuals and in individuals with hypertrophic cardiomyopathy who also had a pathogenic mutation in a known disease gene. Herman et al. (2012) NEJM 366, 619–628. A Complex Balance Perhaps it is not surprising that the more closely you look at something, the more you see. Certainly, the advent of whole-genome comparative genomic hybridization (CGH) arrays taught us that many people with normal G-banded karyotypes have cytogenetic aberrations when we look more closely. Even high-resolution CGH arrays don’t give us a complete picture of chromosomes, as recently illustrated by Chiang et al. These investigators took a set of individuals who had apparently balanced chromosome translocations—at least based on G-banding and whole-genome CGH arrays—and they analyzed the breakpoints at the nucleotide level. What they found was an unexpectedly high level of complexity to the break- points. In almost 20% of cases, three or more breakpoints were involved, but in some cases, a shockingly complex interweaving of segments occurred, akin to what was recently described in cancer cells as ‘‘chromothripsis,’’ or chromosome shattering and reorganization. The cases analyzed by Chiang et al. involved upward of ten break- points with inverted segments interspersed among seg- ments of the expected orientation. This phenomenon is not limited to spontaneous rearrangements in humans; analysis of transgene insertions in mice and in sheep revealed that the sites of integration can be similarly complex. Chiang et al. (2012) Nat. Genet. Published online March 4, 2012. 10.1038/ng.2202. Good News for Men The Y chromosome is just a degenerate of its former auto- somal self that is on its way to extinction, or so some have proposed. If you compare the Y to the X chromosome, for instance, the Y has lost many of the genes that the chromosomes once shared, and without a companion chromosome with which to fully pair itself during meiosis, some think this sex-specific chromosome is doomed. David Page argues otherwise. His group does species comparisons of the Y chromosome in order to understand its evolution and to better predict the future fate of the Y. Page’s group previously compared the human to the chim- panzee Y chromosome, which diverged about six million years ago, but, in order to look at a much longer evolu- tionary window, his group recently compared the human and rhesus macaque Y chromosomes, which diverged 25 million years ago. This comparison yielded a surprising level of evolutionary stability on the Y. In the majority of the male-specific regions of the Y chromosome, rhesus macaques and humans share the same ancestral genes, arguing for Y chromosome stability over the long haul. In only a very restricted segment of the Y has gene loss occurred in humans since the split from the Old World monkeys. Their data fit a model in which rapid degenera- tion of segments on Y was followed by marked slowing of this decay and chromosome stabilization. Don’t count the Y out just yet; it looks like it may stick around a while. Hughes et al. (2012) Nature 483, 82–86. Enhancers Acting as Promoters Just as we learn to group letters into words and bin words into different parts of speech in order to extract meaning from sentences, we try to interpret genome sequences by picking out the nucleotide sets that comprise genes and attempting to recognize the regulatory elements from strings of As, Cs, Gs, and Ts. But although we might think 1 Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA *Correspondence: kgarber@genetics.emory.edu DOI 10.1016/j.ajhg.2012.03.009. Ó2012 by The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics 90, 577–578, April 6, 2012 577
  4. 4. we understand what a particular type of genetic element does, recognition of one of its roles in gene expression sometimes doesn’t tell the whole story. Take enhancers, for instance. These are well-studied cis elements that have a simple job: they bind transcription factors and enhance expression from gene promoters, hence their name. Kowalczyk et al. wondered whether that’s all enhancers do, and they ended up with evidence that intra- genic enhancers can also act as alternative tissue-specific promoters. The resulting mRNAs are spliced and polyade- nylated but do not appear to be translated into protein. Because enhancers are much more common than classic promoters and because about half of enhancers are intra- genic, this promoter-like activity could contribute substan- tially to the complexity of the mammalian transcriptome. The next step is to figure out how these untranslated tran- scripts are used. Kowalczyk et al. (2012) Mol. Cell 45, 447–458. A Common Turn-On While we’re on the subject of surprising roles for noncod- ing elements, a recent paper uncovered the coordinated regulation of two neighboring, but nonparalogous, genes that both tie into an identical phenotype. Joe Gleeson’s group focuses on ciliopathies, and they recently identified mutations in TMEM216 at the JBTS2 locus that cause Jou- bert syndrome. Of the ten JBTS2-linked families, however, only about half of them had a TMEM216 mutation, despite an identical phenotype to the mutation-containing families. When they resequenced the JBTS2 locus, they found mutations in a neighboring gene, TMEM138, that is not related to TMEM216, although it also encodes a transmembrane protein. Although your first thought might be that TMEM138 simply contains a regulatory element for TMEM216, this is not the case. Rather, both genes are coordinately expressed via the action of an inter- genic element, and they both encode proteins involved in the same process, ciliogenesis. Knockdown of either protein leads to defective ciliogenesis, which ultimately is central to the Joubert syndrome phenotype. Thus, despite the fact that the genes are very different, they have evolved a system of coordinated regulation and func- tional relatedness. Lee et al. (2012) Science 335, 966–930. This Month in Our Sister Journal Yeast System for Characterization of Cystathionine- Beta-Synthase Mutations Although we know that individuals with deficiency of cystathionine-beta-synthase (CBS) tend to have intellec- tual disability, a marfanoid habitus, ectopia lentis, and increased risk of thromboembolism, there is variable expressivity for this disorder, and it is difficult to predict outcome from genotype. Dietary protein and methionine restriction is the central approach to management, and supplementation with vitamin B6, a cofactor of CBS, can lead to further reductions in homocystine levels in some affected individuals, who tend to have milder disease. To address the challenge of genotype-phenotype correlations in CBS deficiency, Mayfield et al. used a yeast system to characterize the function of all 84 CBS missense alleles that had been documented as of 2010. This system, in which the yeast ortholog of CBS is replaced by human alleles, allows them to assess the general level of function, as well as the responsiveness of each allele to vitamin B6 and to another cofactor, heme. The authors also propose that glutathione deficiency should be further explored in the context of CBS deficiency, because they noted reduced glutathione production in their system when CBS function was disabled. Mayfield et al. (2012) Genetics. Published online January 20, 2012. 10.1534/genetics.111.137471. 578 The American Journal of Human Genetics 90, 577–578, April 6, 2012
  5. 5. REVIEW Fragile X and X-Linked Intellectual Disability: Four Decades of Discovery Herbert A. Lubs,1 Roger E. Stevenson,1,* and Charles E. Schwartz1 X-Linked intellectual disability (XLID) accounts for 5%–10% of intellectual disability in males. Over 150 syndromes, the most common of which is the fragile X syndrome, have been described. A large number of families with nonsyndromal XLID, 95 of which have been regionally mapped, have been described as well. Muta- tions in 102 X-linked genes have been associated with 81 of these XLID syndromes and with 35 of the regionally mapped families with nonsyndromal XLID. Identification of these genes has enabled considerable reclassification and better understanding of the biological basis of XLID. At the same time, it has improved the clinical diagnosis of XLID and allowed for carrier detection and prevention strategies through gamete donation, prenatal diagnosis, and genetic counseling. Progress in delineating XLID has far outpaced the efforts to understand the genetic basis for autosomal intellectual disability. In large measure, this has been because of the relative ease of identifying families with XLID and finding the responsible mutations, as well as the determined and interactive efforts of a small group of researchers worldwide. Introduction Mutations resulting in X-linked intellectual disability (XLID) have been described in 102 genes (Table S1, avail- able online).1 This work was accomplished over a 40 year period during which the term X-linked mental retardation was widely used; however, we will use intellectual disability (ID), which is emerging as the preferred termi- nology. Mutations in these 102 genes are responsible for 81 of the known 160 XLID syndromes and over 50 families with nonsyndromal XLID (Table S1 and Figures 1 and 2). An additional 30 XLID syndromes and 48 families with nonsyndromal XLID have been regionally mapped (Table 1 and Figures 2 and 3), but the genes not yet identified. Forty-four XLID syndromes, which remain unmapped, have also been described (Table S2). Fewer than 400 auto- somal genes in which mutations resulted in ID have been identified. Of 1,640 references to ID in OMIM (as of March 2010), 316 are entities on the X chromosome. Three comparably sized chromosomes (6, 7, and 8) show 50, 58, and 60 references, respectively. Several authors have recently discussed the possibility that these striking differ- ences might result from a relative concentration of genes that influence intelligence on the X chromosome.2,3 Identification of the mutations in 102 genes that cause XLID has been accomplished primarily through long- term, planned and coordinated studies from the United States, Europe, and Australia. These studies took advantage of the power of pedigrees of relatively large families to assign putative genes to the X chromosome, linkage anal- ysis to achieve regional localizations, accumulation and sharing of large data banks of clinical details and speci- mens, registries of pertinent X chromosomal transloca- tions and abnormalities, stored samples from a variety of populations around the world with ID and effective communication between numerous investigators. In this setting, the continuously developing technologies were applied and reapplied to the available clinical and spec- imen banks effectively and rapidly. A comparable system- atic approach to autosomal ID has not been carried out. Publication of the first family with the marker X,4 later renamed the fragile X (MIM 300624),5 gave an important impetus to the field by providing a laboratory tool which clearly identified the most prevalent XLID syn- drome. A series of biennial international meetings on fragile X syndrome and XLID, beginning in 1983, involved about 100 investigators and provided a sense of unity and progress to the field. Papers and abstracts from these meet- ings and from other research were published (usually bien- nially) as conference reports, special issues or updates on XLID from 1984 to 2008.6–16 The focus of this review will be the discovery process rather than the details of the clinical or molecular findings in the individual XLID entities. Readers are referred to the recently updated excellent review of the fragile X in OMIM (MIM 300624) and OMIM entries on other XLID disorders as detailed in Tables S1 and S2. Other reviews of different aspects of XLID include the periodic XLID updates from 1984 to 2008, an Atlas of XLID Syndromes,1 and a number of commentaries by individual investigators.3,17–22 XLID before Fragile X The prelude to the current cytogenetic and molecular era covered a century (1868–1968). It encompassed descrip- tions of a number of clinically defined entities (Pelizaeus- Merzbacher disease [MIM 312080], Duchenne muscular dystrophy [MIM 310200], incontinentia pigmenti [MIM 308300], Goltz focal dermal hypoplasia [MIM 305600], Lenz microphthalmia syndrome [MIM 309800]), inborn errors of metabolism (Hunter syndrome [MIM 309900], Lowe syndrome [MIM 309000], Lesch-Nyhan syndrome [MIM 300322]), and large pedigrees in which ID segregated with an X-linked pattern.23–28 During the same period, the excess of males among persons with ID was observed in 1 Greenwood Genetic Center, JC Self Research Institute of Human Genetics, 113 Gregor Mendel Circle, Greenwood, SC 29646, USA *Correspondence: res@ggc.org DOI 10.1016/j.ajhg.2012.02.018. Ó2012 by The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics 90, 579–590, April 6, 2012 579
  6. 6. census surveys and other population studies.29–31 The magnitude of the male excess, varied from study to study but averaged about 30 percent and was found in nearly all studies. These two observations—the excess of males among persons with ID and clinical syndromes or families with ID that segregated with an X-linked pattern—provided compelling evidence that genes on the X chromosome were important contributors to the overall causation of ID and, hence, of individual, familial, and societal signifi- cance. By virtue of having but a single X chromosome, the male’s genome was uniquely vulnerable and that vulnerability extended to brain development and function as well as to other systems. Further insights during this early period of time were that XLID comprised syndromal entities (ID plus somatic, metabolic, or neuromuscular manifestations) and nonsyn- dromal entities (ID alone or with inconsistent abnormali- ties). It also became clear that some females in XLID pedigrees had intellectual limitations, albeit with neither the consistency nor the severity of males. Technological limitations (lack of tools for linkage analysis and gene isolation) precluded a more precise genetic characteriza- tion of XLID disorders and delayed the clinical delineation. The Setting of the Initial Observation of the Marker X In 1966, when a one-year-old boy and his brother were referred to the Yale chromosome laboratory for study because of delayed development, medical cytogenetics was in a period of transition. The major trisomies as well as translocations and large deletions had been defined by nonspecific orcein or Giemsa staining. Prenatal cytoge- netic diagnosis had begun and in order to provide more predictive developmental information to families, there was a need for both better, less biased clinical information about X and Y aneuploidy and the several types of smaller variations in the short arms of the acrocentric chromo- somes and variant heterochromatic regions on 1, 9, 16, and Y. The Yale laboratory had selected a minimal media (199) for both routine diagnostic studies and for a year- long study of 4,500 consecutive cord blood and 500 maternal samples. Special attention was given to breaks, gaps, and chromosome variants in the year-long study. The study also sought to identify cytogenetic markers offin-Lowr (RPSKA3, RSK2) Telecanthus-hypospadias (MID1)Oral-facial-digital I (OFD1) Spermine synthase deficiency (SMS) XLID-infantile seizures, Rett like (CDKL5, STK9) Autism (NLGN4) MIDAS (HCCS)Turner, XLID-hydrocephaly- basal ganglia calcification VACTERL-hydrocephalus (FANCB) 22.3 22.2 (AP1S2) C y Pyruvate dehydrogenase deficiency (PDHA1) Glycerol kinase deficiency (GKD) Duchenne muscular dystrophy (DMD) Ornithine transcarbamoylase deficiency (OTC) Monoamine oxidase-A deficiency (MAOA) Norrie (NDP) Partington, West, Proud, XLAG (ARX) Nance-Horan (NHS) XIDE (Renin receptor; ATP6AP2 OFCD, Lenz microphthalmia (BCOR) 22.1 21.3 21.2 21.1 11 4 Ichthyosis follicularis, atrichia, photophobia (MBTPS2) Chaissaing Lacombe chondrodysplasia (HDAC6) XLID-nystagmus-seizures (CASK) MEHMO (EIF2S3) Aarskog (FGDY) b ll d i (OPHN ( p ) XLID-choreoathetosis (HADH2) Stocco dos Santos (SHROOM4, KIAA1202) XLID l ft li / l t (PHF8) Epilepsy/macrocephaly (SYN1) Cornelia de Lange, X-linked (SMC1L1, SMC1A) Renpenning, Sutherland-Haan, Cerebropalatocardiac (Hamel), Golabi-Ito-Hall, Porteous (PQBP1) 11 11.4 11.3 11.1 11.23 11.22 11.21 Goltz (PORCN) XLID-macrocephaly Juberg-Marsidi-Brooks (HUWE1) - TARP (RBM10) -Thalassemia Intellectual Disability XLID-hypotonic facies, Carpenter-Waziri, Holmes-Gang, Chudley-Lowry, XLID-arch (ATRX, XNP, XH2) Phosphoglycerate kinase deficiency (PGK1) Menkes disease (ATP7A) XLID-cerebellar dysgenesis -1)-cleft lip/palate Allan-Herndon (SLC16A2, MCT8) Opitz-Kaveggia FG, Lujan (MED12, HOPA) XLID-macrocephaly-large ears (BRWD3) Graham coloboma (IGBP1) Cantagrel spastic paraplegia (KIAA2022) 13 12 21.1 21 2 Cornelia de Lange, X-linked (HDAC8) Pelizaeus-Merzbacher (PLP) Mohr-Tranebjaerg (TIMM8A, DDP) Lissencephaly, X-linked (DCX) fingerprints-hypotonia, Smith-Fineman-Myers(?) XLID-optic atrophy (AGTR2) Arts, PRPP synthetase superactivity (PRPS1) XLID-short stature-muscle wasting (NXF5) Mitochondrial encephalopathy (NDUFA1) 23 21.2 21.3 22.1 22.2 22.3 XLID-hyperekplexia-seizures (ARHGEF9) Epilepsy-intellectual disability limited to females (PCDH19) Martin-Probst (RAB40AL) Wilson-Turner (LAS1L) XLID-Rolandic seizures (SRPX2) XLID-hypogonadism-tremor (CUL4B) Lowe (OCRL1) Simpson-Golabi-Behmel (GPC3) Lesch-Nyhan (HPRT) Fragile XA (FMR1) MASA spectrum (L1CAM) Börjeson-Forssman-Lehmann (PHF6) XLID-growth hormone deficiency (SOX3) Danon cardiomyopathy (LAMP2) XLID-nail dystrophy-seizures (UBE2A) XLID-macrocephaly-Marfanoid habitus (ZDHHC9) Christianson, Angelman-like (SLC9A6) FG/Lujan phenotype (UPF3B) Chiyonobu XLID (GRIA3) 25 26 24 Microcephaly-pachygyria-dysmorphism (NSDHL) Mucopolysaccharidosis IIA (IDS) Myotubular myopathy (MTM1) Adrenoleukodystrophy (ABCD1) Hydrocephaly- Rett, PPM-X (MECP2)* Incontinentia pigmenti (IKBKG, NEMO) Dyskeratosis congenita (DKC1) Periventricular nodular heterotopia, Otopalatodigital I, Otopalatodigital II, Melnick-Needles (FLNA, FLN1) Creatine transporter deficiency (SLC6A8) *XLID-hypotonia-recurrent infections (MECP2 dup) Autism (RPL10) 28 27 XLID-macrocephaly-seizures-autism (RAB39B) N-Alpha acetyltransferase deficiency (NAA10) Figure 1. Genes with Identified Mutations that Cause Syndromal XLID with Chromosomal Band Location 580 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  7. 7. that might correlate directly with clinical conditions.32 Thus, the initial observation that the two brothers referred to the laboratory because of ID had a consistent chromatid break or constriction in the distal long arm of a large C group chromosome was very pertinent to the research goals of the laboratory. Further study revealed that their normal mother and two maternal relatives with ID (an uncle and great uncle of the boys) had the same marker X chromosome. The pedigree was, of course, consistent with X-linked ID. Studies with H3 thymidine showed that the late repli- cating, large C group chromosome was the same as the chromosome with the apparent breaks and secondary constrictions. The data led to the conclusion that ‘‘either the secondary constriction itself or a closely linked recessive gene may account for the pattern of X-linked inheritance’’.4 This was, in fact, probably the first precise localization of a gene associated with human disease. The fragile X locus was subsequently defined as an uncoiled region (secondary constriction) by electron microscopy.33 Studies from a number of laboratories would provide a more precise confirmation and molecular characterization 22.3 22.2 22.1 CDKL5 (STK9) ( ) ARX (29,32,33, NLGN4 RPSKA3 (RSK2) (19) AP1S2 (59)CLCN4 (49) 21.3 21.2 21.1 11.4 11.3 11.23 IL1RAPL1 (21,34) ( , , , 36,38,43,54,76) TM4SF2 (58) PQBP1 (55) ZNF81 (45) ZNF674 (92) (9 44) ZNF41 (89) 13 11 12 11.1 11.22 11.21 OPHN1 (60) FGDY ( ) FTSJ1 (9,44) KDM5C(SMX,JARID1C) DLG3 (8, 90) SLC16A2 (MCT8) NLGN3 KLF8 (ZNF741) HUWE1 (17, 31)** IQSEC2(1,18) 21.1 21.2 21.3 22.1 ACSL4 (FACL4) (63 68) ZDHHC15 (91) SRPX2 MAGT1 (IAP) ATRX (XNP) 25 24 23 22.2 22.3 PAK3 (30,47) , ARHGEF6 ( PIX) (46) AGTR2 (88) UPF3B (62) NDUFA1 THOC2 (12) 28 26 27AFF2 (FMR2, FRAXE) GDI1 (41, 48) MECP2 (16,64,79)* SLC6A8 RAB39B (72) HCFC1 (3) *MRX64 is due to a dup MECP2 **MRX17 and MRX31 are due to dup HUWE1 and 2 adjacent genes Figure 2. Location of Genes with Mutations that Cause Nonsyn- dromal XLID Twenty-two genes shown on the left of the chromosome with solid arrows cause nonsyndromal XLID only. Numbers in paren- theses adjacent to the gene symbols are assigned MRX numbers. Seventeen genes shown on the right of the chromosome with open arrows cause both syndromal and nonsyndromal XLID. Table 1. Nonsyndromal XLID families (MRX1 – MRX95) with linkage or gene identificationa 1 IQSEC2 33 ARX 65 Xp11.3-q21.33 2 Xp22.1-p22.3 34 del IL1RAPL1 66 Xq21.33-q23 3 HCFC1 35 Xq21.3-q26 67 Xq13.1-q21.31 4 Xp11.22-q21.31 36 ARX 68 ACSL4 5 Xp21.1-q21.3 37 Xp22.31-p22.32 69 Xp11.21-q22.1 6 Xq27 38 ARX 70 Xq23-q25 7 Xp11.23-q12 39 Xp11 71 Xq24-q27.1 8 DLG3 40 Xq21 72 RAB39B 9 FTSJ1 41 GDI1 73 Xp22-p21 10 Xp11.4-p21.3 42 Xp11.3-q13.1; Xq26 74 Xp11.3-p11.4 11 Xp11.22-p21.3 43 ARX 75 Xq24-q26 12 THOC2 44 FTSJ1 76 ARX 13 Xp22.3-q22 45 ZNF81 77 Xq12-q21.33 14 Xp21.2-q13 46 ARHGEF6 78 Xp11.4-p11.23 15 Xp22.1-q12 47 PAK3 79 MECP2 16 MECP2 48 GDI1 80 Xq22-q24 17 dup HUWE1 49 CLCN4 81 Xp11.2-q12 18 IQSEC2 50 Xp11.3-p11.21 82 Xq24-q25 19 RPSKA3 51 Xp11.23-p11.3 83 Not published 20 Xp21.1-q23 52 Xp11.21-q21.32 84 Xp11.3-q22.3 21 IL1RAPL1 53 Xq22.2-q26 85 Xp21.3-p21.1 22 Xp21.1-q21.31 54 ARX 86 Not published 23 Xq23-q24 55 PQBP1 87 ARX 24 Xp22.2-p22.3 56 Xp21.1-p11.21 88 AGTR2 25 Xq27.3 57 Xq24-q25 89 ZNF41 26 Xp11.4-q23 58 TM4SF2 90 DLG3 27 Xq24-q27.1 59 AP1S2 91 ZDHHC15 28 Xq27.3-qter 60 OPHN1 92 ZNF674 29 ARX 61 Xq13.1-q25 93 BRWD3 30 PAK3 62 UPF3B 94 GRIA3 31 dup HUWE1 63 ACSL4 95 MAGT1/OSTb 32 ARX 64 dup MECP2 a Mutations in NLGN4, CDKL5, KDM5C, FGD1, SLC16A2, ATRX, AFF2 and SLC6A8 have been found in other families with nonsyndromal XLID. The American Journal of Human Genetics 90, 579–590, April 6, 2012 581
  8. 8. of the location in the ensuing decade34–36 and identifica- tion of the gene itself in 1991.37–40 In addition, the juxtaposition and timing of the family study and the population survey permitted us to look for the marker X in 5,000 individuals and over 30,000 cells and to conclude tentatively that it was not a common marker or variant because not even one marker X cell was observed. Another family with a similar chromosomal appearance at distal 16q was also ascertained in this same interval. This was inherited in an autosomal-dominant manner and not associated with a disease. We were, there- fore, able to make the preliminary conclusion that such markers did not necessarily indicate disease but that the marker X was a significant clinical marker for a Mendelian disease and hence a new and useful tool. Observations in the 1970s and 1980s More complex and folic-acid-enriched media become popular during the 1970s and presumably made detection of the fragile X increasingly difficult. Most early studies gave variable results and were not published. The initial report was confirmed by Giraud et al.34 and Harvey et al.35 These articles and the report by Sutherland36 estab- lished that folic acid in the culture media prevented the expression and detection of the fragile X. During the 1980s it became clear that a majority of XLID families did not have fragile X, and the identification and study of large non-fragile X XLID families with linkage analysis began in earnest. Large scale studies began across the globe at this time. The results summarized in Table 1, Tables S1 and S2, and Figures 1, 2, and 3 are, therefore, based on about 20 years of clinical and molecular studies. Methodologies Quicken the Pace of Gene Discovery Besides the cytogenetic methods used in the diagnosing and confirmation of fragile X, a number of strategies have been utilized to identify XLID genes (Table S1 and Figures 1 and 2). Prior to 1990, these were limited to the pursuit of genes in cases where the gene products (enzymes in all cases: HPRT [MIM 308000], PGK1 [MIM 311800], OTC [MIM 311250] , and PDHA1 [MIM 300582]) were known, the molecular pathway was known (PLP [MIM 300401]) or a chromosome aberration had localized the candidate region (DMD [MIM 300377]). Over the next decade and a half, exploitation of chromosome rearrange- ments and linkage coupled with candidate gene testing dominated the field. In the past several years, X chromo- some sequencing, microarrays (expression and genomic), and exploration of molecular pathways have added to the range of technologies available for XLID gene identifi- cation. Five of the first seven gene identifications were accomplished with a combination of known metabolic pathways and tissue culture studies in families with inborn errors of metabolism (Figure 4). The first identification, Lesch-Nyhan syndrome due to mutations in HPRT, was re- ported in 198341 and the most recent was the creatine Aicardi Bertini 22 Dessay CMT, lonasescu variant Prieto 21 XLID-blindness-seizures-spasticity Wieacker-Wolff Miles-Carpenter 11 11 12 Goldblatt spastic paraplegia XLID spastic paraplegia, type 7 XLID-macrocephaly-macroorchidism 13 21 Abidi Shrimpton XLID-telecanthus-deafness XLID-hypogammaglobulinemia 23 24 22 Ahmad MRXS7 CMT, Cowchock variant XLID-panhypopituitarism Christian 25 26 27 XLID-coarse facies Vitale: aphasia-coarse facies Gustavson Craniofacioskeletal Hypoparathyroidism, X-linked Armfield Waisman-Laxova Hereditary bullous dystrophy 28 XLID-microcephaly-testicular failure Figure 3. Approximate Linkage Limits for XLID Syndromes for which the Genes Have Not Been Identified 582 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  9. 9. transporter syndrome (MIM 300352) due to mutations in SLC6A8 [MIM 300036].42 Mutations in seven genes were identified by this methodology. Two workhorse approaches have been responsible for the great majority of subsequent gene identifications. The first of these, based on the ascertainment of a patient with both ID and a chromosomal rearrangement involving the X chromosome, was used successfully in identifying the gene associated with Duchenne muscular dystrophy in 1987. A total of 31 genes (Table S1 and Figure 4) had been identified by the middle of 2011 with this approach. The second and most productive ‘‘workhorse’’ approach, linkage study of XLID families followed by molecular analysis of appropriate candidate genes, was employed initially by a number of investigators in detecting and characterizing FMR1 (MIM 309550). Subsequently, its use has resulted in the identification of 43 mutant X genes. With increasing ease of sequencing, the pace of gene iden- tification by this route accelerated after 2003, as shown in Table S1 and Figure 4. The availability of brute force sequencing capability after completion of the Human Genome Project has brought an additional effective method of gene identification, and 21 have been reported since 2006 (Table S1 and Figure 4). Whether sequencing of large series of sporadic males, male siblings, or families with clear XLID will prove to be the most effective use of this resource remains to be deter- mined. The selection of pedigree-based subjects for sequencing, however, has the advantage that segregation of gene alterations can be tested. Since this approach often permits a relatively straight-forward path to gene identifi- cation, continued collection of both clinical data and blood samples remains important. Exploitation of a specific molecular finding has accounted for four gene identifica- tions (FANCB [MIM 300515], PORCN [MIM 300651], SMC1A/SM1L1 [MIM 300040], NDUFA1 [MIM 300078]). Two other new technologies, expression array and array- comparative genomic hybridization have, surprisingly, been applied successfully in only two and one instance, respectively. Expression array was used in combination with two other methods to discover the role of GRIA3 (MIM 305915) and PTCHD1 (MIM 300828) in ID. Array- CGH was used in the isolation of the mutant gene in one nonsyndromal family (HUWE1 [MIM 300697]).43 Many potentially valuable combinations of array technologies for screening followed with brute force sequencing can Figure 4. The Year and Methodology Used to Identify Genes Associated with XLID The following abbreviations are used: Exp-Arr ¼ expression microarray. MCGH ¼ genomic microarray. X-seq ¼ gene sequencing. Mol-Fu ¼ follow up of a known molecular pathway. L-can ¼ candidate gene testing within a linkage interval. Chr-rea ¼ positional cloning based on a chromosome rearrangement. Met-Fu ¼ follow up of a known metabolic pathway. The American Journal of Human Genetics 90, 579–590, April 6, 2012 583
  10. 10. be envisioned. Detection of a consistent up or downregula- tion or other abnormality in two or more XLID family members can certainly be envisioned as a fruitful approach to the selection of subjects for partial or complete X sequencing. Two or more approaches were used in combi- nation in six instances among the 102 gene identifications shown in Table S1 and Figure 1 (FMR1, MID1 [MIM 602148], SOX3 [MIM 313430], HUWE1, CASK [MIM 300172], and GRIA3). The application of CGH and related methods in conjunction with a variety of molecular technologies has increasingly been used to detect du- plications and deletions of genes associated with XLID (Figure 5).1,43–56 In spite of the identification of mutations in 102 genes that result in XLID, the fragile X syndrome continues to be by far the most frequent XLID syndrome. Whether the gradual but continuous expansion of the number of triplet repeats in the large bank of premutation carriers, which vary from 1/113 in Israel to 1/313–382 in the United States) plays a role in maintaining its relatively high gene frequency is unknown.57 Lumping, Splitting, and Reclassification Based on Gene Discovery: A Model for Future Research Given the variability and imprecision with which clinical evaluations are carried out, it is inevitable that some indi- viduals with X-linked ID will be incorrectly included in existing diagnostic categories, whereas others will be incor- rectly excluded. The extent to which individuals and families can be evaluated is dependent on the setting, access to historical information, availability and ages of affected and nonaffected family members, and the ex- perience and expertise of the observers. Differences in phenotype can result from mutations in different domains of a gene and by contributions from the balance of the genome. The identification of mutations in many genes associated with XLID has provided the opportunity to compensate for some of these variables, resulting in the lumping of entities previously considered to be separate and the splitting of other entities previously considered the same. In addition, the phenotypic limits of some XLID entities were established with some degree of objectivity. Several XLID entities have been most instructive. Dis- covery that mutations in ATRX (MIM 300032) (Xq21.1) cause alpha-thalassemia ID allowed testing of large number of males with hypotonic facies, ID, and other features.58–60 Currently, as shown in Table S1, four other named XLID syndromes (Carpenter-Waziri, Holmes- Gang, XLID-Hypotonia-Arch Fingerprints, and Chudley- Lowry syndromes [MIM 309580]) have been found to be allelic variants of alpha-thalassemia ID as have certain families with spastic paraplegia and nonsyndromal XLID.1,61–65 One family clinically diagnosed as Juberg- Marsidi syndrome was found to have an ATRX muta- tion.66,67 This is now known to be based on misdiagnosis of Juberg-Marsidi syndrome (MIM 300612); indeed, the original family with this syndrome has a mutation in HUWE1 at Xp11.22 (Friez et al., 2011, 15th International Workshop on Fragile X and Other Early-Onset Cognitive Disorders). One family clinically diagnosed as Smith- Fineman-Myers syndrome was also found to harbor an ATRX mutation, but the gene has not been analyzed in the original family.68–70 A clinically similar condition, Coffin-Lowry syndrome (MIM 303600), was found to be separate from alpha-thalassemia ID and due to mutations in RPS6KA3 (MIM 300075), which encodes a serine-threo- nine kinase.71 Kalscheuer et al.72 found mutations in PQBP1 (MIM 300463) (Xp11.2) in two named XLID syndromes – Suther- land-Haan syndrome (MIM 309470) and Hamel cerebropa- latocardiac syndrome (MIM 309500)—in MRX55 and two other families with microcephaly and other findings. Lenski et al.,73 Stevenson et al.,74 and Lubs et al.75 added Renpenning, Porteous, and Golabi-Ito-Hall syndromes to the list of XLID syndromes caused by mutations in PQBP1.73–75 The six phenotypes now attributed to muta- tions in PQBP1 are now summarized in the allelic variants of OMIM 300463. As with the ATRX phenotypes, a wide variety of phenotypic expressions result from different mutations in PQBP1 and we remain challenged to better understand the molecular and developmental mecha- nisms leading to these differences. Mutations in ARX (MIM 300382) (Xp22.2) were also found to be an important cause of XLID encompassing Wagenstaller et al.54, Horn et al.50 Gijsbers et al.49 22.3 22.2 22.1 Whibley et al.55 F t l 44 21.3 21.2 21.1 11.4 Froyen et al.45 royen e a . Bedeschi et al.48 11 12 11.3 11.1 11.23 11.22 11.21 Koolen et al.46 13 21.1 21.2 21 3 Mimault et al.51, Woodward et al.56 Koolen et al.46 23 . 22.1 22.2 22.3 Koolen et al.46 S l t 25 26 27 24 Solomon et al.53 Van Esch et al.47, Friez et al.43 Rio et al.52 28 Figure 5. Location of Segmental Duplications Associated with Syndromal or Nonsyndromal XLID43–56 584 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  11. 11. multiple phenotypes. Alterations, most commonly a 24 bp expansion of a polyalanine tract, were found in a number of families with nonsyndromal XLID (MRX29, 32, 33, 36, 38, 43, 54, and 76), an X-linked dystonia (Partington syndrome [MIM 309510]), X-linked infantile spasms (MIM 308350) (West syndrome), X-linked lissencephaly with abnormal genitalia (MIM 300215), hydranencephaly and abnormal genitalia (MIM 300215), and Proud syndrome (MIM 300215).76–83 Perhaps the most prominent example of syndrome split- ting is FG syndrome (MIM 305450). This syndrome, initially described in 1974 by Opitz and Kaveggia,84 is manifest by macrocephaly (or relative macrocephaly), downslanting palpebral fissures, imperforate anus or severe constipation, broad and flat thumbs and great toes, hypotonia, and ID. In the ensuing years, the manifes- tations attributed to FG syndrome have become protean, but none was pathognomonic or required for the diagnosis.85–88 As a result, a number of different localiza- tions on the X chromosome were proposed for FG syndrome.89–95 In 2007, Risheg et al.96 found a recurring mutation, c.2881C>T (p.Arg961Trp), in MED12 (MIM 300188) in six families with the FG phenotype, including the original family reported by Opitz and Kaveggia.84 In addition to the above noted manifestations, two other findings, small ears and friendly behavior, were consistently noted. Although most individuals who have carried the FG diagnosis have one or more findings that overlap with those in FG syndrome, they do not have MED12 muta- tions.97,98 Some have been found to have mutations in other X-linked genes (FMR1, FLNA [MIM 300017], ATRX, CASK, and MECP2 [MIM 300005]), whereas others have duplications or deletions of the autosomes.97 So great is the currently existing heterogeneity within FG syndrome that the vast majority of individuals so designated should best be considered to have ID of undetermined cause. In a number of instances, certain gene mutations have been associated with nonsyndromal XLID, whereas other mutations within the same genes have caused syndromal XLID. Mutations in 17 genes that may cause either type of XLID, depending on the mutation, have been identified (Figure 2). In some cases (e.g., those with OPHN1 [MIM 300127] and ARX mutations) re-examination has found syndromal manifestations in families previously consid- ered to have nonsyndromal XLID.79,99,100 The frequency with which the process of lumping and splitting in this limited field of investigation has occurred has been extremely instructive to both clinical and molec- ular investigators. Moreover, the process of reclassifying and refining the XLID syndromes in light of the gene iden- tifications may be one of the most important contributions by medical genetics to clinical medicine. The underlying mechanisms or pathways by which mutations in different genes result in similar phenotypes and different mutations in a single gene result in disparate phenotypes, however, remain to be fully elucidated. Improved Understanding of Disease Mechanisms in XLID Disorders Analysis of the presently known 102 genes associated with XLID lends some insight into the numerous molecular functions in which disruption can lead to cognitive impairment and impaired brain development.17 Three major functions are almost equally represented in proteins encoded by this panel of 102 genes: 22% are involved in regulation of transcription, 19% in signal transduction, and 15% in metabolism. Additionally, 15% are compo- nents of membrane-associated functions. The remainder are equally distributed (~3%–5%) in seven other cellular functions: cytoskeleton, RNA processing, DNA metabo- lism, protein synthesis, ubiquitinization, cell cycle, and cell adhesion. Regarding their localization within a cell, the proteins encoded by genes associated with XLID are almost equally distributed among the four major subcel- lular fractions: 30% in the nucleus, 28% in the cytoplasm, 18% in the membranes, and 16% in cellular organelles.17 The XLID disorders offer many opportunities for under- standing the functions of specific genes and their interac- tions with other genes in producing disease. Studies involving control of gene expression will necessarily be especially complex. These have just begun, in part because of their complexity and the rapid development of new tech- niques. Only recently, for example, has a preliminary ex- pression microarray analysis been carried out in two affected fragile X males.101 The study identified over 90 genes with a greater than 1.5-fold change in expression. Overrepre- sented genes were involved in signaling (both under- and overexpression), morphogenesis (underexpression), and neurodevelopment and function (overexpression). Although not addressed in this study, the possibility that a hallmark finding in the fragile X syndrome, enlargement of the testes, might result from altered control of tubular growth by a specific target gene is intriguing. One of the 90 genes identified, NUT (nuclear protein in testis [MIM 608963]), which is normally only expressed in the testis, should be a candidate gene in future studies because the BRDA-NUT fusion oncogenes are critical growth promoters in certain aggressive carcinomas.102 Alternatively, a more general growth-controlling gene might also explain the prognathism, macrocephaly and large hands which occur in some individuals with the fragile X syndrome. Studies directed at understanding the mechanisms underlying recurring clinical problems in XLID disorders such as short stature, microcephaly or macrocephaly, autistic behavior, and structural CNS abnormalities103 are also particularly appealing because they provide an opportunity both to simultaneously understand critical pathways, such as in dendrite development and the devel- opment of XLID structural abnormalities, gene expression, and phenotype. The association of autism spectrum dis- order with mutations in at least eight of the 102 genes listed in Table S1 is of particular current interest. This has been reported most frequently in the fragile X syndrome and Rett syndrome but also in disorders resulting from The American Journal of Human Genetics 90, 579–590, April 6, 2012 585
  12. 12. mutations in NLGN3 (MIM 300336), NLGN4 (MIM 300427), RPL10 (MIM 312173), RAB39B (MIM 300774), PTCHD1, and MED12. These genes, however, affect a wide range of functions (Table S1), and the cause of the clinical overlap is not clear. In nonsyndromal XLID, for example, mutations have been identified in five genes involved in the RhoGTPase cycle that affect dendritic outgrowth (OPHN1, PAK3 [MIM 300142], ARHGEF6 [MIM 300267], TM4SF2 [MIM 300096], and GDI1 [MIM 300104]) and are central to the development of the nonsyndromal pheno- type.1,17,104 The limited imaging and direct studies of macrocephaly, microcephaly, and cerebellar hypoplasia have recently been summarized,104 but more extensive application of anatomical and functional brain imaging and spectros- copy techniques that can identify variations in specific brain regions for each disorder, in conjunction with both clinical observations and psychometric studies, is critically needed. Detection of Possible Advantageous Cognitive and Behavioral Genes The identification of 102 X-linked genes affecting intelli- gence has raised the probability that X chromosomal genes (including XLID genes) might play a particularly impor- tant role in brain structure and function as well as a specific role in intelligence and certain cognitive abilities. Clearly, as discussed at the beginning of this paper, the research planned and carried out to identify XLID genes and syndromes over the last several decades might account for part or even all of this relative excess compared to auto- somal loci. A number of papers, however, have addressed the issue of active selection during evolution for X chro- mosomal localization of important brain and cognitive genes.2,105,106 The finding that human and mouse X chro- mosome genes are hyperexpressed in the CNS compared to autosomal genes provided additional important confirma- tory data for the hypothesis of positive evolutionary selec- tion.107 These studies showed not only that there was a doubling of X chromosome expression (compared to auto- somes) early in development (leading to dosage compensa- tion), but overexpression in human CNS tissue and in mouse CNS tissue increased by 2.83 and 2.53, respec- tively, compared to expression in somatic tissues. These observations also support the general idea that X genes are particularly important for brain development and function. Mutations significantly improving intellectual, creative, perceptive, and leadership qualities would be fully expressed in males and reasonably could have been positively selected for in a relatively short period of time in contrast to the negative selection for XLID muta- tions.108–112 In essence, the XY males may have been the experimental animal and the XX female, the storage facility for both advantageous and deleterious mutations. Medical investigations generally focus on adverse effects and no organized searches for X-linked pedigrees with particularly high intellectual or special cognitive talents have been reported. Thus, the same approach that has been effective in identifying XLID syndrome genes, investi- gating families with an X-linked pattern of intellectual outliers, might also prove rewarding for studies at the other end of the intellectual spectrum. What if we selected for families with an X-linked pattern of high intellectual accomplishment; special talents in art or music; unique types of cognitive behavior involving memory, problem solving, or, indeed, any type of special intellectual accom- plishment such as Nobel awards in Economics or Physics? Such families will certainly be uncommon but so are most XLID disorders. Yet families might be identified if academi- cians asked the pertinent family history questions during lunch with colleagues, a dedicated, interactive home page was available, or notices were placed in journals asking for information about possible families. The same group of laboratories that contributed to the data in Table S1 would be logical sources for referral and molecular studies because the necessary cognitive and molecular studies are already in place. A positive result might be even be more important to society than XLID disease description and provide important insight into human evolution. Although there is a wide array of pertinent cognitive tests, these were not designed to detect specific familial talents. The coapplication of a pedigree analysis with perti- nent laboratory tests should provide sufficiently precise initial diagnosis of the affected to carry out linkage and array or other screening tests successfully. One family with four to five outstanding individuals over several generations could provide sufficient data to warrant testing other families (or even other species) and to begin an iden- tification process similar to that described in this paper that has proven successful for XLID. Imagine the prospects for investigating specific gene-environmental interactions during learning and development! Why, other than not having looked seriously, have we not stumbled upon such families? Perhaps we have. In the Inaugural Book of the new National Museum of the American Indian, Native Universe, Voices of Indian America,113 in which tribal leaders, writers, scholars, and story tellers describe Indian traditions and heritages, the following is recounted: ‘‘Story tells us that a group split from the Lenni Lenape, perhaps a thousand years ago or more. The people then settled on the Eastern Shore of the Chesapeake, and were one and the same as the Nanticoke. Then, for some reason, the first Tayac, Uttapoingassenum, led his people to the other side of the bay. Upon their arrival, they encountered peoples who had been living on the land for more than 8,000 years, according to various archeological estimates. For thirteen generations prior to English settlement, as told to Jesuit and Moravian missionaries, the Tayac’s inher- itance passed from brother to brother and then to the sister’s sons. Each led the people until his death.’’ The possibility that the Nanticoke had intuitively recog- nized and employed a quality of leadership that followed an X-linked pattern of inheritance is intriguing to consider. 586 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  13. 13. Although much progress has been made during the past four decades, the clinical and molecular delineation of XLID is far from complete. Perhaps little more than half of the genes in which mutations will result in XLID have been identified. The molecular pathways are incompletely understood, the mechanisms by which brain structure and function are deranged have not been identified, and with few exceptions the neurobehavioral profiles and natural history of the XLID entities have received insufficient attention. These deficiencies notwithstanding, consider- able benefits have been gained for individuals with XLID and their families. Specific molecular tests, including mul- tigene panels, are now available to more efficiently reach a diagnosis. Carrier testing, donor eggs, prenatal diagnosis, and preimplantation genetic testing may be used to prevent recurrence when a specific gene mutation is found. Through these measures, reproductive confidence may be restored for families in which XLID has occurred.114 Supplemental Data Supplemental Data include two tables and can be found with this article online at http://www.cell.com/AJHG/. Web Resources The URLs for data presented herein are as follows: Greenwood Genetic Center, XLID Update, http://www.ggc.org/ research/molecular-studies/xlid.html Online Mendelian Inheritance in Man (OMIM), http://www. omim.org/ References 1. Stevenson, R.E., Schwartz, C.E., and Rogers, R.C. (2012). Atlas of X-Linked Intellectual Disability Syndromes (New York: Oxford University Press). 2. Skuse, D.H. (2005). X-linked genes and mental functioning. Hum. Mol. Genet. 14 (Spec No 1), R27–R32. 3. Ge´cz, J., Shoubridge, C., and Corbett, M. (2009). The genetic landscape of intellectual disability arising from chromo- some X. Trends Genet. 25, 308–316. 4. Lubs, H.A. (1969). A marker X chromosome. Am. J. Hum. Genet. 21, 231–244. 5. Kaiser-McCaw, B., Hecht, F., Cadien, J.D., and Moore, B.C. (1980). Fragile X-linked mental retardation. Am. J. Med. Genet. 7, 503–505. 6. Opitz, J.M., and Sutherland, G.R. (1984). Conference report: International workshop on the fragile X and X-linked intel- lectual disability. Am. J. Med. Genet. 17, 5–94. 7. Turner, G., Opitz, J.M., Brown, W.T., Davies, K.E., Jacobs, P.A., Jenkins, E.C., Mikkelson, M., Partington, M.W., and Sutherland, G.R. (1986). Conference report: Second interna- tional workshop on the fragile X and on X-linked mental retardation. Am. J. Med. Genet. 23, 11–67. 8. Neri, G., Opitz, J.M., Mikkelson, M., Jacobs, P.A., Davies, K., and Turner, G. (1988). Conference report: Third interna- tional workshop on the fragile X and X-linked mental retardation. Am. J. Med. Genet. 30, 1–29. 9. Neri, G., Gurrieri, F., Gal, A., and Lubs, H.A. (1991). XLMR genes: Update 1990. Am. J. Med. Genet. 38, 186–189. 10. Neri,G.,Chiurazzi,P.,Arena,F.,Lubs,H.A.,andGlass,I.A.(1992). XLMR genes: Update 1992. Am. J. Med. Genet. 43, 373–382. 11. Neri, G., Chiurazzi, P., Arena, J.F., and Lubs, H.A. (1994). XLMR genes: Update 1994. Am. J. Med. Genet. 51, 542–549. 12. Brown, W.T., Jenkins, E., Neri, G., Lubs, H., Shapiro, L.R., Davies, K.E., Sherman, S., Hagerman, R., and Laird, C. (1991). Conference report: Fourth international workshop on the fragile X and X-linked mental retardation. Am. J. Med. Genet. 38, 158–172. 13. Lubs, H.A., Chiurazzi, P., Arena, J.F., Schwartz, C., Traneb- jaerg, L., and Neri, G. (1996). XLMR genes: update 1996. Am. J. Med. Genet. 64, 147–157. 14. Lubs, H., Chiurazzi, P., Arena, J., Schwartz, C., Tranebjaerg, L., and Neri, G. (1999). XLMR genes: Update 1998. Am. J. Med. Genet. 83, 237–247. 15. Chiurazzi, P., Hamel, B.C., and Neri, G. (2001). XLMR genes: Update 2000. Eur. J. Hum. Genet. 9, 71–81. 16. Chiurazzi, P., Schwartz, C.E., Gecz, J., and Neri, G. (2008). XLMR genes: Update 2007. Eur. J. Hum. Genet. 16, 422–434. 17. Ropers, H.H. (2008). Genetics of intellectual disability. Curr. Opin. Genet. Dev. 18, 241–250. 18. Chelly, J., Khelfaoui, M., Francis, F., Che´rif, B., and Bienvenu, T. (2006). Genetics and pathophysiology of mental retarda- tion. Eur. J. Hum. Genet. 14, 701–713. 19. Ropers, H.H., and Hamel, B.C. (2005). X-linked mental retardation. Nat. Rev. Genet. 6, 46–57. 20. Kleefstra, T., and Hamel, B.C. (2006). X-linked mental retar- dation: Further lumping, splitting and emerging pheno- types. Clin. Genet. 67, 451–467. 21. Stevenson, R.E., and Schwartz, C.E. (2002). Clinical and molecular contributions to the understanding of X-linked mental retardation. Cytogenet. Genome Res. 99, 265–275. 22. Neri, G., and Opitz, J.M. (2000). Sixty years of X-linked mental retardation: A historical footnote. Am. J. Med. Genet. 97, 228–233. 23. Martin, J.P., and Bell, J. (1943). A pedigree of mental defect showing sex-linkage. J. Neurol. Psychiatry 6, 154–157. 24. Allan, W., Herndon, C.N., and Dudley, F.C. (1944). Some examples of the inheritance of mental deficiency: Apparently sex-linked idiocy and microcephaly. Am. J. Ment. Defic. 48, 325–334. 25. Bickers, D.S., and Adams, R.D. (1949). Hereditary stenosis of the aqueduct of Sylvius as a cause of congenital hydroceph- alus. Brain 72, 246–262. 26. Losowsky, M.S. (1961). Hereditary mental defect showing the pattern of sex influence. J. Ment. Defic. Res. 5, 60–62. 27. Renpenning, H., Gerrard, J.W., Zaleski, W.A., and Tabata, T. (1962). Familial sex-linked mental retardation. Can. Med. Assoc. J. 87, 954–956. 28. Dunn, H.G., Renpenning, H., Gerrard, H.W., Miller, J.R., Tabata, T., and Federoff, S. (1963). Mental retardation as a sex-linked defect. Am. J. Ment. Defic. 67, 827–848. 29. Penrose, L.S. (1938). A clinical and genetic study of 1280 cases of mental defect. Special Report Series, Medical Research Council, No. 229 (London: His Majesty’s Stationery Office). 30. Lehrke, R.G. (1974). X-linked mental retardation and verbal disability. Birth Defects Orig. Artic. Ser. 10, 1–100. 31. Herbst, D.S., and Miller, J.R. (1980). Nonspecific X-linked mental retardation II: The frequency in British Columbia. Am. J. Med. Genet. 7, 461–469. The American Journal of Human Genetics 90, 579–590, April 6, 2012 587
  14. 14. 32. Lubs, H.A., and Ruddle, F.H. (1970). Chromosomal abnor- malities in the human population: estimation of rates based on New Haven newborn study. Science 169, 495–497. 33. Harrison, C.J., Jack, E.M., Allen, T.D., and Harris, R. (1983). The fragile X: A scanning electron microscope study. J. Med. Genet. 20, 280–285. 34. Giraud, F., Ayme, S., Mattei, J.F., and Mattei, M.G. (1976). Constitutional chromosomal breakage. Hum. Genet. 34, 125–136. 35. Harvey, J., Judge, C., and Wiener, S. (1977). Familial X-linked mental retardation with an X chromosome abnormality. J. Med. Genet. 14, 46–50. 36. Sutherland, G.R. (1977). Fragile sites on human chromo- somes: Demonstration of their dependence on the type of tissue culture medium. Science 197, 265–266. 37. Oberle, I., Rousseau, F., Heitz, D., Kretz, C., Kevys, D., Hana- uer, A., Boue, J., Bertheas, M.F., and Mandel, J.L. (1991). Instability of a 550-base pair DNA segment and abnormal methylation in fragile X syndrome. Science 252, 1097–1102. 38. Bell, M.V., Hirst, M.C., Nakahori, Y., MacKinnon, R.N., Roche, A., Flint, T.J., Jacobs, P.A., Tommerup, N., Tranebjaerg, L., Froster-Iskenius, U., et al. (1991). Physical mapping across the fragile X: hypermethylation and clinical expression of the fragile X syndrome. Cell 64, 861–866. 39. Yu, S., Pritchard, M., Kremer, E., Lynch, M., Nancarrow, J., Baker, E., Holman, K., Mulley, J., Warren, S., Schlessinger, D., et al. (1991). Fragile X genotype characterized by an unstable region of DNA. Science 252, 1179–1181. 40. Verkerk, A.J., Pieretti, M., Sutcliffe, J.S., Fu, Y.H., Kuhl, D.P., Pizzuti, A., Reiner, O., Richards, S., Victoria, M.F., Zhang, F.P., et al. (1991). Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell 65, 905–914. 41. Jolly, D.J., Okayama, H., Berg, P., Esty, A.C., Filpula, D., Bohlen, P., Johnson, G.G., Shively, J.E., Hunkapillar, T., and Friedmann, T. (1983). Isolation and characterization of a full-length expressible cDNA for human hypoxanthine phosphoribosyl transferase. Proc. Natl. Acad. Sci. USA 80, 477–481. 42. Salomons, G.S., van Dooren, S.J., Verhoeven, N.M., Cecil, K.M., Ball, W.S., Degrauw, T.J., and Jakobs, C. (2001). X-linked creatine-transporter gene (SLC6A8) defect: A new creatine-deficiency syndrome. Am. J. Hum. Genet. 68, 1497–1500. 43. Friez, M.J., Jones, J.R., Clarkson, K., Lubs, H., Abuelo, D., Bier, J.A., Pai, S., Simensen, R., Williams, C., Giampietro, P.F., et al. (2006). Recurrent infections, hypotonia, and mental retarda- tion caused by duplication of MECP2 and adjacent region in Xq28. Pediatrics 118, e1687–e1695. 44. Froyen, G., Van Esch, H., Bauters, M., Hollanders, K., Frints, S.G., Vermeesch, J.R., Devriendt, K., Fryns, J.P., and Marynen, P. (2007). Detection of genomic copy number changes in patients with idiopathic mental retardation by high-resolu- tion X-array-CGH: Important role for increased gene dosage of XLMR genes. Hum. Mutat. 28, 1034–1042. 45. Froyen, G., Corbett, M., Vandewalle, J., Jarvela, I., Lawrence, O., Meldrum, C., Bauters, M., Govaerts, K., Vandeleur, L., Van Esch, H., et al. (2008). Submicroscopic duplications of the hydroxysteroid dehydrogenase HSD17B10 and the E3 ubiq- uitin ligase HUWE1 are associated with mental retardation. Am. J. Hum. Genet. 82, 432–443. 46. Koolen, D.A., Pfundt, R., de Leeuw, N., Hehir-Kwa, J.Y., Nille- sen, W.M., Neefs, I., Scheltinga, I., Sistermans, E., Smeets, D., Brunner, H.G., et al. (2009). Genomic microarrays in mental retardation: A practical workflow for diagnostic applications. Hum. Mutat. 30, 283–292. 47. Van Esch, H., Bauters, M., Ignatius, J., Jansen, M., Raynaud, M., Hollanders, K., Lugtenberg,D., Bienvenu, T.,Jensen, L.R.,Gecz, J., et al. (2005). Duplication of the MECP2 region is a frequent cause of severe mentalretardation and progressiveneurological symptoms in males. Am. J. Hum. Genet. 77, 442–453. 48. Bedeschi, M.F., Novelli, A., Bernardini, L., Parazzini, C., Bianchi, V., Torres, B., Natacci, F., Giuffrida, M.G., Ficarazzi, P., Dallapiccola, B., and Lalatta, F. (2008). Association of syn- dromic mental retardation with an Xq12q13.1 duplication encompassing the oligophrenin 1 gene. Am. J. Med. Genet. A. 146A, 1718–1724. 49. Gijsbers, A.C., den Hollander, N.S., Helderman-van de Enden, A.T., Schuurs-Hoeijmakers, J.H., Vijfhuizen, L., Bijlsma, E.K., van Haeringen, A., Hansson, K.B., Bakker, E., Breuning, M.H., and Ruivenkamp, C.A. (2011). X-chromosome duplica- tions in males with mental retardation: Pathogenic or benign variants? Clin. Genet. 79, 71–78. 50. Horn, D., Spranger, S., Kruger, G., Wagenstaller, J., Weschke, B., Ropers, H.H., Mundlos, S., Ullmann, R., Strom, T.M., and Kiopocki, E. (2007). Microdeletions and microduplications affecting the STS gene at Xp22.31 are associated with a distinct phenotypic spectrum. Medizinische Genetik 19, 62. 51. Mimault, C., Giraud, G., Courtois, V., Cailloux, F., Boire, J.Y., Dastugue, B., and Boespflug-Tanguy, O.; The Clinical Euro- pean Network on Brain Dysmyelinating Disease. (1999). Proteolipoprotein gene analysis in 82 patients with sporadic Pelizaeus-Merzbacher Disease: Duplications, the major cause of the disease, originate more frequently in male germ cells, but point mutations do not. Am. J. Hum. Genet. 65, 360–369. 52. Rio, M., Malan, V., Boissel, S., Toutain, A., Royer, G., Gobin, S., Morichon-Delvallez, N., Turleau, C., Bonnefont, J.P., Munnich, A., et al. (2010). Familial interstitial Xq27.3q28 duplication encompassing the FMR1 gene but not the MECP2 gene causes a new syndromic mental retardation condition. Eur. J. Hum. Genet. 18, 285–290. 53. Solomon, N.M., Ross, S.A., Morgan, T., Belsky, J.L., Hol, F.A., Karnes, P.S., Hopwood, N.J., Myers, S.E., Tan, A.S., Warne, G.L., et al. (2004). Array comparative genomic hybridisation analysis of boys with X linked hypopituitarism identifies a 3.9 Mb duplicated critical region at Xq27 containing SOX3. J. Med. Genet. 41, 669–678. 54. Wagenstaller, J., Spranger, S., Lorenz-Depiereux, B., Kaz- mierczak, B., Nathrath, M., Wahl, D., Heye, B., Glaser, D., Liebscher, V., Meitinger, T., and Strom, T.M. (2007). Copy-number variations measured by single-nucleotide- polymorphism oligonucleotide arrays in patients with mental retardation. Am. J. Hum. Genet. 81, 768–779. 55. Whibley, A.C., Plagnol, V., Tarpey, P.S., Abidi, F., Fullston, T., Choma, M.K., Boucher, C.A., Shepherd, L., Willatt, L., Parkin, G., et al. (2010). Fine-scale survey of X chromosome copy number variants and indels underlying intellectual disability. Am. J. Hum. Genet. 87, 173–188. 56. Woodward, K., Palmer, R., Rao, K., and Malcolm, S. (1999). Prenatal diagnosis by FISH in a family with Pelizaeus- Merzbacher disease caused by duplication of PLP gene. Prenat. Diagn. 19, 266–268. 588 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  15. 15. 57. Hantash, F.M., Goos, D.G.,Tsao, D., Quan, F., Buller-Burckle,A., Peng, M., Jarvis, M., Sun, W., and Strom, C.M. (2010). Qualita- tiveassessmentofFMR1(CGG)ntripletrepeatstatusinnormal, intermediate, premutation, full mutation, and mosaic carriers in both sexes: Implications for fragile X syndrome carrier and newborn screening. Genet. Med. 12, 162–173. 58. Gibbons, R.J., Brueton, L., Buckle, V.J., Burn, J., Clayton- Smith, J., Davison, B.C., Gardner, R.J., Homfray, T., Kearney, L., Kingston, H.M., et al. (1995a). Clinical and hematologic aspects of the X-linked alpha-thalassemia/mental retardation syndrome (ATR-X). Am. J. Med. Genet. 55, 288–299. 59. Gibbons, R.J., Picketts, D.J., Villard, L., and Higgs, D.R. (1995b). Mutations in a putative global transcriptional regulator cause X-linked mental retardation with alpha- thalassemia (ATR-X syndrome). Cell 80, 837–845. 60. Villard,L.,Bonino,M.C.,Abidi,F.,Ragusa,A.,Belougne,J.,Lossi, A.M., Seaver, L., Bonnefont, J.P., Romano, C., Fichera, M., et al. (1999). Evaluation of a mutation screening strategy for sporadic cases of ATR-X syndrome. J. Med. Genet. 36, 183–186. 61. Abidi, F., Schwartz, C.E., Carpenter, N.J., Villard, L., Fonte´s, M., and Curtis, M. (1999). Carpenter-Waziri syndrome results from a mutation in XNP. Am. J. Med. Genet. 85, 249–251. 62. Lossi, A.M., Milla´n, J.M., Villard, L., Orellana, C., Cardoso, C., Prieto, F., Fonte´s, M., and Martı´nez, F. (1999). Mutation of the XNP/ATR-X gene in a family with severe mental retardation, spastic paraplegia and skewed pattern of X inac- tivation: Demonstration that the mutation is involved in the inactivation bias. Am. J. Hum. Genet. 65, 558–562. 63. Abidi, F.E., Cardoso, C., Lossi, A.M., Lowry, R.B., Depetris, D., Matte´i, M.G., Lubs, H.A., Stevenson, R.E., Fontes, M., Chudley, A.E., and Schwartz, C.E. (2005). Mutation in the 50 alternatively spliced region of the XNP/ATR-X gene causes Chudley-Lowry syndrome. Eur. J. Hum. Genet. 13, 176–183. 64. Guerrini, R., Shanahan, J.L., Carrozzo, R., Bonanni, P., Higgs, D.R., and Gibbons, R.J. (2000). A nonsense mutation of the ATRX gene causing mild mental retardation and epilepsy. Ann. Neurol. 47, 117–121. 65. Yntema, H.G., Poppelaars, F.A., Derksen, E., Oudakker, A.R., van Roosmalen, T., Jacobs, A., Obbema, H., Brunner, H.G., Hamel, B.C., and van Bokhoven, H. (2002). Expanding phenotype of XNP mutations: Mild to moderate mental retardation. Am. J. Med. Genet. 110, 243–247. 66. Mattei, J.F., Collignon, P., Ayme, S., and Giraud, F. (1983). X-linked mental retardation, growth retardation, deafness and microgenitalism. A second familial report. Clin. Genet. 23, 70–74. 67. Villard, L., Gecz, J., Matte´i, J.F., Fonte´s, M., Saugier-Veber, P., Munnich, A., and Lyonnet, S. (1996). XNP mutation in a large family with Juberg-Marsidi syndrome. Nat. Genet. 12, 359–360. 68. Smith, R.D., Fineman, R.M., and Myers, G.G. (1980). Short stature, psychomotor retardation, and unusual facial appear- ance in two brothers. Am. J. Med. Genet. 7, 5–9. 69. Ade`s, L.C., Kerr, B., Turner, G., and Wise, G. (1991). Smith- Fineman-Myers syndrome in two brothers. Am. J. Med. Genet. 40, 467–470. 70. Villard, L., Fonte`s, M., Ade`s, L.C., and Gecz, J. (2000). Identi- fication of a mutation in the XNP/ATR-X gene in a family reported as Smith-Fineman-Myers syndrome. Am. J. Med. Genet. 91, 83–85. 71. Trivier, E., De Cesare, D., Jacquot, S., Pannetier, S., Zackai, E., Young, I., Mandel, J.L., Sassone-Corsi, P., and Hanauer, A. (1996). Mutations in the kinase Rsk-2 associated with Coffin-Lowry syndrome. Nature 384, 567–570. 72. Kalscheuer, V.M., Freude, K., Musante, L., Jensen, L.R., Yntema, H.G., Ge´cz, J., Sefiani, A., Hoffmann, K., Moser, B., Haas, S., et al. (2004). Mutations in the polyglutamine binding protein 1 gene cause X-linked mental retardation. Nat. Genet. 35, 313–315. 73. Lenski, C., Abidi, F., Meindl, A., Gibson, A., Platzer, M., Frank Kooy, R., Lubs, H.A., Stevenson, R.E., Ramser, J., and Schwartz, C.E. (2004). Novel truncating mutations in the polyglutamine tract binding protein 1 gene (PQBP1) cause Renpenning syndrome and X-linked mental retardation in another family with microcephaly. Am. J. Hum. Genet. 74, 777–780. 74. Stevenson, R.E., Bennett, C.W., Abidi, F., Kleefstra, T., Porteous, M., Simensen, R.J., Lubs, H.A., Hamel, B.C., and Schwartz, C.E. (2005). Renpenning syndrome comes into focus. Am. J. Med. Genet. A. 134, 415–421. 75. Lubs, H., Abidi, F.E., Echeverri, R., Holloway, L., Meindl, A., Stevenson, R.E., and Schwartz, C.E. (2006). Golabi-Ito-Hall syndrome results from a missense mutation in the WW domain of the PQBP1 gene. J. Med. Genet. 43, e30. 76. Strømme, P., Mangelsdorf, M.E., Scheffer, I.E., and Ge´cz, J. (2002). Infantile spasms, dystonia, and other X-linked phenotypes caused by mutations in Aristaless related homeobox gene, ARX. Brain Dev. 24, 266–268. 77. Strømme, P., Mangelsdorf, M.E., Shaw, M.A., Lower, K.M., Lewis, S.M., Bruyere, H., Lu¨tcherath, V., Gedeon, A.K., Wallace, R.H., Scheffer, I.E., et al. (2002). Mutations in the human ortholog of Aristaless cause X-linked mental retarda- tion and epilepsy. Nat. Genet. 30, 441–445. 78. Bienvenu, T., Poirier, K., Friocourt, G., Bahi, N., Beaumont, D., Fauchereau, F., Ben Jeema, L., Zemni, R., Vinet, M.C., Francis, F., et al. (2002). ARX, a novel Prd-class-homeobox gene highly expressed in the telencephalon, is mutated in X-linked mental retardation. Hum. Mol. Genet. 11, 981–991. 79. Frints, S.G., Froyen, G., Marynen, P., Willekens, D., Legius, E., and Fryns, J.P. (2002). Re-evaluation of MRX36 family after discovery of an ARX gene mutation reveals mild neurological features of Partington syndrome. Am. J. Med. Genet. 112, 427–428. 80. Kitamura, K., Yanazawa, M., Sugiyama, N., Miura, H., Iizuka- Kogo, A., Kusaka, M., Omichi, K., Suzuki, R., Kato-Fukui, Y., Kamiirisa, K., et al. (2002). Mutation of ARX causes abnormal development of forebrain and testes in mice and X-linked lis- sencephaly with abnormal genitalia in humans. Nat. Genet. 32, 359–369. 81. Uyanik, G., Aigner, L., Martin, P., Gross, C., Neumann, D., Marschner-Scha¨fer, H., Hehr, U., and Winkler, J. (2003). ARX mutations in X-linked lissencephaly with abnormal genitalia. Neurology 61, 232–235. 82. Kato, M., Das, S., Petras, K., Kitamura, K., Morohashi, K., Abuelo, D.N., Barr, M., Bonneau, D., Brady, A.F., Carpenter, N.J., et al. (2004). Mutations of ARX are associated with striking pleiotropy and consistent genotype-phenotype correlation. Hum. Mutat. 23, 147–159. 83. Stepp, M.L., Cason, A.L., Finnis, M., Mangelsdorf, M., Holin- ski-Feder, E., Macgregor, D., MacMillan, A., Holden, J.J., Gecz, J., Stevenson, R.E., and Schwartz, C.E. (2005). XLMR in MRX families 29, 32, 33 and 38 results from the dup24 mutation in the ARX (Aristaless related homeobox) gene. BMC Med. Genet. 6, 16. The American Journal of Human Genetics 90, 579–590, April 6, 2012 589
  16. 16. 84. Opitz, J.M., and Kaveggia, E.G. (1974). Studies of malforma- tion syndromes of man 33: the FG syndrome. An X-linked recessive syndrome of multiple congenital anomalies and mental retardation. Z. Kinderheilkd. 117, 1–18. 85. Opitz, J.M., Richieri-da Costa, A., Aase, J.M., and Benke, P.J. (1988). FG syndrome update 1988: note of 5 new patients and bibliography. Am. J. Med. Genet. 30, 309–328. 86. Romano, C., Baraitser, M., and Thompson, E. (1994). A clin- ical follow-up of British patients with FG syndrome. Clin. Dysmorphol. 3, 104–114. 87. Ozonoff, S., Williams, B.J., Rauch, A.M., and Opitz, J.O. (2000). Behavior phenotype of FG syndrome: cognition, personality, and behavior in eleven affected boys. Am. J. Med. Genet. 97, 112–118. 88. Battaglia, A., Chines, C., and Carey, J.C. (2006). The FG syndrome: report of a large Italian series. Am. J. Med. Genet. A. 140, 2075–2079. 89. Briault, S., Hill, R., Shrimpton, A., Zhu, D., Till, M., Ronce, N., Margaritte-Jeannin, P., Baraitser, M., Middleton-Price, H., Malcolm, S., et al. (1997). A gene for FG syndrome maps in the Xq12-q21.31 region. Am. J. Med. Genet. 73, 87–90. 90. Briault, S., Villard, L., Rogner, U., Coy, J., Odent, S., Lucas, J., Passage, E., Zhu, D., Shrimpton, A., Pembrey, M., et al. (2000). Mapping of X chromosome inversion breakpoints [inv(X)(q11q28)] associated with FG syndrome: A second FG locus [FGS2]? Am. J. Med. Genet. 95, 178–181. 91. Piluso, G., Carella, M., D’Avanzo, M., Santinelli, R., Carrano, E.M., D’Avanzo, A., D’Adamo, A.P., Gasparini, P., and Nigro, V. (2003). Genetic heterogeneity of FG syndrome: a fourth locus (FGS4) maps to Xp11.4-p11.3 in an Italian family. Hum. Genet. 112, 124–130. 92. Dessay, S., Moizard, M.P., Gilardi, J.L., Opitz, J.M., Middle- ton-Price, H., Pembrey, M., Moraine, C., and Briault, S. (2002). FG syndrome: linkage analysis in two families sup- porting a new gene localization at Xp22.3 [FGS3]. Am. J. Med. Genet. 112, 6–11. 93. Jehee, F.S., Rosenberg, C., Krepischi-Santos, A.C., Kok, F., Knijnenburg, J., Froyen, G., Vianna-Morgante, A.M., Opitz, J.M., and Passos-Bueno, M.R. (2005). An Xq22.3 duplication detected by comparative genomic hybridization microarray (Array-CGH) defines a new locus (FGS5) for FG syndrome. Am. J. Med. Genet. A. 139, 221–226. 94. Tarpey, P.S., Raymond, F.L., Nguyen, L.S., Rodriguez, J., Hackett, A., Vandeleur, L., Smith, R., Shoubridge, C., Edkins, S., Stevens, C., et al. (2007). Mutations in UPF3B, a member of the nonsense-mediated mRNA decay complex, cause syn- dromic and nonsyndromic mental retardation. Nat. Genet. 39, 1127–1133. 95. Unger, S., Mainberger, A., Spitz, C., Ba¨hr, A., Zeschnigk, C., Zabel, B., Superti-Furga, A., and Morris-Rosendahl, D.J. (2007). Filamin A mutation is one cause of FG syndrome. Am. J. Med. Genet. A. 143A, 1876–1879. 96. Risheg, H., Graham, J.M., Jr., Clark, R.D., Rogers, R.C., Opitz, J.M., Moeschler, J.B., Peiffer, A.P., May, M., Joseph, S.M., Jones, J.R., et al. (2007). A recurrent mutation in MED12 leading to R961W causes Opitz-Kaveggia syndrome. Nat. Genet. 39, 451–453. 97. Lyons, M.J., Graham, J.M., Jr., Neri, G., Hunter, A.G.W., Clark, R.D., Rogers, R.C., Moscarda, M., Boccuto, L., Simen- sen, R., Dodd, J., et al. (2009). Clinical experience in the evaluation of 30 patients with a prior diagnosis of FG syndrome. J. Med. Genet. 46, 9–13. 98. Clark, R.D., Graham, J.M., Jr., Friez, M.J., Hoo, J.J., Jones, K.L., McKeown, C., Moeschler, J.B., Raymond, F.L., Rogers, R.C., Schwartz, C.E., et al. (2009). FG syndrome, an X-linked multiple congenital anomaly syndrome: the clinical pheno- type and an algorithm for diagnostic testing. Genet. Med. 11, 769–775. 99. Bergmann, C., Zerres, K., Senderek, J., Rudnik-Schoneborn, S., Eggermann, T., Ha¨usler, M., Mull, M., and Ramaekers, V.T. (2003). Oligophrenin 1 (OPHN1) gene mutation causes syndromic X-linked mental retardation with epilepsy, rostral ventricular enlargement and cerebellar hypoplasia. Brain 126, 1537–1544. 100. Philip, N., Chabrol, B., Lossi, A.M., Cardoso, C., Guerrini, R., Dobyns, W.B., Raybaud, C., and Villard, L. (2003). Mutations in the oligophrenin-1 gene (OPHN1) cause X linked congen- ital cerebellar hypoplasia. J. Med. Genet. 40, 441–446. 101. Bittel, D.C., Kibiryeva, N., and Butler, M.G. (2007). Whole genome microarray analysis of gene expression in subjects with fragile X syndrome. Genet. Med. 9, 464–472. 102. French, C.A., Miyoshi, I., Kubonishi, I., Grier, H.E., Perez- Atayde, A.R., and Fletcher, J.A. (2003). BRD4-NUT fusion oncogene: A novel mechanism in aggressive carcinoma. Cancer Res. 63, 304–307. 103. Stevenson, R.E., and Schwartz, C.E. (2009). X-linked intellec- tual disability: Unique vulnerability of the male genome. Dev. Disabil. Res. Rev. 15, 361–368. 104. Renieri, A., Pescucci, C., Longo, I., Ariani, F., Mari, F., and Meloni, I. (2005). Non-syndromic X-linked mental retarda- tion: From a molecular to a clinical point of view. J. Cell. Physiol. 204, 8–20. 105. Zechner, U., Wilda, M., Kehrer-Sawatzki, H., Vogel, W., Fundele, R., and Hameister, H. (2001). A high density of X-linked genes for general cognitive ability: A run-away processshapinghumanevolution?TrendsGenet.17,697–701. 106. Graves, J.A., Ge´cz, J., and Hameister, H. (2002). Evolution of the human X—a smart and sexy chromosome that controls speciation and development. Cytogenet. Genome Res. 99, 141–145. 107. Nguyen, D.K., and Disteche, C.M. (2006). Dosage compensa- tion of the active X chromosome in mammals. Nat. Genet. 38, 47–53. 108. Turner, G., and Partington, M.W. (1991). Genes for intelli- gence on the X chromosome. J. Med. Genet. 28, 429. 109. Turner, G. (1996). Finding genes on the X chromosome by which homo may have become sapiens. Am. J. Hum. Genet. 58, 1109–1110. 110. Turner, G. (1996). Intelligence and the X chromosome. Lancet 347, 1814–1815. 111. Hedges, L.V., and Nowell, A. (1995). Sex differences in mental test scores, variability, and numbers of high-scoring individuals. Science 269, 41–45. 112. Lubs, H.A. (1999). The other side of the coin: a hypothesis concerning the importance of genes for high intelligence and evolution of the X chromosome. Am. J. Med. Genet. 85, 206–208. 113. McMaster, G., and Trafzer, C. (2004). Native Universe, Voices of Indian America (Washington, DC: Smithsonian and National Geographic). 114. Turner, G., Boyle, J., Partington, M.W., Kerr, B., Raymond, F.L., and Ge´cz, J. (2008). Restoring reproductive confidence in families with X-linked mental retardation by finding the causal mutation. Clin. Genet. 73, 188–190. 590 The American Journal of Human Genetics 90, 579–590, April 6, 2012
  17. 17. ARTICLE On Sharing Quantitative Trait GWAS Results in an Era of Multiple-omics Data and the Limits of Genomic Privacy Hae Kyung Im,1,* Eric R. Gamazon,2 Dan L. Nicolae,2,3,4 and Nancy J. Cox2,3,* Recent advances in genome-scale, system-level measurements of quantitative phenotypes (transcriptome, metabolome, and proteome) promise to yield unprecedented biological insights. In this environment, broad dissemination of results from genome-wide association studies (GWASs) or deep-sequencing efforts is highly desirable. However, summary results from case-control studies (allele frequencies) have been withdrawn from public access because it has been shown that they can be used for inferring participation in a study if the individual’s genotype is available. A natural question that follows is how much private information is contained in summary results from quantitative trait GWAS such as regression coefficients or p values. We show that regression coefficients for many SNPs can reveal the person’s participation and for participants his or her phenotype with high accuracy. Our power calculations show that regression coefficients contain as much information on individuals as allele frequencies do, if the person’s phenotype is rather extreme or if multiple phenotypes are available as has been increasingly facilitated by the use of multiple-omics data sets. These findings emphasize the need to devise a mechanism that allows data sharing that will facilitate scientific progress without sacrificing privacy protection. Introduction Homer et al.1 showed that it is possible to detect an individ- ual’s presence in a complex genomic DNA mixture even when the mixture contains only trace quantities of his or her DNA. The study considered the implications of its find- ings, motivated originally as an application to forensic science, in the context of genome-wide association studies (GWASs) from which aggregate allele frequencies for a large number of markers were being made publicly available. Shortly after this publication, a reduction in open access to aggregate GWAS results was implemented. Jacobs et al.2 presented an improved method using a likelihood approach and showed that disease status could be inferred for participants of the study. Visscher et al.3 and Sankarara- man et al.4 calculated power estimates to understand the limits of individual detection from sample allele frequen- cies. They showed that the power to detect membership is determined by the ratio between the number of markers and the number of participants in the study. We present a method that can infer an individual’s partic- ipation in a study when regression coefficients from quantitative phenotypes are available. This problem is especially relevant now that genome-wide system-level measurements of quantitative phenotypes (transcriptome, proteome, and metabolome) are being widely collected and analyzed. Undoubtedly, disseminating results from quantitative GWAS and deep-sequencing efforts could be of enormous benefit to research groups working on related traits. We explore several statistics that can discriminate study participants from nonparticipants. Notably, we find that the use of only the direction of effects (signs of the coefficients) enables membership inference with good accuracy. We show the results from applying the statistics to the Genetics of Kidneys in Diabetes (GoKinD) data set5,6 to illustrate the level of information contained in aggregate data. We also provide quantification of the infor- mation content by computing the power of the method. Furthermore, we discuss a general framework that can be used for integrating our findings and earlier studies of genomic privacy based on sample allele frequencies. With the increasing use of high-throughput technologies to inte- grate multiple-omics data sets, these various statistics result in a more powerful approach to the identification problem than with the use of a single phenotype. Material and Methods Let us assume that we have the estimated regression coefficients for M independent SNPs, that we use data on n individuals in a GWAS (test sample), and that we also have the allelic dosage for nà individuals from a reference population such as HapMap7,8 or 1000 Genomes Project.9 Membership Inference Method We define a statistic (a function of available data) that has a different distribution depending on the membership status and use this difference to infer membership. We compute this statistic for the individual of interest, I, and for all individuals in the refer- ence population. If the statistic falls well within the reference distribution we will conclude that the individual is not likely to have participated in the study, and if the statistic falls in the extremes of the distribution, we will conclude that the individual did participate in the study. 1 Department of Health Studies, University of Chicago, Chicago, IL, 60637, USA; 2 Department of Medicine, University of Chicago, Chicago, IL, 60637, USA; 3 Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA; 4 Department of Statistics, University of Chicago, Chicago, IL, 60637, USA *Correspondence: haky@uchicago.edu (H.K.I.), ncox@bsd.uchicago.edu (N.J.C.) DOI 10.1016/j.ajhg.2012.02.008. Ó2012 by The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics 90, 591–598, April 6, 2012 591
  18. 18. Let bY be defined as bYI ¼ n M XM j¼1 bbj À XI;j À bXj Á ; (Equation 1) where XI;j is the allelic dosage of individual I at SNP j, bbj is the estimated coefficient from fitting the model Yi ¼ aj þ bjXi;j þ ei, and bXj is the estimated mean of allelic dosage (twice the allele frequency) for SNP j computed with the reference group. Conditional Mean and Variance of bY The expected value and the variance of the statistic bYI conditional on the individual’s genotype XI and demeaned phenotype YI À m and membership status (in or out) are as follows: E½bY j XI ; YI ; inŠzðYI À mÞ E½bY j XI ; YI ; outŠz 0 Var½bY j XI ; YI ; inŠ z s2 n M Var½bY j XI ; YI ; outŠ z s2 n M ; (Equation 2) where s2 is the variance of the phenotype, and m is the population mean of the phenotype Y. Note that for the method to work we do not need to make use of these expressions nor do we need to know s2 and m because we rely on the empirical distribution from the reference population to determine membership. These expres- sions will serve to estimate the power of the method. Unconditional on YI, the variance of the statistic bY is given by Var À bY Á j XI ; in z s2 : In computing these quantities we assume that the number of markers is much larger than the number of individuals in the test sample and the number of individuals in the reference group: M >> n >> 1 and M >> nà >> 1. Hardy Weinberg equilibrium is assumed. To derive these expressions, we used standard Taylor expansions and the law of iterative expectations. We tested the validity of these for finite samples (n between 100 and 1,000 and M=n between 1,000 and 50,000) by fitting linear regressions with simulated genotypes and phenotypes and computing the sample mean and variances of the bY statistic. See Supplemental Data, available online, to find plots of the validation. Power of the Method To compute power, we define the null and alternative hypothesis. Under the null hypothesis the individual did not participate in the study (nor did any relatives of the individual), whereas under the alternative hypothesis, the individual did participate. Using the mean and variance under the null hypothesis and the correspond- ing mean and variance under the alternative hypothesis computed in Equation 2 and assuming M >> n >> 1; M >> nà >> 1, normality of the statistic bY, and the sign of YI À m to be known, the power will be approximately given by powerzF j YI À m j s ffiffiffiffiffi M n r À za ! ; (Equation 3) where a is the type I error, zx ¼ FÀ1 ð1 À xÞ is the ð1 À xÞ-quantile of the normal distribution, and F is the normal cumulative distribu- tion function. If the sign of bY À m is not known, a two-sided test will be used in the derivation and the power will be given by powerzF j YI À m j s ffiffiffiffiffi M n r À za=2 ! : (Equation 4) See derivation in Appendix A. Because F is a strictly increasing function the power d increases when M, the number of SNPs, increases d decreases when n, the study’s sample size, increases d increases when the individual’s phenotype deviates more from the mean (scaled by the standard deviation) d increases when a, the type I error, increases To facilitate comparison with Visscher et al.3 and Sankararaman et al.,4 let us express the one-sided power Equation 3 with the following (equivalent) implicit formula ðza þ zbÞ2 z YI À m s 2 M n ; (Equation 5) where 1 À b is the power (note that in Sankararaman et al.4 b is defined as the power). Recall that in Visscher et al.3 and Sankarara- man et al.4 power was given implicitly by ðza þ zbÞ2 z M n : (Equation 6) Thus, the only difference between Equations 5 and 6 is the factor ððYI À mÞ=sÞ2 . If the phenotype of the person deviates more than one standard deviation away from the mean, i.e., jYI À mj s and the sign of YI À m is known, the power when regression coefficients are used is larger than it is when allele frequencies are used. If the person’s phenotype is close to the mean, then the power will be much diminished. Although expectations are computed conditional on YI À m, we do not need to know its magnitude in order to achieve this power. However, we do need to know the sign of YI À m in order to keep the test one-sided. If the sign is not used, jYI À mj would need to be 1 þ À ðza=2À zaÞ= ffiffiffiffiffiffiffiffiffiffi M=n p Á times greater than the standard deviation in order to achieve greater power than the allele frequency case. As an example, if a ¼ 0:05 and M=n ¼ 100, jYI À mj would need to be greater than 1.031 times s. Individual Contribution to the Regression Coefficient In order to get an intuitive understanding of the contribution of each individual from the sample, we can decompose the esti- mated regression coefficient into roughly the sum of individual contributions: bbj ¼ ~X 0 j ~Xj À1 ~X 0 j ~Y bbj z 1 ns2 j ~XI;j ~YI þ 1 ns2 j X isI ~Xi;j ~Yi bbj z ~bI;j þ P isI ~bi;j ; (Equation 7) defining ~bi;j ¼ ð1=ns2 j Þ ~Xi;j ~Yi as the individual contribution to the regression coefficient and s2 j as the variance of the allelic dosage (under Hardy Weinberg assumption s2 j ¼ 2pjð1 À pjÞ where pj is the minor allele frequency of SNP j). We use the tilde ~X for the demeaned variable that uses the mean from the sample. It is worth comparing with the decomposition for the case when minor allele frequencies for the sample are available: bpjzðpI;j=nÞ þ P isI ðpi;j=nÞ, where bpj is the sample minor allele frequency and pi;j is the allelic dosage divided by 2 of individual i for SNP j. This similarity gives an intuitive understanding of the corresponding similarity in the dependence of power on the ratio of the number of SNPs and sample size of the study. 592 The American Journal of Human Genetics 90, 591–598, April 6, 2012
  19. 19. Combining Multiple Phenotypes If results from multiple phenotypes such as eQTL (or other omics data) results are available, we can combine the information regarding the individual’s membership by using a Fisher type of method (the sum of logarithms of p values).10 For each phenotype k, we can compute an empirical p value, pk, defined as the proportion of reference individuals with magnitude of the jbYj greater than the individual’s jbYI j. We can combine p values across different phenotypes by computing À2 Xnpheno k¼1 log10 pk where npheno is the number of phenotypes to be combined. In addition to accumulating evidence across phenotypes, this method avoids the problem of lack of power due to one particular phenotype being close to the population mean. Covariate Adjustment Usually other covariates such as age, sex, etc. are adjusted for when performing GWASs. If the allelic dosage is independent of the covariates (as will likely be the case for most SNPs) bY will converge to the covariate-adjusted phenotype instead of the actual phenotype. The standard deviation might change if the covariates explain a substantial portion of the phenotypic variability. However, the method will still work because under no participa- tion bY will still be around 0, whereas if the individual participated in the study, bY will converge to the covariate-adjusted phenotype. The method does not require knowing the actual phenotype and it will work relative to this adjusted phenotype. For the purpose of re-identification using our method, the presence of covariates is only a nuisance and no additional power is achieved when they are present. Sample Correlation Statistic Equation 7 suggests that the sample correlation between the esti- mated beta and the individual’s genotype might be useful because we would expect the correlation to be 0 if the individual was not in the sample and different from 0 if the individual was part of the study. bC ¼ PM j¼1 À bbj À b ÁÀ XI;j À bXj À XI À bX Á ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P j À bbj À b Á2 P j À XI;j À bXj À XI À bX Á2 s ; where the long bar above an expression means the sample mean of the expression. Sign Statistic Equation 7 also shows that the sign of the correlation coefficient will be slightly more likely to match the sign of the demeaned allelic dosage if the person participated in the study than other- wise. Let bS be defined as: bS ¼ XM j¼1 sign À bb Á sign À Xi;j À bXj Á We expect that strictly more than 50% of the times the product signðbbÞ signðXi;j À bXjÞ will be positive (or negative) if the indi- vidual participated in the study and his or her phenotype is above (or below) average. By looking at the absolute value of the sign statistic we expect to gain information on whether the individual was part of the study or not. Analysis Details We used the PLINK software11 and filtered out SNP markers that were not in Hardy Weinberg equilibrium (p 0.001) and those that had minor allele frequencies less than 5%. Receiver operating characteristic (ROC) curves were generated by using the absolute value of the statistic as the predicting variable and membership in the sample as the labels by using the ROCR12 package for the R statistical package.13 We used only individuals who self-reported as white both for sample and reference. Results We show the performance of the statistics defined in Mate- rial and Methods ðbY; bS; bCÞ by using data from the GoKinD (Genetics of Kidney Disease) study.5,6 The data set was downloaded from dbGaP14 and consisted of more than 1,800 probands with long-standing type 1 diabetes, over 300 dichotomous and quantitative phenotypes, and geno- type from Affymetrix Genome-Wide Human SNP Array 5.0 platform. We used a subset of 1,644 individuals reported to be Caucasian. We show results for two of the phenotypes: cholesterol level and body mass index (BMI). We also tested the method on a third simulated phenotype and found at least as good performance. The latter demonstrates that the method does not depend on any real effect of genotype on phenotype. We randomly sampled 100, 500, and 1,000 individuals from each study’s cohort and performed a GWAS including only individuals from each random sample. The remaining individuals were used as reference group. The statistics ðbY; bS; bCÞ were computed for both sample and reference individuals. Identifiability Statistic and Phenotype Reconstruction Figure 1 shows bY versus the actual phenotype (rank normalized cholesterol levels). The blue dots correspond to individuals in the sample and the black dots correspond to individuals in the reference group. For individuals in the sample, bY lies close to the one-to-one line (perfect predic- tion line), whereas the individuals in the reference popula- tion lie close to a flat line around 0 (consistent with our calculations of mean and variances). The sample size was n ¼ 1; 000 and the number of SNPs was M ¼ 300; 000. The number of reference individuals was 644. This demonstrates that for individuals who participated in a study, their phenotype can be reconstructed with high accuracy using the bY statistic, whereas for nonparticipants what we get is mostly noise. Distribution of Statistic by Membership Status and ROC Analysis The left panel in Figure 2 shows the distribution of the absolute value of bY by membership status. As in Figure 1 The American Journal of Human Genetics 90, 591–598, April 6, 2012 593

×