Giovanni Coppola, MD
Semel Institute for Neuroscience and Human Behavior
David Geffen School of Medicine
UCLA
Methods and approaches in identifying 
genes critical for brain development and
developmental disorders
2015 Summer Institute in Cognitive Neuroscience
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
Steps in Conducting a Genetic Study of a Trait
1. Define a phenotype
2. Quantify degree of genetic effect
3. Collect families/cohorts for study
4. Measure genetic variation
5. Assess its statistical contribution to
the trait
1. Phenotype
(1) The form taken by some trait (or group of
traits) in a specific individual.
(2) The detectable outward manifestations of
a specific genotype.
• Qualitative (disease)
• Quantitative
Sullivan et al, Nat Rev Genet 2012
AD: Alzheimer's disease
ADHD: attention-deficit hyperactivity disorder
ALC: alcohol dependence
AN: anorexia nervosa
ASD: autism spectrum disorder
BIP: bipolar disorder
BRCA: breast cancer
CD: Crohn's disease
MDD: major depressive disorder
NIC: nicotine usage (cigarettes per day)
SCZ: schizophrenia
T2DM: type 2 diabetes mellitus
2. Heritability in Complex Diseases
Gender Bias (M:F)
• Developmental delay: 1.4:1
• ASD 4:1
• Asperger 6:1
Genetic Risk and ASD
0
25
50
75
100
Estimated Heritability (80%)
3. Genetic Architecture of Complex Diseases
Manolio et al 2009
Genome
Sequencing
High-throughput
genotyping
Genome
Sequencing
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
1. CACATAGATCGATCGATTGGCGATGAATGAT
2. CACATAGATCGATCGATTGGCGATGAATGAT
3. CACATAGATCGATCGATTGGCGATGAATGAT
4. CACATAGATCGATCGATTGGCGATGAATGAT
5. CACATAGATCGATCGATTGGCGATGAATGAT
6. CACATAGATCGATCGATTGGCGATGAATGAT
7. CACATAGATCGATCGATTGGCGATGAATGAT
8. CACATAGATCGATCGATTGGCGATGAATGAT
9. CACATAGATCGATCGATTGGCGATGAATGAT
10. CACATAGATCGATCGATTGGCGATGAATGAT
11. CACATAGATCGATCGATTGGCGATGAATGAT
12. CACATAGATCGATCGATTGGCGATGAATGAT
13. CACATAGATCGATCTATTGGCGATGAATGAT
14. CACATAGATCGATCGATTGGCGATGAATGAT
15. CACATAGATCGATCGATTGGCGATGAATGAT
16. CACATAGATCGATCGATTGGCGATGAATGAT
17. CACATAGATCGATCGATTGGCGATGAATGAT
18. CACATAGATCGATCGATTGGCGATGAATGAT
19. CACATAGATCGATCGATTGGCGATGAATGAT
20. CACATAGATCGATCGATTGGCGATGAATGAT
21. CACATAGATCGATCGATTGGCGATGAATGAT
22. CACATAGATCGATCGATTGGCGATGAATGAT
23. CACATAGATCGATCGATTGGCGATGAATGAT
24. CACATAGATCGATCGATTGGCGATGAATGAT
25. CACATAGATCGATCGATTGGCGATGAATGAT
...
100. CACATAGATCGATCGATTGGCGATGAATGAT
Patients Controls
1. CACATAGATCGATCGATTGGCGATGAATGAT
2. CACATAGATCGATCGATTGGCGATGAATGAT
3. CACATAGATCGATCGATTGGCGATGAATGAT
4. CACATAGATCGATCGATTGGCGATGAATGAT
5. CACATAGATCGATCGATTGGCGATGAATGAT
6. CACATAGATCGATCGATTGGCGATGAATGAT
7. CACATAGATCGATCGATTGGCGATGAATGAT
8. CACATAGATCGATCGATTGGCGATGAATGAT
9. CACATAGATCGATCGATTGGCGATGAATGAT
10. CACATAGATCGATCGATTGGCGATGAATGAT
11. CACATAGATCGATCGATTGGCGATGAATGAT
12. CACATAGATCGATCGATTGGCGATGAATGAT
13. CACATAGATCGATCGATTGGCGATGAATGAT
14. CACATAGATCGATCGATTGGCGATGAATGAT
15. CACATAGATCGATCGATTGGCGATGAATGAT
16. CACATAGATCGATCGATTGGCGATGAATGAT
17. CACATAGATCGATCGATTGGCGATGAATGAT
18. CACATAGATCGATCGATTGGCGATGAATGAT
19. CACATAGATCGATCGATTGGCGATGAATGAT
20. CACATAGATCGATCGATTGGCGATGAATGAT
21. CACATAGATCGATCGATTGGCGATGAATGAT
22. CACATAGATCGATCGATTGGCGATGAATGAT
23. CACATAGATCGATCGATTGGCGATGAATGAT
24. CACATAGATCGATCGATTGGCGATGAATGAT
25. CACATAGATCGATCGATTGGCGATGAATGAT
...
100. CACATAGATCGATCGATTGGCGATGAATGAT
Pathogenic Mutation
Genetic Architecture of Complex Diseases
Manolio et al 2009
Mendelian mutations
Hagerman et al, Pediatrics 2009
FMR1-related disorders
• fragile X syndrome
• FMR1-related premature ovarian failure (POF)
• fragile X-associated tremor/ataxia syndrome (FXTAS)
FXS
• craniofacial abnormalities
• delayed attainment of motor milestones and speech
• abnormal temperament
• abnormal behavior: shyness, gaze aversion
• macro-orchidism
• cardiac: mitral valve prolapse
• dermatologic: usually soft and smooth skin
FMR1-related disorders
Hagerman et al, Pediatrics 2009
FMR1-related disorders
Genetic Architecture of Complex Diseases
Manolio et al 2009
• FMR1 (Fragile X)
• MECP2 (Rett)
• TSC1/TSC2 (Tuberous Sclerosis)
• CACNA1C (Timothy)
• Dup15q
• 22q11.2 DS
Genetic Risk and ASD
0
25
50
75
100
Mendelian forms (10%)
Estimated Heritability (80%)
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
1. CACATAGATCGATCGATTGGCGATGAATGAT
2. CACATAGATCGATCTATTGGCGATGAATGAT
3. CACATAGATCGATCGATTGGCGATGAATGAT
4. CACATAGATCGATCGATTGGCGATGAATGAT
5. CACATAGATCGATCGATTGGCGATGAATGAT
6. CACATAGATCGATCGATTGGCGATGAATGAT
7. CACATAGATCGATCTATTGGCGATGAATGAT
8. CACATAGATCGATCGATTGGCGATGAATGAT
9. CACATAGATCGATCGATTGGCGATGAATGAT
10. CACATAGATCGATCTATTGGCGATGAATGAT
11. CACATAGATCGATCGATTGGCGATGAATGAT
12. CACATAGATCGATCGATTGGCGATGAATGAT
13. CACATAGATCGATCTATTGGCGATGAATGAT
14. CACATAGATCGATCGATTGGCGATGAATGAT
15. CACATAGATCGATCGATTGGCGATGAATGAT
16. CACATAGATCGATCTATTGGCGATGAATGAT
17. CACATAGATCGATCGATTGGCGATGAATGAT
18. CACATAGATCGATCGATTGGCGATGAATGAT
19. CACATAGATCGATCTATTGGCGATGAATGAT
20. CACATAGATCGATCGATTGGCGATGAATGAT
21. CACATAGATCGATCGATTGGCGATGAATGAT
22. CACATAGATCGATCTATTGGCGATGAATGAT
23. CACATAGATCGATCGATTGGCGATGAATGAT
24. CACATAGATCGATCTATTGGCGATGAATGAT
25. CACATAGATCGATCGATTGGCGATGAATGAT
...
100. CACATAGATCGATCGATTGGCGATGAATGAT
Patients Controls
1. CACATAGATCGATCGATTGGCGATGAATGAT
2. CACATAGATCGATCGATTGGCGATGAATGAT
3. CACATAGATCGATCTATTGGCGATGAATGAT
4. CACATAGATCGATCGATTGGCGATGAATGAT
5. CACATAGATCGATCGATTGGCGATGAATGAT
6. CACATAGATCGATCGATTGGCGATGAATGAT
7. CACATAGATCGATCGATTGGCGATGAATGAT
8. CACATAGATCGATCGATTGGCGATGAATGAT
9. CACATAGATCGATCGATTGGCGATGAATGAT
10. CACATAGATCGATCGATTGGCGATGAATGAT
11. CACATAGATCGATCTATTGGCGATGAATGAT
12. CACATAGATCGATCGATTGGCGATGAATGAT
13. CACATAGATCGATCGATTGGCGATGAATGAT
14. CACATAGATCGATCTATTGGCGATGAATGAT
15. CACATAGATCGATCGATTGGCGATGAATGAT
16. CACATAGATCGATCGATTGGCGATGAATGAT
17. CACATAGATCGATCGATTGGCGATGAATGAT
18. CACATAGATCGATCGATTGGCGATGAATGAT
19. CACATAGATCGATCTATTGGCGATGAATGAT
20. CACATAGATCGATCGATTGGCGATGAATGAT
21. CACATAGATCGATCGATTGGCGATGAATGAT
22. CACATAGATCGATCGATTGGCGATGAATGAT
23. CACATAGATCGATCGATTGGCGATGAATGAT
24. CACATAGATCGATCTATTGGCGATGAATGAT
25. CACATAGATCGATCGATTGGCGATGAATGAT
...
100. CACATAGATCGATCGATTGGCGATGAATGAT
Disease-Associated Sequence Variant
• Assumption
• Principle
• Technology
Genome-Wide Association Studies (GWAS)
Genetic component
Linkage disequilibrium
Microarrays
From Lichten Nature 2008;454:421
GWAS - rationale
meiotic recombination
Cardon & Bell Nat Rev Genet 2001;2:91
GWAS
Kruglyak Nat Rev Genet 2008;9:314
GWAS
Genotyping using Microarrays
www.affymetrix.com
Corvin et al 2010
GWAS
analysis steps
Pearson & Manolio, JAMA 2008;299:1335
Manhattan Plot
https://www.genome.gov/26525384
NHGRI&GWA&Catalog&
www.genome.gov/GWAStudies&
www.ebi.ac.uk/fgpt/gwas/&
Published&GenomeBWide&Associations&through&12/2013&
Published&GWA&at&p≤5X10B8&for&17&trait&categories
GWAS in ASD
Weiss et al, Nature 2009
Wang et al, Nature 2009
GWAS in ASD
Wang et al, Nature 2009
Cardon & Bell, Nat Rev Genet 2001;2:91
GWAS
GWAS confounders - population stratification
Novembre et al, Nature 2008;456:98
!
Pop
Structure
GWAS confounders: population stratification
Genetic Architecture of Complex Diseases
Manolio et al 2009
• FMR1 (Fragile X)
• MECP2 (Rett)
• TSC1/TSC2 (Tuberous Sclerosis)
• CACNA1C (Timothy)
• 15q duplication
• 22q11 deletion
• CDH9 and CDH10
• 5p15 (SEMA5A?)
• MACROD2
• CNTNAP2
Genetic Risk and ASD
0
25
50
75
100
Common variation (1%)
Estimated Heritability (80%)
Mendelian forms (10%)
?
Nature 2008
Genetic Architecture of Complex Diseases
Manolio et al 2009
hundreds of
common
variants with
small effect s
GWAS in Psychiatric Disease
Sullivan et al, Nat Rev Genet 2012
Why Do We Need So Many Samples?
Altshuler et al, 2008
Franke et al Nat Genet 2010
What to Expect: Insights from Other Complex Traits
Cumulative fraction of genetic
variance explained by 71
Crohn's disease risk loci.
Allen et al Nature 2010
What to Expect: Insights from Other Complex Traits
Genetic Risk and ASD
0
25
50
75
100
Common variation (20%??)
Estimated Heritability (80%)
Mendelian forms (10%)
estimated
Genetic Architecture of Complex Diseases
Manolio et al 2009
hundreds of
common
variants with
small effect s
[hundreds
of rare variants
with moderate
effect size
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
Rare Copy Number Variants (CNVs)
rare CNVs
Cooper et al, Nat Genet 2011
Genetic Architecture of Complex Diseases
Manolio et al 2009
• FMR1 (Fragile X)
• MECP2 (Rett)
• TSC1/TSC2 (Tuberous Sclerosis)
• CACNA1C (Timothy)
• 15q duplication
• 22q11 deletion
• CDH9 and CDH10
• 5p15 (SEMA5A?)
• MACROD2
• CNTNAP2
rare CNVs
Genetic Risk and ASD
0
25
50
75
100
Common variation (20%??)
Estimated Heritability (80%)
Mendelian forms (10%)
rare CNVs (7%)
estimated
Genetic Architecture of Complex Diseases
Manolio et al 2009
• FMR1 (Fragile X)
• MECP2 (Rett)
• TSC1/TSC2 (Tuberous Sclerosis)
• CACNA1C (Timothy)
• 15q duplication
• 22q11 deletion
• CDH9 and CDH10
• 5p15 (SEMA5A?)
• MACROD2
• CNTNAP2
rare CNVs
rare sequence
variants?
June 26, 2000
Bamshad et al, Nat Rev Genet 2011;12:745
Exome Sequencing
@6:6:1355:6985:Y
GCTGTTTCTGCAGACAGGACCTCAATAGTTCTGGTGAGCTGCTCACTGGGCAAGTAACTACCATCCTGAGGGGGCA
+6:6:1355:6985:Y
?<>B><2BB@BBB/@>BBBBB?BB@@B6B@@@@@0B-@BBBBBB@B>BBAB3@@A>><@>,@B7BBBB;@9:<9=?
@6:6:1356:4867:Y
CATTTCATGGAGTATCTAGGACCTTACCCAGCGAGGCCACAAGTGCGAAGTTGTCTAGCATCACGCGGCGGTACAG
+6:6:1356:4867:Y
IIIIIIIIIIIIIIII=B=BBBBB@6::??BBBBBBBBAB@<6>>B@B@@@B=@B@/<@@@@@@@B@#########
@6:6:1357:2232:Y
ACCGCAGTGGATGCGGTGCAACACGGGTTTCGTACCATCGTCGTGCGCGAATGCGTCGGCGAACGCCACCCGGCGG
+6:6:1357:2232:Y
############################################################################
NGS Data
sanger-fastq format
NGS Data
Alignment to Reference Genome
NextGen Sequencing: Main Platforms
Roche 454
Illumina HiSeq2500
Gartner Inc.
NextGen Sequencing: the Hype Cycle
2005 2014
Genome Length 3 billion
Positions Called 2.8 billion (~93%)
Average Depth of Coverage 61
Number of Heterozygotes 2.4 million (0.09%)
Variants in Coding Regions 20,696
Predicted deleterious 800-1,800
HGMD 726
never seen in EVS 1,724 (~8%)
Genome Sequencing
some numbers
EVS: Exome Variant Server (evs.gs.washington.edu/EVS/)
HGMD: Human Gene Mutation Database (www.hgmd.org/)
PFBC: CAUSAL MUTATIONS
Exome Sequencing
www.my46.org
www.1000genomes.org
Whole-Genome Sequencing
http://evs.gs.washington.edu/EVS/
Exome Variant Server
ExAC Database
http://exac.broadinstitute.org
Simons Simplex Collection
http://sfari.org/resources/simons-simplex-collection
The Simons Simplex Collection (SSC) is a core project and resource of the
Simons Foundation Autism Research Initiative (SFARI). The SSC achieved its
primary goal to establish a permanent repository of genetic samples from 2,600
simplex families, each of which has one child affected with an autism spectrum
disorder, and unaffected parents and siblings.
Exome Sequencing
O’Roak et al, Nature 2012
Iossifov et al, Neuron 2012
Sanders et al, Nature 2012
~200 ASD genes
Excess of de novo events from older fathers
Chen et al 2015
ASD genes
gene.sfari.org
CHD8
Bernier et al, Cell 2014
Genetic Risk and ASD
0
25
50
75
100
Common variation (20%??)
Estimated Heritability (80%)
Mendelian forms (10%)
rare CNVs (7%)
rare de novo events (10%?)
estimated
Genetic Risk and ASD
0
25
50
75
100
Common variation (20%??)
Estimated Heritability (80%)
Mendelian forms (10%)
rare CNVs (7%)
rare de novo events (10%?)
estimated
20%
33%
47%
explained
or estimated
missing
non-genetic
Outline
1. Key Concepts
2. Mendelian Genes
3. Risk Genes: Common Variants
4. Risk Genes: Rare Variants
5. Genomic Approaches
DNA
RNA
Protein
transcription
translation
High-throughput
genotyping
Genome
Sequencing
Gene
Expression
Epigenetics
Proteomics
Microarrays
NextGen Sequencing
Genome Sequencing
Projects
Green et al, Nature 2011;470:204
• Understand disease pathogenesis at the
global level
• Characterize susceptibility to complex
diseases
• Characterize and understand drug
response (personalized treatment)
Promise of Genomic Medicine
Green et al Nature 2011;470:204
Green et al Nature 2011;470:204
Promise of Genomic Medicine
Non-Genetic Factors
common variant
common variant
Imaging features
common variant
Rare CNV Rare variant
Rare variantRare variant
CNV CNV
common variant
Gene Expression
Epigenetics
Towards a Personalized Genetic Risk Map
common variant
common variant
common variant
-OMICs Studies - Conventional Approach
genomics
genomics: transcription outliers
Voineagu et al, Mol Psychiatry 2012
Mike Oldham
Steve Horvath
Differential Expression vs. Differential Co-Expression
Weighted Gene Coexpression Network Analysis (WGCNA)
Steve Horvath, PhD
**Slide courtesy of A Barabasi
Flight connections and hub airports
The nodes with the largest number of links
(connections) are most important!
Steve Horvath
Construct a network
Rationale: make use of interaction patterns between genes
Identify modules
Rationale: module (pathway) based analysis
Relate modules to external information
Array Information: Clinical data, SNPs, proteomics
Gene Information: gene ontology, EASE, IPA
Rationale: find biologically interesting modules
Find the key drivers in interesting modules
Tools: intramodular connectivity, causality testing
Rationale: experimental validation, therapeutics, biomarkers
Study Module Preservation across different data
Rationale:
• Same data: to check robustness of module definition
• Different data: to find interesting modules.
Steve Horvath
Transcriptional networks in ASD brain
genomics: WGCNA
Parikshak et al, Cell 2013
genomics: WGCNA
Parikshak et al, Cell 2013
genomics: WGCNA
Parikshak et al, Cell 2013
genomics: WGCNA
Parikshak et al, Cell 2013
Integrative Functional Genomic
Analyses Implicate Specific Molecular
Pathways and Circuits in Autism
Neelroop N. Parikshak,1,2 Rui Luo,3,4 Alice Zhang,2 Hyejung Won,1 Jennifer K. Lowe,1,4 Vijayendran Chandran,5
Steve Horvath,3,6 and Daniel H. Geschwind1,2,3,4,5,*
1Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles,
CA 90095, USA
2Interdepartmental Program in Neuroscience, University of California, Los Angeles, Los Angeles, CA 90095, USA
3Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
4Center for Autism Treatment and Research, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles,
Los Angeles, CA 90095, USA
5Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles,
CA 90095, USA
6Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
*Correspondence: dhg@ucla.edu
http://dx.doi.org/10.1016/j.cell.2013.10.031
SUMMARY
Genetic studies have identified dozens of autism
spectrum disorder (ASD) susceptibility genes, raising
two critical questions: (1) do these genetic loci
converge on specific biological processes, and (2)
where does the phenotypic specificity of ASD arise,
given its genetic overlap with intellectual disability
(ID)? To address this, we mapped ASD and ID risk
genes onto coexpression networks representing
developmental trajectories and transcriptional pro-
files representing fetal and adult cortical laminae.
ASD genes tightly coalesce in modules that implicate
distinct biological functions during human cortical
development, including early transcriptional regula-
tion and synaptic development. Bioinformatic ana-
lyses suggest that translational regulation by FMRP
and transcriptional coregulation by common tran-
scription factors connect these processes. At a cir-
cuit level, ASD genes are enriched in superficial
cortical layers and glutamatergic projection neurons.
Furthermore, we show that the patterns of ASD and
ID risk genes are distinct, providing a biological
framework for further investigating the pathophysi-
ology of ASD.
INTRODUCTION
Autism spectrum disorder (ASD) is a heterogeneous neurodeve-
lopmental disorder in which hundreds of genes have been impli-
cated (Berg and Geschwind, 2012; Geschwind and Levitt, 2007).
Analysis of copy number variation (CNV) and exome sequencing
have identified rare variants that alter dozens of protein-coding
genes in ASD, none of which account for more than 1% of
ASD cases (Devlin and Scherer, 2012). This and the fact that a
significant fraction (40%–60%) of ASD is explained by common
variation (Klei et al., 2012) point to a heterogeneous genetic
architecture.
These findings raise several issues. Based on the background
human mutation rate (MacArthur et al., 2012), most genes
affected by only one observed rare variant to date are likely false
positives that do not increase risk for ASD (Gratten et al., 2013). It
is therefore essential to develop approaches that prioritize
singleton variants, especially missense mutations. Furthermore,
given the heterogeneity of ASD, it would be valuable to identify
common pathways, cell types, or circuits disrupted within ASD
itself. Recent studies combining gene expression, protein-
protein interactions (PPIs), and other systematic gene annotation
resources suggest some molecular convergence in subsets of
ASD risk genes (Ben-David and Shifman, 2013; Gilman et al.,
2011; Sakai et al., 2011; Voineagu et al., 2011). Yet, it remains
unclear how the large number of genes implicated through
different methods may converge to affect human brain develop-
ment, which is critical to a mechanistic understanding of ASD
(Berg and Geschwind, 2012). Additionally, ASD has considerable
overlap with ID at the genetic level, so identifying molecular path-
ways and circuits that confer the phenotypic specificity of ASD
would be of considerable utility (Geschwind, 2011; Matson and
Shoemaker, 2009).
Here, we took a stepwise approach to determine whether
genes implicated in ASD affect convergent pathways during
in vivo human neural development and whether they are en-
riched in specific cells or circuits (Figure 1A). First, we con-
structed transcriptional networks representing genome-wide
functional relationships during fetal and early postnatal brain
development and mapped genes from multiple ASD and ID
resources to these networks. We then assessed shared neurobi-
ological function among these genes, including coregulatory
relationships and enrichment in layer-specific patterns from
http://geschwindlab.neurology.ucla.edu/sites/all/files/networkplot/ParikshakDevelopmentalCortexNetwork.html
genomics: WGCNA
Genetics
Gene Expression
Imaging
Clinical Phenotype
Epigenetics
Genotyping
Sequencing
Transcriptome
Proteome
Methylome
Histone Modifications
• Structural
• Functional
Neuropathology
OMICs Approaches to
Human CNS Disease
• Binary
• Quantitative
Genetics
Gene Expression
Imaging
Clinical Phenotype
Epigenetics
Genotyping
Sequencing
Transcriptome
Proteome
Methylome
...
Structural
Functional
Neuropathology
OMICs Approaches to
Human CNS Disease
unidimensional approach
systems biology approach
Geschwind and Konopka, Nature 2010
Genetics
meth
1
meth
2
meth
3
trait
traittrait
Network Edge Orienting (NEO)
0.6
3.5
Steve Horvath
Aten et al, BMC Systems Biol 2008
Conclusions
1. The genetic map for ASD and ID and the role of
common and rare variation are increasingly
characterized
2. Two technological advances (microarrays and
sequencing) have facilitated progress over the
past 10 years
3. Whole-genome sequencing will clarify the role
of non-coding variation
4. Replication and functional validation pose
significant challenges
Thank you
gcoppola@ucla.edu

UCSB Summer Institute in Cognitive Neuroscience, June 29 2015

  • 1.
    Giovanni Coppola, MD SemelInstitute for Neuroscience and Human Behavior David Geffen School of Medicine UCLA Methods and approaches in identifying  genes critical for brain development and developmental disorders 2015 Summer Institute in Cognitive Neuroscience
  • 2.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 3.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 4.
    Steps in Conductinga Genetic Study of a Trait 1. Define a phenotype 2. Quantify degree of genetic effect 3. Collect families/cohorts for study 4. Measure genetic variation 5. Assess its statistical contribution to the trait
  • 5.
    1. Phenotype (1) Theform taken by some trait (or group of traits) in a specific individual. (2) The detectable outward manifestations of a specific genotype. • Qualitative (disease) • Quantitative
  • 6.
    Sullivan et al,Nat Rev Genet 2012 AD: Alzheimer's disease ADHD: attention-deficit hyperactivity disorder ALC: alcohol dependence AN: anorexia nervosa ASD: autism spectrum disorder BIP: bipolar disorder BRCA: breast cancer CD: Crohn's disease MDD: major depressive disorder NIC: nicotine usage (cigarettes per day) SCZ: schizophrenia T2DM: type 2 diabetes mellitus 2. Heritability in Complex Diseases Gender Bias (M:F) • Developmental delay: 1.4:1 • ASD 4:1 • Asperger 6:1
  • 7.
    Genetic Risk andASD 0 25 50 75 100 Estimated Heritability (80%)
  • 8.
    3. Genetic Architectureof Complex Diseases Manolio et al 2009 Genome Sequencing High-throughput genotyping Genome Sequencing
  • 9.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 10.
    1. CACATAGATCGATCGATTGGCGATGAATGAT 2. CACATAGATCGATCGATTGGCGATGAATGAT 3.CACATAGATCGATCGATTGGCGATGAATGAT 4. CACATAGATCGATCGATTGGCGATGAATGAT 5. CACATAGATCGATCGATTGGCGATGAATGAT 6. CACATAGATCGATCGATTGGCGATGAATGAT 7. CACATAGATCGATCGATTGGCGATGAATGAT 8. CACATAGATCGATCGATTGGCGATGAATGAT 9. CACATAGATCGATCGATTGGCGATGAATGAT 10. CACATAGATCGATCGATTGGCGATGAATGAT 11. CACATAGATCGATCGATTGGCGATGAATGAT 12. CACATAGATCGATCGATTGGCGATGAATGAT 13. CACATAGATCGATCTATTGGCGATGAATGAT 14. CACATAGATCGATCGATTGGCGATGAATGAT 15. CACATAGATCGATCGATTGGCGATGAATGAT 16. CACATAGATCGATCGATTGGCGATGAATGAT 17. CACATAGATCGATCGATTGGCGATGAATGAT 18. CACATAGATCGATCGATTGGCGATGAATGAT 19. CACATAGATCGATCGATTGGCGATGAATGAT 20. CACATAGATCGATCGATTGGCGATGAATGAT 21. CACATAGATCGATCGATTGGCGATGAATGAT 22. CACATAGATCGATCGATTGGCGATGAATGAT 23. CACATAGATCGATCGATTGGCGATGAATGAT 24. CACATAGATCGATCGATTGGCGATGAATGAT 25. CACATAGATCGATCGATTGGCGATGAATGAT ... 100. CACATAGATCGATCGATTGGCGATGAATGAT Patients Controls 1. CACATAGATCGATCGATTGGCGATGAATGAT 2. CACATAGATCGATCGATTGGCGATGAATGAT 3. CACATAGATCGATCGATTGGCGATGAATGAT 4. CACATAGATCGATCGATTGGCGATGAATGAT 5. CACATAGATCGATCGATTGGCGATGAATGAT 6. CACATAGATCGATCGATTGGCGATGAATGAT 7. CACATAGATCGATCGATTGGCGATGAATGAT 8. CACATAGATCGATCGATTGGCGATGAATGAT 9. CACATAGATCGATCGATTGGCGATGAATGAT 10. CACATAGATCGATCGATTGGCGATGAATGAT 11. CACATAGATCGATCGATTGGCGATGAATGAT 12. CACATAGATCGATCGATTGGCGATGAATGAT 13. CACATAGATCGATCGATTGGCGATGAATGAT 14. CACATAGATCGATCGATTGGCGATGAATGAT 15. CACATAGATCGATCGATTGGCGATGAATGAT 16. CACATAGATCGATCGATTGGCGATGAATGAT 17. CACATAGATCGATCGATTGGCGATGAATGAT 18. CACATAGATCGATCGATTGGCGATGAATGAT 19. CACATAGATCGATCGATTGGCGATGAATGAT 20. CACATAGATCGATCGATTGGCGATGAATGAT 21. CACATAGATCGATCGATTGGCGATGAATGAT 22. CACATAGATCGATCGATTGGCGATGAATGAT 23. CACATAGATCGATCGATTGGCGATGAATGAT 24. CACATAGATCGATCGATTGGCGATGAATGAT 25. CACATAGATCGATCGATTGGCGATGAATGAT ... 100. CACATAGATCGATCGATTGGCGATGAATGAT Pathogenic Mutation
  • 11.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 Mendelian mutations
  • 12.
    Hagerman et al,Pediatrics 2009 FMR1-related disorders
  • 13.
    • fragile Xsyndrome • FMR1-related premature ovarian failure (POF) • fragile X-associated tremor/ataxia syndrome (FXTAS) FXS • craniofacial abnormalities • delayed attainment of motor milestones and speech • abnormal temperament • abnormal behavior: shyness, gaze aversion • macro-orchidism • cardiac: mitral valve prolapse • dermatologic: usually soft and smooth skin FMR1-related disorders
  • 14.
    Hagerman et al,Pediatrics 2009 FMR1-related disorders
  • 15.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 • FMR1 (Fragile X) • MECP2 (Rett) • TSC1/TSC2 (Tuberous Sclerosis) • CACNA1C (Timothy) • Dup15q • 22q11.2 DS
  • 16.
    Genetic Risk andASD 0 25 50 75 100 Mendelian forms (10%) Estimated Heritability (80%)
  • 17.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 18.
    1. CACATAGATCGATCGATTGGCGATGAATGAT 2. CACATAGATCGATCTATTGGCGATGAATGAT 3.CACATAGATCGATCGATTGGCGATGAATGAT 4. CACATAGATCGATCGATTGGCGATGAATGAT 5. CACATAGATCGATCGATTGGCGATGAATGAT 6. CACATAGATCGATCGATTGGCGATGAATGAT 7. CACATAGATCGATCTATTGGCGATGAATGAT 8. CACATAGATCGATCGATTGGCGATGAATGAT 9. CACATAGATCGATCGATTGGCGATGAATGAT 10. CACATAGATCGATCTATTGGCGATGAATGAT 11. CACATAGATCGATCGATTGGCGATGAATGAT 12. CACATAGATCGATCGATTGGCGATGAATGAT 13. CACATAGATCGATCTATTGGCGATGAATGAT 14. CACATAGATCGATCGATTGGCGATGAATGAT 15. CACATAGATCGATCGATTGGCGATGAATGAT 16. CACATAGATCGATCTATTGGCGATGAATGAT 17. CACATAGATCGATCGATTGGCGATGAATGAT 18. CACATAGATCGATCGATTGGCGATGAATGAT 19. CACATAGATCGATCTATTGGCGATGAATGAT 20. CACATAGATCGATCGATTGGCGATGAATGAT 21. CACATAGATCGATCGATTGGCGATGAATGAT 22. CACATAGATCGATCTATTGGCGATGAATGAT 23. CACATAGATCGATCGATTGGCGATGAATGAT 24. CACATAGATCGATCTATTGGCGATGAATGAT 25. CACATAGATCGATCGATTGGCGATGAATGAT ... 100. CACATAGATCGATCGATTGGCGATGAATGAT Patients Controls 1. CACATAGATCGATCGATTGGCGATGAATGAT 2. CACATAGATCGATCGATTGGCGATGAATGAT 3. CACATAGATCGATCTATTGGCGATGAATGAT 4. CACATAGATCGATCGATTGGCGATGAATGAT 5. CACATAGATCGATCGATTGGCGATGAATGAT 6. CACATAGATCGATCGATTGGCGATGAATGAT 7. CACATAGATCGATCGATTGGCGATGAATGAT 8. CACATAGATCGATCGATTGGCGATGAATGAT 9. CACATAGATCGATCGATTGGCGATGAATGAT 10. CACATAGATCGATCGATTGGCGATGAATGAT 11. CACATAGATCGATCTATTGGCGATGAATGAT 12. CACATAGATCGATCGATTGGCGATGAATGAT 13. CACATAGATCGATCGATTGGCGATGAATGAT 14. CACATAGATCGATCTATTGGCGATGAATGAT 15. CACATAGATCGATCGATTGGCGATGAATGAT 16. CACATAGATCGATCGATTGGCGATGAATGAT 17. CACATAGATCGATCGATTGGCGATGAATGAT 18. CACATAGATCGATCGATTGGCGATGAATGAT 19. CACATAGATCGATCTATTGGCGATGAATGAT 20. CACATAGATCGATCGATTGGCGATGAATGAT 21. CACATAGATCGATCGATTGGCGATGAATGAT 22. CACATAGATCGATCGATTGGCGATGAATGAT 23. CACATAGATCGATCGATTGGCGATGAATGAT 24. CACATAGATCGATCTATTGGCGATGAATGAT 25. CACATAGATCGATCGATTGGCGATGAATGAT ... 100. CACATAGATCGATCGATTGGCGATGAATGAT Disease-Associated Sequence Variant
  • 19.
    • Assumption • Principle •Technology Genome-Wide Association Studies (GWAS) Genetic component Linkage disequilibrium Microarrays
  • 20.
    From Lichten Nature2008;454:421 GWAS - rationale meiotic recombination
  • 21.
    Cardon & BellNat Rev Genet 2001;2:91 GWAS
  • 22.
    Kruglyak Nat RevGenet 2008;9:314 GWAS
  • 23.
  • 24.
    Corvin et al2010 GWAS analysis steps
  • 25.
    Pearson & Manolio,JAMA 2008;299:1335 Manhattan Plot
  • 26.
  • 27.
  • 28.
    GWAS in ASD Weisset al, Nature 2009 Wang et al, Nature 2009
  • 29.
    GWAS in ASD Wanget al, Nature 2009
  • 30.
    Cardon & Bell,Nat Rev Genet 2001;2:91 GWAS
  • 31.
    GWAS confounders -population stratification Novembre et al, Nature 2008;456:98
  • 32.
  • 33.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 • FMR1 (Fragile X) • MECP2 (Rett) • TSC1/TSC2 (Tuberous Sclerosis) • CACNA1C (Timothy) • 15q duplication • 22q11 deletion • CDH9 and CDH10 • 5p15 (SEMA5A?) • MACROD2 • CNTNAP2
  • 34.
    Genetic Risk andASD 0 25 50 75 100 Common variation (1%) Estimated Heritability (80%) Mendelian forms (10%) ?
  • 35.
  • 36.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 hundreds of common variants with small effect s
  • 37.
    GWAS in PsychiatricDisease Sullivan et al, Nat Rev Genet 2012
  • 38.
    Why Do WeNeed So Many Samples? Altshuler et al, 2008
  • 39.
    Franke et alNat Genet 2010 What to Expect: Insights from Other Complex Traits Cumulative fraction of genetic variance explained by 71 Crohn's disease risk loci.
  • 40.
    Allen et alNature 2010 What to Expect: Insights from Other Complex Traits
  • 41.
    Genetic Risk andASD 0 25 50 75 100 Common variation (20%??) Estimated Heritability (80%) Mendelian forms (10%) estimated
  • 42.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 hundreds of common variants with small effect s [hundreds of rare variants with moderate effect size
  • 43.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 44.
    Rare Copy NumberVariants (CNVs)
  • 45.
    rare CNVs Cooper etal, Nat Genet 2011
  • 46.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 • FMR1 (Fragile X) • MECP2 (Rett) • TSC1/TSC2 (Tuberous Sclerosis) • CACNA1C (Timothy) • 15q duplication • 22q11 deletion • CDH9 and CDH10 • 5p15 (SEMA5A?) • MACROD2 • CNTNAP2 rare CNVs
  • 47.
    Genetic Risk andASD 0 25 50 75 100 Common variation (20%??) Estimated Heritability (80%) Mendelian forms (10%) rare CNVs (7%) estimated
  • 48.
    Genetic Architecture ofComplex Diseases Manolio et al 2009 • FMR1 (Fragile X) • MECP2 (Rett) • TSC1/TSC2 (Tuberous Sclerosis) • CACNA1C (Timothy) • 15q duplication • 22q11 deletion • CDH9 and CDH10 • 5p15 (SEMA5A?) • MACROD2 • CNTNAP2 rare CNVs rare sequence variants?
  • 49.
  • 50.
    Bamshad et al,Nat Rev Genet 2011;12:745 Exome Sequencing
  • 51.
  • 52.
    NGS Data Alignment toReference Genome
  • 53.
    NextGen Sequencing: MainPlatforms Roche 454 Illumina HiSeq2500
  • 54.
    Gartner Inc. NextGen Sequencing:the Hype Cycle 2005 2014
  • 55.
    Genome Length 3billion Positions Called 2.8 billion (~93%) Average Depth of Coverage 61 Number of Heterozygotes 2.4 million (0.09%) Variants in Coding Regions 20,696 Predicted deleterious 800-1,800 HGMD 726 never seen in EVS 1,724 (~8%) Genome Sequencing some numbers EVS: Exome Variant Server (evs.gs.washington.edu/EVS/) HGMD: Human Gene Mutation Database (www.hgmd.org/)
  • 56.
    PFBC: CAUSAL MUTATIONS ExomeSequencing www.my46.org
  • 57.
  • 58.
  • 59.
  • 60.
    Simons Simplex Collection http://sfari.org/resources/simons-simplex-collection TheSimons Simplex Collection (SSC) is a core project and resource of the Simons Foundation Autism Research Initiative (SFARI). The SSC achieved its primary goal to establish a permanent repository of genetic samples from 2,600 simplex families, each of which has one child affected with an autism spectrum disorder, and unaffected parents and siblings.
  • 61.
    Exome Sequencing O’Roak etal, Nature 2012 Iossifov et al, Neuron 2012 Sanders et al, Nature 2012
  • 62.
    ~200 ASD genes Excessof de novo events from older fathers Chen et al 2015
  • 63.
  • 64.
  • 65.
    Genetic Risk andASD 0 25 50 75 100 Common variation (20%??) Estimated Heritability (80%) Mendelian forms (10%) rare CNVs (7%) rare de novo events (10%?) estimated
  • 66.
    Genetic Risk andASD 0 25 50 75 100 Common variation (20%??) Estimated Heritability (80%) Mendelian forms (10%) rare CNVs (7%) rare de novo events (10%?) estimated 20% 33% 47% explained or estimated missing non-genetic
  • 67.
    Outline 1. Key Concepts 2.Mendelian Genes 3. Risk Genes: Common Variants 4. Risk Genes: Rare Variants 5. Genomic Approaches
  • 68.
  • 69.
    Green et al,Nature 2011;470:204
  • 70.
    • Understand diseasepathogenesis at the global level • Characterize susceptibility to complex diseases • Characterize and understand drug response (personalized treatment) Promise of Genomic Medicine Green et al Nature 2011;470:204
  • 71.
    Green et alNature 2011;470:204 Promise of Genomic Medicine
  • 72.
    Non-Genetic Factors common variant commonvariant Imaging features common variant Rare CNV Rare variant Rare variantRare variant CNV CNV common variant Gene Expression Epigenetics Towards a Personalized Genetic Risk Map common variant common variant common variant
  • 73.
    -OMICs Studies -Conventional Approach
  • 74.
  • 75.
  • 76.
    Mike Oldham Steve Horvath DifferentialExpression vs. Differential Co-Expression
  • 77.
    Weighted Gene CoexpressionNetwork Analysis (WGCNA) Steve Horvath, PhD
  • 78.
    **Slide courtesy ofA Barabasi Flight connections and hub airports The nodes with the largest number of links (connections) are most important! Steve Horvath
  • 79.
    Construct a network Rationale:make use of interaction patterns between genes Identify modules Rationale: module (pathway) based analysis Relate modules to external information Array Information: Clinical data, SNPs, proteomics Gene Information: gene ontology, EASE, IPA Rationale: find biologically interesting modules Find the key drivers in interesting modules Tools: intramodular connectivity, causality testing Rationale: experimental validation, therapeutics, biomarkers Study Module Preservation across different data Rationale: • Same data: to check robustness of module definition • Different data: to find interesting modules. Steve Horvath
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 87.
    Integrative Functional Genomic AnalysesImplicate Specific Molecular Pathways and Circuits in Autism Neelroop N. Parikshak,1,2 Rui Luo,3,4 Alice Zhang,2 Hyejung Won,1 Jennifer K. Lowe,1,4 Vijayendran Chandran,5 Steve Horvath,3,6 and Daniel H. Geschwind1,2,3,4,5,* 1Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA 2Interdepartmental Program in Neuroscience, University of California, Los Angeles, Los Angeles, CA 90095, USA 3Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA 4Center for Autism Treatment and Research, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA 5Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA 6Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095, USA *Correspondence: dhg@ucla.edu http://dx.doi.org/10.1016/j.cell.2013.10.031 SUMMARY Genetic studies have identified dozens of autism spectrum disorder (ASD) susceptibility genes, raising two critical questions: (1) do these genetic loci converge on specific biological processes, and (2) where does the phenotypic specificity of ASD arise, given its genetic overlap with intellectual disability (ID)? To address this, we mapped ASD and ID risk genes onto coexpression networks representing developmental trajectories and transcriptional pro- files representing fetal and adult cortical laminae. ASD genes tightly coalesce in modules that implicate distinct biological functions during human cortical development, including early transcriptional regula- tion and synaptic development. Bioinformatic ana- lyses suggest that translational regulation by FMRP and transcriptional coregulation by common tran- scription factors connect these processes. At a cir- cuit level, ASD genes are enriched in superficial cortical layers and glutamatergic projection neurons. Furthermore, we show that the patterns of ASD and ID risk genes are distinct, providing a biological framework for further investigating the pathophysi- ology of ASD. INTRODUCTION Autism spectrum disorder (ASD) is a heterogeneous neurodeve- lopmental disorder in which hundreds of genes have been impli- cated (Berg and Geschwind, 2012; Geschwind and Levitt, 2007). Analysis of copy number variation (CNV) and exome sequencing have identified rare variants that alter dozens of protein-coding genes in ASD, none of which account for more than 1% of ASD cases (Devlin and Scherer, 2012). This and the fact that a significant fraction (40%–60%) of ASD is explained by common variation (Klei et al., 2012) point to a heterogeneous genetic architecture. These findings raise several issues. Based on the background human mutation rate (MacArthur et al., 2012), most genes affected by only one observed rare variant to date are likely false positives that do not increase risk for ASD (Gratten et al., 2013). It is therefore essential to develop approaches that prioritize singleton variants, especially missense mutations. Furthermore, given the heterogeneity of ASD, it would be valuable to identify common pathways, cell types, or circuits disrupted within ASD itself. Recent studies combining gene expression, protein- protein interactions (PPIs), and other systematic gene annotation resources suggest some molecular convergence in subsets of ASD risk genes (Ben-David and Shifman, 2013; Gilman et al., 2011; Sakai et al., 2011; Voineagu et al., 2011). Yet, it remains unclear how the large number of genes implicated through different methods may converge to affect human brain develop- ment, which is critical to a mechanistic understanding of ASD (Berg and Geschwind, 2012). Additionally, ASD has considerable overlap with ID at the genetic level, so identifying molecular path- ways and circuits that confer the phenotypic specificity of ASD would be of considerable utility (Geschwind, 2011; Matson and Shoemaker, 2009). Here, we took a stepwise approach to determine whether genes implicated in ASD affect convergent pathways during in vivo human neural development and whether they are en- riched in specific cells or circuits (Figure 1A). First, we con- structed transcriptional networks representing genome-wide functional relationships during fetal and early postnatal brain development and mapped genes from multiple ASD and ID resources to these networks. We then assessed shared neurobi- ological function among these genes, including coregulatory relationships and enrichment in layer-specific patterns from http://geschwindlab.neurology.ucla.edu/sites/all/files/networkplot/ParikshakDevelopmentalCortexNetwork.html genomics: WGCNA
  • 88.
    Genetics Gene Expression Imaging Clinical Phenotype Epigenetics Genotyping Sequencing Transcriptome Proteome Methylome HistoneModifications • Structural • Functional Neuropathology OMICs Approaches to Human CNS Disease • Binary • Quantitative
  • 89.
    Genetics Gene Expression Imaging Clinical Phenotype Epigenetics Genotyping Sequencing Transcriptome Proteome Methylome ... Structural Functional Neuropathology OMICsApproaches to Human CNS Disease unidimensional approach systems biology approach Geschwind and Konopka, Nature 2010
  • 90.
    Genetics meth 1 meth 2 meth 3 trait traittrait Network Edge Orienting(NEO) 0.6 3.5 Steve Horvath Aten et al, BMC Systems Biol 2008
  • 91.
    Conclusions 1. The geneticmap for ASD and ID and the role of common and rare variation are increasingly characterized 2. Two technological advances (microarrays and sequencing) have facilitated progress over the past 10 years 3. Whole-genome sequencing will clarify the role of non-coding variation 4. Replication and functional validation pose significant challenges
  • 92.