Variation in crop genomes and
heterosis
Nathan Springer
Plant breeding relies upon variation
What are the molecular variants that underlie phenotypic diversity?
• SNPs
• InDels – CNV/PAV
• Transposons
• Epigenetics
• Expression levels
How prevalent are these types of variation?
How do they behave in breeding?
Variation: heterosis and transgressive
segregation
• Transgressive variation is basis for much of
classical breeding efforts
• Apparent phenotypic similarity does not indicate
similar genetic mechanisms
F1
B73
Mo17 Short RILs
Intermediate
RILs
Tall RILs
RIL
B73 Mo17 F1
F2
Proportionofpopulation
Outline
• Molecular variation in crop genomes
– Expression variation
– Structural variation
– Epigenetic variation
• Heterosis
B73 Mo17
0
200
400
600
800
1000
1200
1400
Gene Q Gene S Gene T
Expression
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
Gene A Gene B Gene C
Expression
B14
B37
B73
B84
Mo17
Oh43
W22
Wf9
Transcriptome differences among parents
• Comparisons of different individuals of the same species reveal
a surprising number of gene expression differences
B37 B73 B84 Mo17 Oh43 W22 Wf9
B14A 11.8% 8.5% 6.5% 16.2% 16.6% 12.0% 11.7%
B37 11.0% 9.5% 14.2% 12.7% 12.1% 11.8%
B73 4.3% 15.6% 14.7% 13.1% 12.0%
B84 15.7% 15.6% 13.5% 11.3%
Mo17 15.0% 15.0% 14.2%
Oh43 14.2% 12.1%
W22 11.0%
% Differentially expressed genes (from 12,327 expressed genes)
• These differentially expressed genes include many examples of
genes that are only expressed in some genotypes
History of structural variation
studies in maize
Kato et al., 2004 PNAS
Brunner et al. 2005 Plant Cell
Sequencing of multiple haplotypes: Dooner, Rafalski,
Morgante, Schnable, Messing
Gain/loss in Hp301, Tx303 and teosinte
Mo17
Hp301
Tx303
Teo
• Many gain/loss sequences
in Hp301, Tx303 and
teosinte
• Significant amount of
reference genome is
missing in each line
– Hp301: 24Mbp
– Tx303: 29Mbp
– Mo17: 25Mbp
Gain
Loss
Kai Ying; ISU
Gene-centric array based analysis of
structural variation in diverse maize
• 24 diverse maize lines (4 SS / 6 NSS / 5 tropical / 6 PVP / 3 mixed)
• 14 teosinte genotypes (4 TIL / 10 wild individuals)
Swanson-Wagner et al., Genome Res. 2010
Not Sig n=28,675
PAV / DownCNV n=3,334
UpCNV n=402
Both n=76
Physical position (Mb)
#genotypeswithvariant
Chr9
Chr10
Functional implications of structural
variants
• Many – but certainly not all - genes with structural variation
are “Unclassified”
• “Classical” maize genes (Schnable, Freeling)
– 24 / 420 tested (4 CNV and 20 PAV)
• Transcription factors (GRASSIUS)
– 98 / 1,723 tested (7 CNV and 91 PAV)
Structural variation contributes
significantly to quantitative trait
variation in maize
(Chia et al., 2012 Nat Genetics)
Examples of CNV affecting important traits in plants
Cold tolerance in barley (Knox et al TAG 2010)
SCN resistance in soybean (Cook et al. Science 2012)
Flowering time in wheat (Diaz et al 2012 PLoS One)
Herbicide resistance in weeds (Gaines et al., 2010 PNAS)
Potential causes of CNV
Potential sources of dispersed duplicates:
1. Transposition
2. On-going fractionation of syntenic regions (Schnable et al., 2011 PNAS)
CoGePedia
Reference
genome
sequence
Pan-genome?
Epigenetics - definitions
• Epigenetics: Heritable information
not solely due to DNA sequence
•Mitotic memory: Development; response
to environment
•Meiotic / trans-generational memory:
Silencing of TEs; heritable variation
• Chromatin modifications (DNA
methylation / histone modifications)
are often a mechanism of epigenetic
memory but are not necessarily
epigenetic
Epigenetics can contribute to natural variation
• Tip of the iceberg or rare form of variation?
• Most examples of trans-generational epigenetic
regulation are variable within the species
– Arabidopsis
• SUP, PAI, BAL
– Maize
• B, P, C, Pl, R
• Epigenetically silenced alleles may represent
genes on the path to genomic removal via
genetic mechanisms
Morgan et al., 1999
Cubas et al., 1999 Chandler and Stam 2004
DMRs in diverse maize
genotypes
DNA methylation diversity in maize
Maize Landrace Teosinte
Hypomethylation Hypermethylation
1,754 Rare DMRsHierarchical Clustering Hierarchical Clustering1,966 Common DMRs
• Rare “loss” of DNA methylation more common than rare “gain”
• Diversity of epigenome mirrors genomic diversity
Functional Consequences of DMRs
NAMinbred5mcNAMinbredRNA-seq
~40M RNA-seq reads (tissue matched) for each NAM parental inbred
Compare transcript abundance and DNA methylation variation
• Identified nearest genes to each DMR (2,375 genes within 10kb of DMR) and
assessed correlation with transcript abundance
•277 (of 2,375 tested) had a significant (q<0.01) negative correlation with
expression [53 genes exhibited a positively correlation]
•No significant GO enrichments; many genes lack syntenic orthologs in other
species (TEs or novel genes)
• ~0.7% of all genes expression associated with nearby DNA methylation variation
Functional Consequences of DMRs
Qualitative association Quantitative association
Outline
• Molecular variation in crop genomes
– Expression variation
– Structural variation
– Epigenetic variation
• Heterosis
What is heterosis?
Heterosis refers to the phenomenon in which hybrid
offspring exhibit characteristics that lie outside the range
of the parents
Mo17 B73F1
F1
Mo17 B73
Two major goals for research into
mechanisms of heterosis
• Goal 1: Improve prediction of ideal hybrid
genotypes. Testing hybrid combinations involves
major cost/effort and improved prediction could
make this process more efficient.
• Goal 2. Develop inbred lines, or approaches, that
“capture” phenotypic gains of heterosis.
Observation and quantification of heterosis
• Heterosis is most readily observed and quantified when two
pure-breeding homozygous lines are crossed
• Heterosis is distinct from segregation and transgressive
variation
Height (cM)
RIL
Parent 1 Parent 2 F1
F2
Proportionofpopulation
F1
B73
Mo17
Short RILs
Intermediate
RILs
Tall RILs
Parent 1 Parent 2
Example F2 Example RIL
Heterosis use pros/cons
• Heterosis can generate high levels of uniform production that
can be re-generated each generation and allow for strong
selection in parents
• Heterosis results in complications in seed production and seed
value
• Choosing to use heterosis likely limits breeding progress
Many traits exhibit heterosis
• Measurements of different plant traits in over 400 maize
hybrids provides evidence for prevalent heterosis for many
traits
Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E,
Springer NM. 2009. Heterosis is prevalent for
multiple traits in diverse maize germplasm.
PLoS ONE 4:e7433
Phenotypic observations about heterosis
Stuber CW, Lincoln SE, Wolff DW, Helentjaris T, Lander ES. 1992.
Identification of genetic factors contributing to heterosis in a
hybrid from two elite maize inbred lines using molecular
markers. Genetics 132:823--39
There are some examples of heterosis
due to the effects of a single locus
•Heterosis is generally due to contributions from many loci (QTL
mapping studies)
Krieger U, Lippman ZB, Zamir D. 2010. The flowering gene
SINGLE FLOWER TRUSS drives heterosis for yield in
tomato. Nat. Genet. 42:459--63
B73 Best Parent Heterosis
0
2
4
6
8
10
12
14
H100xB73
B84xB73
F2xB73
B14axB73
Mo17xB73
B77xB73
H99xB73
W64axB73
W22xB73
Wf9xB73
B37xB73
Oh43xB73
A188xB73
Ranking
Final Height
Stalk Diameter
Days to Flow er
Number of Chutes
50 Seed Weight
Kernel Row s
Week 3 Height
Biomass Avg *
Greenhouse height *
Phenotypic observations about heterosis
•Heterosis is not quantifiable at the organismal level (trait to trait
variation)
Phenotypic observations about heterosis
•Different genes likely control heterosis for heterosis for different
traits (lack of correlation for heterosis for different traits)
PlantheightBPH
YieldBPH
Yield BPH Cob weight BPH
115 diverse inbreds
each crossed to
B73 and Mo17
DistB73
DTT
PlantYield
TSLLEN
TSLBCHCNT
TSLANG
PLTHT
UPLFANG
LEAFWDT
LEAFLEN
RPR
STLKWDT
10KWeight
CobDiameter
KernelHeight
EarLength
CobWeight
TotKWt
Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer
NM. 2009. Heterosis is prevalent for multiple traits in
diverse maize germplasm. PLoS ONE 4:e7433
Phenotypic observations about heterosis
•Heterosis is only partially correlated with genetic diversity
PlantheightBPH
PlantyieldBPH
Genetic diversity (from Hamblin 2008)
Attempting to understand heterosis
• Genetic basis; dominance, over-dominance, etc
• Molecular basis; dosage, allele-preference, etc
• Many possible answers each with some evidence
– little evidence for a common answer
Dominance and over-dominance
• The dominance theory of heterosis posits that inbred lines have
mildly deleterious alleles and heterosis is the result of
complementation of these defects
• The over-dominance theory of heterosis suggests that
heterozygosity per se results in heterosis
• Associated concepts
• Pseudo-over-dominance
• Epistasis
• Birchler and others have encouraged moving past dominance /
over-dominance debate to think in more quantitative or systems
approaches
Dominance
• Evidence for substantial genetic load (deleterious alleles)
from inbreeding depression and from genomic analyses
• Dominance contribution to heterosis must be through MILD
deleterious alleles and likely to be highly multi-genic.
• Also consider capture of “beneficial” alleles - adaptedness
Parent 1 Parent 2 Hybrid F1
Arguments against pure-dominance
Parent 1 Parent 2
+
+
+
-
-
+
+
+
-
-
- -
-
-
-
+
+
+ -
-
-
+
+
+
Hybrid F1 – ideal
recombinant
chromosomes
may require too
many cross-overs
• Over-dominant action for some loci
• Response: Potential pseudo-overdominance
• Lack of ability to “capture” positive alleles and purge
deleterious alleles
• Response: many genes involved, each with small effects
which may limit ability to purge deleterious alleles
Hybrid F1
+
+
+
-
-
- -
-
-
+
+
+
The case for over-dominance
• Loci with over-dominant contribution to phenotype
have been identified (SFT, Erecta, QTL studies)
• Observations of heterosis and inbreeding
depression in polyploids suggest mechanisms
beyond dominance
• Lack of progress in “removing” heterosis and limited
expectations for genetic load
Molecular basis of heterosis
Mo17
B73
F1 ?
F1
Mo17 B73
A. What molecular variation exists between parents?
B. What is unique about the hybrid?
No heterosis without variation among parents:
Understanding variation and how it combines is
important for heterosis
What tissue to survey?
What is unique about the hybrid?
Transcriptional levels?
0
1
2
3
4
5
6
Expressionlevel
Parent 1 Parent 2
Potential hybrid expression levels
A
B
C
D
E
Mid-parent
High
Parent-like
Above
High parent
Below
Low Parent
Low
Parent-like
Differentially expressed genes
• Since many traits have values outside the parental range it
was expected that many genes would also be expressed
outside the parental range
• Most genes are expressed at levels within parental range
What is unique about the hybrid?
How might mid-parent expression levels be beneficial?
•Many mid-parent (additive) expression patterns
•Potential “Goldilocks” effect of gene expression on phenotype
• Genetic action of gene expression phenotype does not equal
genetic action of phenotype
Expressionlevel
Gene A Gene B Gene C
Optimal
expression range
Increasingly
detrimental over-
expression
Increasingly
detrimental
under-expression
Hybrid
Inbred 2
Inbred 1 Hybrid
Inbred 2
Inbred 1
Hybrid
Inbred 2
Inbred 1
What is unique about the hybrid?
Unique genome / transcriptome content
• Hybrids encode more genes and express more genes than
either parent
• Basically a dominance explanation
• How might these genes contribute to heterosis?
Parent 1
Parent 2
• Most genes present/expressed in both parents
• Small number of genes unique to each parent
• All genes present / expressed in hybrid
Improved interactions may lead to improved
transition precision
• Genome content variation often affects members of gene families and
therefore may lead to subtle perturbations proper interactions
• Birchler and Veitia have proposed concept of dosage balance hypothesis
• Propose the having correct interactions in complexes may be critical to
achieving proper developmental transitions and stress response
• As co-evolved gene family members are re-united in hybrids they are more
efficient at precise transitions in development or in response to stress
• Important to remember that selection has been strong to move from
teosinte to maize and to filter out major deleterious alleles
A1
A2
B1
B2
A1 B1
B2
A1
A2 B2
Inbred 1 Inbred 2 Hybrid
The loss of genes (from genome or transcriptome) may be tolerated due to
partial redundancy of paralogs or orthologs
Allows survival of inbred lines lacking genes and but may “break-down” and
provide sub-optimal performance especially during transitions and stress
After bear damage “repaired” for trip
home using duct tape
Heterosis Summary
• Heterosis varies among traits and tissues
• Search for unifying principles among traits and species may not
be successful
• Distinct mechanisms causes of molecular variation (genome,
transcriptome, epigenome) and action to produce phenotypic
heterosis
• Selective pressures and genetic load (history) matters
• Modern day lines represent significant selection upon natural
genetic materials
• Limited utility of heterotic groups
Compare / contrast maize-switchgrass heterosis
• Both allopolyploid outcrossers with large
effect population size – likely abundant
genetic load and on-going fractionation
• Breeding style limitations
• Differences in “domesticated vs wild” are
distinct in two species
• Peter Hermanson
• Steve Eichten
• Amanda Waters
• Qing Li
• Ruth Swanson-Wagner
• Matthew Vaughn (TACC )
• Jawon Song (TACC)
• Irina Makarevitch (Hamline)
• Damon Lisch (Berkeley)
Iowa State U
-Patrick Schnable
-Eddy Yeh
NimbleGen
-Jeffrey Jeddeloh
U Georgia
-Kelly Dawe
-Xiaoyu Zhang
-Jonathan Gent
-Nathaniel Ellis
U of Minnesota
-Bob Stupar
-Chad Myers
-Roman Briskine
-Rob Schaefer
-Peter Tiffin
-Lin Li
-Gary Muehlbauer
U of Wisconsin
-Shawn Kaeppler
-Scott Stelpflug
NSF DBI# 0922095
NSF IOS# 1237931
Modeling of heterosis phenotypes
• Use parental phenotype, genetic distance between parents
and environment to model hybrid performance
Scatter Plot
PLTHT_Est
Scatter Plot
TotKWt_Est
Scatter Plot
CobDia_Est
Scatter Plot
CobWt_Est
A. Cob diameter B. Cob weight
C. Plant height D. Total kernel weight
Predicted
Actual
Predicted
Actual
Predicted
Actual
Predicted
Actual
Population 1 (R2 = 0.70)
Population 2 – B73 OC (R2 = 0.73)
Population 2 – Mo17 OC (R2 = 0.70)
Population 1 (R2 = 0.91)
Population 2 – B73 OC (R2 = 0.69)
Population 2 – Mo17 OC (R2 = 0.56)
Population 1 (R2 = 0.76)
Population 2 – B73 OC (R2 = 0.53)
Population 2 – Mo17 OC (R2 = 0.54)
Population 1 (R2 = 0.74)
Population 2 – B73 OC (R2 = 0.65)
Population 2 – Mo17 OC (R2 = 0.55)
“Adaptedness” concept
from Troyer 2006
Flint-Garcia et al. PLoSOne 2009
Many plant species exhibit heterosis
• Heterosis is also prevalent in many other plant species
although the magnitude and prevalence of heterosis varies
• Note: Actual genetic architecture of heterosis may vary
depending on past selection pressures and natural history
Groszmann M, Greaves IK, Albertyn ZI, Scofield GN, Peacock WJ,
Dennis ES. 2011. Changes in 24-nt siRNA levels in Arabidopsis
hybrids suggest an epigenetic contribution to hybrid vigor. Proc.
Natl. Acad. Sci. USA 108:2617--22
RiceArabidopsis
Qifa Zhang
Unique expression in hybrids?
• Limited evidence for unique expression levels in hybrids
0
1
2
3
4
5
6
Expressionlevel
Parent 1 Parent 2
Potential hybrid expression levels
A
B
C
D
E
High parent level
B84xB73 B37xB73 Oh43xB73 Oh43xMo17 Mo17xB73 B73xMo17
# DE genes 290 655 1071 885 1064 1055
# Non-additive 88 (30.3%) 159 (24.3%) 296 (27.6%) 233 (26.3%) 247 (23.2%) 266 (25.2%)
# NA between parents 83 126 232 184 201 209
# HP or LP 5 32 58 47 44 55
# AHP or BLP 0 3 6 2 2 2
Similar results in Guo et al., 2006; Stupar and Springer 2006; Swanson-Wagner 2006
Contrasting results in Auger et al., 2005; Meyer et al., 2007; Uzarowska et al., 2007
Non-additive
Non-additive
Mid-parent level
Low parent level
High-parent
Low-parent
Above high-parent
Below high-parent
Non-additive
between parents
Does epigenotype have information beyond
genotype for predicting phenotype?
• Epigenotype is more costly to determine than genotype
– Is there novel information in epigenotype for predicting
phenotype?
• Remember: Epigenotype will predominantly act through
alteration of expression levels
Genotype
(SNPs / TEs)
Quantitative variation
(altered levels of gene
product)
Environment
Epigenotype
Qualitative variation
(altered quality of gene
product)
Phenotype
?
?
Gene product variation
Distribution of structural polymorphism in B73/Mo17
Springer et al., PLoS Genetics 2009
Belo et al., TAG 2010
Both shared and unique structural variants
Mo17
Hp301
Tx303
21,000 probes in a 20Mb region of chromosome 4
Missing in all 3
genotypes
Missing in
Mo17 and
Tx303
Missing in
Tx303 only
Copy gain in
Hp301 and
Tx303
Copy gain in
Hp301 and
Tx303
Novel Hybridization Patterns
Apparent “de novo” CNV in RILs
Segregation of Non-Allelic Gene Copies Generates
PAVs/CNVs and Novel Phenotypes
Changes in gene
complement among RILs.
Strong statistical support
for association between
gene loss and yield
component traits in IBM
RILs
Liu et al., Plant J. 2012
Frequent unlinked Mo17 copy gains
• 4,994 probes detect Mo17-specific sequence duplications
• Could be local or unlinked copy gains in Mo17
60% unlinked (trans)
10% linked (cis)
30% unassigned
Most Mo17 copy number gains occur at
unlinked genomic positions
NIL type
Genotype at locus
Unlinked duplications Linked duplications
Scatter Plot
class
Scatter Plot
class
Scatter Plot
class
Scatter Plot
class
B73 Mo17
B M B M
AC186656 AC194260
B73 Mo17
B M B M
B73 Mo17
B M B M
B73 Mo17
B M B M
AC191373 AC198648
Eichten et al., Plant Phys. 2011
Non-Mendelian gene expression
variation in maize
• RNAseq analysis of expression in ~100 RILs
• Most genes have expected patterns (normal or
bi-modal distribution)
• ~150 examples of paramutation-like patterns
• ~200 genes with unexpected patterns of
presence-absence for transcripts
Lin Li, Gary Muehlbauer : Li et al., PLoS Genetics 2013
Low-parent
level
High-parent
level
Mid-parent
level
Prop.ofgenesineachd/abin
<-2.0 -1.0 0 1.0 >2.0
d/a ratio
• The majority of genes exhibit hybrid expression levels within
the parental range (94%)
• Similar distributions of additive and non-additive expression for
different hybrids
B84xB73
B37xB73
Oh43xB73
Oh43xMo17
Mo17xB73
B73xMo17
A B D E
0
1
2
3
4
5
6
Expressionlevel
Parent 1
Parent 2
A
B
C
D
E C
Heterosis and genome content
variation
• Content variation may be a potential
contributor to heterosis
– Hybrids contain more genes and express
more genes than either parent
NSS
PVP
SS
How do B73 and Mo17 genomes vary?
• SNPs (coding and non-coding)
• InDels (including transposons)
• Copy number variation (and PAV)
• Epigenetic information
B73
Mo17
BxM
MxB
What happens in hybrids?
• Majority of B73 vs Mo17 DMRs show mid-
parent methylation levels in hybrid F1s
• 5-10 DMRs show high-parent methylation state
More
Mo17 like
More
B73 like
Genome-wide Assessment of DNA methylation
• 1.1 Million experimental probes
placed every 200bp
-single-copy
-corrected for CGH effects
• meDIP-chip (5mC) and ChIP-chip
for H3K9me2 and H3K27me3
-Antibody pulldown of methylated
DNA (not context-specific) contrasted
against control gDNA
•Assess relative methylation
enrichment across low-copy space
of maize genome
Analysis B73 methylation
Mo17 methylation
Genes
Repeats
Matt Vaughn, TACC
DNA Methylation variation is prevalent between
genotypes, but not between tissues
DNA methylation
Eichten et al., Plant Genome 2012
H3K27me3
Makarevitch et al., Plant Cell 2013
Maize epigenomic profiling
Genome wide distribution
• 5mC and H3K9me2 largely overlapping and enriched in pericentromeric regions.
• H3K9me2 rarely found within genes
• H3K27me3 enriched in chromosomes arms and often in genes.
~100 kb
DNA methylation differences following
domestication
Maize Landrace Teosinte
Hypomethylation Hypermethylation
172 Maize – teosinte DMRs
3720
Rare & Common
DMRs
149
Teosinte-specific
DMRs
23
• Some DNA methylation differences between maize
and teosinte
• Few are fixed differences in maize / teosinte
• Small number of maize-teosinte DMRs overlap with
domestication regions or maize-teosinte DE genes
How does heritable information vary among
individuals of a species?
• Expected to occur primarily through SNPs and
small InDels that result in:
– Qualitative variation (different proteins)
– Quantitative variation (different amount of mRNA
or protein)
• But.. Other types of variation exist as well
Genome content summary
• High levels of variation for genome content
– Some association with heterosis
– Potential on-going fractionation
• Implications for genome structure and plant
breeding
– Hybrids have more genes than inbreds
– Extra gene fragments segregate in
populations
– Non-colinearity within a species
– May require pan-genome sequencing
strategies to capture species gene content
Potential transcriptome complementation
in hybrids
• Numerous genes expressed in some
inbreds but not others
• Some exhibit tissue-specific absence
and others are absent in all tissues
tested
• Results in higher numbers of genes
being expressed in hybrid
• Some are due to differences in
expression, others due to genome
content differences B73expressionlevel
Mo17 expression level
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
70mer Affy
Normalizedaveragesignal
B14
B37
B73
B84
Mo17a
Oh43
W22
Wf9
AF520911
Analogous to Fu and Dooner (2002) suggestion about genomic differences
Many additional PA transcriptome patterns documented in Hansey et al., PLoS One 2012
Other differences among parents
• Epigenetic changes
• Allelic preferred translation and transcription
Goff and Zhang; COPB 2013

Variation in crop genomes and heterosis

  • 1.
    Variation in cropgenomes and heterosis Nathan Springer
  • 2.
    Plant breeding reliesupon variation What are the molecular variants that underlie phenotypic diversity? • SNPs • InDels – CNV/PAV • Transposons • Epigenetics • Expression levels How prevalent are these types of variation? How do they behave in breeding?
  • 3.
    Variation: heterosis andtransgressive segregation • Transgressive variation is basis for much of classical breeding efforts • Apparent phenotypic similarity does not indicate similar genetic mechanisms F1 B73 Mo17 Short RILs Intermediate RILs Tall RILs RIL B73 Mo17 F1 F2 Proportionofpopulation
  • 4.
    Outline • Molecular variationin crop genomes – Expression variation – Structural variation – Epigenetic variation • Heterosis B73 Mo17
  • 5.
    0 200 400 600 800 1000 1200 1400 Gene Q GeneS Gene T Expression 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Gene A Gene B Gene C Expression B14 B37 B73 B84 Mo17 Oh43 W22 Wf9 Transcriptome differences among parents • Comparisons of different individuals of the same species reveal a surprising number of gene expression differences B37 B73 B84 Mo17 Oh43 W22 Wf9 B14A 11.8% 8.5% 6.5% 16.2% 16.6% 12.0% 11.7% B37 11.0% 9.5% 14.2% 12.7% 12.1% 11.8% B73 4.3% 15.6% 14.7% 13.1% 12.0% B84 15.7% 15.6% 13.5% 11.3% Mo17 15.0% 15.0% 14.2% Oh43 14.2% 12.1% W22 11.0% % Differentially expressed genes (from 12,327 expressed genes) • These differentially expressed genes include many examples of genes that are only expressed in some genotypes
  • 6.
    History of structuralvariation studies in maize Kato et al., 2004 PNAS Brunner et al. 2005 Plant Cell Sequencing of multiple haplotypes: Dooner, Rafalski, Morgante, Schnable, Messing
  • 7.
    Gain/loss in Hp301,Tx303 and teosinte Mo17 Hp301 Tx303 Teo • Many gain/loss sequences in Hp301, Tx303 and teosinte • Significant amount of reference genome is missing in each line – Hp301: 24Mbp – Tx303: 29Mbp – Mo17: 25Mbp Gain Loss Kai Ying; ISU
  • 8.
    Gene-centric array basedanalysis of structural variation in diverse maize • 24 diverse maize lines (4 SS / 6 NSS / 5 tropical / 6 PVP / 3 mixed) • 14 teosinte genotypes (4 TIL / 10 wild individuals) Swanson-Wagner et al., Genome Res. 2010 Not Sig n=28,675 PAV / DownCNV n=3,334 UpCNV n=402 Both n=76 Physical position (Mb) #genotypeswithvariant Chr9 Chr10
  • 9.
    Functional implications ofstructural variants • Many – but certainly not all - genes with structural variation are “Unclassified” • “Classical” maize genes (Schnable, Freeling) – 24 / 420 tested (4 CNV and 20 PAV) • Transcription factors (GRASSIUS) – 98 / 1,723 tested (7 CNV and 91 PAV) Structural variation contributes significantly to quantitative trait variation in maize (Chia et al., 2012 Nat Genetics) Examples of CNV affecting important traits in plants Cold tolerance in barley (Knox et al TAG 2010) SCN resistance in soybean (Cook et al. Science 2012) Flowering time in wheat (Diaz et al 2012 PLoS One) Herbicide resistance in weeds (Gaines et al., 2010 PNAS)
  • 10.
    Potential causes ofCNV Potential sources of dispersed duplicates: 1. Transposition 2. On-going fractionation of syntenic regions (Schnable et al., 2011 PNAS) CoGePedia
  • 11.
  • 12.
    Epigenetics - definitions •Epigenetics: Heritable information not solely due to DNA sequence •Mitotic memory: Development; response to environment •Meiotic / trans-generational memory: Silencing of TEs; heritable variation • Chromatin modifications (DNA methylation / histone modifications) are often a mechanism of epigenetic memory but are not necessarily epigenetic
  • 13.
    Epigenetics can contributeto natural variation • Tip of the iceberg or rare form of variation? • Most examples of trans-generational epigenetic regulation are variable within the species – Arabidopsis • SUP, PAI, BAL – Maize • B, P, C, Pl, R • Epigenetically silenced alleles may represent genes on the path to genomic removal via genetic mechanisms Morgan et al., 1999 Cubas et al., 1999 Chandler and Stam 2004
  • 14.
    DMRs in diversemaize genotypes
  • 15.
    DNA methylation diversityin maize Maize Landrace Teosinte Hypomethylation Hypermethylation 1,754 Rare DMRsHierarchical Clustering Hierarchical Clustering1,966 Common DMRs • Rare “loss” of DNA methylation more common than rare “gain” • Diversity of epigenome mirrors genomic diversity
  • 16.
    Functional Consequences ofDMRs NAMinbred5mcNAMinbredRNA-seq ~40M RNA-seq reads (tissue matched) for each NAM parental inbred Compare transcript abundance and DNA methylation variation
  • 17.
    • Identified nearestgenes to each DMR (2,375 genes within 10kb of DMR) and assessed correlation with transcript abundance •277 (of 2,375 tested) had a significant (q<0.01) negative correlation with expression [53 genes exhibited a positively correlation] •No significant GO enrichments; many genes lack syntenic orthologs in other species (TEs or novel genes) • ~0.7% of all genes expression associated with nearby DNA methylation variation Functional Consequences of DMRs Qualitative association Quantitative association
  • 18.
    Outline • Molecular variationin crop genomes – Expression variation – Structural variation – Epigenetic variation • Heterosis
  • 19.
    What is heterosis? Heterosisrefers to the phenomenon in which hybrid offspring exhibit characteristics that lie outside the range of the parents Mo17 B73F1 F1 Mo17 B73
  • 20.
    Two major goalsfor research into mechanisms of heterosis • Goal 1: Improve prediction of ideal hybrid genotypes. Testing hybrid combinations involves major cost/effort and improved prediction could make this process more efficient. • Goal 2. Develop inbred lines, or approaches, that “capture” phenotypic gains of heterosis.
  • 21.
    Observation and quantificationof heterosis • Heterosis is most readily observed and quantified when two pure-breeding homozygous lines are crossed • Heterosis is distinct from segregation and transgressive variation Height (cM) RIL Parent 1 Parent 2 F1 F2 Proportionofpopulation F1 B73 Mo17 Short RILs Intermediate RILs Tall RILs Parent 1 Parent 2 Example F2 Example RIL
  • 22.
    Heterosis use pros/cons •Heterosis can generate high levels of uniform production that can be re-generated each generation and allow for strong selection in parents • Heterosis results in complications in seed production and seed value • Choosing to use heterosis likely limits breeding progress
  • 23.
    Many traits exhibitheterosis • Measurements of different plant traits in over 400 maize hybrids provides evidence for prevalent heterosis for many traits Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433
  • 24.
    Phenotypic observations aboutheterosis Stuber CW, Lincoln SE, Wolff DW, Helentjaris T, Lander ES. 1992. Identification of genetic factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. Genetics 132:823--39 There are some examples of heterosis due to the effects of a single locus •Heterosis is generally due to contributions from many loci (QTL mapping studies) Krieger U, Lippman ZB, Zamir D. 2010. The flowering gene SINGLE FLOWER TRUSS drives heterosis for yield in tomato. Nat. Genet. 42:459--63
  • 25.
    B73 Best ParentHeterosis 0 2 4 6 8 10 12 14 H100xB73 B84xB73 F2xB73 B14axB73 Mo17xB73 B77xB73 H99xB73 W64axB73 W22xB73 Wf9xB73 B37xB73 Oh43xB73 A188xB73 Ranking Final Height Stalk Diameter Days to Flow er Number of Chutes 50 Seed Weight Kernel Row s Week 3 Height Biomass Avg * Greenhouse height * Phenotypic observations about heterosis •Heterosis is not quantifiable at the organismal level (trait to trait variation)
  • 26.
    Phenotypic observations aboutheterosis •Different genes likely control heterosis for heterosis for different traits (lack of correlation for heterosis for different traits) PlantheightBPH YieldBPH Yield BPH Cob weight BPH 115 diverse inbreds each crossed to B73 and Mo17 DistB73 DTT PlantYield TSLLEN TSLBCHCNT TSLANG PLTHT UPLFANG LEAFWDT LEAFLEN RPR STLKWDT 10KWeight CobDiameter KernelHeight EarLength CobWeight TotKWt Flint-Garcia SA, Buckler ES, Tiffin P, Ersoz E, Springer NM. 2009. Heterosis is prevalent for multiple traits in diverse maize germplasm. PLoS ONE 4:e7433
  • 27.
    Phenotypic observations aboutheterosis •Heterosis is only partially correlated with genetic diversity PlantheightBPH PlantyieldBPH Genetic diversity (from Hamblin 2008)
  • 28.
    Attempting to understandheterosis • Genetic basis; dominance, over-dominance, etc • Molecular basis; dosage, allele-preference, etc • Many possible answers each with some evidence – little evidence for a common answer
  • 29.
    Dominance and over-dominance •The dominance theory of heterosis posits that inbred lines have mildly deleterious alleles and heterosis is the result of complementation of these defects • The over-dominance theory of heterosis suggests that heterozygosity per se results in heterosis • Associated concepts • Pseudo-over-dominance • Epistasis • Birchler and others have encouraged moving past dominance / over-dominance debate to think in more quantitative or systems approaches
  • 30.
    Dominance • Evidence forsubstantial genetic load (deleterious alleles) from inbreeding depression and from genomic analyses • Dominance contribution to heterosis must be through MILD deleterious alleles and likely to be highly multi-genic. • Also consider capture of “beneficial” alleles - adaptedness Parent 1 Parent 2 Hybrid F1
  • 31.
    Arguments against pure-dominance Parent1 Parent 2 + + + - - + + + - - - - - - - + + + - - - + + + Hybrid F1 – ideal recombinant chromosomes may require too many cross-overs • Over-dominant action for some loci • Response: Potential pseudo-overdominance • Lack of ability to “capture” positive alleles and purge deleterious alleles • Response: many genes involved, each with small effects which may limit ability to purge deleterious alleles Hybrid F1 + + + - - - - - - + + +
  • 32.
    The case forover-dominance • Loci with over-dominant contribution to phenotype have been identified (SFT, Erecta, QTL studies) • Observations of heterosis and inbreeding depression in polyploids suggest mechanisms beyond dominance • Lack of progress in “removing” heterosis and limited expectations for genetic load
  • 33.
    Molecular basis ofheterosis Mo17 B73 F1 ? F1 Mo17 B73 A. What molecular variation exists between parents? B. What is unique about the hybrid? No heterosis without variation among parents: Understanding variation and how it combines is important for heterosis What tissue to survey?
  • 34.
    What is uniqueabout the hybrid? Transcriptional levels? 0 1 2 3 4 5 6 Expressionlevel Parent 1 Parent 2 Potential hybrid expression levels A B C D E Mid-parent High Parent-like Above High parent Below Low Parent Low Parent-like Differentially expressed genes • Since many traits have values outside the parental range it was expected that many genes would also be expressed outside the parental range • Most genes are expressed at levels within parental range
  • 35.
    What is uniqueabout the hybrid? How might mid-parent expression levels be beneficial? •Many mid-parent (additive) expression patterns •Potential “Goldilocks” effect of gene expression on phenotype • Genetic action of gene expression phenotype does not equal genetic action of phenotype Expressionlevel Gene A Gene B Gene C Optimal expression range Increasingly detrimental over- expression Increasingly detrimental under-expression Hybrid Inbred 2 Inbred 1 Hybrid Inbred 2 Inbred 1 Hybrid Inbred 2 Inbred 1
  • 36.
    What is uniqueabout the hybrid? Unique genome / transcriptome content • Hybrids encode more genes and express more genes than either parent • Basically a dominance explanation • How might these genes contribute to heterosis? Parent 1 Parent 2 • Most genes present/expressed in both parents • Small number of genes unique to each parent • All genes present / expressed in hybrid
  • 37.
    Improved interactions maylead to improved transition precision • Genome content variation often affects members of gene families and therefore may lead to subtle perturbations proper interactions • Birchler and Veitia have proposed concept of dosage balance hypothesis • Propose the having correct interactions in complexes may be critical to achieving proper developmental transitions and stress response • As co-evolved gene family members are re-united in hybrids they are more efficient at precise transitions in development or in response to stress • Important to remember that selection has been strong to move from teosinte to maize and to filter out major deleterious alleles A1 A2 B1 B2 A1 B1 B2 A1 A2 B2 Inbred 1 Inbred 2 Hybrid
  • 38.
    The loss ofgenes (from genome or transcriptome) may be tolerated due to partial redundancy of paralogs or orthologs Allows survival of inbred lines lacking genes and but may “break-down” and provide sub-optimal performance especially during transitions and stress After bear damage “repaired” for trip home using duct tape
  • 39.
    Heterosis Summary • Heterosisvaries among traits and tissues • Search for unifying principles among traits and species may not be successful • Distinct mechanisms causes of molecular variation (genome, transcriptome, epigenome) and action to produce phenotypic heterosis • Selective pressures and genetic load (history) matters • Modern day lines represent significant selection upon natural genetic materials • Limited utility of heterotic groups
  • 40.
    Compare / contrastmaize-switchgrass heterosis • Both allopolyploid outcrossers with large effect population size – likely abundant genetic load and on-going fractionation • Breeding style limitations • Differences in “domesticated vs wild” are distinct in two species
  • 41.
    • Peter Hermanson •Steve Eichten • Amanda Waters • Qing Li • Ruth Swanson-Wagner • Matthew Vaughn (TACC ) • Jawon Song (TACC) • Irina Makarevitch (Hamline) • Damon Lisch (Berkeley) Iowa State U -Patrick Schnable -Eddy Yeh NimbleGen -Jeffrey Jeddeloh U Georgia -Kelly Dawe -Xiaoyu Zhang -Jonathan Gent -Nathaniel Ellis U of Minnesota -Bob Stupar -Chad Myers -Roman Briskine -Rob Schaefer -Peter Tiffin -Lin Li -Gary Muehlbauer U of Wisconsin -Shawn Kaeppler -Scott Stelpflug NSF DBI# 0922095 NSF IOS# 1237931
  • 42.
    Modeling of heterosisphenotypes • Use parental phenotype, genetic distance between parents and environment to model hybrid performance Scatter Plot PLTHT_Est Scatter Plot TotKWt_Est Scatter Plot CobDia_Est Scatter Plot CobWt_Est A. Cob diameter B. Cob weight C. Plant height D. Total kernel weight Predicted Actual Predicted Actual Predicted Actual Predicted Actual Population 1 (R2 = 0.70) Population 2 – B73 OC (R2 = 0.73) Population 2 – Mo17 OC (R2 = 0.70) Population 1 (R2 = 0.91) Population 2 – B73 OC (R2 = 0.69) Population 2 – Mo17 OC (R2 = 0.56) Population 1 (R2 = 0.76) Population 2 – B73 OC (R2 = 0.53) Population 2 – Mo17 OC (R2 = 0.54) Population 1 (R2 = 0.74) Population 2 – B73 OC (R2 = 0.65) Population 2 – Mo17 OC (R2 = 0.55) “Adaptedness” concept from Troyer 2006 Flint-Garcia et al. PLoSOne 2009
  • 43.
    Many plant speciesexhibit heterosis • Heterosis is also prevalent in many other plant species although the magnitude and prevalence of heterosis varies • Note: Actual genetic architecture of heterosis may vary depending on past selection pressures and natural history Groszmann M, Greaves IK, Albertyn ZI, Scofield GN, Peacock WJ, Dennis ES. 2011. Changes in 24-nt siRNA levels in Arabidopsis hybrids suggest an epigenetic contribution to hybrid vigor. Proc. Natl. Acad. Sci. USA 108:2617--22 RiceArabidopsis Qifa Zhang
  • 44.
    Unique expression inhybrids? • Limited evidence for unique expression levels in hybrids 0 1 2 3 4 5 6 Expressionlevel Parent 1 Parent 2 Potential hybrid expression levels A B C D E High parent level B84xB73 B37xB73 Oh43xB73 Oh43xMo17 Mo17xB73 B73xMo17 # DE genes 290 655 1071 885 1064 1055 # Non-additive 88 (30.3%) 159 (24.3%) 296 (27.6%) 233 (26.3%) 247 (23.2%) 266 (25.2%) # NA between parents 83 126 232 184 201 209 # HP or LP 5 32 58 47 44 55 # AHP or BLP 0 3 6 2 2 2 Similar results in Guo et al., 2006; Stupar and Springer 2006; Swanson-Wagner 2006 Contrasting results in Auger et al., 2005; Meyer et al., 2007; Uzarowska et al., 2007 Non-additive Non-additive Mid-parent level Low parent level High-parent Low-parent Above high-parent Below high-parent Non-additive between parents
  • 45.
    Does epigenotype haveinformation beyond genotype for predicting phenotype? • Epigenotype is more costly to determine than genotype – Is there novel information in epigenotype for predicting phenotype? • Remember: Epigenotype will predominantly act through alteration of expression levels Genotype (SNPs / TEs) Quantitative variation (altered levels of gene product) Environment Epigenotype Qualitative variation (altered quality of gene product) Phenotype ? ? Gene product variation
  • 46.
    Distribution of structuralpolymorphism in B73/Mo17 Springer et al., PLoS Genetics 2009 Belo et al., TAG 2010
  • 47.
    Both shared andunique structural variants Mo17 Hp301 Tx303 21,000 probes in a 20Mb region of chromosome 4 Missing in all 3 genotypes Missing in Mo17 and Tx303 Missing in Tx303 only Copy gain in Hp301 and Tx303 Copy gain in Hp301 and Tx303
  • 48.
    Novel Hybridization Patterns Apparent“de novo” CNV in RILs
  • 49.
    Segregation of Non-AllelicGene Copies Generates PAVs/CNVs and Novel Phenotypes Changes in gene complement among RILs. Strong statistical support for association between gene loss and yield component traits in IBM RILs Liu et al., Plant J. 2012
  • 50.
    Frequent unlinked Mo17copy gains • 4,994 probes detect Mo17-specific sequence duplications • Could be local or unlinked copy gains in Mo17 60% unlinked (trans) 10% linked (cis) 30% unassigned Most Mo17 copy number gains occur at unlinked genomic positions NIL type Genotype at locus Unlinked duplications Linked duplications Scatter Plot class Scatter Plot class Scatter Plot class Scatter Plot class B73 Mo17 B M B M AC186656 AC194260 B73 Mo17 B M B M B73 Mo17 B M B M B73 Mo17 B M B M AC191373 AC198648 Eichten et al., Plant Phys. 2011
  • 51.
    Non-Mendelian gene expression variationin maize • RNAseq analysis of expression in ~100 RILs • Most genes have expected patterns (normal or bi-modal distribution) • ~150 examples of paramutation-like patterns • ~200 genes with unexpected patterns of presence-absence for transcripts Lin Li, Gary Muehlbauer : Li et al., PLoS Genetics 2013
  • 52.
    Low-parent level High-parent level Mid-parent level Prop.ofgenesineachd/abin <-2.0 -1.0 01.0 >2.0 d/a ratio • The majority of genes exhibit hybrid expression levels within the parental range (94%) • Similar distributions of additive and non-additive expression for different hybrids B84xB73 B37xB73 Oh43xB73 Oh43xMo17 Mo17xB73 B73xMo17 A B D E 0 1 2 3 4 5 6 Expressionlevel Parent 1 Parent 2 A B C D E C
  • 53.
    Heterosis and genomecontent variation • Content variation may be a potential contributor to heterosis – Hybrids contain more genes and express more genes than either parent NSS PVP SS
  • 54.
    How do B73and Mo17 genomes vary? • SNPs (coding and non-coding) • InDels (including transposons) • Copy number variation (and PAV) • Epigenetic information
  • 55.
    B73 Mo17 BxM MxB What happens inhybrids? • Majority of B73 vs Mo17 DMRs show mid- parent methylation levels in hybrid F1s • 5-10 DMRs show high-parent methylation state More Mo17 like More B73 like
  • 56.
    Genome-wide Assessment ofDNA methylation • 1.1 Million experimental probes placed every 200bp -single-copy -corrected for CGH effects • meDIP-chip (5mC) and ChIP-chip for H3K9me2 and H3K27me3 -Antibody pulldown of methylated DNA (not context-specific) contrasted against control gDNA •Assess relative methylation enrichment across low-copy space of maize genome Analysis B73 methylation Mo17 methylation Genes Repeats Matt Vaughn, TACC
  • 57.
    DNA Methylation variationis prevalent between genotypes, but not between tissues DNA methylation Eichten et al., Plant Genome 2012 H3K27me3 Makarevitch et al., Plant Cell 2013
  • 58.
    Maize epigenomic profiling Genomewide distribution • 5mC and H3K9me2 largely overlapping and enriched in pericentromeric regions. • H3K9me2 rarely found within genes • H3K27me3 enriched in chromosomes arms and often in genes. ~100 kb
  • 59.
    DNA methylation differencesfollowing domestication Maize Landrace Teosinte Hypomethylation Hypermethylation 172 Maize – teosinte DMRs 3720 Rare & Common DMRs 149 Teosinte-specific DMRs 23 • Some DNA methylation differences between maize and teosinte • Few are fixed differences in maize / teosinte • Small number of maize-teosinte DMRs overlap with domestication regions or maize-teosinte DE genes
  • 60.
    How does heritableinformation vary among individuals of a species? • Expected to occur primarily through SNPs and small InDels that result in: – Qualitative variation (different proteins) – Quantitative variation (different amount of mRNA or protein) • But.. Other types of variation exist as well
  • 61.
    Genome content summary •High levels of variation for genome content – Some association with heterosis – Potential on-going fractionation • Implications for genome structure and plant breeding – Hybrids have more genes than inbreds – Extra gene fragments segregate in populations – Non-colinearity within a species – May require pan-genome sequencing strategies to capture species gene content
  • 62.
    Potential transcriptome complementation inhybrids • Numerous genes expressed in some inbreds but not others • Some exhibit tissue-specific absence and others are absent in all tissues tested • Results in higher numbers of genes being expressed in hybrid • Some are due to differences in expression, others due to genome content differences B73expressionlevel Mo17 expression level 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 70mer Affy Normalizedaveragesignal B14 B37 B73 B84 Mo17a Oh43 W22 Wf9 AF520911 Analogous to Fu and Dooner (2002) suggestion about genomic differences Many additional PA transcriptome patterns documented in Hansey et al., PLoS One 2012
  • 63.
    Other differences amongparents • Epigenetic changes • Allelic preferred translation and transcription Goff and Zhang; COPB 2013