6. Domestication, Polyploidy and
Genomics of Crops
• Most species domesticated 10,000 years ago
(cereals, legumes/pulses, brassicas, fruits,
cows/sheep/pigs, silkworm/bees)
• Few species more recently (rabbits, fish, trees,
biofuel crops)
• A few dropped out of production
• First steps: productive, reproduce easily, disease-
free, edible/tasty, harvestable …
• With critical technology of people: not obvious
Heslop-Harrison & Schwarzacher Domestication genomics in Arie Altman
www.tinyurl.com/domest and review of rabbits www.tinyurl.com/rabdom
13. Domestication, polyploidy and
genomics of crops and weeds
• CROPS: where one species controls the growth and
reproduction of another
• WEEDS
• Many animals collect food to see them through the
winter, build nests in anticipation of reproduction
• A few plants kill off all others nearby
• Ants (Formicidae) farm plants, animals and fungi
• Humans only for 20% of their history – and still
exploiting environment unsustainably!
14. Organelle sequences
from chloroplasts or
mitochondria
Sequences from
viruses
Transgenes introduced
with molecular biology
methods
Genes, regulatory and non-
coding low-copy sequences
Dispersed repeats
Repetitive DNA sequences
Nuclear
Genome
Tandem repeats
Satellite sequences
DNA transposonsRetrotransposons
Centromeric
repeats
Structural
components of
chromosomes
Telomeric
repeats
Simple sequence
repeats or
microsatellites
Repeated genes
Subtelomeric
repeats
45S and 5S
rRNA genes
Blocks of tandem
repeats at discrete
chromosomal loci
DNA sequence components of the nuclear genome
After Biscotti et al. Chromosome Research 2015
Other genes
Transposable elements
Autonomous/
non-autonomous
Dispersed repeats
that we don’t know
about – except each is
significant proportion
of genome
16. Domestication, polyploidy and
genomics of crops and weeds
• Genome size
• Critical parameter for genome studies – first
sequenced genomes chosen to be small ...
Large genomes only tackled 25 years on
• But is it critical for species …
• No: you can’t ‘look’ at a species and make any
suggestion about it’s genome size …
18. Domestication, polyploidy and
genomics of crops and weeds
• Polyploidy is also critical part of genomes …
• No: you can’t ‘look’ at a species and make any
suggestion about it’s ploidy …
21. Domestication, polyploidy and
genomics of crops and weeds
• Ancient polyploidy (detected by sequencing)
• Modern polyploidy (detected by cytogenetics)
• Advantages: more control, genes free to
mutate, ?larger cells/organs
• Disadvantages: meiosis challenging, buffering
of changes, more DNA to replicate
22. Repetitive DNA in dandelion
3 microspecies 22, 12 & 12 Gb
2n=3x=24 apomictic
Rubar Salih & Lubos Majesky
23. k-mer analysis
For a 16-mer length, there are 2 billion canonical
16-mers (416/2), and the average 16-mer occurs
10 times in the 22Gb of sequence data.
The overall distribution of these informs us about
how repetitive the genome is, and the frequency
of different repetitive elements.
24. k-mer analysis
The most abundant 16-mers in the 150bp genome reads:
7bp telomere sequence (TTTAGGG/CCCTAAA) added ends of each
chromosome
occurs a total of 7M times, much higher than the
expectation of 140.
From 128-mer
GT10kb
Coverage Depth = 7
AF(11)_S983_009
Blue: DAPI fluorescence.
Green: telomere primer HC_89bp
Red: 5S rDNA
25. In asexual dandelion microspecies
Rubar M. Salih
Genome evolution and biodiversity
•Actively evolving repetitive sequences in
the genome
•Differences seen between microspecies in
repeats
•Structural and mobile components of
genome identified
•Chloroplast sequence gives phylogeny and robust
markers for diversity (PLoS One in press Dec 2016)
26. So questions are
1) where is this sequence located in the genome? and
2) are there any differences between the microspecies in its abundance?
We can see this is a Ty1-Copia element
because the retroelements coding
domains are in the order
RNaseH
Reverse Transcriptase
Integrase
LTRs divergent
More (solo LTRs)
RepeatExplorer: Graph-based clustering of related sequences, program/approach by
Novák P, Neumann P, Pech J, Steinhaisl J, Macas J. RepeatExplorer: a Galaxy-based web server
for genome-wide characterization of eukaryotic repetitive elements from next-generation
sequence reads. Bioinformatics. 2013 Mar 15;29(6):792-3.
30. Dispersed on chromosomes in all
microspecies: but differences
AA1_AK07_171D_45S B_010
AC1_O996_171 D_KsHC B_003AC11_S933_171 D_KsHC B_004
0.075%
Low complexity
Assembled to genome of:
A:
S:
O:
31. Sequence CL80 double-dots on 14 chromosomes (not 16 -
not 2 genomes worth) - is it a centromeric repeat?
LTR.Copia (2hits, 0.103%)
Low complexity (5hits, 0.0895%)
Genome proportion = 0.2480%
Assembled to genome:
A =
S =
O =
AE (3)_A978_A dig_pta794_001
AE (4)_O976_A dig_pta794_002
AE (2)_S3_A dig_Pta794 bio_002
32. Unknown or Chloroplast
Low Complexity
Mixed Repeat
LTR Degenerate
LTR Gypsy
LTR Copia
DNA Transposons
LINES
LTR Caulimovirus
Simple Repeat
rRNA
Tandem Repeat
Telomere
Genomeproportion(%)
Cluster (number)
I I I I I I I I
1 50 100 150 200 250 300 351
Telomere
Tandem Repeat
rRNA
Simple Repeat
LTR Caulimovirus
LINES
DNA Transposons
LTR Copia
LTR Gypsy
LTR Degenerate
Mixed Repeat
Low Complexity
Unknown or Chloroplast
Retroelements and tandem repeats in Petunia
Supplementary Ms 2. Bombarely et al.
Petunia genome sequence
Nature Plants 2: article number 16074.
Telomere
Tandem Repeat
rRNA
Simple Repeat
LTR Caulimovirus
LINES
DNA Transposons
LTR Copia
LTR Gypsy
LTR Degenerate
Mixed Repeat
Low Complexity
Unknown or Chloroplast
33. Organelle sequences
from chloroplasts or
mitochondria
Sequences from
viruses
Transgenes introduced
with molecular biology
methods
Genes, regulatory and non-
coding low-copy sequences
Dispersed repeats
Repetitive DNA sequences
Nuclear
Genome
Tandem repeats
Satellite sequences
DNA transposonsRetrotransposons
Centromeric
repeats
Structural
components of
chromosomes
Telomeric
repeats
Simple sequence
repeats or
microsatellites
Repeated genes
Subtelomeric
repeats
45S and 5S
rRNA genes
Blocks of tandem
repeats at discrete
chromosomal loci
DNA sequence components of the nuclear genome
After Biscotti et al. Chromosome Research 2015
Other genes
Transposable elements
Autonomous/
non-autonomous
Dispersed repeats
that we don’t know
about – except each is
significant proportion
of genome
34. Japanese knotweed – invasive in watercourses in Europe
Fallopia (and Fallopia x Muehlenbeckia hybrids)
35. Repeat Explorer analysis raw reads of F. japonica and M. australis. Top clusters
represented 50% of the reads in F. japonica and 39.5% of reads in M. australis.
F. japonica has a higher proportion of dispersed repeats than M. australis.
36.
37.
38. Fallopia x Muehlenbeckia hybrid : Differential probes identified by k-mer and RepeatExplorer
Green is Fallopia-specific; Red is equal in both genomes
Desjardins, Bailey, Wang, Schwarzacher, Heslop-Harrison. 2017 in prep
41. Panicum sensu stricto c. 100 species; x=9
Evolution of Panicum miliaceum Proso millet
P. miliaceum
2n=4x=36
P. capillare
2n=2x=18
P. repens
2n=4x=36
also 2n=18 to 54
P. sumatrense
2n=2x=18 or 4x=36
Global North-temperate
Low genetic diverstiy
Weedy forms
P. virgatum
2n=4x=36 or 2x=18
? ? ? ? ??
• Hunt , HH et al. 2014. Reticulate evolution in Panicum (Poaceae): the
origin of tetraploid broomcorn millet, P. miliaceum. J Exp Bot. 2014
44. Nicotiana
hybrid
4x + 4x
cell fusions
Each of 4
chromosome
sets has
distinctive
repetitive
DNA when
probed with
genomic DNA
Patel et al
Ann Bot 2011
Cell fusion
hybrid of two
4x tetraploid
tobacco
species
Four genomes
differentially
labelled
Patel, Badakshi,
HH, Davey et al
2011 Annals
Botany
46. Centromere dynamics and timing of chromosome synapsis (6x wheat)
Adel Sepsi, Higgins, Heslop‐Harrison, Schwarzacher. CENH3 morphogenesis reveals dynamic centromere
associations during synaptonemal complex formation and the progression through male meiosis in
hexaploid wheat. Plant Journal. 2016 Sep 1.
Sepsi et al. Plant Journal 2016
47. (b) Centromere depolarisation and SC formation during Zygotene
Interphase Leptotene Zygotene Late ZygoteneTelomere
bouquet
Homologue chromosome pairs Centromeres ZYP1
Early Zygotene
1 2 3
Subtelomeric synapsis Interstitial alignment Interstitial elongation
(a) Centromere, telomere and chromosome arm dynamics in meiotic prophase I.
Sepsi et al. Plant Journal 2016
48. How do genomes evolve?
–Gene mutation very rarely
• (human: 10−8
/site/generation)
–Chromosome evolution
–Polyploidy and genome duplication
–Repetitive sequences: mobility & copy number
• (10−4
/generation in µsat)
–Recombination
–Epigenetic aspects: centromeres & expression
49. Repetitive sequences
• Many families and various types
• Abundant
• Rapidly evolving … or conserved
– Copy number and sequence
• May be near-genome specific, even
chromosome-specific
• Various genome/chromosomal locations
50. Organelle sequences
from chloroplasts or
mitochondria
Sequences from
viruses
Transgenes introduced
with molecular biology
methods
Genes, regulatory and non-
coding low-copy sequences
Dispersed repeats
Repetitive DNA sequences
Nuclear
Genome
Tandem repeats
Satellite sequences
DNA transposonsRetrotransposons
Centromeric
repeats
Structural
components of
chromosomes
Telomeric
repeats
Simple sequence
repeats or
microsatellites
Repeated genes
Subtelomeric
repeats
45S and 5S
rRNA genes
Blocks of tandem
repeats at discrete
chromosomal loci
DNA sequence components of the nuclear genome
After Biscotti et al. Chromosome Research 2015
Other genes
Transposable elements
Autonomous/
non-autonomous
Dispersed repeats
that we don’t know
about – except each is
significant proportion
of genome
51. Sequences from
viruses
Transgenes introduced
with molecular biology
methods
Genes, regulatory and non-
coding low-copy sequences
Dispersed repeats
Repetitive DNA sequences
Nuclear
Genome
Tandem repeats
Satellite sequences
DNA transposonsRetrotransposons
Centromeric
repeats
Structural
components of
chromosomes
Telomeric
repeats
Simple sequence
repeats or
microsatellites
Repeated genes
Subtelomeric
repeats
45S and 5S
rRNA genes
Blocks of tandem
repeats at discrete
chromosomal loci
Real? Passively Amplified DNA sequences: PADs
Or: Transposable element derivatives (LTRs etc)?
Other genes
Transposable elements
Autonomous/
non-autonomous
Dispersed repeats that
we don’t know about –
except each is
significant proportion
of genome
52. Domestication, polyploidy and
genomics of crops (and weeds)
Pat Heslop-Harrison &
Trude Schwarzacher
and collaborators
Leicester, UK
phh@molcyt.com
www.molcyt.com
www.molcyt.org
Twitter Pathh1 .
53. From Chromosome to Nucleus
Pat Heslop-Harrison phh4@le.ac.uk www.molcyt.com
54. • About half of all higher plant species are recognizable as polyploids, a major
feature of genome architecture where there are more than two sets of
chromosomes. Advantages include multiple copies of each gene with different
regulation, so essentially fixing heterosis; larger cell size; and the opportunity for
mutation without lethality. Disadvantages include twice as much DNA to replicate;
incorrect control of multiple gene copies in interacting genomes; chromosome
instability at mitosis; and the challenges of ensuring chromosome pairing and
regular meiotic segregation in seed crops, in breeding hybrid materials, or else
combining sterility with parthenocarpy in fruit crops. Given these substantial
contrasts, it is perhaps surprising that the top three cereal crops are wheat (a
modern hexaploid 2n=6x=42), rice (diploid, 2n=2x=14), and maize
(palaeotetraploid, 2n= 2 or 4 x =20), suggesting neither advantages nor
disadvantages are overwhelming. I will consider the balance of positives and
negatives over evolutionary and crop-breeding timescales. In the second part of
my talk, I will consider how knowledge of polyploid behaviour and knowledge of
ancestors can be exploited, discussing our work with polyploids, both well-known
(wheat, Brassica, banana) and less known (proso millet, ornamentals and saffron
crocus). Further details and references will be at www.molcyt.com. Email
phh(a)molcyt.com