Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Ewan mollison wp4 april 2018
1. Phyto-Threats
Work package 4
Predicting risk via analysis of Phytophthora genome
evolution
Ewan Mollison, Paul Sharp, Leighton Pritchard, David Cooke,
Sarah Green
2. Introduction
• What can drive evolution of a pathogen?
• “Intrinsic” factors: duplication, rearrangement, insertion, deletion of DNA
regions
• “Extrinsic” factors: hybridisation between species, transfer of genes between
species
• Allow pathogens to
• Adapt to evolving host defences
• Expand host range
• Increase virulence
3. Aims
• Compare genes from available sequenced Phytophthora genomes
• Identify a core set of Phytophthora genes, common to all species
• Identify species-specific genes or variation
• Sequence genomes from three less damaging species, which are
closely related to highly damaging species
• Can we use this to help understand key genes involved in virulence?
• Study target genes / gene families known to be important for
virulence
• How do variations in these influence the pathogen, e.g. host range, damage
caused, etc.?
4. Overall sequencing/assembly strategy for P. austrocedri
Purified DNA
prepared for
sequencing
Generate low coverage
PacBio long reads
~18x genome coverage
and assemble genome
113.76 Mbp across 2,226
fragments (contigs)
Generate high coverage
Illumina short read pairs
~132x
De-duplicate read pairs
to remove redundancy
No change: still ~132x
Additional stringent quality
control at 99.9% confidence
Reduced coverage to ~92x
Error correct PacBio contigs
with cleaned Illumina reads
113.83 Mbp across 2,226
contigs
Use pairing of short
reads to link PacBio
contigs together into
longer scaffolds
114.39 Mbp across
1,977 contigs
Identify repetitive
elements with
RepeatModeler &
RepeatMasker
48.96% repetitive
DNA content
Predict genes using
software “trained” with
sample gene structures
31,326 predicted genes
5. Available genome assemblies
• Genome assemblies for 26 Phytophthora species now available
• Varying states of “finished-ness”
• Most genomes are released along with predicted genes and protein
sequences
• 10 genomes released purely as scaffolded assemblies – gene
prediction needed to be carried out on these using Augustus trained
with appropriate models generated by Peter Thorpe
6. • Ranges from 10,000
(P. kernoviae) to
over 75,000 (P.
cambivora)
• Large numbers are
an artefact of
“greedy” gene
prediction tools
over-predicting
genes
7. • Repeat rich
regions can
evolve rapidly –
very useful for
overcoming host
resistances
• Larger genomes
often a result of
expansion of
repetitive regions
8. Completeness of
coverage
• Not all genes may be
captured in an
assembly
• Use a set of 234
ubiquitous genes
expected to be
present in all species
• 22/26 assemblies
estimated over 90%
“complete”
• 3/26 over 70%
• P. alni only 37%
complete100 80 60 40 20 0
9. Orthologous genes common to all 26 species
• Use Orthofinder to perform all-by-all comparison of peptides over
30aa in length between all 26 species and generate clusters of
orthologues and paralogues
• Using these common clusters, Orthofinder also generates a
phylogenetic tree for all 26 species
No. clusters identified 55,134
Max. peptides/cluster 1,956
Min. peptides/cluster 1
Clusters containing only one protein 29,442
Clusters with at least 20 species represented 4,156
Clusters with all 26 species represented 2,107
11. Sample gene family: Xylanases
• Class of cell wall degrading enzymes which
break down hemicellulose by degrading b-1-4-
xylan into xylose
• Hemicellulose is a major constituent of the
plant cell wall
• Xylanase enzymes play a major role in the ability of
micro-organisms to degrade plant material
• Help the pathogen enter host tissues by breaking
down the cell wall
12. Phytophthora xylanases
• Four major xylanases identified in Phytophthora: xyn1, xyn2, xyn3, xyn4
• Average length
• xyn1: 386 aa
• xyn2: 477 aa
• xyn3: 359 aa
• xyn4: 355 aa
• xyn1 and xyn2 thought to be most important family members for
virulence in P. parasitica
14. Xylanase tree overview
xyn1
xyn2
xyn3
xyn4 • Clearly defined structure
for xyn1 and xyn2 with two
distinct clades
• Clades for xyn3 and xyn4
less clearly defined
• P. kernoviae sequences at a
much greater distance
17. Presence/absence of xyn genes by species
• All four present in most species
• Sequences from Phytophthora clades 7 and 8 fall
outside expected xyn3/xyn4 groupings
• P. ramorum possesses one additional sequence,
which groups with xyn3 clade
• P. infestans xyn1/xyn2 both fall into xyn1 clade
• P. kernoviae possesses two xyn genes, one
associated with xyn1/xyn2 clade and the other
associated with xyn3/xyn4 clade
• P. alni only appears to have two – would expect
more as it is hybrid. Incomplete genome
assembly?
Clade Species xyn1 xyn2 xyn3 xyn4
1 P. cactorum yes yes yes yes
1 P. infestans yes ? yes yes
1 P. parasitica yes yes yes yes
2 P. capsici yes yes yes yes
2 P. colocasiae yes yes yes yes
2 P. multivora yes yes yes yes
2 P. plurivora yes yes yes yes
3 P. pluvialis yes yes yes yes
3 P. taxon totara yes yes yes ?
4 P. litchii yes yes ? yes
4 P. megakarya yes yes yes yes
4 P. palmivora yes yes ? yes
5 P. agathidicida yes yes yes yes
6 P. pinifolia yes yes yes yes
7 P. alni yes no yes no
7 P. cambivora yes yes yes ?
7 P. cinnamomi yes yes yes ?
7 P. fragariae yes yes yes ?
7 P. pisi yes yes yes ?
7 P. rubi yes yes yes ?
7 P. sojae yes yes yes ?
8 P. austrocedri no yes ? no
8 P. cryptogea yes yes ? no
8 P. lateralis yes yes ? no
8 P. ramorum yes yes ? ?
10 P. kernoviae ? ? ? ?
19. • Should sequences from P.
kernoviae (clade 10) and P.
taxon totara (clade 3) be
this similar?
• Source database linked to
P. kernoviae genome
download instead of P.
taxon totara!
xyn1
xyn2
xyn3
xyn4
20. Further work
• Sequencing the other three Phytophthora species, P. europaea, P.
obscurum and P. foliorum, is problematic
• Difficult to obtain sufficient quantities of high-quality DNA for long-read
PacBio sequencing
• As new genome assemblies become available, bring these into the
analyses
• Expand xylanase gene family analysis to identify additional members
• Investigate other gene families of interest, e.g. RXLR effector proteins