SlideShare a Scribd company logo
1 of 21
Phyto-Threats
Work package 4
Predicting risk via analysis of Phytophthora genome
evolution
Ewan Mollison, Paul Sharp, Leighton Pritchard, David Cooke,
Sarah Green
Introduction
• What can drive evolution of a pathogen?
• “Intrinsic” factors: duplication, rearrangement, insertion, deletion of DNA
regions
• “Extrinsic” factors: hybridisation between species, transfer of genes between
species
• Allow pathogens to
• Adapt to evolving host defences
• Expand host range
• Increase virulence
Aims
• Compare genes from available sequenced Phytophthora genomes
• Identify a core set of Phytophthora genes, common to all species
• Identify species-specific genes or variation
• Sequence genomes from three less damaging species, which are
closely related to highly damaging species
• Can we use this to help understand key genes involved in virulence?
• Study target genes / gene families known to be important for
virulence
• How do variations in these influence the pathogen, e.g. host range, damage
caused, etc.?
Overall sequencing/assembly strategy for P. austrocedri
Purified DNA
prepared for
sequencing
Generate low coverage
PacBio long reads
~18x genome coverage
and assemble genome
113.76 Mbp across 2,226
fragments (contigs)
Generate high coverage
Illumina short read pairs
~132x
De-duplicate read pairs
to remove redundancy
No change: still ~132x
Additional stringent quality
control at 99.9% confidence
Reduced coverage to ~92x
Error correct PacBio contigs
with cleaned Illumina reads
113.83 Mbp across 2,226
contigs
Use pairing of short
reads to link PacBio
contigs together into
longer scaffolds
114.39 Mbp across
1,977 contigs
Identify repetitive
elements with
RepeatModeler &
RepeatMasker
48.96% repetitive
DNA content
Predict genes using
software “trained” with
sample gene structures
31,326 predicted genes
Available genome assemblies
• Genome assemblies for 26 Phytophthora species now available
• Varying states of “finished-ness”
• Most genomes are released along with predicted genes and protein
sequences
• 10 genomes released purely as scaffolded assemblies – gene
prediction needed to be carried out on these using Augustus trained
with appropriate models generated by Peter Thorpe
• Ranges from 10,000
(P. kernoviae) to
over 75,000 (P.
cambivora)
• Large numbers are
an artefact of
“greedy” gene
prediction tools
over-predicting
genes
• Repeat rich
regions can
evolve rapidly –
very useful for
overcoming host
resistances
• Larger genomes
often a result of
expansion of
repetitive regions
Completeness of
coverage
• Not all genes may be
captured in an
assembly
• Use a set of 234
ubiquitous genes
expected to be
present in all species
• 22/26 assemblies
estimated over 90%
“complete”
• 3/26 over 70%
• P. alni only 37%
complete100 80 60 40 20 0
Orthologous genes common to all 26 species
• Use Orthofinder to perform all-by-all comparison of peptides over
30aa in length between all 26 species and generate clusters of
orthologues and paralogues
• Using these common clusters, Orthofinder also generates a
phylogenetic tree for all 26 species
No. clusters identified 55,134
Max. peptides/cluster 1,956
Min. peptides/cluster 1
Clusters containing only one protein 29,442
Clusters with at least 20 species represented 4,156
Clusters with all 26 species represented 2,107
Phylogeny inferred from orthofinder analysis
1
4
2
7
1
6
10
8
8
3
5
3
Clade 1
Clade 2
Clade 3
Clade 4
Clade 5
Clade 6
Clade 7
Clade 8
Clade 10
Sample gene family: Xylanases
• Class of cell wall degrading enzymes which
break down hemicellulose by degrading b-1-4-
xylan into xylose
• Hemicellulose is a major constituent of the
plant cell wall
• Xylanase enzymes play a major role in the ability of
micro-organisms to degrade plant material
• Help the pathogen enter host tissues by breaking
down the cell wall
Phytophthora xylanases
• Four major xylanases identified in Phytophthora: xyn1, xyn2, xyn3, xyn4
• Average length
• xyn1: 386 aa
• xyn2: 477 aa
• xyn3: 359 aa
• xyn4: 355 aa
• xyn1 and xyn2 thought to be most important family members for
virulence in P. parasitica
Alignment of xylanases from 26 Phytophthora species
Xylanase tree overview
xyn1
xyn2
xyn3
xyn4 • Clearly defined structure
for xyn1 and xyn2 with two
distinct clades
• Clades for xyn3 and xyn4
less clearly defined
• P. kernoviae sequences at a
much greater distance
xyn1 and xyn2 clades
xyn1
xyn2
xyn3 and xyn4 clades
xyn3
xyn4
Presence/absence of xyn genes by species
• All four present in most species
• Sequences from Phytophthora clades 7 and 8 fall
outside expected xyn3/xyn4 groupings
• P. ramorum possesses one additional sequence,
which groups with xyn3 clade
• P. infestans xyn1/xyn2 both fall into xyn1 clade
• P. kernoviae possesses two xyn genes, one
associated with xyn1/xyn2 clade and the other
associated with xyn3/xyn4 clade
• P. alni only appears to have two – would expect
more as it is hybrid. Incomplete genome
assembly?
Clade Species xyn1 xyn2 xyn3 xyn4
1 P. cactorum yes yes yes yes
1 P. infestans yes ? yes yes
1 P. parasitica yes yes yes yes
2 P. capsici yes yes yes yes
2 P. colocasiae yes yes yes yes
2 P. multivora yes yes yes yes
2 P. plurivora yes yes yes yes
3 P. pluvialis yes yes yes yes
3 P. taxon totara yes yes yes ?
4 P. litchii yes yes ? yes
4 P. megakarya yes yes yes yes
4 P. palmivora yes yes ? yes
5 P. agathidicida yes yes yes yes
6 P. pinifolia yes yes yes yes
7 P. alni yes no yes no
7 P. cambivora yes yes yes ?
7 P. cinnamomi yes yes yes ?
7 P. fragariae yes yes yes ?
7 P. pisi yes yes yes ?
7 P. rubi yes yes yes ?
7 P. sojae yes yes yes ?
8 P. austrocedri no yes ? no
8 P. cryptogea yes yes ? no
8 P. lateralis yes yes ? no
8 P. ramorum yes yes ? ?
10 P. kernoviae ? ? ? ?
Regarding data sources…
• Should sequences from P.
kernoviae (clade 10) and P.
taxon totara (clade 3) be
this similar?
• Source database linked to
P. kernoviae genome
download instead of P.
taxon totara!
xyn1
xyn2
xyn3
xyn4
Further work
• Sequencing the other three Phytophthora species, P. europaea, P.
obscurum and P. foliorum, is problematic
• Difficult to obtain sufficient quantities of high-quality DNA for long-read
PacBio sequencing
• As new genome assemblies become available, bring these into the
analyses
• Expand xylanase gene family analysis to identify additional members
• Investigate other gene families of interest, e.g. RXLR effector proteins
Acknowledgements
University of Edinburgh
Prof. Paul Sharp
Forest Research
Dr. Sarah Green
James Hutton Institute
Dr. Peter Thorpe
Dr. Leighton Pritchard
Dr. David Cooke

More Related Content

What's hot

Final passaging poster
Final passaging posterFinal passaging poster
Final passaging poster
Whitney Heuer
 
12 1503-techapp1
12 1503-techapp112 1503-techapp1
12 1503-techapp1
Tet Msj
 
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
Nathan Jones
 
10 week PhD report
10 week PhD report10 week PhD report
10 week PhD report
Tanja Lepore
 

What's hot (13)

Novel Theileria genotypes from wildlife in a Theileria parva—Endemic area in ...
Novel Theileria genotypes from wildlife in a Theileria parva—Endemic area in ...Novel Theileria genotypes from wildlife in a Theileria parva—Endemic area in ...
Novel Theileria genotypes from wildlife in a Theileria parva—Endemic area in ...
 
Mouse genome
Mouse genomeMouse genome
Mouse genome
 
Final passaging poster
Final passaging posterFinal passaging poster
Final passaging poster
 
U0 vqmt qxndm=
U0 vqmt qxndm=U0 vqmt qxndm=
U0 vqmt qxndm=
 
12 1503-techapp1
12 1503-techapp112 1503-techapp1
12 1503-techapp1
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
Expression and Purification of the Plasmodium berghei Apical Membrane Antigen-1
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Beneficials insects
Beneficials insectsBeneficials insects
Beneficials insects
 
Sero-evidence of zoonotic viruses in rodents and humans in Kibera informal se...
Sero-evidence of zoonotic viruses in rodents and humans in Kibera informal se...Sero-evidence of zoonotic viruses in rodents and humans in Kibera informal se...
Sero-evidence of zoonotic viruses in rodents and humans in Kibera informal se...
 
Bacteriophage ppt
Bacteriophage pptBacteriophage ppt
Bacteriophage ppt
 
Janse_2013_EID_pla
Janse_2013_EID_plaJanse_2013_EID_pla
Janse_2013_EID_pla
 
10 week PhD report
10 week PhD report10 week PhD report
10 week PhD report
 

Similar to Ewan mollison wp4 april 2018

Parfrey smbe euk_2013_final
Parfrey smbe euk_2013_finalParfrey smbe euk_2013_final
Parfrey smbe euk_2013_final
Laura_Parfrey
 

Similar to Ewan mollison wp4 april 2018 (20)

Predicting risk via analysis of Phytophthora genome evolution
Predicting risk via analysis of Phytophthora genome evolutionPredicting risk via analysis of Phytophthora genome evolution
Predicting risk via analysis of Phytophthora genome evolution
 
Paul Sharp and Ewan Mollison wp4 Nov 2018
Paul Sharp and Ewan Mollison wp4 Nov 2018Paul Sharp and Ewan Mollison wp4 Nov 2018
Paul Sharp and Ewan Mollison wp4 Nov 2018
 
NCUR Presentation
NCUR PresentationNCUR Presentation
NCUR Presentation
 
DNA recombinant technology on insulin modification
DNA recombinant technology on insulin modificationDNA recombinant technology on insulin modification
DNA recombinant technology on insulin modification
 
Lec (2)
Lec (2)Lec (2)
Lec (2)
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
CRISPR Array
CRISPR ArrayCRISPR Array
CRISPR Array
 
07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf
 
2 chapter 5 genes and chromosome
2 chapter 5   genes and chromosome2 chapter 5   genes and chromosome
2 chapter 5 genes and chromosome
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
High-throughput sequencing and latent variable modelling of within-host paras...
High-throughput sequencing and latent variable modelling of within-host paras...High-throughput sequencing and latent variable modelling of within-host paras...
High-throughput sequencing and latent variable modelling of within-host paras...
 
Genomics and proteomics ppt
Genomics and proteomics pptGenomics and proteomics ppt
Genomics and proteomics ppt
 
Evolution
EvolutionEvolution
Evolution
 
Map Based Cloning.pptx
Map Based Cloning.pptxMap Based Cloning.pptx
Map Based Cloning.pptx
 
Model organism
Model organismModel organism
Model organism
 
Parfrey smbe euk_2013_final
Parfrey smbe euk_2013_finalParfrey smbe euk_2013_final
Parfrey smbe euk_2013_final
 
Transgenesis methods and applications
Transgenesis methods and applicationsTransgenesis methods and applications
Transgenesis methods and applications
 
Breeding for stress in potato
Breeding for stress in potatoBreeding for stress in potato
Breeding for stress in potato
 
EVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - IntroductionEVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - Introduction
 
Evolution of North American Micruracarus
Evolution of North American MicruracarusEvolution of North American Micruracarus
Evolution of North American Micruracarus
 

More from Forest Research

Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
Mike Dunn - Factors for accreditation success interactive session 13 Nov 19Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
Forest Research
 

More from Forest Research (20)

Mike Dunn & Mariella Marzano wp2 13 Nov 19
Mike Dunn & Mariella Marzano wp2 13 Nov 19Mike Dunn & Mariella Marzano wp2 13 Nov 19
Mike Dunn & Mariella Marzano wp2 13 Nov 19
 
David cooke wp1 13 Nov 19
David cooke wp1 13 Nov 19David cooke wp1 13 Nov 19
David cooke wp1 13 Nov 19
 
Sarah Green introduction 13 Nov 19
Sarah Green introduction 13 Nov 19Sarah Green introduction 13 Nov 19
Sarah Green introduction 13 Nov 19
 
Helen Bentley-Fox & Amanda Calvert 13 Nov 19
Helen Bentley-Fox & Amanda Calvert 13 Nov 19Helen Bentley-Fox & Amanda Calvert 13 Nov 19
Helen Bentley-Fox & Amanda Calvert 13 Nov 19
 
Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
Mike Dunn - Factors for accreditation success interactive session 13 Nov 19Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
Mike Dunn - Factors for accreditation success interactive session 13 Nov 19
 
Beth Purse wp3 13 Nov 19
Beth Purse wp3 13 Nov 19Beth Purse wp3 13 Nov 19
Beth Purse wp3 13 Nov 19
 
Louise barwell wp3 14 Nov 19
Louise barwell wp3 14 Nov 19Louise barwell wp3 14 Nov 19
Louise barwell wp3 14 Nov 19
 
Gregory Valatin wp2 14 Nov 19
Gregory Valatin wp2 14 Nov 19Gregory Valatin wp2 14 Nov 19
Gregory Valatin wp2 14 Nov 19
 
Mariella Marzano wp2 14 Nov 19
Mariella Marzano wp2 14 Nov 19Mariella Marzano wp2 14 Nov 19
Mariella Marzano wp2 14 Nov 19
 
Sarah Green wp5 Nov 2018
Sarah Green wp5 Nov 2018Sarah Green wp5 Nov 2018
Sarah Green wp5 Nov 2018
 
Mike Dunn wp3 Nov 2018
Mike Dunn wp3 Nov 2018Mike Dunn wp3 Nov 2018
Mike Dunn wp3 Nov 2018
 
Mariella Marzano and Mike Dunn wp2 Nov 2018
Mariella Marzano and Mike Dunn wp2 Nov 2018Mariella Marzano and Mike Dunn wp2 Nov 2018
Mariella Marzano and Mike Dunn wp2 Nov 2018
 
Leighton Pritchard wp1 Nov 2018
Leighton Pritchard wp1 Nov 2018Leighton Pritchard wp1 Nov 2018
Leighton Pritchard wp1 Nov 2018
 
Gregory Valatin wp2 Nov 2018
Gregory Valatin wp2 Nov 2018Gregory Valatin wp2 Nov 2018
Gregory Valatin wp2 Nov 2018
 
Glyn Jones wp2 Nov 2018
Glyn Jones wp2 Nov 2018Glyn Jones wp2 Nov 2018
Glyn Jones wp2 Nov 2018
 
David Cooke wp1 Nov 2018
David Cooke wp1 Nov 2018David Cooke wp1 Nov 2018
David Cooke wp1 Nov 2018
 
Colin Price wp2 Nov 2018
Colin Price wp2 Nov 2018Colin Price wp2 Nov 2018
Colin Price wp2 Nov 2018
 
Beth Purse & Dan Chapman wp3 Nov 2018
Beth Purse & Dan Chapman wp3 Nov 2018Beth Purse & Dan Chapman wp3 Nov 2018
Beth Purse & Dan Chapman wp3 Nov 2018
 
Pete thorpe wp1 april 2018
Pete thorpe wp1 april 2018Pete thorpe wp1 april 2018
Pete thorpe wp1 april 2018
 
Mike dunn wp3 april 2018
Mike dunn wp3 april 2018Mike dunn wp3 april 2018
Mike dunn wp3 april 2018
 

Recently uploaded

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Recently uploaded (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 

Ewan mollison wp4 april 2018

  • 1. Phyto-Threats Work package 4 Predicting risk via analysis of Phytophthora genome evolution Ewan Mollison, Paul Sharp, Leighton Pritchard, David Cooke, Sarah Green
  • 2. Introduction • What can drive evolution of a pathogen? • “Intrinsic” factors: duplication, rearrangement, insertion, deletion of DNA regions • “Extrinsic” factors: hybridisation between species, transfer of genes between species • Allow pathogens to • Adapt to evolving host defences • Expand host range • Increase virulence
  • 3. Aims • Compare genes from available sequenced Phytophthora genomes • Identify a core set of Phytophthora genes, common to all species • Identify species-specific genes or variation • Sequence genomes from three less damaging species, which are closely related to highly damaging species • Can we use this to help understand key genes involved in virulence? • Study target genes / gene families known to be important for virulence • How do variations in these influence the pathogen, e.g. host range, damage caused, etc.?
  • 4. Overall sequencing/assembly strategy for P. austrocedri Purified DNA prepared for sequencing Generate low coverage PacBio long reads ~18x genome coverage and assemble genome 113.76 Mbp across 2,226 fragments (contigs) Generate high coverage Illumina short read pairs ~132x De-duplicate read pairs to remove redundancy No change: still ~132x Additional stringent quality control at 99.9% confidence Reduced coverage to ~92x Error correct PacBio contigs with cleaned Illumina reads 113.83 Mbp across 2,226 contigs Use pairing of short reads to link PacBio contigs together into longer scaffolds 114.39 Mbp across 1,977 contigs Identify repetitive elements with RepeatModeler & RepeatMasker 48.96% repetitive DNA content Predict genes using software “trained” with sample gene structures 31,326 predicted genes
  • 5. Available genome assemblies • Genome assemblies for 26 Phytophthora species now available • Varying states of “finished-ness” • Most genomes are released along with predicted genes and protein sequences • 10 genomes released purely as scaffolded assemblies – gene prediction needed to be carried out on these using Augustus trained with appropriate models generated by Peter Thorpe
  • 6. • Ranges from 10,000 (P. kernoviae) to over 75,000 (P. cambivora) • Large numbers are an artefact of “greedy” gene prediction tools over-predicting genes
  • 7. • Repeat rich regions can evolve rapidly – very useful for overcoming host resistances • Larger genomes often a result of expansion of repetitive regions
  • 8. Completeness of coverage • Not all genes may be captured in an assembly • Use a set of 234 ubiquitous genes expected to be present in all species • 22/26 assemblies estimated over 90% “complete” • 3/26 over 70% • P. alni only 37% complete100 80 60 40 20 0
  • 9. Orthologous genes common to all 26 species • Use Orthofinder to perform all-by-all comparison of peptides over 30aa in length between all 26 species and generate clusters of orthologues and paralogues • Using these common clusters, Orthofinder also generates a phylogenetic tree for all 26 species No. clusters identified 55,134 Max. peptides/cluster 1,956 Min. peptides/cluster 1 Clusters containing only one protein 29,442 Clusters with at least 20 species represented 4,156 Clusters with all 26 species represented 2,107
  • 10. Phylogeny inferred from orthofinder analysis 1 4 2 7 1 6 10 8 8 3 5 3 Clade 1 Clade 2 Clade 3 Clade 4 Clade 5 Clade 6 Clade 7 Clade 8 Clade 10
  • 11. Sample gene family: Xylanases • Class of cell wall degrading enzymes which break down hemicellulose by degrading b-1-4- xylan into xylose • Hemicellulose is a major constituent of the plant cell wall • Xylanase enzymes play a major role in the ability of micro-organisms to degrade plant material • Help the pathogen enter host tissues by breaking down the cell wall
  • 12. Phytophthora xylanases • Four major xylanases identified in Phytophthora: xyn1, xyn2, xyn3, xyn4 • Average length • xyn1: 386 aa • xyn2: 477 aa • xyn3: 359 aa • xyn4: 355 aa • xyn1 and xyn2 thought to be most important family members for virulence in P. parasitica
  • 13. Alignment of xylanases from 26 Phytophthora species
  • 14. Xylanase tree overview xyn1 xyn2 xyn3 xyn4 • Clearly defined structure for xyn1 and xyn2 with two distinct clades • Clades for xyn3 and xyn4 less clearly defined • P. kernoviae sequences at a much greater distance
  • 15. xyn1 and xyn2 clades xyn1 xyn2
  • 16. xyn3 and xyn4 clades xyn3 xyn4
  • 17. Presence/absence of xyn genes by species • All four present in most species • Sequences from Phytophthora clades 7 and 8 fall outside expected xyn3/xyn4 groupings • P. ramorum possesses one additional sequence, which groups with xyn3 clade • P. infestans xyn1/xyn2 both fall into xyn1 clade • P. kernoviae possesses two xyn genes, one associated with xyn1/xyn2 clade and the other associated with xyn3/xyn4 clade • P. alni only appears to have two – would expect more as it is hybrid. Incomplete genome assembly? Clade Species xyn1 xyn2 xyn3 xyn4 1 P. cactorum yes yes yes yes 1 P. infestans yes ? yes yes 1 P. parasitica yes yes yes yes 2 P. capsici yes yes yes yes 2 P. colocasiae yes yes yes yes 2 P. multivora yes yes yes yes 2 P. plurivora yes yes yes yes 3 P. pluvialis yes yes yes yes 3 P. taxon totara yes yes yes ? 4 P. litchii yes yes ? yes 4 P. megakarya yes yes yes yes 4 P. palmivora yes yes ? yes 5 P. agathidicida yes yes yes yes 6 P. pinifolia yes yes yes yes 7 P. alni yes no yes no 7 P. cambivora yes yes yes ? 7 P. cinnamomi yes yes yes ? 7 P. fragariae yes yes yes ? 7 P. pisi yes yes yes ? 7 P. rubi yes yes yes ? 7 P. sojae yes yes yes ? 8 P. austrocedri no yes ? no 8 P. cryptogea yes yes ? no 8 P. lateralis yes yes ? no 8 P. ramorum yes yes ? ? 10 P. kernoviae ? ? ? ?
  • 19. • Should sequences from P. kernoviae (clade 10) and P. taxon totara (clade 3) be this similar? • Source database linked to P. kernoviae genome download instead of P. taxon totara! xyn1 xyn2 xyn3 xyn4
  • 20. Further work • Sequencing the other three Phytophthora species, P. europaea, P. obscurum and P. foliorum, is problematic • Difficult to obtain sufficient quantities of high-quality DNA for long-read PacBio sequencing • As new genome assemblies become available, bring these into the analyses • Expand xylanase gene family analysis to identify additional members • Investigate other gene families of interest, e.g. RXLR effector proteins
  • 21. Acknowledgements University of Edinburgh Prof. Paul Sharp Forest Research Dr. Sarah Green James Hutton Institute Dr. Peter Thorpe Dr. Leighton Pritchard Dr. David Cooke