Field-based, real-time metagenomics and phylogenomics for responsive pathogen detection: lessons from nanopore analyses of Acute Oak Decline (AOD) sites in the UK.
This document summarizes a study that used nanopore sequencing to analyze plant and microbial DNA from oak trees affected by Acute Oak Decline (AOD) at multiple sites in the UK. Key findings include: (1) Potential bacterial "vectors" of AOD were detected at nearly all sites in both symptomatic and asymptomatic trees; (2) Differences in bacterial communities between bark and leaf samples were unclear; (3) Non-native oak species were also affected; (4) Challenges remained in data analysis and establishing microbial baselines. The study demonstrated the feasibility of field-based metagenomic sequencing to identify pathogens.
Similar to Field-based, real-time metagenomics and phylogenomics for responsive pathogen detection: lessons from nanopore analyses of Acute Oak Decline (AOD) sites in the UK.
Similar to Field-based, real-time metagenomics and phylogenomics for responsive pathogen detection: lessons from nanopore analyses of Acute Oak Decline (AOD) sites in the UK. (20)
Field-based, real-time metagenomics and phylogenomics for responsive pathogen detection: lessons from nanopore analyses of Acute Oak Decline (AOD) sites in the UK.
1. Field-based, real-time metagenomics and
phylogenomics for responsive pathogen
detection: lessons from nanopore analyses
of Acute Oak Decline (AOD) sites in the UK
Joe Parker
Early-career Research Fellow, RBG Kew
@lonelyjoeparker:
joe.parker@kew.org
3. Sites / samples
• Three sites: Kew, Richmond, Malvern Wood
• Symptomatic and asymptomatic
• Several species – not just native (Kew)
• Incl. N America, Near East, China/SE Asia
• Tissues: bark, leaf, exudate
• In silica and also straight to -20ºC
L-R: Q. petraea; Q. frainetto; Q. nigra
6. Field-sequencing for real
Conditions
100% humidity; 6-13ºC
Essential kit
800w generator
3x laptops
Centrifuge
Waterbath
Polystyrene boxes (lots)
Kettle(…!)
Yield
>400Mbp data in three days;
A. thaliana ~2.01x coverage
10. Partial queries: encounter curves?
Subsample MinION output
Repeat ID pipeline, record
mean ID stat sbias
Replicates: N = 30
Simulate from 100 – 104 reads
(≈instant → hours)
11. Partial references?
Take reference genome at
high contiguity
Fragment randomly to
target (low) contiguity
Repeat read identification
using fragmented DB
Simulate N50 ≈1,000bp
to N50 ≈ 10Mbp
12. Long-read matching vs K-mers
Entire chloroplast
genome (~150kbp)
Plastid
coding loci
Individual field-
sequenced
MinION reads
13. Gene prediction – phylogenomics and functional descriptions?
Filtered
reads
Gene
models
TAIR10
CDS code
Annotation
SNAP
1:1 reciprocal
BLAST
Multiple sequence
alignments
MUSCLE
Trimal
Gene trees → Consensus tree
*BEAST
RAxML,
TreeAnnotator
Cumulative counts:
Unique genes
All genes
(‘Lab’ being
transported!)
14. (Early) AOD conclusions
• ‘Vector’ sequences almost ubiquitous
• Bark / leaf differences unclear
• Non-native species affected
• ‘Polysyndromic’ bacteria?
• Relative abundance?
• Microbiome not understood – baseline
data
• Temperature/climate effect?
15. Field-based ID systems
• Data is decent
• Need good long-read-aware + noise
correction
• More data to establish baselines
• Mainly expertise/know-how limited, tech
is fine now.
17. Keeping it simple: Kew Science Festival
Six species: whole genome-
skim samples with MinION
in preparation
Build BLAST DBs from
skimmed data
Select ‘unknown’ (blinded)
sample, extract DNA and
resequence in real-time
Compare to partial DBs in
six-way BLAST competition
Live ID ?
18. Thanks, funders, contacts and questions
Oxford Nanopore
Technologies Ltd.
Dan Turner, Richard
Ronan, Gerrard CoyneU Bangor:
Alexander S.T. Papadopulos (@metallophyte)
RBG Kew:
Postdocs: Andrew Helmstetter (@ajhelmstetter); Tim Coker
Thanks: Dion Devey, Robyn Cowan, Tim Wilkinson, Stephen Dodsworth, Pepijn Kooij, Felix
Forest, Bill Baker, Jan T. Kim, Jenny Williams, Abigail Barker, Mark Lee, Jim Clarkson, Mike
Chester, Ester Gaya, Lisa Pokorny, Laszlo Csiba, Paul Wilkin, Richard Buggs, Mike Fay, Mark
Chase, Ilia Leitch
QMUL
Laura Kelly, Kalina Davies, Steve Rossiter
Oxford
Aris Katzourakis, Oli Pybus, Jayna Raghwani
Others
Forest Research: Daegan Inward, Katy Reed
Dstl: Claire Lonstale, James Taylor
Birmingham: Nick Loman, Josh Quick
U. Utah: Bryn Dentinger
Imperial: James Rosindell
This research was
conducted in the
Sackler Phylogenomics
Laboratory and was
supported by the
Calleva Foundation
Phylogenomic Research
Programme and the
Sackler Trust
@lonelyjoeparker:
joe.parker@kew.org
Editor's Notes
FUNDED tailor made for health research/application
need to mention it somewhere because of:
strategic links
Building the ‘momentum narrative’
Other related stuff; VIPs etc
Plant health and emerging threats
A connected world means new diseases can spread globally, fast.
Lay out the problem, e.g. opportunities – look! Health! Ascertainment bias! Field-portable! etc
Funding: yet another pot, this one also bigger.
Software etc to improve UI (ahem)
Portable sequencing: also long reads and real-time
Portable
Real-time
Long
easy
Data in terrible conditions but anyone can do it
Social media reach The Atlantic, Economist
EXPLAIN AXES: precision improves rapidly
EXPLAIN AXES: a partial REFERENCE would work, too
Single reads match whole genes – meat & drink
EXPLAIN AXES postdoc-years PAPER ACCEPTED
HPCs to apps: Exponential data, linear understanding.
Pause – to recap
This is important because it’s where we tie it together and show my contribution:
Portable sequencers, easier to use
More places
More experimenters
More data
More noise
Efficient comparison?
Dynamic computation?
Clever hashing
Portable, mass sequencing is really here
Massive potential for de novo genomics; phylogenomics
But while we’re accumulating information at an exponential rate, we’re integrating it linearly, in essence
… where are we going?
MORE FUNDING. SO simple a kid could do it? Yes
The challenge I set myself: OK, it’s a simple experiment. Can I buid a trest simple ehough a child can understand it?
SOCIAL MEDIA
Funding: NANOPORE