Microbial Agrogenomics
Where can it lead us?
Leighton Pritchard
Information and Computational Sciences
The James Hutton Institute
Acceptable Use Policy
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Recording of this talk, taking photos, discussing the content using
email, Twitter, blogs, etc. is permitted (and encouraged),
providing distraction to others during the presentation is minimised.
These slides will be available on SlideShare.
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
The James Hutton Institute
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Centres of Expertise
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
http://www.hutton.ac.uk
• Dundee Effector Consortium (DEC, with University of Dundee) [link]
• Centre for Research on Potato and Other Solanaceous Plants (CRPS) [link]
• Centre for Human and Animal Pathogens in the Environment (HAP-E) [link]
Plant-Pathogen Interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Pathogens of barley (e.g. Rhynchosporium commune), and soft fruit
(e.g. Raspberry Leaf Blotch Virus (RLBV))
Plant-Pathogen Interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Potato pathogens, pests, and vectors.
• soft-rot bacteria (Dickeya, Pectobacterium, Erwinia)
• blight (Phytophthora infestans)
• Potato Cyst Nematode (PCN) (Globodera)
• aphids (Myzus persicae)
Issue 1: Food Security
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Economic cost and burden of crop disease
• P. infestans: e1bn Europe; $4bn global
• Societal impact (human health, commodity prices; farming)
• Emerging pathogens (JIT supply chain; climate change)
• Plant-associated human pathogens
• Food fraud
Issue 2: Environmental Sustainability
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Pesticide minimisation and withdrawal
• Durable resistance, soil-beneficial microbes, plant
growth/nutritional enhancement
• Traditional breeding, GM, or engineering?
• Soils: rhizosphere interactions/soil diversity
• Farming practices (water run-off, rotation, equipment-cleaning
- EU sulphuric acid ban)
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
What Have Genomes Ever Done For Us?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Catalogue features (genes, regulatory elements, etc.) in an
organism.
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Gene products at the host-microbe interface
Dodds & Rathjen (2010) Nat. Rev. Genet. 11:539-548 doi:10.1038/nrg2812
Plant-Nematode Interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
RNA-seq identification of 27 putative nematode effectors:
Small proteins, expressed in gland cells during feeding stage only.
Cotton et al. (2014) Genome Biol. 15:R43 doi:10.1186/gb-2014-15-3-r43
Plant Defence
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Prediction of NB-LRR genes (sequence capture).
Jupe et al. (2013) Plant J. 76:530-544 doi:10.1111/tpj.12307
Jupe et al. (2012) BMC Genomics 13:75 doi:10.1186/1471-2164-13-75
What Have Genomes Ever Done For Us?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Catalogue features (genes, regulatory elements, etc.) in an
organism.
• If we have multiple genomes. . .
• What common features associate with phenotype or
environment?
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
GWAS/QTLs/genotyping for plant breeding
http://ics.hutton.ac.uk/flapjack/
Milne et al. (2010) Bioinformatics 26:3133-3134 doi:10.1093/bioinformatics/btq580
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Structural changes to genomes: repeat-driven expansion
duplication, mutation, recombination, epigenetic control of effectors . . .
Haas et al. (2009) Nature 461:393-398 doi:10.1038/nature08358
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Structural changes to genomes: genome reductions
Buchnera, Serratia symbiotica - aphid symbionts, ‘random’ inactivation
Gil et al. (2002) Proc. Natl. Acad. Sci. USA 99:4454-4458 doi:10.1073/pnas.062067299
Burke & Moran (2011) Genome Biol. Evol. 99:4454-4458 doi:10.1093/gbe/evr002
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Lateral gene transfer (virulence-associated genes)
Bell et al. (2004) Proc. Natl. Acad. Sci. 101:11105-11110 doi:10.1073/pnas.0402424101
Plant-microbe interactions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Closely-related bacteria, different host/environmental preference.
Pectobacterium atrosepticum
Holden et al. (2009) FEMS Micro. Rev. 33:689-703 doi:10.1111/j.1574-6976.2008.00153.x
Toth et al. (2006) Annu. Rev. Phytopath. 44:305-336 doi:10.1146/annurev.phyto.44.070505.143444
What Have Genomes Ever Done For Us?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Catalogue features (genes, regulatory elements, etc.) in an
organism.
• If we have multiple genomes. . .
• What common features associate with phenotype or
environment?
• Epidemiology: spread and transmission
Historical Origins
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Retracing 19th-century P.infestans pandemics
Yoshida et al. (2014) PLoS Pathog. 10:e1004028 doi:10.1371/journal.ppat.1004028
International Emergence
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Distribution of Dickeya spp. in Europe
• D.dianthicola; ◦ D.solani; Dickeya spp. on potato
Toth et al. (2011) Plant Pathol. 60:385-399 doi:10.1111/j.1365-3059.2011.02427.x
Host Jumps
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Movement of Dickeya from ornamental to crop plants
Parkinson et al. (2015) Eur. J. Plant Pathol. 141:63-70 doi:10.1007/s10658-014-0523-5
Diagnostic Tools
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Quarantine and legislation require precise identification.
Genomes enable rapid, robust RT-PCR diagnostics.
targets
V
IV
III
II
I
genomes
I
II
III
IV
V
https://github.com/widdowquinn/find differential primers
Pritchard et al. (2013) Plant Pathol. 62:587-596 doi:10.1111/j.1365-3059.2012.02678.x
Pritchard et al. (2012) PLoS One 7:e34498 doi:10.1371/journal.pone.0034498
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
2003: E. carotovora subsp. atroseptica
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• £250k collaboration between SCRI, University of Cambridge,
WT Sanger Institute
• Single isolate: E. carotovora subsp. atroseptica SCRI1043
• First sequenced enterobacterial plant pathogen (32 authors!)
• Annotation: 6 people, for 6 months ≈ three person-years
• Result: single, complete 5Mbp circular chromosome (10.2X)
Bell et al. (2004) Proc. Natl. Acad. Sci. USA 101: 30:11105-11110. doi:10.1073/pnas.0402424101
2003: E. carotovora subsp. atroseptica
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Compared against all 142 then-available bacterial genomes
Bell et al. (2004) Proc. Natl. Acad. Sci. USA 101: 30:11105-11110. doi:10.1073/pnas.0402424101
2013: Dickeya spp.
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Sequenced and annotated 25 isolates of Dickeya over two years
• Multiple sequencing methods: 454, Illumina (SE, PE)
• Automated annotation, limited manual correction
• Results: 12-237 fragments: 4.2-5.1Mbp/genome (6-84X)
Pritchard et al. (2013) Genome Ann. 1 (4) doi:10.1128/genomeA.00087-12
Pritchard et al. (2013) Genome Ann. 1 (6) doi:10.1128/genomeA.00978-13
2013: Dickeya spp.
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Whole genome-based species definitions: sp. nov. D. solani
van der Wolf et al. (2014) Int. J. Syst. Evol. Micr. 64:768-774 doi:10.1099/ijs.0.052944-0
2013: Dickeya spp.
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Differences in metabolic capacity (but ≈ 20% orphan EC activities)
2014: E. coli
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Sequenced and annotated ≈ 190 isolates of E. coli
All bacteria environmental, sampled from lysimeters
• Illumina PE sequencing, cost ≈£11k
• Automated annotation: PROKKA
(w/ Fiona Brennan, Florence Abram, NUI Galway)
2014: E. coli
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Whole genome-based subspecies classification
Brunei20070942_contigs
Muenster20063091_contigs
Senftenberg20070885_contigs
Lys142_contigs
Lys175_contigs
Lys130_contigs
Lys170_contigs
Lys126_contigs
Lys167_contigs
Lys176_contigs
Lys169_contigs
Lys50_contigs
X5038_contigs
Lys131_contigs
Lys171_contigs
Lys111_contigs
Lys107_contigs
Lys114_contigs
Lys16_contigs
Lys22_contigs
Lys65_contigs
Lys56_contigs
Lys113_contigs
Lys109_contigs
Lys77_contigs
Lys102_contigs
Lys100_contigs
Lys92_contigs
Lys94_contigs
Lys80_contigs
Lys64_contigs
Lys82_contigs
AW3_contigs
X5008_contigs
AW4_contigs
AW1_contigs
Lys118_contigs
Lys138_contigs
Lys121_contigs
Lys122_contigs
Lys177_contigs
Lys155_contigs
Lys165_contigs
Lys163_contigs
Lys160_contigs
Lys161_contigs
Lys172_contigs
Lys144_contigs
Lys135_contigs
Lys146_contigs
Lys123_contigs
Lys124_contigs
Lys150_contigs
Lys140_contigs
Lys157_contigs
Lys173_contigs
Lys156_contigs
Lys158_contigs
Lys159_contigs
Lys162_contigs
Lys5_contigs
X5084_contigs
X5042_contigs
Lys110_contigs
Lys136_contigs
Lys54_contigs
Lys1_contigs
Lys6_contigs
Lys112_contigs
X5012_contigs
Lys30_contigs
Lys25_contigs
Lys43_contigs
Lys37_contigs
Lys40_contigs
Lys151_contigs
Lys31_contigs
Lys27_contigs
Lys42_contigs
Lys51_contigs
Lys33_contigs
Lys46_contigs
Lys38_contigs
Lys89_contigs
Lys23_contigs
Lys115_contigs
Lys108_contigs
Lys104_contigs
DSM10973_contigs
Lys125_contigs
Lys105_contigs
Lys17_contigs
Lys128_contigs
Lys66_contigs
Lys73_contigs
Lys15_contigs
Lys91_contigs
DSM8698_contigs
DSM8695_contigs
Lys74_contigs
Lys61_contigs
Lys9_contigs
Lys153_contigs
Lys84_contigs
Lys93_contigs
Lys72_contigs
Lys62_contigs
Lys21_contigs
Lys59_contigs
Lys63_contigs
Lys83_contigs
Lys19_contigs
Lys4_contigs
AW13_contigs
Lys45_contigs
Lys28_contigs
Lys53_contigs
Lys52_contigs
Lys34_contigs
Lys36_contigs
Lys24_contigs
Lys35_contigs
Lys68_contigs
Lys106_contigs
Lys88_contigs
Lys97_contigs
Lys76_contigs
Lys134_contigs
Lys58_contigs
Lys71_contigs
Lys81_contigs
Lys129_contigs
Lys120_contigs
Lys145_contigs
Lys137_contigs
Lys127_contigs
Lys152_contigs
Lys101_contigs
Lys98_contigs
Lys70_contigs
Lys133_contigs
Lys47_contigs
Lys75_contigs
Lys48_contigs
Lys148_contigs
Lys139_contigs
Lys141_contigs
Lys164_contigs
Lys149_contigs
Lys147_contigs
Lys60_contigs
Lys79_contigs
Lys168_contigs
Lys18_contigs
Lys87_contigs
Lys96_contigs
Lys7_contigs
Lys154_contigs
Lys117_contigs
Lys119_contigs
Lys178_contigs
Lys116_contigs
Lys86_contigs
Lys90_contigs
Lys41_contigs
Lys13_contigs
Lys85_contigs
X5002_contigs
Lys12_contigs
Lys39_contigs
Lys14_contigs
Lys55_contigs
Lys29_contigs
Lys99_contigs
X5035_contigs
Lys8_contigs
Lys3_contigs
X5034_contigs
X5088_contigs
Lys20_contigs
Lys78_contigs
Lys11_contigs
Brunei20070942_contigs
Muenster20063091_contigs
Senftenberg20070885_contigs
Lys142_contigs
Lys175_contigs
Lys130_contigs
Lys170_contigs
Lys126_contigs
Lys167_contigs
Lys176_contigs
Lys169_contigs
Lys50_contigs
5038_contigs
Lys131_contigs
Lys171_contigs
Lys111_contigs
Lys107_contigs
Lys114_contigs
Lys16_contigs
Lys22_contigs
Lys65_contigs
Lys56_contigs
Lys113_contigs
Lys109_contigs
Lys77_contigs
Lys102_contigs
Lys100_contigs
Lys92_contigs
Lys94_contigs
Lys80_contigs
Lys64_contigs
Lys82_contigs
AW3_contigs
5008_contigs
AW4_contigs
AW1_contigs
Lys118_contigs
Lys138_contigs
Lys121_contigs
Lys122_contigs
Lys177_contigs
Lys155_contigs
Lys165_contigs
Lys163_contigs
Lys160_contigs
Lys161_contigs
Lys172_contigs
Lys144_contigs
Lys135_contigs
Lys146_contigs
Lys123_contigs
Lys124_contigs
Lys150_contigs
Lys140_contigs
Lys157_contigs
Lys173_contigs
Lys156_contigs
Lys158_contigs
Lys159_contigs
Lys162_contigs
Lys5_contigs
5084_contigs
5042_contigs
Lys110_contigs
Lys136_contigs
Lys54_contigs
Lys1_contigs
Lys6_contigs
Lys112_contigs
5012_contigs
Lys30_contigs
Lys25_contigs
Lys43_contigs
Lys37_contigs
Lys40_contigs
Lys151_contigs
Lys31_contigs
Lys27_contigs
Lys42_contigs
Lys51_contigs
Lys33_contigs
Lys46_contigs
Lys38_contigs
Lys89_contigs
Lys23_contigs
Lys115_contigs
Lys108_contigs
Lys104_contigs
DSM10973_contigs
Lys125_contigs
Lys105_contigs
Lys17_contigs
Lys128_contigs
Lys66_contigs
Lys73_contigs
Lys15_contigs
Lys91_contigs
DSM8698_contigs
DSM8695_contigs
Lys74_contigs
Lys61_contigs
Lys9_contigs
Lys153_contigs
Lys84_contigs
Lys93_contigs
Lys72_contigs
Lys62_contigs
Lys21_contigs
Lys59_contigs
Lys63_contigs
Lys83_contigs
Lys19_contigs
Lys4_contigs
AW13_contigs
Lys45_contigs
Lys28_contigs
Lys53_contigs
Lys52_contigs
Lys34_contigs
Lys36_contigs
Lys24_contigs
Lys35_contigs
Lys68_contigs
Lys106_contigs
Lys88_contigs
Lys97_contigs
Lys76_contigs
Lys134_contigs
Lys58_contigs
Lys71_contigs
Lys81_contigs
Lys129_contigs
Lys120_contigs
Lys145_contigs
Lys137_contigs
Lys127_contigs
Lys152_contigs
Lys101_contigs
Lys98_contigs
Lys70_contigs
Lys133_contigs
Lys47_contigs
Lys75_contigs
Lys48_contigs
Lys148_contigs
Lys139_contigs
Lys141_contigs
Lys164_contigs
Lys149_contigs
Lys147_contigs
Lys60_contigs
Lys79_contigs
Lys168_contigs
Lys18_contigs
Lys87_contigs
Lys96_contigs
Lys7_contigs
Lys154_contigs
Lys117_contigs
Lys119_contigs
Lys178_contigs
Lys116_contigs
Lys86_contigs
Lys90_contigs
Lys41_contigs
Lys13_contigs
Lys85_contigs
5002_contigs
Lys12_contigs
Lys39_contigs
Lys14_contigs
Lys55_contigs
Lys29_contigs
Lys99_contigs
5035_contigs
Lys8_contigs
Lys3_contigs
5034_contigs
5088_contigs
Lys20_contigs
Lys78_contigs
Lys11_contigs
ANIm
0.9 0.92 0.94 0.96 0.98
Value
0100020003000400050006000
Color Key
and Histogram
Count
A
B1
B2
C
D
E
F
U
X
(w/ Fiona Brennan, Florence Abram, NUI Galway)
2014: Campylobacter spp.
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
≈1034 clinical, animal, food-associated Campylobacter isolates
• Illumina PE sequencing, cost ≈£60k
• Automated annotation: PRODIGAL
(w/ Ken Forbes, Norval Strachan, University of Aberdeen)
2014: Campylobacter spp.
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• 15554 ‘gene families’ in 1034 isolates.
• Calculation: 4e12 pairwise protein comparisons!
(w/ Ken Forbes, Norval Strachan, University of Aberdeen)
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
So what’s changed?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Everything.
• Cost: £250k → £60 per genome.
Now cheaper to sequence than analyse a genome!
Offload work from people to software.
• Location: sequencing centre, to benchtop (Nanopore!)
• Speed: sequencing run time can be less than a day
• Data: massive volume increase
Predicting the future is hard. . .
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Su et al. attempted to do it, though:
10,000 prokaryotes in 2015 was an underestimate.
http://sulab.org/2013/06/sequenced-genomes-per-year/
So what’s changed?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Everything.
• Cost: £250k → £60 per genome.
• Location: sequencing centre, to benchtop (Nanopore!)
• Speed: sequencing run time can be less than a day
• Data: massive volume increase
More data ≈ better, but also more challenging.
• Software: more (= better. . .) software for more things
• New experiments: genomes, exomes, variant calling,
methylated sequences, STARR-seq, . . .
• New applications: diagnostics, epidemic tracking,
metagenomics, . . .
Sequence first. . . ask questions, later
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• “Why?” has sometimes been replaced by “What?”
http://dilbert.com/strip/2000-01-03
“The thesis is not hypothesis driven. Add a hypothesis and refer to it in subsequent
chapters.”
More isn’t always better
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Deeper sequencing (more reads) = more information or better
assembly.
60-80X coverage the ‘sweet spot’ for bacterial genomes.
More reps more reads!
Conway & Bromage (2011) Bioinformatics 27:479-486 doi:10.1093/bioinformatics/btq697
Are database annotations reliable?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Automated annotation is essential
The Critical Assessment of Function Annotation (CAFA) project.
Radivojac et al. (2013) Nat. Meth. 10:221-227 doi:10.1038/nmeth.2340
Do biased database annotations matter?
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Experimental annotations of proteins are incomplete. Is that
important?
Tested by simulation, and following databases for three years.
• Yes. It matters.
• Current large scale annotations are meaningful and almost surprisingly reliable.
• The nature and level of data incompleteness, and type of classification model
have an effect.
• “Low precision, high recall” (i.e. less discriminating) tools most significantly
affected.
Molecular function prediction is usually more reliable than
biological process prediction
Jiang et al. (2014) Bioinformatics 30:i609-i616 doi:10.1093/bioinformatics/btu472
Cozzetto et al. (2013) BMC Bioinf. 14:S3-S1 doi:10.1186/1471-2105-14-S3-S1
CAFA results
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
The Critical Assessment of Function Annotation (CAFA) 2013
results. (F-measure combines precision and recall)
• You can do better than
BLAST.
• Best-performing methods do
comparably well.
• Best methods used
evolutionary relationships,
structure, and expression
data.
• Machine Learning methods
work best.
Radivojac et al. (2013) Nat. Meth. 10:221-227 doi:10.1038/nmeth.2340
More Isn’t Always Better
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Statistical inference on large datasets requires extra care.
Hypothesis tests may incorrectly reject null hypotheses (B-H)
More Isn’t Always Better
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• More tests → random effect seems ’real’
• May be considering a large set of inferences simultaneously
(and yet not notice!):“p-hacking”, “Researcher Degrees of
Freedom”
“good scientists are skilled at looking hard enough and subsequently coming up
with good stories (plausible even to themselves, as well as to their colleagues
and peer reviewers) to back up any statistically-significant comparisons they
happen to come up with.” Gelman & Loken (2013) ”The Garden of Forking Paths”
(“Data-dredging”)
True for all large data analyses: genomics, metabolomics,
proteomics, health screening, finding terrorists, etc.
Xia et al. (2012) Metabolomics 9:280-299 doi:10.1007/s11306-012-0482-9
Broadhurst & Kell (2006) Metabolomics 2:171-196 doi:10.1007/s11306-006-0037-z
Genome-Scale Predictions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Imagine a paper describing a predictor for protein functional
class (e.g. pathogen effector)
• The paper reports sensitivity = 0.95, FPR = 0.01
• We run the predictor on 20,000 proteins in an organism
• It predicts 130 members of the class. How many of them are
likely to be true positives?
Pritchard & Broadhurst (2014) Meth. Mol. Biol. 9:280-299 doi:10.1007/978-1-62703-986-4 4
Genome-Scale Predictions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Imagine a paper describing a predictor for protein functional
class (e.g. pathogen effector)
• The paper reports sensitivity = 0.95, FPR = 0.01
• We run the predictor on 20,000 proteins in an organism
• It predicts 130 members of the class. How many of them are
likely to be true positives?
• We need a baseline level of that class (fX ) in the genome to
determine this.
• Estimate ≈ 200 in gene complement, so fX = 0.01
• fX = 0.01 =⇒ P(class|+ve) = 0.490 ≈ 0.5: 65 TP
Pritchard & Broadhurst (2014) Meth. Mol. Biol. 9:280-299 doi:10.1007/978-1-62703-986-4 4
Baserate Fallacy
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
http://bit.ly/1EFbzCI
http://armchairbiology.blogspot.co.uk/2014/07/the-baserate-fallacy-revisited.html
A Literature Example
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Reported sensitivity ≈ 0.71, FPR ≈ 0.15
Arnold et al. (2009) PLoS Pathog. 5:e1000376 doi:10.1371/journal.ppat.1000376
Big Data: New Problems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Lots of high throughput experiments, and large datasets
(but even more small datasets)
• Historically ill-formed data (sequences in Word documents,
BLAST results pasted into notebooks).
• How do we connect all this data in a productive way?
This section influenced heavily by C. Titus Brown and Philip Bourne
Big Data: New Problems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Data management. Too often:
“Goodbye to the student is goodbye to the data”
• Persistence of data resources (link rot, database entropy)
http://www.phdcomics.com/comics/archive.php?comicid=382
Big Data: New Problems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• How reproducible are computational results?
• Software/data versions prevent exact reproduction: 280h to
reproduce one paper approximately - in the same lab!
Garijo et al. (2013) PLoS One doi:10.1371/journal.pone.0080278
http://www.slideshare.net/pebourne/sib0114
Big Data: New Problems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Maybe we can get away with all of this in a traditional model of
science publishing. . .
http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
Big Data: New Problems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
. . .but lots of biological data doesn’t make sense except in the light
of other biological data.
http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Everyone could be better off with collaboration and data sharing.
What is winning: career progression, or feeding people?
(still competing, but on analysis and insight, not on who holds what data. . .)
http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Data quality ≈ data trust:
• Sustainable: storage, archiving, maintenance
• Findable: “where is the dataset?”, “is it available?”
• Queryable: “is X in the dataset?”
• Analysable: metadata, annotation
http://www.slideshare.net/pebourne/sib0114
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Interoperable digital assets: datasets, software, lab books, etc.
• Uniquely identified (DOI, PMID, etc.)
• Provenance (version and access control)
• Open standards - what data to keep, how to organise it:
MINSEQE (sequencing), MIAME (microarray), MIASE (simulation), MIAPE
(proteomics), MIARE (RNAi), SBML, GFF3, SAM/BAM/CRAM, etc.
• Sustainable infrastructure for biological information
(ELIXIR, “The Commons” [US], RDF, Open Data)
http://www.slideshare.net/pebourne/sib0114
https://pebourne.wordpress.com/2014/10/07/the-commons/
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Too much software is difficult to use for experts, or unusable for
non-experts.
Veretnik et al. (2008) PLoS Comp. Biol. doi:10.1371/journal.pcbi.1000136
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Workflows, pipelines, and service integrative frameworks
Cock et al. (2014) Methods Mol. Biol. 1127:3-15 doi:10.1007/978-1-62703-986-4 1
Cock et al. (2013) PeerJ 1:e167 doi:10.7717/peerj.167
http://galaxy-community.org.uk/
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Sometimes new software is needed.
Writing good software is difficult, and expensive.
http://www.theregister.co.uk/2015/01/22/us military finds f35 software is a buggy mess/
Big Data: New Solutions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Not enough software engineers to go round: train what we have.
Programming literacy, computational thinking: versioned, readable,
maintainable code.
http://www.software.ac.uk/
http://software-carpentry.org/
http://datacarpentry.org/
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
Cheap Sequencing In The Field
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Diagnostics and epidemic tracking by sequencing
Global Microbial Identifier (GMI) http://www.globalmicrobialidentifier.org:
Global system of databases for microbial/disease identification and diagnostics.
Quick et al. (2014) BMJ Open 11:e006278 doi:10.1136/bmjopen-2014-006278
Sequencing In The Field
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Live prediction for epidemiology?
(Peter Skelsey, JHI)
Sequence Isn’t Everything
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Organisms are dynamic, and multi-scale
• Context: epigenetics, tissue differentiation, mesoscale systems,
symbiosis, etc.
• Phenotypic plasticity: responses to environment - stress,
temperature, etc.
The Phytobiome
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Phytobiome: the plant, and its associated microbial community
• American Phytopathological Society “Phytobiomes Intitative”
• “a complete systems approach that spans foundational to applied
science focused on downstream application”
• We are not at war with all microbes. . .
https://www.apsnet.org/members/outreach/ppb/phytobiomes
Genomes Are Parts Lists
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
We know (some of) the bits that make up the machinery. . .
Flux Balance Analysis
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Flux Balance Analysis: constraint-based static representation of
metabolism (RNA/ChIP-seq adds dynamics to models)
• Set upper, lower bounds to reaction rate, define objective phenotype
(biomass, target flux profile)
• in silico knockouts; viable states; nutrient usage
• A basis for synthetic biology and engineering
Flux Balance Analysis
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Dickeya: 29 × FBA, host range ≈ nutrient-dependent growth
also transposon mutant libraries
(w/ Sonia Humphris, Ian Toth, JHI)
Plant-Microbe Interactions Are Systems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Components, interactions, dynamics etc. = systems biology
Interaction creates a third system from host and microbe
Pritchard & Birch (2014) Mol. Plant. Pathol. 15:865-870 doi:10.1111/mpp.12210
Pritchard & Birch (2011) Plant Sci. 180:584-603 doi:10.1016/j.plantsci.2010.12.008
Plant-Microbe Interactions Are Systems
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Components, interactions, dynamics etc. = systems biology
Interaction creates a third system from host and microbe
microbe
(bulk)
microbe
(local)
PRR PRR*
R protein
R protein*
ø
øø
effector translocation
effector
(internalised)
PAMP
ø
ø
cell wall
microbe approaches cell microbe leaves cell/
is destroyed
microbe produces
PAMP
microbe produces
effector
PAMP binding
activates PRR
effector binding
activates R protein
callose
production
callose
loss
effector
loss
effector
loss
PAMP
loss
enhanced by callose (PTI)
and R protein* (ETI)
enhanced by
PRR* (PTI)
slowed by
callose (PTI)
callose
effector
(external)
enhanced by
effector action
No Response PTI
PTI+ETS PTI+ETS+ETI
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0 50 100 150 200 0 50 100 150 200
Time
Arbitraryunits
variable
Callose
Pathogen
Pathogen, Callose timecourses by host type
Pritchard & Birch (2014) Mol. Plant. Pathol. 15:865-870 doi:10.1111/mpp.12210
Pritchard & Birch (2011) Plant Sci. 180:584-603 doi:10.1016/j.plantsci.2010.12.008
Integrate Models and Data
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Integration of models and datasets still a challenge
• Models at different scales
• Kinetic, metabolomic, proteomic, transcriptomic, genomic
datasets
Hartmann & Schreiber (2014) Front. Bioeng. Biotechnol. 8:226-244 doi:10.3389/fbioe.2014.00091
Types of Model
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
• Combining data: models at different scales.
• Information required/produced depends on model type.
• Size/detail trade-off
Hartmann & Schreiber (2014) Front. Bioeng. Biotechnol. 8:226-244 doi:10.3389/fbioe.2014.00091/abstract
Synthetic Biology
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Engineering new response modes into crops.
Gurr & Rushton (2005) Trends Biotech. 23:283-290 doi:10.1016/j.tibtech.2005.04.009
Genome Editing
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
TALENs and CRISPR/Cas9s
http://www.lifetechnologies.com/
http://www.umassmed.edu/xuelab
Trait Stacking
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
For resistance and other beneficial traits (yield, nutrients, biofuels)
Vanholme et al. (2010) Trends Biotechnol. 28:543-547 doi:10.1016/j.tibtech.2010.07.008
Engineering Soil-Beneficial Microbes
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Refactoring of Klebsiella nitrogen fixation:
Temme et al. (2012) Proc. Natl. Acad. Sci. USA 10:763 doi:10.1073/pnas.1120788109
Engineering New Biology
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
dCas9 logic circuits, integrating with host regulation
Nielsen & Voigt (2014) Mol. Syst. Biol. 10:763 doi:10.15252/msb.20145735
Table of Contents
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Introduction
Why Genomics?
2003-Now
Implications
Where Next?
Conclusions
Data
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Sequencing is ever cheaper and more productive:
• Very large datasets
• More information (with good planning)
• Challenges for data storage and sharing
• Challenges for analysis (“why” vs. “what”)
• Challenges for software, accessibility (workflows,
multidisciplinary training)
• Interdisciplinary collaboration and data integration will
be essential
Systems/Synthetics
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
A parts list only gets us so far:
• Cells are dynamic biophysical systems
• Organisms are dynamic cellular systems
• ‘Real’ plant systems include the phytobiome
• Systems biology essential to understand plant-microbe
interactions
• Synthetic biology promises to be a powerful tool to improve
plant health, nutrition, etc.
• BUT: ethical issues around deployment of synthetic systems
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Conclusions
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
Acknowledgements
Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
James Hutton Institute
Paul Birch
Emma Campbell
Peter Cock
Ingo Hein
Nicola Holden
Sonia Humphris
Florian Jupe
Ian Toth
NUI Galway
Florence Abram
Fiona Brennan
University of Aberdeen
Ken Forbes
Norval Strachan
University of Alberta
David Broadhurst
SASA
Vincent Mulholland
Gerry Saddler
Fera
Valerie Bertrand
John Elphinstone
Rachel Glover
Neil Parkinson
University of M¨unster
Martina Bielaszewska
Helge Karch
University of Salford
Natalie Ferry
Ryan Joynson
And many others!

Microbial Agrogenomics 4/2/2015, UK-MX Workshop

  • 1.
    Microbial Agrogenomics Where canit lead us? Leighton Pritchard Information and Computational Sciences The James Hutton Institute
  • 2.
    Acceptable Use Policy IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Recording of this talk, taking photos, discussing the content using email, Twitter, blogs, etc. is permitted (and encouraged), providing distraction to others during the presentation is minimised. These slides will be available on SlideShare.
  • 3.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 4.
    The James HuttonInstitute Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 5.
    Centres of Expertise IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions http://www.hutton.ac.uk • Dundee Effector Consortium (DEC, with University of Dundee) [link] • Centre for Research on Potato and Other Solanaceous Plants (CRPS) [link] • Centre for Human and Animal Pathogens in the Environment (HAP-E) [link]
  • 6.
    Plant-Pathogen Interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Pathogens of barley (e.g. Rhynchosporium commune), and soft fruit (e.g. Raspberry Leaf Blotch Virus (RLBV))
  • 7.
    Plant-Pathogen Interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Potato pathogens, pests, and vectors. • soft-rot bacteria (Dickeya, Pectobacterium, Erwinia) • blight (Phytophthora infestans) • Potato Cyst Nematode (PCN) (Globodera) • aphids (Myzus persicae)
  • 8.
    Issue 1: FoodSecurity Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Economic cost and burden of crop disease • P. infestans: e1bn Europe; $4bn global • Societal impact (human health, commodity prices; farming) • Emerging pathogens (JIT supply chain; climate change) • Plant-associated human pathogens • Food fraud
  • 9.
    Issue 2: EnvironmentalSustainability Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Pesticide minimisation and withdrawal • Durable resistance, soil-beneficial microbes, plant growth/nutritional enhancement • Traditional breeding, GM, or engineering? • Soils: rhizosphere interactions/soil diversity • Farming practices (water run-off, rotation, equipment-cleaning - EU sulphuric acid ban)
  • 10.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 11.
    What Have GenomesEver Done For Us? Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Catalogue features (genes, regulatory elements, etc.) in an organism.
  • 12.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Gene products at the host-microbe interface Dodds & Rathjen (2010) Nat. Rev. Genet. 11:539-548 doi:10.1038/nrg2812
  • 13.
    Plant-Nematode Interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions RNA-seq identification of 27 putative nematode effectors: Small proteins, expressed in gland cells during feeding stage only. Cotton et al. (2014) Genome Biol. 15:R43 doi:10.1186/gb-2014-15-3-r43
  • 14.
    Plant Defence Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Prediction of NB-LRR genes (sequence capture). Jupe et al. (2013) Plant J. 76:530-544 doi:10.1111/tpj.12307 Jupe et al. (2012) BMC Genomics 13:75 doi:10.1186/1471-2164-13-75
  • 15.
    What Have GenomesEver Done For Us? Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Catalogue features (genes, regulatory elements, etc.) in an organism. • If we have multiple genomes. . . • What common features associate with phenotype or environment?
  • 16.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions GWAS/QTLs/genotyping for plant breeding http://ics.hutton.ac.uk/flapjack/ Milne et al. (2010) Bioinformatics 26:3133-3134 doi:10.1093/bioinformatics/btq580
  • 17.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Structural changes to genomes: repeat-driven expansion duplication, mutation, recombination, epigenetic control of effectors . . . Haas et al. (2009) Nature 461:393-398 doi:10.1038/nature08358
  • 18.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Structural changes to genomes: genome reductions Buchnera, Serratia symbiotica - aphid symbionts, ‘random’ inactivation Gil et al. (2002) Proc. Natl. Acad. Sci. USA 99:4454-4458 doi:10.1073/pnas.062067299 Burke & Moran (2011) Genome Biol. Evol. 99:4454-4458 doi:10.1093/gbe/evr002
  • 19.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Lateral gene transfer (virulence-associated genes) Bell et al. (2004) Proc. Natl. Acad. Sci. 101:11105-11110 doi:10.1073/pnas.0402424101
  • 20.
    Plant-microbe interactions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Closely-related bacteria, different host/environmental preference. Pectobacterium atrosepticum Holden et al. (2009) FEMS Micro. Rev. 33:689-703 doi:10.1111/j.1574-6976.2008.00153.x Toth et al. (2006) Annu. Rev. Phytopath. 44:305-336 doi:10.1146/annurev.phyto.44.070505.143444
  • 21.
    What Have GenomesEver Done For Us? Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Catalogue features (genes, regulatory elements, etc.) in an organism. • If we have multiple genomes. . . • What common features associate with phenotype or environment? • Epidemiology: spread and transmission
  • 22.
    Historical Origins Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Retracing 19th-century P.infestans pandemics Yoshida et al. (2014) PLoS Pathog. 10:e1004028 doi:10.1371/journal.ppat.1004028
  • 23.
    International Emergence Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Distribution of Dickeya spp. in Europe • D.dianthicola; ◦ D.solani; Dickeya spp. on potato Toth et al. (2011) Plant Pathol. 60:385-399 doi:10.1111/j.1365-3059.2011.02427.x
  • 24.
    Host Jumps Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Movement of Dickeya from ornamental to crop plants Parkinson et al. (2015) Eur. J. Plant Pathol. 141:63-70 doi:10.1007/s10658-014-0523-5
  • 25.
    Diagnostic Tools Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Quarantine and legislation require precise identification. Genomes enable rapid, robust RT-PCR diagnostics. targets V IV III II I genomes I II III IV V https://github.com/widdowquinn/find differential primers Pritchard et al. (2013) Plant Pathol. 62:587-596 doi:10.1111/j.1365-3059.2012.02678.x Pritchard et al. (2012) PLoS One 7:e34498 doi:10.1371/journal.pone.0034498
  • 26.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 27.
    2003: E. carotovorasubsp. atroseptica Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • £250k collaboration between SCRI, University of Cambridge, WT Sanger Institute • Single isolate: E. carotovora subsp. atroseptica SCRI1043 • First sequenced enterobacterial plant pathogen (32 authors!) • Annotation: 6 people, for 6 months ≈ three person-years • Result: single, complete 5Mbp circular chromosome (10.2X) Bell et al. (2004) Proc. Natl. Acad. Sci. USA 101: 30:11105-11110. doi:10.1073/pnas.0402424101
  • 28.
    2003: E. carotovorasubsp. atroseptica Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Compared against all 142 then-available bacterial genomes Bell et al. (2004) Proc. Natl. Acad. Sci. USA 101: 30:11105-11110. doi:10.1073/pnas.0402424101
  • 29.
    2013: Dickeya spp. IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Sequenced and annotated 25 isolates of Dickeya over two years • Multiple sequencing methods: 454, Illumina (SE, PE) • Automated annotation, limited manual correction • Results: 12-237 fragments: 4.2-5.1Mbp/genome (6-84X) Pritchard et al. (2013) Genome Ann. 1 (4) doi:10.1128/genomeA.00087-12 Pritchard et al. (2013) Genome Ann. 1 (6) doi:10.1128/genomeA.00978-13
  • 30.
    2013: Dickeya spp. IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Whole genome-based species definitions: sp. nov. D. solani van der Wolf et al. (2014) Int. J. Syst. Evol. Micr. 64:768-774 doi:10.1099/ijs.0.052944-0
  • 31.
    2013: Dickeya spp. IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Differences in metabolic capacity (but ≈ 20% orphan EC activities)
  • 32.
    2014: E. coli IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Sequenced and annotated ≈ 190 isolates of E. coli All bacteria environmental, sampled from lysimeters • Illumina PE sequencing, cost ≈£11k • Automated annotation: PROKKA (w/ Fiona Brennan, Florence Abram, NUI Galway)
  • 33.
    2014: E. coli IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Whole genome-based subspecies classification Brunei20070942_contigs Muenster20063091_contigs Senftenberg20070885_contigs Lys142_contigs Lys175_contigs Lys130_contigs Lys170_contigs Lys126_contigs Lys167_contigs Lys176_contigs Lys169_contigs Lys50_contigs X5038_contigs Lys131_contigs Lys171_contigs Lys111_contigs Lys107_contigs Lys114_contigs Lys16_contigs Lys22_contigs Lys65_contigs Lys56_contigs Lys113_contigs Lys109_contigs Lys77_contigs Lys102_contigs Lys100_contigs Lys92_contigs Lys94_contigs Lys80_contigs Lys64_contigs Lys82_contigs AW3_contigs X5008_contigs AW4_contigs AW1_contigs Lys118_contigs Lys138_contigs Lys121_contigs Lys122_contigs Lys177_contigs Lys155_contigs Lys165_contigs Lys163_contigs Lys160_contigs Lys161_contigs Lys172_contigs Lys144_contigs Lys135_contigs Lys146_contigs Lys123_contigs Lys124_contigs Lys150_contigs Lys140_contigs Lys157_contigs Lys173_contigs Lys156_contigs Lys158_contigs Lys159_contigs Lys162_contigs Lys5_contigs X5084_contigs X5042_contigs Lys110_contigs Lys136_contigs Lys54_contigs Lys1_contigs Lys6_contigs Lys112_contigs X5012_contigs Lys30_contigs Lys25_contigs Lys43_contigs Lys37_contigs Lys40_contigs Lys151_contigs Lys31_contigs Lys27_contigs Lys42_contigs Lys51_contigs Lys33_contigs Lys46_contigs Lys38_contigs Lys89_contigs Lys23_contigs Lys115_contigs Lys108_contigs Lys104_contigs DSM10973_contigs Lys125_contigs Lys105_contigs Lys17_contigs Lys128_contigs Lys66_contigs Lys73_contigs Lys15_contigs Lys91_contigs DSM8698_contigs DSM8695_contigs Lys74_contigs Lys61_contigs Lys9_contigs Lys153_contigs Lys84_contigs Lys93_contigs Lys72_contigs Lys62_contigs Lys21_contigs Lys59_contigs Lys63_contigs Lys83_contigs Lys19_contigs Lys4_contigs AW13_contigs Lys45_contigs Lys28_contigs Lys53_contigs Lys52_contigs Lys34_contigs Lys36_contigs Lys24_contigs Lys35_contigs Lys68_contigs Lys106_contigs Lys88_contigs Lys97_contigs Lys76_contigs Lys134_contigs Lys58_contigs Lys71_contigs Lys81_contigs Lys129_contigs Lys120_contigs Lys145_contigs Lys137_contigs Lys127_contigs Lys152_contigs Lys101_contigs Lys98_contigs Lys70_contigs Lys133_contigs Lys47_contigs Lys75_contigs Lys48_contigs Lys148_contigs Lys139_contigs Lys141_contigs Lys164_contigs Lys149_contigs Lys147_contigs Lys60_contigs Lys79_contigs Lys168_contigs Lys18_contigs Lys87_contigs Lys96_contigs Lys7_contigs Lys154_contigs Lys117_contigs Lys119_contigs Lys178_contigs Lys116_contigs Lys86_contigs Lys90_contigs Lys41_contigs Lys13_contigs Lys85_contigs X5002_contigs Lys12_contigs Lys39_contigs Lys14_contigs Lys55_contigs Lys29_contigs Lys99_contigs X5035_contigs Lys8_contigs Lys3_contigs X5034_contigs X5088_contigs Lys20_contigs Lys78_contigs Lys11_contigs Brunei20070942_contigs Muenster20063091_contigs Senftenberg20070885_contigs Lys142_contigs Lys175_contigs Lys130_contigs Lys170_contigs Lys126_contigs Lys167_contigs Lys176_contigs Lys169_contigs Lys50_contigs 5038_contigs Lys131_contigs Lys171_contigs Lys111_contigs Lys107_contigs Lys114_contigs Lys16_contigs Lys22_contigs Lys65_contigs Lys56_contigs Lys113_contigs Lys109_contigs Lys77_contigs Lys102_contigs Lys100_contigs Lys92_contigs Lys94_contigs Lys80_contigs Lys64_contigs Lys82_contigs AW3_contigs 5008_contigs AW4_contigs AW1_contigs Lys118_contigs Lys138_contigs Lys121_contigs Lys122_contigs Lys177_contigs Lys155_contigs Lys165_contigs Lys163_contigs Lys160_contigs Lys161_contigs Lys172_contigs Lys144_contigs Lys135_contigs Lys146_contigs Lys123_contigs Lys124_contigs Lys150_contigs Lys140_contigs Lys157_contigs Lys173_contigs Lys156_contigs Lys158_contigs Lys159_contigs Lys162_contigs Lys5_contigs 5084_contigs 5042_contigs Lys110_contigs Lys136_contigs Lys54_contigs Lys1_contigs Lys6_contigs Lys112_contigs 5012_contigs Lys30_contigs Lys25_contigs Lys43_contigs Lys37_contigs Lys40_contigs Lys151_contigs Lys31_contigs Lys27_contigs Lys42_contigs Lys51_contigs Lys33_contigs Lys46_contigs Lys38_contigs Lys89_contigs Lys23_contigs Lys115_contigs Lys108_contigs Lys104_contigs DSM10973_contigs Lys125_contigs Lys105_contigs Lys17_contigs Lys128_contigs Lys66_contigs Lys73_contigs Lys15_contigs Lys91_contigs DSM8698_contigs DSM8695_contigs Lys74_contigs Lys61_contigs Lys9_contigs Lys153_contigs Lys84_contigs Lys93_contigs Lys72_contigs Lys62_contigs Lys21_contigs Lys59_contigs Lys63_contigs Lys83_contigs Lys19_contigs Lys4_contigs AW13_contigs Lys45_contigs Lys28_contigs Lys53_contigs Lys52_contigs Lys34_contigs Lys36_contigs Lys24_contigs Lys35_contigs Lys68_contigs Lys106_contigs Lys88_contigs Lys97_contigs Lys76_contigs Lys134_contigs Lys58_contigs Lys71_contigs Lys81_contigs Lys129_contigs Lys120_contigs Lys145_contigs Lys137_contigs Lys127_contigs Lys152_contigs Lys101_contigs Lys98_contigs Lys70_contigs Lys133_contigs Lys47_contigs Lys75_contigs Lys48_contigs Lys148_contigs Lys139_contigs Lys141_contigs Lys164_contigs Lys149_contigs Lys147_contigs Lys60_contigs Lys79_contigs Lys168_contigs Lys18_contigs Lys87_contigs Lys96_contigs Lys7_contigs Lys154_contigs Lys117_contigs Lys119_contigs Lys178_contigs Lys116_contigs Lys86_contigs Lys90_contigs Lys41_contigs Lys13_contigs Lys85_contigs 5002_contigs Lys12_contigs Lys39_contigs Lys14_contigs Lys55_contigs Lys29_contigs Lys99_contigs 5035_contigs Lys8_contigs Lys3_contigs 5034_contigs 5088_contigs Lys20_contigs Lys78_contigs Lys11_contigs ANIm 0.9 0.92 0.94 0.96 0.98 Value 0100020003000400050006000 Color Key and Histogram Count A B1 B2 C D E F U X (w/ Fiona Brennan, Florence Abram, NUI Galway)
  • 34.
    2014: Campylobacter spp. IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions ≈1034 clinical, animal, food-associated Campylobacter isolates • Illumina PE sequencing, cost ≈£60k • Automated annotation: PRODIGAL (w/ Ken Forbes, Norval Strachan, University of Aberdeen)
  • 35.
    2014: Campylobacter spp. IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions • 15554 ‘gene families’ in 1034 isolates. • Calculation: 4e12 pairwise protein comparisons! (w/ Ken Forbes, Norval Strachan, University of Aberdeen)
  • 36.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 37.
    So what’s changed? IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Everything. • Cost: £250k → £60 per genome. Now cheaper to sequence than analyse a genome! Offload work from people to software. • Location: sequencing centre, to benchtop (Nanopore!) • Speed: sequencing run time can be less than a day • Data: massive volume increase
  • 38.
    Predicting the futureis hard. . . Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Su et al. attempted to do it, though: 10,000 prokaryotes in 2015 was an underestimate. http://sulab.org/2013/06/sequenced-genomes-per-year/
  • 39.
    So what’s changed? IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Everything. • Cost: £250k → £60 per genome. • Location: sequencing centre, to benchtop (Nanopore!) • Speed: sequencing run time can be less than a day • Data: massive volume increase More data ≈ better, but also more challenging. • Software: more (= better. . .) software for more things • New experiments: genomes, exomes, variant calling, methylated sequences, STARR-seq, . . . • New applications: diagnostics, epidemic tracking, metagenomics, . . .
  • 40.
    Sequence first. .. ask questions, later Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • “Why?” has sometimes been replaced by “What?” http://dilbert.com/strip/2000-01-03 “The thesis is not hypothesis driven. Add a hypothesis and refer to it in subsequent chapters.”
  • 41.
    More isn’t alwaysbetter Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Deeper sequencing (more reads) = more information or better assembly. 60-80X coverage the ‘sweet spot’ for bacterial genomes. More reps more reads! Conway & Bromage (2011) Bioinformatics 27:479-486 doi:10.1093/bioinformatics/btq697
  • 42.
    Are database annotationsreliable? Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Automated annotation is essential The Critical Assessment of Function Annotation (CAFA) project. Radivojac et al. (2013) Nat. Meth. 10:221-227 doi:10.1038/nmeth.2340
  • 43.
    Do biased databaseannotations matter? Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Experimental annotations of proteins are incomplete. Is that important? Tested by simulation, and following databases for three years. • Yes. It matters. • Current large scale annotations are meaningful and almost surprisingly reliable. • The nature and level of data incompleteness, and type of classification model have an effect. • “Low precision, high recall” (i.e. less discriminating) tools most significantly affected. Molecular function prediction is usually more reliable than biological process prediction Jiang et al. (2014) Bioinformatics 30:i609-i616 doi:10.1093/bioinformatics/btu472 Cozzetto et al. (2013) BMC Bioinf. 14:S3-S1 doi:10.1186/1471-2105-14-S3-S1
  • 44.
    CAFA results Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions The Critical Assessment of Function Annotation (CAFA) 2013 results. (F-measure combines precision and recall) • You can do better than BLAST. • Best-performing methods do comparably well. • Best methods used evolutionary relationships, structure, and expression data. • Machine Learning methods work best. Radivojac et al. (2013) Nat. Meth. 10:221-227 doi:10.1038/nmeth.2340
  • 45.
    More Isn’t AlwaysBetter Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Statistical inference on large datasets requires extra care. Hypothesis tests may incorrectly reject null hypotheses (B-H)
  • 46.
    More Isn’t AlwaysBetter Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • More tests → random effect seems ’real’ • May be considering a large set of inferences simultaneously (and yet not notice!):“p-hacking”, “Researcher Degrees of Freedom” “good scientists are skilled at looking hard enough and subsequently coming up with good stories (plausible even to themselves, as well as to their colleagues and peer reviewers) to back up any statistically-significant comparisons they happen to come up with.” Gelman & Loken (2013) ”The Garden of Forking Paths” (“Data-dredging”) True for all large data analyses: genomics, metabolomics, proteomics, health screening, finding terrorists, etc. Xia et al. (2012) Metabolomics 9:280-299 doi:10.1007/s11306-012-0482-9 Broadhurst & Kell (2006) Metabolomics 2:171-196 doi:10.1007/s11306-006-0037-z
  • 47.
    Genome-Scale Predictions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions • Imagine a paper describing a predictor for protein functional class (e.g. pathogen effector) • The paper reports sensitivity = 0.95, FPR = 0.01 • We run the predictor on 20,000 proteins in an organism • It predicts 130 members of the class. How many of them are likely to be true positives? Pritchard & Broadhurst (2014) Meth. Mol. Biol. 9:280-299 doi:10.1007/978-1-62703-986-4 4
  • 48.
    Genome-Scale Predictions Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions • Imagine a paper describing a predictor for protein functional class (e.g. pathogen effector) • The paper reports sensitivity = 0.95, FPR = 0.01 • We run the predictor on 20,000 proteins in an organism • It predicts 130 members of the class. How many of them are likely to be true positives? • We need a baseline level of that class (fX ) in the genome to determine this. • Estimate ≈ 200 in gene complement, so fX = 0.01 • fX = 0.01 =⇒ P(class|+ve) = 0.490 ≈ 0.5: 65 TP Pritchard & Broadhurst (2014) Meth. Mol. Biol. 9:280-299 doi:10.1007/978-1-62703-986-4 4
  • 49.
    Baserate Fallacy Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions http://bit.ly/1EFbzCI http://armchairbiology.blogspot.co.uk/2014/07/the-baserate-fallacy-revisited.html
  • 50.
    A Literature Example IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Reported sensitivity ≈ 0.71, FPR ≈ 0.15 Arnold et al. (2009) PLoS Pathog. 5:e1000376 doi:10.1371/journal.ppat.1000376
  • 51.
    Big Data: NewProblems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Lots of high throughput experiments, and large datasets (but even more small datasets) • Historically ill-formed data (sequences in Word documents, BLAST results pasted into notebooks). • How do we connect all this data in a productive way? This section influenced heavily by C. Titus Brown and Philip Bourne
  • 52.
    Big Data: NewProblems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • Data management. Too often: “Goodbye to the student is goodbye to the data” • Persistence of data resources (link rot, database entropy) http://www.phdcomics.com/comics/archive.php?comicid=382
  • 53.
    Big Data: NewProblems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions • How reproducible are computational results? • Software/data versions prevent exact reproduction: 280h to reproduce one paper approximately - in the same lab! Garijo et al. (2013) PLoS One doi:10.1371/journal.pone.0080278 http://www.slideshare.net/pebourne/sib0114
  • 54.
    Big Data: NewProblems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Maybe we can get away with all of this in a traditional model of science publishing. . . http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
  • 55.
    Big Data: NewProblems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions . . .but lots of biological data doesn’t make sense except in the light of other biological data. http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
  • 56.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Everyone could be better off with collaboration and data sharing. What is winning: career progression, or feeding people? (still competing, but on analysis and insight, not on who holds what data. . .) http://www.slideshare.net/c.titus.brown/2015-baltiandbioinformatics
  • 57.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Data quality ≈ data trust: • Sustainable: storage, archiving, maintenance • Findable: “where is the dataset?”, “is it available?” • Queryable: “is X in the dataset?” • Analysable: metadata, annotation http://www.slideshare.net/pebourne/sib0114
  • 58.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Interoperable digital assets: datasets, software, lab books, etc. • Uniquely identified (DOI, PMID, etc.) • Provenance (version and access control) • Open standards - what data to keep, how to organise it: MINSEQE (sequencing), MIAME (microarray), MIASE (simulation), MIAPE (proteomics), MIARE (RNAi), SBML, GFF3, SAM/BAM/CRAM, etc. • Sustainable infrastructure for biological information (ELIXIR, “The Commons” [US], RDF, Open Data) http://www.slideshare.net/pebourne/sib0114 https://pebourne.wordpress.com/2014/10/07/the-commons/
  • 59.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Too much software is difficult to use for experts, or unusable for non-experts. Veretnik et al. (2008) PLoS Comp. Biol. doi:10.1371/journal.pcbi.1000136
  • 60.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Workflows, pipelines, and service integrative frameworks Cock et al. (2014) Methods Mol. Biol. 1127:3-15 doi:10.1007/978-1-62703-986-4 1 Cock et al. (2013) PeerJ 1:e167 doi:10.7717/peerj.167 http://galaxy-community.org.uk/
  • 61.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Sometimes new software is needed. Writing good software is difficult, and expensive. http://www.theregister.co.uk/2015/01/22/us military finds f35 software is a buggy mess/
  • 62.
    Big Data: NewSolutions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Not enough software engineers to go round: train what we have. Programming literacy, computational thinking: versioned, readable, maintainable code. http://www.software.ac.uk/ http://software-carpentry.org/ http://datacarpentry.org/
  • 63.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 64.
    Cheap Sequencing InThe Field Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Diagnostics and epidemic tracking by sequencing Global Microbial Identifier (GMI) http://www.globalmicrobialidentifier.org: Global system of databases for microbial/disease identification and diagnostics. Quick et al. (2014) BMJ Open 11:e006278 doi:10.1136/bmjopen-2014-006278
  • 65.
    Sequencing In TheField Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Live prediction for epidemiology? (Peter Skelsey, JHI)
  • 66.
    Sequence Isn’t Everything IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Organisms are dynamic, and multi-scale • Context: epigenetics, tissue differentiation, mesoscale systems, symbiosis, etc. • Phenotypic plasticity: responses to environment - stress, temperature, etc.
  • 67.
    The Phytobiome Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Phytobiome: the plant, and its associated microbial community • American Phytopathological Society “Phytobiomes Intitative” • “a complete systems approach that spans foundational to applied science focused on downstream application” • We are not at war with all microbes. . . https://www.apsnet.org/members/outreach/ppb/phytobiomes
  • 68.
    Genomes Are PartsLists Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions We know (some of) the bits that make up the machinery. . .
  • 69.
    Flux Balance Analysis IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Flux Balance Analysis: constraint-based static representation of metabolism (RNA/ChIP-seq adds dynamics to models) • Set upper, lower bounds to reaction rate, define objective phenotype (biomass, target flux profile) • in silico knockouts; viable states; nutrient usage • A basis for synthetic biology and engineering
  • 70.
    Flux Balance Analysis IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Dickeya: 29 × FBA, host range ≈ nutrient-dependent growth also transposon mutant libraries (w/ Sonia Humphris, Ian Toth, JHI)
  • 71.
    Plant-Microbe Interactions AreSystems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Components, interactions, dynamics etc. = systems biology Interaction creates a third system from host and microbe Pritchard & Birch (2014) Mol. Plant. Pathol. 15:865-870 doi:10.1111/mpp.12210 Pritchard & Birch (2011) Plant Sci. 180:584-603 doi:10.1016/j.plantsci.2010.12.008
  • 72.
    Plant-Microbe Interactions AreSystems Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Components, interactions, dynamics etc. = systems biology Interaction creates a third system from host and microbe microbe (bulk) microbe (local) PRR PRR* R protein R protein* ø øø effector translocation effector (internalised) PAMP ø ø cell wall microbe approaches cell microbe leaves cell/ is destroyed microbe produces PAMP microbe produces effector PAMP binding activates PRR effector binding activates R protein callose production callose loss effector loss effector loss PAMP loss enhanced by callose (PTI) and R protein* (ETI) enhanced by PRR* (PTI) slowed by callose (PTI) callose effector (external) enhanced by effector action No Response PTI PTI+ETS PTI+ETS+ETI 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 0 50 100 150 200 Time Arbitraryunits variable Callose Pathogen Pathogen, Callose timecourses by host type Pritchard & Birch (2014) Mol. Plant. Pathol. 15:865-870 doi:10.1111/mpp.12210 Pritchard & Birch (2011) Plant Sci. 180:584-603 doi:10.1016/j.plantsci.2010.12.008
  • 73.
    Integrate Models andData Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions Integration of models and datasets still a challenge • Models at different scales • Kinetic, metabolomic, proteomic, transcriptomic, genomic datasets Hartmann & Schreiber (2014) Front. Bioeng. Biotechnol. 8:226-244 doi:10.3389/fbioe.2014.00091
  • 74.
    Types of Model IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions • Combining data: models at different scales. • Information required/produced depends on model type. • Size/detail trade-off Hartmann & Schreiber (2014) Front. Bioeng. Biotechnol. 8:226-244 doi:10.3389/fbioe.2014.00091/abstract
  • 75.
    Synthetic Biology Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions Engineering new response modes into crops. Gurr & Rushton (2005) Trends Biotech. 23:283-290 doi:10.1016/j.tibtech.2005.04.009
  • 76.
    Genome Editing Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions TALENs and CRISPR/Cas9s http://www.lifetechnologies.com/ http://www.umassmed.edu/xuelab
  • 77.
    Trait Stacking Introduction WhyGenomics? 2003-Now Implications Where Next? Conclusions For resistance and other beneficial traits (yield, nutrients, biofuels) Vanholme et al. (2010) Trends Biotechnol. 28:543-547 doi:10.1016/j.tibtech.2010.07.008
  • 78.
    Engineering Soil-Beneficial Microbes IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Refactoring of Klebsiella nitrogen fixation: Temme et al. (2012) Proc. Natl. Acad. Sci. USA 10:763 doi:10.1073/pnas.1120788109
  • 79.
    Engineering New Biology IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions dCas9 logic circuits, integrating with host regulation Nielsen & Voigt (2014) Mol. Syst. Biol. 10:763 doi:10.15252/msb.20145735
  • 80.
    Table of Contents IntroductionWhy Genomics? 2003-Now Implications Where Next? Conclusions Introduction Why Genomics? 2003-Now Implications Where Next? Conclusions
  • 81.
    Data Introduction Why Genomics?2003-Now Implications Where Next? Conclusions Sequencing is ever cheaper and more productive: • Very large datasets • More information (with good planning) • Challenges for data storage and sharing • Challenges for analysis (“why” vs. “what”) • Challenges for software, accessibility (workflows, multidisciplinary training) • Interdisciplinary collaboration and data integration will be essential
  • 82.
    Systems/Synthetics Introduction Why Genomics?2003-Now Implications Where Next? Conclusions A parts list only gets us so far: • Cells are dynamic biophysical systems • Organisms are dynamic cellular systems • ‘Real’ plant systems include the phytobiome • Systems biology essential to understand plant-microbe interactions • Synthetic biology promises to be a powerful tool to improve plant health, nutrition, etc. • BUT: ethical issues around deployment of synthetic systems
  • 83.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 84.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 85.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 86.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 87.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 88.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 89.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 90.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 91.
    Conclusions Introduction Why Genomics?2003-Now Implications Where Next? Conclusions
  • 92.
    Acknowledgements Introduction Why Genomics?2003-Now Implications Where Next? Conclusions James Hutton Institute Paul Birch Emma Campbell Peter Cock Ingo Hein Nicola Holden Sonia Humphris Florian Jupe Ian Toth NUI Galway Florence Abram Fiona Brennan University of Aberdeen Ken Forbes Norval Strachan University of Alberta David Broadhurst SASA Vincent Mulholland Gerry Saddler Fera Valerie Bertrand John Elphinstone Rachel Glover Neil Parkinson University of M¨unster Martina Bielaszewska Helge Karch University of Salford Natalie Ferry Ryan Joynson And many others!