SlideShare a Scribd company logo
1 of 47
Download to read offline
@kcranstn!
http://slideshare.net/kcranstn
Enabling science
with the tree of life
Karen Cranston!
National Evolutionary
Synthesis Center (NESCent)
The tree of life provides
a means for organizing
and explaining
biodiversity data
Weigmann et al. PNAS, 2011
What do we want from a Tree of Life?
❖ complete = contains all of
biodiversity!
❖ dynamic = continuously updated
with new data!
❖ available digitally = browse,
query, download
Image: http://evolution.berkeley.edu
❖ Create a complete tree of life by synthesizing
published phylogenetic data!
❖ Provide tools for managing, synthesizing & sharing
phylogenetic data
http://opentreeoflife.org
Synthetic science
❖ Novel methods & analysis tools!
❖ Big data from existing data
Biodiversity Synthesis Center /
Encyclopedia of Life
National Evolutionary Synthesis Center
Challenges
❖ Incongruence: How do we detect and use conflict
between trees?!
❖ Availability: What data do we have to construct a
tree of life?!
❖ Synthesis: How do we combine data across the tree
of life?
What can we learn from conflict
between trees?
aactgtcgcatgttgacg...
aattgtcg-atgttgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
Phylogenetic
inference
Many
likely trees
Gene tree uncertainty
Single gene
alignment
Bayesian phylogenetic inference
Input: sequence data + evolutionary model
Output = list of sampled phylogenies
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Sampled trees
Probability
Number of times
sampled ∝ probability
Is there a stable backbone
among the trees?!
What taxa have unstable
placement?
Summarize with agreement subtrees
0.20 0.15
0.25Pr=0.40
1 23 4 5
1 2 3 4 51 23 4 5
1 2 3 4 5
Pr=1.00
0.20 0.15
0.25Pr=0.40
1 3 4 5 1 3 4 5
1 3 4 51 3 4 5
0.20 0.15
0.25Pr=0.40
1 23 4 5
1 2 4 3 51 23 4 5
1 2 3 4 5
Pr=0.85
0.20 0.15
0.25Pr=0.40
1 3 4 5 1 4 3 5
1 3 4 51 3 4 5
Cranston, K.A. and B.H. Rannala. Summarizing a posterior distribution of phylogenies using
agreement subtrees. Systematic Biology 2007: 56(4), pp. 578-590.
Multiple sequence
alignments
Concatenate
Supermatrix
Species tree
Supertrees
Gene
duplication
Coalescent
Gene trees
Phylogenomics of rice (Oryza)
820,000 BAC-end
sequences for 9 diploid
Oryza species
1720 gene fragments!
2.4 million nucleotides
Cranston, K.A., B. Hurwitz, M.J. Sanderson, D. Ware, R.A. Wing, L. Stein. Phylogenomic
analysis from deep BAC-end sequence libraries of rice. Systematic Botany, 35:3, 2010
What are the biological
causes of gene tree
incongruence in rice?!
Do we need full genomes to
answer these questions?
Phylogenomics of rice (Oryza)
Cranston, K.A., B. Hurwitz, M.J. Sanderson, D. Ware, R.A. Wing, L. Stein. Phylogenomic
analysis from deep BAC-end sequence libraries of rice. Systematic Botany, 35:3, 2010
Concatenated analysis
Gene trees in Oryza
❖ Gene tree methods: recover every
possible topology!
❖ Species tree methods: many clades
not statistically significant
Cranston, K.A., B. Hurwitz, D. Ware, L. Stein, R.A. Wing. Species trees from highly
incongruent gene trees in rice. Systematic Biology. 2009: doi: 10.1093/syst- bio.syp054
Supermatrix topology
❖ Suggest incomplete lineage sorting and hybridization /
introgression in evolutionary history of rice
What data do we have for creating a
complete tree of life?
Gene tree signal in GenBank
How many trees can we build using all of
the data in GenBank and how are those
trees distributed across the tree of life?
All-vs-all BLAST at each NCBI taxonomy node
Sanderson, M.J., D.T. Boss, D. Chen, K.A. Cranston, and A. Wehe. The PhyLoTA Browser:
Processing GenBank for molecular phylogenetics research. Systematic Biology 2008: 57(3).
Arachis hypogaea
Arachis hypogaea
subsp. fastigiata
Arachis hypogaea
subsp. hypogaea
Arachis glabrata
subtree
clusters
Arachis
All possible clusters, alignments and trees
aactgtcgcatgttgacg...
aattgtcg-atgttgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
❖ ~90000 clusters, alignments, trees available for download!
❖ data availability matrix at each NCBI node
❖ complete = contains all of biodiversity!
❖ dynamic = continuously updated with new data!
❖ available digitally = browse, query, download
http://opentreeoflife.org
Gordon Burleigh	

Keith Crandall	

Karl Gude	

David Hibbett	

Mark Holder	

Laura Katz	

Rick Ree 	

Stephen Smith	

Doug Soltis	

Tiffani Williams
Computer science!
Systematics!
Evolutionary theory!
Computational biology!
Bioinformatics!
Journalism
Even if there were phylogenies for all sequence
clusters in GenBank, would only represent a
small fraction of biodiversity
Two types of inputs
Phylogeny!
highly resolved!
computationally derived!
limited coverage
Taxonomy!
poorly resolved!
manually curated!
much more complete
~7000 trees from ~2600 studies
Phylografter: Rick Ree, Field Museum of Natural History
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-
proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–
88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number
EVOLUTION
~ 4% of all published
phylogenetic trees
Stoltzfus et al 2012
Trees generally
published as pictures
in PDFs
OpenTree Reference Taxonomy
+
+
+
patch files for manual edits
+
3,133,028 nodes and 2,559,835 ‘species’
Jonathan Rees, NESCent
How do we combine data to build
and use a tree of life?
Novel datastore for synthesis
Treemachine: Stephen Smith, Cody Hinchliff, Joseph Brown, U Michigan
Jim Allman, NESCent
Manual synthesis based on all data
Automated synthesis based on limited data
Inputs:
Published phylogenies
Taxonomies
• filter / weight input trees	

• re-synthesize
• process feedback 	

• input new trees
synthetic tree of life
Improving the synthetic tree
❖ Branch lengths & divergence times!
❖ Better synthesis using tree metadata!
❖ Community engagement!
❖ data deposition & curation!
❖ feedback & annotation
Moving beyond a single tree
❖ Detecting conflict and coverage!
❖ Visualization! !
❖ Enabling custom synthesis!
❖ Building out to other tools & resources
Leaf
Tree of Life
OPEN
What can we do with a tree of life?
aactgtcgcatgttgacg...
aattgtcg-atgttgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aac-gtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
aactgtcgcatgtcgacg...
+ =
Image: Zephyris at the English language Wikipedia
10 million years
24 million years
Acer macrophyllum!
Betula lutea!
Aesculus glabra!
Tilia americana!
Ulmus rubra
Leaf patterns image from Walls RL: American Journal
of Botany 2011, 98(2):244-253.
Acer macrophyllum
Betula alleghaniensis
Aesculus glabra
Tilia americana
Ulmus rubra
Stoltzfus, A., Lapp, H., Matasci, N., … Cranston, K.A., ... & Jordan, G. (2013). Phylotastic! Making
tree-of-life knowledge accessible, reusable and convenient. BMC bioinformatics, 14(1), 158.
Collaborative data collection!
Validation of datasets!
Search & download across datasets
Get tree
Get tree
Leaf
Tree of Life
OPEN
What can we do with a tree of life?
University of Alberta: !
! Bruce Rannala!
!
University of Arizona: !
! Michael Sanderson!
!
NESCent:!
! Jonathan Rees!
! Jim Allman

More Related Content

What's hot

Genomics in Pleistocene Park: On the Internal Causes Driving Extinction
Genomics in Pleistocene Park: On the Internal Causes Driving ExtinctionGenomics in Pleistocene Park: On the Internal Causes Driving Extinction
Genomics in Pleistocene Park: On the Internal Causes Driving ExtinctionJacob Kostecke
 
Genetic engineering in animal
Genetic engineering in animalGenetic engineering in animal
Genetic engineering in animalTaikiat Kiat
 
Transgenic animals
Transgenic animalsTransgenic animals
Transgenic animalsSanaspriya01
 
Population Genetics_Dr. Ashwin Atkulwar
Population Genetics_Dr. Ashwin AtkulwarPopulation Genetics_Dr. Ashwin Atkulwar
Population Genetics_Dr. Ashwin AtkulwarAshwin Atkulwar
 
Population Genetics
Population GeneticsPopulation Genetics
Population GeneticsJolie Yu
 
Transgenic animals by Kashikant Yadav
Transgenic animals by Kashikant YadavTransgenic animals by Kashikant Yadav
Transgenic animals by Kashikant YadavKashikant Yadav
 
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...Pat (JS) Heslop-Harrison
 
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-Harrison
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-HarrisonMolecular Cytogenetics Research Group Dec 2016 Pat Heslop-Harrison
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-HarrisonPat (JS) Heslop-Harrison
 
Transgenic Animals A Better Approach towards Experimentation
Transgenic Animals A Better Approach towards ExperimentationTransgenic Animals A Better Approach towards Experimentation
Transgenic Animals A Better Approach towards Experimentationijtsrd
 
Transgenic animals - A brief review
Transgenic animals - A brief reviewTransgenic animals - A brief review
Transgenic animals - A brief reviewAsmita Sagar
 
Transgenic animals
Transgenic animalsTransgenic animals
Transgenic animalsAhmed Madni
 

What's hot (19)

Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 
Genomics in Pleistocene Park: On the Internal Causes Driving Extinction
Genomics in Pleistocene Park: On the Internal Causes Driving ExtinctionGenomics in Pleistocene Park: On the Internal Causes Driving Extinction
Genomics in Pleistocene Park: On the Internal Causes Driving Extinction
 
Genetic engineering in animal
Genetic engineering in animalGenetic engineering in animal
Genetic engineering in animal
 
Transgenic animals
Transgenic animalsTransgenic animals
Transgenic animals
 
Transgenic animals
Transgenic animalsTransgenic animals
Transgenic animals
 
Population Genetics_Dr. Ashwin Atkulwar
Population Genetics_Dr. Ashwin AtkulwarPopulation Genetics_Dr. Ashwin Atkulwar
Population Genetics_Dr. Ashwin Atkulwar
 
Population Genetics
Population GeneticsPopulation Genetics
Population Genetics
 
Transgenic animals by Kashikant Yadav
Transgenic animals by Kashikant YadavTransgenic animals by Kashikant Yadav
Transgenic animals by Kashikant Yadav
 
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
 
Transgenesis
TransgenesisTransgenesis
Transgenesis
 
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-Harrison
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-HarrisonMolecular Cytogenetics Research Group Dec 2016 Pat Heslop-Harrison
Molecular Cytogenetics Research Group Dec 2016 Pat Heslop-Harrison
 
Transgenesis in animal
Transgenesis in animalTransgenesis in animal
Transgenesis in animal
 
Transgenic Animals A Better Approach towards Experimentation
Transgenic Animals A Better Approach towards ExperimentationTransgenic Animals A Better Approach towards Experimentation
Transgenic Animals A Better Approach towards Experimentation
 
Transgenic animals - A brief review
Transgenic animals - A brief reviewTransgenic animals - A brief review
Transgenic animals - A brief review
 
Hardy weinberg law
Hardy weinberg lawHardy weinberg law
Hardy weinberg law
 
Transgenic animals
Transgenic animalsTransgenic animals
Transgenic animals
 
Trangenic animals
Trangenic animalsTrangenic animals
Trangenic animals
 
Transgenic animal
Transgenic animalTransgenic animal
Transgenic animal
 
Transgenic animals
Transgenic animals Transgenic animals
Transgenic animals
 

Viewers also liked

Making sense of my bio signals v2
Making sense of my bio signals v2Making sense of my bio signals v2
Making sense of my bio signals v2TH Schee
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0Karen Cranston
 
Foursquare 服務設計
Foursquare 服務設計Foursquare 服務設計
Foursquare 服務設計TH Schee
 
Aula1 setor372 sartrecoc
Aula1 setor372 sartrecocAula1 setor372 sartrecoc
Aula1 setor372 sartrecocAdemir Aquino
 
Eeg Sleep Wake Evaluation
Eeg Sleep Wake EvaluationEeg Sleep Wake Evaluation
Eeg Sleep Wake Evaluationjagruner
 
Data Self Education for LITA Forum
Data Self Education for LITA ForumData Self Education for LITA Forum
Data Self Education for LITA ForumAbigailGoben
 

Viewers also liked (6)

Making sense of my bio signals v2
Making sense of my bio signals v2Making sense of my bio signals v2
Making sense of my bio signals v2
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0
 
Foursquare 服務設計
Foursquare 服務設計Foursquare 服務設計
Foursquare 服務設計
 
Aula1 setor372 sartrecoc
Aula1 setor372 sartrecocAula1 setor372 sartrecoc
Aula1 setor372 sartrecoc
 
Eeg Sleep Wake Evaluation
Eeg Sleep Wake EvaluationEeg Sleep Wake Evaluation
Eeg Sleep Wake Evaluation
 
Data Self Education for LITA Forum
Data Self Education for LITA ForumData Self Education for LITA Forum
Data Self Education for LITA Forum
 

Similar to Carleton Biology talk : March 2014

Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Phylogenetic tree construction step by step
Phylogenetic tree construction step by stepPhylogenetic tree construction step by step
Phylogenetic tree construction step by stepShamiGurav1
 
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison Delhi
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison DelhiMolecular Cytogenetics - HYM Mohan Ram Heslop-Harrison Delhi
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison DelhiPat (JS) Heslop-Harrison
 
Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_finalbeiko
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformaticsebiquity
 
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...c.titus.brown
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
EVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - IntroductionEVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - IntroductionJonathan Eisen
 
CRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestCRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestLeadershipProgram
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...taxonbytes
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSFKaren Cranston
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural cropsPulipati Gangadhara Rao
 
2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorialc.titus.brown
 

Similar to Carleton Biology talk : March 2014 (20)

Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
Phylogenetic tree construction step by step
Phylogenetic tree construction step by stepPhylogenetic tree construction step by step
Phylogenetic tree construction step by step
 
Bioinformatics in a Nutshell
Bioinformatics in a NutshellBioinformatics in a Nutshell
Bioinformatics in a Nutshell
 
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison Delhi
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison DelhiMolecular Cytogenetics - HYM Mohan Ram Heslop-Harrison Delhi
Molecular Cytogenetics - HYM Mohan Ram Heslop-Harrison Delhi
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_final
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformatics
 
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
 
2014 sage-talk
2014 sage-talk2014 sage-talk
2014 sage-talk
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
EVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - IntroductionEVE161: Microbial Phylogenomics - Class 1 - Introduction
EVE161: Microbial Phylogenomics - Class 1 - Introduction
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Brief introduction to Bioinformatics
Brief introduction to BioinformaticsBrief introduction to Bioinformatics
Brief introduction to Bioinformatics
 
CRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestCRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuest
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSF
 
Cg7 trees
Cg7 treesCg7 trees
Cg7 trees
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial
 

More from Karen Cranston

Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014Karen Cranston
 
WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communitiesKaren Cranston
 
Building communities around open-source scientific software
Building communities around open-source scientific softwareBuilding communities around open-source scientific software
Building communities around open-source scientific softwareKaren Cranston
 
Using phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesisUsing phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesisKaren Cranston
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013Karen Cranston
 
Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013Karen Cranston
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Karen Cranston
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012Karen Cranston
 
Open Tree of Life at Duke Futures
Open Tree of Life at Duke FuturesOpen Tree of Life at Duke Futures
Open Tree of Life at Duke FuturesKaren Cranston
 

More from Karen Cranston (10)

Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014Open Tree of Life Phyloseminar 2014
Open Tree of Life Phyloseminar 2014
 
WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communities
 
Building communities around open-source scientific software
Building communities around open-source scientific softwareBuilding communities around open-source scientific software
Building communities around open-source scientific software
 
Using phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesisUsing phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesis
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 
Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBio
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012
 
Open Tree of Life at Duke Futures
Open Tree of Life at Duke FuturesOpen Tree of Life at Duke Futures
Open Tree of Life at Duke Futures
 

Recently uploaded

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 

Recently uploaded (20)

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 

Carleton Biology talk : March 2014

  • 1. @kcranstn! http://slideshare.net/kcranstn Enabling science with the tree of life Karen Cranston! National Evolutionary Synthesis Center (NESCent)
  • 2. The tree of life provides a means for organizing and explaining biodiversity data Weigmann et al. PNAS, 2011
  • 3. What do we want from a Tree of Life? ❖ complete = contains all of biodiversity! ❖ dynamic = continuously updated with new data! ❖ available digitally = browse, query, download Image: http://evolution.berkeley.edu
  • 4. ❖ Create a complete tree of life by synthesizing published phylogenetic data! ❖ Provide tools for managing, synthesizing & sharing phylogenetic data http://opentreeoflife.org
  • 5. Synthetic science ❖ Novel methods & analysis tools! ❖ Big data from existing data Biodiversity Synthesis Center / Encyclopedia of Life National Evolutionary Synthesis Center
  • 6. Challenges ❖ Incongruence: How do we detect and use conflict between trees?! ❖ Availability: What data do we have to construct a tree of life?! ❖ Synthesis: How do we combine data across the tree of life?
  • 7. What can we learn from conflict between trees?
  • 9. Bayesian phylogenetic inference Input: sequence data + evolutionary model Output = list of sampled phylogenies
  • 10. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 Sampled trees Probability Number of times sampled ∝ probability Is there a stable backbone among the trees?! What taxa have unstable placement?
  • 11. Summarize with agreement subtrees 0.20 0.15 0.25Pr=0.40 1 23 4 5 1 2 3 4 51 23 4 5 1 2 3 4 5 Pr=1.00 0.20 0.15 0.25Pr=0.40 1 3 4 5 1 3 4 5 1 3 4 51 3 4 5
  • 12. 0.20 0.15 0.25Pr=0.40 1 23 4 5 1 2 4 3 51 23 4 5 1 2 3 4 5 Pr=0.85 0.20 0.15 0.25Pr=0.40 1 3 4 5 1 4 3 5 1 3 4 51 3 4 5 Cranston, K.A. and B.H. Rannala. Summarizing a posterior distribution of phylogenies using agreement subtrees. Systematic Biology 2007: 56(4), pp. 578-590.
  • 14. Phylogenomics of rice (Oryza) 820,000 BAC-end sequences for 9 diploid Oryza species 1720 gene fragments! 2.4 million nucleotides Cranston, K.A., B. Hurwitz, M.J. Sanderson, D. Ware, R.A. Wing, L. Stein. Phylogenomic analysis from deep BAC-end sequence libraries of rice. Systematic Botany, 35:3, 2010 What are the biological causes of gene tree incongruence in rice?! Do we need full genomes to answer these questions?
  • 15. Phylogenomics of rice (Oryza) Cranston, K.A., B. Hurwitz, M.J. Sanderson, D. Ware, R.A. Wing, L. Stein. Phylogenomic analysis from deep BAC-end sequence libraries of rice. Systematic Botany, 35:3, 2010 Concatenated analysis
  • 16. Gene trees in Oryza ❖ Gene tree methods: recover every possible topology! ❖ Species tree methods: many clades not statistically significant Cranston, K.A., B. Hurwitz, D. Ware, L. Stein, R.A. Wing. Species trees from highly incongruent gene trees in rice. Systematic Biology. 2009: doi: 10.1093/syst- bio.syp054 Supermatrix topology ❖ Suggest incomplete lineage sorting and hybridization / introgression in evolutionary history of rice
  • 17. What data do we have for creating a complete tree of life?
  • 18. Gene tree signal in GenBank How many trees can we build using all of the data in GenBank and how are those trees distributed across the tree of life?
  • 19. All-vs-all BLAST at each NCBI taxonomy node Sanderson, M.J., D.T. Boss, D. Chen, K.A. Cranston, and A. Wehe. The PhyLoTA Browser: Processing GenBank for molecular phylogenetics research. Systematic Biology 2008: 57(3). Arachis hypogaea Arachis hypogaea subsp. fastigiata Arachis hypogaea subsp. hypogaea Arachis glabrata subtree clusters Arachis
  • 20. All possible clusters, alignments and trees aactgtcgcatgttgacg... aattgtcg-atgttgacg... aac-gtcgcatgtcgacg... aac-gtcgcatgtcgacg... aac-gtcgcatgtcgacg... aactgtcgcatgtcgacg... aactgtcgcatgtcgacg... aactgtcgcatgtcgacg... ❖ ~90000 clusters, alignments, trees available for download! ❖ data availability matrix at each NCBI node
  • 21.
  • 22. ❖ complete = contains all of biodiversity! ❖ dynamic = continuously updated with new data! ❖ available digitally = browse, query, download http://opentreeoflife.org
  • 23. Gordon Burleigh Keith Crandall Karl Gude David Hibbett Mark Holder Laura Katz Rick Ree Stephen Smith Doug Soltis Tiffani Williams Computer science! Systematics! Evolutionary theory! Computational biology! Bioinformatics! Journalism
  • 24.
  • 25. Even if there were phylogenies for all sequence clusters in GenBank, would only represent a small fraction of biodiversity
  • 26. Two types of inputs Phylogeny! highly resolved! computationally derived! limited coverage Taxonomy! poorly resolved! manually curated! much more complete
  • 27. ~7000 trees from ~2600 studies Phylografter: Rick Ree, Field Museum of Natural History
  • 28. Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im- proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80– 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number EVOLUTION ~ 4% of all published phylogenetic trees Stoltzfus et al 2012 Trees generally published as pictures in PDFs
  • 29. OpenTree Reference Taxonomy + + + patch files for manual edits + 3,133,028 nodes and 2,559,835 ‘species’ Jonathan Rees, NESCent
  • 30. How do we combine data to build and use a tree of life?
  • 31. Novel datastore for synthesis Treemachine: Stephen Smith, Cody Hinchliff, Joseph Brown, U Michigan
  • 32.
  • 34. Manual synthesis based on all data Automated synthesis based on limited data
  • 35. Inputs: Published phylogenies Taxonomies • filter / weight input trees • re-synthesize • process feedback • input new trees synthetic tree of life
  • 36. Improving the synthetic tree ❖ Branch lengths & divergence times! ❖ Better synthesis using tree metadata! ❖ Community engagement! ❖ data deposition & curation! ❖ feedback & annotation
  • 37. Moving beyond a single tree ❖ Detecting conflict and coverage! ❖ Visualization! ! ❖ Enabling custom synthesis! ❖ Building out to other tools & resources
  • 38. Leaf Tree of Life OPEN What can we do with a tree of life?
  • 40.
  • 41. Acer macrophyllum! Betula lutea! Aesculus glabra! Tilia americana! Ulmus rubra Leaf patterns image from Walls RL: American Journal of Botany 2011, 98(2):244-253. Acer macrophyllum Betula alleghaniensis Aesculus glabra Tilia americana Ulmus rubra
  • 42. Stoltzfus, A., Lapp, H., Matasci, N., … Cranston, K.A., ... & Jordan, G. (2013). Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient. BMC bioinformatics, 14(1), 158.
  • 43.
  • 44. Collaborative data collection! Validation of datasets! Search & download across datasets
  • 46. Leaf Tree of Life OPEN What can we do with a tree of life?
  • 47. University of Alberta: ! ! Bruce Rannala! ! University of Arizona: ! ! Michael Sanderson! ! NESCent:! ! Jonathan Rees! ! Jim Allman