SlideShare a Scribd company logo

Inference and informatics in a 'sequenced' world

Joe Parker
Joe Parker
Joe ParkerEarly Career Research Fellow at Royal Botanic Gardens, Kew

Short lecture relating my recent work on real-time phylogenomics, implications for bioinformatics research and future directions of genomic/phylogenetic modelling to explicitly account for phylogeny, synteny and identity through coloured graphs. University of Reading, 2nd August 2017

Inference and informatics in a 'sequenced' world

1 of 21
Download to read offline
Informatics and inference in a
sequenced world
Dr. Joe Parker
Early Career Research Fellow (Phylogenomics)
Royal Botanic Gardens, Kew
@lonelyjoeparker:
Joe Parker - background 2
VL 4 length
Average
VL 1 length
≤
3
4
≤
2
7
>
2
7
>
3
4
Neut -
Neut +
Incredible times for bioscience 3
Images – Wikimedia commons CC BY-SA
(clockwise from top left: Jeroen Rouwkema, @aGastya, author’s own, @RE73)
Step back: molecular evolution 4
“Horizontal gene transfer occurs x more frequently in these lineages, because of
this biology”
“Convergent evolution is rare in most genes, in most organisms, but y times
greater in these gene families …because of this biology”
“New chomosomes are created & destroyed at z, q, rates in this reproductive
strategy …because of this biology”
Field-based DNA
sequencing
Snowdonia, HelloWorld & ‘tent-seq’ 6
A. thaliana Arabidopsis lyrata
Congeneric species;
Reference genomes available
Field-sequenced (MinION) &
Lab-sequenced (Illumina™)
Orthogonal BLAST:
4 sample*sequencer combinations
Compare TRUE & FALSE rates for
varying ID statistic cutoffs

Recommended

Using field-based DNA sequencing to accelerate phylogenomics
Using field-based DNA sequencing to accelerate phylogenomicsUsing field-based DNA sequencing to accelerate phylogenomics
Using field-based DNA sequencing to accelerate phylogenomicsJoe Parker
 
Joe parker-benchmarking-bioinformatics
Joe parker-benchmarking-bioinformaticsJoe parker-benchmarking-bioinformatics
Joe parker-benchmarking-bioinformaticsJoe Parker
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015Richard Casey
 
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...Torsten Seemann
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencingDenis C. Bauer
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingSurya Saha
 
Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Christian Frech
 

More Related Content

What's hot

Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016Surya Saha
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Torsten Seemann
 
Avances en genética. Utilidad de la NGS y la bioinformática.
Avances en genética. Utilidad de la NGS y la bioinformática.Avances en genética. Utilidad de la NGS y la bioinformática.
Avances en genética. Utilidad de la NGS y la bioinformática.BBK Innova Sarea
 
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015Torsten Seemann
 
Real-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerReal-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerJoe Parker
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Joe Parker
 
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...Joe Parker
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSIntegrated DNA Technologies
 
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore SequencingLab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore Sequencingscalene
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...Joseph Hughes
 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics Christopher Mason
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014Torsten Seemann
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewPaolo Dametto
 
How to fingerprint a bat
How to fingerprint a batHow to fingerprint a bat
How to fingerprint a batDavid Martin
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowBrian Krueger
 
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Jonathan Eisen
 

What's hot (20)

Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
A Journey Through The History Of DNA Sequencing
A Journey Through The History Of DNA Sequencing A Journey Through The History Of DNA Sequencing
A Journey Through The History Of DNA Sequencing
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
 
Avances en genética. Utilidad de la NGS y la bioinformática.
Avances en genética. Utilidad de la NGS y la bioinformática.Avances en genética. Utilidad de la NGS y la bioinformática.
Avances en genética. Utilidad de la NGS y la bioinformática.
 
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
 
Real-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerReal-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe Parker
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
 
A brief history of DNA sequencing
A brief history of DNA sequencingA brief history of DNA sequencing
A brief history of DNA sequencing
 
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore SequencingLab in a Suitcase and Other Adventures with Nanopore Sequencing
Lab in a Suitcase and Other Adventures with Nanopore Sequencing
 
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
How to Standardise and Assemble Raw Data into Sequences: What Does it Mean fo...
 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
 
T-bioinfo overview
T-bioinfo overviewT-bioinfo overview
T-bioinfo overview
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
 
How to fingerprint a bat
How to fingerprint a batHow to fingerprint a bat
How to fingerprint a bat
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
 

Similar to Inference and informatics in a 'sequenced' world

GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGenomeInABottle
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.jennomics
 
Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Sucheta Tripathy
 
Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;applicationFyzah Bashir
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
Outline Of The AFLP Procedure (RFLP)
Outline Of The AFLP Procedure (RFLP)Outline Of The AFLP Procedure (RFLP)
Outline Of The AFLP Procedure (RFLP)Laura Olson
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-researchc.titus.brown
 

Similar to Inference and informatics in a 'sequenced' world (20)

20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop
 
2014 davis-talk
2014 davis-talk2014 davis-talk
2014 davis-talk
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
2014 naples
2014 naples2014 naples
2014 naples
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920
 
Giab agbt SVs_2019
Giab agbt SVs_2019Giab agbt SVs_2019
Giab agbt SVs_2019
 
Dna microarray mehran- u of toronto
Dna microarray  mehran- u of torontoDna microarray  mehran- u of toronto
Dna microarray mehran- u of toronto
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
Dna microarray mehran
Dna microarray  mehranDna microarray  mehran
Dna microarray mehran
 
Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;application
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
Outline Of The AFLP Procedure (RFLP)
Outline Of The AFLP Procedure (RFLP)Outline Of The AFLP Procedure (RFLP)
Outline Of The AFLP Procedure (RFLP)
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-research
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
2013 duke-talk
2013 duke-talk2013 duke-talk
2013 duke-talk
 

More from Joe Parker

Challenges and potential of real-time phylogenomics: lessons from a metagenom...
Challenges and potential of real-time phylogenomics: lessons from a metagenom...Challenges and potential of real-time phylogenomics: lessons from a metagenom...
Challenges and potential of real-time phylogenomics: lessons from a metagenom...Joe Parker
 
Reframing Phylogenomics
Reframing PhylogenomicsReframing Phylogenomics
Reframing PhylogenomicsJoe Parker
 
Real-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerReal-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerJoe Parker
 
'Omics in extreme Environments (Lightweight bioinformatics)
'Omics in extreme Environments (Lightweight bioinformatics)'Omics in extreme Environments (Lightweight bioinformatics)
'Omics in extreme Environments (Lightweight bioinformatics)Joe Parker
 
Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsJoe Parker
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Joe Parker
 
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...Joe Parker
 

More from Joe Parker (7)

Challenges and potential of real-time phylogenomics: lessons from a metagenom...
Challenges and potential of real-time phylogenomics: lessons from a metagenom...Challenges and potential of real-time phylogenomics: lessons from a metagenom...
Challenges and potential of real-time phylogenomics: lessons from a metagenom...
 
Reframing Phylogenomics
Reframing PhylogenomicsReframing Phylogenomics
Reframing Phylogenomics
 
Real-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerReal-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe Parker
 
'Omics in extreme Environments (Lightweight bioinformatics)
'Omics in extreme Environments (Lightweight bioinformatics)'Omics in extreme Environments (Lightweight bioinformatics)
'Omics in extreme Environments (Lightweight bioinformatics)
 
Interpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasetsInterpreting ‘tree space’ in the context of very large empirical datasets
Interpreting ‘tree space’ in the context of very large empirical datasets
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...
 
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
 

Recently uploaded

2024 Insilicogen Company English Brochure
2024 Insilicogen Company English Brochure2024 Insilicogen Company English Brochure
2024 Insilicogen Company English BrochureInsilico Gen
 
Elbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointElbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointTELISHA2
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Nan Yang Academy of Sciences
 
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...Abhinav S
 
Thornyissue testing of slideshow for website
Thornyissue testing of slideshow for websiteThornyissue testing of slideshow for website
Thornyissue testing of slideshow for websitesuelcarter1
 
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilHydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilZeeshan Nazir
 
Quality safety and legislations of cosmetics.pptx
Quality safety and legislations of cosmetics.pptxQuality safety and legislations of cosmetics.pptx
Quality safety and legislations of cosmetics.pptxDeviSky1
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxySérgio Sacani
 
Study of X - Ray Spectra and its types
Study  of X  - Ray Spectra and its typesStudy  of X  - Ray Spectra and its types
Study of X - Ray Spectra and its typestanishashukla147
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cellsAlaaOraby6
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisMarkus Roggen
 
An Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesAn Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesDavid Yonge-Mallo
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeGaia Science Pte Ltd
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET
 
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxAnkitChoudhary955647
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...Advanced-Concepts-Team
 
Analytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfAnalytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfMollyWinterbottom
 

Recently uploaded (20)

2024 Insilicogen Company English Brochure
2024 Insilicogen Company English Brochure2024 Insilicogen Company English Brochure
2024 Insilicogen Company English Brochure
 
Elbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointElbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow joint
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
 
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
 
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHESINTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
 
Thornyissue testing of slideshow for website
Thornyissue testing of slideshow for websiteThornyissue testing of slideshow for website
Thornyissue testing of slideshow for website
 
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilHydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
 
VEM 023- LESSON 1.pdf
VEM 023- LESSON 1.pdfVEM 023- LESSON 1.pdf
VEM 023- LESSON 1.pdf
 
Quality safety and legislations of cosmetics.pptx
Quality safety and legislations of cosmetics.pptxQuality safety and legislations of cosmetics.pptx
Quality safety and legislations of cosmetics.pptx
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our Galaxy
 
Study of X - Ray Spectra and its types
Study  of X  - Ray Spectra and its typesStudy  of X  - Ray Spectra and its types
Study of X - Ray Spectra and its types
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cells
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
 
An Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesAn Introduction to Quantum Programming Languages
An Introduction to Quantum Programming Languages
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in Singapore
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
 
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
 
ALL the evidence webinar: Appraising and using evidence about community conte...
ALL the evidence webinar: Appraising and using evidence about community conte...ALL the evidence webinar: Appraising and using evidence about community conte...
ALL the evidence webinar: Appraising and using evidence about community conte...
 
Analytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfAnalytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdf
 

Inference and informatics in a 'sequenced' world

  • 1. Informatics and inference in a sequenced world Dr. Joe Parker Early Career Research Fellow (Phylogenomics) Royal Botanic Gardens, Kew @lonelyjoeparker:
  • 2. Joe Parker - background 2 VL 4 length Average VL 1 length ≤ 3 4 ≤ 2 7 > 2 7 > 3 4 Neut - Neut +
  • 3. Incredible times for bioscience 3 Images – Wikimedia commons CC BY-SA (clockwise from top left: Jeroen Rouwkema, @aGastya, author’s own, @RE73)
  • 4. Step back: molecular evolution 4 “Horizontal gene transfer occurs x more frequently in these lineages, because of this biology” “Convergent evolution is rare in most genes, in most organisms, but y times greater in these gene families …because of this biology” “New chomosomes are created & destroyed at z, q, rates in this reproductive strategy …because of this biology”
  • 6. Snowdonia, HelloWorld & ‘tent-seq’ 6 A. thaliana Arabidopsis lyrata Congeneric species; Reference genomes available Field-sequenced (MinION) & Lab-sequenced (Illumina™) Orthogonal BLAST: 4 sample*sequencer combinations Compare TRUE & FALSE rates for varying ID statistic cutoffs
  • 7. Tasty pics 7 Conditions 100% humidity; 6-13ºC Essential kit 800w generator 3x laptops Centrifuge Waterbath Polystyrene boxes (lots) Kettle(…!) Yield >400Mbp data in three days; A. thaliana ~2.01x coverage
  • 8. Field- vs. lab-sequenced sample ID 8 Match individual reads to each reference with BLAST Compare match lengths in TRUE and FALSE cases ‘Length bias’ ID stat: lengthTRUE - lengthFALSE Compare TRUE & FALSE rates as length bias cutoff varies MiSeq (lab) MinION (field)
  • 9. Bitty data (1) partial queries 9 Subsample MinION output Repeat ID pipeline, record mean ID stat sbias Replicates: N = 30 Simulate from 100 – 104 reads (≈instant → hours)
  • 10. Bitty data (2) partial references 10 Take reference genome at high contiguity Fragment randomly to target (low) contiguity Repeat read identification using fragmented DB Simulate N50 ≈1,000bp to N50 ≈ 10Mbp
  • 11. Keeping it simple: Kew Science Festival 11 Six species: whole genome- skim samples with MinION in preparation Build BLAST DBs from skimmed data Select ‘unknown’ (blinded) sample, extract DNA and resequence in real-time Compare to partial DBs in six- way BLAST competition Live ID ?
  • 12. de novo genome assembly 12 Data MiSeq only MiSeq + MinION Assembler Abyss hybridSPAdes Illumina reads, 300bp paired-end 8,033,488 8,033,488 Illumina data (yield) 2,418 Mbp 2,418 Mbp MinION reads, R7.3 + R9 kits, N50 ~ 4,410bp - 96,845 MinION data (yield) - 240 Mbp Approx. coverage 19.49x 19.49x + 2.01x Assembly key statistics: # contigs 24,999 10,644 Longest contig 90 Kbp 414 Kbp N50 contiguity 7,853 bp 48,730 bp Fraction of reference genome (%) 82 88 Errors, per 100 kbp: #N’s 1.7 5.4 # mismatches 518 588 # indels 120 130 Largest alignment 76,935 bp 264,039 bp CEGMA gene completeness estimate: # genes 219 of 248 245 of 248 % genes 88% 99%
  • 13. Wait – genes? 13 Entire chloroplast genome (~150kbp) Plastid coding loci Individual field- sequenced MinION reads
  • 14. Real-time phylogenomics 14 Filtered reads Gene models TAIR10 CDS code Annotation SNAP 1:1 reciprocal BLAST Multiple sequence alignments MUSCLE Trimal Gene trees → Consensus tree *BEAST RAxML, TreeAnnotator Cumulative counts: Unique genes All genes (‘Lab’ being transported!)
  • 15. Emerging health threats & globalisation 15 Acute oak decline: A syndrome-type oak disease • Unknown cause, no treatment • ca. 200 million oaks in GB …amenity & timber value: ~£500/tree • Emerged ca. 2004, spreading rapidly • Significant morbidity and mortality Defra ‘Futureproofing Plant Health’ initiative • Test field-based methods • Balanced survey of microbial community composition (healthy & affected individuals) • Overcome ascertainment bias • Pilot training of non-experts. • Draw conclusions relevant to rapid-response plant health monitoring in the UK. © 2016 Katy Reed / Forest Research
  • 16. Recap 16 From lab-based… … to ‘app store’ genomics
  • 17. Problems with phylogeny… and comparative genomics 17 Suh (2016) Zool. Scripta. doi:10.1111/zsc.12213 Zapata et al. (2016) PNAS 113:E4052-E4060 ©2016 National Academy of Sciences
  • 18. Key: Extant node Inferred node Synteny edge (physical connection Phylogeny edge (evolutionary connection) Identity edge (organismal connection) Three-colour graphs: phylogeny, synteny & identity 18 a b c d x y z e a a
  • 19. Three-colour graphs: phylogeny, synteny & identity 19 a1 b1 a2 b2 a3 b3 b’3 a4 b4 a5 b5 Duplication a1 b1 a2 b2 a3 b3 a4 b4 x4 y4 x3 y3 x1 y1x2 Tetraploid hybrid formed Diploidization Key: Extant node Inferred node Synteny edge (physical connection) Phylogeny edge (evolutionary connection) Identity edge (organismal connection)a1 b1 b2 a2 a3 b3 c1 c3 c1 Inversion a1 b1 a2 b2 x1 x5 x2 x3 x4 x7 x6 HGT
  • 20. Final thoughts 20 bionode.js bioboxes.org Singularity Portable sequencing, by anyone means really Big Data Informatics connecting this data through explicit models is inference Scalable, reproducible, sustainable research:
  • 21. Thanks, funders, contacts and questions 21 Oxford Nanopore Technologies Ltd. Dan Turner, Richard Ronan, Gerrard CoyneRBG Kew: Alexander S.T. Papadopulos (@metallophyte) Andrew Helmstetter (@ajhelmstetter) Dion Devey, Robyn Cowan, Tim Wilkinson, Stephen Dodsworth, Pepijn Kooij, Felix Forest, Bill Baker, Jan T. Kim, Jenny Williams, Abigail Barker, Mark Lee, Jim Clarkson, Mike Chester, Ester Gaya, Lisa Pokorny, Laszlo Csiba, Paul Wilkin, Richard Buggs, Mike Fay, Mark Chase, Ilia Leitch QMUL Laura Kelly, Kalina Davies, Steve Rossiter Oxford Aris Katzourakis, Oli Pybus, Jayna Raghwani Others Forest Research: Daegan Inward, Katy Reed Dstl: Claire Lonstale, James Taylor Birmingham: Nick Loman, Josh Quick U. Utah: Bryn Dentinger Imperial: James Rosindell This research was conducted in the Sackler Phylogenomics Laboratory and was supported by the Calleva Foundation Phylogenomic Research Programme and the Sackler Trust @lonelyjoeparker: joe.parker@kew.org

Editor's Notes

  1. Welcome, thanks, menu Formal introduction and thanks; Lay out the menu / journey I’ll mainly be talking about work in the last 2.5 yrs since taking up my ECRF at Kew
  2. Wide range of taxa, techniques and questions. Enough to set my scene without taking ages, confusing/losing audience, or giving the impression I’m just a tools-bot.
  3. Start of… Incredible times Traditional to start bioinformatics talks with a slide about Moore’s law, sequencing costs, and the data deluge Actually this is a fantastic age to be living in, ever bigger analyses – and I’ll talk a lot about “real-time” phylogenomics But why? What are we attempting to discover?
  4. We need enough data to turn obervations, into empirical comparisons, into models and laws We know a lot about evolutionary mechanisms And a lot about (a handful of genomes) What we know tells us “it’s complicated” Most genes don’t have simple orthologues etc etc etc, hotizonatl etc But we don’t, really, have an empirical understanding of how they fit together, e.g.: - ”horizontal gene transfer occurs x more frequently in these lineages, because of this biology” - adaptive molecularconvergence is rare in most genes, in most organisms, but y times greater in these gene families because of this biology - new chomosomes are created (by duplication, endogenisation, polyploidy) and destroyed (by diploidization) at z rates in this reproductive strategy because of biology
  5. Portable sequencing: also long reads and real-time
  6. Direct, explicit, orthogonal test – and can it work? Picture of experimental design Outline of the study In terms of bioinformatics questions Funding: a first pot and timeline…
  7. Data in terrible conditions but anyone can do it Social media reach The Atlantic, Economist
  8. We compare match lengths, and minon allows long matches
  9. EXPLAIN AXES: precision improves rapidly
  10. EXPLAIN AXES: a partial REFERENCE would work, too
  11. MORE FUNDING. SO simple a kid could do it? Yes The challenge I set myself: OK, it’s a simple experiment. Can I buid a trest simple ehough a child can understand it? SOCIAL MEDIA Funding: NANOPORE
  12. Data from one time and place can and should be useful elsewhere lash a bit of proper genomics
  13. Single reads match whole genes – meat & drink
  14. EXPLAIN AXES postdoc-years PAPER ACCEPTED
  15. FUNDED tailor made for health research/application need to mention it somewhere because of: strategic links Building the ‘momentum narrative’ Other related stuff; VIPs etc Plant health and emerging threats A connected world means new diseases can spread globally, fast. Lay out the problem, e.g. opportunities – look! Health! Ascertainment bias! Field-portable! etc Funding: yet another pot, this one also bigger. Software etc to improve UI (ahem)
  16. HPCs to apps: Exponential data, linear understanding. Pause – to recap This is important because it’s where we tie it together and show my contribution: Portable, mass sequencing is really here Massive potential for de novo genomics; phylogenomics But while we’re accumulating information at an exponential rate, we’re integrating it linearly, in essence … where are we going?
  17. Nature is cruel: more data only muddies the water Bifurcating phylogenies are decreasingly useful and complicated to get ‘Comparative’ genomics actually uses relatively few datapoints (e.g. Encode…) In part because most phylogenetic methods require variations on homology assumption
  18. Here’s a common framework for all these studies How to infer – sounds like a nightmare Many of the edges in this network are really there already Shifting paradigms, making linking easier Explicitly model phylogeny, synteny and identity Edge support reflects evidence; deviations from neutrality reflect hypotheses/models/phenomena Any nodes connecting to an identity edge are considered completely connected Maximum # edges ~n (2n-1)/2 Digraphs ~n!! Possible ancestors from one locus on n taxa essentially inverse func of when they coalesce (can have m generations of n ancestors until an event where n(m)<n(t)
  19. EXAMPLES Gene duplication e.g. paralogue in animal Tetraploid formed then secondary diploidization, e.g. plant Inversion in a genome Unlinked loci (e.g. bacterial plasmids) and HGT. How to infer – sounds like a nightmare Many of the edges in this network are really there already Shifting paradigms, making linking easier Explicitly model phylogeny, synteny and identity Edge support reflects evidence; deviations from neutrality reflect hypotheses/models/phenomena
  20. Formally linking datasets and models is inferring the network of life Shifts the job for bioinformatics from something it’s good at – sophisiticated analysis incemental To sometheing computers in gerneral are great at: linking elements In this case informatics doesn’t enable research , it is the process of inference It’s relatively easy to write a new standalone app to do x, or analyse some big dataset Reproducibility and scaling-up science mean we must work harder on the links Informatics as inference. The lonely astronomers.
  21. Funders Thanks Reach out