SlideShare a Scribd company logo
1 of 36
João André Carriço, PhD
Microbiology Institute/Institute for Molecular Medicine
Faculty of Medicine, University of Lisbon
Portugal
http://im.fm.ul.pt
http://imm.fm.ul.pt
http://www.joaocarrico.info
WORKSHOP 24:
NGS FOR MICROBIAL GENOMIC
SURVEILLANCE AND MORE - ONE
TECHNOLOGY FITS ALL
Nothing to disclose
 This presentation is not intended to cover all available
software or databases (we would need several weeks or
months to do that)
 I’ll present what I use or intend to use in a near future
 I gladly accept any suggestions to included on similar
presentations in the future.
 It is supposed to be interactive so ask away during the
presentation.
 What is in the reads FASTQ files
 Available Databases
 Virulence Factors and AMR DBs
 Sequence-based typing databases: Pubmlst.org / Enterobase
 HighThroughput Sequencing data analysis (freeware)
 Prokka
 Roary
 Nullabor
 Microreact.org
 PHYLOViZ
 Commercial Solutions
 Bionumerics 7.5
 CLC GenomicsWorkbench (CLC Bio)
 Ridom Seqsphere+
Isolate
Genome*
Sequenced
Reads
Slide Source: Nick Loman
Other isolates
in the sequencing run
Contamination
* Chromosome + Plasmids + Phages
Virulence Factor Databases
 VFDB (http://www.mgc.ac.cn/VFs/main.htm)
 Pathosystems Resource Integration Center (PATRIC)
VF (https)://www.patricbrc.org/)
 Victors (http://www.phidias.us/victors/)
 PHI-Base (http://www.phi-base.org/)
 MvirDB (http://mvirdb.llnl.gov/ )
To know more:
- Presentation on the Controversies in interpreting whole genome sequence data session :
http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
 Comprehensive Antibiotic Resistance Database
(CARD) (https://card.mcmaster.ca/)
 Repository of Antibiotic resistanceCassetes
(RAC)(http://rac.aihi.mq.edu.au/rac/)
 Integrall :The integron database
(http://integrall.bio.ua.pt/)
(…)
http://www.pubmlst.org
http://bigsdb.web.pasteur.fr/
slide by @happy_khan
Martin Sergeant
Mark Achtman
Nabil-Fareed Alikhan
Zhemin Zhou
To know more :
http://www.slideshare.net/nickloman/eccmid-2015-so-i-have-sequenced-my-genome-what-now
Reads
(fastq files)
contigs
(fasta files)
Annotated contigs
(gbk/gff files)
Roary :PanGenome Analysis
Enterobase
BIGSdb
Nullabor
PHYLOViZ:
Tree + metada
visualization
Microreact.org:
Tree +metadata
+vizualization
Prokka
De novo assembler
 Genome annotation made easy byTorsten
Seemann (slides byTorsten)
 Genome annotation: adding biological
information to the sequence, by describing
features
To know more :
http://www.slideshare.net/torstenseemann/prokka-rapid-bacterial-genome-annotation-abphm-2013
Available at: https://github.com/tseemann/prokka
 Pan genome analysis by Andrew Page
 Available at: https://sangerpathogens.github.io/Roary/
Core
genome
Accessory
genome
Pan-genome
 Inputs:Annotated de novo assemblies (GFF files)
• Typically from the annotation pipeline
 Outputs:
• Spreadsheet with presence and absence of genes
• Multi-FASTA alignment of core genes so you can build a tree without a
reference
• Multi-FASTA alignments for each gene
• Plots for the open/closed genome, unique genes
• Integrates with Phandango so you can visualise all structural variation
• QC report from Kraken to help identify suspect samples
(Slide by Andrew Page)
Core (n or n-1 strains)
Soft-Core
(n-2 or n-3 strains)
Shell
( 8(?) to n-3 strains)
Cloud
( <8 (?) strains)
Core genome:
Core + Soft-Core
Accessory genome:
Shell + Cloud
iCANDY output of presence and
absence of genes in accessory
genome.
S. Weltevreden & public S. enterica
genomes
(Slide by Andrew Page)
 Complete pipeline from reads to reports byTorsten
Seemann
 Objective is automate analysis for everyday use on
public health labs /research settings
 Uses and distills outputs by a lot of software
 Avaliable at: https://github.com/tseemann/nullarbor
Slide byTorsten Seeman
From: https://github.com/tseemann/nullarbor
Slides byTorsten Seeman
www.phyloviz.net
Inputs:
- Tab separated txt (profiles)
- Fasta files
- Automatic database retrieval
(MLST)
Outputs:
• goeBURST and goeBURST
MST
• Link quality assessment
• High quality images
Can be easily applied to:
- MLST/ cgMLST/wgMLST
- MLVA
- SNP data*
- Gene Presence/absence
New features:
• Hierarchical clustering
• Neighbor-Joining
• Project Saving
 Available at http://online.phyloviz.net
 Web based version of PHYLOViZ
 Allows users to create their own datasets, save them and share their data
(privately or publicly)
 REST API available
 Scalable to thousands of nodes
 Tree Analysis tools:
 Interactive distance matrix
 NLV graph
Slide by @happy_khan
NLV Graph
Tree cut-off
Full MST
Create Selections
Change tree options
 Available at http://microreact.org/
 Presentation on session Harnessing whole genome sequence data
for public health applications : Novel open access tools forWGS-
based pathogen surveillance and the identification of high-risk
clones
 http://eccmidlive.org/#resources/novel-open-access-tools-for-
wgs-based-pathogen-surveillance-and-the-identification-of-high-
risk-clones
• Ridom Seqsphere+ : http://www.ridom.de/seqsphere/
• Applied Maths Bionumerics 7.6: http://www.applied-maths.com/bionumerics
• CLCBioGenomicWorkbench : http://www.clcbio.com/blog/clc-genomics-workbench-7-5/
• Huge variety of software and database solutions
• There is no single One-Size-Fits-All solution (job
security for bioinformaticians)
• Different questions require different approaches
• Always question the results and data provenance
 ECCMID2015 Meet-the-expert session on “What bioinformatic tools
should I use for analysis of HighThroughput Sequencing data for
molecular diagnostics? ”
 Nick Loman: http://www.slideshare.net/nickloman/eccmid-2015-
meettheexpert-bioinformatics-tools
 João André Carriço:
http://www.slideshare.net/joaoandrecarrico/eccmid-meet-
theexpert2015
 UMMI Members
 Bruno Gonçalves
 Mário Ramirez
 José Melo-Cristino
 INESC-ID
 Alexandre Francisco
 Cátia Vaz
 Marta Nascimento
 EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)
 Mirko Rossi
 FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):
 Dag Harmsen (Univ. Muenster)
 Stefan Niemann (Research Center Borstel)
 Keith Jolley, James Bray and Martin Maiden (Univ. Oxford)
 Joerg Rothganger (RIDOM)
 Hannes Pouseele (Applied Maths)
 Genome Canada IRIDA project (www.irida.ca)
 Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar(NLM , PHAC)
 Ed Taboada and Peter Kruczkiewicz (LabFoodborne Zoonoses, PHAC)
 Fiona Brinkman (SFU)
 William Hsiao (BCCDC)
INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS

More Related Content

What's hot

Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
mhaendel
 

What's hot (20)

20170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_10120170209 ngs for_cancer_genomics_101
20170209 ngs for_cancer_genomics_101
 
ECCMID 2016 - How to build actionable virulome databases
ECCMID 2016 - How to build actionable virulome databasesECCMID 2016 - How to build actionable virulome databases
ECCMID 2016 - How to build actionable virulome databases
 
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
 
Metagenomics sequencing
Metagenomics sequencingMetagenomics sequencing
Metagenomics sequencing
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
NGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical viewNGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical view
 
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Aug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plansAug2014 abrf interlaboratory study plans
Aug2014 abrf interlaboratory study plans
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
Coding & Best Practice in Programming in the NGS era
Coding & Best Practice in Programming in the NGS eraCoding & Best Practice in Programming in the NGS era
Coding & Best Practice in Programming in the NGS era
 
I Jornada Actualización en Genética Reproductiva y Fertilidad
I Jornada Actualización en Genética Reproductiva y Fertilidad I Jornada Actualización en Genética Reproductiva y Fertilidad
I Jornada Actualización en Genética Reproductiva y Fertilidad
 
Aug2015 deanna church analytical validation
Aug2015 deanna church analytical validationAug2015 deanna church analytical validation
Aug2015 deanna church analytical validation
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assays
 
overview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csnceroverview on Next generation sequencing in breast csncer
overview on Next generation sequencing in breast csncer
 

Viewers also liked

Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
GenomeInABottle
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiology
Philip Ashton
 

Viewers also liked (19)

Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
 
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for HarmonizationEU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
 
Poster ESHG
Poster ESHGPoster ESHG
Poster ESHG
 
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member StatesProposal for 2016 survey of WGS capacity in EU/EEA Member States
Proposal for 2016 survey of WGS capacity in EU/EEA Member States
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
 
Whole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiologyWhole genome microbiology for Salmonella public health microbiology
Whole genome microbiology for Salmonella public health microbiology
 
Genome Wide Methodologies and Future Perspectives
 Genome Wide Methodologies and Future Perspectives Genome Wide Methodologies and Future Perspectives
Genome Wide Methodologies and Future Perspectives
 
Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety? Whole Genome Sequencing (WGS): How significant is it for food safety?
Whole Genome Sequencing (WGS): How significant is it for food safety?
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGS
 
Innovative NGS Library Construction Technology
Innovative NGS Library Construction TechnologyInnovative NGS Library Construction Technology
Innovative NGS Library Construction Technology
 
DNA Sequencing from Single Cell
DNA Sequencing from Single CellDNA Sequencing from Single Cell
DNA Sequencing from Single Cell
 
Plant genome sequencing and crop improvement
Plant genome sequencing and crop improvementPlant genome sequencing and crop improvement
Plant genome sequencing and crop improvement
 
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology Overview
 
Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 
transforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencingtransforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencing
 
Whole Genome Amplification from Single Cell
Whole Genome Amplification from Single CellWhole Genome Amplification from Single Cell
Whole Genome Amplification from Single Cell
 

Similar to Making Use of NGS Data: From Reads to Trees and Annotations

Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
David Ruau
 
DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics lab
Andrew Stewart
 

Similar to Making Use of NGS Data: From Reads to Trees and Annotations (20)

Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysis
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Initial steps towards a production platform for DNA sequence analysis on the ...
Initial steps towards a production platform for DNA sequence analysis on the ...Initial steps towards a production platform for DNA sequence analysis on the ...
Initial steps towards a production platform for DNA sequence analysis on the ...
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Reproducibility: 10 Simple Rules
Reproducibility: 10 Simple RulesReproducibility: 10 Simple Rules
Reproducibility: 10 Simple Rules
 
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCImplementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on Production
 
From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...From construction to deployment of LifeWatchGreece the potentail role of EGI-...
From construction to deployment of LifeWatchGreece the potentail role of EGI-...
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics lab
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
 
How to be a bioinformatician
How to be a bioinformaticianHow to be a bioinformatician
How to be a bioinformatician
 
ISMB Workshop 2014
ISMB Workshop 2014ISMB Workshop 2014
ISMB Workshop 2014
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 

Recently uploaded

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 

Recently uploaded (20)

Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 

Making Use of NGS Data: From Reads to Trees and Annotations

  • 1. João André Carriço, PhD Microbiology Institute/Institute for Molecular Medicine Faculty of Medicine, University of Lisbon Portugal http://im.fm.ul.pt http://imm.fm.ul.pt http://www.joaocarrico.info WORKSHOP 24: NGS FOR MICROBIAL GENOMIC SURVEILLANCE AND MORE - ONE TECHNOLOGY FITS ALL
  • 3.  This presentation is not intended to cover all available software or databases (we would need several weeks or months to do that)  I’ll present what I use or intend to use in a near future  I gladly accept any suggestions to included on similar presentations in the future.  It is supposed to be interactive so ask away during the presentation.
  • 4.  What is in the reads FASTQ files  Available Databases  Virulence Factors and AMR DBs  Sequence-based typing databases: Pubmlst.org / Enterobase  HighThroughput Sequencing data analysis (freeware)  Prokka  Roary  Nullabor  Microreact.org  PHYLOViZ  Commercial Solutions  Bionumerics 7.5  CLC GenomicsWorkbench (CLC Bio)  Ridom Seqsphere+
  • 5. Isolate Genome* Sequenced Reads Slide Source: Nick Loman Other isolates in the sequencing run Contamination * Chromosome + Plasmids + Phages
  • 6.
  • 7. Virulence Factor Databases  VFDB (http://www.mgc.ac.cn/VFs/main.htm)  Pathosystems Resource Integration Center (PATRIC) VF (https)://www.patricbrc.org/)  Victors (http://www.phidias.us/victors/)  PHI-Base (http://www.phi-base.org/)  MvirDB (http://mvirdb.llnl.gov/ ) To know more: - Presentation on the Controversies in interpreting whole genome sequence data session : http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
  • 8.  Comprehensive Antibiotic Resistance Database (CARD) (https://card.mcmaster.ca/)  Repository of Antibiotic resistanceCassetes (RAC)(http://rac.aihi.mq.edu.au/rac/)  Integrall :The integron database (http://integrall.bio.ua.pt/) (…)
  • 10. slide by @happy_khan Martin Sergeant Mark Achtman Nabil-Fareed Alikhan Zhemin Zhou
  • 11. To know more : http://www.slideshare.net/nickloman/eccmid-2015-so-i-have-sequenced-my-genome-what-now Reads (fastq files) contigs (fasta files) Annotated contigs (gbk/gff files) Roary :PanGenome Analysis Enterobase BIGSdb Nullabor PHYLOViZ: Tree + metada visualization Microreact.org: Tree +metadata +vizualization Prokka De novo assembler
  • 12.  Genome annotation made easy byTorsten Seemann (slides byTorsten)  Genome annotation: adding biological information to the sequence, by describing features To know more : http://www.slideshare.net/torstenseemann/prokka-rapid-bacterial-genome-annotation-abphm-2013 Available at: https://github.com/tseemann/prokka
  • 13.  Pan genome analysis by Andrew Page  Available at: https://sangerpathogens.github.io/Roary/ Core genome Accessory genome Pan-genome
  • 14.  Inputs:Annotated de novo assemblies (GFF files) • Typically from the annotation pipeline  Outputs: • Spreadsheet with presence and absence of genes • Multi-FASTA alignment of core genes so you can build a tree without a reference • Multi-FASTA alignments for each gene • Plots for the open/closed genome, unique genes • Integrates with Phandango so you can visualise all structural variation • QC report from Kraken to help identify suspect samples (Slide by Andrew Page)
  • 15. Core (n or n-1 strains) Soft-Core (n-2 or n-3 strains) Shell ( 8(?) to n-3 strains) Cloud ( <8 (?) strains) Core genome: Core + Soft-Core Accessory genome: Shell + Cloud
  • 16. iCANDY output of presence and absence of genes in accessory genome. S. Weltevreden & public S. enterica genomes (Slide by Andrew Page)
  • 17.  Complete pipeline from reads to reports byTorsten Seemann  Objective is automate analysis for everyday use on public health labs /research settings  Uses and distills outputs by a lot of software  Avaliable at: https://github.com/tseemann/nullarbor
  • 22. Inputs: - Tab separated txt (profiles) - Fasta files - Automatic database retrieval (MLST) Outputs: • goeBURST and goeBURST MST • Link quality assessment • High quality images Can be easily applied to: - MLST/ cgMLST/wgMLST - MLVA - SNP data* - Gene Presence/absence
  • 23. New features: • Hierarchical clustering • Neighbor-Joining • Project Saving
  • 24.  Available at http://online.phyloviz.net  Web based version of PHYLOViZ  Allows users to create their own datasets, save them and share their data (privately or publicly)  REST API available  Scalable to thousands of nodes  Tree Analysis tools:  Interactive distance matrix  NLV graph
  • 26.
  • 28.
  • 29.
  • 31.  Available at http://microreact.org/  Presentation on session Harnessing whole genome sequence data for public health applications : Novel open access tools forWGS- based pathogen surveillance and the identification of high-risk clones  http://eccmidlive.org/#resources/novel-open-access-tools-for- wgs-based-pathogen-surveillance-and-the-identification-of-high- risk-clones
  • 32.
  • 33. • Ridom Seqsphere+ : http://www.ridom.de/seqsphere/ • Applied Maths Bionumerics 7.6: http://www.applied-maths.com/bionumerics • CLCBioGenomicWorkbench : http://www.clcbio.com/blog/clc-genomics-workbench-7-5/
  • 34. • Huge variety of software and database solutions • There is no single One-Size-Fits-All solution (job security for bioinformaticians) • Different questions require different approaches • Always question the results and data provenance
  • 35.  ECCMID2015 Meet-the-expert session on “What bioinformatic tools should I use for analysis of HighThroughput Sequencing data for molecular diagnostics? ”  Nick Loman: http://www.slideshare.net/nickloman/eccmid-2015- meettheexpert-bioinformatics-tools  João André Carriço: http://www.slideshare.net/joaoandrecarrico/eccmid-meet- theexpert2015
  • 36.  UMMI Members  Bruno Gonçalves  Mário Ramirez  José Melo-Cristino  INESC-ID  Alexandre Francisco  Cátia Vaz  Marta Nascimento  EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)  Mirko Rossi  FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):  Dag Harmsen (Univ. Muenster)  Stefan Niemann (Research Center Borstel)  Keith Jolley, James Bray and Martin Maiden (Univ. Oxford)  Joerg Rothganger (RIDOM)  Hannes Pouseele (Applied Maths)  Genome Canada IRIDA project (www.irida.ca)  Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar(NLM , PHAC)  Ed Taboada and Peter Kruczkiewicz (LabFoodborne Zoonoses, PHAC)  Fiona Brinkman (SFU)  William Hsiao (BCCDC) INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS