Presentation for 2nd Conference Rapid Microbial NGS and Bioinformatics: Translation Into Practice
Hamburg/Germany, June 9-11, 2016
http://rami-ngs.org/
Use of mutants in understanding seedling development.pptx
Common languages in genomic epidemiology: from ontologies to algorithms
1. JoãoAndré Carriço, Mario Ramirez
Microbiology Institute and Instituto de Medicina Molecular,
Faculty of Medicine, University of Lisbon
jcarrico@fm.ul.pt twitter: @jacarrico
RAMI-NGS, Hamburg, Germany, 9-11 June 2016
2. Moving fromTyping into High
Throughput Sequencing (HTS)
Genomics :
Increase in discrimination
Extra information to be extracted the
genome (resistance profiles, virulence
factors, genome organization)
Global Outbreak detection / Surveillance
Direct application in public health
Source attribution -> intervention
5. Read mapping algorithms
Bowtie2
BWA
SOAP2
Saruman
mr/mrsFAST
…. (And a lot more )
Algorithms
Hatem M et all BMC Bioinformatics
2013..14:184
DOI: 10.1186/1471-2105-14-184
+ a plethora of parameters for each of them
+ a (proper) choice of reference
6. Gene-by-gene approach allele call algorithms:
BIGSdb ( Jolley, K.A. & Maiden, M. C. J. BMC Bioinf 11, 595 (2010).)
Enterobase (https://enterobase.warwick.ac.uk/)
GEP (Genome Profiler) (JCM. 2015 May;53(5):1765-7)
Ridom Seqsphere
Bionumerics (Applied Maths)
Mostly assembly based (yes it is a lot of work … )
Assembly algorithms have some parameters (mostly k-mer
sizes)
Lots of heuristics for allele definition..
Algorithms
7. Gene by gene approaches:
What is a locus?
What is an allele?
It depends on the
algorithm(s) used!
Algorithms
However the results are
largely congruent!
9. “Formal representation of knowledge as a set of concepts within a
domain, and the relationships between those concepts” –Wikipedia
Domain modeling: represents all the concepts involved in in
microbial typing by sequence-based methods
Provides a shared vocabulary, where the concepts should be
unambiguous
Enables a machine-readable format that can be used for software
and algorithms automatically interact with multiple databases
Ontologies
11. GenEpiO: Combining Different Epi, Lab,
Genomics and Clinical Data Fields.
Lab Analytics
Genomics, PFGE
Serotyping, Phage typing
MLST, AMR
Clinical Data
Patient demographics,
Medical History,
Comorbidities, Symptoms,
Health Status
Reporting
Case/Investigation Status
GenEpiO
(Genomic Epidemiology
Application Ontology)
See draft version at https://github.com/Public-Health-Bioinformatics/IRIDA_ontology
Original slide from
Emma Griffiths
Ontologies
12. Public Health
Surveillance
Case Cluster
Analysis
Result
Reporting
Infectious Disease Epidemiology
(from case to Intervention)
Lab Surveillance
(from sample to strain typing results)
Evidence
Collection
& Outbreak
Investigation
Sample Collection
& Processing
Sequence Data
Generation &
Processing
Bioinformatics
Analysis
Result
Reporting
Whole Genome
Sequencing (SO, ERO, OBI etc)
Quality Control (OBI, ERO)
Anatomy
(FMA)
Environment (Envo)
Food (FoodOn)
Clinical Sampling (OBI)
Custom LIMS
Quality Control (OBI, ERO)
AMR (ARO)
Virulence (PATO)
Phylogenetic Clustering (EDAM)
Mobile Elements (MobiO)
Quality Control (OBI, ERO)
AMR (ARO) LOINC
Surveillance (SurvO)
Demographics (SIO)
Patient History (SIO)
Symptoms (SYMP)
Exposures (ExO)
Source Attribution (IDO)
Travel (IDO)
Transmission (TRANS)
Food (FoodOn)
Geography (OMRSE)
Outbreak Protocols
Surveillance (SurvO)
Food (FoodOn)
Surveillance (SurvO)
Mobile Elements (MobiO)
Infectious Disease (IDO)
Typing (TypON)
Nomenclature &Taxonomy
(NCBItaxon)
Original slide from Emma Griffiths /IRIDA
http://foodontology.github.io/foodon/
(pipeline) NGSOnto
17. Transparency of
analytical methods
Better definition
of concepts
(Clinical/Lab/Analysis)
Better tool/database
interoperability
• Reproducibility of results
• Added value of analysis
• Custom interfaces for non-bionf specialists
18.
19. UMMI Members
Bruno Gonçalves
Mickael Silva
Miguel MAchado
Mário Ramirez
José Melo-Cristino
INESC-ID
Alexandre Francisco
Cátia Vaz
Marta Nascimento
EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)
Mirko Rossi
FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):
Dag Harmsen (Univ. Muenster)
Stefan Niemann (Research Center Borstel)
Keith Jolley, James Bray and Martin Maiden (Univ.Oxford)
Joerg Rothganger (RIDOM)
Hannes Pouseele (Applied Maths)
Genome Canada IRIDA project (www.irida.ca)
Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar (NLM , PHAC)
Ed Taboada and Peter Kruczkiewicz (Lab Foodborne Zoonoses, PHAC)
Fiona Brinkman (SFU)
William Hsiao (BCCDC)
INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS