Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
@monarchinit @ontowonka
“Not everyone can become a great
artist, but a great artist can come from
anywhere”
Anton Ego, Rat...
Faith-based research
“I believe that my work
on some obscure cell
type in some obscure
organism will matter to
mankind one...
Four things it takes to solve an
undiagnosed disease
1. Deep phenotyping the human organism
1. Crossing the language barri...
1. DEEP PHENOTYPING THE
HUMAN ORGANISM
Patient
Genome
/Exome
Filter
****
** ***** ****
Genomic data
Diagnosis,
treatment
ATCTTAGCACGTTAC
ATCTTAGCACGTGAC
ATCTTATC...
What do all those variations do?
We only know the phenotypic consequences of mutation of
<20% of the human coding genome
Patient
Genome
/Exome
Diagnosis,
treatment
Filter
****
** ***** ****
Genomic data
Phenotyp
e
Gene-Phenotype
Data
Environme...
We have a common language
for sequence data….
ATCTTAGCACGTTAC…
….not so much for phenotypes
CC2.0 European Southern Observatory
https://www.flickr.com/photos/esoastronomy/6923443595
Can we help machines understand
phenotypes?
“Palmoplantar
hyperkeratosis”
Human phenotype
I have
absolutely no
idea what t...
A disease is a collection of
phenotypes
Patient
Disease X
Differential diagnosis with similar but non-matching phenotypes ...
Do we *really* need yet another clinical
vocabulary?
Winnenburg and Bodenreider, ISMB PhenoDay, 2014
UMLS
SNOMED CT
CHV
Me...
Disease-phenotype associations using an
ontology
Hyposmia
Abnormality of
globe location
eyeball of
camera-type eye
sensory...
Once OMIM is rendered
computable, are we done yet?
Free text -> HPO
enables phenotype semantic
similarity matching
Mendelian disease integration
Merges sources together using:
 equivalence and subclass axioms derived from xrefs
 string...
Why we need all the organisms
Model data can provide up to 80% phenotypic coverage
of the human coding genome
We learn different things from different organisms
2. CROSSING THE LANGUAGE
BARRIER
Ulcerated
paws
Palmoplantar
hyperkeratosis
Thick hand skin
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Lice...
Challenge: Each database uses
their own vocabulary/ontology
MP
HP
MGI
HPOA
Image credits:
"HandsEBS" by James Heilman, MD ...
Challenge: Each database uses
their own vocabulary/ontology
ZFA
MP
DPO
WPO
HP
OMIA
VT
FYPO
APO
SNO
MED
…
…
…
WB
PB
FB
OMIA...
Decomposition of complex
concepts allows interoperability
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., ...
Cross-species ontology integration
3. A LOT OF DATA FROM A LOT
OF PLACES
Graph Views
Diverse
G2P/D
source data
Source
Ontologies Owl Loader
Graph
Views
Monarch App
Faceted
Browsing
Phenotype
Matc...
Data Integrated in SciGraph
>25 sources
>100 species
51M triples
4M curated
associations
2.2M G-P / G-D
associations
Genotype-phenotype integration
One source
Two sources
3 or more
9%
91% of our 2.2 Million G2P associations required
integr...
Ontology-based phenotype matching
www.owlsim.org
Combining genotype and phenotype
data for variant prioritization
Whole exome
Remove off-target and
common variants
Variant...
York platelet syndrome and STIM1
Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474 Grosse J, J Clin Inves...
4. VERY MANY PEOPLE
(WHO HAVE FAITH)
Who helped solve the STIM1
UDP_2542 case?
Credit extends beyond the
publication
 Johannes creates stim1 mouse
 Melissa annotates patient UDP_2542 with HPO
 Will ...
Credit is connected
Credit to Will is asserted, but credit to Melissa can be inferred
Who is in the graph?
Melissa Haendel
Peter Robinson
Chris Mungall
Sebastian Kohler
Cindy Smith
Nicole Vasilevsky
Sandra Do...
Tracking Evidence and Provenance
of G2P Associations
Evidence is a collection of information that is used
to support a sci...
Evidence and Provenance for a
Variant-Phenotype Association
Who is missing?
http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450
What about patients?
Can they help too?
HP:0000252
Pref Label: Microcephaly
Synonyms: Decreased Head Circumference;
Reduce...
Job opening
https://goo.gl/MlcnR5
Focusing on building ontologies and
semantic web technologies to
represent research, att...
Funding: NIH Office of Director: 1R24OD011883; NIH-UDP:
HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143,
BD2K ...
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
Upcoming SlideShare
Loading in …5
×

3

Share

Download to read offline

Envisioning a world where everyone helps solve disease

Download to read offline

Keynote presented at the Semantic Web for Life Sciences conference in Cambridge, UK, December 9th, 2015
http://www.swat4ls.org/

The talk focuses on the use of ontologies for data integration to support rare disease diagnostics, and how so very many people unbeknownst to the patient or even to the researchers creating the data are involved in a diagnosis.

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Envisioning a world where everyone helps solve disease

  1. 1. @monarchinit @ontowonka “Not everyone can become a great artist, but a great artist can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar Envisioning a world where everyone helps solve disease Melissa Haendel SWAT4LS 2015 Cambridge, England
  2. 2. Faith-based research “I believe that my work on some obscure cell type in some obscure organism will matter to mankind one day” Well, it can, and it does.
  3. 3. Four things it takes to solve an undiagnosed disease 1. Deep phenotyping the human organism 1. Crossing the language barrier 1. A lot of data from a lot of places 1. Very many people (who have faith)
  4. 4. 1. DEEP PHENOTYPING THE HUMAN ORGANISM
  5. 5. Patient Genome /Exome Filter **** ** ***** **** Genomic data Diagnosis, treatment ATCTTAGCACGTTAC ATCTTAGCACGTGAC ATCTTATCACGTTAC ATCTTAGCACGTTAC
  6. 6. What do all those variations do? We only know the phenotypic consequences of mutation of <20% of the human coding genome
  7. 7. Patient Genome /Exome Diagnosis, treatment Filter **** ** ***** **** Genomic data Phenotyp e Gene-Phenotype Data Environment
  8. 8. We have a common language for sequence data…. ATCTTAGCACGTTAC… ….not so much for phenotypes
  9. 9. CC2.0 European Southern Observatory https://www.flickr.com/photos/esoastronomy/6923443595
  10. 10. Can we help machines understand phenotypes? “Palmoplantar hyperkeratosis” Human phenotype I have absolutely no idea what that means Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
  11. 11. A disease is a collection of phenotypes Patient Disease X Differential diagnosis with similar but non-matching phenotypes is difficult Flat back of head Hypotonia Abnormal skull morphology Decreased muscle mass
  12. 12. Do we *really* need yet another clinical vocabulary? Winnenburg and Bodenreider, ISMB PhenoDay, 2014 UMLS SNOMED CT CHV MedDRA MeSH NCIT ICD10-C ICD9-CM ICD-10 OMIM MedlinePlus Existing clinical vocabularies don’t adequately cover phenotype descriptions
  13. 13. Disease-phenotype associations using an ontology Hyposmia Abnormality of globe location eyeball of camera-type eye sensory perception of smell Abnormal eye morphology Motor neuron atrophyDeeply set eyes motor neuronCL 34571 annotations in 22 species 157534 phenotype annotations 2150 phenotype annotations
  14. 14. Once OMIM is rendered computable, are we done yet? Free text -> HPO enables phenotype semantic similarity matching
  15. 15. Mendelian disease integration Merges sources together using:  equivalence and subclass axioms derived from xrefs  string matching  manual efforts to fill gaps based on phenotypes and anatomical axioms Parkinson’s disease subtypes Different colors = different disease sources https://github.com/monarch-initiative/monarch-disease-ontology
  16. 16. Why we need all the organisms Model data can provide up to 80% phenotypic coverage of the human coding genome
  17. 17. We learn different things from different organisms
  18. 18. 2. CROSSING THE LANGUAGE BARRIER
  19. 19. Ulcerated paws Palmoplantar hyperkeratosis Thick hand skin Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  20. 20. Challenge: Each database uses their own vocabulary/ontology MP HP MGI HPOA Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  21. 21. Challenge: Each database uses their own vocabulary/ontology ZFA MP DPO WPO HP OMIA VT FYPO APO SNO MED … … … WB PB FB OMIA MGI RGD ZFIN SGD HPOA IMPC OMIM ICD QTLdb EHR Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  22. 22. Decomposition of complex concepts allows interoperability Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2 “Palmoplantar hyperkeratosis” increased Stratum corneum layer of skin = Human phenotype PATO Uberon Species neutral ontologies, homologous concepts Autopod keratinization GO
  23. 23. Cross-species ontology integration
  24. 24. 3. A LOT OF DATA FROM A LOT OF PLACES
  25. 25. Graph Views Diverse G2P/D source data Source Ontologies Owl Loader Graph Views Monarch App Faceted Browsing Phenotype Matching .ttl .ttl Input OutputPipeline Putting it Together: Data + Ontologies https://github.com/SciGraph/SciGraph
  26. 26. Data Integrated in SciGraph >25 sources >100 species 51M triples 4M curated associations 2.2M G-P / G-D associations
  27. 27. Genotype-phenotype integration One source Two sources 3 or more 9% 91% of our 2.2 Million G2P associations required integrating 2 or more data sources (this number does not even include orthology (Panther)) 91%
  28. 28. Ontology-based phenotype matching www.owlsim.org
  29. 29. Combining genotype and phenotype data for variant prioritization Whole exome Remove off-target and common variants Variant score from allele freq and pathogenicity Phenotype score from phenotypic similarity PHIVE score to give final candidates Mendelian filters https://www.sanger.ac.uk/reso urces/software/exomiser/
  30. 30. York platelet syndrome and STIM1 Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474 Grosse J, J Clin Invest 2007 117: 3540-50 Impaired platelet aggregation (HP:0003540) Thromocytopenia (HP:0001873) Abnormal platelet activation (MP:0006298) Thrombocytopenia (MP:0003179) UDP_2542 Stim1Sax/Sax http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html
  31. 31. 4. VERY MANY PEOPLE (WHO HAVE FAITH)
  32. 32. Who helped solve the STIM1 UDP_2542 case?
  33. 33. Credit extends beyond the publication  Johannes creates stim1 mouse  Melissa annotates patient UDP_2542 with HPO  Will performs analysis of UDP_2542 that includes stim1 mouse to generate a dataset of prioritized variants  Tom writes publication pmid:25577287 about the STIM1 diagnosis  Tom explicitly credits Will as an author but not Melissa.
  34. 34. Credit is connected Credit to Will is asserted, but credit to Melissa can be inferred
  35. 35. Who is in the graph? Melissa Haendel Peter Robinson Chris Mungall Sebastian Kohler Cindy Smith Nicole Vasilevsky Sandra Dolken Johannes Grosse Attila Braun David Varga-Szabo Niklas Beyersdorf Boris Schneider Lutz Zeitlmann Petra Hanke Patricia Schropp Silke Mühlstedt Carolin Zorn Michael Huber Carolin Schmittwolf Wolfgang Jagla Philipp Yu Thomas Kerkau Harald Schulze Michael Nehls Bernhard Nieswandt Thomas Markello Dong Chen Justin Y. Kwan Iren Horkayne-Szakaly Alan Morrison Olga Simakova Irina Maric Jay Lozier Andrew R. Cullinane Tatjana Kilo Lynn Meister Kourosh Pakzad Sanjay Chainani Roxanne Fischer Camilo Toro James G. White David Adams Cornelius Boerkoel William A. Gahl Cynthia J. Tifft Meral Gunay-Aygun Melissa Haendel David Adams David Draper Bailey Gallinger Joie Davis Nicole Vasilevsky Heather Trang Rena Godfrey Gretchen Golas Catherine Groden Michele Nehrebecky Ariane Soldatos Elise Valkanas, Colleen Wahl Lynne Wolfe Elizabeth Lee Amanda Links Will Bone Murat Sincan Damian Smedley Jules Jacobson Nicole Washington Elise Flynn Sebastian Kohler Orion Buske Marta Girdea Michael Brudno Jeremy Band Hans Goeble Karen Balbach Nadine Pfeifer Sandra Werner Christian Linden Clinical/care Pathology Ontologist CS/informatics Curator Basic research
  36. 36. Tracking Evidence and Provenance of G2P Associations Evidence is a collection of information that is used to support a scientific claim or association Provenance is a history of what processes led to the claim being made, what entities participated in these processes Value of Evidence and Provenance Metadata  context to evaluate credibility/confidence  support filtering and analysis of data  detailed history for attribution
  37. 37. Evidence and Provenance for a Variant-Phenotype Association
  38. 38. Who is missing? http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450
  39. 39. What about patients? Can they help too? HP:0000252 Pref Label: Microcephaly Synonyms: Decreased Head Circumference; Reduced Head Circumference; Small head circumference Suggested Synonyms : Small Head; Little Head; Small Skull; Little Skull; Small Cranium… Small headMicrocephaly https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png
  40. 40. Job opening https://goo.gl/MlcnR5 Focusing on building ontologies and semantic web technologies to represent research, attribution, provenance, and scholarly communication @ontowonka haendel@ohsu.edu
  41. 41. Funding: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143, BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman) PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiser www.monarchinitiative.org/page/team
  • WeitingLin8

    Jun. 2, 2017
  • JamesMalone5

    Jan. 2, 2016
  • kerfors

    Dec. 14, 2015

Keynote presented at the Semantic Web for Life Sciences conference in Cambridge, UK, December 9th, 2015 http://www.swat4ls.org/ The talk focuses on the use of ontologies for data integration to support rare disease diagnostics, and how so very many people unbeknownst to the patient or even to the researchers creating the data are involved in a diagnosis.

Views

Total views

4,035

On Slideshare

0

From embeds

0

Number of embeds

134

Actions

Downloads

24

Shares

0

Comments

0

Likes

3

×