SlideShare a Scribd company logo

The Application of the Human Phenotype Ontology

M
mhaendel

Presented at the II International Summer School for Rare Disease and Orphan Drug Registries, September 15-19, 2014, Organized by the National Centre for Rare Diseases Istituto Superiore di Sanità (ISS), Rome, Italy. Note the extensive contribution by many consortium members and partners listed in the acknowledgements slide.

1 of 76
Download to read offline
Melissa 
Haendel 
Sept 19th, 
2014 
THE APPLICATION OF 
THE HUMAN 
PHENOTYPE 
ONTOLOGY 
II International Summer School 
RARE DISEASE AND ORPHAN 
DRUG REGISTRIES
OUTLINE 
 Why phenotyping is hard 
 About Ontologies 
 Diagnosing known diseases 
 Getting the phenotype data 
 How much phenotyping is enough? 
 Model organism data for undiagnosed 
diseases
PHENOTYPIC BEGINNINGS 
http://rundangerously.blogspot.com/2010/06/inconvenient-summer- 
head-cold.html http://www.vetnext.com/images http://commons.wikimedia.org/wiki/File:CleftLip1.png 
Phenotyping SEEMS like a simple task, but there are shades of 
grey and nuances that are difficult to convey.
http://anthro.palomar.edu/abnormal/abnormal_4.htm 
http://www.pyroenergen.com/articles07/downs-syndrome.htm 
http://www.theguardian.com/commentisfree/2009/oct/27/downs-syndrome-increase-terminations
THE CONSTELLATION OF PHENOTYPES SIGNIFIES THE DISEASE – 
A ‘PROFILE’ 
http://www.learn.ppdictionary.com/prenatal_development_2.htm 
http://www.pyroenergen.com/articles07/do 
wns-syndrome.htm 
http://www.theguardian.com/commentisfree/2009/oct/27/downs-syndrome-increase-terminations 
http://anthro.palomar.edu/abnormal/abnormal_4.htm
CLINICAL PHENOTYPING 
Often free text or checkboxes 
Dysmorphic features 
• df 
• dysmorphic 
• dysmorphic faces 
• dysmorphic features 
Congenital malformation/anomaly: 
• congenital anomaly 
• congenital malformation 
• congenital anamoly 
• congenital anomly 
• congential anomaly 
• congentital anomaly 
• cong. m. 
• cong. Mal 
• cong. malfor 
• congenital malform 
• congenital m. 
• multiple congenital anomalies 
• multiple congenital abormalities 
• multiple congenital abnormalities 
Examples of lists: 
* dd. cong. malfor. behav. pro. 
* dd. mental retardation 
* df< delayed puberty 
* df&lt 
* dd df mr 
* mental retar.short stature

Recommended

Doing Grounded Thoery
Doing Grounded ThoeryDoing Grounded Thoery
Doing Grounded ThoeryLeo Casey
 
To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?Galit Shmueli
 
Grounded theory research
Grounded theory researchGrounded theory research
Grounded theory researchDr. Hina Kaynat
 
Research methodology - Process of Research
Research methodology - Process of ResearchResearch methodology - Process of Research
Research methodology - Process of ResearchThe Stockker
 
A Summary on "Using Thematic Analysis in Psychology"
A Summary on "Using Thematic Analysis in Psychology"A Summary on "Using Thematic Analysis in Psychology"
A Summary on "Using Thematic Analysis in Psychology"Russel June Ramirez
 

More Related Content

What's hot

Design-Based Research: A method for achieving impact in the real world
Design-Based Research: A method for achieving impact in the real worldDesign-Based Research: A method for achieving impact in the real world
Design-Based Research: A method for achieving impact in the real worldGenevieve Lacerte Beninger, MBA
 
Research paradigm
Research paradigmResearch paradigm
Research paradigmAmina Tariq
 
Nurs 508 ass i, pls grounded theory
Nurs 508 ass i, pls grounded theoryNurs 508 ass i, pls grounded theory
Nurs 508 ass i, pls grounded theoryJJ Bellcote
 
Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data AnalysisHendri Karisma
 
Research design new ppt
Research design new pptResearch design new ppt
Research design new pptRekha Marbate
 
Mixing Methods Tutorial
Mixing Methods Tutorial Mixing Methods Tutorial
Mixing Methods Tutorial Maria Kapsali
 
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updated
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updatedMELJUN CORTES research seminar_1__theoretical_framework_2nd_updated
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updatedMELJUN CORTES
 
Grounded Theory Method - Muller
Grounded Theory Method - MullerGrounded Theory Method - Muller
Grounded Theory Method - MullerMichael Muller
 
RMD 100Q Chapter1 cohen ak revised
RMD 100Q Chapter1 cohen ak revisedRMD 100Q Chapter1 cohen ak revised
RMD 100Q Chapter1 cohen ak revisedAnil Kanjee
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisDr. Shiv S Tripathi
 
Research hypothesis
Research hypothesisResearch hypothesis
Research hypothesisEMERENSIA X
 

What's hot (20)

Grounded Theory Introduction
Grounded Theory Introduction Grounded Theory Introduction
Grounded Theory Introduction
 
Qualitative Research methods
Qualitative Research methodsQualitative Research methods
Qualitative Research methods
 
Case study
Case studyCase study
Case study
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Design-Based Research: A method for achieving impact in the real world
Design-Based Research: A method for achieving impact in the real worldDesign-Based Research: A method for achieving impact in the real world
Design-Based Research: A method for achieving impact in the real world
 
Research paradigm
Research paradigmResearch paradigm
Research paradigm
 
Mixed method
Mixed methodMixed method
Mixed method
 
Nurs 508 ass i, pls grounded theory
Nurs 508 ass i, pls grounded theoryNurs 508 ass i, pls grounded theory
Nurs 508 ass i, pls grounded theory
 
RESEARCH
RESEARCHRESEARCH
RESEARCH
 
Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data Analysis
 
Research design new ppt
Research design new pptResearch design new ppt
Research design new ppt
 
Mixing Methods Tutorial
Mixing Methods Tutorial Mixing Methods Tutorial
Mixing Methods Tutorial
 
Qualitative research
Qualitative researchQualitative research
Qualitative research
 
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updated
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updatedMELJUN CORTES research seminar_1__theoretical_framework_2nd_updated
MELJUN CORTES research seminar_1__theoretical_framework_2nd_updated
 
Triangulation
TriangulationTriangulation
Triangulation
 
Grounded Theory
Grounded Theory Grounded Theory
Grounded Theory
 
Grounded Theory Method - Muller
Grounded Theory Method - MullerGrounded Theory Method - Muller
Grounded Theory Method - Muller
 
RMD 100Q Chapter1 cohen ak revised
RMD 100Q Chapter1 cohen ak revisedRMD 100Q Chapter1 cohen ak revised
RMD 100Q Chapter1 cohen ak revised
 
Grounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysisGrounded theory methodology of qualitative data analysis
Grounded theory methodology of qualitative data analysis
 
Research hypothesis
Research hypothesisResearch hypothesis
Research hypothesis
 

Viewers also liked

The Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoveryThe Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoverymhaendel
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
 
Dataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standardDataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standardmhaendel
 
Integrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discoveryIntegrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discoverymhaendel
 
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...mhaendel
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemNicole Vasilevsky
 
How open is open? An evaluation rubric for public knowledgebases
How open is open?  An evaluation rubric for public knowledgebasesHow open is open?  An evaluation rubric for public knowledgebases
How open is open? An evaluation rubric for public knowledgebasesmhaendel
 
Creating presentations that don't suck
Creating presentations that don't suckCreating presentations that don't suck
Creating presentations that don't suckmhaendel
 
Standard grade Inheritance
Standard grade InheritanceStandard grade Inheritance
Standard grade Inheritancejayerichards
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Benjamin Good
 
Deep phenotyping for everyone
Deep phenotyping for everyoneDeep phenotyping for everyone
Deep phenotyping for everyonemhaendel
 
Genotypes and phenotypes
Genotypes and phenotypesGenotypes and phenotypes
Genotypes and phenotypesRosio DeLeon
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesEBI
 

Viewers also liked (17)

The Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discoveryThe Monarch Initiative: A semantic phenomics approach to disease discovery
The Monarch Initiative: A semantic phenomics approach to disease discovery
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicine
 
Dataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standardDataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standard
 
Integrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discoveryIntegrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discovery
 
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...
Global Phenotypic Data Sharing Standards to Maximize Diagnostics and Mechanis...
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
 
How open is open? An evaluation rubric for public knowledgebases
How open is open?  An evaluation rubric for public knowledgebasesHow open is open?  An evaluation rubric for public knowledgebases
How open is open? An evaluation rubric for public knowledgebases
 
Creating presentations that don't suck
Creating presentations that don't suckCreating presentations that don't suck
Creating presentations that don't suck
 
Ensembl genome
Ensembl genomeEnsembl genome
Ensembl genome
 
Genome Browser
Genome BrowserGenome Browser
Genome Browser
 
Standard grade Inheritance
Standard grade InheritanceStandard grade Inheritance
Standard grade Inheritance
 
Ensembl annotation
Ensembl annotationEnsembl annotation
Ensembl annotation
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
Deep phenotyping for everyone
Deep phenotyping for everyoneDeep phenotyping for everyone
Deep phenotyping for everyone
 
Unit 7 Chromosomes And Phenotype
Unit 7 Chromosomes And PhenotypeUnit 7 Chromosomes And Phenotype
Unit 7 Chromosomes And Phenotype
 
Genotypes and phenotypes
Genotypes and phenotypesGenotypes and phenotypes
Genotypes and phenotypes
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
 

Similar to The Application of the Human Phenotype Ontology

Use of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosisUse of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosismhaendel
 
Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery mhaendel
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14mhaendel
 
Global phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discoveryGlobal phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discoverymhaendel
 
GA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team updateGA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team updatemhaendel
 
Enhancing Rare Disease Literature for Researchers and Patients
Enhancing Rare Disease Literature for Researchers and PatientsEnhancing Rare Disease Literature for Researchers and Patients
Enhancing Rare Disease Literature for Researchers and PatientsErin D. Foster
 
5 la genetica clínica en pediatria
5 la genetica clínica en pediatria5 la genetica clínica en pediatria
5 la genetica clínica en pediatriaapepasm
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonErin D. Foster
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonNicole Vasilevsky
 
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve diseaseEnvisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve diseasemhaendel
 
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...mhaendel
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmKnome_Inc
 
GA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project IntroductionGA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project Introductionmhaendel
 
Myathenia Gravis
Myathenia GravisMyathenia Gravis
Myathenia GravisAmr Hassan
 
Mark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disordersMark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disorderswef
 

Similar to The Application of the Human Phenotype Ontology (20)

Use of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosisUse of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosis
 
Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
Global phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discoveryGlobal phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discovery
 
GA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team updateGA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team update
 
Enhancing Rare Disease Literature for Researchers and Patients
Enhancing Rare Disease Literature for Researchers and PatientsEnhancing Rare Disease Literature for Researchers and Patients
Enhancing Rare Disease Literature for Researchers and Patients
 
5 la genetica clínica en pediatria
5 la genetica clínica en pediatria5 la genetica clínica en pediatria
5 la genetica clínica en pediatria
 
Dokotorat_abstract_MJMS
Dokotorat_abstract_MJMSDokotorat_abstract_MJMS
Dokotorat_abstract_MJMS
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the Layperson
 
Enhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the LaypersonEnhancing the Human Phenotype Ontology for Use by the Layperson
Enhancing the Human Phenotype Ontology for Use by the Layperson
 
The Foundation of P4 Medicine
The Foundation of P4 MedicineThe Foundation of P4 Medicine
The Foundation of P4 Medicine
 
Schizophrenia
SchizophreniaSchizophrenia
Schizophrenia
 
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve diseaseEnvisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
 
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi Rehm
 
GA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project IntroductionGA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project Introduction
 
Myathenia Gravis
Myathenia GravisMyathenia Gravis
Myathenia Gravis
 
Mark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disordersMark Daly - Finding risk genes in psychiatric disorders
Mark Daly - Finding risk genes in psychiatric disorders
 

More from mhaendel

Semantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discoverySemantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discoverymhaendel
 
The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA mhaendel
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholdermhaendel
 
Building (and traveling) the data-brick road: A report from the front lines ...
Building (and traveling) the data-brick road:  A report from the front lines ...Building (and traveling) the data-brick road:  A report from the front lines ...
Building (and traveling) the data-brick road: A report from the front lines ...mhaendel
 
Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odysseymhaendel
 
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease DiscoveryData Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease Discoverymhaendel
 
Deep phenotyping to aid identification of coding & non-coding rare disease v...
Deep phenotyping to aid identification  of coding & non-coding rare disease v...Deep phenotyping to aid identification  of coding & non-coding rare disease v...
Deep phenotyping to aid identification of coding & non-coding rare disease v...mhaendel
 
Science in the open, what does it take?
Science in the open, what does it take?Science in the open, what does it take?
Science in the open, what does it take?mhaendel
 
Phenopackets as applied to variant interpretation
Phenopackets as applied to variant interpretation Phenopackets as applied to variant interpretation
Phenopackets as applied to variant interpretation mhaendel
 
Credit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributionsCredit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributionsmhaendel
 
Why the world needs phenopacketeers, and how to be one
Why the world needs phenopacketeers, and how to be oneWhy the world needs phenopacketeers, and how to be one
Why the world needs phenopacketeers, and how to be onemhaendel
 
On the frontier of genotype-2-phenotype data integration
On the frontier of genotype-2-phenotype data integrationOn the frontier of genotype-2-phenotype data integration
On the frontier of genotype-2-phenotype data integrationmhaendel
 
Getting (and giving) credit for all that we do
Getting (and giving) credit for all that we doGetting (and giving) credit for all that we do
Getting (and giving) credit for all that we domhaendel
 
The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...mhaendel
 
Force11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapeForce11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapemhaendel
 
On the nature of Credit
On the nature of CreditOn the nature of Credit
On the nature of Creditmhaendel
 
Standardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontologyStandardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontologymhaendel
 

More from mhaendel (17)

Semantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discoverySemantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discovery
 
The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholder
 
Building (and traveling) the data-brick road: A report from the front lines ...
Building (and traveling) the data-brick road:  A report from the front lines ...Building (and traveling) the data-brick road:  A report from the front lines ...
Building (and traveling) the data-brick road: A report from the front lines ...
 
Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odyssey
 
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease DiscoveryData Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
 
Deep phenotyping to aid identification of coding & non-coding rare disease v...
Deep phenotyping to aid identification  of coding & non-coding rare disease v...Deep phenotyping to aid identification  of coding & non-coding rare disease v...
Deep phenotyping to aid identification of coding & non-coding rare disease v...
 
Science in the open, what does it take?
Science in the open, what does it take?Science in the open, what does it take?
Science in the open, what does it take?
 
Phenopackets as applied to variant interpretation
Phenopackets as applied to variant interpretation Phenopackets as applied to variant interpretation
Phenopackets as applied to variant interpretation
 
Credit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributionsCredit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributions
 
Why the world needs phenopacketeers, and how to be one
Why the world needs phenopacketeers, and how to be oneWhy the world needs phenopacketeers, and how to be one
Why the world needs phenopacketeers, and how to be one
 
On the frontier of genotype-2-phenotype data integration
On the frontier of genotype-2-phenotype data integrationOn the frontier of genotype-2-phenotype data integration
On the frontier of genotype-2-phenotype data integration
 
Getting (and giving) credit for all that we do
Getting (and giving) credit for all that we doGetting (and giving) credit for all that we do
Getting (and giving) credit for all that we do
 
The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...
 
Force11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapeForce11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscape
 
On the nature of Credit
On the nature of CreditOn the nature of Credit
On the nature of Credit
 
Standardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontologyStandardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontology
 

Recently uploaded

Volatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesVolatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesAhmed Metwaly
 
Salesforce Starter Package Presentation.
Salesforce Starter Package Presentation.Salesforce Starter Package Presentation.
Salesforce Starter Package Presentation.Naresh Gupta
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeGaia Science Pte Ltd
 
Hypertension in Children and Adolescents
Hypertension in Children and AdolescentsHypertension in Children and Adolescents
Hypertension in Children and AdolescentsTristanBabaylan1
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cellsAlaaOraby6
 
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...Abhinav S
 
A galactic microquasar mimicking winged radio galaxies
A galactic microquasar mimicking winged radio galaxiesA galactic microquasar mimicking winged radio galaxies
A galactic microquasar mimicking winged radio galaxiesSérgio Sacani
 
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxAnkitChoudhary955647
 
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfCW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfMollyWinterbottom
 
Planeta 9 - A Pan-STARRS1 Search for Planet Nine
Planeta 9 - A Pan-STARRS1 Search for Planet NinePlaneta 9 - A Pan-STARRS1 Search for Planet Nine
Planeta 9 - A Pan-STARRS1 Search for Planet NineSérgio Sacani
 
green chemistry, clean sustainable environment.ppt
green chemistry, clean sustainable environment.pptgreen chemistry, clean sustainable environment.ppt
green chemistry, clean sustainable environment.pptRashmiSanghi1
 
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxExploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxSamrat Tayade
 
Chemistry chapter 1 solutions detailed explanation
Chemistry chapter 1 solutions detailed explanationChemistry chapter 1 solutions detailed explanation
Chemistry chapter 1 solutions detailed explanationayuqroyjohn85
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisMarkus Roggen
 
Duchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxDuchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxNavanidhan.M
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxySérgio Sacani
 
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdf
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdfFINAL Shehnaz and Thane Interview PowerPoint ;-).pdf
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdfThane Heins
 
Physics Chapter Three - Electric Fields and Charges
Physics Chapter Three - Electric Fields and ChargesPhysics Chapter Three - Electric Fields and Charges
Physics Chapter Three - Electric Fields and Chargesalinford
 

Recently uploaded (20)

Volatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesVolatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduates
 
Research methods in ethnobotany- Exploring Traditional Wisdom
Research methods in ethnobotany- Exploring Traditional WisdomResearch methods in ethnobotany- Exploring Traditional Wisdom
Research methods in ethnobotany- Exploring Traditional Wisdom
 
Salesforce Starter Package Presentation.
Salesforce Starter Package Presentation.Salesforce Starter Package Presentation.
Salesforce Starter Package Presentation.
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in Singapore
 
Hypertension in Children and Adolescents
Hypertension in Children and AdolescentsHypertension in Children and Adolescents
Hypertension in Children and Adolescents
 
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHESINTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cells
 
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...LIGHT  Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
LIGHT Community Medicine LIGHT IS A SOURCE OF ENERGY THERE ARE TWO TYPE OF S...
 
A galactic microquasar mimicking winged radio galaxies
A galactic microquasar mimicking winged radio galaxiesA galactic microquasar mimicking winged radio galaxies
A galactic microquasar mimicking winged radio galaxies
 
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsxROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
ROLES OF MICROBES IN BIOCONTROL BY ANKIT CHOUDHARY.ppsx
 
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfCW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
 
Planeta 9 - A Pan-STARRS1 Search for Planet Nine
Planeta 9 - A Pan-STARRS1 Search for Planet NinePlaneta 9 - A Pan-STARRS1 Search for Planet Nine
Planeta 9 - A Pan-STARRS1 Search for Planet Nine
 
green chemistry, clean sustainable environment.ppt
green chemistry, clean sustainable environment.pptgreen chemistry, clean sustainable environment.ppt
green chemistry, clean sustainable environment.ppt
 
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxExploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
 
Chemistry chapter 1 solutions detailed explanation
Chemistry chapter 1 solutions detailed explanationChemistry chapter 1 solutions detailed explanation
Chemistry chapter 1 solutions detailed explanation
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
 
Duchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxDuchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptx
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our Galaxy
 
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdf
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdfFINAL Shehnaz and Thane Interview PowerPoint ;-).pdf
FINAL Shehnaz and Thane Interview PowerPoint ;-).pdf
 
Physics Chapter Three - Electric Fields and Charges
Physics Chapter Three - Electric Fields and ChargesPhysics Chapter Three - Electric Fields and Charges
Physics Chapter Three - Electric Fields and Charges
 

The Application of the Human Phenotype Ontology

  • 1. Melissa Haendel Sept 19th, 2014 THE APPLICATION OF THE HUMAN PHENOTYPE ONTOLOGY II International Summer School RARE DISEASE AND ORPHAN DRUG REGISTRIES
  • 2. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 3. PHENOTYPIC BEGINNINGS http://rundangerously.blogspot.com/2010/06/inconvenient-summer- head-cold.html http://www.vetnext.com/images http://commons.wikimedia.org/wiki/File:CleftLip1.png Phenotyping SEEMS like a simple task, but there are shades of grey and nuances that are difficult to convey.
  • 5. THE CONSTELLATION OF PHENOTYPES SIGNIFIES THE DISEASE – A ‘PROFILE’ http://www.learn.ppdictionary.com/prenatal_development_2.htm http://www.pyroenergen.com/articles07/do wns-syndrome.htm http://www.theguardian.com/commentisfree/2009/oct/27/downs-syndrome-increase-terminations http://anthro.palomar.edu/abnormal/abnormal_4.htm
  • 6. CLINICAL PHENOTYPING Often free text or checkboxes Dysmorphic features • df • dysmorphic • dysmorphic faces • dysmorphic features Congenital malformation/anomaly: • congenital anomaly • congenital malformation • congenital anamoly • congenital anomly • congential anomaly • congentital anomaly • cong. m. • cong. Mal • cong. malfor • congenital malform • congenital m. • multiple congenital anomalies • multiple congenital abormalities • multiple congenital abnormalities Examples of lists: * dd. cong. malfor. behav. pro. * dd. mental retardation * df< delayed puberty * df&lt * dd df mr * mental retar.short stature
  • 7. SEARCHING FOR PHENOTYPES USING TEXT ALONE IS INSUFFICIENT OMIM Query # Records “large bone” 785 “enlarged bone” 156 “big bone” 16 “huge bones” 4 “massive bones” 28 “hyperplastic bones” 12 “hyperplastic bone” 40 “bone hyperplasia” 134 “increased bone growth” 612
  • 8. TERMS SHOULD BE WELL DEFINED SO THEY GET USED PROPERLY We need to capture synonyms and use unique labels
  • 9. SO WHAT IS THE PROBLEM?  Obviously similar phenotype descriptions mean the same thing to you, but not to a computer:  generalized amyotrophy  generalized muscle, atrophy  muscular atrophy, generalized  Many publications have little information about the actual phenotypic features seen in patients with particular mutations  Databases cannot talk to one another about phenotypes
  • 10. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 11. ONTOLOGIES CAN HELP. A controlled vocabulary of logically defined, inter-related terms used to annotate data  Use of common or logically related terms across databases enables integration  Relationships between terms allow annotations to be grouped in scientifically meaningful ways  Reasoning software enables computation of inferred knowledge  Some well known ontologies are SNOMED-CT, Foundational Model of Anatomy, Gene Ontology, Linnean Taxonomy of species
  • 12. OTHER COMMON USES OF ONTOLOGIES
  • 13. HUMAN PHENOTYPE ONTOLOGY Used to annotate: • Patients • Disorders • Genotypes • Genes • Sequence variants Abnormality of pancreatic islet cells Reduced pancreatic beta cells Abnormality of endocrine pancreas physiology Pancreatic islet cell adenoma Abnormality of exocrine pancreas physiology Pancreatic islet cell adenoma Insulinoma Multiple pancreatic beta-cell adenomas Mappings to SNOMED-CT, UMLS, MeSH, ICD, etc. Köhler et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 2014 Jan 1;42(1):D966-74.
  • 14. USING A CONTROLLED VOCABULARY TO LINK PHENOTYPES TO DISEASES Failure to Thrive Chromosome 21 Trisomy Flat Head Abnormal Ears Umbilical Hernia Broad Hands
  • 15. SURVEY OF ANNOTATIONS IN DISEASE CORPUS 7000+ diseases OMIM + Orphanet + Decipher (ClinVar coming soon) 111,000+ annotations Phenotype annotations are unevenly distributed across different anatomical systems
  • 16. HOW DOES HPO RELATE TO OTHER CLINICAL VOCABULARIES? Winnenburg and Bodenreider, ISMB PhenoDay, 2014
  • 17. LOGICAL TERM DEFINITION Definitions are of the following Genus-Differentia form: X = a Y which has one or more differentiating characteristics. where X is the is_a parent of Y. Definition of a cylinder: Surface formed by the set of lines perpendicular to a plane, which pass through a given circle in that plane. is_a is_a Definition: Blue cylinder = Cylinder that has color blue. Definition: Red cylinder = Cylinder that has color red.
  • 18. ABOUT REASONERS A piece of software able to infer logical consequences from a set of asserted facts or axioms. They are used to check the logical consistency of the ontologies and to extend the ontologies with "inferred" facts or axioms For example, a reasoner would infer: Major premise: All mortals die. Minor premise: Some men are mortals. Conclusion: Some men die.
  • 19. PHENOTYPES CAN BE CLASSIFIED IN MULTIPLE WAYS Abnormality of the eye Abnormal eye morphology Vitreous hemorrhage Abnormality of the cardiovascular system Abnormal eye physiology Hemorrhage of the eye Internal hemorrhage Abnormality of the globe Abnormality of blood circulation
  • 20. PHENOTYPE MATCHING Patient Phenotype Resting tremors REM disorder Unstable posture Myopia Neuronal loss in Substantia Nigra Phenotype of known variant Resting tremors REM disorder Unstable posture Nystagmus Neuronal loss in Substantia Niagra
  • 21. abn. of the eye abn. of the ocular region abn. of the eyelid abn. of globe localization or size abn. of the palpebral fissures hypertelorism downward slanting palpebral fissures Noonan Syndrome a) Syndrome term Query term Overlap between query and disease abn. of the eye abn. of the ocular region abn. of the eyelid abn. of globe localization or size abn. of the palpebral fissures downward slanting palpebral fissures Opitz Syndrome b) telecanthus hypertelorism c) Noonan Syndrome downward slanting palpebral fissures hypertelorism 3.78 3.05 Opitz Syndrome hypertelorism telecanthus 3.05 Query (Q) hypertelorism 2.45 downward slanting palpebral fissures (IC of abn. of the eyelid) sim(Q,Noonan) = 3.78 + 3.05 sim(Q,Opitz) =2.45 + 3.05 2 = 2.75 2 = 3.42
  • 23. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 24. THE YET-TO-BE DIAGNOSED PATIENT  Known disorders not recognized during prior evaluations?  Atypical presentation of known disorders?  Combinations of several disorders?  Novel, unreported disorder?
  • 25. EXOME ANALYSIS Remove off-target, common variants, and variants not in known disease causing genes Recessive, de novo filters http://compbio.charite.de/PhenIX/ Target panel of 2741 known Mendelian disease genes Compare phenotype profiles using data from: HGMD, Clinvar, OMIM, Orphanet Zemojtel et al. Sci Transl Med 3 September 2014: Vol. 6, Issue 252, p.252ra123
  • 26. PHENIX PERFORMANCE TESTING Figure removed due to restrictions. Please see the paper: http://stm.sciencemag.org/content/6/252/252ra123.full Simulated datasets created by spiking DAG panel generated VCF file with the causative mutation removed
  • 27. CONTROL PATIENTS WITH KNOWN MUTATIONS Inheritance Gene Average Rank AD ACVR1, ATL1, BRCA1, BRCA2, CHD7 (4), CLCN7, COL1A1, COL2A1, EXT1, FGFR2 (2), FGFR3, GDF5, KCNQ1, MLH1 (2), MLL2/KMT2D, MSH2, MSH6, MYBPC3, NF1 (6), P63, PTCH1, PTH1R (2), PTPN11 (2), SCN1A, SOS1, TRPS1, TSC1, WNT10A 1.7 AR ATM, ATP6V0A2, CLCN1 (2), LRP5, PYCR1, SLC39A4 5 X EFNB1, MECP2 (2), DMD, PHF6 1.8
  • 28. WORKFLOW FOR CLINICAL EXOME Suspected genetic disease DRG sequencing Deep phenotyping ANALYSIS Top ranked candidates Clinical rounds Exclude candidate gene Sanger validation Cosegregation studies Fails Diagnosis Passes Inconsistent Consistent Reconsider short list Choose best candidate Variant Analysis Annotation sources: HGMD MAF (dbSNP, ESP) ClinVar Prediction criteria: Predicted pathogenicity Variant class Location in DRG target region Computational Phenotype Analysis HPO Annotation sources: OMIM, Orphanet, MGI... Semantic similarity Mode of inheritance Ontology: Prediction criteria: MP
  • 29. PHENIX HELPED DIAGNOSE 11/40 PATIENTS global developmental delay (HP:0001263) delayed speech and language development (HP:0000750) motor delay (HP:0001270) proportionate short stature (HP:0003508) microcephaly (HP:0000252) feeding difficulties (HP:0011968) congenital megaloureter (HP:0008676) cone-shaped epiphysis of the phalanges of the hand (HP:0010230) sacral dimple (HP:0000960) hyperpigmentated/hypopigmentated macules (HP:0007441) hypertelorism (HP:0000316) abnormality of the midface (HP:0000309) flat nose (HP:0000457) thick lower lip vermilion (HP:0000179) thick upper lip vermilion (HP:0000215) full cheeks (HP:0000293) short neck (HP:0000470)
  • 30. University of Queensland, University of Sydney
  • 31. SKELETOME PATIENT ARCHIVE  Integration with the HPO, Orphanet, and Monarch Initiative  Automated phenotyping from clinical summaries  Collaborative diagnosis University of Queensland, University of Sydney
  • 32. THE FACES OF RARE DISEASES  Screening and diagnosis  Treatment moni toring  Surgical planning and audi t  Genotype-phenotype correlation  Cross-species comparisons  Face to text conversion for text mining mm Non-invasive, non-irradiating deeply precise 3D facial analysis University of Western Australia
  • 33. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 35. USING ONTOLOGIES IN THE CLINIC  Ontologies are large (HPO has > 10,000 terms) and difficult to navigate  Mapping data to an ontology post-visit is time consuming and prone to error  Best time to phenotype using ontologies is during the patient visit  Goals of PhenoTips  Make deep phenotyping simple  Make it “faster than paper”
  • 36. PhenomeCentral is a Matchmaker  Lets you know about other similar patients  Lets you easily connect with other users Each Patient Record can be:  Public – Anyone can see the record  Private – Only specified users/consortia can see the record  Matchable – The record cannot be seen, but can be “discovered” by users who submit similar patients
  • 37. STEP 1: ADD PATIENT  Can use the interface built into PhenomeCentral  Can export data directly from a local PhenoTips instance  Add a vcf file (or list of genes)  Set each record as Private, Public or Matchable
  • 38. STEP 2: SEE PATIENTS SIMILAR TO YOURS
  • 39. STEP 3: CONTACT THE SUBMITTER OF THE OTHER DATASET
  • 40. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 41. HOW MUCH PHENOTYPING IS ENOUGH?  How many annotations…?  How many different categories?  How many within each?
  • 42. Not everything that counts can be counted and not everything that can be counted counts -Albert Einstein
  • 43. METHOD: DERIVE BY CATEGORY REMOVAL  Remove annotations that are subclasses of a single high-level node  Repeat for each 1° subclass
  • 44. Example: Schwartz-jampel Syndrome, Type I to test influence of a single phenotypic category
  • 45. Example: Schwartz-jampel Syndrome derivations to test influence of a single phenotypic category
  • 47. SEMANTIC SIMILARITY ALGORITHMS ARE ROBUST IN THE FACE OF MISSING INFORMATION Similarity of Derived Disease to Original Derived Disease Profile Rank  (avg) 92% of derived diseases are most-similar to original disease  Severity of impact follows proportion of phenotype
  • 48. METHOD: DERIVE BY LIFTING  Iteratively map each class to their direct superclass(es)  Keep only leaf nodes
  • 49. SEMANTIC SIMILARITY ALGORITHMS ARE SENSITIVE TO SPECIFICITY OF INFORMATION Similarity of Derived Disease to Original Derived Disease Profile Rank  Severity of impact increases with more-general phenotypes
  • 50. ANNOTATION SUFFICIENCY SCORE http://monarchinitiative.org/page/services
  • 51. OUTLINE  Why phenotyping is hard  About Ontologies  Diagnosing known diseases  Getting the phenotype data  How much phenotyping is enough?  Model organism data for undiagnosed diseases
  • 52. WHAT TO DO WHEN WE CAN’T DIAGNOSE WITH A KNOWN DISEASE?
  • 53. MODELS RECAPITULATE VARIOUS PHENOTYPIC ASPECTS OF DISEASE B6.Cg-Alms1foz/fox/J increased weight, adipose tissue volume, glucose homeostasis altered ALSM1(NM_015120.4) [c.10775delC] + [-] GENOTYPE PHENOTYPE obesity, diabetes mellitus, insulin resistance increased food intake, hyperglycemia, insulin resistance kcnj11c14/c14; insrt143/+(AB) ?
  • 54. HOW MUCH PHENOTYPE DATA?  Human genes have poor phenotype coverage GWAS + ClinVar + OMIM
  • 55. HOW MUCH PHENOTYPE DATA?  Human genes have poor phenotype coverage What else can we leverage? GWAS + ClinVar + OMIM
  • 56. HOW MUCH PHENOTYPE DATA?  Human genes have poor phenotype coverage What else can we leverage? …animal models Orthology via PANTHER v9
  • 57. COMBINED, HUMAN AND MODEL PHENOTYPES CAN BE LINKED TO >75% HUMAN GENES Orthology via PANTHER v9
  • 58. EACH MODEL CONTRIBUTES DIFFERENT PHENOTYPES Data from MGI, ZFIN, & HPO, reasoned over with cross-species phenotype ontology https://code.google.com/p/phenotype-ontologies/
  • 59. PROBLEM: CLINICAL AND MODEL PHENOTYPES ARE DESCRIBED DIFFERENTLY
  • 60. SOLUTION: BRIDGING SEMANTICS anatomical structure endoderm of forgut lung bud respiration organ lung organ foregut is_a (SubClassOf) part_of develops_from capable_of alveolus organ part alveolus of lung FMA:lung MA:lung endoderm GO: respiratory gaseous exchange MA:lung alveolus FMA: pulmonary alveolus is_a (taxon equivalent) NCBITaxon: Mammalia EHDAA: lung bud only_in_taxon pulmonary acinus alveolar sac lung primordium swim bladder respiratory primordium NCBITaxon: Actinopterygii Mungall, C. J., Torniai, C., Gkoutos, G. V., Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5
  • 61. PHENOTYPE REPRESENTATION REQUIRES MORE THAN “PHENOT YPE ONTOLOGIES Disease Gene Ontology Chemical glucose metabolism (GO:0006006) Gene/protein function data glucose (CHEBI:172 34) Metabolomics, toxicogenomics data type II diabetes mellitus (DOID:9352) Disease & phenotype data pyruvate (CHEBI:153 61) Cell pancreatic beta cell (CL:0000169) transcriptomic data
  • 62. MODELS BASED ON PHENOTYPIC SIMILARITY Washington, Haendel, et al. (2009). Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation. PLoS Biol, 7(11). doi:10.1371/journal.pbio.1000247
  • 63. OWLSIM: PHENOTYPE SIMILARITY ACROSS PATIENTS OR ORGANISMS find Resting tremors REM disorder Shuffling gait Unstable posture Neuronal loss in Substant ia Nigra Const ipat ion Hyposmia sterotypic behavior abnormal EEG decreased stride length poor rotarod performance axon degenerat ion decreased gut peristalsis failure to find food abnormal motor function sleep disturbance abnormal locomotion abnormal coordination CNS neuron degenerat ion abnormal digestive physiology abnormal olfaction https://code.google.com/p/owltools/wiki/OwlSim
  • 64. MONARCH PHENOTYPE DATA Species Data source Genes Genotypes Variants Phenotype annotations Also in the system: Rat; IMPC; GO annotations; Coriell cell lines; OMIA; MPD; Yeast; CTD; GWAS; Panther, Homologene orthologs; BioGrid interactions; Drugbank; AutDB; Allen Brain …157 sources to date Coming soon: Animal QTLs for pig, cattle, chicken, sheep, trout, dog, horse Diseases mouse MGI 13,433 59,087 34,895 271,621 fish ZFIN 7,612 25,588 17,244 81,406 fly Flybase 27,951 91,096 108,348 267,900 worm Wormbase 23,379 15,796 10,944 543,874 human HPOA 112,602 7,401 human OMIM 2,970 4,437 3,651 human ClinVar 3,215 100,523 445,241 4,056 human KEGG 2,509 3,927 1,159 human ORPHANET 3,113 5,690 3,064 human CTD 7,414 23,320 4,912
  • 65. EXOMISER METHOD Cohort VCF file Homo rec De novo dom Compound het X-linked Exomiser Filters: Mendelian Frequency Candidates HP https://www.sanger.ac.uk/resources/databa ses/exomiser/query/exomiser2
  • 66. EXOMISER RESULTS ON NIH UNDIAGNOSED DISEASE PROGRAM PATIENTS 9 previously diagnosed families Identified causative variants with a rank of at least 7/408 potential variants 21 families without identified disorders We have now prioritized variants in STIM1, ATP13A2, PANK2, and CSF1R in 5 different families (2 STIM1 families) Bone et al., submitted 
  • 67. UDP_2731 Patient phenotypes Sh3kbp1 tm1Ivdi -/ - Gait apraxia Spasticity Thyroid stimulating hormone excess Behavioural/ Psychiatric Abnormality hyperactivity hyperactivity increased dopamine level increased exploration in new environment abnormal locomotor behavior Abnormal voluntary movement Abnormality of the endocrine system Behavioral abnormality
  • 68. WALKING THE INTERACTOME Microcephaly Myoclonus Myoclonus YARS Microcephaly Akinesia Choreoathetosis Microcephaly musculoskeletal movement phenotype Involuntary movements IL41L IARS IARS2 MARS AARS Abnormal stereopsis Visual impairment abnormal visual perception Patient phenotypes Combined Oxidative Phosphorylation Deficiency 14 FARS2 WARS2 ? AIMP1 UDP_1166
  • 69. FINDING COLLABORATORS FOR FUNCTIONAL VALIDATION Patient Phenotype profile Phenotyping experts
  • 70. PHENOVIZ: INTEGRATE ALL HUMAN, MOUSE, AND FISH DATA TO UNDERSTAND CNVS Desktop application for differential diagnostics in CNVs  Explain manifestations of CNV diseases based on genes contained in CNV E.g., Supravalcular aortic stenosis in Williams syndrome can be explained by haploinsufficiency for elastin  Double the number of explanations using model data Doelken, Köhler, et al. (2013) Dis Model Mech 6:358-72
  • 71. A LOOK AT THE HPO
  • 72. WHO USES THE HPO?  Bayés, Àlex, et al. Nature neuroscience 2011  Castellano, Sergi, et al. PNAS 2014  Corpas, Manuel, et al. " Current Protocols in Human Genetics 2012  Sifrim, Alejandro, et al. Nature methods 2013  Lappalainen, Ilkka, et al. Nucleic acids research 2013  Firth, Helen V., and Caroline F. Wright. Developmental Medicine & Child Neurology 2011  Many more…
  • 73. ADVANTAGES OF HPO  Widely used, flexible, freely available, and community supported resource  Prioritization of candidate variants through tools such as PhenIX and Exomizer, and others  Extensive links to model organism ontologies, allowing selection of optimal models for wet-lab validation and research, and collaborators  Intuitive clinical interfaces built into tools such as PhenoTips, Certagenia, and others  Ability to easily share data with key international projects (Decipher/DDD, RD-Connect, PhenomeCentral, Matchmaker Exchange, etc.)
  • 74. LIMITATIONS  Quantitative vs. qualitative – Much of clinical data is quantitative lab data with reference standards. It is possible to convert based on ±3 SD, but no way to record the reference measure/population yet.  Temporal presentation – ontologies can support temporal ordering, but data capture tools don’t yet capture this and the comparison algorithms don’t yet take it into account  Severity – semantic encoding is available, but simple in comparison to phenotype-specific measures  Emerging ontology – some areas have poor coverage, such as nervous system, behavior, and imaging results. Need to represent the assays in these contexts.
  • 75. ACKNOWLEDGMENTS NIH-UDP William Bone Murat Sincan David Adams Amanda Links David Draper Joie Davis Neal Boerkoel Cyndi Tif f t Bill Gahl OHSU Nicole Vasilesky Matt Brush Bryan Laraway Shahim Essaid Lawrence Berkeley Nicole Washington Suzanna Lewis Chris Mungall UCSD Amarnath Gupta Jef f Grethe Anita Bandrowski Maryann Martone U of Pitt Chuck Boromeo Jeremy Espino Becky Boes Harry Hochheiser Sanger Anika Oehl r ich Jules Jacobson Damian Smedley Toronto Mar ta Gi rdea Sergiu Dumi t r iu Heather Trang Mike Brudno JAX Cynthia Smi th Charité Sebast ian Kohler Sandra Doelken Sebast ian Bauer Peter Robinson Funding: NIH Office of Director: 1R24OD011883 NIH-UDP: HHSN268201300036C, HHSN268201400093P
  • 76. WHERE TO GET HPO, AND HOW TO REQUEST NEW CONTENT We need you! Browse in the following places: http://www.human-phenotype-ontology.org/ http://purl.bioontology.org/ontology/HP Get the file: http://purl.obolibrary.org/obo/hp.owl Request content: https://sourceforge.net/p/obo/human-phenotype-requests/new/ More Documentation: https://code.google.com/p/phenotype-ontologies/

Editor's Notes

  1. A good example of this can be seen here. So the average person has had enough experience with Down’s Syndrome that they are likely able to notice that all three of these patients have it. However, if you asked them to describe this phenotype in short phrases that can be agreed upon by the majority of people, can be used to identify the disorder, and ideally can be easily used by a computer, they are going to have a difficult time. It is a collection of phenotypes “Together “ that define a disease and it is difficult for someone who does have the proper training to parse this information out. Down’s Syndrome good example Average person can identify Down’s , but can’t list out the Pheno for a computer Unless clinically TRAINED tough to outline a phenotype
  2. But, through hard work, and being very specific with our description of phenotype we can begin to approach making this information manageable and use it to identify disease. Can make a COMPUTABLE list of phenotype with hard work
  3. How can we characterize the diseases? Well, a distribution of the phenotypes across all diseases reveals that most have phenotypes affecting the nervous system, while the least have connective tissue phenotypes. 7401 diseases 99,045 annotations There are 20 categories in all. Note that they are not additive, as some phenotypes might belong to two categories. For example, some eye and ear phenotypes also belong to the head and neck. MESSAGE: Phenotype annotations unevenly distributed in different anatomical systems.
  4. As you might expect, we use this phenotype ontology to search for known variants that have similar phenotype to our patient. Using just HPO and the disease annotated with HPO terms we can compare to the known human disease, but as we are all painfully aware of we do not know the consequences of mutations in every gene that is in the human genome. We use patient pheno and search for known variants to with similar pheno Can do this with Humans, but don’t know all Human disease
  5. PhenIX usese human data and predicted deleteriousness HGMD ClinVAR OMIM Orphanet
  6. HGMD mutations were inserted into variant files from DAG panels from which the causative mutations had been removed and phenotypic annotations of the corresponding diseases were extracted from the HPO database. The genes were ranked with PhenIX. Results were simulated either on the entire disease set (All) or by filtering for known autosomal dominant (AD) or autosomal recessive (AR) diseases (fig. S2). A total of 8504 (All), 3471 (AD), and 5006 (AR) simulations were performed.
  7. See final figure in paper: http://stm.sciencemag.org/content/6/252/252ra123.full
  8. A community-driven knowledge curation platform for skeletal dysplasias Editorial process for curating phenotype and genotype knowledge on skeletal dysplasias Integration with HPO
  9. 1: Diagnostic Facial Signature via a vector diagram: shows directional differences versus a normative data set. It is a 2D representation of a 3D image. The ends of the coloured lines represent the patient’s (or averaged groups of patients’ ) image(s). The grey scale represents the normative data (normal equivalent). In this case, if the tips of the arrows are seen sticking out from the grey surface, then the patient (or group of patient’s) is characterised by a protrusion of that region. E.g. here the patient has a prominent nasal tip and prominent lips. If you were to rotate the 3D image you would see the lines in the cheek region pointing inwards i.e. the patient has a flat cheek region compared to the reference range. 2: Monitoring treatment in a multisystem disorder. Progressive reduction in facial dysmorphology (anomaly), top to bottom, over treatment for a rare metabolic condition (Mucopolysaccharidosis type 1) 3: Text mining: converting face to text, specifically, elements of morphology terms.
  10. When we have a patient with an undiagnosed/rare disease, we want to be able to search the whole landscape of knowledge. How can we effectively utilize phenotype information collected about the patients? It all boils down to the question, how much is enough to be useful? What does that information need to be like in order to be useful? We have partnered with the UDP to try to help figure out how much information might be necessary to collect about the patients with rare diseases.
  11. To illustrate the method, let’s take Schwartz-jampel Syndrome. It has ~100 phenotype annotations, distributed to 17 phenotypic categories. The majority of which are in the skeletal system. Problem in Hspg2, a proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. For this example, I’ve taken a subset of the phenotypes, and colored them by category.
  12. We can test the roll of category by creating a derived disease that removes all the phenotypes for that category as our “case”…
  13. We can test the role of category by creating a derived disease that removes all the phenotypes for that category as our “case”… And then as a control, remove an equal amount of “information” from other categories. In the case of Schwartz-Jampel Syndrome, removing only skeletal phenotypes (which comprises 40% of phenotype profile) it significantly reduces its similarity, dropping it to only 86% similar, whereas removing the same amount of information from the controls gives an average of 91% similarity. In this case, there were 73 controls to compare to. We have performed these types of experiments for every disease profile in our corpus (approx 8K). The experiment therefore is: Create a variety of “derived” less-specific diseases Assess the change in similarity: Is the derived disease still considered similar to the original disease? …or more similar to a different disease? Is it distinguishable beyond random? Using the results, we can create a metric to define when a phenotype profile is unique enough to be useful in comparisons to other diseases and models systems.
  14. Using the results, we can create a metric to define when a phenotype profile is unique enough to be useful in comparisons to other diseases and models systems. This has been implemented at UDP so that clinicians can provide quality human data to be able to compare to animal models. It has also been implemented on monarch genotype pages, where one can see how well annotated any given model is. This functionality is also available as a service call. We can also generate different types of reports based on these analyses to determine which diseases have the least coverage, which models are the least well annotated, etc.
  15. Our abilities to link genotype to phenotype are constrained by our knowledgebase. To date, <40% of human genes have been directly linked to diseases or traits by ClinVar, OMIM, and GWAS. How are we going to discover the disease-gene links if the phenotype coverage is so poor? (Of course, there may be much more links for GWAS, once we figure out what to do with all the variants that lie outside of gene boundaries.
  16. So, <40% of human genes have phenotypes. And when we look at the orthologs for each of the standard multi-cellular model organisms, there doesn’t appear to be any more than about 50% coverage for any given model.
  17. But when put together, they bring the phenotypic coverage of human genes (either directly or inferred via orthology) up to nearly 80%. That is A LOT of coverage. How can we better tap that?
  18. Since any computational methods rely heavily on the data, what does the available data look like? The distribution of phenotype information per model genotype is different compared to human disease annotations. For mouse, there’s a much higher representation of metabolic, cardiovascular, blood, and endocrine phenotypes available to compare; For fish, there’s increased nervous, skeletal, head and neck, and cardiovascular, and connective tissue. (Note that these do not include “normal” phenotypes for either diseases or genotypes.) Each model brings something different to the phenotype landscape.
  19. Different terminology is used to describe clinical manifestations than is used to describe model system biological features.
  20. Things like finding models of sirenomelia due to disruption of the lateral plate mesoderm . Helping to find models and gene candidates based on the relationships in the development
  21. Without additional knowledge and linking, computers can’t make the connections. These links take us from the molecular to the protein, to the cellular and anatomical, to the disease level of phenotypes
  22. OWLsim computes semantic similarity between sets of phenotypes within and across species using the bridging semantics. Phenotypes in common from the bridging ontologies relate human clinical phenotypes with model organism phenotypes. Examples include motor systems, olfaction, and digestion. In this case, data encoded using the human phenotype ontology has been made interoperable with mouse, zebrafish and other model system ontologies. This also enables the use of more complex algorithms to detect similarity – not bases solely on mapping or string matching; e.g. constipation and decreased gut peristalsis are both subtypes of abnormal digestive system physiology.
  23. Run through pipeline: Exomiser LOT is a version of exomiser that is less restrictive as far as what transcripts it recognizes (not at worried about off target reads because of the Mendalian filters and the ability to look at the BAMS) Exomiser Exomiser LOT Pheno only
  24. Exomiser is an exome analysis tool that leverages the uberpheno cross species phentoype comparisons, standard exome filtering (pathogenicity, frequency, off targets, etc.), in combination with mendelian filtering and interactome walking. The original method paper is published: http://genome.cshlp.org/content/early/2013/10/25/gr.160325.113.abstract This work is unpublished but has been (just) submitted. Happy to share the manuscript if of interest.
  25. Each model organism has a different suite of phenotypes that are examined, because different models are used to explore different types of biological function and malfunction. By using a diversity of model systems, we have the potential to identify candidates based on partial overlaps with the patient phenotype profile by looking at different models with mutations in potential candidates or related via interactions, co-expression, genomic regulatory region, etc.
  26. Phenoviz is a new graphviz plugin that can be used as a standalone app for Windows, Mac, or Linux. The user uploads a list of CNVs detected by Array CGH (SNP Chips, or even genome sequence data would also work as a starting point, but the program expects a simple list). You also enter a list of the HPO terms observed in the patient. The application then tries to find “matches” based on the single gene disorders (human – HPO annotations) or the mouse models (mainly knockouts, MP annotations from MGI) or fish models (ZFIN E/Q annotations). This is being in the Charite Array CGH diagnostics service to help with interpretation of CNVs. Subjectively, the tool helps you to quickly find good candidates in order to write reports. The program also picks out the best matching CNV in case the user enters several (a typical array CGH finding in our lab has up to 50 CNVs, of which 2-5 are not found in databases of common variants like DGV).
  27. There are a lot of people who have contributed to this work over many years. 