SlideShare a Scribd company logo
1 of 26
Building a repository of biomedical
ontologies with Neo4j
Simon Jupp
Samples, Phenotypes and Ontologies Team
European Bioinformatics Institute
Cambridge, UK.
The challenge - thousands of data
attributes…
• European Archive for molecular data
• ENA, EVA, EGA, BioSample, ArrayExpress
• How do we make sense of the data?
• SPOT team builds tools to support the mapping of this data to ontologies
and other standards
Why we need terminology standards (or
ontologies)
Dyschromatopsia
Search PubMed for “color blindness”
Search PubMed for “Dyschromatopsia”
Search PubMed for "abnormality of the eye"
The ontology of color blindness
HP:0011518 (Dichromacy )HP:0011518 (Eye)
HP:0000551 (Abnormality of color vision )
HP:0007641 (Dyschromatopsia)
Is-a
Is-a
Disease-location
Ontology powered applications
Query expansion in the Gene Expression Atlas – searching “eye disease” finds
genes expressed in “Turner syndrome”
https://www.ebi.ac.uk/gxa/home
Ontology powered applications
Visualising Gene-Disease associations in Open Targets
https://www.opentargets.org
Ontology powered applications
SNP – trait
associations in the
GWAS catalog
All traits mapped to
disease, phenotype
and measurements
in EFO
https://www.ebi.ac.uk/gwas/
Cardiovascular disease traits
11
Genotype Phenotype
Sequence
Proteins
Gene products Transcript
Pathways
Cell type
BRENDA tissue /
enzyme source
Development
Anatomy
Phenotype
Plasmodium
life cycle
-Sequence types
and features
-Genetic Context
- Molecule role
- Molecular Function
- Biological process
- Cellular component
-Protein covalent bond
-Protein domain
-UniProt taxonomy
-Pathway ontology
-Event (INOH pathway
ontology)
-Systems Biology
-Protein-protein
interaction
-Arabidopsis development
-Cereal plant development
-Plant growth and developmental stage
-C. elegans development
-Drosophila development FBdv fly
development.obo OBO yes yes
-Human developmental anatomy, abstract
version
-Human developmental anatomy, timed version
-Mosquito gross anatomy
-Mouse adult gross anatomy
-Mouse gross anatomy and development
-C. elegans gross anatomy
-Arabidopsis gross anatomy
-Cereal plant gross anatomy
-Drosophila gross anatomy
-Dictyostelium discoideum anatomy
-Fungal gross anatomy FAO
-Plant structure
-Maize gross anatomy
-Medaka fish anatomy and development
-Zebrafish anatomy and development
-NCI Thesaurus
-Mouse pathology
-Human disease
-Cereal plant trait
-PATO PATO attribute and value.obo
-Mammalian phenotype
- Human phenotype
-Habronattus courtship
-Loggerhead nesting
-Animal natural history and life history
eVOC (Expressed
Sequence Annotation
for Humans)
Ontologies for life sciences
Ontology Lookup Service
• Ontology search engine
• Ontology term history tracking
• Ontology visualisation
• Powerful RESTful API
Repository of over 160 pre-selected biomedical ontologies (4.5 million terms, 11
million relationships)
http://www.ebi.ac.uk/ols
• Provides unified mechanism to access
multiple ontologies
• Large community of users (~5000 p/m, 100s
of millions of hits p/m)
• Open source and dockerised
Ontology visualisation tools
Build process
Nightly crawl of
all registered
ontologies
Multiple indexes created
with standalone Spring Boot
applications
API and website
run with Spring data
https://ebispot.github.io
Open Source Software
Loading ontologies into Neo4j
• Ontologies usually published in W3C
OWL format
• RDF based (so already a graph)
• …but not a very friendly graph for our
use-cases (more on this this afternoon)
• Primary OLS use-cases for a graph
• Term hierarchy (parent/child)
• Simple view over other relationships
• Part of, develops from
• Extracting subgraphs/subsets
• e.g. taxon specific subsets
OWL to Neo4j schema
Every term is a node with an label for each ontology
Each relationship and subset relation is labeled (is-a, part-of, develops-from etc..)
Powerful yet simple queries
• Get the transitive closure for “heart” following parent and
partonomy relations from the UBERON anatomy ontology
MATCH path = (n:Class)-[r:SUBCLASSOF|RelatedTree*]
->(parent)<-[r2:SUBCLASSOF|RelatedTree]-(sibling:Class)
WHERE n.ontology_name = {0} AND n.iri = {1}
Ontology Mappings
• We now have too many ontologies!! with overlapping scope
• Millions of mappings exists to interlink the ontologies
Datasource 1 Datasource 2
Human
Phenotype
Ontology
SNOMED-CTMappings
Xref
Ontology Mapping Service (OxO)
• New database of mappings built with Neo4j
• Crawls OLS ontologies and UMLS for mappings and provides UI and
API to access all known mappings
* Went live March 2017
http://www.ebi.ac.uk/spot/oxo *
Exploring the Xref graph
• We build a graph in Neo4j of known xrefs
• Direct mappings to NCIt “Retoinoblastoma” from Disease
ontology (DO) and EFO
Discover new mappings
• If we traverse 1 hop in the graph we can infer more
mappings
1 hop
Problems with mappings
• But exposes inconsistencies in public mapping
• Use this as basis for fixing and confirming mappings
Conclusion
• Neo4j being adopted in multiple projects across this
institute
• Liked as provides simple and effective solution to some of
our data modelling challenges
• Neo4j is a good fit for working with ontologies and
taxonomic data
• Excellent developer integration for building applications
e.g. Spring-data-neo4j
Ontology team
Helen ParkinsonTony Burdett
Sira SarntivijaiOlga Vrousgou Thomas Liener
Funding
• EMBL
• CORBEL This project receives funding from the
European Union’s Horizon 2020 research and
innovation programme under grant agreement No
654248.
• EXCELERATE ELIXIR-EXCELERATE is funded by
the European Commission within the Research
Infrastructures programme of Horizon 2020, grant
agreement number 676559.
Predicting annotation
• We do a lot of data curation with ontologies
• Need better support for mapping prediction
• E.g. Sample likes these are usually annotated with these
terms
• Need species specificity e.g. only mapping plant samples
with plant ontology terms
Input from submission Ontology class
2’-deoxy-5-azacytidine 5-aza-2’-deoxycytidine
Ovarian Cancer ovarian carcinoma
Anterior tibialis tibialis anterios
Endothelium, Vascula cardiovascular system endothelium
Tagging with ontologies
• We have built a large corpus of known mappings
between “data values” and ontology terms
• Piloting building a recommendation engine for our
curation tools with Neo4j

More Related Content

What's hot

Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfConnorShorten2
 
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...Neo4j
 
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 OverviewNeo4j
 
Smarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceSmarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceNeo4j
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflowDatabricks
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Neo4j
 
Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyNeo4j
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
 
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...Neo4j
 
Intermediate Cypher.pdf
Intermediate Cypher.pdfIntermediate Cypher.pdf
Intermediate Cypher.pdfNeo4j
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI StrategyAtScale
 
Data visualization introduction
Data visualization introductionData visualization introduction
Data visualization introductionManokamnaKochar1
 

What's hot (20)

Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
 
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
 
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 Overview
 
Smarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data ScienceSmarter Fraud Detection With Graph Data Science
Smarter Fraud Detection With Graph Data Science
 
Generative AI.pptx
Generative AI.pptxGenerative AI.pptx
Generative AI.pptx
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
Knowledge Graphs for Transformation: Dynamic Context for the Intelligent Ente...
 
Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph Technology
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...
The Art of the Possible with Graph - Sudhir Hasbe - GraphSummit London 14 Nov...
 
Intermediate Cypher.pdf
Intermediate Cypher.pdfIntermediate Cypher.pdf
Intermediate Cypher.pdf
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine Learning
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
Data visualization introduction
Data visualization introductionData visualization introduction
Data visualization introduction
 

Similar to Building a Biomedical Ontology Repository with Neo4j

Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppSimon Jupp
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...Neo4j
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISimon Jupp
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesConnected Data World
 
Building a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jBuilding a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jSimon Jupp
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biologyrobertstevens65
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?EUDAT
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeologyekansa
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Using public databases to inform research questions
Using public databases to inform research questionsUsing public databases to inform research questions
Using public databases to inform research questionsamlbinder
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
JulieKlein_Bosc2012
JulieKlein_Bosc2012JulieKlein_Bosc2012
JulieKlein_Bosc2012KUPKB_Team
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is usefulTheContentMine
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is usefulpetermurrayrust
 

Similar to Building a Biomedical Ontology Repository with Neo4j (20)

Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical Sciences
 
Building a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4jBuilding a repository of biomedical ontologies with Neo4j
Building a repository of biomedical ontologies with Neo4j
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biology
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
 
Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Presentation AgroPortal
Presentation AgroPortalPresentation AgroPortal
Presentation AgroPortal
 
Using public databases to inform research questions
Using public databases to inform research questionsUsing public databases to inform research questions
Using public databases to inform research questions
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
JulieKlein_Bosc2012
JulieKlein_Bosc2012JulieKlein_Bosc2012
JulieKlein_Bosc2012
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is useful
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is useful
 

Recently uploaded

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 

Recently uploaded (20)

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

Building a Biomedical Ontology Repository with Neo4j

  • 1. Building a repository of biomedical ontologies with Neo4j Simon Jupp Samples, Phenotypes and Ontologies Team European Bioinformatics Institute Cambridge, UK.
  • 2. The challenge - thousands of data attributes… • European Archive for molecular data • ENA, EVA, EGA, BioSample, ArrayExpress • How do we make sense of the data? • SPOT team builds tools to support the mapping of this data to ontologies and other standards
  • 3. Why we need terminology standards (or ontologies) Dyschromatopsia
  • 4. Search PubMed for “color blindness”
  • 5. Search PubMed for “Dyschromatopsia”
  • 6. Search PubMed for "abnormality of the eye"
  • 7. The ontology of color blindness HP:0011518 (Dichromacy )HP:0011518 (Eye) HP:0000551 (Abnormality of color vision ) HP:0007641 (Dyschromatopsia) Is-a Is-a Disease-location
  • 8. Ontology powered applications Query expansion in the Gene Expression Atlas – searching “eye disease” finds genes expressed in “Turner syndrome” https://www.ebi.ac.uk/gxa/home
  • 9. Ontology powered applications Visualising Gene-Disease associations in Open Targets https://www.opentargets.org
  • 10. Ontology powered applications SNP – trait associations in the GWAS catalog All traits mapped to disease, phenotype and measurements in EFO https://www.ebi.ac.uk/gwas/ Cardiovascular disease traits
  • 11. 11 Genotype Phenotype Sequence Proteins Gene products Transcript Pathways Cell type BRENDA tissue / enzyme source Development Anatomy Phenotype Plasmodium life cycle -Sequence types and features -Genetic Context - Molecule role - Molecular Function - Biological process - Cellular component -Protein covalent bond -Protein domain -UniProt taxonomy -Pathway ontology -Event (INOH pathway ontology) -Systems Biology -Protein-protein interaction -Arabidopsis development -Cereal plant development -Plant growth and developmental stage -C. elegans development -Drosophila development FBdv fly development.obo OBO yes yes -Human developmental anatomy, abstract version -Human developmental anatomy, timed version -Mosquito gross anatomy -Mouse adult gross anatomy -Mouse gross anatomy and development -C. elegans gross anatomy -Arabidopsis gross anatomy -Cereal plant gross anatomy -Drosophila gross anatomy -Dictyostelium discoideum anatomy -Fungal gross anatomy FAO -Plant structure -Maize gross anatomy -Medaka fish anatomy and development -Zebrafish anatomy and development -NCI Thesaurus -Mouse pathology -Human disease -Cereal plant trait -PATO PATO attribute and value.obo -Mammalian phenotype - Human phenotype -Habronattus courtship -Loggerhead nesting -Animal natural history and life history eVOC (Expressed Sequence Annotation for Humans) Ontologies for life sciences
  • 12. Ontology Lookup Service • Ontology search engine • Ontology term history tracking • Ontology visualisation • Powerful RESTful API Repository of over 160 pre-selected biomedical ontologies (4.5 million terms, 11 million relationships) http://www.ebi.ac.uk/ols • Provides unified mechanism to access multiple ontologies • Large community of users (~5000 p/m, 100s of millions of hits p/m) • Open source and dockerised
  • 14. Build process Nightly crawl of all registered ontologies Multiple indexes created with standalone Spring Boot applications API and website run with Spring data https://ebispot.github.io Open Source Software
  • 15. Loading ontologies into Neo4j • Ontologies usually published in W3C OWL format • RDF based (so already a graph) • …but not a very friendly graph for our use-cases (more on this this afternoon) • Primary OLS use-cases for a graph • Term hierarchy (parent/child) • Simple view over other relationships • Part of, develops from • Extracting subgraphs/subsets • e.g. taxon specific subsets
  • 16. OWL to Neo4j schema Every term is a node with an label for each ontology Each relationship and subset relation is labeled (is-a, part-of, develops-from etc..)
  • 17. Powerful yet simple queries • Get the transitive closure for “heart” following parent and partonomy relations from the UBERON anatomy ontology MATCH path = (n:Class)-[r:SUBCLASSOF|RelatedTree*] ->(parent)<-[r2:SUBCLASSOF|RelatedTree]-(sibling:Class) WHERE n.ontology_name = {0} AND n.iri = {1}
  • 18. Ontology Mappings • We now have too many ontologies!! with overlapping scope • Millions of mappings exists to interlink the ontologies Datasource 1 Datasource 2 Human Phenotype Ontology SNOMED-CTMappings Xref
  • 19. Ontology Mapping Service (OxO) • New database of mappings built with Neo4j • Crawls OLS ontologies and UMLS for mappings and provides UI and API to access all known mappings * Went live March 2017 http://www.ebi.ac.uk/spot/oxo *
  • 20. Exploring the Xref graph • We build a graph in Neo4j of known xrefs • Direct mappings to NCIt “Retoinoblastoma” from Disease ontology (DO) and EFO
  • 21. Discover new mappings • If we traverse 1 hop in the graph we can infer more mappings 1 hop
  • 22. Problems with mappings • But exposes inconsistencies in public mapping • Use this as basis for fixing and confirming mappings
  • 23. Conclusion • Neo4j being adopted in multiple projects across this institute • Liked as provides simple and effective solution to some of our data modelling challenges • Neo4j is a good fit for working with ontologies and taxonomic data • Excellent developer integration for building applications e.g. Spring-data-neo4j
  • 24. Ontology team Helen ParkinsonTony Burdett Sira SarntivijaiOlga Vrousgou Thomas Liener Funding • EMBL • CORBEL This project receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 654248. • EXCELERATE ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559.
  • 25. Predicting annotation • We do a lot of data curation with ontologies • Need better support for mapping prediction • E.g. Sample likes these are usually annotated with these terms • Need species specificity e.g. only mapping plant samples with plant ontology terms Input from submission Ontology class 2’-deoxy-5-azacytidine 5-aza-2’-deoxycytidine Ovarian Cancer ovarian carcinoma Anterior tibialis tibialis anterios Endothelium, Vascula cardiovascular system endothelium
  • 26. Tagging with ontologies • We have built a large corpus of known mappings between “data values” and ontology terms • Piloting building a recommendation engine for our curation tools with Neo4j