SlideShare a Scribd company logo
Simon Jupp
Samples, Phenotypes and Ontologies
EMBL-EBI
Semantic services for data
interoperability
Elixir all hands meeting
Interoperability workshop
March 2017
Ontology services as building blocks for
FAIR
• You need standards (ontologies and controlled
vocabularies) to make data interpretable
• Interpretable data is more readily interoperable
• We can use interoperable data to build integrated
systems that make the data more findable by user
• The data become reusable when we use common
standards
• But,
• There are a lot of standards
• Doing this at scale for different domains is hard
Improving Findability by greater
InteroperabilitySmarter searching Data analysis
Data integration
Data visualisation
BioSamples case study
• description of material of biological
interest
• may be linked to assay data
• sequencing, microarray,
• proteomics
• also imaging, etc
• We’ve been making this data
FAIR for many years
The challenge - thousands of data
attributes…
• BioSamples is an example of real world experimental metadata
• We see all the variability – warts and all
• Good play ground for building tooling to cleanup and add values to this
data
• If we can build tooling that works for BioSamples – they’ll work anywhere!
What are the disease attributes?
diseaseState
hostDisease
clinicallyAffectedStatus
diagnosis
Infection
diseaseStatus
healthState
disease
clinicalInformation
hostHealthState
affectedBy
causeOfDeath
NOT:
diseaseStage: info about the stage of a disease e.g. "48 hai”, “stage”, “terminal”
diseasestage
tumorStatus:"non-tumor",120, "Tumor",100,"CSL +/+ Xenograft Tumor 1st",
healthStatus: "normal","Allergic","stressed”,"NA(Not immunized)"
Makes finding the right data hard
Normalising sample descriptions through
annotation with ontologies
CL:CL_0000071
(blood vessel
endothelial cell)
obo:CHEBI_39867
(valproic acid)
NCBITaxon:NCBITa
xon_9606
(Homo Sapiens)
Curation
Ontology challenges
• How do I access ontologies?
• How do I map data to ontologies?
• Which ontologies should I use?
• What about data that doesn’t map?
• How can I translate from one ontology to another?
• How can I extend an ontology?
• How do I build “ontology aware” search applications?
• How do I publish this data?
SPOT team - Adding value with ontologies
Data
Exploration
and
Cleanup
Data
structuring
Ontology
Annotatio
n
Data cleaning
and mapping
Ontology
building
FAIRified data
Data Enrichment Services
• Building an interoperability
toolkit for Europe (Elixir)
• Integrated (linked) APIs
• Plumbing for data curation
systems and workflows
• Lowering the barrier of entry to
ontologies for data stewards
New ontology lookup service!
The Ontology Toolkit
Search/Visualise ontologies
Annotate data
Ontology cross mapping
Create new ontology content
Webulous
Ontology Lookup Service
• Ontology search engine
• Ontology term history tracking
• Ontology visualisation
• Powerful RESTful API
Repository of over 160 pre-selected biomedical ontologies (4.5 million terms)
http://www.ebi.ac.uk/ols
• Provides unified mechanism to access
multiple ontologies
• Large community of users, 10s of millions of
hits per month
• Open source and dockerised
Zooma
• Optimal mappings based on data we have seen previously
• Favours precision over recall
• Captures annotations + context – context is v. important
• Currently contains over 92,000 annotations from 7 resources
• ClinVar, Cellular Phenotype Database, ExpressionAtlas, UniProt, GWAS, EBiSC, OpenTargets
• Used to improve and share their mappings across resources
Repository of curated ontology mappings
http://www.ebi.ac.uk/spot/zooma
“Heart”
UBERON:0000948
A Zooma Mapping
+ Context
(where, when, why?)
New for 2017 – Ontology Cross Mapping
• Cross-references are a powerful tool for integrating data
• A lot curator effort in building ontology cross-references
• Currently hard to find/explore Ontology Mapping space
Datasource 1 Datasource 2
Human
Phenotype
Ontology
SNOMED-CTMappings
Ontology Mapping Service (OxO)
• UI and API to expose known mappings from OBO, UMLS and
manually curated mappings sets (e.g. GWAS, OpenTargets)
• Normlaised CURIE prefixes using identifiers.org
• SNOMED-CT: / SNOMEDCT: / SNOMED: / SNOMEDCT_
• Provides a “silver standard” to support predictive mapping algorithms
* Going live March 2017
http://www.ebi.ac.uk/spot/oxo *
Common questions
• How do I access ontologies?
• How do I map data to ontologies?
• Which ontologies should I use?
• What about data that doesn’t map?
• How can I translate from one ontology to another?
• How can I extend an ontology?
• How do I build “ontology aware” search applications?
• How do I publish this data?
Data
Get the application ontology from OLS
Building a search index with BioSolr
Publishing structured data as RDF
Yes
No
Yes
No
Yes
No
Webulous OBO foundry
Create a new term
Add mappings
back to Zooma
No
Is the data annotated
to ontologies?
Is there
unmapped data?
Can you find
terms in OLS?
Is it the ontology
want?
Yes
Data annotation workflow
Search Zooma
Search OLS
Search OxO
Summary
• Part of FAIR process will be alignment with standards
• Already many standards and ontologies in use
• We build tools and services that help get you there
• You will have to do some curation
• But our tooling can capture that so we can share the burden
• How FAIR is FAIR enough?
• We’ll never FAIRify all of BioSamples
• Decide what your application is and optimise for that
Ontology team
Helen ParkinsonTony Burdett
Sira SarntivijaiOlga Vrousgou Thomas Liener
Funding
• EMBL
• CORBEL This project receives funding from the
European Union’s Horizon 2020 research and
innovation programme under grant agreement No
654248.
• EXCELERATE ELIXIR-EXCELERATE is funded by
the European Commission within the Research
Infrastructures programme of Horizon 2020, grant
agreement number 676559.

More Related Content

What's hot

BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
Charlie Hull
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Justin Clark-Casey
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
Pablo Pareja Tobes
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
Connected Data World
 
OEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology EngineeringOEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology Engineering
María Poveda Villalón
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
Connected Data World
 
Bio4j
Bio4jBio4j
Using Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4jUsing Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4j
Neo4j
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Carole Goble
 
Annotopia open annotation services platform
Annotopia open annotation services platformAnnotopia open annotation services platform
Annotopia open annotation services platform
Tim Clark
 
Genome science intermine
Genome science intermineGenome science intermine
Genome science intermine
ELIXIR UK
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
Valery Tkachenko
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
Stuart Chalk
 
Graph DB + Bioinformatics: Bio4j, recent applications and future directions
Graph DB + Bioinformatics:  Bio4j, recent applications and future directions Graph DB + Bioinformatics:  Bio4j, recent applications and future directions
Graph DB + Bioinformatics: Bio4j, recent applications and future directions Pablo Pareja Tobes
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
Carole Goble
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Use of Research (Meta-)Data - Finding researchers in/across organizations -
Use of Research (Meta-)Data  - Finding researchers in/across organizations -Use of Research (Meta-)Data  - Finding researchers in/across organizations -
Use of Research (Meta-)Data - Finding researchers in/across organizations -
National Institute of Informatics (NII)
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all

What's hot (20)

BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
OEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology EngineeringOEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology Engineering
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
Bio4j
Bio4jBio4j
Bio4j
 
Using Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4jUsing Public RDF Resources in Neo4j
Using Public RDF Resources in Neo4j
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Annotopia open annotation services platform
Annotopia open annotation services platformAnnotopia open annotation services platform
Annotopia open annotation services platform
 
Genome science intermine
Genome science intermineGenome science intermine
Genome science intermine
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
Graph DB + Bioinformatics: Bio4j, recent applications and future directions
Graph DB + Bioinformatics:  Bio4j, recent applications and future directions Graph DB + Bioinformatics:  Bio4j, recent applications and future directions
Graph DB + Bioinformatics: Bio4j, recent applications and future directions
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Ngsp
NgspNgsp
Ngsp
 
Use of Research (Meta-)Data - Finding researchers in/across organizations -
Use of Research (Meta-)Data  - Finding researchers in/across organizations -Use of Research (Meta-)Data  - Finding researchers in/across organizations -
Use of Research (Meta-)Data - Finding researchers in/across organizations -
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 

Similar to Semantics as a service at EMBL-EBI

Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
Pistoia Alliance
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
Jamshaid Ashraf
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...
mestato
 
II-SDV 2012 Text Mining, Term Mining and Visualization - Improving the Impac...
II-SDV 2012 Text Mining, Term Mining and Visualization  - Improving the Impac...II-SDV 2012 Text Mining, Term Mining and Visualization  - Improving the Impac...
II-SDV 2012 Text Mining, Term Mining and Visualization - Improving the Impac...Dr. Haxel Consult
 
Nicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in ElasticsearchNicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in Elasticsearch
MeetupDataScienceRoma
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
Susanna-Assunta Sansone
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
Melanie Courtot
 
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
Access Innovations, Inc.
 
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014Susanna-Assunta Sansone
 
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
Dr. Haxel Consult
 
COPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob DaveyCOPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob Davey
AIMS (Agricultural Information Management Standards)
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
Artificial Intelligence Institute at UofSC
 
Elsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing IndustryElsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing Industry
Antonio Gulli
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
petrknoth
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
INRAE (MISTEA) and University of Montpellier (LIRMM)
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Amit Sheth
 

Similar to Semantics as a service at EMBL-EBI (20)

Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...
 
II-SDV 2012 Text Mining, Term Mining and Visualization - Improving the Impac...
II-SDV 2012 Text Mining, Term Mining and Visualization  - Improving the Impac...II-SDV 2012 Text Mining, Term Mining and Visualization  - Improving the Impac...
II-SDV 2012 Text Mining, Term Mining and Visualization - Improving the Impac...
 
Nicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in ElasticsearchNicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in Elasticsearch
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
 
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
Text Mining, Term Mining, and Visualization - Improving the Impact of Scholar...
 
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
OpenAthens Conference 2018 - Tim Lull and Chad Smith - Cultivating your onlin...
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
 
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
 
COPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob DaveyCOPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob Davey
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
Elsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing IndustryElsevier - Smart Data and Algorithms for the Publishing Industry
Elsevier - Smart Data and Algorithms for the Publishing Industry
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 

Recently uploaded

THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 

Recently uploaded (20)

THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 

Semantics as a service at EMBL-EBI

  • 1. Simon Jupp Samples, Phenotypes and Ontologies EMBL-EBI Semantic services for data interoperability Elixir all hands meeting Interoperability workshop March 2017
  • 2. Ontology services as building blocks for FAIR • You need standards (ontologies and controlled vocabularies) to make data interpretable • Interpretable data is more readily interoperable • We can use interoperable data to build integrated systems that make the data more findable by user • The data become reusable when we use common standards • But, • There are a lot of standards • Doing this at scale for different domains is hard
  • 3. Improving Findability by greater InteroperabilitySmarter searching Data analysis Data integration Data visualisation
  • 4. BioSamples case study • description of material of biological interest • may be linked to assay data • sequencing, microarray, • proteomics • also imaging, etc • We’ve been making this data FAIR for many years
  • 5. The challenge - thousands of data attributes… • BioSamples is an example of real world experimental metadata • We see all the variability – warts and all • Good play ground for building tooling to cleanup and add values to this data • If we can build tooling that works for BioSamples – they’ll work anywhere!
  • 6. What are the disease attributes? diseaseState hostDisease clinicallyAffectedStatus diagnosis Infection diseaseStatus healthState disease clinicalInformation hostHealthState affectedBy causeOfDeath NOT: diseaseStage: info about the stage of a disease e.g. "48 hai”, “stage”, “terminal” diseasestage tumorStatus:"non-tumor",120, "Tumor",100,"CSL +/+ Xenograft Tumor 1st", healthStatus: "normal","Allergic","stressed”,"NA(Not immunized)"
  • 7. Makes finding the right data hard
  • 8. Normalising sample descriptions through annotation with ontologies CL:CL_0000071 (blood vessel endothelial cell) obo:CHEBI_39867 (valproic acid) NCBITaxon:NCBITa xon_9606 (Homo Sapiens) Curation
  • 9. Ontology challenges • How do I access ontologies? • How do I map data to ontologies? • Which ontologies should I use? • What about data that doesn’t map? • How can I translate from one ontology to another? • How can I extend an ontology? • How do I build “ontology aware” search applications? • How do I publish this data?
  • 10. SPOT team - Adding value with ontologies Data Exploration and Cleanup Data structuring Ontology Annotatio n Data cleaning and mapping Ontology building FAIRified data
  • 11. Data Enrichment Services • Building an interoperability toolkit for Europe (Elixir) • Integrated (linked) APIs • Plumbing for data curation systems and workflows • Lowering the barrier of entry to ontologies for data stewards New ontology lookup service!
  • 12. The Ontology Toolkit Search/Visualise ontologies Annotate data Ontology cross mapping Create new ontology content Webulous
  • 13. Ontology Lookup Service • Ontology search engine • Ontology term history tracking • Ontology visualisation • Powerful RESTful API Repository of over 160 pre-selected biomedical ontologies (4.5 million terms) http://www.ebi.ac.uk/ols • Provides unified mechanism to access multiple ontologies • Large community of users, 10s of millions of hits per month • Open source and dockerised
  • 14. Zooma • Optimal mappings based on data we have seen previously • Favours precision over recall • Captures annotations + context – context is v. important • Currently contains over 92,000 annotations from 7 resources • ClinVar, Cellular Phenotype Database, ExpressionAtlas, UniProt, GWAS, EBiSC, OpenTargets • Used to improve and share their mappings across resources Repository of curated ontology mappings http://www.ebi.ac.uk/spot/zooma “Heart” UBERON:0000948 A Zooma Mapping + Context (where, when, why?)
  • 15. New for 2017 – Ontology Cross Mapping • Cross-references are a powerful tool for integrating data • A lot curator effort in building ontology cross-references • Currently hard to find/explore Ontology Mapping space Datasource 1 Datasource 2 Human Phenotype Ontology SNOMED-CTMappings
  • 16. Ontology Mapping Service (OxO) • UI and API to expose known mappings from OBO, UMLS and manually curated mappings sets (e.g. GWAS, OpenTargets) • Normlaised CURIE prefixes using identifiers.org • SNOMED-CT: / SNOMEDCT: / SNOMED: / SNOMEDCT_ • Provides a “silver standard” to support predictive mapping algorithms * Going live March 2017 http://www.ebi.ac.uk/spot/oxo *
  • 17. Common questions • How do I access ontologies? • How do I map data to ontologies? • Which ontologies should I use? • What about data that doesn’t map? • How can I translate from one ontology to another? • How can I extend an ontology? • How do I build “ontology aware” search applications? • How do I publish this data?
  • 18. Data Get the application ontology from OLS Building a search index with BioSolr Publishing structured data as RDF Yes No Yes No Yes No Webulous OBO foundry Create a new term Add mappings back to Zooma No Is the data annotated to ontologies? Is there unmapped data? Can you find terms in OLS? Is it the ontology want? Yes Data annotation workflow Search Zooma Search OLS Search OxO
  • 19. Summary • Part of FAIR process will be alignment with standards • Already many standards and ontologies in use • We build tools and services that help get you there • You will have to do some curation • But our tooling can capture that so we can share the burden • How FAIR is FAIR enough? • We’ll never FAIRify all of BioSamples • Decide what your application is and optimise for that
  • 20. Ontology team Helen ParkinsonTony Burdett Sira SarntivijaiOlga Vrousgou Thomas Liener Funding • EMBL • CORBEL This project receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 654248. • EXCELERATE ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559.