Challenges for ontology repositories and applications to biomedicine and agronomy

INRAE (MISTEA) and University of Montpellier (LIRMM)
INRAE (MISTEA) and University of Montpellier (LIRMM)Senior Researcher at INRAE (MISTEA) and University of Montpellier (LIRMM)
Clement Jonquet - jonquet@lirmm.fr
SIMBig’17 – Sept 6th 2017 – Lima, Peru
• LIRMM
• Clement Jonquet
• Vincent Emonet
• Anne Toulet
• Andon Tchechmedjiev
• Amine Abdaoui
• Zohra Bellahsene
• Amina Annane (ESI
Algeria)
• Mathieu Roche (CIRAD)
• Sandra Bringay
• Juan Antonio Lossio
Ventura
• Few MSc students / year
• Collaborators
• Pierre Larmande (IRD /
IBC)
• Maguelonne Teisseire
(IRSTEA)
• Mark Musen (BMIR)
• John Graybeal (NCBO)
• Stefan Darmoni (CISMEF)
• Sebastien Harispe (LGI2P)
• Adrien Coulet (LORIA)
• Elizabeth Arnaud (CGIAR)
• Cyril Pommier (INRA)
• Esther Dzalé-Yeumo
(INRA)
C.Jonquet-SIMBig2017-Lima,Peru
2
▪ Ontologies & Ontology repositories
▪ 2 collaborative projects with ontology repositories and
ontology-based services
▪ Challenges for ontology repositories &
propositions/results
▪ Conclusion
C.Jonquet-SIMBig2017-Lima,Peru
3
C. Jonquet - SIMBig 2017 - Lima, Peru
4
▪ To provide canonical representation of scientific knowledge
▪ To annotate experimental data to enable interpretation,
comparison, and discovery across databases
▪ To facilitate knowledge-based applications for
▪ Decision support
▪ Natural language-processing
▪ Data integration
▪ But ontologies are: spread out, in different formats, of different size, with
different structures
C.Jonquet-SIMBig2017-Lima,Peru
5
▪ Ontology libraries defined as
▪ “a library system that offers various functions for managing, adapting
and standardizing groups of ontologies. It should fulfill the needs for re-
use of ontologies. In this sense, an ontology library system should be
easily accessible and offer efficient support for re-using existing relevant
ontologies and standardizing them based on upper-level ontologies and
ontology representation languages.” [Ding & Fensel, 2001]
▪ Ontology repositories defined as
▪ “a structured collection of ontologies (…) by using an Ontology Metadata
Vocabulary. References and relations between ontologies and their
modules build the semantic model of an ontology repository. Access to
resources is realized through semantically-enabled interfaces applicable
for humans and machines. Therefore a repository provides a formal query
language” [Hartmann, Palma, Gomez-Perez, 2009]
C.Jonquet-SIMBig2017-Lima,Peru
6
▪ You’ve built an ontology, how do you let the world know?
▪ You need an ontology, where do you go to get it?
▪ How do you know whether an ontology is any good?
▪ How do you find data resources that are relevant to the domain of the ontology
(or to specific terms)?
▪ How could you leverage your ontology to enable new science?
▪ How could you use ontologies without managing them ?
C.Jonquet-SIMBig2017-Lima,Peru
7
▪ Open Ontology Repository initiative (late 2000s)
▪ 2010 ORES workshop
▪ Ontology Repositories and Editors for the Semantic Web
▪ Review of ontology repositories
▪ [Where to publish and find ontologies? D’Aquin & Noy, 2012]
▪ A bunch of papers on ontology recommendation & selection
▪ News
▪ New platform in 2015 Aber-OWL
▪ OLS 3.0, AgroPortal releases
C.Jonquet-SIMBig2017-Lima,Peru
8
▪ Ontology repositories / portal
▪ NCBO BioPortal
▪ Ontobee
▪ AberOWL
▪ EBI Ontology Lookup Service
▪ OKFN Linked Open Vocabularies
▪ ONKI Ontology Library Service
▪ MMI Ontology Registry and Repository
▪ ESIPportal
▪ AgroPortal
▪ SIFR BioPortal
▪ CISMEF HeTOP
▪ OntoHub
▪ Web indexes
▪ Watson, Swoogle,
Sindice, Falcons
▪ Ontology libraries / listings (more or less
updated)
▪ OBO Foundry
▪ WebProtégé
▪ Romulus
▪ DAML ontology library
▪ Colore
▪ FAO VEST Registry
▪ BioSharing
▪ DERI Vocabularies , OntologyDesignPatterns,
Semanticweb.org, W3C Good ontologies
▪ Platform technology
▪ Mondeca ITM, LexEVS
▪ Abandoned projects
▪ Cubboard, Knoodl, Schemapedia, SchemaWeb,
OntoSelect, OntoSearch, TONES
C.Jonquet-SIMBig2017-Lima,Peru
9
▪ Web repository for biomedical ontologies
▪ Make ontologies accessible and usable –
abstraction on format, locations, structure,
etc.
▪ Users can publish, download, browse, search,
comment, align ontologies and use them for
annotations both online and via a web services
API.
C.Jonquet-SIMBig2017-Lima,Peru
10
C.Jonquet-SIMBig2017-Lima,Peru
11
• Online support for ontology
• Peer review & notes
• Versioning
• Mapping
• Search
• Resources
• Annotation
• Open source technology
• Packaged in a “virtual
appliance”
• Set up your own
“bioportal” in a few
hours
http://bioportal.bioontology.org
Ontology
Services
• Search
• Traverse
• Comment
• Download
Widgets
• Tree-view
• Auto-complete
• Graph-view
Annotation
Data Access
Mapping
Services
• Create
• Upload
• Download
Term recognition
Search “data”
annotated with a
given term
http://data.bioontology.org
C.Jonquet-SIMBig2017-Lima,Peru
12
▪ NCI term browser (https://nciterms.nci.nih.gov)
▪ BioPortal first, then LexEVS
▪ Open Ontology Repository (OOR) Initiative (http://www.oor.net)
▪ Marine Metadata Interoperability Ontology Registry and Repository (http://mmisw.org)
▪ ESIPPortal (Earth Science Information Partners - http://semanticportal.esipfed.org )
▪ AgroPortal (http://agroportal.lirmm.fr)
▪ SIFR/French BioPortal (http://bioportal.lirmm.fr)
▪ And a few hospitals, research labs, with private data and specific needs (often in-house annotation)
C.Jonquet-SIMBig2017-Lima,Peru
13
C.Jonquet-SIMBig2017-Lima,Peru
14
C. Jonquet - SIMBig 2017 - Lima, Peru
15
http://www.lirmm.fr/sifr
▪Ontology-based services to index, mine
and retrieve French biomedical data
▪In France, there is already a reference
repository for medical terminologies
but nothing public for annotation
▪Crucial need for tools & services for
French biomedical data
C.Jonquet-SIMBig2017-Lima,Peru
16
C. Jonquet, A. Annane, K. Bouarech, V. Emonet & S.
Melzi. SIFR BioPortal: French biomedical ontologies
and terminologies available for semantic
annotation, In 16th Journées Francophones
d'Informatique Médicale
JFIM'16. Genève, Suisse, July 2016.
http://bioportal.lirmm.fr
25 monolingual ontologies/terminologies
• From the UMLS or EHTOP
• Cleaned and checked for the
Annotator purpose
17
C.Jonquet-SIMBig2017-Lima,Peru
French/SIFR
Annotator
http://bioportal.lirmm.fr/annotator
C.Jonquet-SIMBig2017-Lima,Peru
18
http://agroportal.lirmm.fr
▪ Develop and support a reference ontology repository
▪ Primary focus on the agronomy & close related domains (plant sciences, nutrition and biodiversity)
▪ Reusing the NCBO BioPortal technology
▪ Avoid to re-implement what has been done, facilitate interoperability
▪ Reusing the scientific outcomes, experience & methods of the biomedical domain
▪ Enable straightforward use of agronomic related ontologies
▪ Respect the requirements & specificities of the agronomic community
▪ Fully semantic web compliant infrastructure
▪ Enable new science
C. Jonquet, A. Toulet, E. Arnaud, S. Aubin, E. Dzalé-Yeumo, V. Emonet, J. Graybeal, M. A.
Musen, C. Pommier & P. Larmande. Reusing the NCBO BioPortal technology for agronomy to
build AgroPortal, In 7th International Conference on Biomedical Ontologies, ICBO'16, Demo
Session. Corvallis, Oregon, USA, August 2016.
C.Jonquet-SIMBig2017-Lima,Peru
19
C.Jonquet-SIMBig2017-Lima,Peru
20
➢ IBC Rice Genomics & AgroLD project
➢ Data integration and knowledge management related to rice (P. Larmande)
➢ RDA Wheat Data Interoperability working group
➢ Common framework for publishing wheat data (E. Dzalé-Yeumo)
➢ LovInra : INRA Linked Open Vocabularies
➢Vocabularies produced by INRA scientists (S. Aubin)
➢ Crop Ontology project
➢ Ontologies for describing crop germplasm & traits (E. Arnaud)
➢ GODAN global map of agri-food data standards
➢ VEST/AgroPortal MAP of standards (V. Pesce)
C.Jonquet-SIMBig2017-Lima,Peru
21
C. Jonquet - SIMBig 2017 - Lima, Peru
22
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 23
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 24
▪ First role of an ontology repository is to handle ontology metadata
(model, extract, edit, valorize)
▪ Everything about an ontology
▪ Intrinsic properties e.g., name, URI, creation date
▪ Relation to other ontologies e.g., imports, is mapped to, disagrees with
▪ Community contributions e.g. notes, project using, endorsements
▪ Content-based services e.g., SPARQL endpoint, bulk RDF download, search
▪ omv:usedOntologyEngineeringTool example
▪ What does it say about your community?
C.Jonquet-SIMBig2017-Lima,Peru
25
▪Pickup properties and relations
from 23 existing vocabularies
▪ Existing properties in ontology
repositories (especially
BioPortal)
▪ Non specific properties that may
“return to the ontology”
346 relevant properties that could
be used to described ontologies
127 used to build a new metadata
model inside AgroPortal
Ontology
repositories
metadata
Other Interesting
vocabularies
(e.g., IDOT, PAV,
SD, DOAP, …)
Standards &
Relevant (e.g.,
DC, DCAT, SKOS,
OWL, PROV, OMV,
VOID, VOAF, MOD
…)
C.Jonquet-SIMBig2017-Lima,Peru
26
Describe ontologies with
semantic metadata
• Display “per ontology”
• Ontology specific properties => viewable and
editable within the ontology specific page
• Everything you need to know about an ontology
• URIs used in the backend to store the information
• e.g., CC-BY =>
https://creativecommons.org/licenses/by-
nd/4.0/
• “Get my metadata back” buttons
C. Jonquet - SIMBig 2017 - Lima, Peru
27
select
▪ Allows to search, order and select ontologies using a
facetted search approach, based on the metadata
▪ 4 additional ways to filter ontologies in the list
▪ 2 new options to sort this list (name, released date).
C. Jonquet - SIMBig 2017 - Lima, Peru
28
Display “per property”
▪ Global presentation of the properties
▪ Synthesis diagrams & listing
▪ Allows to explore the agronomical ontology landscape
by automatically aggregating the metadata fields of
each ontologies in explicit visualizations (charts, term
cloud and graphs).
29
C. Jonquet, A. Toulet, V. Emonet. Two years after: a review of vocabularies
and ontologies in AgroPortal, In International Workshop on sources and data
integration in agriculture, food and environment using ontologies, IN-OVIVE'17.
Montpellier, France, July 2017. pp. 13. EFITA.
C.Jonquet-SIMBig2017-Lima,Peru
30
C.Jonquet-SIMBig2017-Lima,Peru
31
C.Jonquet-SIMBig2017-Lima,Peru
32
C.Jonquet-SIMBig2017-Lima,Peru
▪ Develop a new standard ontology metadata model
▪ Harmonize ontology repositories
▪ MOD project: Metadata for Ontology Description and Publication Ontology
▪ https://github.com/sifrproject/MOD-Ontology
▪ Add features within ontology editors
C.Jonquet-SIMBig2017-Lima,Peru
33
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 34
▪ Interface internationalization = displaying static elements of the user interface (e.g., menu
names, help, etc.) in different languages
▪ Content internationalization = displaying BioPortal content (e.g., ontology labels,
mappings, etc.) in different languages
▪ Multilingual = internationalization (display) + to enabling a complete use of the
functionalities and services of BioPortal for multilingual ontologies or monolingual
ontologies
▪ completely and properly addressed (languages, translations, multilingual mappings, etc.)
▪ rich semantic description
▪ Being able to parse multilingual content in ontologies (from xmllang to Lemon)
C.Jonquet-SIMBig2017-Lima,Peru
35
en:disease
fr:maladie
...
en:cancer
fr:cancer
en:spindel cell sarcome
fr:sarcome à cellules
fusiformes
en:melanoma
fr:mélanome
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules
fusiformes
mélanome
C.Jonquet-SIMBig2017-Lima,Peru
36
• Set of propositions
• Representation of natural language
property for an ontology
• Representation of relation between
ontologies
• Representation of multilingual translation
mappings
• Then parse multilingual ontologies in the
portal
• Then translate the user interface
C. Jonquet, V. Emonet & M. A. Musen. Roadmap for a multilingual BioPortal, In
4th Workshop on the Multilingual Semantic Web, MSW4'15. Portoroz, Slovenia,
June 2015.
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules
fusiformes
mélanome
gold:freeTranslation
gold:literalTranslation
en:disease
fr:maladie
...
en:cancer
fr:cancer
en:spindel cell sarcome
fr:sarcome à cellules fusiformes
en:melanoma
fr:mélanome
Search
index
Annota
tor
Index
C.Jonquet-SIMBig2017-Lima,Peru
37
▪ No formal representation of the translation links between
translated ontologies and original ones and those mappings
are not always formally available
▪ Reconciled more than 228K mappings between ten English
ontologies hosted on NCBO BioPortal and their French
translations
▪ We stored those mappings in the SIFR BioPortal making the
whole thing available in JSON-LD and connecting the 2
portals
A. Annane, V. Emonet, F. Azouaou & C. Jonquet. Multilingual Mapping Reconciliation between English-French
Biomedical Ontologies, In 6th International Conference on Web Intelligence, Mining and Semantics, WIMS'16. Nimes,
France, June 2016. ACM.
C.Jonquet-SIMBig2017-Lima,Peru
38
NCBO BioPortal
SIFR BioPortal
French
ontology
Concepts
English
ontology
Concepts
Mapped
concepts
Mapping%
Mappings
number
Properties
(skos;gold)
STY 133 STY 133 133 100% 133
exactMatch;
freeTranslation
MDRFRE 66378 MEDDRA 66378 66378 100% 66378
exactMatch;
freeTranslation
CIF 1495 ICF 1495 1495 100% 1495
exactMatch;
freeTranslation
MTHMSTFRE 1700 MSTDE 1699 1700 100% 1700
exactMatch;
freeTranslation
MSHFRE 26142 MeSH 252242 26220 99.79% 26220
exactMatch;
freeTranslation
WHO-ARTFRE 3482 WHO 1724 3482 100% 3482 broadMatch ; translation
CISP2 745 ICPC2P 7537 665 70% 5063 narrowMatch ; translation
MEDLINEPLUS 795 MEDLINEPLUS 2113 771 97% 1520 closeMatch; translation
CIM-10 19853 ICD10 12318 19813 99% 19813
exactMatch;
freeTranslation 62%
broadMatch; translation
37%
SNMIFRE 106266 SNMI 109150 102093 96% 102093
exactMatch;
freeTranslation
C.Jonquet-SIMBig2017-Lima,Peru
39
C.Jonquet-SIMBig2017-Lima,Peru
40
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 41
▪ Ontologies, vocabularies, and terminologies will inevitably overlap in
coverage
▪ Mappings does not always belong to an ontology
▪ The community needs a place to store
and retrieve them
▪ Scalability
▪ That’s the role of the ontology repository
▪ Need to be semantically described with plenty
of provenance information
C.Jonquet-SIMBig2017-Lima,Peru
42
▪ Created mappings in BioPortal include :
▪ ID based mappings (CUI & URI)
▪ Lexical mappings (LOOM)
▪  “Responsible” for the content
Faria D. et al. Towards annotating potential incoherences in BioPortal
mappings. ISWC 2014.
▪ Uploaded mappings are added by a user using the REST API (or the UI)
▪ Use of external tools
▪ Explicit metadata to make the distinction when using/retrieving mappings
C.Jonquet-SIMBig2017-Lima,Peru
43
Select ontologies to align (from BioPortal
or not)
Align ontologies for instance with
Yamm++
Automatically export the results to
BioPortal
Reuse mappings for annotation, indexing
and future mapping generation
C.Jonquet-SIMBig2017-Lima,Peru
44
C.Jonquet-SIMBig2017-Lima,Peru
45
C.Jonquet-SIMBig2017-Lima,Peru
46
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 47
▪Data deluge
▪Not necessarily connected to
relevant ontologies
▪Annotate data with ontology concepts
▪ Horizontal approach
ONTOLOGIES
RESOURCES
C.Jonquet-SIMBig2017-Lima,Peru
48
C. Jonquet, P. LePendu, S. Falconer, A. Coulet, N. F. Noy, M. A. Musen &
N. H. Shah. NCBO Resource Index: Ontology-Based Search and Mining
of Biomedical Resources, Web Semantics. September 2011. Vol. 9 (3),
pp. 316-324. Elsevier.
▪ Ontologies and data change everyday
▪ Need to be able to handle the “deltas”
only
▪ Work on terminology and knowledge
extraction from text
▪ BioTex (http://tubo.lirmm.fr/biotex)
C. Jonquet - SIMBig 2017 - Lima, Peru 49
J.A. Lossio-Ventura, C. Jonquet, M. Roche & M. Teisseire. Biomedical
term extraction: overview and a new methodology, Information
Retrieval, Special issue on Medical Information Retrieval. August 2015.
Vol. 19 (1), pp. 59-99. Springer.
▪We built the NCBO Resource Index as a searchable
database of around 50 biomedical resources
semantically indexed, with annotations
▪Since then, linked open data has become the approach
in the semantic web
▪In agronomy: build a database of resources described in
RDF, and annotated with ontologies: the AgroLD project
C.Jonquet-SIMBig2017-Lima,Peru
50
C.Jonquet-SIMBig2017-Lima,Peru
51
C.Jonquet-SIMBig2017-Lima,Peru
52
NCBO
BioPortal data
as of 2013
www.agrold.org
AgroLD
Ontologies
C.Jonquet-SIMBig2017-Lima,Peru
53
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 54
▪The role of the portal is to offer services for ontologies
▪Focus here on the use of ontologies is for annotation purposes
▪ How can a repository facilitate the use of ontologies for annotation?
▪Text mining challenge (disambiguation, context, negation,
modality, time)
▪ Electronic Health Records
C.Jonquet-SIMBig2017-Lima,Peru
55
▪ Improve the NCBO Annotator results by ranking the annotations according
to their relevance
▪ While not changing the service implementation
▪ Take into account their frequencies (as originally proposed in 2009 and removed)
▪ Add a term extraction measure, called C-Value, used to positively discriminate
annotations generated from matches with multi-word terms.
▪ Mostly improves annotations done with multiword terms
▪ 2 new scoring methods allowing to score and rank annotations by their
importance in the given input data
▪ Interesting results validated against PubMed manual annotations
S. Melzi & C. Jonquet. Scoring semantic annotations returned by the NCBO Annotator, In 7th International
Semantic Web Applications and Tools for Life Sciences, SWAT4LS'14. Berlin, Germany, Dec. 2014.
C.Jonquet-SIMBig2017-Lima,Peru
56
▪Project SIFR & PratikPharma
▪Detecting Negation, Temporality and Experiencer
▪Implementation using NegEx/ConText
▪ Inclusion in the French/SIFR Annotator
▪ Proxy architecture to plug this the NCBO Annotator
▪Very good performance results
▪ e.g., negation F1 between 0.8 and 0.9
C.Jonquet-SIMBig2017-Lima,Peru
57
C.Jonquet-SIMBig2017-Lima,Peru
58
Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 59
▪There are 596 ontologies and +110 ontology views in
BioPortal right now
▪Mostly biology and medicine
▪Overlaps with other domains
▪Lots of upper level ontologies
▪Lots of vocabularies
▪Swoogle in 2007: “Search over 10.000 ontologies” How
much now?
C.Jonquet-SIMBig2017-Lima,Peru
60
▪ No repository (except the web itself)
will handle them all, while keeping
the level of features (and curation?)
▪ Will each domain build they own technology?
▪ Sharing the technology is the best way to guaranty long term support
and future development
▪ Developers all around the world
▪ Different funders & support
▪ Sharing the technology is the best way to make ontology repositories
interoperable
C.Jonquet-SIMBig2017-Lima,Peru
61
▪ UI does not really matter
▪ We should be able to make a new
portal for another community in minutes
▪ Avoid duplicating ontologies
▪ Connect portals one another
▪ Through mappings as we did with translation
mappings
▪ The annotator proxy feature
▪ Implement and discuss standards
▪ SKOS handling in BioPortal
▪ Ontology metadata description
C.Jonquet-SIMBig2017-Lima,Peru
62
▪ Most of our new features are
developed within a proxy
▪ E.g., we can call either the
AgroPortal, SIFR BioPortal or even
the NCBO BioPortal Annotator and
use the same code to score
annotations
▪ Used this to set up an enhanced
version of the NCBO Annotator
63
C.Jonquet-SIMBig2017-Lima,Peru
▪ A remote BioPortal UI which actually talks to the main
BioPortal REST API
▪ Interesting for future interoperable BioPortal
instances
C.Jonquet-SIMBig2017-Lima,Peru
64
C. Jonquet - SIMBig 2017 - Lima, Peru
65
▪We discussed the importance of ontology repositories and the
span of ontology-based services they can (should) offer
▪Reviewed some of the challenges in that domain of research, at
the light of our 2 projects (SIFR & AgroPortal)
▪Reviewed some of the results obtained & propositions made
▪ Some are work in progress
C.Jonquet-SIMBig2017-Lima,Peru
66
www.lirmm.fr/sifrC. Jonquet - SIMBig 2017 - Lima, Peru
67
1 of 67

Recommended

Tutorial: “How to use ontology repositories and ontology–based services” by
Tutorial: “How to use ontology repositories and ontology–based services”Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”INRAE (MISTEA) and University of Montpellier (LIRMM)
1.2K views107 slides
FAIR data requires FAIR ontologies, how do we do? by
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?INRAE (MISTEA) and University of Montpellier (LIRMM)
456 views33 slides
Mastering an ontology & vocabulary management technology in France ? by
Mastering an ontology & vocabulary management technology in France ?Mastering an ontology & vocabulary management technology in France ?
Mastering an ontology & vocabulary management technology in France ?INRAE (MISTEA) and University of Montpellier (LIRMM)
678 views26 slides
FAIR data requires FAIR ontologies, how do we do? by
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?EUDAT
557 views29 slides

More Related Content

What's hot

Augmenting interoperability across scholarly repositories by
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesHerbert Van de Sompel
3K views21 slides
A few contributions of the SIFR (Semantic Indexing of French biomedical Resou... by
A few contributions of the SIFR (Semantic Indexing of French biomedical Resou...A few contributions of the SIFR (Semantic Indexing of French biomedical Resou...
A few contributions of the SIFR (Semantic Indexing of French biomedical Resou...INRAE (MISTEA) and University of Montpellier (LIRMM)
834 views43 slides
Ontology-based Tools to Enhance the Curation Workflow by
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowTrish Whetzel
689 views24 slides
Open Archives Initiative Object Re-Use & Exchange by
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeHerbert Van de Sompel
3.5K views43 slides

What's hot(20)

Ontology-based Tools to Enhance the Curation Workflow by Trish Whetzel
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation Workflow
Trish Whetzel689 views
Motivation, inspiration and innovation from frustration by Herbert Van de Sompel
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
BiPday 2014 -- Pesole Graziano by eventi-ITBbari
BiPday 2014 -- Pesole GrazianoBiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole Graziano
eventi-ITBbari534 views
The European Nucleotide Archive by EBI
The European Nucleotide ArchiveThe European Nucleotide Archive
The European Nucleotide Archive
EBI2.4K views
Introduction to Biodiversity Informatics by David Shorthouse
Introduction to Biodiversity Informatics Introduction to Biodiversity Informatics
Introduction to Biodiversity Informatics
David Shorthouse3.7K views
The biodiversity informatics landscape: a systematics perspective by Vince Smith
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
Vince Smith1.7K views
Research Objects, SEEK and FAIRDOM by Carole Goble
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
Carole Goble1.7K views
Reproducible and citable data and models: an introduction. by FAIRDOM
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
FAIRDOM4.2K views
The eNanoMapper database for nanomaterial safety information: storage and query by Nina Jeliazkova
The eNanoMapper database for nanomaterial safety information: storage and queryThe eNanoMapper database for nanomaterial safety information: storage and query
The eNanoMapper database for nanomaterial safety information: storage and query
Nina Jeliazkova147 views
NGB documentation system SESTO (17 Sept 2004) by Dag Endresen
NGB documentation system SESTO (17 Sept 2004)NGB documentation system SESTO (17 Sept 2004)
NGB documentation system SESTO (17 Sept 2004)
Dag Endresen637 views
ETDs and Open Access for Research and Development: Issues and challenges by Bhojaraju Gunjal
ETDs and Open Access for Research and Development: Issues and challengesETDs and Open Access for Research and Development: Issues and challenges
ETDs and Open Access for Research and Development: Issues and challenges
Bhojaraju Gunjal621 views

Similar to Challenges for ontology repositories and applications to biomedicine and agronomy

OOR--Open-Ontology-Repository--jun2010 by
OOR--Open-Ontology-Repository--jun2010OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010Peter Yim
1.7K views36 slides
OpenMinTeD: Making Sense of Large Volumes of Data by
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Dataopenminted_eu
481 views20 slides
New trends in ontological engineering, practices and tools by
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsMaría Poveda Villalón
454 views39 slides
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a... by
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...e-ROSA
169 views20 slides
Semantic artefact and ontology services for long-term data interpretation by
Semantic artefact and ontology services for long-term data interpretationSemantic artefact and ontology services for long-term data interpretation
Semantic artefact and ontology services for long-term data interpretationINRAE (MISTEA) and University of Montpellier (LIRMM)
6 views30 slides
The Biodiversity Informatics Landscape by
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeVince Smith
1.2K views29 slides

Similar to Challenges for ontology repositories and applications to biomedicine and agronomy(20)

OOR--Open-Ontology-Repository--jun2010 by Peter Yim
OOR--Open-Ontology-Repository--jun2010OOR--Open-Ontology-Repository--jun2010
OOR--Open-Ontology-Repository--jun2010
Peter Yim1.7K views
OpenMinTeD: Making Sense of Large Volumes of Data by openminted_eu
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Data
openminted_eu481 views
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a... by e-ROSA
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...
eROSA Stakeholder WS1: AgroPortal: a vocabulary and ontology repository for a...
e-ROSA169 views
The Biodiversity Informatics Landscape by Vince Smith
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
Vince Smith1.2K views
Open interoperability standards, tools and services at EMBL-EBI by Pistoia Alliance
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
Pistoia Alliance3.1K views
Open Data (and Software, and other Research Artefacts) - A proper management by Oscar Corcho
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
Oscar Corcho219 views
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles by dgarijo
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
dgarijo519 views
ContentMining in Neuroscience by TheContentMine
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
TheContentMine304 views
ContentMining in Neuroscience by TheContentMine
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
TheContentMine102 views
Getting Started with Institutional Repositories and Open Access by Abby Clobridge
Getting Started with Institutional Repositories and Open AccessGetting Started with Institutional Repositories and Open Access
Getting Started with Institutional Repositories and Open Access
Abby Clobridge848 views
Repository : A Brief Comparative Study Between The National University Of Mal... by tulipbiru64
Repository : A Brief Comparative Study Between The National University Of Mal...Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...
tulipbiru64637 views
Being FAIR: FAIR data and model management SSBSS 2017 Summer School by Carole Goble
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble978 views
10th e concertation-brussels-06march2013-v2 by Alex Hardisty
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
Alex Hardisty544 views

More from INRAE (MISTEA) and University of Montpellier (LIRMM)

O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep... by
O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...
O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep...INRAE (MISTEA) and University of Montpellier (LIRMM)
128 views24 slides
Portail d’ontologies et annotation sémantique de texte - Application en biomé... by
Portail d’ontologies et annotation sémantique de texte - Application en biomé...Portail d’ontologies et annotation sémantique de texte - Application en biomé...
Portail d’ontologies et annotation sémantique de texte - Application en biomé...INRAE (MISTEA) and University of Montpellier (LIRMM)
460 views89 slides
SIFR : Indexation sémantique de ressources biomédicales francophones by
SIFR : Indexation sémantique de ressources biomédicales francophonesSIFR : Indexation sémantique de ressources biomédicales francophones
SIFR : Indexation sémantique de ressources biomédicales francophonesINRAE (MISTEA) and University of Montpellier (LIRMM)
485 views10 slides

More from INRAE (MISTEA) and University of Montpellier (LIRMM)(10)

Recently uploaded

"Running students' code in isolation. The hard way", Yurii Holiuk by
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk Fwdays
11 views34 slides
Case Study Copenhagen Energy and Business Central.pdf by
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdfAitana
16 views3 slides
Scaling Knowledge Graph Architectures with AI by
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
30 views15 slides
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
39 views1 slide
Piloting & Scaling Successfully With Microsoft Viva by
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft VivaRichard Harbridge
12 views160 slides
Evolving the Network Automation Journey from Python to Platforms by
Evolving the Network Automation Journey from Python to PlatformsEvolving the Network Automation Journey from Python to Platforms
Evolving the Network Automation Journey from Python to PlatformsNetwork Automation Forum
13 views21 slides

Recently uploaded(20)

"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays11 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana16 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma39 views
Piloting & Scaling Successfully With Microsoft Viva by Richard Harbridge
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft Viva
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc10 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software263 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2217 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
Powerful Google developer tools for immediate impact! (2023-24) by wesley chun
Powerful Google developer tools for immediate impact! (2023-24)Powerful Google developer tools for immediate impact! (2023-24)
Powerful Google developer tools for immediate impact! (2023-24)
wesley chun10 views

Challenges for ontology repositories and applications to biomedicine and agronomy

  • 1. Clement Jonquet - jonquet@lirmm.fr SIMBig’17 – Sept 6th 2017 – Lima, Peru
  • 2. • LIRMM • Clement Jonquet • Vincent Emonet • Anne Toulet • Andon Tchechmedjiev • Amine Abdaoui • Zohra Bellahsene • Amina Annane (ESI Algeria) • Mathieu Roche (CIRAD) • Sandra Bringay • Juan Antonio Lossio Ventura • Few MSc students / year • Collaborators • Pierre Larmande (IRD / IBC) • Maguelonne Teisseire (IRSTEA) • Mark Musen (BMIR) • John Graybeal (NCBO) • Stefan Darmoni (CISMEF) • Sebastien Harispe (LGI2P) • Adrien Coulet (LORIA) • Elizabeth Arnaud (CGIAR) • Cyril Pommier (INRA) • Esther Dzalé-Yeumo (INRA) C.Jonquet-SIMBig2017-Lima,Peru 2
  • 3. ▪ Ontologies & Ontology repositories ▪ 2 collaborative projects with ontology repositories and ontology-based services ▪ Challenges for ontology repositories & propositions/results ▪ Conclusion C.Jonquet-SIMBig2017-Lima,Peru 3
  • 4. C. Jonquet - SIMBig 2017 - Lima, Peru 4
  • 5. ▪ To provide canonical representation of scientific knowledge ▪ To annotate experimental data to enable interpretation, comparison, and discovery across databases ▪ To facilitate knowledge-based applications for ▪ Decision support ▪ Natural language-processing ▪ Data integration ▪ But ontologies are: spread out, in different formats, of different size, with different structures C.Jonquet-SIMBig2017-Lima,Peru 5
  • 6. ▪ Ontology libraries defined as ▪ “a library system that offers various functions for managing, adapting and standardizing groups of ontologies. It should fulfill the needs for re- use of ontologies. In this sense, an ontology library system should be easily accessible and offer efficient support for re-using existing relevant ontologies and standardizing them based on upper-level ontologies and ontology representation languages.” [Ding & Fensel, 2001] ▪ Ontology repositories defined as ▪ “a structured collection of ontologies (…) by using an Ontology Metadata Vocabulary. References and relations between ontologies and their modules build the semantic model of an ontology repository. Access to resources is realized through semantically-enabled interfaces applicable for humans and machines. Therefore a repository provides a formal query language” [Hartmann, Palma, Gomez-Perez, 2009] C.Jonquet-SIMBig2017-Lima,Peru 6
  • 7. ▪ You’ve built an ontology, how do you let the world know? ▪ You need an ontology, where do you go to get it? ▪ How do you know whether an ontology is any good? ▪ How do you find data resources that are relevant to the domain of the ontology (or to specific terms)? ▪ How could you leverage your ontology to enable new science? ▪ How could you use ontologies without managing them ? C.Jonquet-SIMBig2017-Lima,Peru 7
  • 8. ▪ Open Ontology Repository initiative (late 2000s) ▪ 2010 ORES workshop ▪ Ontology Repositories and Editors for the Semantic Web ▪ Review of ontology repositories ▪ [Where to publish and find ontologies? D’Aquin & Noy, 2012] ▪ A bunch of papers on ontology recommendation & selection ▪ News ▪ New platform in 2015 Aber-OWL ▪ OLS 3.0, AgroPortal releases C.Jonquet-SIMBig2017-Lima,Peru 8
  • 9. ▪ Ontology repositories / portal ▪ NCBO BioPortal ▪ Ontobee ▪ AberOWL ▪ EBI Ontology Lookup Service ▪ OKFN Linked Open Vocabularies ▪ ONKI Ontology Library Service ▪ MMI Ontology Registry and Repository ▪ ESIPportal ▪ AgroPortal ▪ SIFR BioPortal ▪ CISMEF HeTOP ▪ OntoHub ▪ Web indexes ▪ Watson, Swoogle, Sindice, Falcons ▪ Ontology libraries / listings (more or less updated) ▪ OBO Foundry ▪ WebProtégé ▪ Romulus ▪ DAML ontology library ▪ Colore ▪ FAO VEST Registry ▪ BioSharing ▪ DERI Vocabularies , OntologyDesignPatterns, Semanticweb.org, W3C Good ontologies ▪ Platform technology ▪ Mondeca ITM, LexEVS ▪ Abandoned projects ▪ Cubboard, Knoodl, Schemapedia, SchemaWeb, OntoSelect, OntoSearch, TONES C.Jonquet-SIMBig2017-Lima,Peru 9
  • 10. ▪ Web repository for biomedical ontologies ▪ Make ontologies accessible and usable – abstraction on format, locations, structure, etc. ▪ Users can publish, download, browse, search, comment, align ontologies and use them for annotations both online and via a web services API. C.Jonquet-SIMBig2017-Lima,Peru 10
  • 11. C.Jonquet-SIMBig2017-Lima,Peru 11 • Online support for ontology • Peer review & notes • Versioning • Mapping • Search • Resources • Annotation • Open source technology • Packaged in a “virtual appliance” • Set up your own “bioportal” in a few hours
  • 12. http://bioportal.bioontology.org Ontology Services • Search • Traverse • Comment • Download Widgets • Tree-view • Auto-complete • Graph-view Annotation Data Access Mapping Services • Create • Upload • Download Term recognition Search “data” annotated with a given term http://data.bioontology.org C.Jonquet-SIMBig2017-Lima,Peru 12
  • 13. ▪ NCI term browser (https://nciterms.nci.nih.gov) ▪ BioPortal first, then LexEVS ▪ Open Ontology Repository (OOR) Initiative (http://www.oor.net) ▪ Marine Metadata Interoperability Ontology Registry and Repository (http://mmisw.org) ▪ ESIPPortal (Earth Science Information Partners - http://semanticportal.esipfed.org ) ▪ AgroPortal (http://agroportal.lirmm.fr) ▪ SIFR/French BioPortal (http://bioportal.lirmm.fr) ▪ And a few hospitals, research labs, with private data and specific needs (often in-house annotation) C.Jonquet-SIMBig2017-Lima,Peru 13
  • 15. C. Jonquet - SIMBig 2017 - Lima, Peru 15
  • 16. http://www.lirmm.fr/sifr ▪Ontology-based services to index, mine and retrieve French biomedical data ▪In France, there is already a reference repository for medical terminologies but nothing public for annotation ▪Crucial need for tools & services for French biomedical data C.Jonquet-SIMBig2017-Lima,Peru 16
  • 17. C. Jonquet, A. Annane, K. Bouarech, V. Emonet & S. Melzi. SIFR BioPortal: French biomedical ontologies and terminologies available for semantic annotation, In 16th Journées Francophones d'Informatique Médicale JFIM'16. Genève, Suisse, July 2016. http://bioportal.lirmm.fr 25 monolingual ontologies/terminologies • From the UMLS or EHTOP • Cleaned and checked for the Annotator purpose 17 C.Jonquet-SIMBig2017-Lima,Peru
  • 19. http://agroportal.lirmm.fr ▪ Develop and support a reference ontology repository ▪ Primary focus on the agronomy & close related domains (plant sciences, nutrition and biodiversity) ▪ Reusing the NCBO BioPortal technology ▪ Avoid to re-implement what has been done, facilitate interoperability ▪ Reusing the scientific outcomes, experience & methods of the biomedical domain ▪ Enable straightforward use of agronomic related ontologies ▪ Respect the requirements & specificities of the agronomic community ▪ Fully semantic web compliant infrastructure ▪ Enable new science C. Jonquet, A. Toulet, E. Arnaud, S. Aubin, E. Dzalé-Yeumo, V. Emonet, J. Graybeal, M. A. Musen, C. Pommier & P. Larmande. Reusing the NCBO BioPortal technology for agronomy to build AgroPortal, In 7th International Conference on Biomedical Ontologies, ICBO'16, Demo Session. Corvallis, Oregon, USA, August 2016. C.Jonquet-SIMBig2017-Lima,Peru 19
  • 21. ➢ IBC Rice Genomics & AgroLD project ➢ Data integration and knowledge management related to rice (P. Larmande) ➢ RDA Wheat Data Interoperability working group ➢ Common framework for publishing wheat data (E. Dzalé-Yeumo) ➢ LovInra : INRA Linked Open Vocabularies ➢Vocabularies produced by INRA scientists (S. Aubin) ➢ Crop Ontology project ➢ Ontologies for describing crop germplasm & traits (E. Arnaud) ➢ GODAN global map of agri-food data standards ➢ VEST/AgroPortal MAP of standards (V. Pesce) C.Jonquet-SIMBig2017-Lima,Peru 21
  • 22. C. Jonquet - SIMBig 2017 - Lima, Peru 22
  • 23. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 23
  • 24. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 24
  • 25. ▪ First role of an ontology repository is to handle ontology metadata (model, extract, edit, valorize) ▪ Everything about an ontology ▪ Intrinsic properties e.g., name, URI, creation date ▪ Relation to other ontologies e.g., imports, is mapped to, disagrees with ▪ Community contributions e.g. notes, project using, endorsements ▪ Content-based services e.g., SPARQL endpoint, bulk RDF download, search ▪ omv:usedOntologyEngineeringTool example ▪ What does it say about your community? C.Jonquet-SIMBig2017-Lima,Peru 25
  • 26. ▪Pickup properties and relations from 23 existing vocabularies ▪ Existing properties in ontology repositories (especially BioPortal) ▪ Non specific properties that may “return to the ontology” 346 relevant properties that could be used to described ontologies 127 used to build a new metadata model inside AgroPortal Ontology repositories metadata Other Interesting vocabularies (e.g., IDOT, PAV, SD, DOAP, …) Standards & Relevant (e.g., DC, DCAT, SKOS, OWL, PROV, OMV, VOID, VOAF, MOD …) C.Jonquet-SIMBig2017-Lima,Peru 26
  • 27. Describe ontologies with semantic metadata • Display “per ontology” • Ontology specific properties => viewable and editable within the ontology specific page • Everything you need to know about an ontology • URIs used in the backend to store the information • e.g., CC-BY => https://creativecommons.org/licenses/by- nd/4.0/ • “Get my metadata back” buttons C. Jonquet - SIMBig 2017 - Lima, Peru 27
  • 28. select ▪ Allows to search, order and select ontologies using a facetted search approach, based on the metadata ▪ 4 additional ways to filter ontologies in the list ▪ 2 new options to sort this list (name, released date). C. Jonquet - SIMBig 2017 - Lima, Peru 28
  • 29. Display “per property” ▪ Global presentation of the properties ▪ Synthesis diagrams & listing ▪ Allows to explore the agronomical ontology landscape by automatically aggregating the metadata fields of each ontologies in explicit visualizations (charts, term cloud and graphs). 29 C. Jonquet, A. Toulet, V. Emonet. Two years after: a review of vocabularies and ontologies in AgroPortal, In International Workshop on sources and data integration in agriculture, food and environment using ontologies, IN-OVIVE'17. Montpellier, France, July 2017. pp. 13. EFITA. C.Jonquet-SIMBig2017-Lima,Peru
  • 33. ▪ Develop a new standard ontology metadata model ▪ Harmonize ontology repositories ▪ MOD project: Metadata for Ontology Description and Publication Ontology ▪ https://github.com/sifrproject/MOD-Ontology ▪ Add features within ontology editors C.Jonquet-SIMBig2017-Lima,Peru 33
  • 34. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 34
  • 35. ▪ Interface internationalization = displaying static elements of the user interface (e.g., menu names, help, etc.) in different languages ▪ Content internationalization = displaying BioPortal content (e.g., ontology labels, mappings, etc.) in different languages ▪ Multilingual = internationalization (display) + to enabling a complete use of the functionalities and services of BioPortal for multilingual ontologies or monolingual ontologies ▪ completely and properly addressed (languages, translations, multilingual mappings, etc.) ▪ rich semantic description ▪ Being able to parse multilingual content in ontologies (from xmllang to Lemon) C.Jonquet-SIMBig2017-Lima,Peru 35
  • 36. en:disease fr:maladie ... en:cancer fr:cancer en:spindel cell sarcome fr:sarcome à cellules fusiformes en:melanoma fr:mélanome disease ... cancer spindle cell sarcome melanoma maladie ... cancer sarcome à cellules fusiformes mélanome C.Jonquet-SIMBig2017-Lima,Peru 36
  • 37. • Set of propositions • Representation of natural language property for an ontology • Representation of relation between ontologies • Representation of multilingual translation mappings • Then parse multilingual ontologies in the portal • Then translate the user interface C. Jonquet, V. Emonet & M. A. Musen. Roadmap for a multilingual BioPortal, In 4th Workshop on the Multilingual Semantic Web, MSW4'15. Portoroz, Slovenia, June 2015. disease ... cancer spindle cell sarcome melanoma maladie ... cancer sarcome à cellules fusiformes mélanome gold:freeTranslation gold:literalTranslation en:disease fr:maladie ... en:cancer fr:cancer en:spindel cell sarcome fr:sarcome à cellules fusiformes en:melanoma fr:mélanome Search index Annota tor Index C.Jonquet-SIMBig2017-Lima,Peru 37
  • 38. ▪ No formal representation of the translation links between translated ontologies and original ones and those mappings are not always formally available ▪ Reconciled more than 228K mappings between ten English ontologies hosted on NCBO BioPortal and their French translations ▪ We stored those mappings in the SIFR BioPortal making the whole thing available in JSON-LD and connecting the 2 portals A. Annane, V. Emonet, F. Azouaou & C. Jonquet. Multilingual Mapping Reconciliation between English-French Biomedical Ontologies, In 6th International Conference on Web Intelligence, Mining and Semantics, WIMS'16. Nimes, France, June 2016. ACM. C.Jonquet-SIMBig2017-Lima,Peru 38 NCBO BioPortal SIFR BioPortal
  • 39. French ontology Concepts English ontology Concepts Mapped concepts Mapping% Mappings number Properties (skos;gold) STY 133 STY 133 133 100% 133 exactMatch; freeTranslation MDRFRE 66378 MEDDRA 66378 66378 100% 66378 exactMatch; freeTranslation CIF 1495 ICF 1495 1495 100% 1495 exactMatch; freeTranslation MTHMSTFRE 1700 MSTDE 1699 1700 100% 1700 exactMatch; freeTranslation MSHFRE 26142 MeSH 252242 26220 99.79% 26220 exactMatch; freeTranslation WHO-ARTFRE 3482 WHO 1724 3482 100% 3482 broadMatch ; translation CISP2 745 ICPC2P 7537 665 70% 5063 narrowMatch ; translation MEDLINEPLUS 795 MEDLINEPLUS 2113 771 97% 1520 closeMatch; translation CIM-10 19853 ICD10 12318 19813 99% 19813 exactMatch; freeTranslation 62% broadMatch; translation 37% SNMIFRE 106266 SNMI 109150 102093 96% 102093 exactMatch; freeTranslation C.Jonquet-SIMBig2017-Lima,Peru 39
  • 41. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 41
  • 42. ▪ Ontologies, vocabularies, and terminologies will inevitably overlap in coverage ▪ Mappings does not always belong to an ontology ▪ The community needs a place to store and retrieve them ▪ Scalability ▪ That’s the role of the ontology repository ▪ Need to be semantically described with plenty of provenance information C.Jonquet-SIMBig2017-Lima,Peru 42
  • 43. ▪ Created mappings in BioPortal include : ▪ ID based mappings (CUI & URI) ▪ Lexical mappings (LOOM) ▪  “Responsible” for the content Faria D. et al. Towards annotating potential incoherences in BioPortal mappings. ISWC 2014. ▪ Uploaded mappings are added by a user using the REST API (or the UI) ▪ Use of external tools ▪ Explicit metadata to make the distinction when using/retrieving mappings C.Jonquet-SIMBig2017-Lima,Peru 43
  • 44. Select ontologies to align (from BioPortal or not) Align ontologies for instance with Yamm++ Automatically export the results to BioPortal Reuse mappings for annotation, indexing and future mapping generation C.Jonquet-SIMBig2017-Lima,Peru 44
  • 47. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 47
  • 48. ▪Data deluge ▪Not necessarily connected to relevant ontologies ▪Annotate data with ontology concepts ▪ Horizontal approach ONTOLOGIES RESOURCES C.Jonquet-SIMBig2017-Lima,Peru 48 C. Jonquet, P. LePendu, S. Falconer, A. Coulet, N. F. Noy, M. A. Musen & N. H. Shah. NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources, Web Semantics. September 2011. Vol. 9 (3), pp. 316-324. Elsevier.
  • 49. ▪ Ontologies and data change everyday ▪ Need to be able to handle the “deltas” only ▪ Work on terminology and knowledge extraction from text ▪ BioTex (http://tubo.lirmm.fr/biotex) C. Jonquet - SIMBig 2017 - Lima, Peru 49 J.A. Lossio-Ventura, C. Jonquet, M. Roche & M. Teisseire. Biomedical term extraction: overview and a new methodology, Information Retrieval, Special issue on Medical Information Retrieval. August 2015. Vol. 19 (1), pp. 59-99. Springer.
  • 50. ▪We built the NCBO Resource Index as a searchable database of around 50 biomedical resources semantically indexed, with annotations ▪Since then, linked open data has become the approach in the semantic web ▪In agronomy: build a database of resources described in RDF, and annotated with ontologies: the AgroLD project C.Jonquet-SIMBig2017-Lima,Peru 50
  • 54. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 54
  • 55. ▪The role of the portal is to offer services for ontologies ▪Focus here on the use of ontologies is for annotation purposes ▪ How can a repository facilitate the use of ontologies for annotation? ▪Text mining challenge (disambiguation, context, negation, modality, time) ▪ Electronic Health Records C.Jonquet-SIMBig2017-Lima,Peru 55
  • 56. ▪ Improve the NCBO Annotator results by ranking the annotations according to their relevance ▪ While not changing the service implementation ▪ Take into account their frequencies (as originally proposed in 2009 and removed) ▪ Add a term extraction measure, called C-Value, used to positively discriminate annotations generated from matches with multi-word terms. ▪ Mostly improves annotations done with multiword terms ▪ 2 new scoring methods allowing to score and rank annotations by their importance in the given input data ▪ Interesting results validated against PubMed manual annotations S. Melzi & C. Jonquet. Scoring semantic annotations returned by the NCBO Annotator, In 7th International Semantic Web Applications and Tools for Life Sciences, SWAT4LS'14. Berlin, Germany, Dec. 2014. C.Jonquet-SIMBig2017-Lima,Peru 56
  • 57. ▪Project SIFR & PratikPharma ▪Detecting Negation, Temporality and Experiencer ▪Implementation using NegEx/ConText ▪ Inclusion in the French/SIFR Annotator ▪ Proxy architecture to plug this the NCBO Annotator ▪Very good performance results ▪ e.g., negation F1 between 0.8 and 0.9 C.Jonquet-SIMBig2017-Lima,Peru 57
  • 59. Better ontology identification & selection (via ontology metadata) Multilingualism Ontology alignment (creation & use) Catching up with relevant data: annotations and linked data Generalized ontology-based services (keep quality while enabling horizontal studies) Scale to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 59
  • 60. ▪There are 596 ontologies and +110 ontology views in BioPortal right now ▪Mostly biology and medicine ▪Overlaps with other domains ▪Lots of upper level ontologies ▪Lots of vocabularies ▪Swoogle in 2007: “Search over 10.000 ontologies” How much now? C.Jonquet-SIMBig2017-Lima,Peru 60
  • 61. ▪ No repository (except the web itself) will handle them all, while keeping the level of features (and curation?) ▪ Will each domain build they own technology? ▪ Sharing the technology is the best way to guaranty long term support and future development ▪ Developers all around the world ▪ Different funders & support ▪ Sharing the technology is the best way to make ontology repositories interoperable C.Jonquet-SIMBig2017-Lima,Peru 61
  • 62. ▪ UI does not really matter ▪ We should be able to make a new portal for another community in minutes ▪ Avoid duplicating ontologies ▪ Connect portals one another ▪ Through mappings as we did with translation mappings ▪ The annotator proxy feature ▪ Implement and discuss standards ▪ SKOS handling in BioPortal ▪ Ontology metadata description C.Jonquet-SIMBig2017-Lima,Peru 62
  • 63. ▪ Most of our new features are developed within a proxy ▪ E.g., we can call either the AgroPortal, SIFR BioPortal or even the NCBO BioPortal Annotator and use the same code to score annotations ▪ Used this to set up an enhanced version of the NCBO Annotator 63 C.Jonquet-SIMBig2017-Lima,Peru
  • 64. ▪ A remote BioPortal UI which actually talks to the main BioPortal REST API ▪ Interesting for future interoperable BioPortal instances C.Jonquet-SIMBig2017-Lima,Peru 64
  • 65. C. Jonquet - SIMBig 2017 - Lima, Peru 65
  • 66. ▪We discussed the importance of ontology repositories and the span of ontology-based services they can (should) offer ▪Reviewed some of the challenges in that domain of research, at the light of our 2 projects (SIFR & AgroPortal) ▪Reviewed some of the results obtained & propositions made ▪ Some are work in progress C.Jonquet-SIMBig2017-Lima,Peru 66
  • 67. www.lirmm.fr/sifrC. Jonquet - SIMBig 2017 - Lima, Peru 67