5. ▪ To provide canonical representation of scientific knowledge
▪ To annotate experimental data to enable interpretation,
comparison, and discovery across databases
▪ To facilitate knowledge-based applications for
▪ Decision support
▪ Natural language-processing
▪ Data integration
▪ But ontologies are: spread out, in different formats, of different size, with
different structures
C.Jonquet-SIMBig2017-Lima,Peru
5
6. ▪ Ontology libraries defined as
▪ “a library system that offers various functions for managing, adapting
and standardizing groups of ontologies. It should fulfill the needs for re-
use of ontologies. In this sense, an ontology library system should be
easily accessible and offer efficient support for re-using existing relevant
ontologies and standardizing them based on upper-level ontologies and
ontology representation languages.” [Ding & Fensel, 2001]
▪ Ontology repositories defined as
▪ “a structured collection of ontologies (…) by using an Ontology Metadata
Vocabulary. References and relations between ontologies and their
modules build the semantic model of an ontology repository. Access to
resources is realized through semantically-enabled interfaces applicable
for humans and machines. Therefore a repository provides a formal query
language” [Hartmann, Palma, Gomez-Perez, 2009]
C.Jonquet-SIMBig2017-Lima,Peru
6
7. ▪ You’ve built an ontology, how do you let the world know?
▪ You need an ontology, where do you go to get it?
▪ How do you know whether an ontology is any good?
▪ How do you find data resources that are relevant to the domain of the ontology
(or to specific terms)?
▪ How could you leverage your ontology to enable new science?
▪ How could you use ontologies without managing them ?
C.Jonquet-SIMBig2017-Lima,Peru
7
8. ▪ Open Ontology Repository initiative (late 2000s)
▪ 2010 ORES workshop
▪ Ontology Repositories and Editors for the Semantic Web
▪ Review of ontology repositories
▪ [Where to publish and find ontologies? D’Aquin & Noy, 2012]
▪ A bunch of papers on ontology recommendation & selection
▪ News
▪ New platform in 2015 Aber-OWL
▪ OLS 3.0, AgroPortal releases
C.Jonquet-SIMBig2017-Lima,Peru
8
10. ▪ Web repository for biomedical ontologies
▪ Make ontologies accessible and usable –
abstraction on format, locations, structure,
etc.
▪ Users can publish, download, browse, search,
comment, align ontologies and use them for
annotations both online and via a web services
API.
C.Jonquet-SIMBig2017-Lima,Peru
10
11. C.Jonquet-SIMBig2017-Lima,Peru
11
• Online support for ontology
• Peer review & notes
• Versioning
• Mapping
• Search
• Resources
• Annotation
• Open source technology
• Packaged in a “virtual
appliance”
• Set up your own
“bioportal” in a few
hours
12. http://bioportal.bioontology.org
Ontology
Services
• Search
• Traverse
• Comment
• Download
Widgets
• Tree-view
• Auto-complete
• Graph-view
Annotation
Data Access
Mapping
Services
• Create
• Upload
• Download
Term recognition
Search “data”
annotated with a
given term
http://data.bioontology.org
C.Jonquet-SIMBig2017-Lima,Peru
12
13. ▪ NCI term browser (https://nciterms.nci.nih.gov)
▪ BioPortal first, then LexEVS
▪ Open Ontology Repository (OOR) Initiative (http://www.oor.net)
▪ Marine Metadata Interoperability Ontology Registry and Repository (http://mmisw.org)
▪ ESIPPortal (Earth Science Information Partners - http://semanticportal.esipfed.org )
▪ AgroPortal (http://agroportal.lirmm.fr)
▪ SIFR/French BioPortal (http://bioportal.lirmm.fr)
▪ And a few hospitals, research labs, with private data and specific needs (often in-house annotation)
C.Jonquet-SIMBig2017-Lima,Peru
13
16. http://www.lirmm.fr/sifr
▪Ontology-based services to index, mine
and retrieve French biomedical data
▪In France, there is already a reference
repository for medical terminologies
but nothing public for annotation
▪Crucial need for tools & services for
French biomedical data
C.Jonquet-SIMBig2017-Lima,Peru
16
17. C. Jonquet, A. Annane, K. Bouarech, V. Emonet & S.
Melzi. SIFR BioPortal: French biomedical ontologies
and terminologies available for semantic
annotation, In 16th Journées Francophones
d'Informatique Médicale
JFIM'16. Genève, Suisse, July 2016.
http://bioportal.lirmm.fr
25 monolingual ontologies/terminologies
• From the UMLS or EHTOP
• Cleaned and checked for the
Annotator purpose
17
C.Jonquet-SIMBig2017-Lima,Peru
19. http://agroportal.lirmm.fr
▪ Develop and support a reference ontology repository
▪ Primary focus on the agronomy & close related domains (plant sciences, nutrition and biodiversity)
▪ Reusing the NCBO BioPortal technology
▪ Avoid to re-implement what has been done, facilitate interoperability
▪ Reusing the scientific outcomes, experience & methods of the biomedical domain
▪ Enable straightforward use of agronomic related ontologies
▪ Respect the requirements & specificities of the agronomic community
▪ Fully semantic web compliant infrastructure
▪ Enable new science
C. Jonquet, A. Toulet, E. Arnaud, S. Aubin, E. Dzalé-Yeumo, V. Emonet, J. Graybeal, M. A.
Musen, C. Pommier & P. Larmande. Reusing the NCBO BioPortal technology for agronomy to
build AgroPortal, In 7th International Conference on Biomedical Ontologies, ICBO'16, Demo
Session. Corvallis, Oregon, USA, August 2016.
C.Jonquet-SIMBig2017-Lima,Peru
19
21. ➢ IBC Rice Genomics & AgroLD project
➢ Data integration and knowledge management related to rice (P. Larmande)
➢ RDA Wheat Data Interoperability working group
➢ Common framework for publishing wheat data (E. Dzalé-Yeumo)
➢ LovInra : INRA Linked Open Vocabularies
➢Vocabularies produced by INRA scientists (S. Aubin)
➢ Crop Ontology project
➢ Ontologies for describing crop germplasm & traits (E. Arnaud)
➢ GODAN global map of agri-food data standards
➢ VEST/AgroPortal MAP of standards (V. Pesce)
C.Jonquet-SIMBig2017-Lima,Peru
21
23. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 23
24. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 24
25. ▪ First role of an ontology repository is to handle ontology metadata
(model, extract, edit, valorize)
▪ Everything about an ontology
▪ Intrinsic properties e.g., name, URI, creation date
▪ Relation to other ontologies e.g., imports, is mapped to, disagrees with
▪ Community contributions e.g. notes, project using, endorsements
▪ Content-based services e.g., SPARQL endpoint, bulk RDF download, search
▪ omv:usedOntologyEngineeringTool example
▪ What does it say about your community?
C.Jonquet-SIMBig2017-Lima,Peru
25
26. ▪Pickup properties and relations
from 23 existing vocabularies
▪ Existing properties in ontology
repositories (especially
BioPortal)
▪ Non specific properties that may
“return to the ontology”
346 relevant properties that could
be used to described ontologies
127 used to build a new metadata
model inside AgroPortal
Ontology
repositories
metadata
Other Interesting
vocabularies
(e.g., IDOT, PAV,
SD, DOAP, …)
Standards &
Relevant (e.g.,
DC, DCAT, SKOS,
OWL, PROV, OMV,
VOID, VOAF, MOD
…)
C.Jonquet-SIMBig2017-Lima,Peru
26
27. Describe ontologies with
semantic metadata
• Display “per ontology”
• Ontology specific properties => viewable and
editable within the ontology specific page
• Everything you need to know about an ontology
• URIs used in the backend to store the information
• e.g., CC-BY =>
https://creativecommons.org/licenses/by-
nd/4.0/
• “Get my metadata back” buttons
C. Jonquet - SIMBig 2017 - Lima, Peru
27
28. select
▪ Allows to search, order and select ontologies using a
facetted search approach, based on the metadata
▪ 4 additional ways to filter ontologies in the list
▪ 2 new options to sort this list (name, released date).
C. Jonquet - SIMBig 2017 - Lima, Peru
28
29. Display “per property”
▪ Global presentation of the properties
▪ Synthesis diagrams & listing
▪ Allows to explore the agronomical ontology landscape
by automatically aggregating the metadata fields of
each ontologies in explicit visualizations (charts, term
cloud and graphs).
29
C. Jonquet, A. Toulet, V. Emonet. Two years after: a review of vocabularies
and ontologies in AgroPortal, In International Workshop on sources and data
integration in agriculture, food and environment using ontologies, IN-OVIVE'17.
Montpellier, France, July 2017. pp. 13. EFITA.
C.Jonquet-SIMBig2017-Lima,Peru
33. ▪ Develop a new standard ontology metadata model
▪ Harmonize ontology repositories
▪ MOD project: Metadata for Ontology Description and Publication Ontology
▪ https://github.com/sifrproject/MOD-Ontology
▪ Add features within ontology editors
C.Jonquet-SIMBig2017-Lima,Peru
33
34. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 34
35. ▪ Interface internationalization = displaying static elements of the user interface (e.g., menu
names, help, etc.) in different languages
▪ Content internationalization = displaying BioPortal content (e.g., ontology labels,
mappings, etc.) in different languages
▪ Multilingual = internationalization (display) + to enabling a complete use of the
functionalities and services of BioPortal for multilingual ontologies or monolingual
ontologies
▪ completely and properly addressed (languages, translations, multilingual mappings, etc.)
▪ rich semantic description
▪ Being able to parse multilingual content in ontologies (from xmllang to Lemon)
C.Jonquet-SIMBig2017-Lima,Peru
35
37. • Set of propositions
• Representation of natural language
property for an ontology
• Representation of relation between
ontologies
• Representation of multilingual translation
mappings
• Then parse multilingual ontologies in the
portal
• Then translate the user interface
C. Jonquet, V. Emonet & M. A. Musen. Roadmap for a multilingual BioPortal, In
4th Workshop on the Multilingual Semantic Web, MSW4'15. Portoroz, Slovenia,
June 2015.
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules
fusiformes
mélanome
gold:freeTranslation
gold:literalTranslation
en:disease
fr:maladie
...
en:cancer
fr:cancer
en:spindel cell sarcome
fr:sarcome à cellules fusiformes
en:melanoma
fr:mélanome
Search
index
Annota
tor
Index
C.Jonquet-SIMBig2017-Lima,Peru
37
38. ▪ No formal representation of the translation links between
translated ontologies and original ones and those mappings
are not always formally available
▪ Reconciled more than 228K mappings between ten English
ontologies hosted on NCBO BioPortal and their French
translations
▪ We stored those mappings in the SIFR BioPortal making the
whole thing available in JSON-LD and connecting the 2
portals
A. Annane, V. Emonet, F. Azouaou & C. Jonquet. Multilingual Mapping Reconciliation between English-French
Biomedical Ontologies, In 6th International Conference on Web Intelligence, Mining and Semantics, WIMS'16. Nimes,
France, June 2016. ACM.
C.Jonquet-SIMBig2017-Lima,Peru
38
NCBO BioPortal
SIFR BioPortal
41. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 41
42. ▪ Ontologies, vocabularies, and terminologies will inevitably overlap in
coverage
▪ Mappings does not always belong to an ontology
▪ The community needs a place to store
and retrieve them
▪ Scalability
▪ That’s the role of the ontology repository
▪ Need to be semantically described with plenty
of provenance information
C.Jonquet-SIMBig2017-Lima,Peru
42
43. ▪ Created mappings in BioPortal include :
▪ ID based mappings (CUI & URI)
▪ Lexical mappings (LOOM)
▪ “Responsible” for the content
Faria D. et al. Towards annotating potential incoherences in BioPortal
mappings. ISWC 2014.
▪ Uploaded mappings are added by a user using the REST API (or the UI)
▪ Use of external tools
▪ Explicit metadata to make the distinction when using/retrieving mappings
C.Jonquet-SIMBig2017-Lima,Peru
43
44. Select ontologies to align (from BioPortal
or not)
Align ontologies for instance with
Yamm++
Automatically export the results to
BioPortal
Reuse mappings for annotation, indexing
and future mapping generation
C.Jonquet-SIMBig2017-Lima,Peru
44
47. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 47
48. ▪Data deluge
▪Not necessarily connected to
relevant ontologies
▪Annotate data with ontology concepts
▪ Horizontal approach
ONTOLOGIES
RESOURCES
C.Jonquet-SIMBig2017-Lima,Peru
48
C. Jonquet, P. LePendu, S. Falconer, A. Coulet, N. F. Noy, M. A. Musen &
N. H. Shah. NCBO Resource Index: Ontology-Based Search and Mining
of Biomedical Resources, Web Semantics. September 2011. Vol. 9 (3),
pp. 316-324. Elsevier.
49. ▪ Ontologies and data change everyday
▪ Need to be able to handle the “deltas”
only
▪ Work on terminology and knowledge
extraction from text
▪ BioTex (http://tubo.lirmm.fr/biotex)
C. Jonquet - SIMBig 2017 - Lima, Peru 49
J.A. Lossio-Ventura, C. Jonquet, M. Roche & M. Teisseire. Biomedical
term extraction: overview and a new methodology, Information
Retrieval, Special issue on Medical Information Retrieval. August 2015.
Vol. 19 (1), pp. 59-99. Springer.
50. ▪We built the NCBO Resource Index as a searchable
database of around 50 biomedical resources
semantically indexed, with annotations
▪Since then, linked open data has become the approach
in the semantic web
▪In agronomy: build a database of resources described in
RDF, and annotated with ontologies: the AgroLD project
C.Jonquet-SIMBig2017-Lima,Peru
50
54. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 54
55. ▪The role of the portal is to offer services for ontologies
▪Focus here on the use of ontologies is for annotation purposes
▪ How can a repository facilitate the use of ontologies for annotation?
▪Text mining challenge (disambiguation, context, negation,
modality, time)
▪ Electronic Health Records
C.Jonquet-SIMBig2017-Lima,Peru
55
56. ▪ Improve the NCBO Annotator results by ranking the annotations according
to their relevance
▪ While not changing the service implementation
▪ Take into account their frequencies (as originally proposed in 2009 and removed)
▪ Add a term extraction measure, called C-Value, used to positively discriminate
annotations generated from matches with multi-word terms.
▪ Mostly improves annotations done with multiword terms
▪ 2 new scoring methods allowing to score and rank annotations by their
importance in the given input data
▪ Interesting results validated against PubMed manual annotations
S. Melzi & C. Jonquet. Scoring semantic annotations returned by the NCBO Annotator, In 7th International
Semantic Web Applications and Tools for Life Sciences, SWAT4LS'14. Berlin, Germany, Dec. 2014.
C.Jonquet-SIMBig2017-Lima,Peru
56
57. ▪Project SIFR & PratikPharma
▪Detecting Negation, Temporality and Experiencer
▪Implementation using NegEx/ConText
▪ Inclusion in the French/SIFR Annotator
▪ Proxy architecture to plug this the NCBO Annotator
▪Very good performance results
▪ e.g., negation F1 between 0.8 and 0.9
C.Jonquet-SIMBig2017-Lima,Peru
57
59. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologiesC. Jonquet - SIMBig 2017 - Lima, Peru 59
60. ▪There are 596 ontologies and +110 ontology views in
BioPortal right now
▪Mostly biology and medicine
▪Overlaps with other domains
▪Lots of upper level ontologies
▪Lots of vocabularies
▪Swoogle in 2007: “Search over 10.000 ontologies” How
much now?
C.Jonquet-SIMBig2017-Lima,Peru
60
61. ▪ No repository (except the web itself)
will handle them all, while keeping
the level of features (and curation?)
▪ Will each domain build they own technology?
▪ Sharing the technology is the best way to guaranty long term support
and future development
▪ Developers all around the world
▪ Different funders & support
▪ Sharing the technology is the best way to make ontology repositories
interoperable
C.Jonquet-SIMBig2017-Lima,Peru
61
62. ▪ UI does not really matter
▪ We should be able to make a new
portal for another community in minutes
▪ Avoid duplicating ontologies
▪ Connect portals one another
▪ Through mappings as we did with translation
mappings
▪ The annotator proxy feature
▪ Implement and discuss standards
▪ SKOS handling in BioPortal
▪ Ontology metadata description
C.Jonquet-SIMBig2017-Lima,Peru
62
63. ▪ Most of our new features are
developed within a proxy
▪ E.g., we can call either the
AgroPortal, SIFR BioPortal or even
the NCBO BioPortal Annotator and
use the same code to score
annotations
▪ Used this to set up an enhanced
version of the NCBO Annotator
63
C.Jonquet-SIMBig2017-Lima,Peru
64. ▪ A remote BioPortal UI which actually talks to the main
BioPortal REST API
▪ Interesting for future interoperable BioPortal
instances
C.Jonquet-SIMBig2017-Lima,Peru
64
66. ▪We discussed the importance of ontology repositories and the
span of ontology-based services they can (should) offer
▪Reviewed some of the challenges in that domain of research, at
the light of our 2 projects (SIFR & AgroPortal)
▪Reviewed some of the results obtained & propositions made
▪ Some are work in progress
C.Jonquet-SIMBig2017-Lima,Peru
66