A few contributions of the SIFR (Semantic Indexing of French biomedical Resources project) and how we reuse NCBO technology

INRAE (MISTEA) and University of Montpellier (LIRMM)
INRAE (MISTEA) and University of Montpellier (LIRMM)Senior Researcher at INRAE (MISTEA) and University of Montpellier (LIRMM)
Atelier Recherche d’Information Sémantique, RISE’15
30 juin 2015 – Rennes
Clement Jonquet – jonquet@lirmm.fr
A few contributions of
the SIFR
(Semantic Indexing of French
biomedical Resources project)
and how we reuse NCBO
technology
How is this relevant to RISE?
 Modèles de Recherche d'Information Sémantique
 Extraction d'Information
 Annotation Sémantique
 Indexation Sémantique
 Alignement d'ontologies et correspondances pour la Recherche
d'Information
 Langages de Représentation des connaissances pour la Recherche
d'Information
 Utilisation des distances Sémantiques pour la Recherche
d'Information
Atelier RISE 2015
30 juin 2015, Rennes
A few introduction words
Atelier RISE 2015
30 juin 2015, Rennes
Biologist have adopted
ontologies
 To provide canonical representation of scientific
knowledge
 To annotate experimental data to enable
interpretation, comparison, and discovery across
databases
 To facilitate knowledge-based applications for
 Decision support
 Natural language-processing
 Data integration
 But ontologies are: spread out, in different formats, of
different size, with different structures
Atelier RISE 2015
30 juin 2015, Rennes
Working with terminologies &
ontologies – a portal please!
 You’ve built an ontology, how do you let the world know?
 You need an ontology, where do you go o get it?
 How do you know whether an ontology is any good?
 How do you find resources that are relevant to the
domain of the ontology (or to specific terms)?
 How could you leverage your ontology to enable new
science?
 How could you use ontologies without managing them ?
Atelier RISE 2015
30 juin 2015, Rennes
Atelier RISE 2015
30 juin 2015, Rennes
 Comparison of the
approaches
[IWBBIO'14]
Annotation challenge
 Explosion of biomedical data: diverse,
distributed, unstructured… not linked to
ontologies
 Hard for biomedical researchers to find the
data they need
 Data integration problem
 Translational discoveries are prevented
 Good examples
 GO annotations
 PubMed (biomedical literature) indexed with
Mesh headings
 Annotate data with ontology concepts
 Horizontal approach
ONTOLOGIES
RESOURCES
Atelier RISE 2015
30 juin 2015, Rennes
Good use of the semantics (1/2)
 Simple keywords based search miss results
Atelier RISE 2015
30 juin 2015, Rennes
Good use of the semantics (2/2)
Atelier RISE 2015
30 juin 2015, Rennes
A few words about SIFR
project
Atelier RISE 2015
30 juin 2015, Rennes
Semantic Indexing of
French Biomedical Data
Resources project
… in collaboration with…
People
 Young researchers
 Clement Jonquet
 Mathieu Roche
 Sandra Bringay
 Advisors
 Stefano A. Cerri
 Maguelonne Teisseire
 Pascal Poncelet
 Staff
 Vincent Emonet
 Students
 Juan Antonio Lossio Ventura
 Guillaume Surroca
 ~3 MSc students / year
 Close collaborators
 Philippe Lemoisson (TETIS)
 Pierre Larmande (IRD / IBC)
 Mark Musen (BMIR)
 Stefan Darmoni (CISMEF)
 Sebastien Harispe (LGI2P)
Atelier RISE 2015
30 juin 2015, Rennes
Increasing number of biomedical
data + multilingualism
 Limits of keyword-based indexing
 Biomedical community has turned to ontologies to describe their
data and turn them into structured and formalized knowledge
 Using ontologies is by means of creating semantic annotations
 Crucial need for tools & services for French biomedical data
 Biomedical data integration challenge
 New potential sceintific discoveries hidden in data
 Translational research
Atelier RISE 2015
30 juin 2015, Rennes
Use ontologies for indexing, mining
and searching (French) biomedical
data
 Obj1: Design, development and deployment
of the French Annotator.
 Obj2: Obtain new research results to exploit
and enhance ontology-based indexing
services.
 semantic distances
 ontology alignment
 ontology enrichment and disambiguation
 Obj3: Valorization of indexing services
Atelier RISE 2015
30 juin 2015, Rennes
Atelier RISE 2015
30 juin 2015, Rennes
A French biomedical Annotator
Atelier RISE 2015
30 juin 2015, Rennes
Use biomedical ontologies-based
annotations end-user applications
Reuse of the NCBO
technology
Atelier RISE 2015
30 juin 2015, Rennes
Bioportal : A “one stop shop”
for Biomedical Ontologies
 Web repository for biomedical ontologies
 Make ontologies accessible and usable – abstraction on
format, locations, structure, etc.
 Users can publish, download, browse, search, comment,
align ontologies and use them for annotations both online
and via a web services API.
 Online support for ontology
 Peer review
 Notes (comments and discussion)
 Versioning
 Mapping
 Search
 Resources
Atelier RISE 2015
30 juin 2015, Rennes
http://bioportal.bioontology.org
BioPortal Ontology Repository
http://data.bioontology.org
Ontology
Services
• Search
• Traverse
• Comment
• Download
Widgets
• Tree-view
• Auto-complete
• Graph-view
Annotation
Data Access
Mapping
Services
• Create
• Upload
• Download
Term recognition
Search “data”
annotated with a
given term
http://bioportal.bioontology.org Atelier RISE 2015
30 juin 2015, Rennes
Current axes of research
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (1/8):
Design of the SIFR (French)
Annotator service
 Deployment of a local instance of BioPortal at LIRMM
 16 French terminologies imported from UMLS, EHTOP & BioPortal
 UTF8 compliant Mgrep concept recognizer (Univ. of Michigan)
 http://bioportal.lirmm.fr/annotator
 New improvement to the annotation workflow
 Automatic term extraction measures (C-value, LIDF-value, etc.)
 Scoring of annotations & representation in RDF using the AO
[SWAT4LS 2014]
Atelier RISE 2015
30 juin 2015, Rennes
Improving the Annotator(s) –
example with scoring
 Objective : To improve the Annotator(s) results by ranking
the annotations according to their relevance
 While not changing the service implementation
 Take into account their frequencies (as originally proposed in
2009 and removed)
 Add a term extraction measure, called C-Value, used to
positively discriminate annotations generated from matches
with multi-word terms.
 2 new scoring methods allowing to score and rank
annotations by their importance in the given input data
 Interesting results validated against PubMed manual
annotations
 [SWAT4LS 2014]
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (2/8):
Dealing with multilingualism within
BioPortal
 Status of multilingualism in BioPortal – quite negative
 Set of propositions [MSW 2014]
 Representation of natural language property for an ontology
 Representation of the distinction between ontologies
 Representation of relation between ontologies
 Representation of multilingual translation mappings
 Reconciliation of multilingual mappings (possible PhD collaboration with
ESI)
 Currently being tested/implemented within our local instance
Atelier RISE 2015
30 juin 2015, Rennes
What is being multilingual?
 Interface internationalization = displaying static elements of
the user interface (e.g., menu names, help, etc.) in
different languages
 Content internationalization = displaying BioPortal content
(e.g., ontology labels, mappings, etc.) in different languages
 Multilingual = internationalization (display) + to enabling a
complete use of the functionalities and services of BioPortal
for multilingual ontologies or monolingual ontologies
 completely and properly addressed (languages, translations,
multilingual mappings, etc.)
 rich semantic description
 Being able to parse multilingual content in ontologies (from
xmllang to Lemon)
Atelier RISE 2015
30 juin 2015, Rennes
multilingual
ontology
Atelier RISE 2015
30 juin 2015, Rennes
en:disease
fr:maladie
...
en:cancer
fr:cancer
en:spindel cell sarcome
fr:sarcome à cellules fusiformes
en:melanoma
fr:mélanome
disease
... cancer
spindle cell sarcome melanoma
maladie
... cancer
sarcome à cellules
fusiformes
mélanome
language specific
ontology
(monolingual)
SIFR axes of research (3/8):
Automatic extraction of biomedical
terminology from text
 Context of the PhD of Juan Antonio Lossio
[LBM 2013][TALN 2014][PolTAL 2014]
 BioTex , software
http://tubo.lirmm.fr/biotex [ISWC 2014]
 Work in French, English and Spanish
 Motivations for automatic terminology
extraction
 Experiment and validate approaches for
French data
 Contribute to the ontology enrichment
process
 Acquire some NLP expertise for the
annotation workflow
Atelier RISE 2015
30 juin 2015, Rennes
Atelier RISE 2015
30 juin 2015, Rennes
Statistical methods
C-value: Improves the extraction of longest terms
soft contact soft contact lens
Frantzi, K., Ananiadou, S., & Mima, H. (2000). Automatic recognition of multi-word terms:. the
c-value/nc-value method. International Journal on Digital Libraries, 3(2), 115-130.
Atelier RISE 2015
30 juin 2015, Rennes
Atelier RISE 2015
30 juin 2015, Rennes
Atelier RISE 2015
30 juin 2015, Rennes
Include BioTex into BioPortal
 Use BioPortal dictionary for validation
 New ontology enrichment service… give a corpus of data and
see what are the terms not yet covered
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (4/8):
Semantic distance framework
 Automatically compute existing (Rada, Wu&Palmer, Resnik)
semantic similarity measures over BioPortal ontologies
 For a given concept get all semantically closed concepts
 Get the semantic distance between 2 concepts
 Collaboration with LGI2P to reuse Semantic Measure Library
(SML) within BioPortal
 1st prototype: http://tubo.lirmm.fr/BioMedicalSemantic/web/app_dev.php
 To include SML within BioPortal backend to bring semantic
distance services to the ontologies and data annotated
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (5/8):
Informal patient data analysis
 Dealing with public patient data on blogs, forums and
tweets (Sandra Bringay)
 Detection of emotion [EGC 2014][eTELEMED 2014]
 Patient vocabulary (crabe vs. cancer)
 Project “Parlons de nous” (www.lirmm.fr/patient-mind)
 MSH-M
 A patient vocabulary currently being constructed [IC 2015]
 Hosted and available in our local instance of BioPortal
 Used for annotations, indexing, information retrieval
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (6/8):
Viewpoint: a subjective knowledge
representation formalism
 Collaboration with P. Lemoisson (CIRAD) & PhD of G. Surroca
 Graph based knowledge representation formalism
 Linked data from the semantic Web and user contributions
from the social Web.
 Unified topological approach
 First prototype for semantic search over HAL-LIRMM
publications [IC2014]
 Capture the phenomenon of Serendipity
(i.e., incidental learning) [IC 2015]
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (7/8):
Pharmacogenomics use case
 PGx studies how individual gene variations cause variability in
drug responses
 Validation of pharmacogenomics state-of-the-art knowledge on
the basis of practice-based evidences
 Compare pharmacogenomics literature (in English) and electronic
health records (in French)
 EHRs from Paris (HEGP) & St Etienne hospitals
 Improvement of the AnnotatorS to come to handle clinical data:
negation, disambiguation, modularity, temporality
 Project submitted to ANR generic call 2015 (April 27th)
 Collaborative action lead by Adrien Coulet (LORIA)
 Stanford is in the loop (Russ, Mark, Michel, Nigam)
Atelier RISE 2015
30 juin 2015, Rennes
SIFR axes of research (8/8):
application to agronomy & plant
 Within the Institute of Computational Biology of
Montpellier
 Design of a semantic annotation workflow for plant data -
collaboration with IBC project [CO-PDI 2014]
 AgroLD: to build an RDF knowledge base to house plant data
resources: SouthGreen, Gramene, OryGeneDB… [RDA 2014]
 AgroPortal: reference ontology repository for the agronomic
domain [IN-OVIVE 2015]
 Experiment NCBO technologies for the plant community
 4 driving agronomic use cases
Atelier RISE 2015
30 juin 2015, Rennes
Objectives of AgroPortal project
 Develop and support a reference ontology repository for the
agronomic domain
 One-stop-shop for plant/agronomic related ontologies
 Primary focus on the agronomic & plant domain
 Reusing the NCBO BioPortal technology
 Avoid to re-implement what has been done
 Facilitate interoperability
 Reusing the scientific outcomes, experience & methods of the
biomedical domain
 Enable straightforward use of agronomic related ontologies
 Respect the requirements of the agronomic community
 Fully semantic web compliant infrastructure
Atelier RISE 2015
30 juin 2015, Rennes
AgroPortal
 50 ontologies relevant to agronomic and plant
Atelier RISE 2015
30 juin 2015, Rennes
A few conclusions
Atelier RISE 2015
30 juin 2015, Rennes
Next future
 Continue to move different prototypes into production
 Release of the French Annotator
 Find more use cases
 Collaboration with the plant/agro community
 Continue reusing and contributing to NCBO technology
Atelier RISE 2015
30 juin 2015, Rennes
Online resources
 Web page: www.lirmm.fr/sifr
 https://www.researchgate.net/projects
 Code repository: https://github.com/sifrproject
 13 developpers
 10 repositories
 Publications: http://bit.ly/194ImnR
 Direct link to HAL-LIRMM platform
with advance search features
 Portals & services:
 http://bioportal.lirmm.fr
 http://agroportal.lirmm.fr
Atelier RISE 2015
30 juin 2015, Rennes
Questions & Remarks ?
Atelier RISE 2015
30 juin 2015, Rennes
1 of 43

Recommended

BioPortal: ontologies and integrated data resources at the click of a mouse by
BioPortal: ontologies and integrated data resourcesat the click of a mouseBioPortal: ontologies and integrated data resourcesat the click of a mouse
BioPortal: ontologies and integrated data resources at the click of a mouseINRAE (MISTEA) and University of Montpellier (LIRMM)
1.5K views40 slides
AgroPortal : a proposition for ontology- based services in the agronomic domain by
AgroPortal : a proposition for ontology- based services in the agronomic domainAgroPortal : a proposition for ontology- based services in the agronomic domain
AgroPortal : a proposition for ontology- based services in the agronomic domainINRAE (MISTEA) and University of Montpellier (LIRMM)
871 views30 slides
Mastering an ontology & vocabulary management technology in France ? by
Mastering an ontology & vocabulary management technology in France ?Mastering an ontology & vocabulary management technology in France ?
Mastering an ontology & vocabulary management technology in France ?INRAE (MISTEA) and University of Montpellier (LIRMM)
678 views26 slides

More Related Content

What's hot

FAIR data requires FAIR ontologies, how do we do? by
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?INRAE (MISTEA) and University of Montpellier (LIRMM)
457 views33 slides
FAIR data requires FAIR ontologies, how do we do? by
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?EUDAT
557 views29 slides
Challenges for ontology repositories and applications to biomedicine and agro... by
Challenges for ontology repositories and applications to biomedicine and agro...Challenges for ontology repositories and applications to biomedicine and agro...
Challenges for ontology repositories and applications to biomedicine and agro...INRAE (MISTEA) and University of Montpellier (LIRMM)
433 views67 slides
A Comparative Study Ontology Building Tools for Semantic Web Applications by
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications IJwest
20 views13 slides
Keynote at AgroLT 2008 by
Keynote at AgroLT 2008Keynote at AgroLT 2008
Keynote at AgroLT 2008Miguel-Angel Sicilia
688 views46 slides
Ontology engineering: Ontology alignment by
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignmentGuus Schreiber
1.2K views38 slides

What's hot(14)

FAIR data requires FAIR ontologies, how do we do? by EUDAT
FAIR data requires FAIR ontologies, how do we do?FAIR data requires FAIR ontologies, how do we do?
FAIR data requires FAIR ontologies, how do we do?
EUDAT557 views
A Comparative Study Ontology Building Tools for Semantic Web Applications by IJwest
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications
IJwest20 views
Ontology engineering: Ontology alignment by Guus Schreiber
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignment
Guus Schreiber1.2K views
download by butest
downloaddownload
download
butest260 views
Ontology Building and its Application using Hozo by Kouji Kozaki
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using Hozo
Kouji Kozaki4.6K views
Ontology Engineering for Big Data by Kouji Kozaki
Ontology Engineering for Big DataOntology Engineering for Big Data
Ontology Engineering for Big Data
Kouji Kozaki9.6K views
Ontology Mapping by butest
Ontology MappingOntology Mapping
Ontology Mapping
butest6K views
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce... by pathsproject
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
pathsproject956 views
Ontology Mapping by samhati27
Ontology MappingOntology Mapping
Ontology Mapping
samhati274.4K views
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW by ijait
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ijait19 views

Similar to A few contributions of the SIFR (Semantic Indexing of French biomedical Resources project) and how we reuse NCBO technology

About the use of biomedical ontologies to play with text in the context of th... by
About the use of biomedical ontologies to play with text in the context of th...About the use of biomedical ontologies to play with text in the context of th...
About the use of biomedical ontologies to play with text in the context of th...INRAE (MISTEA) and University of Montpellier (LIRMM)
427 views41 slides
The EdReNe Network by
The EdReNe NetworkThe EdReNe Network
The EdReNe NetworkUNI-C
3.1K views25 slides
Semantic artefact and ontology services for long-term data interpretation by
Semantic artefact and ontology services for long-term data interpretationSemantic artefact and ontology services for long-term data interpretation
Semantic artefact and ontology services for long-term data interpretationINRAE (MISTEA) and University of Montpellier (LIRMM)
6 views30 slides
EdReNe Presentation May 2010 by
EdReNe Presentation May 2010EdReNe Presentation May 2010
EdReNe Presentation May 2010UNI-C
341 views26 slides
EdReNe - a brief presentation and status by
EdReNe - a brief presentation and statusEdReNe - a brief presentation and status
EdReNe - a brief presentation and statusedrene01
492 views22 slides
Toward FAIR Semantic Resources by
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic ResourcesEUDAT
52 views30 slides

Similar to A few contributions of the SIFR (Semantic Indexing of French biomedical Resources project) and how we reuse NCBO technology(20)

The EdReNe Network by UNI-C
The EdReNe NetworkThe EdReNe Network
The EdReNe Network
UNI-C3.1K views
EdReNe Presentation May 2010 by UNI-C
EdReNe Presentation May 2010EdReNe Presentation May 2010
EdReNe Presentation May 2010
UNI-C341 views
EdReNe - a brief presentation and status by edrene01
EdReNe - a brief presentation and statusEdReNe - a brief presentation and status
EdReNe - a brief presentation and status
edrene01492 views
Toward FAIR Semantic Resources by EUDAT
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic Resources
EUDAT52 views
Schuurman phd presentation 2015 02 27 by Dimitri Schuurman
Schuurman phd presentation 2015 02 27Schuurman phd presentation 2015 02 27
Schuurman phd presentation 2015 02 27
Dimitri Schuurman1.3K views
Linq 2013 session_red_1_martin_sanchez_garcia_marianos by LINQ_Conference
Linq 2013 session_red_1_martin_sanchez_garcia_marianosLinq 2013 session_red_1_martin_sanchez_garcia_marianos
Linq 2013 session_red_1_martin_sanchez_garcia_marianos
LINQ_Conference293 views
Learning and Text Analysis for Ontology Engineering by butest
Learning and Text Analysis for Ontology EngineeringLearning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
butest365 views
Innovative and Inquiry-based Teaching for Excellence by glhanley
Innovative and Inquiry-based Teaching for ExcellenceInnovative and Inquiry-based Teaching for Excellence
Innovative and Inquiry-based Teaching for Excellence
glhanley704 views
Potential Contributions of the Living Labs to the Lisbon Strategy Objectives by go.growth
Potential Contributions of the Living Labs to the Lisbon Strategy ObjectivesPotential Contributions of the Living Labs to the Lisbon Strategy Objectives
Potential Contributions of the Living Labs to the Lisbon Strategy Objectives
go.growth567 views
Potential Contributions of the Living Labs to the Lisbon Strategy Objectives by gogrowth
Potential Contributions of the Living Labs to the Lisbon Strategy ObjectivesPotential Contributions of the Living Labs to the Lisbon Strategy Objectives
Potential Contributions of the Living Labs to the Lisbon Strategy Objectives
gogrowth435 views
Orcid simons & picasso rscd 13 feb2017 (1) by SusanMRob
Orcid simons & picasso rscd 13 feb2017 (1)Orcid simons & picasso rscd 13 feb2017 (1)
Orcid simons & picasso rscd 13 feb2017 (1)
SusanMRob589 views
Ric workshop ahm 2011 stephen andrews by djmichael156
Ric workshop ahm 2011 stephen andrewsRic workshop ahm 2011 stephen andrews
Ric workshop ahm 2011 stephen andrews
djmichael156294 views
Ric workshop ahm 2011 stephen andrews by djmichael156
Ric workshop ahm 2011 stephen andrewsRic workshop ahm 2011 stephen andrews
Ric workshop ahm 2011 stephen andrews
djmichael156202 views

More from INRAE (MISTEA) and University of Montpellier (LIRMM)

Ontology Repositories and Semantic Artefact Catalogues with the OntoPortal Te... by
Ontology Repositories and Semantic Artefact Catalogues with the OntoPortal Te...Ontology Repositories and Semantic Artefact Catalogues with the OntoPortal Te...
Ontology Repositories and Semantic Artefact Catalogues with the OntoPortal Te...INRAE (MISTEA) and University of Montpellier (LIRMM)
70 views24 slides
O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep... by
O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...
O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep...INRAE (MISTEA) and University of Montpellier (LIRMM)
128 views24 slides
Portail d’ontologies et annotation sémantique de texte - Application en biomé... by
Portail d’ontologies et annotation sémantique de texte - Application en biomé...Portail d’ontologies et annotation sémantique de texte - Application en biomé...
Portail d’ontologies et annotation sémantique de texte - Application en biomé...INRAE (MISTEA) and University of Montpellier (LIRMM)
460 views89 slides
AgroPortal : a vocabulary and ontology repository for agronomy, plant science... by
AgroPortal : a vocabulary and ontology repository for agronomy, plant science...AgroPortal : a vocabulary and ontology repository for agronomy, plant science...
AgroPortal : a vocabulary and ontology repository for agronomy, plant science...INRAE (MISTEA) and University of Montpellier (LIRMM)
593 views25 slides

More from INRAE (MISTEA) and University of Montpellier (LIRMM)(12)

Recently uploaded

eTEP -RS Dr.TVR.pptx by
eTEP -RS Dr.TVR.pptxeTEP -RS Dr.TVR.pptx
eTEP -RS Dr.TVR.pptxVarunraju9
144 views33 slides
Quit Smoking Revolution.pdf by
Quit Smoking Revolution.pdfQuit Smoking Revolution.pdf
Quit Smoking Revolution.pdfGio Ferrandino
21 views56 slides
Cholera Romy W. (3).pptx by
Cholera Romy W. (3).pptxCholera Romy W. (3).pptx
Cholera Romy W. (3).pptxrweth613
65 views11 slides
Buccoadhesive drug delivery System.pptx by
Buccoadhesive drug delivery System.pptxBuccoadhesive drug delivery System.pptx
Buccoadhesive drug delivery System.pptxABG
111 views43 slides
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha... by
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...
The A-Team Against Relapsed/Refractory Myeloma: Community Strategies for Enha...PVI, PeerView Institute for Medical Education
13 views44 slides
DRUG REPUROSING SEMINAR.pptx by
DRUG REPUROSING SEMINAR.pptxDRUG REPUROSING SEMINAR.pptx
DRUG REPUROSING SEMINAR.pptxRiya Gagnani
9 views28 slides

Recently uploaded(20)

eTEP -RS Dr.TVR.pptx by Varunraju9
eTEP -RS Dr.TVR.pptxeTEP -RS Dr.TVR.pptx
eTEP -RS Dr.TVR.pptx
Varunraju9144 views
Cholera Romy W. (3).pptx by rweth613
Cholera Romy W. (3).pptxCholera Romy W. (3).pptx
Cholera Romy W. (3).pptx
rweth61365 views
Buccoadhesive drug delivery System.pptx by ABG
Buccoadhesive drug delivery System.pptxBuccoadhesive drug delivery System.pptx
Buccoadhesive drug delivery System.pptx
ABG111 views
DRUG REPUROSING SEMINAR.pptx by Riya Gagnani
DRUG REPUROSING SEMINAR.pptxDRUG REPUROSING SEMINAR.pptx
DRUG REPUROSING SEMINAR.pptx
Riya Gagnani9 views
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends by muskansbl01
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness TrendsTop Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends
Top Ayurvedic PCD Companies in India Riding the Wave of Wellness Trends
muskansbl0149 views
Peptic ulcer.pdf by UVAS
Peptic ulcer.pdfPeptic ulcer.pdf
Peptic ulcer.pdf
UVAS15 views
Structural Racism and Public Health: How to Talk to Policymakers and Communit... by katiequigley33
Structural Racism and Public Health: How to Talk to Policymakers and Communit...Structural Racism and Public Health: How to Talk to Policymakers and Communit...
Structural Racism and Public Health: How to Talk to Policymakers and Communit...
katiequigley331.3K views
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx by JubinNath2
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptxCMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx
CMC(CHEMISTRY,MANUFACTURING AND CONTROL).pptx
JubinNath29 views
Asthalin Inhaler (Generic Albuterol Sulfate Inhaler) by The Swiss Pharmacy
Asthalin Inhaler (Generic Albuterol Sulfate Inhaler) Asthalin Inhaler (Generic Albuterol Sulfate Inhaler)
Asthalin Inhaler (Generic Albuterol Sulfate Inhaler)
GAS CHROMATOGRAPHY-Principle, Instrumentation Advantage and disadvantage appl... by DipeshGamare
GAS CHROMATOGRAPHY-Principle, Instrumentation Advantage and disadvantage appl...GAS CHROMATOGRAPHY-Principle, Instrumentation Advantage and disadvantage appl...
GAS CHROMATOGRAPHY-Principle, Instrumentation Advantage and disadvantage appl...
DipeshGamare15 views

A few contributions of the SIFR (Semantic Indexing of French biomedical Resources project) and how we reuse NCBO technology

  • 1. Atelier Recherche d’Information Sémantique, RISE’15 30 juin 2015 – Rennes Clement Jonquet – jonquet@lirmm.fr A few contributions of the SIFR (Semantic Indexing of French biomedical Resources project) and how we reuse NCBO technology
  • 2. How is this relevant to RISE?  Modèles de Recherche d'Information Sémantique  Extraction d'Information  Annotation Sémantique  Indexation Sémantique  Alignement d'ontologies et correspondances pour la Recherche d'Information  Langages de Représentation des connaissances pour la Recherche d'Information  Utilisation des distances Sémantiques pour la Recherche d'Information Atelier RISE 2015 30 juin 2015, Rennes
  • 3. A few introduction words Atelier RISE 2015 30 juin 2015, Rennes
  • 4. Biologist have adopted ontologies  To provide canonical representation of scientific knowledge  To annotate experimental data to enable interpretation, comparison, and discovery across databases  To facilitate knowledge-based applications for  Decision support  Natural language-processing  Data integration  But ontologies are: spread out, in different formats, of different size, with different structures Atelier RISE 2015 30 juin 2015, Rennes
  • 5. Working with terminologies & ontologies – a portal please!  You’ve built an ontology, how do you let the world know?  You need an ontology, where do you go o get it?  How do you know whether an ontology is any good?  How do you find resources that are relevant to the domain of the ontology (or to specific terms)?  How could you leverage your ontology to enable new science?  How could you use ontologies without managing them ? Atelier RISE 2015 30 juin 2015, Rennes
  • 6. Atelier RISE 2015 30 juin 2015, Rennes  Comparison of the approaches [IWBBIO'14]
  • 7. Annotation challenge  Explosion of biomedical data: diverse, distributed, unstructured… not linked to ontologies  Hard for biomedical researchers to find the data they need  Data integration problem  Translational discoveries are prevented  Good examples  GO annotations  PubMed (biomedical literature) indexed with Mesh headings  Annotate data with ontology concepts  Horizontal approach ONTOLOGIES RESOURCES Atelier RISE 2015 30 juin 2015, Rennes
  • 8. Good use of the semantics (1/2)  Simple keywords based search miss results Atelier RISE 2015 30 juin 2015, Rennes
  • 9. Good use of the semantics (2/2) Atelier RISE 2015 30 juin 2015, Rennes
  • 10. A few words about SIFR project Atelier RISE 2015 30 juin 2015, Rennes
  • 11. Semantic Indexing of French Biomedical Data Resources project … in collaboration with…
  • 12. People  Young researchers  Clement Jonquet  Mathieu Roche  Sandra Bringay  Advisors  Stefano A. Cerri  Maguelonne Teisseire  Pascal Poncelet  Staff  Vincent Emonet  Students  Juan Antonio Lossio Ventura  Guillaume Surroca  ~3 MSc students / year  Close collaborators  Philippe Lemoisson (TETIS)  Pierre Larmande (IRD / IBC)  Mark Musen (BMIR)  Stefan Darmoni (CISMEF)  Sebastien Harispe (LGI2P) Atelier RISE 2015 30 juin 2015, Rennes
  • 13. Increasing number of biomedical data + multilingualism  Limits of keyword-based indexing  Biomedical community has turned to ontologies to describe their data and turn them into structured and formalized knowledge  Using ontologies is by means of creating semantic annotations  Crucial need for tools & services for French biomedical data  Biomedical data integration challenge  New potential sceintific discoveries hidden in data  Translational research Atelier RISE 2015 30 juin 2015, Rennes
  • 14. Use ontologies for indexing, mining and searching (French) biomedical data  Obj1: Design, development and deployment of the French Annotator.  Obj2: Obtain new research results to exploit and enhance ontology-based indexing services.  semantic distances  ontology alignment  ontology enrichment and disambiguation  Obj3: Valorization of indexing services Atelier RISE 2015 30 juin 2015, Rennes
  • 15. Atelier RISE 2015 30 juin 2015, Rennes A French biomedical Annotator
  • 16. Atelier RISE 2015 30 juin 2015, Rennes Use biomedical ontologies-based annotations end-user applications
  • 17. Reuse of the NCBO technology Atelier RISE 2015 30 juin 2015, Rennes
  • 18. Bioportal : A “one stop shop” for Biomedical Ontologies  Web repository for biomedical ontologies  Make ontologies accessible and usable – abstraction on format, locations, structure, etc.  Users can publish, download, browse, search, comment, align ontologies and use them for annotations both online and via a web services API.  Online support for ontology  Peer review  Notes (comments and discussion)  Versioning  Mapping  Search  Resources Atelier RISE 2015 30 juin 2015, Rennes
  • 20. http://data.bioontology.org Ontology Services • Search • Traverse • Comment • Download Widgets • Tree-view • Auto-complete • Graph-view Annotation Data Access Mapping Services • Create • Upload • Download Term recognition Search “data” annotated with a given term http://bioportal.bioontology.org Atelier RISE 2015 30 juin 2015, Rennes
  • 21. Current axes of research Atelier RISE 2015 30 juin 2015, Rennes
  • 22. SIFR axes of research (1/8): Design of the SIFR (French) Annotator service  Deployment of a local instance of BioPortal at LIRMM  16 French terminologies imported from UMLS, EHTOP & BioPortal  UTF8 compliant Mgrep concept recognizer (Univ. of Michigan)  http://bioportal.lirmm.fr/annotator  New improvement to the annotation workflow  Automatic term extraction measures (C-value, LIDF-value, etc.)  Scoring of annotations & representation in RDF using the AO [SWAT4LS 2014] Atelier RISE 2015 30 juin 2015, Rennes
  • 23. Improving the Annotator(s) – example with scoring  Objective : To improve the Annotator(s) results by ranking the annotations according to their relevance  While not changing the service implementation  Take into account their frequencies (as originally proposed in 2009 and removed)  Add a term extraction measure, called C-Value, used to positively discriminate annotations generated from matches with multi-word terms.  2 new scoring methods allowing to score and rank annotations by their importance in the given input data  Interesting results validated against PubMed manual annotations  [SWAT4LS 2014] Atelier RISE 2015 30 juin 2015, Rennes
  • 24. SIFR axes of research (2/8): Dealing with multilingualism within BioPortal  Status of multilingualism in BioPortal – quite negative  Set of propositions [MSW 2014]  Representation of natural language property for an ontology  Representation of the distinction between ontologies  Representation of relation between ontologies  Representation of multilingual translation mappings  Reconciliation of multilingual mappings (possible PhD collaboration with ESI)  Currently being tested/implemented within our local instance Atelier RISE 2015 30 juin 2015, Rennes
  • 25. What is being multilingual?  Interface internationalization = displaying static elements of the user interface (e.g., menu names, help, etc.) in different languages  Content internationalization = displaying BioPortal content (e.g., ontology labels, mappings, etc.) in different languages  Multilingual = internationalization (display) + to enabling a complete use of the functionalities and services of BioPortal for multilingual ontologies or monolingual ontologies  completely and properly addressed (languages, translations, multilingual mappings, etc.)  rich semantic description  Being able to parse multilingual content in ontologies (from xmllang to Lemon) Atelier RISE 2015 30 juin 2015, Rennes
  • 26. multilingual ontology Atelier RISE 2015 30 juin 2015, Rennes en:disease fr:maladie ... en:cancer fr:cancer en:spindel cell sarcome fr:sarcome à cellules fusiformes en:melanoma fr:mélanome disease ... cancer spindle cell sarcome melanoma maladie ... cancer sarcome à cellules fusiformes mélanome language specific ontology (monolingual)
  • 27. SIFR axes of research (3/8): Automatic extraction of biomedical terminology from text  Context of the PhD of Juan Antonio Lossio [LBM 2013][TALN 2014][PolTAL 2014]  BioTex , software http://tubo.lirmm.fr/biotex [ISWC 2014]  Work in French, English and Spanish  Motivations for automatic terminology extraction  Experiment and validate approaches for French data  Contribute to the ontology enrichment process  Acquire some NLP expertise for the annotation workflow Atelier RISE 2015 30 juin 2015, Rennes
  • 28. Atelier RISE 2015 30 juin 2015, Rennes
  • 29. Statistical methods C-value: Improves the extraction of longest terms soft contact soft contact lens Frantzi, K., Ananiadou, S., & Mima, H. (2000). Automatic recognition of multi-word terms:. the c-value/nc-value method. International Journal on Digital Libraries, 3(2), 115-130. Atelier RISE 2015 30 juin 2015, Rennes
  • 30. Atelier RISE 2015 30 juin 2015, Rennes
  • 31. Atelier RISE 2015 30 juin 2015, Rennes
  • 32. Include BioTex into BioPortal  Use BioPortal dictionary for validation  New ontology enrichment service… give a corpus of data and see what are the terms not yet covered Atelier RISE 2015 30 juin 2015, Rennes
  • 33. SIFR axes of research (4/8): Semantic distance framework  Automatically compute existing (Rada, Wu&Palmer, Resnik) semantic similarity measures over BioPortal ontologies  For a given concept get all semantically closed concepts  Get the semantic distance between 2 concepts  Collaboration with LGI2P to reuse Semantic Measure Library (SML) within BioPortal  1st prototype: http://tubo.lirmm.fr/BioMedicalSemantic/web/app_dev.php  To include SML within BioPortal backend to bring semantic distance services to the ontologies and data annotated Atelier RISE 2015 30 juin 2015, Rennes
  • 34. SIFR axes of research (5/8): Informal patient data analysis  Dealing with public patient data on blogs, forums and tweets (Sandra Bringay)  Detection of emotion [EGC 2014][eTELEMED 2014]  Patient vocabulary (crabe vs. cancer)  Project “Parlons de nous” (www.lirmm.fr/patient-mind)  MSH-M  A patient vocabulary currently being constructed [IC 2015]  Hosted and available in our local instance of BioPortal  Used for annotations, indexing, information retrieval Atelier RISE 2015 30 juin 2015, Rennes
  • 35. SIFR axes of research (6/8): Viewpoint: a subjective knowledge representation formalism  Collaboration with P. Lemoisson (CIRAD) & PhD of G. Surroca  Graph based knowledge representation formalism  Linked data from the semantic Web and user contributions from the social Web.  Unified topological approach  First prototype for semantic search over HAL-LIRMM publications [IC2014]  Capture the phenomenon of Serendipity (i.e., incidental learning) [IC 2015] Atelier RISE 2015 30 juin 2015, Rennes
  • 36. SIFR axes of research (7/8): Pharmacogenomics use case  PGx studies how individual gene variations cause variability in drug responses  Validation of pharmacogenomics state-of-the-art knowledge on the basis of practice-based evidences  Compare pharmacogenomics literature (in English) and electronic health records (in French)  EHRs from Paris (HEGP) & St Etienne hospitals  Improvement of the AnnotatorS to come to handle clinical data: negation, disambiguation, modularity, temporality  Project submitted to ANR generic call 2015 (April 27th)  Collaborative action lead by Adrien Coulet (LORIA)  Stanford is in the loop (Russ, Mark, Michel, Nigam) Atelier RISE 2015 30 juin 2015, Rennes
  • 37. SIFR axes of research (8/8): application to agronomy & plant  Within the Institute of Computational Biology of Montpellier  Design of a semantic annotation workflow for plant data - collaboration with IBC project [CO-PDI 2014]  AgroLD: to build an RDF knowledge base to house plant data resources: SouthGreen, Gramene, OryGeneDB… [RDA 2014]  AgroPortal: reference ontology repository for the agronomic domain [IN-OVIVE 2015]  Experiment NCBO technologies for the plant community  4 driving agronomic use cases Atelier RISE 2015 30 juin 2015, Rennes
  • 38. Objectives of AgroPortal project  Develop and support a reference ontology repository for the agronomic domain  One-stop-shop for plant/agronomic related ontologies  Primary focus on the agronomic & plant domain  Reusing the NCBO BioPortal technology  Avoid to re-implement what has been done  Facilitate interoperability  Reusing the scientific outcomes, experience & methods of the biomedical domain  Enable straightforward use of agronomic related ontologies  Respect the requirements of the agronomic community  Fully semantic web compliant infrastructure Atelier RISE 2015 30 juin 2015, Rennes
  • 39. AgroPortal  50 ontologies relevant to agronomic and plant Atelier RISE 2015 30 juin 2015, Rennes
  • 40. A few conclusions Atelier RISE 2015 30 juin 2015, Rennes
  • 41. Next future  Continue to move different prototypes into production  Release of the French Annotator  Find more use cases  Collaboration with the plant/agro community  Continue reusing and contributing to NCBO technology Atelier RISE 2015 30 juin 2015, Rennes
  • 42. Online resources  Web page: www.lirmm.fr/sifr  https://www.researchgate.net/projects  Code repository: https://github.com/sifrproject  13 developpers  10 repositories  Publications: http://bit.ly/194ImnR  Direct link to HAL-LIRMM platform with advance search features  Portals & services:  http://bioportal.lirmm.fr  http://agroportal.lirmm.fr Atelier RISE 2015 30 juin 2015, Rennes
  • 43. Questions & Remarks ? Atelier RISE 2015 30 juin 2015, Rennes