SlideShare a Scribd company logo
1 of 80
Download to read offline
SeMAntic RepresenTation for
Protocols
(SMART Protocols)
Olga Ximena Giraldo Pasmin
Supervisor: Oscar Corcho, PhD
Universidad Politécnica de Madrid
Instrument Mortar and pestle
Reagent Triton X-100
Sample
DNA
Cell
disruption
Grind the leaf tissueusing a mortar and pestle
Precipitation
reaction
Precipitate the DNA with
0.6 mL of 2-propanol.
How to accurately document, share and retrieve meaningful information from
experimental protocols
Goal
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 2
•  Experimental protocols
are like cooking recipes
ü  They have ingredients:
reagents and sample
ü  They have appliances:
equipment,
ü  They have a list of instructions,
ü  They have a total time
ü  They have critical steps…
What is an experimental protocol
3PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Laboratory Protocols
•  A protocol is written in natural language
•  A protocol is a depiction of a sequence
of operations that includes an input and
an output. In this sense a protocol is a
type of workflow.
“The protocols should have complete
information that allows anybody to
recreate an experiment” [1]
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 4
[1] Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of
biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
Poorly documented protocols
5PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
• Incubate the centrifuge tubes
in a water bath.
• Incubate the samples for 5 min
with gentle shaking.
• Rinse DNA briefly in 1-2 ml of
wash.
• Incubate at -20°C overnight.
“Mix thoroughly at room
temperature”
Protocol
Instead of this…
I obtain this!
Poor reproducibility
*virologist who, in 1955, cofounded The Journal of Irreproducible Results with physicist
Harry J. Lipkin.
“An experiment is reproducible until
another laboratory tries to repeat it”
— Alexander Kohn*
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 6
Source: https://www.nature.com/news/1-500-scientists-lift-the-lid-on-
reproducibility-1.19970
Improving the reproducibility
Data repository
for making data
available
few efforts focus on
representing and
standardizing
experimental protocols.
For reproducibility
purposes, if the data
must be available, so
does the experimental
protocol detailing the
methodology followed
to derive the data.
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 7
Ontobee
Open
Science
Open access
Open sourceOpen data
Citizen science
Repositories of protocols
Features
•  Some of them are not open
access.
•  The protocol is available mostly
in PDF and HTML.
•  Information is organized into
sections and the content into
each section is unstructured.
•  The search is limited to author,
title, keywords or publication
date.
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 8
Main research question
How to formalize the information from
laboratory protocols as a knowledge base?
9PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
10PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
Specific research questions
11PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) for experimental protocols in
life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
Specific research questions
Materials
12PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  9 Instructions for authors,
•  530 experimental protocols
•  Minimum information standards
•  Bio-ontologies
Methodology used for building our
guidelines…
13PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
A guideline for reporting experimental protocols in life sciences
14PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Instructions for authors
from journals publishing
protocols:
•  Identification of
bibliographic and rhetorical
data elements.
Experimental protocols:
•  Were published protocols following the
instructions for authors?
•  What was missing?
•  What additional information was included?
•  What information is reported in
unpublished protocols?
MI standards and ontologies:
•  Identification and description of data
elements in protocols (e.g. whole
organisms, anatomical parts, chemical
compounds, instruments, primers,
software, etc.)
W e A n a l y z e d
Checklist Version 0.1
•  Extraction of a first set of
bibliographic and rhetorical elements
from resources analyzed.
Checklist
Version 0.2
Enriched checklist.
First Draft
S e c o n d
D r a f t
Resulting Checklist
Workshops: The 1st group of
participants (19 domain experts
from CIAT), focused on “what
information is necessary and
sufficient for reporting a
protocol?”
E v a l u a t i o n 1
Online survey: The 2nd
group of 23 participants.
They were asked to indicate
whether a particular data
element was relevant or not
in the checklist V. 0.2.
E v a l u a t i o n 2
A B C
D
FE
Final meeting:
Participants from workshops
and survey. The discussion
focus on “should the
checklist include infrequent
data elements?”
E v a l u a t i o n 3
G
Checklist Version
1.0
Results…
15PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements for reporting protocols
16PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Bibliographic
data
elements
Discursive
data
elements
Data elements for reporting protocols
17PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Data elements for reporting protocols
18PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Summarizing
•  Here, where described 17 data elements that can be
used to improve the reporting structure of protocols.
•  The guideline can be adapted to address the needs
of specific communities.
•  Improving reporting structures requires collective
efforts from authors, peer reviewers, editors, and
funding bodies.
•  The improvement will be incremental; as guidelines are
presented, they will be evaluated, adapted, and re-deployed.
19PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
20PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Specific research questions
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
Methodology used for building our ontology model
21PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Development of scenarios
and competency questions
Draft ontology
Iterative building ontology
models and validation
Ontology evaluation and
evolution
Ontology features
Ontology development process
•  Methodology
•  NeOn methodology,
•  OBO foundry
•  Ontology editor
•  Protégé versions 4.X and 5.0,
•  Visualization – OWLViz
Ontologies reused
•  BFO,
•  Ontology of relations (RO),
•  OBI Minimal metadata
Ontology terms reused
•  Chemical Entities of Biological Interest (ChEBI),
•  NCBI taxonomy,
•  The Ontology for Biomedical Investigations (OBI),
•  The BioAssay Ontology (BAO),
•  The Experimental Factor Ontology (EFO),
•  Eagle-I resource ontology (ERO),
•  Cell Line Ontology (CLO),
•  EXACT,
•  Information Artifact Ontology (IAO)
22PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Document module
23PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal
of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
Workflow module
24PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols.
Journal of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
Ontology evaluation
25PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Syntax - The OntOlogy Pitfall Scanner (OOPS!),
•  Capability of SMART Protocols Ontology to answer competency questions specified
by domain experts
•  https://smartprotocols.github.io/queries/
Retrieve the protocols using mouse as a sample
PREFIX sp: <http://purl.org/net/SMARTprotocol#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ro: <http://http://www.obofoundry.org/ro/ro.owl#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?title ?externalUri
WHERE {
?protocol sp:hasTitle ?title_uri .
?title_uri rdf:value ?title .
?protocol ro:hasPart ?materials .
?materials a sp:MaterialsSection .
?materials ro:hasPart ?reagents .
?reagents a sp:SpecimenList .
?reagents ro:hasPart ?reagent .
?reagent owl:sameAs ?externalUri .
?reagent sp:hasName ?nameUri .
?nameUri rdf:value ”mouse" .
}
Ontology evaluation
26PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
PREFIX sp: <http://purl.org/net/SMARTprotocol#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX obi: <http://purl.obolibrary.org/obo/OBI_>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX CHEBI: <http://purl.obolibrary.org/obo/CHEBI_>
SELECT ?name (group_concat(?manufacturerName; separator=" , ")
as ?manufacturers) (group_concat(?homepage; separator=" , ")
as ?whereTobuy)
WHERE {
?reagent a CHEBI:33893 .
?reagent sp:hasName ?nameUri .
?nameUri rdf:value ?name .
?reagent obi:0000304 ?manufacturer .
?manufacturer sp:hasName ?manufacturerNameUri .
?manufacturerNameUri rdf:value ?manufacturerName .
?reagent foaf:homepage ?homepage .
} GROUP BY ?name
Retrieve reagent names, their manufacturers and homepages
Advantages and limitations
Limitations
•  Linking reagents and instruments to external resources
•  Manufacturers don’t always offer Application Programing Interfaces (APIs) that make it
possible to resolve these entities against their websites
•  Solution – web scraping
•  Manufacturers don’t always use controlled vocabularies or common identifiers
•  Solution - PubChem
SMART Protocols ontology follows the FAIR principles
•  Findable – it is registered in Bioportal, github and the vocab.linkeddata.es
•  Reusable – classes and object properties are documented with annotation properties (e.g.
“preferred terms”, “alternative terms”, “definitions”, “example of usage”) from the OBI Minimal
metadata to know the context and suitability of each ontology term.
•  Interoperable and accesible
•  ontology language – OWL
•  License – Creative Commons Attribution 4.0
27PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
28PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Adoption of guideline elements
represented in the SMART Protocols
ontology…
Laboratory Protocols in Bioschemas.org
29PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Community initiative built on top of
schema.org
•  Aim
•  Improve data discoverability and
interoperability in Life Sciences
•  How
•  Adding Life Science types to
schema.org
•  Providing usage guidelines, examples
and tools
Source: http://www.igst.it/nettab/2018/
files/2018/10/NETTAB2018_Garcia.pdf
30
http://bio.sdo-
bioschemas-227516.appspot.com/
LabProtocol
LabProtocol specification
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
31PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Specific research questions
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
32PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Identification of
common data elements
in protocols
A
Development of a Gold
Standard Corpus (GSC)
B
Two main steps
Identifying common data elements
33PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Retrieve the protocols using mouse as a sample.
•  Retrieve reagent names, their manufacturers and homepages.
•  Retrieve the purpose of the protocol titled “Isolation of Lung Infiltrating Cell in Mice”.
•  Retrieve all the protocols that use the software “ImageJ” and the corresponding homepage.
Competency questions:
The SIRO model
34PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Sample/Specimen
(whole organism, anatomical
part, bodily fluids, etc.)
Instruments
(equipment, devices,
consumables, software)
Reagents
(chemical compounds,
mixtures)
Objective
(purpose)
The SIRO model
supports search,
retrieval and
classification of
experimental protocols
35PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Identification of
common data elements
in protocols
A
Development of a Gold
Standard Corpus (GSC)
B
Two main steps
36PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Materials used for building our Gold
Standard Corpus…
Subset of protocols
37PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Journal No. of protocols
Bio-Protocols 19
Biotechniques 4
Cold spring harbor protocols 7
Current protocols 3
Genetic and Molecular Research 3
Journal of Biological Methods 4
Journal of Visualized Experiments 11
MethodsX 6
Nature Protocols 1
Total 58
Domain expert annotators
38PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Institution No. of annotators
Centro de Bioinformática y Biología Computacional de Colombia 8
Universidad del Valle, Colombia 11
Database Center for Life Science (DBCLS), Robotic Biology
Institute(RBI), Spiber, Yachie-Lab, Universidad de Tokyo, Japan
14
Universidad Santiago de Cali, Colombia 1
Total 34
BioH: Annotation tool
39PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Bio
based on •  Open source software,
•  Non-profit organization.
•  Collaborate, create, discover, share and
re-use knowledge,
•  exports annotations in a standardized
format,
•  annotations are available as linked data
over a SPARQL endpoint.
created
to
Training documentation
40PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
• Guidelines about What and
How annotate.
•  User guide of the BioH
Annotation tool,
Source: Olga Giraldo. (2018, December 10). Guidelines to
annotate experimental protocols (Version 1.0.0). Zenodo.
http://doi.org/10.5281/zenodo.2171281
Source: Olga Giraldo. (2019, April). Guidelines to use BioH
annotation tool (Version 1.0.0). Zenodo. http://doi.org/
10.5281/zenodo.2639704
Methodology for developing a Gold Standard Corpus
41PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Training session
-  usage of the
annotation tool,
-  annotation
guidelines.
•  Assignment of
protocols to
annotators
•  3 annotators
per protocol
Annotation
phase I
Review of
Annotations
Annotation
phase II
Resolve
inconsistencies
and produce a
final output
Tag distribution
Results
42PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Overall objective was missing in 12 protocols
Results
43PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  High inter-annotator agreement was observed
Source: Olga Giraldo. (2018). Fleiss Kappa of protocols (Version V 1.1.0en) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1489112
Annotated terms mapped to ontology terms
44PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
1769 concepts
related to
Samples,
Instruments and
Reagents
NCBI Taxon
organisms
UBERON
anatom. parts
ERO
reagents
CheBI
chem. comp.
PubChem
chem. comp.
SMART Prot.
reagents
EFO
instruments
OBI
instruments
BAO
instruments
SMART Prot.
instruments
45PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Semantic gazetteers features
JAPE (Java Annotation Patterns Engine) rules features
46PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Evaluating automatic annotation
47PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Olga Giraldo. (2018). Precision, Recall and F1 score (Version V1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1753520
Entity F1-measure (>0.70)
Sample 45%
Instrument 60%
Reagent 59%
Gazetteers failures
48PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Words with typos
•  e.g., centrifuge vs centifuge
Words with different meaning
•  e.g., the word “cat” is a term from NCBI Taxonomy used to represent the
common name of “Felis catus”, but cat (or cat., Cat, CAT) also
represents the short word for “catalog”
Cases where multiple samples, reagents or instruments
in the same statement were mentioned.
•  e.g., eppendorf or falcon tubes instead of eppendorf tubes or falcon
tubes.
Summarizing
•  58 fully annotated documents,
•  High inter-annotator agreement (0,72 - 1.0),
•  Good practices for the development of an annotated corpus of
documents were put into action:
ü Clear annotation tasks (what and how annotate),
ü Low ambiguity of the data,
ü The participation of annotators experts in life sciences,
ü Three annotators per document,
ü Two annotation phases.
•  This annotated corpus could serve as a gold standard
for biological natural language processing (NLP) tasks.
49PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
50PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the annotation and finding specific protocols
based on common data elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
51PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Why do we need to formalize and extract information
from lab protocols?
•  Because we want a recommendation system that
matches protocols according to my situation, for
instance
ü  samples I have,
ü  availability of equipment, reagents, lab conditions
ü  Expertise
SMART Protocols Publication Platform
52PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Available at: https://smart-protocols.firebaseapp.com/login
What is semantic publishing
•  Semantic publishing is the workflow that aggregates
detailed and well characterized semantic interoperable
assertions in a way so that these are intelligible for
humans and procesable by machines.
•  The assertions are created by domain experts.
•  The aggregation of assertions does not end at the time of
publication; it assumes that the assertions in the published
object will continue to evolve.
53PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Features
54PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Features
55PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Features
56PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
reagent_name_resolver
organism_name_resolver
Open
Science
Open access
Open source
Open data
Citizen science
Open educational
resources
Conclusions and outcomes
57PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research questions
1.  How to design a guideline that formally represents bibliographic (e.g. title, author,
version), and rhetorical components (e.g. purpose, materials, and procedure) from
experimental protocols in life science.
2.  What is the ontological structure that allows the formal representation of an
experimental protocol as a document and as an executable element?
•  Identification of data elements for
reporting experimental protocols,
•  Data element are resolvable
against resources in the web of
data,
•  Ontologies are the connectivity
tissue
•  The protocols are available in
RDF, JSON, HTML, etc.
Conclusions
•  In order to preserve
the digital continuum
(data-experimental
protocol), several
layers of semantics
are needed in a well
coordinated
metadata modeling
effort.
Conclusions and outcomes
58PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Publications
1.  Giraldo, O., García, A., & Corcho, O. (2018), A guideline for reporting experimental protocols in
life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795.
2.  Giraldo, O., García, A., López, F., & Corcho, O. (2017). Using semantics for representing
experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/
s13326-017-0160-y.
International internships
Centro: Centro Internacional de Agricultura Tropical (CIAT)
Localidad: Cali, Colombia
Fecha:20/01/2014 - 30/03/2014
DuraciĂłn (semanas): 10
Tema: Análisis de protocolos de laboratorio. Fase I estandarización de protocolos de
laboratorio usando tecnologías semánticas
Centro: The Ontology Development Group in the Oregon Health & Science University
Localidad: Portland, Oregon, Estados Unidos
Fecha:05/10/2014 - 31/01/2015
DuraciĂłn (semanas): 17
Tema: desarrollar un modelo ontolĂłgico para representar protocolos experimentales
incluidos en sistemas de manejo de datos de investigaciĂłn de Elsevier
Conclusions and outcomes
59PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research questions
3.  How to facilitate the manual annotation based on common data elements in
experimental protocols?
4.  How to facilitate automatic entity recognition by using semantics and NLP
techniques?
•  The identification of the key elements from the SIRO model helps to focus the
manual and automatic annotation of protocols,
•  The automatic annotation depends on the knowledge encoded in the ontologies.
There is the need for better and more exhaustive ontologies describing samples,
instruments and reagents.
•  Reagents and instruments should come directly from suppliers.
Conclusions
Conclusions and outcomes
60PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Publications
1.  Giraldo, O., García, A., Ohta T., Lopez F. (2017). Annotating the SIRO model and discovering
experimental protocols. Biomedical Linked Annotation Hackathon 3 (BLAH3), Tokyo, Japan;
January 16 – 20. 2017. http://blah3.linkedannotation.org/program
2.  Giraldo, O., García, A., Figueredo J., Corcho O. (2015), Using Semantics and NLP in
Experimental Protocols. 8th Semantic Web Applications and Tools for Life Sciences
International Conference (SWAT4LS), Cambridge, UK; December 7-10, 2015. ISSN 1613-0073.
http://ceur-ws.org/Vol-1546/
61PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research question
5.  How to facilitate the generation of semantic documents for experimental
protocols?
Conclusions
The publication process presented here is a novel publication paradigm that delivers
semantics at birth.
•  The SMART Protocols ontology, is used to guide the data capturing process,
•  Pubchem and uniprot are used to enrich information about chemicals and
organisms.
•  The HTML is marked with the LabProtocol Bioschemas profile.
•  The publication paradigm presented in this thesis could also be applicable to other
types of documents.
•  This publication paradigm is aligned with open science.
Conclusions and outcomes
62PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Awards
Best Idea award (2016). First phase in the first edition of the actĂşaLoop competition for
innovation in social networks.
Best poster at the International Conference on Biomedical Ontology 2015 (ICBO 2015)
Autores: Olga Giraldo, Alexander GarcĂ­a and Oscar Corcho
TĂ­tulo: Using Semantics and NLP in the SMART Protocols Repository
PublicaciĂłn: http://icbo2015.fc.ul.pt/ICBO2015Proceedings.pdf
Lugar celebraciĂłn: Lisboa, Portugal Fecha: 27 - 30 de Julio, 2015
Conclusions and outcomes
Future work
•  Improve the gazetteer and rule based system.
•  Investigate more agile methods for end user engagement
in the process of maintaining terminological resources.
•  From the SIRO model, structuring the objective remains a
challenge. The identification and semantic
characterization of the objective of a protocol should be
investigated.
•  There is the need to have a representation for workflows
executed by robots. Is one single representation for the
protocols reasonable? How to manage these workflows?
63PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
SeMAntic RepresenTation for
Protocols
(SMART Protocols)
Olga Ximena Giraldo Pasmin
ogiraldo@fi.upm.es
Supervisor: Oscar Corcho, PhD
Universidad Politécnica de Madrid
All materials are available at
Zenodo (https://zenodo.org/deposit?page=1&size=20) and
Github (https://smartprotocols.github.io/)
Supporting material
65PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Instructions for authors
66PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life
sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Corpus of protocols
67PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Minimum information standards
68PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Ontologies
69PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI
10.7717/peerj.4795
Developing the LabProtocol profile
70PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Use cases Mapping
Specification
Adoption
Testing
http://bioschemas.org/useCases/LabProtocols/
https://smart-protocols.firebaseapp.com/login
http://bioschemas.org/types/LabProtocol/
Poor data availability
2011… 2017…
[1] Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA (2011) Public Availability of Published Research Data in High-Impact Journals. PLoS ONE 6(9):
e24357. doi:10.1371/journal.pone.0024357
[2] Vasilevsky et al. (2017), Reproducible and reusable research: are journal data sharing policies meeting the mark? PeerJ 5:e3208; DOI 10.7717/peerj.3208
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 71
Annotating with BioH
72PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
a.  Select the text by
using the mouse,
then…
b.  Click on the
annotate
button to
highlight the
text.
c.  add a tag; as was
indicated in step 7,
then…
d.  add a comment,
and…
e.  save
Bibliographic data elements
73PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Discursive data elements
74PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Data elements describing materials
75PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting
experimental protocols in life sciences. PeerJ 6:e4795; DOI
10.7717/peerj.4795
76PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements describing materials
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
77PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements describing materials
Source: Giraldo et al. (2018), A
guideline for reporting experimental
protocols in life sciences. PeerJ
6:e4795; DOI 10.7717/peerj.4795
Data elements for the procedure
78PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
79PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements for the procedure
SMART Protocols Publication Platform is FAIR
•  Identity metadata of the protocol and their digital
objects greatly improves the “findability” of the
data.
•  Access is improved by using the tool to publish
protocols to archives, e.g. ZENODO.
•  Interoperability is improved by using standard
metadata definitions and representations.
•  Reuse is improved by providing suitable information
about the methodology followed to derive the data.
80PhD Thesis: SeMAntic RepresenTation for Experimental Protocols

More Related Content

Similar to Phd tesis olga giraldo 10mayo

SMART Protocols in LISC-2014
SMART Protocols in LISC-2014 SMART Protocols in LISC-2014
SMART Protocols in LISC-2014 Olga Ximena Giraldo
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityOscar Corcho
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchesterrobertstevens65
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectKen Karapetyan
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer ResearchCarole Goble
 
On chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsOn chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsNina Jeliazkova
 
OOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria PovedaOOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria Povedasemanticsconference
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...OSTHUS
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologySnow Owl
 
Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBIJuan Antonio Vizcaino
 
Standardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So FarStandardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So FarAhmad C. Bukhari
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
 

Similar to Phd tesis olga giraldo 10mayo (20)

SMART Protocols in LISC-2014
SMART Protocols in LISC-2014 SMART Protocols in LISC-2014
SMART Protocols in LISC-2014
 
SMART Protocols
SMART ProtocolsSMART Protocols
SMART Protocols
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchester
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Credible workshop
Credible workshopCredible workshop
Credible workshop
 
On chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsOn chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurements
 
OOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria PovedaOOPS!: on-line ontology diagnosis by Maria Poveda
OOPS!: on-line ontology diagnosis by Maria Poveda
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBI
 
Standardization of the HIPC Data Templates
Standardization of the HIPC Data TemplatesStandardization of the HIPC Data Templates
Standardization of the HIPC Data Templates
 
Standardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So FarStandardization of the HIPC Data Templates: The Story So Far
Standardization of the HIPC Data Templates: The Story So Far
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experience
 

Recently uploaded

Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 

Recently uploaded (20)

Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

Phd tesis olga giraldo 10mayo

  • 1. SeMAntic RepresenTation for Protocols (SMART Protocols) Olga Ximena Giraldo Pasmin Supervisor: Oscar Corcho, PhD Universidad PolitĂ©cnica de Madrid
  • 2. Instrument Mortar and pestle Reagent Triton X-100 Sample DNA Cell disruption Grind the leaf tissueusing a mortar and pestle Precipitation reaction Precipitate the DNA with 0.6 mL of 2-propanol. How to accurately document, share and retrieve meaningful information from experimental protocols Goal PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 2
  • 3. •  Experimental protocols are like cooking recipes ü  They have ingredients: reagents and sample ü  They have appliances: equipment, ü  They have a list of instructions, ü  They have a total time ü  They have critical steps… What is an experimental protocol 3PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 4. Laboratory Protocols •  A protocol is written in natural language •  A protocol is a depiction of a sequence of operations that includes an input and an output. In this sense a protocol is a type of workflow. “The protocols should have complete information that allows anybody to recreate an experiment” [1] PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 4 [1] Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
  • 5. Poorly documented protocols 5PhD Thesis: SeMAntic RepresenTation for Experimental Protocols • Incubate the centrifuge tubes in a water bath. • Incubate the samples for 5 min with gentle shaking. • Rinse DNA briefly in 1-2 ml of wash. • Incubate at -20°C overnight. “Mix thoroughly at room temperature” Protocol Instead of this… I obtain this!
  • 6. Poor reproducibility *virologist who, in 1955, cofounded The Journal of Irreproducible Results with physicist Harry J. Lipkin. “An experiment is reproducible until another laboratory tries to repeat it” — Alexander Kohn* PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 6 Source: https://www.nature.com/news/1-500-scientists-lift-the-lid-on- reproducibility-1.19970
  • 7. Improving the reproducibility Data repository for making data available few efforts focus on representing and standardizing experimental protocols. For reproducibility purposes, if the data must be available, so does the experimental protocol detailing the methodology followed to derive the data. PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 7 Ontobee Open Science Open access Open sourceOpen data Citizen science
  • 8. Repositories of protocols Features •  Some of them are not open access. •  The protocol is available mostly in PDF and HTML. •  Information is organized into sections and the content into each section is unstructured. •  The search is limited to author, title, keywords or publication date. PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 8
  • 9. Main research question How to formalize the information from laboratory protocols as a knowledge base? 9PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 10. 10PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) from experimental protocols in life science? •  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  How to facilitate the manual annotation based on common data elements in experimental protocols? •  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  How to facilitate the generation of semantic documents for experimental protocols? Specific research questions
  • 11. 11PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) for experimental protocols in life science? •  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  How to facilitate the manual annotation based on common data elements in experimental protocols? •  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  How to facilitate the generation of semantic documents for experimental protocols? Specific research questions
  • 12. Materials 12PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  9 Instructions for authors, •  530 experimental protocols •  Minimum information standards •  Bio-ontologies
  • 13. Methodology used for building our guidelines… 13PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 14. A guideline for reporting experimental protocols in life sciences 14PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Instructions for authors from journals publishing protocols: •  Identification of bibliographic and rhetorical data elements. Experimental protocols: •  Were published protocols following the instructions for authors? •  What was missing? •  What additional information was included? •  What information is reported in unpublished protocols? MI standards and ontologies: •  Identification and description of data elements in protocols (e.g. whole organisms, anatomical parts, chemical compounds, instruments, primers, software, etc.) W e A n a l y z e d Checklist Version 0.1 •  Extraction of a first set of bibliographic and rhetorical elements from resources analyzed. Checklist Version 0.2 Enriched checklist. First Draft S e c o n d D r a f t Resulting Checklist Workshops: The 1st group of participants (19 domain experts from CIAT), focused on “what information is necessary and sufficient for reporting a protocol?” E v a l u a t i o n 1 Online survey: The 2nd group of 23 participants. They were asked to indicate whether a particular data element was relevant or not in the checklist V. 0.2. E v a l u a t i o n 2 A B C D FE Final meeting: Participants from workshops and survey. The discussion focus on “should the checklist include infrequent data elements?” E v a l u a t i o n 3 G Checklist Version 1.0
  • 15. Results… 15PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 16. Data elements for reporting protocols 16PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795 Bibliographic data elements Discursive data elements
  • 17. Data elements for reporting protocols 17PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 18. Data elements for reporting protocols 18PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 19. Summarizing •  Here, where described 17 data elements that can be used to improve the reporting structure of protocols. •  The guideline can be adapted to address the needs of specific communities. •  Improving reporting structures requires collective efforts from authors, peer reviewers, editors, and funding bodies. •  The improvement will be incremental; as guidelines are presented, they will be evaluated, adapted, and re-deployed. 19PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 20. 20PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Specific research questions •  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) from experimental protocols in life science? •  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  How to facilitate the manual annotation based on common data elements in experimental protocols? •  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  How to facilitate the generation of semantic documents for experimental protocols?
  • 21. Methodology used for building our ontology model 21PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Development of scenarios and competency questions Draft ontology Iterative building ontology models and validation Ontology evaluation and evolution
  • 22. Ontology features Ontology development process •  Methodology •  NeOn methodology, •  OBO foundry •  Ontology editor •  ProtĂ©gĂ© versions 4.X and 5.0, •  Visualization – OWLViz Ontologies reused •  BFO, •  Ontology of relations (RO), •  OBI Minimal metadata Ontology terms reused •  Chemical Entities of Biological Interest (ChEBI), •  NCBI taxonomy, •  The Ontology for Biomedical Investigations (OBI), •  The BioAssay Ontology (BAO), •  The Experimental Factor Ontology (EFO), •  Eagle-I resource ontology (ERO), •  Cell Line Ontology (CLO), •  EXACT, •  Information Artifact Ontology (IAO) 22PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 23. Document module 23PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
  • 24. Workflow module 24PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
  • 25. Ontology evaluation 25PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  Syntax - The OntOlogy Pitfall Scanner (OOPS!), •  Capability of SMART Protocols Ontology to answer competency questions specified by domain experts •  https://smartprotocols.github.io/queries/ Retrieve the protocols using mouse as a sample PREFIX sp: <http://purl.org/net/SMARTprotocol#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX ro: <http://http://www.obofoundry.org/ro/ro.owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?title ?externalUri WHERE { ?protocol sp:hasTitle ?title_uri . ?title_uri rdf:value ?title . ?protocol ro:hasPart ?materials . ?materials a sp:MaterialsSection . ?materials ro:hasPart ?reagents . ?reagents a sp:SpecimenList . ?reagents ro:hasPart ?reagent . ?reagent owl:sameAs ?externalUri . ?reagent sp:hasName ?nameUri . ?nameUri rdf:value ”mouse" . }
  • 26. Ontology evaluation 26PhD Thesis: SeMAntic RepresenTation for Experimental Protocols PREFIX sp: <http://purl.org/net/SMARTprotocol#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX obi: <http://purl.obolibrary.org/obo/OBI_> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX CHEBI: <http://purl.obolibrary.org/obo/CHEBI_> SELECT ?name (group_concat(?manufacturerName; separator=" , ") as ?manufacturers) (group_concat(?homepage; separator=" , ") as ?whereTobuy) WHERE { ?reagent a CHEBI:33893 . ?reagent sp:hasName ?nameUri . ?nameUri rdf:value ?name . ?reagent obi:0000304 ?manufacturer . ?manufacturer sp:hasName ?manufacturerNameUri . ?manufacturerNameUri rdf:value ?manufacturerName . ?reagent foaf:homepage ?homepage . } GROUP BY ?name Retrieve reagent names, their manufacturers and homepages
  • 27. Advantages and limitations Limitations •  Linking reagents and instruments to external resources •  Manufacturers don’t always offer Application Programing Interfaces (APIs) that make it possible to resolve these entities against their websites •  Solution – web scraping •  Manufacturers don’t always use controlled vocabularies or common identifiers •  Solution - PubChem SMART Protocols ontology follows the FAIR principles •  Findable – it is registered in Bioportal, github and the vocab.linkeddata.es •  Reusable – classes and object properties are documented with annotation properties (e.g. “preferred terms”, “alternative terms”, “definitions”, “example of usage”) from the OBI Minimal metadata to know the context and suitability of each ontology term. •  Interoperable and accesible •  ontology language – OWL •  License – Creative Commons Attribution 4.0 27PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 28. 28PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Adoption of guideline elements represented in the SMART Protocols ontology…
  • 29. Laboratory Protocols in Bioschemas.org 29PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  Community initiative built on top of schema.org •  Aim •  Improve data discoverability and interoperability in Life Sciences •  How •  Adding Life Science types to schema.org •  Providing usage guidelines, examples and tools Source: http://www.igst.it/nettab/2018/ files/2018/10/NETTAB2018_Garcia.pdf
  • 31. 31PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Specific research questions •  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) from experimental protocols in life science? •  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  How to facilitate the manual annotation based on common data elements in experimental protocols? •  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  How to facilitate the generation of semantic documents for experimental protocols?
  • 32. 32PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Identification of common data elements in protocols A Development of a Gold Standard Corpus (GSC) B Two main steps
  • 33. Identifying common data elements 33PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  Retrieve the protocols using mouse as a sample. •  Retrieve reagent names, their manufacturers and homepages. •  Retrieve the purpose of the protocol titled “Isolation of Lung Infiltrating Cell in Mice”. •  Retrieve all the protocols that use the software “ImageJ” and the corresponding homepage. Competency questions:
  • 34. The SIRO model 34PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Sample/Specimen (whole organism, anatomical part, bodily fluids, etc.) Instruments (equipment, devices, consumables, software) Reagents (chemical compounds, mixtures) Objective (purpose) The SIRO model supports search, retrieval and classification of experimental protocols
  • 35. 35PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Identification of common data elements in protocols A Development of a Gold Standard Corpus (GSC) B Two main steps
  • 36. 36PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Materials used for building our Gold Standard Corpus…
  • 37. Subset of protocols 37PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Journal No. of protocols Bio-Protocols 19 Biotechniques 4 Cold spring harbor protocols 7 Current protocols 3 Genetic and Molecular Research 3 Journal of Biological Methods 4 Journal of Visualized Experiments 11 MethodsX 6 Nature Protocols 1 Total 58
  • 38. Domain expert annotators 38PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Institution No. of annotators Centro de Bioinformática y BiologĂ­a Computacional de Colombia 8 Universidad del Valle, Colombia 11 Database Center for Life Science (DBCLS), Robotic Biology Institute(RBI), Spiber, Yachie-Lab, Universidad de Tokyo, Japan 14 Universidad Santiago de Cali, Colombia 1 Total 34
  • 39. BioH: Annotation tool 39PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Bio based on •  Open source software, •  Non-profit organization. •  Collaborate, create, discover, share and re-use knowledge, •  exports annotations in a standardized format, •  annotations are available as linked data over a SPARQL endpoint. created to
  • 40. Training documentation 40PhD Thesis: SeMAntic RepresenTation for Experimental Protocols • Guidelines about What and How annotate. •  User guide of the BioH Annotation tool, Source: Olga Giraldo. (2018, December 10). Guidelines to annotate experimental protocols (Version 1.0.0). Zenodo. http://doi.org/10.5281/zenodo.2171281 Source: Olga Giraldo. (2019, April). Guidelines to use BioH annotation tool (Version 1.0.0). Zenodo. http://doi.org/ 10.5281/zenodo.2639704
  • 41. Methodology for developing a Gold Standard Corpus 41PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Training session -  usage of the annotation tool, -  annotation guidelines. •  Assignment of protocols to annotators •  3 annotators per protocol Annotation phase I Review of Annotations Annotation phase II Resolve inconsistencies and produce a final output Tag distribution
  • 42. Results 42PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  Overall objective was missing in 12 protocols
  • 43. Results 43PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  High inter-annotator agreement was observed Source: Olga Giraldo. (2018). Fleiss Kappa of protocols (Version V 1.1.0en) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1489112
  • 44. Annotated terms mapped to ontology terms 44PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 1769 concepts related to Samples, Instruments and Reagents NCBI Taxon organisms UBERON anatom. parts ERO reagents CheBI chem. comp. PubChem chem. comp. SMART Prot. reagents EFO instruments OBI instruments BAO instruments SMART Prot. instruments
  • 45. 45PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Semantic gazetteers features
  • 46. JAPE (Java Annotation Patterns Engine) rules features 46PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 47. Evaluating automatic annotation 47PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Olga Giraldo. (2018). Precision, Recall and F1 score (Version V1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1753520 Entity F1-measure (>0.70) Sample 45% Instrument 60% Reagent 59%
  • 48. Gazetteers failures 48PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Words with typos •  e.g., centrifuge vs centifuge Words with different meaning •  e.g., the word “cat” is a term from NCBI Taxonomy used to represent the common name of “Felis catus”, but cat (or cat., Cat, CAT) also represents the short word for “catalog” Cases where multiple samples, reagents or instruments in the same statement were mentioned. •  e.g., eppendorf or falcon tubes instead of eppendorf tubes or falcon tubes.
  • 49. Summarizing •  58 fully annotated documents, •  High inter-annotator agreement (0,72 - 1.0), •  Good practices for the development of an annotated corpus of documents were put into action: ü Clear annotation tasks (what and how annotate), ü Low ambiguity of the data, ü The participation of annotators experts in life sciences, ü Three annotators per document, ü Two annotation phases. •  This annotated corpus could serve as a gold standard for biological natural language processing (NLP) tasks. 49PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 50. 50PhD Thesis: SeMAntic RepresenTation for Experimental Protocols •  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) from experimental protocols in life science? •  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  How to facilitate the annotation and finding specific protocols based on common data elements in experimental protocols? •  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  How to facilitate the generation of semantic documents for experimental protocols?
  • 51. 51PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Why do we need to formalize and extract information from lab protocols? •  Because we want a recommendation system that matches protocols according to my situation, for instance ü  samples I have, ü  availability of equipment, reagents, lab conditions ü  Expertise
  • 52. SMART Protocols Publication Platform 52PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Available at: https://smart-protocols.firebaseapp.com/login
  • 53. What is semantic publishing •  Semantic publishing is the workflow that aggregates detailed and well characterized semantic interoperable assertions in a way so that these are intelligible for humans and procesable by machines. •  The assertions are created by domain experts. •  The aggregation of assertions does not end at the time of publication; it assumes that the assertions in the published object will continue to evolve. 53PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 54. Features 54PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 55. Features 55PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 56. Features 56PhD Thesis: SeMAntic RepresenTation for Experimental Protocols reagent_name_resolver organism_name_resolver Open Science Open access Open source Open data Citizen science Open educational resources
  • 57. Conclusions and outcomes 57PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Research questions 1.  How to design a guideline that formally represents bibliographic (e.g. title, author, version), and rhetorical components (e.g. purpose, materials, and procedure) from experimental protocols in life science. 2.  What is the ontological structure that allows the formal representation of an experimental protocol as a document and as an executable element? •  Identification of data elements for reporting experimental protocols, •  Data element are resolvable against resources in the web of data, •  Ontologies are the connectivity tissue •  The protocols are available in RDF, JSON, HTML, etc. Conclusions •  In order to preserve the digital continuum (data-experimental protocol), several layers of semantics are needed in a well coordinated metadata modeling effort.
  • 58. Conclusions and outcomes 58PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Publications 1.  Giraldo, O., GarcĂ­a, A., & Corcho, O. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795. 2.  Giraldo, O., GarcĂ­a, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/ s13326-017-0160-y. International internships Centro: Centro Internacional de Agricultura Tropical (CIAT) Localidad: Cali, Colombia Fecha:20/01/2014 - 30/03/2014 DuraciĂłn (semanas): 10 Tema: Análisis de protocolos de laboratorio. Fase I estandarizaciĂłn de protocolos de laboratorio usando tecnologĂ­as semánticas Centro: The Ontology Development Group in the Oregon Health & Science University Localidad: Portland, Oregon, Estados Unidos Fecha:05/10/2014 - 31/01/2015 DuraciĂłn (semanas): 17 Tema: desarrollar un modelo ontolĂłgico para representar protocolos experimentales incluidos en sistemas de manejo de datos de investigaciĂłn de Elsevier
  • 59. Conclusions and outcomes 59PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Research questions 3.  How to facilitate the manual annotation based on common data elements in experimental protocols? 4.  How to facilitate automatic entity recognition by using semantics and NLP techniques? •  The identification of the key elements from the SIRO model helps to focus the manual and automatic annotation of protocols, •  The automatic annotation depends on the knowledge encoded in the ontologies. There is the need for better and more exhaustive ontologies describing samples, instruments and reagents. •  Reagents and instruments should come directly from suppliers. Conclusions
  • 60. Conclusions and outcomes 60PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Publications 1.  Giraldo, O., GarcĂ­a, A., Ohta T., Lopez F. (2017). Annotating the SIRO model and discovering experimental protocols. Biomedical Linked Annotation Hackathon 3 (BLAH3), Tokyo, Japan; January 16 – 20. 2017. http://blah3.linkedannotation.org/program 2.  Giraldo, O., GarcĂ­a, A., Figueredo J., Corcho O. (2015), Using Semantics and NLP in Experimental Protocols. 8th Semantic Web Applications and Tools for Life Sciences International Conference (SWAT4LS), Cambridge, UK; December 7-10, 2015. ISSN 1613-0073. http://ceur-ws.org/Vol-1546/
  • 61. 61PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Research question 5.  How to facilitate the generation of semantic documents for experimental protocols? Conclusions The publication process presented here is a novel publication paradigm that delivers semantics at birth. •  The SMART Protocols ontology, is used to guide the data capturing process, •  Pubchem and uniprot are used to enrich information about chemicals and organisms. •  The HTML is marked with the LabProtocol Bioschemas profile. •  The publication paradigm presented in this thesis could also be applicable to other types of documents. •  This publication paradigm is aligned with open science. Conclusions and outcomes
  • 62. 62PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Awards Best Idea award (2016). First phase in the first edition of the actĂşaLoop competition for innovation in social networks. Best poster at the International Conference on Biomedical Ontology 2015 (ICBO 2015) Autores: Olga Giraldo, Alexander GarcĂ­a and Oscar Corcho TĂ­tulo: Using Semantics and NLP in the SMART Protocols Repository PublicaciĂłn: http://icbo2015.fc.ul.pt/ICBO2015Proceedings.pdf Lugar celebraciĂłn: Lisboa, Portugal Fecha: 27 - 30 de Julio, 2015 Conclusions and outcomes
  • 63. Future work •  Improve the gazetteer and rule based system. •  Investigate more agile methods for end user engagement in the process of maintaining terminological resources. •  From the SIRO model, structuring the objective remains a challenge. The identification and semantic characterization of the objective of a protocol should be investigated. •  There is the need to have a representation for workflows executed by robots. Is one single representation for the protocols reasonable? How to manage these workflows? 63PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 64. SeMAntic RepresenTation for Protocols (SMART Protocols) Olga Ximena Giraldo Pasmin ogiraldo@fi.upm.es Supervisor: Oscar Corcho, PhD Universidad PolitĂ©cnica de Madrid All materials are available at Zenodo (https://zenodo.org/deposit?page=1&size=20) and Github (https://smartprotocols.github.io/)
  • 65. Supporting material 65PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 66. Instructions for authors 66PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 67. Corpus of protocols 67PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 68. Minimum information standards 68PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 69. Ontologies 69PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 70. Developing the LabProtocol profile 70PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Use cases Mapping Specification Adoption Testing http://bioschemas.org/useCases/LabProtocols/ https://smart-protocols.firebaseapp.com/login http://bioschemas.org/types/LabProtocol/
  • 71. Poor data availability 2011… 2017… [1] Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA (2011) Public Availability of Published Research Data in High-Impact Journals. PLoS ONE 6(9): e24357. doi:10.1371/journal.pone.0024357 [2] Vasilevsky et al. (2017), Reproducible and reusable research: are journal data sharing policies meeting the mark? PeerJ 5:e3208; DOI 10.7717/peerj.3208 PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 71
  • 72. Annotating with BioH 72PhD Thesis: SeMAntic RepresenTation for Experimental Protocols a.  Select the text by using the mouse, then… b.  Click on the annotate button to highlight the text. c.  add a tag; as was indicated in step 7, then… d.  add a comment, and… e.  save
  • 73. Bibliographic data elements 73PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 74. Discursive data elements 74PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 75. Data elements describing materials 75PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 76. 76PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Data elements describing materials Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 77. 77PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Data elements describing materials Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
  • 78. Data elements for the procedure 78PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
  • 79. 79PhD Thesis: SeMAntic RepresenTation for Experimental Protocols Data elements for the procedure
  • 80. SMART Protocols Publication Platform is FAIR •  Identity metadata of the protocol and their digital objects greatly improves the “findability” of the data. •  Access is improved by using the tool to publish protocols to archives, e.g. ZENODO. •  Interoperability is improved by using standard metadata definitions and representations. •  Reuse is improved by providing suitable information about the methodology followed to derive the data. 80PhD Thesis: SeMAntic RepresenTation for Experimental Protocols