2. Instrument Mortar and pestle
Reagent Triton X-100
Sample
DNA
Cell
disruption
Grind the leaf tissueusing a mortar and pestle
Precipitation
reaction
Precipitate the DNA with
0.6 mL of 2-propanol.
How to accurately document, share and retrieve meaningful information from
experimental protocols
Goal
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 2
3. •  Experimental protocols
are like cooking recipes
ü  They have ingredients:
reagents and sample
ü  They have appliances:
equipment,
ü  They have a list of instructions,
ü  They have a total time
ü  They have critical steps…
What is an experimental protocol
3PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
4. Laboratory Protocols
•  A protocol is written in natural language
•  A protocol is a depiction of a sequence
of operations that includes an input and
an output. In this sense a protocol is a
type of workflow.
“The protocols should have complete
information that allows anybody to
recreate an experiment” [1]
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 4
[1] Giraldo, O., GarcĂa, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal of
biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
5. Poorly documented protocols
5PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
• Incubate the centrifuge tubes
in a water bath.
• Incubate the samples for 5 min
with gentle shaking.
• Rinse DNA briefly in 1-2 ml of
wash.
• Incubate at -20°C overnight.
“Mix thoroughly at room
temperature”
Protocol
Instead of this…
I obtain this!
6. Poor reproducibility
*virologist who, in 1955, cofounded The Journal of Irreproducible Results with physicist
Harry J. Lipkin.
“An experiment is reproducible until
another laboratory tries to repeat it”
— Alexander Kohn*
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 6
Source: https://www.nature.com/news/1-500-scientists-lift-the-lid-on-
reproducibility-1.19970
7. Improving the reproducibility
Data repository
for making data
available
few efforts focus on
representing and
standardizing
experimental protocols.
For reproducibility
purposes, if the data
must be available, so
does the experimental
protocol detailing the
methodology followed
to derive the data.
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 7
Ontobee
Open
Science
Open access
Open sourceOpen data
Citizen science
8. Repositories of protocols
Features
•  Some of them are not open
access.
•  The protocol is available mostly
in PDF and HTML.
•  Information is organized into
sections and the content into
each section is unstructured.
•  The search is limited to author,
title, keywords or publication
date.
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 8
9. Main research question
How to formalize the information from
laboratory protocols as a knowledge base?
9PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
10. 10PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
Specific research questions
11. 11PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) for experimental protocols in
life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
Specific research questions
12. Materials
12PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  9 Instructions for authors,
•  530 experimental protocols
•  Minimum information standards
•  Bio-ontologies
13. Methodology used for building our
guidelines…
13PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
14. A guideline for reporting experimental protocols in life sciences
14PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Instructions for authors
from journals publishing
protocols:
•  Identification of
bibliographic and rhetorical
data elements.
Experimental protocols:
•  Were published protocols following the
instructions for authors?
•  What was missing?
•  What additional information was included?
•  What information is reported in
unpublished protocols?
MI standards and ontologies:
•  Identification and description of data
elements in protocols (e.g. whole
organisms, anatomical parts, chemical
compounds, instruments, primers,
software, etc.)
W e A n a l y z e d
Checklist Version 0.1
•  Extraction of a first set of
bibliographic and rhetorical elements
from resources analyzed.
Checklist
Version 0.2
Enriched checklist.
First Draft
S e c o n d
D r a f t
Resulting Checklist
Workshops: The 1st group of
participants (19 domain experts
from CIAT), focused on “what
information is necessary and
sufficient for reporting a
protocol?”
E v a l u a t i o n 1
Online survey: The 2nd
group of 23 participants.
They were asked to indicate
whether a particular data
element was relevant or not
in the checklist V. 0.2.
E v a l u a t i o n 2
A B C
D
FE
Final meeting:
Participants from workshops
and survey. The discussion
focus on “should the
checklist include infrequent
data elements?”
E v a l u a t i o n 3
G
Checklist Version
1.0
16. Data elements for reporting protocols
16PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
Bibliographic
data
elements
Discursive
data
elements
17. Data elements for reporting protocols
17PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
18. Data elements for reporting protocols
18PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
19. Summarizing
•  Here, where described 17 data elements that can be
used to improve the reporting structure of protocols.
•  The guideline can be adapted to address the needs
of specific communities.
•  Improving reporting structures requires collective
efforts from authors, peer reviewers, editors, and
funding bodies.
•  The improvement will be incremental; as guidelines are
presented, they will be evaluated, adapted, and re-deployed.
19PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
20. 20PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Specific research questions
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
21. Methodology used for building our ontology model
21PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Development of scenarios
and competency questions
Draft ontology
Iterative building ontology
models and validation
Ontology evaluation and
evolution
23. Document module
23PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo, O., GarcĂa, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols. Journal
of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
24. Workflow module
24PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo, O., GarcĂa, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing experimental protocols.
Journal of biomedical semantics, 8(1), 52. doi:10.1186/s13326-017-0160-y
25. Ontology evaluation
25PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Syntax - The OntOlogy Pitfall Scanner (OOPS!),
•  Capability of SMART Protocols Ontology to answer competency questions specified
by domain experts
•  https://smartprotocols.github.io/queries/
Retrieve the protocols using mouse as a sample
PREFIX sp: <http://purl.org/net/SMARTprotocol#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ro: <http://http://www.obofoundry.org/ro/ro.owl#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?title ?externalUri
WHERE {
?protocol sp:hasTitle ?title_uri .
?title_uri rdf:value ?title .
?protocol ro:hasPart ?materials .
?materials a sp:MaterialsSection .
?materials ro:hasPart ?reagents .
?reagents a sp:SpecimenList .
?reagents ro:hasPart ?reagent .
?reagent owl:sameAs ?externalUri .
?reagent sp:hasName ?nameUri .
?nameUri rdf:value ”mouse" .
}
26. Ontology evaluation
26PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
PREFIX sp: <http://purl.org/net/SMARTprotocol#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX obi: <http://purl.obolibrary.org/obo/OBI_>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX CHEBI: <http://purl.obolibrary.org/obo/CHEBI_>
SELECT ?name (group_concat(?manufacturerName; separator=" , ")
as ?manufacturers) (group_concat(?homepage; separator=" , ")
as ?whereTobuy)
WHERE {
?reagent a CHEBI:33893 .
?reagent sp:hasName ?nameUri .
?nameUri rdf:value ?name .
?reagent obi:0000304 ?manufacturer .
?manufacturer sp:hasName ?manufacturerNameUri .
?manufacturerNameUri rdf:value ?manufacturerName .
?reagent foaf:homepage ?homepage .
} GROUP BY ?name
Retrieve reagent names, their manufacturers and homepages
27. Advantages and limitations
Limitations
•  Linking reagents and instruments to external resources
•  Manufacturers don’t always offer Application Programing Interfaces (APIs) that make it
possible to resolve these entities against their websites
•  Solution – web scraping
•  Manufacturers don’t always use controlled vocabularies or common identifiers
•  Solution - PubChem
SMART Protocols ontology follows the FAIR principles
•  Findable – it is registered in Bioportal, github and the vocab.linkeddata.es
•  Reusable – classes and object properties are documented with annotation properties (e.g.
“preferred terms”, “alternative terms”, “definitions”, “example of usage”) from the OBI Minimal
metadata to know the context and suitability of each ontology term.
•  Interoperable and accesible
•  ontology language – OWL
•  License – Creative Commons Attribution 4.0
27PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
28. 28PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Adoption of guideline elements
represented in the SMART Protocols
ontology…
29. Laboratory Protocols in Bioschemas.org
29PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Community initiative built on top of
schema.org
•  Aim
•  Improve data discoverability and
interoperability in Life Sciences
•  How
•  Adding Life Science types to
schema.org
•  Providing usage guidelines, examples
and tools
Source: http://www.igst.it/nettab/2018/
files/2018/10/NETTAB2018_Garcia.pdf
31. 31PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Specific research questions
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the manual annotation based on common data
elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
32. 32PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Identification of
common data elements
in protocols
A
Development of a Gold
Standard Corpus (GSC)
B
Two main steps
33. Identifying common data elements
33PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Retrieve the protocols using mouse as a sample.
•  Retrieve reagent names, their manufacturers and homepages.
•  Retrieve the purpose of the protocol titled “Isolation of Lung Infiltrating Cell in Mice”.
•  Retrieve all the protocols that use the software “ImageJ” and the corresponding homepage.
Competency questions:
34. The SIRO model
34PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Sample/Specimen
(whole organism, anatomical
part, bodily fluids, etc.)
Instruments
(equipment, devices,
consumables, software)
Reagents
(chemical compounds,
mixtures)
Objective
(purpose)
The SIRO model
supports search,
retrieval and
classification of
experimental protocols
35. 35PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Identification of
common data elements
in protocols
A
Development of a Gold
Standard Corpus (GSC)
B
Two main steps
36. 36PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Materials used for building our Gold
Standard Corpus…
37. Subset of protocols
37PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Journal No. of protocols
Bio-Protocols 19
Biotechniques 4
Cold spring harbor protocols 7
Current protocols 3
Genetic and Molecular Research 3
Journal of Biological Methods 4
Journal of Visualized Experiments 11
MethodsX 6
Nature Protocols 1
Total 58
38. Domain expert annotators
38PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Institution No. of annotators
Centro de Bioinformática y BiologĂa Computacional de Colombia 8
Universidad del Valle, Colombia 11
Database Center for Life Science (DBCLS), Robotic Biology
Institute(RBI), Spiber, Yachie-Lab, Universidad de Tokyo, Japan
14
Universidad Santiago de Cali, Colombia 1
Total 34
39. BioH: Annotation tool
39PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Bio
based on •  Open source software,
•  Non-profit organization.
•  Collaborate, create, discover, share and
re-use knowledge,
•  exports annotations in a standardized
format,
•  annotations are available as linked data
over a SPARQL endpoint.
created
to
40. Training documentation
40PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
• Guidelines about What and
How annotate.
•  User guide of the BioH
Annotation tool,
Source: Olga Giraldo. (2018, December 10). Guidelines to
annotate experimental protocols (Version 1.0.0). Zenodo.
http://doi.org/10.5281/zenodo.2171281
Source: Olga Giraldo. (2019, April). Guidelines to use BioH
annotation tool (Version 1.0.0). Zenodo. http://doi.org/
10.5281/zenodo.2639704
41. Methodology for developing a Gold Standard Corpus
41PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Training session
-  usage of the
annotation tool,
-  annotation
guidelines.
•  Assignment of
protocols to
annotators
•  3 annotators
per protocol
Annotation
phase I
Review of
Annotations
Annotation
phase II
Resolve
inconsistencies
and produce a
final output
Tag distribution
42. Results
42PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  Overall objective was missing in 12 protocols
43. Results
43PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  High inter-annotator agreement was observed
Source: Olga Giraldo. (2018). Fleiss Kappa of protocols (Version V 1.1.0en) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1489112
44. Annotated terms mapped to ontology terms
44PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
1769 concepts
related to
Samples,
Instruments and
Reagents
NCBI Taxon
organisms
UBERON
anatom. parts
ERO
reagents
CheBI
chem. comp.
PubChem
chem. comp.
SMART Prot.
reagents
EFO
instruments
OBI
instruments
BAO
instruments
SMART Prot.
instruments
45. 45PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Semantic gazetteers features
46. JAPE (Java Annotation Patterns Engine) rules features
46PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
47. Evaluating automatic annotation
47PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Olga Giraldo. (2018). Precision, Recall and F1 score (Version V1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1753520
Entity F1-measure (>0.70)
Sample 45%
Instrument 60%
Reagent 59%
48. Gazetteers failures
48PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Words with typos
•  e.g., centrifuge vs centifuge
Words with different meaning
•  e.g., the word “cat” is a term from NCBI Taxonomy used to represent the
common name of “Felis catus”, but cat (or cat., Cat, CAT) also
represents the short word for “catalog”
Cases where multiple samples, reagents or instruments
in the same statement were mentioned.
•  e.g., eppendorf or falcon tubes instead of eppendorf tubes or falcon
tubes.
49. Summarizing
•  58 fully annotated documents,
•  High inter-annotator agreement (0,72 - 1.0),
•  Good practices for the development of an annotated corpus of
documents were put into action:
ü Clear annotation tasks (what and how annotate),
ü Low ambiguity of the data,
ü The participation of annotators experts in life sciences,
ü Three annotators per document,
ü Two annotation phases.
•  This annotated corpus could serve as a gold standard
for biological natural language processing (NLP) tasks.
49PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
50. 50PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
•  How to design a guideline that formally represents bibliographic
(e.g. title, author, version), and rhetorical components (e.g.
purpose, materials, and procedure) from experimental protocols
in life science?
•  What is the ontological structure that allows the formal
representation of an experimental protocol as a document and as
an executable element?
•  How to facilitate the annotation and finding specific protocols
based on common data elements in experimental protocols?
•  How to facilitate automatic entity recognition by using semantics
and NLP techniques?
•  How to facilitate the generation of semantic documents for
experimental protocols?
51. 51PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Why do we need to formalize and extract information
from lab protocols?
•  Because we want a recommendation system that
matches protocols according to my situation, for
instance
ü  samples I have,
ü  availability of equipment, reagents, lab conditions
ü  Expertise
52. SMART Protocols Publication Platform
52PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Available at: https://smart-protocols.firebaseapp.com/login
53. What is semantic publishing
•  Semantic publishing is the workflow that aggregates
detailed and well characterized semantic interoperable
assertions in a way so that these are intelligible for
humans and procesable by machines.
•  The assertions are created by domain experts.
•  The aggregation of assertions does not end at the time of
publication; it assumes that the assertions in the published
object will continue to evolve.
53PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
56. Features
56PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
reagent_name_resolver
organism_name_resolver
Open
Science
Open access
Open source
Open data
Citizen science
Open educational
resources
57. Conclusions and outcomes
57PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research questions
1.  How to design a guideline that formally represents bibliographic (e.g. title, author,
version), and rhetorical components (e.g. purpose, materials, and procedure) from
experimental protocols in life science.
2.  What is the ontological structure that allows the formal representation of an
experimental protocol as a document and as an executable element?
•  Identification of data elements for
reporting experimental protocols,
•  Data element are resolvable
against resources in the web of
data,
•  Ontologies are the connectivity
tissue
•  The protocols are available in
RDF, JSON, HTML, etc.
Conclusions
•  In order to preserve
the digital continuum
(data-experimental
protocol), several
layers of semantics
are needed in a well
coordinated
metadata modeling
effort.
58. Conclusions and outcomes
58PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Publications
1.  Giraldo, O., GarcĂa, A., & Corcho, O. (2018), A guideline for reporting experimental protocols in
life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795.
2.  Giraldo, O., GarcĂa, A., LĂłpez, F., & Corcho, O. (2017). Using semantics for representing
experimental protocols. Journal of biomedical semantics, 8(1), 52. doi:10.1186/
s13326-017-0160-y.
International internships
Centro: Centro Internacional de Agricultura Tropical (CIAT)
Localidad: Cali, Colombia
Fecha:20/01/2014 - 30/03/2014
DuraciĂłn (semanas): 10
Tema: Análisis de protocolos de laboratorio. Fase I estandarización de protocolos de
laboratorio usando tecnologĂas semánticas
Centro: The Ontology Development Group in the Oregon Health & Science University
Localidad: Portland, Oregon, Estados Unidos
Fecha:05/10/2014 - 31/01/2015
DuraciĂłn (semanas): 17
Tema: desarrollar un modelo ontolĂłgico para representar protocolos experimentales
incluidos en sistemas de manejo de datos de investigaciĂłn de Elsevier
59. Conclusions and outcomes
59PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research questions
3.  How to facilitate the manual annotation based on common data elements in
experimental protocols?
4.  How to facilitate automatic entity recognition by using semantics and NLP
techniques?
•  The identification of the key elements from the SIRO model helps to focus the
manual and automatic annotation of protocols,
•  The automatic annotation depends on the knowledge encoded in the ontologies.
There is the need for better and more exhaustive ontologies describing samples,
instruments and reagents.
•  Reagents and instruments should come directly from suppliers.
Conclusions
60. Conclusions and outcomes
60PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Publications
1.  Giraldo, O., GarcĂa, A., Ohta T., Lopez F. (2017). Annotating the SIRO model and discovering
experimental protocols. Biomedical Linked Annotation Hackathon 3 (BLAH3), Tokyo, Japan;
January 16 – 20. 2017. http://blah3.linkedannotation.org/program
2.  Giraldo, O., GarcĂa, A., Figueredo J., Corcho O. (2015), Using Semantics and NLP in
Experimental Protocols. 8th Semantic Web Applications and Tools for Life Sciences
International Conference (SWAT4LS), Cambridge, UK; December 7-10, 2015. ISSN 1613-0073.
http://ceur-ws.org/Vol-1546/
61. 61PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Research question
5.  How to facilitate the generation of semantic documents for experimental
protocols?
Conclusions
The publication process presented here is a novel publication paradigm that delivers
semantics at birth.
•  The SMART Protocols ontology, is used to guide the data capturing process,
•  Pubchem and uniprot are used to enrich information about chemicals and
organisms.
•  The HTML is marked with the LabProtocol Bioschemas profile.
•  The publication paradigm presented in this thesis could also be applicable to other
types of documents.
•  This publication paradigm is aligned with open science.
Conclusions and outcomes
62. 62PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Awards
Best Idea award (2016). First phase in the first edition of the actĂşaLoop competition for
innovation in social networks.
Best poster at the International Conference on Biomedical Ontology 2015 (ICBO 2015)
Autores: Olga Giraldo, Alexander GarcĂa and Oscar Corcho
TĂtulo: Using Semantics and NLP in the SMART Protocols Repository
PublicaciĂłn: http://icbo2015.fc.ul.pt/ICBO2015Proceedings.pdf
Lugar celebraciĂłn: Lisboa, Portugal Fecha: 27 - 30 de Julio, 2015
Conclusions and outcomes
63. Future work
•  Improve the gazetteer and rule based system.
•  Investigate more agile methods for end user engagement
in the process of maintaining terminological resources.
•  From the SIRO model, structuring the objective remains a
challenge. The identification and semantic
characterization of the objective of a protocol should be
investigated.
•  There is the need to have a representation for workflows
executed by robots. Is one single representation for the
protocols reasonable? How to manage these workflows?
63PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
66. Instructions for authors
66PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life
sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
67. Corpus of protocols
67PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
68. Minimum information standards
68PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
69. Ontologies
69PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI
10.7717/peerj.4795
70. Developing the LabProtocol profile
70PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Use cases Mapping
Specification
Adoption
Testing
http://bioschemas.org/useCases/LabProtocols/
https://smart-protocols.firebaseapp.com/login
http://bioschemas.org/types/LabProtocol/
71. Poor data availability
2011… 2017…
[1] Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA (2011) Public Availability of Published Research Data in High-Impact Journals. PLoS ONE 6(9):
e24357. doi:10.1371/journal.pone.0024357
[2] Vasilevsky et al. (2017), Reproducible and reusable research: are journal data sharing policies meeting the mark? PeerJ 5:e3208; DOI 10.7717/peerj.3208
PhD Thesis: SeMAntic RepresenTation for Experimental Protocols 71
72. Annotating with BioH
72PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
a.  Select the text by
using the mouse,
then…
b.  Click on the
annotate
button to
highlight the
text.
c.  add a tag; as was
indicated in step 7,
then…
d.  add a comment,
and…
e.  save
73. Bibliographic data elements
73PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
74. Discursive data elements
74PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
75. Data elements describing materials
75PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Source: Giraldo et al. (2018), A guideline for reporting
experimental protocols in life sciences. PeerJ 6:e4795; DOI
10.7717/peerj.4795
76. 76PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements describing materials
Source: Giraldo et al. (2018), A guideline for reporting experimental protocols in life sciences. PeerJ 6:e4795; DOI 10.7717/peerj.4795
77. 77PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements describing materials
Source: Giraldo et al. (2018), A
guideline for reporting experimental
protocols in life sciences. PeerJ
6:e4795; DOI 10.7717/peerj.4795
78. Data elements for the procedure
78PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
79. 79PhD Thesis: SeMAntic RepresenTation for Experimental Protocols
Data elements for the procedure
80. SMART Protocols Publication Platform is FAIR
•  Identity metadata of the protocol and their digital
objects greatly improves the “findability” of the
data.
•  Access is improved by using the tool to publish
protocols to archives, e.g. ZENODO.
•  Interoperability is improved by using standard
metadata definitions and representations.
•  Reuse is improved by providing suitable information
about the methodology followed to derive the data.
80PhD Thesis: SeMAntic RepresenTation for Experimental Protocols