Successfully reported this slideshow.
Your SlideShare is downloading. ×

SMART Protocols

Ad

SMART Protocols: SeMAntic
RepresenTation for
Experimental Protocols
Olga Giraldo
ogiraldo@fi.upm.es
Ontology engineering g...

Ad

Agenda
• What is a lab protocol
• Motivation
• Our general research question
• Our assumption
• Our propose
• Preliminary ...

Ad

What is a lab protocol
• Laboratory protocols are like cooking recipes
• They have ingredients: reagents and sample
• They...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Upcoming SlideShare
Summary ph dtesis_oxg
Summary ph dtesis_oxg
Loading in …3
×

Check these out next

1 of 24 Ad
1 of 24 Ad
Advertisement

More Related Content

Advertisement

SMART Protocols

  1. 1. SMART Protocols: SeMAntic RepresenTation for Experimental Protocols Olga Giraldo ogiraldo@fi.upm.es Ontology engineering group (OEG) Universidad Politécnica de Madrid
  2. 2. Agenda • What is a lab protocol • Motivation • Our general research question • Our assumption • Our propose • Preliminary results • Future work
  3. 3. What is a lab protocol • Laboratory protocols are like cooking recipes • They have ingredients: reagents and sample • They have appliances: equipment, • They have a total time • They have a list of instructions, • They have critical steps. • The laboratory protocols are “the how to do” an experiment.
  4. 4. Some problems in lab protocols  some of them present insufficient granularity,  the instructions can be imprecise or ambiguous due to the use of natural language. • Incubate the centrifuge tubes in a water bath. • Incubate the samples for 5 min with gentle shaking. • Rinse DNA briefly in 1-2 ml of wash. • Incubate at -20C overnight.
  5. 5. Why do we need to formalize and extract information from lab protocols? Because we want a recommendation system… • That matches protocols according to my situation, for instance • samples I have, • availability of equipment, reagents, lab conditions • expertise We also want content based information retrieval • Meaningful sentences, sample used, purpose of the protocol, applicability, critical steps, etc. Also, identification of instructions • Find all protocols for DNA extraction that have been used in Oryza sativa that are suitable for processing a large number of samples with a low execution time. Motivation
  6. 6. Currently… Semi-structured information Unstructured information How to formalize the information from laboratory protocols as a knowledge base? Ontologies + NLP tools
  7. 7. Our assumption “Experimental protocols are fundamental information structures that should support the description of the processes by means of which results are generated in experimental research”
  8. 8. Our propose
  9. 9. Methods to represent and extract information • Ontology model representing lab protocols • Gazetteer-based method: use existing lists of named entities  Lists of proper nouns, which refer to real-life entities • Rule-based approaches: write manual extraction rules • Combination of the above
  10. 10. Ontology development
  11. 11. SMART Protocols - document The Protocol as a document sp:application of the protocol sp:advantage of the protocol sp:limitation of the protocol sp:provenance of the protocol sp:purpose of the protocol sp:introduction section sp:buffer list sp:equipment and supplies list sp:kit list sp:primer list sp:reagent list sp:software list sp:solution list sp:materials section exact:caution sp:critical step sp:hint sp:pause point sp:storage condition sp:timing sp:troubleshooting sp:methods section sp:experimental protocol iao:document iao:document part iao:textual entity iao:data set owl:subClassOf ro:hasPart ro:partOf owl:subClassOf owl:subClassOfowl:subClassOf ro:hasPart ro:hasPart ro:hasPart ro:partOf ro:partOf ro:partOf owl:subClassOf owl:subClassOf exact:alert message owl:subClassOf  Rhetorical and structural components (e.g. introduction, materials, and methods);  Information like application of the protocol, advantages and limitations, list of reagents, critical steps.
  12. 12. SMART Protocols - wf sp:basic step of DNA extraction p-plan:Step p-plan:Variable sp:cell disruption sp:plant tissue Basic Steps of DNA Extraction sp:DNA purification obi:DNA extract p-plan:hasInputVariable p-plan:hasOutputVariable p-plan:hasOutputVariable owl:subClassOf sp:digestion reaction sp:powdered tissue owl:subClassOf owl:subClassOf owl:subClassOf p-plan:hasInputVariable sp:digested contaminant p-plan:hasInputVariable p-plan:hasOutputVariable owl:subClassOfowl:subClassOfowl:subClassOfowl:subClassOf bfo:isPrecededBy bfo:isPrecededBy Representation of the workflow aspects in protocols  implicit order in the instructions, following the input output structure.
  13. 13. SMART Protocols documentation • SMART Protocols ontology is available here: • http://vocab.linkeddata.es/SMARTProtocols/ • Paper accepted in the Linked Science 2014 (LISC2014) • Authors: Giraldo O, Garcia A, Corcho O • Title: SMART Protocols: SeMAntic RepresenTation for Experimental Protocols • Collocated with the 13th International Semantic Web Conference (ISWC2014) • http://xurl.es/smart_protocols
  14. 14. Linguistic pattern recognition
  15. 15. Classification of the protocols • Our corpus of protocols was classified according to the purpose. DNA / RNA extraction DNA amplification Electrophoresis or sequencing of nucleic acids Genetic transformation • Identification of instructions common to a type of protocol  Key steps in DNA extraction: cell disruption, digestion reaction and DNA purification ≈ pasta recipe Key steps of pasta recipe: boil water, add the pasta until it is cooked and mix the sauce with the pasta.
  16. 16. Creation of gazetteer lists  Lists containing keywords to find occurrences of these key words in the text.
  17. 17. Creation of grammar rules (JAPE rules) Rule: secondstep1 ( ({Token.category == "CD"}) (({Token})[0,3]) ({Token.category == "NN"}) (({Token})[0,3]) {digest_reaction_reagent.majorType == "reagents"} (({Token})*) ({Token.string == "." }) ):digestionreaction1 --> :digestionreaction1.Digestionreaction1 = {type = "digestionreaction1"}  To find occurrences of basic steps from lab protocols.
  18. 18. SMART Protocols in action sp= smart protocols, ro= relation ontology sp:experimental protocol sp:DNA extraction protocol sp:advantages sp:sample owl:subClassOf rdf:type sp:title of the protocol sp:author entry rdf:type sp:hasAuthor sp:hasTitle rdf:type ro:partOf ro:partOf sp:application of the protocol ro:partOf rdf:type rdf:type
  19. 19. SMART Protocols in action
  20. 20. Future work
  21. 21. Continue… • Analysis of the protocols. Focus on the identification of keywords and/or constructs in English –e.g. instructions, actions. • Writing rules. • Executing, testing and debugging the rules.
  22. 22. Goal of the internship To take advantage of the previous experiences in the formalization of lab protocols and apply them in a new OHSU-Elsevier project focused on research data management systems.
  23. 23. Special thanks… Olga Giraldo ogiraldo@fi.upm.es oxgiraldo@gmail.com Ontology engineering group (OEG) Universidad Politécnica de Madrid Supervisors OEG’s colleagues Thank you!!! Oscar Corcho Alexander Garcia Daniel Garijo María Poveda Pablo Calleja

Editor's Notes

  • And as I mentioned before an experimental protocol is a how to do an experiment. For this reason our assumption is that experimental protocols are…
  • What do we propose?
  • These set of methods to represent and extract intelligent information from laboratory protocols: the first one is an ontology model…
    The use of gazetteer-based method, this is a list of entities or objects from lab protocols that we like to recovery.
    The manual creation of rules,
    And a combination of all of these methods.
  • which results we have obtained
  • The development of two ontology modules, one of them represent the metadata to report a laboratory protocol and the another module represent the protocol as a executable element.
  • The ontologies are available here and recently were accepted a paper in the workshop linked science 2014 where is describing the ontology design.
    So far, we have covered a way about how to report formally a lab protocol.
  • Now, we start describing the methods used to extract linguistic patterns from the lab protocols
  • The first step in this stage, was the classification of lab protocols according to their propose. In our corpus we identify 4 types of protocols: protocols designed to extract nucleic acids, protocols designed to DNA amplification, protocols designed to electrophoresis or sequencing and protocols designed to genetic transformation.
  • Then were created rules to find occurrences of basic steps from lab protocols. Here there are an example: this rule was created to annotate an instruction associated to the digestion reaction. This rule describe a quantity used of a reagent that participate only in the digestion reaction stage. The annotation continue until the nearest period.

×