• Like
247th ACS Meeting: Experiment Markup Language (ExptML)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

247th ACS Meeting: Experiment Markup Language (ExptML)

  • 64 views
Published

To integrate science into the semantic web it is important to capture the context of research as it is done. ExptML is designed to store information and workflows from the scientific process.

To integrate science into the semantic web it is important to capture the context of research as it is done. ExptML is designed to store information and workflows from the scientific process.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
64
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Experiment Markup Language: A Combined Markup Language and Ontology to Represent Science Stuart J. Chalk Department of Chemistry University of North Florida schalk@unf.edu 2014 Spring ACS Meeting – CINF Paper 19
  • 2.  Digital Representation of Science  Electronic Notebooks  The Eureka Research Workbench  Experiment Markup Language  ExptML Schema and Files  Semantic Data and Ontologies  File Storage  Eureka Interface  Web Interface  Conclusion Outline
  • 3.  Most research on digital science is focused on the data  Standards exist for the digital representation of  Data -> individual measurements, time series, spectra  Molecules  Chemical Reactions  Context is important!  Context can be added ad-hoc  Needs to be added systematically - to be searchable  We need a digital representation of the scientific process Digital Representation of Science
  • 4.  Conceptualized in 2006  Need a way to store  Research activities  Laboratory resources  Data  Need to capture the workflow of scientists – not define it  Writing in a lab notebook is equivalent to blogging…  …but the context of the entries is important and varies  Many data types, so how to capture information?  Experiment Markup Language (ExptML) Eureka Research Workbench
  • 5.  A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net) Experiment Markup Language (ExptML)  Sample  Solution  Space  Specimen  Substance  Task  Template  Timeline  User  Vendor  Annotation  Api  Calculation  Chemical  Citation  Customer  Data  Dataset  Definition  Element  Equipment  Event  Experiment  Group  Message  Project  Protocol  Quote  Report  Result
  • 6. ExptML Chemical Schema
  • 7. ExptML Chemical Schema
  • 8. ExptML Chemical (Instance)
  • 9.  To allow ExptML to capture a scientific workflow, an ontology is needed to represent the structure  Needs to be  Flexible – able to be used in a wide variety of areas  Logical – the links make sense in the context of science  Searchable – so we can find research done in a similar way  Comprehensive! This is the BIG problem  Many existing ontologies Linking ExptML Files
  • 10.  In computer science and ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.”*  In essence, an ontology allows us to define the relationships and assertions about concepts  For samples represented in ExptML we define  isSample (assertion)  hasSample (relationship)  isSampleOf (relationship) ExptML Ontology *https://en.wikipedia.org/wiki/Ontology_(information_science)
  • 11. ExptML Ontology
  • 12.  XML is nice for storage, archiving and transmitting information…  …but it is not so easy to use in software  Many XML readers but each have their own syntax  Can be cumbersome to deal in software with  File size (XML is verbose)  Namespaces  Data types (e.g. string, decimal, etc…)  So the solution is… Developments in ExptML
  • 13.  JSONize it!  Compact string representation of arrays of data  Used in AJAX requests in web browsers Javascript Object Notation (JSON) { “exptmlid”: “exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” } <annotation id="exptml_ann1" xmlns="urn:exptml:schema:draft:0.4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:exptml:schema:draft:0.4 http://exptml.sourceforge.net/files/schema/exptml_annotation.xsd" version="0.4"> <anntype>comment</anntype> <text>Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.</text> <date>2011-11-25T11:05:17-04:00</date> </annotation>
  • 14.  JSON-based Serialization for Linked Data  Current W3C recommendation*  Allows us to define a specification for the JSON data  “@content” is equivalent to an XML Schema JSON-LD *http://www.w3.org/TR/json-ld { “@context”: { “exptmlid”: “http://www.w3.org/2001/XMLSchema#string”, “anntype”: “http://www.w3.org/2001/XMLSchema#string”, “text”: “http://www.w3.org/2001/XMLSchema#string”, “date”: “http://www.w3.org/2001/XMLSchema#dateTime” } }
  • 15. JSON-LD { “@context”: { “exptmlid”: “http://www.w3.org/2001/XMLSchema#string”, “anntype”: “http://www.w3.org/2001/XMLSchema#string”, “text”: “http://www.w3.org/2001/XMLSchema#string”, “date”: “http://www.w3.org/2001/XMLSchema#dateTime” } “exptmlid”: “exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” }
  • 16.  @id represents an Internationalized Resource Identifier (IRI)  The IRI identifies a node and allows this data to be linked JSON-LD { “@context”: “http://exptld.org/annotation.jsonld” “@id”: “https://eureka.coas.unf.edu/exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” }
  • 17.  Current the ontology defines generic relationships  Should be expanded to provide additional context Developments in the Ontology <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"> <rdfs:label>has solution</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a particular solution</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#rels"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasBuffer"> <rdfs:label>has buffer</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a buffer (solution)</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasReagent"> <rdfs:label>has reagent</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a reagent (solution)</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasCalibrationStandard"> <rdfs:label>has calibration standard</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a calibration standard</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution”/> </rdf:Property>
  • 18.  BIG Problem!  Context is specific to the science and the scientist  How many sub-properties of “hasSolution” are needed?  Additional context is domain specific so…  … we need to integrate other related ontologies  Map “hasSolution” to predicates in other ontologies  Use VIVO to choose the ‘best’ domain specific ontology  Aggregate science ontologies? – requires software/time  Evaluate ElasticSearch (http://www.elasticsearch.org) Expand the Ontology
  • 19.  JSON-LD is a concrete RDF syntax!*  JSON-LD can be converted to triples Combine ML and Ontology? *http://www.w3.org/TR/json-ld/#relationship-to-rdf { "@context": "http://exptld.org/annotation.jsonld", "@id": "https://eureka.coas.unf.edu/exptml:ann1", "anntype": "comment", "text": "Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.", "date": "2011-11-25T11:05:17-04:00", "hasUser": [ { "@id": "https://eureka.coas.unf.edu/exptml:usr1” }, { "@id": "https://eureka.coas.unf.edu/exptml:usr11”} ], "hasExperiment": { "@id": "https://eureka.coas.unf.edu/exptml:exp1" } }
  • 20.  Nice start - allows for conceptual evaluation of the approach  Needs work – “science cannot be described by one alone”  TODO  Integrate and aggregate existing ontologies  Work with ELN developers e.g. LabTrove and elnItemManifest*  Encourage ontology development in areas where gaps exist e.g. Chemical Analysis  Contribute to standards development e.g. Research Data Alliance (RDA) – http://rd-alliance.org Conclusion * “First steps towards semantic descriptions of electronic laboratory notebook records“, S J Coles, J G Frey, C L Bird, R J Whitby and A E Day, J. Cheminformatics, 2013, 5:52 http://doi.dx.org/10.1186/1758-2946-5-52
  • 21. References  Eureka – http://sourceforge.net/projects/eureka  Fedora-Commons – http://fedora-commons.org  XML – http://www.w3.org/standards/xml  ExptML – http://exptml.sourceforge.net/  JSON-LD – http://www.w3.org/TR/json-ld  UnitsML – http://unitsml.nist.gov/  RDF – http://www.w3.org/RDF/  CIR – http://cactus.nci.nih.gov/chemical/structure  RDA – http://rd-alliance.org  Research Data Alliance (https://rd-alliance.org/)  http://www.nytimes.com/2013/08/13/science/how-to-share-scientific-data.html