Liberating Laboratory Data - Eureka


Published on

Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Liberating Laboratory Data - Eureka

  1. 1. Eureka Research Workbench: Semantic Capture of the Scientific Process Stuart J. Chalk Department of Chemistry University of North Florida Jacksonville, FL USA Liberating Laboratory Data – Day 2
  2. 2. Capturing Science Data Data is a fundamental output of science, but… Data is not useful if it does not have context Big data analytics needs detailed, well structured metadata and relationships to assemble aggregated datasets for useful interpretation Options LabArchives eCAT LabTrove Dryad data publishing or …
  3. 3. Eureka Research Workbench Started in 2006 as an offshoot of getting involved in the Analytical Information Markup Language (AnIML) project No way to store all research notes in a digital format No way to capture the workflow of scientists Realized writing in a lab notebook is equivalent to “multitype” blogging in the digital world How to capture information? Many datatypes -> ExptML How to store files and make them available through web interface? (Fedora-Commons) How to link data together? RDF (in Fedora-Commons)
  4. 4. Experiment Markup Language (ExptML) A specification (written in XML) that describes different types of information recorded during the scientific process ( Many datatypes (will expand…)           Annotation Api Calculation Chemical Citation Communication Customer Data Dataset Definition           Element Equipment Event Experiment Group Project Protocol Quote Report Result           Sample Solution Space Specimen Substance Task Template Timeline User Vendor
  5. 5. ExptML Chemical Schema
  6. 6. ExptML Chemical Schema
  7. 7. ExptML Chemical Instance
  8. 8. Related Data - ExptML Ontology In computer science and ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.”* In essence, an ontology allows us to define the relationships and assertions about concepts For substances represented in ExptML we define isSubstance (assertion) hasSubstance isSubstanceOf *
  9. 9. ExptML Ontology
  10. 10. Fedora Commons Digital repository software for creating and managing online digital libraries Stores the ExptML files Stores any other files (PDFs, Images, Word etc.) Stores relationships as RDF Version control Checksumming Built in search of content and relationships
  11. 11. File Storage Fedora-Commons treats each ExptML file as an object In the definition of a fedora object the file is just one stream of many. By default each object also has a “DC” stream of metadata and an “RELS-EXT” stream of relationships Each Fedora object can have any number of additional streams for Paper PDFs, product/sample pictures, original file formats (if a conversion has been done) Video, audio, anything You can export individual streams or the whole Fedora object with streams binary encoded (Sharing/archiving)
  12. 12. File Storage
  13. 13. Eureka Interface So, finally to the Eureka Research Workbench! Web interface written in PHP using the CakePHP Framework Communicates with Fedora-Commons API to create, retrieve, update and delete (CRUD) ExptML and other files Representational State Transfer (REST) format for URLs E.g. http://web.server/chemicals/view/exptml:chm1 Allows for searching of all files in Fedora Can also search based on relationships Can extract data out of XML files Can gather data from other websites (via API controller) and add it to ExptML files
  14. 14. Typical things we record in our notebook    Eureka Website – Notebook
  15. 15. Conclusion Eureka uses ExptML for representing science data Reliable storage system for ExptML files (Fedora) Method for storage of relationships (RDF in Fedora) Web application to create ExptML files (Eureka) TODO Provide web functionality to process data Provide mechanism for sharing of data (authenticated) Integration into the RDA model for sharing research data Integrate with many other websites, e.g. ChemSpider Support enlItemManifest and future RDA specifications
  16. 16. References Eureka – Fedora-Commons – XML – ExptML – JSON – UnitsML – RDF – CIR – RDA –