Eureka Research Workbench:
Semantic Capture of the
Scientific Process
Stuart J. Chalk
Department of Chemistry
University o...
Capturing Science Data
Data is a fundamental output of science, but…
Data is not useful if it does not have context
Big da...
Eureka Research Workbench
Started in 2006 as an offshoot of getting involved in the
Analytical Information Markup Language...
Experiment Markup Language (ExptML)
A specification (written in XML) that describes
different types of information recorde...
ExptML Chemical Schema
ExptML Chemical Schema
ExptML Chemical Instance
Related Data - ExptML Ontology
In computer science and ontology
“formally represents knowledge as a set of concepts within...
ExptML Ontology
Fedora Commons
Digital repository software for creating and managing
online digital libraries
Stores the ExptML files
Stor...
File Storage
Fedora-Commons treats each ExptML file as an object
In the definition of a fedora object the file is just one...
File Storage
Eureka Interface
So, finally to the Eureka Research Workbench!
Web interface written in PHP using the CakePHP Framework
Co...
Typical things we record
in our notebook




Eureka Website – Notebook
Conclusion
Eureka uses ExptML for representing science data
Reliable storage system for ExptML files (Fedora)
Method for s...
References
Eureka – http://sourceforge.net/projects/eureka
Fedora-Commons – http://fedora-commons.org
XML – http://www.w3....
Upcoming SlideShare
Loading in...5
×

Liberating Laboratory Data - Eureka

199

Published on

Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
199
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Liberating Laboratory Data - Eureka

  1. 1. Eureka Research Workbench: Semantic Capture of the Scientific Process Stuart J. Chalk Department of Chemistry University of North Florida Jacksonville, FL USA schalk@unf.edu Liberating Laboratory Data – Day 2
  2. 2. Capturing Science Data Data is a fundamental output of science, but… Data is not useful if it does not have context Big data analytics needs detailed, well structured metadata and relationships to assemble aggregated datasets for useful interpretation Options LabArchives http://www.labarchives.com eCAT http://www.researchspace.com/electronic-lab-notebook/ LabTrove http://www.labtrove.org/ Dryad data publishing http://datadryad.org/ or …
  3. 3. Eureka Research Workbench Started in 2006 as an offshoot of getting involved in the Analytical Information Markup Language (AnIML) project No way to store all research notes in a digital format No way to capture the workflow of scientists Realized writing in a lab notebook is equivalent to “multitype” blogging in the digital world How to capture information? Many datatypes -> ExptML How to store files and make them available through web interface? (Fedora-Commons) How to link data together? RDF (in Fedora-Commons)
  4. 4. Experiment Markup Language (ExptML) A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net) Many datatypes (will expand…)           Annotation Api Calculation Chemical Citation Communication Customer Data Dataset Definition           Element Equipment Event Experiment Group Project Protocol Quote Report Result           Sample Solution Space Specimen Substance Task Template Timeline User Vendor
  5. 5. ExptML Chemical Schema
  6. 6. ExptML Chemical Schema
  7. 7. ExptML Chemical Instance
  8. 8. Related Data - ExptML Ontology In computer science and ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.”* In essence, an ontology allows us to define the relationships and assertions about concepts For substances represented in ExptML we define isSubstance (assertion) hasSubstance isSubstanceOf *https://en.wikipedia.org/wiki/Ontology_(information_science)
  9. 9. ExptML Ontology
  10. 10. Fedora Commons Digital repository software for creating and managing online digital libraries Stores the ExptML files Stores any other files (PDFs, Images, Word etc.) Stores relationships as RDF Version control Checksumming Built in search of content and relationships
  11. 11. File Storage Fedora-Commons treats each ExptML file as an object In the definition of a fedora object the file is just one stream of many. By default each object also has a “DC” stream of metadata and an “RELS-EXT” stream of relationships Each Fedora object can have any number of additional streams for Paper PDFs, product/sample pictures, original file formats (if a conversion has been done) Video, audio, anything You can export individual streams or the whole Fedora object with streams binary encoded (Sharing/archiving)
  12. 12. File Storage
  13. 13. Eureka Interface So, finally to the Eureka Research Workbench! Web interface written in PHP using the CakePHP Framework Communicates with Fedora-Commons API to create, retrieve, update and delete (CRUD) ExptML and other files Representational State Transfer (REST) format for URLs E.g. http://web.server/chemicals/view/exptml:chm1 Allows for searching of all files in Fedora Can also search based on relationships Can extract data out of XML files Can gather data from other websites (via API controller) and add it to ExptML files
  14. 14. Typical things we record in our notebook    Eureka Website – Notebook
  15. 15. Conclusion Eureka uses ExptML for representing science data Reliable storage system for ExptML files (Fedora) Method for storage of relationships (RDF in Fedora) Web application to create ExptML files (Eureka) TODO Provide web functionality to process data Provide mechanism for sharing of data (authenticated) Integration into the RDA model for sharing research data Integrate with many other websites, e.g. ChemSpider Support enlItemManifest and future RDA specifications
  16. 16. References Eureka – http://sourceforge.net/projects/eureka Fedora-Commons – http://fedora-commons.org XML – http://www.w3.org/standards/xml ExptML – http://exptml.sourceforge.net/ JSON – http://www.json.org/ UnitsML – http://unitsml.nist.gov/ RDF – http://www.w3.org/RDF/ CIR – http://cactus.nci.nih.gov/chemical/structure RDA – http://rd-alliance.org
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×