Liberating Laboratory Data - Eureka

  • 119 views
Uploaded on

Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event …

Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
119
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Eureka Research Workbench: Semantic Capture of the Scientific Process Stuart J. Chalk Department of Chemistry University of North Florida Jacksonville, FL USA schalk@unf.edu Liberating Laboratory Data – Day 2
  • 2. Capturing Science Data Data is a fundamental output of science, but… Data is not useful if it does not have context Big data analytics needs detailed, well structured metadata and relationships to assemble aggregated datasets for useful interpretation Options LabArchives http://www.labarchives.com eCAT http://www.researchspace.com/electronic-lab-notebook/ LabTrove http://www.labtrove.org/ Dryad data publishing http://datadryad.org/ or …
  • 3. Eureka Research Workbench Started in 2006 as an offshoot of getting involved in the Analytical Information Markup Language (AnIML) project No way to store all research notes in a digital format No way to capture the workflow of scientists Realized writing in a lab notebook is equivalent to “multitype” blogging in the digital world How to capture information? Many datatypes -> ExptML How to store files and make them available through web interface? (Fedora-Commons) How to link data together? RDF (in Fedora-Commons)
  • 4. Experiment Markup Language (ExptML) A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net) Many datatypes (will expand…)           Annotation Api Calculation Chemical Citation Communication Customer Data Dataset Definition           Element Equipment Event Experiment Group Project Protocol Quote Report Result           Sample Solution Space Specimen Substance Task Template Timeline User Vendor
  • 5. ExptML Chemical Schema
  • 6. ExptML Chemical Schema
  • 7. ExptML Chemical Instance
  • 8. Related Data - ExptML Ontology In computer science and ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.”* In essence, an ontology allows us to define the relationships and assertions about concepts For substances represented in ExptML we define isSubstance (assertion) hasSubstance isSubstanceOf *https://en.wikipedia.org/wiki/Ontology_(information_science)
  • 9. ExptML Ontology
  • 10. Fedora Commons Digital repository software for creating and managing online digital libraries Stores the ExptML files Stores any other files (PDFs, Images, Word etc.) Stores relationships as RDF Version control Checksumming Built in search of content and relationships
  • 11. File Storage Fedora-Commons treats each ExptML file as an object In the definition of a fedora object the file is just one stream of many. By default each object also has a “DC” stream of metadata and an “RELS-EXT” stream of relationships Each Fedora object can have any number of additional streams for Paper PDFs, product/sample pictures, original file formats (if a conversion has been done) Video, audio, anything You can export individual streams or the whole Fedora object with streams binary encoded (Sharing/archiving)
  • 12. File Storage
  • 13. Eureka Interface So, finally to the Eureka Research Workbench! Web interface written in PHP using the CakePHP Framework Communicates with Fedora-Commons API to create, retrieve, update and delete (CRUD) ExptML and other files Representational State Transfer (REST) format for URLs E.g. http://web.server/chemicals/view/exptml:chm1 Allows for searching of all files in Fedora Can also search based on relationships Can extract data out of XML files Can gather data from other websites (via API controller) and add it to ExptML files
  • 14. Typical things we record in our notebook    Eureka Website – Notebook
  • 15. Conclusion Eureka uses ExptML for representing science data Reliable storage system for ExptML files (Fedora) Method for storage of relationships (RDF in Fedora) Web application to create ExptML files (Eureka) TODO Provide web functionality to process data Provide mechanism for sharing of data (authenticated) Integration into the RDA model for sharing research data Integrate with many other websites, e.g. ChemSpider Support enlItemManifest and future RDA specifications
  • 16. References Eureka – http://sourceforge.net/projects/eureka Fedora-Commons – http://fedora-commons.org XML – http://www.w3.org/standards/xml ExptML – http://exptml.sourceforge.net/ JSON – http://www.json.org/ UnitsML – http://unitsml.nist.gov/ RDF – http://www.w3.org/RDF/ CIR – http://cactus.nci.nih.gov/chemical/structure RDA – http://rd-alliance.org