Curation-Friendly Tools for the Scientific Researcher


Published on

Presentation for Online Northwest Conference, in Corvallis Oregon, February 10, 2012.

Highlights electronic lab notebooks (ELN) and OMERO (Open Microscopy Environment) as two tools that enable researchers to better manage their research data.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Curation-Friendly Tools for the Scientific Researcher

  1. 1. Brian Westra University of
  2. 2. Data services needs assessment: 2009-2010Interviewed 25 faculty:BiologyCenter for Advanced Materials Characterization at OregonChemistryComputer & Information ScienceGeological SciencesHuman PhysiologyInstitute for a Sustainable EnvironmentMuseum of Natural and Cultural HistoryPhysicsPsychology
  3. 3. o Connecting data sources to data viewing and usageo Data organizationo Metadata/annotation of fileso Recording workflow, procedures, provenancePreservation, archiving and publishing datawere farther down the list
  4. 4. Clearly articulated need and opportunity;also tie-in to data management planimplementationsLogical extension of the role for librariesbeyond traditional servicesSupport for e-Science is a goalWorking in the data lifecycle/ecosystem is morerobust than ‗just‘ archiving/preservation
  5. 5. Maintaining, preserving and adding value todigital research data throughout its lifecycle.
  6. 6. File management tools: i.e., SharepointBest practices: naming conventions, versioncontrol softwareAre there other solutions or services?
  7. 7. Going beyond file management systems toembedded, more holistic tools/systems:o Electronic Lab Notebookso Content/format-specific data management software
  8. 8. ―…how a laboratory tracks and manages itsinformation resources, particularly the datathat represents the laboratory‘s product.‖(Avery, McGee, & Falk, 2000)―a data and sample management system that isdesigned to improve the management oflaboratory workflow‖ (―Clinical LIMS,‖ 2011)Most basic function: sample handling andreporting.
  9. 9. Data (create, store, share, organize, analyze) + information (notes)May include: sample handling, storeroom inventory,signatures, collaboration, protocols and SOPs,embedded workflows, data analysis andvisualizationLIMS and ELN functions and features often overlap
  10. 10. Many of them! UWisconsin-Madison RFI responsesincluded these vendors: o Accelrys o Agilent o Amphora o Axiope o Contur o IDBS o Kinematik o Labtrack o Notebookmaker o Rescentris o Waters
  11. 11. Continuously changing field of vendors andproducts o Nature article o Other options: open source, or a mix of basic tools, often used in open science
  12. 12. Some UO considerations:o Academic audience (vs. FDA compliance)o Cost – S/W, hardware, sys-admin, trainingo Interface and ease of useo Account managemento Platformo Research domain integration*o Metadata support*o Data file management**curation characteristics
  13. 13. o Research domain o Workflow integration with analytical tools, methods o Data capture from typical hardware/sources o Ontologieso Metadata o Capture/extraction o Representation, standards o Export with fileso Data file management o File format standards, transformations o Export options o Metadata o Provenance, version control o Archiving raw and derivatives
  14. 14. Wisconsin-Madison RFIo Some highlights from an excellent list of considerationso Good processo Plan to field test with 60 participants
  15. 15. What might be your ―make or break‖ issues?How would you assign weights or ranking tothe metrics?1. Costs2. Platform3. Product lock-in4. etc.
  16. 16. ‗Ground truth‘ themetrics andvalues/comparatorsSatellite or high-altitude(pre-pilot) might notconform to on the ground(during the pilot) om_content&view=article&id=29:ground- truthing&catid=9&Itemid=9
  17. 17. Have realistic team work load and timelineexpectationsIt‘s progress! It may be difficult to applymeasures of curation capacity to an ELN o Archiving and preservation capacity o Exportable relational (semantic) representation o Publication of data
  18. 18. It may be more realistic to ask:o Will this help you (the PI) find and understand the data and notes this week/ next year/after the student is gone?o Can this improve your ability to do data management (and write a better plan for the next grant proposal)?o Is it simple enough that it will become part of the routine? i.e., folklore: info everyone knows but no one records
  19. 19. Example: publish direct to ChemSpiderChemspider recordELN data exchange project: Dial-a-molecule
  20. 20. A compelling reason for faculty to participateCollaboration and coordination withstakeholders (Office of Research, IT,Libraries, research faculty, Tech Transfer)Champion(s) – these are usually not easy orinexpensive to implement, in the lab or withlimited budgets
  21. 21. What is the scope of a ―pilot case‖?o Durationo Number of participantso Hardware capacityo Level of training and supporto Evaluation criteria and roleso Exit strategy – and dealing with successWho‘s going to pay for this (right now)?Might anticipate who is going pay for this (if itworks well and goes to production)
  22. 22. ―Data you enter in the ELN software will be stored in a securelocation, however; at the end of the pilot period, the data willbe removed and we cannot guarantee that it can be recoveredfully from the ELN. Therefore, we very strongly encourage youto keep an additional copy of all data and notebook entries inelectronic and/or hard copy format during the pilot as a backupmeasure and as a means of keeping a complete and continuousrecord of your work during the pilot period.‖
  23. 23. Many biology labs produce a lot of still imagesand video Cresko lab - UO
  24. 24. Open Microscopy Environment (OME)-developedsystem for image file management
  25. 25. Embeds/supports curation:o Uses a metadata standard for description (OME XML)o Employs file format standards (import to tiff)o Can archive raw and derivative fileso Provides intuitive organizational schemao Annotation and description support on multiple levelso Export of files with metadata
  26. 26. video
  27. 27. It‘s open source – what is the level ofsupport/installation base? Longevity/stability?How well does it fit into the workflow of the lab?Can it support the proprietary formats generatedin the labs?What are the IT/systems requirements?
  28. 28. Finding a host and participantsEstablishing realistic expectationso Host obligationso Project scope
  29. 29. DCXL: Digital Curation for ExcelDiscussion: what other options are youexploring?
  30. 30. Avery, G., McGee, C., & Falk, S. (2000). Product Review: Implementing LIMS: A ―how-to‖ guide. AnalyticalChemistry, 72(1), 57 A-62 A. American Chemical Society. doi:10.1021/ac0027082CIO Office, U. of W.-M. (n.d.). Charter 6.7: eLab Notebooks | CIO Office | UW-Madison. Retrieved February 9, 2012, from LIMS. (2011). Retrieved from, J. (2012). Going paperless: The digital lab. Nature, 481(7382), 430-1. doi:10.1038/481430aPerkinElmer. (n.d.). PerkinElmer Informatics. Retrieved February 9, 2012, from (n.d.). Rescentris | CERF Software. Retrieved February 9, 2012, from of Dundee & Open Microscopy Environment. (n.d.). About OMERO — OME. Retrieved February 9, 2012, from of Wisconsin-Madison. (2012). Informed Consent for Electronic Lab Notebook Pilot | Technology Solutions forTeaching and Research. Retrieved February 9, 2012, from of Wisconsin-Madison. (n.d.-a). Electronic Lab Notebooks | Technology Solutions for Teaching and Research.Retrieved February 9, 2012, a from of Wisconsin-Madison. (n.d.-b). Electronic Lab Notebook Request for Information - University of Wisconsin-Madison. Retrieved February 9, 2012, b from