SlideShare a Scribd company logo
1 of 31
Brian Westra
 University of Oregon
bwestra@uoregon.edu
Data services needs assessment: 2009-2010

Interviewed 25 faculty:

Biology
Center for Advanced Materials Characterization at Oregon
Chemistry
Computer & Information Science
Geological Sciences
Human Physiology
Institute for a Sustainable Environment
Museum of Natural and Cultural History
Physics
Psychology
o   Connecting data sources to data viewing and
    usage
o   Data organization
o   Metadata/annotation of files
o   Recording workflow, procedures, provenance

Preservation, archiving and publishing data
were farther down the list
Clearly articulated need and opportunity;
also tie-in to data management plan
implementations

Logical extension of the role for libraries
beyond traditional services

Support for e-Science is a goal

Working in the data lifecycle/ecosystem is more
robust than ‗just‘ archiving/preservation
Maintaining, preserving and adding value to
digital research data throughout its lifecycle.




http://www.dcc.ac.uk/digital-curation/what-digital-curation
File management tools: i.e., Sharepoint

Best practices: naming conventions, version
control software

Are there other solutions or services?
Going beyond file management systems to
embedded, more holistic tools/systems:

o   Electronic Lab Notebooks

o   Content/format-specific data management
    software
―…how a laboratory tracks and manages its
information resources, particularly the data
that represents the laboratory‘s product.‖
(Avery, McGee, & Falk, 2000)


―a data and sample management system that is
designed to improve the management of
laboratory workflow‖ (―Clinical LIMS,‖ 2011)

Most basic function: sample handling and
reporting.
Data (create, store, share, organize, analyze)
                           +
                 information (notes)

May include: sample handling, storeroom inventory,
signatures, collaboration, protocols and SOPs,
embedded workflows, data analysis and
visualization

LIMS and ELN functions and features often overlap
Many of them! UWisconsin-Madison RFI responses
included these vendors:

 o Accelrys
 o Agilent
 o Amphora
 o Axiope
 o Contur
 o IDBS
 o Kinematik
 o Labtrack
 o Notebookmaker
 o Rescentris
 o Waters
Continuously changing field of vendors and
products

 o Nature article

 o Other options: open source, or a mix of basic tools,
   often used in open science
Some UO considerations:

o   Academic audience (vs. FDA compliance)
o   Cost – S/W, hardware, sys-admin, training
o   Interface and ease of use
o   Account management
o   Platform
o   Research domain integration*
o   Metadata support*
o   Data file management*

*curation characteristics
o   Research domain
    o Workflow integration with analytical tools, methods
    o Data capture from typical hardware/sources
    o Ontologies

o   Metadata
    o Capture/extraction
    o Representation, standards
    o Export with files

o   Data file management
    o   File format standards, transformations
    o   Export options
    o   Metadata
    o   Provenance, version control
    o   Archiving raw and derivatives
Wisconsin-Madison RFI

o   Some highlights from an excellent list of
    considerations

o   Good process

o   Plan to field test with 60 participants
What might be your ―make or break‖ issues?

How would you assign weights or ranking to
the metrics?
1. Costs
2. Platform
3. Product lock-in
4. etc.
‗Ground truth‘ the
metrics and
values/comparators

Satellite or high-altitude
(pre-pilot) might not
conform to on the ground
(during the pilot)

                             http://www.seawead.org/index.php?option=c
                             om_content&view=article&id=29:ground-
                             truthing&catid=9&Itemid=9
Have realistic team work load and timeline
expectations

It‘s progress! It may be difficult to apply
measures of curation capacity to an ELN

 o Archiving and preservation capacity
 o Exportable relational (semantic) representation
 o Publication of data
It may be more realistic to ask:

o   Will this help you (the PI) find and understand the
    data and notes this week/ next year/after the
    student is gone?

o   Can this improve your ability to do data
    management (and write a better plan for the next
    grant proposal)?

o   Is it simple enough that it will become part of the
    routine?
    i.e., folklore: info everyone knows but no one
    records
Example: publish direct to ChemSpider

Chemspider record

ELN data exchange project: Dial-a-molecule
A compelling reason for faculty to participate

Collaboration and coordination with
stakeholders (Office of Research, IT,
Libraries, research faculty, Tech Transfer)

Champion(s) – these are usually not easy or
inexpensive to implement, in the lab or with
limited budgets
What is the scope of a ―pilot case‖?
o Duration
o Number of participants
o Hardware capacity
o Level of training and support
o Evaluation criteria and roles
o Exit strategy – and dealing with success


Who‘s going to pay for this (right now)?

Might anticipate who is going pay for this (if it
works well and goes to production)
―Data you enter in the ELN software will be stored in a secure
location, however; at the end of the pilot period, the data will
be removed and we cannot guarantee that it can be recovered
fully from the ELN. Therefore, we very strongly encourage you
to keep an additional copy of all data and notebook entries in
electronic and/or hard copy format during the pilot as a backup
measure and as a means of keeping a complete and continuous
record of your work during the pilot period.‖


https://academictech.doit.wisc.edu/informed-consent-electronic-lab-notebook-pilot
Many biology labs produce a lot of still images
and video




                 Cresko lab - UO
Open Microscopy Environment (OME)-developed
system for image file management
Embeds/supports curation:

o   Uses a metadata standard for description (OME
    XML)
o   Employs file format standards (import to tiff)
o   Can archive raw and derivative files
o   Provides intuitive organizational schema
o   Annotation and description support on multiple
    levels
o   Export of files with metadata
video
It‘s open source – what is the level of
support/installation base? Longevity/stability?

How well does it fit into the workflow of the lab?

Can it support the proprietary formats generated
in the labs?

What are the IT/systems requirements?
Finding a host and participants

Establishing realistic expectations
o Host obligations
o Project scope
DCXL: Digital Curation for Excel

Discussion: what other options are you
exploring?
Avery, G., McGee, C., & Falk, S. (2000). Product Review: Implementing LIMS: A ―how-to‖ guide. Analytical
Chemistry, 72(1), 57 A-62 A. American Chemical Society. doi:10.1021/ac0027082

CIO Office, U. of W.-M. (n.d.). Charter 6.7: eLab Notebooks | CIO Office | UW-Madison. Retrieved February 9, 2012, from
http://www.cio.wisc.edu/plan-docs-Charter6-7.aspx

Clinical LIMS. (2011). Retrieved from http://www.scientificcomputing.com/product-IN-Clinical-LIMS-
072811.aspx?terms=LIMS

Giles, J. (2012). Going paperless: The digital lab. Nature, 481(7382), 430-1. doi:10.1038/481430a

PerkinElmer. (n.d.). PerkinElmer Informatics. Retrieved February 9, 2012, from http://www.cambridgesoft.com/?l=en

Rescentris. (n.d.). Rescentris | CERF Software. Retrieved February 9, 2012, from http://rescentris.com/cerf-software/
University of Dundee & Open Microscopy Environment. (n.d.). About OMERO — OME. Retrieved February 9, 2012, from
http://www.openmicroscopy.org/site/products/omero

University of Wisconsin-Madison. (2012). Informed Consent for Electronic Lab Notebook Pilot | Technology Solutions for
Teaching and Research. Retrieved February 9, 2012, from https://academictech.doit.wisc.edu/informed-consent-
electronic-lab-notebook-pilot

University of Wisconsin-Madison. (n.d.-a). Electronic Lab Notebooks | Technology Solutions for Teaching and Research.
Retrieved February 9, 2012, a from http://academictech.doit.wisc.edu/ideas/electronic-lab-notebooks

University of Wisconsin-Madison. (n.d.-b). Electronic Lab Notebook Request for Information - University of Wisconsin-
Madison. Retrieved February 9, 2012, b from https://academictech.doit.wisc.edu/files/115349rfi.pdf

More Related Content

What's hot

Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...FAIRDOM
 
Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Fiona Nielsen
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIMartin Scharm
 
Data challenges for researchers
Data challenges for researchersData challenges for researchers
Data challenges for researchersMichael Hoffman
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSyed Ahmad Chan Bukhari, PhD
 
Using electronic laboratory notebooks in the academic life sciences: a group ...
Using electronic laboratory notebooks in the academic life sciences: a group ...Using electronic laboratory notebooks in the academic life sciences: a group ...
Using electronic laboratory notebooks in the academic life sciences: a group ...SC CTSI at USC and CHLA
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardshipRussell Jarvis
 
Data-Science-Specialization
Data-Science-SpecializationData-Science-Specialization
Data-Science-SpecializationAsmi Ariv
 
Data Science Coursera 8N8VM4AGNDL7
Data Science Coursera 8N8VM4AGNDL7Data Science Coursera 8N8VM4AGNDL7
Data Science Coursera 8N8VM4AGNDL7Mei Chiao Lin
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experiencelmfederer
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyAlex Henderson
 

What's hot (19)

Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 
Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
 
Data challenges for researchers
Data challenges for researchersData challenges for researchers
Data challenges for researchers
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence images
 
Safe Assign
Safe AssignSafe Assign
Safe Assign
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Using electronic laboratory notebooks in the academic life sciences: a group ...
Using electronic laboratory notebooks in the academic life sciences: a group ...Using electronic laboratory notebooks in the academic life sciences: a group ...
Using electronic laboratory notebooks in the academic life sciences: a group ...
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
Data-Science-Specialization
Data-Science-SpecializationData-Science-Specialization
Data-Science-Specialization
 
Data Science Coursera 8N8VM4AGNDL7
Data Science Coursera 8N8VM4AGNDL7Data Science Coursera 8N8VM4AGNDL7
Data Science Coursera 8N8VM4AGNDL7
 
Open Helix
Open HelixOpen Helix
Open Helix
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experience
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
 

Viewers also liked

Web-based Tools for Today's Researcher | October 2014
Web-based Tools for Today's Researcher | October 2014Web-based Tools for Today's Researcher | October 2014
Web-based Tools for Today's Researcher | October 2014Mike Pascoe
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture Ramez Al-Fayez
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 

Viewers also liked (8)

Web-based Tools for Today's Researcher | October 2014
Web-based Tools for Today's Researcher | October 2014Web-based Tools for Today's Researcher | October 2014
Web-based Tools for Today's Researcher | October 2014
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar to Curation-Friendly Tools for the Scientific Researcher

UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...Jisc
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...OSTHUS
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData ManagementUlrike Wittig
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theoryC. Tobin Magle
 
How Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementHow Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementAgaram Technologies
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...faflrt
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA'saaroncollie
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeCarole Goble
 
Curation and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataCuration and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataManjulaPatel
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 

Similar to Curation-Friendly Tools for the Scientific Researcher (20)

UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
How Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data ManagementHow Logilab ELN helps Organizations in Research Data Management
How Logilab ELN helps Organizations in Research Data Management
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA's
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
 
Curation and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataCuration and Preservation of Crystallography Data
Curation and Preservation of Crystallography Data
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
Digital Destiny
Digital DestinyDigital Destiny
Digital Destiny
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Curation-Friendly Tools for the Scientific Researcher

  • 1. Brian Westra University of Oregon bwestra@uoregon.edu
  • 2. Data services needs assessment: 2009-2010 Interviewed 25 faculty: Biology Center for Advanced Materials Characterization at Oregon Chemistry Computer & Information Science Geological Sciences Human Physiology Institute for a Sustainable Environment Museum of Natural and Cultural History Physics Psychology
  • 3. o Connecting data sources to data viewing and usage o Data organization o Metadata/annotation of files o Recording workflow, procedures, provenance Preservation, archiving and publishing data were farther down the list
  • 4. Clearly articulated need and opportunity; also tie-in to data management plan implementations Logical extension of the role for libraries beyond traditional services Support for e-Science is a goal Working in the data lifecycle/ecosystem is more robust than ‗just‘ archiving/preservation
  • 5. Maintaining, preserving and adding value to digital research data throughout its lifecycle. http://www.dcc.ac.uk/digital-curation/what-digital-curation
  • 6. File management tools: i.e., Sharepoint Best practices: naming conventions, version control software Are there other solutions or services?
  • 7. Going beyond file management systems to embedded, more holistic tools/systems: o Electronic Lab Notebooks o Content/format-specific data management software
  • 8. ―…how a laboratory tracks and manages its information resources, particularly the data that represents the laboratory‘s product.‖ (Avery, McGee, & Falk, 2000) ―a data and sample management system that is designed to improve the management of laboratory workflow‖ (―Clinical LIMS,‖ 2011) Most basic function: sample handling and reporting.
  • 9. Data (create, store, share, organize, analyze) + information (notes) May include: sample handling, storeroom inventory, signatures, collaboration, protocols and SOPs, embedded workflows, data analysis and visualization LIMS and ELN functions and features often overlap
  • 10. Many of them! UWisconsin-Madison RFI responses included these vendors: o Accelrys o Agilent o Amphora o Axiope o Contur o IDBS o Kinematik o Labtrack o Notebookmaker o Rescentris o Waters
  • 11. Continuously changing field of vendors and products o Nature article o Other options: open source, or a mix of basic tools, often used in open science
  • 12. Some UO considerations: o Academic audience (vs. FDA compliance) o Cost – S/W, hardware, sys-admin, training o Interface and ease of use o Account management o Platform o Research domain integration* o Metadata support* o Data file management* *curation characteristics
  • 13. o Research domain o Workflow integration with analytical tools, methods o Data capture from typical hardware/sources o Ontologies o Metadata o Capture/extraction o Representation, standards o Export with files o Data file management o File format standards, transformations o Export options o Metadata o Provenance, version control o Archiving raw and derivatives
  • 14. Wisconsin-Madison RFI o Some highlights from an excellent list of considerations o Good process o Plan to field test with 60 participants
  • 15. What might be your ―make or break‖ issues? How would you assign weights or ranking to the metrics? 1. Costs 2. Platform 3. Product lock-in 4. etc.
  • 16. ‗Ground truth‘ the metrics and values/comparators Satellite or high-altitude (pre-pilot) might not conform to on the ground (during the pilot) http://www.seawead.org/index.php?option=c om_content&view=article&id=29:ground- truthing&catid=9&Itemid=9
  • 17. Have realistic team work load and timeline expectations It‘s progress! It may be difficult to apply measures of curation capacity to an ELN o Archiving and preservation capacity o Exportable relational (semantic) representation o Publication of data
  • 18. It may be more realistic to ask: o Will this help you (the PI) find and understand the data and notes this week/ next year/after the student is gone? o Can this improve your ability to do data management (and write a better plan for the next grant proposal)? o Is it simple enough that it will become part of the routine? i.e., folklore: info everyone knows but no one records
  • 19. Example: publish direct to ChemSpider Chemspider record ELN data exchange project: Dial-a-molecule
  • 20. A compelling reason for faculty to participate Collaboration and coordination with stakeholders (Office of Research, IT, Libraries, research faculty, Tech Transfer) Champion(s) – these are usually not easy or inexpensive to implement, in the lab or with limited budgets
  • 21. What is the scope of a ―pilot case‖? o Duration o Number of participants o Hardware capacity o Level of training and support o Evaluation criteria and roles o Exit strategy – and dealing with success Who‘s going to pay for this (right now)? Might anticipate who is going pay for this (if it works well and goes to production)
  • 22. ―Data you enter in the ELN software will be stored in a secure location, however; at the end of the pilot period, the data will be removed and we cannot guarantee that it can be recovered fully from the ELN. Therefore, we very strongly encourage you to keep an additional copy of all data and notebook entries in electronic and/or hard copy format during the pilot as a backup measure and as a means of keeping a complete and continuous record of your work during the pilot period.‖ https://academictech.doit.wisc.edu/informed-consent-electronic-lab-notebook-pilot
  • 23. Many biology labs produce a lot of still images and video Cresko lab - UO
  • 24. Open Microscopy Environment (OME)-developed system for image file management
  • 25. Embeds/supports curation: o Uses a metadata standard for description (OME XML) o Employs file format standards (import to tiff) o Can archive raw and derivative files o Provides intuitive organizational schema o Annotation and description support on multiple levels o Export of files with metadata
  • 26. video
  • 27. It‘s open source – what is the level of support/installation base? Longevity/stability? How well does it fit into the workflow of the lab? Can it support the proprietary formats generated in the labs? What are the IT/systems requirements?
  • 28. Finding a host and participants Establishing realistic expectations o Host obligations o Project scope
  • 29. DCXL: Digital Curation for Excel Discussion: what other options are you exploring?
  • 30.
  • 31. Avery, G., McGee, C., & Falk, S. (2000). Product Review: Implementing LIMS: A ―how-to‖ guide. Analytical Chemistry, 72(1), 57 A-62 A. American Chemical Society. doi:10.1021/ac0027082 CIO Office, U. of W.-M. (n.d.). Charter 6.7: eLab Notebooks | CIO Office | UW-Madison. Retrieved February 9, 2012, from http://www.cio.wisc.edu/plan-docs-Charter6-7.aspx Clinical LIMS. (2011). Retrieved from http://www.scientificcomputing.com/product-IN-Clinical-LIMS- 072811.aspx?terms=LIMS Giles, J. (2012). Going paperless: The digital lab. Nature, 481(7382), 430-1. doi:10.1038/481430a PerkinElmer. (n.d.). PerkinElmer Informatics. Retrieved February 9, 2012, from http://www.cambridgesoft.com/?l=en Rescentris. (n.d.). Rescentris | CERF Software. Retrieved February 9, 2012, from http://rescentris.com/cerf-software/ University of Dundee & Open Microscopy Environment. (n.d.). About OMERO — OME. Retrieved February 9, 2012, from http://www.openmicroscopy.org/site/products/omero University of Wisconsin-Madison. (2012). Informed Consent for Electronic Lab Notebook Pilot | Technology Solutions for Teaching and Research. Retrieved February 9, 2012, from https://academictech.doit.wisc.edu/informed-consent- electronic-lab-notebook-pilot University of Wisconsin-Madison. (n.d.-a). Electronic Lab Notebooks | Technology Solutions for Teaching and Research. Retrieved February 9, 2012, a from http://academictech.doit.wisc.edu/ideas/electronic-lab-notebooks University of Wisconsin-Madison. (n.d.-b). Electronic Lab Notebook Request for Information - University of Wisconsin- Madison. Retrieved February 9, 2012, b from https://academictech.doit.wisc.edu/files/115349rfi.pdf