SlideShare a Scribd company logo
1 of 57
Paving the way to open and interoperable
research data service workflows
Progress from 3 perspectives
Angus Whyte, Digital Curation Centre
Rory Macneil, Research Space
Stuart Lewis, University of Edinburgh
Repository Fringe, Edinburgh, 2nd August 2016
Paving the way to open and interoperable
research data service workflows
Progress from 3 perspectives
Angus Whyte, Digital Curation Centre
Research data service models, new DCC guidance and some draft ‘design
principles’ for institutions integrating research data service workflows
Rory Macneill, Research Space
Integrating the RSpace ELN with University of Edinburgh’s DataShare and
Harvard’s Dataverse repositories
Stuart Lewis, University of Edinburgh
DataVault –Jisc Research Data Spring prototype for packaging data to be archived
Some working definitions
Research data service – “a means of delivering value to the producers and
users of digital objects by facilitating outcomes they want to achieve without
the ownership of specific costs or risks” (derived from ITIL definition of a service)
Sub-types -
Active data management “services used to create or transform digital
objects for the purposes of research”
Preservation: “services offering to ensure digital objects meet a defined
level of FAIRness - findability, accessibility, interoperability, and reusability -
for a designated community and period of time”
Publication: “services offering to enhance digital objects FAIRness by
reviewing their quality on specified criteria, or connecting them to
additional metadata”
Guidance: “services offering practical guidance on choosing or using the
above services”
Draft design principles for integration of research data
service workflows
1. Active data management services should use open
standards to express and expose the objects and
metadata they offer to downstream services,
including their access and reuse terms
Draft design principles for integration of research data
service workflows
2. Preservation and publication services should publish
policies stating what digital object types they accept,
for what communities, and on what terms and
conditions
Draft design principles for integration of research data
service workflows
3. Preservation and publication services should make
openly available sufficient metadata to enable reuse
of their outputs, including all terms and conditions
for third-party access and reuse
Draft design principles for integration of research data
service workflows
4. Active data management, preservation and
publication services should make sufficient detail of
their workflows available to support the
reproducibility of research that produced the digital
objects they act upon
Draft design principles for integration of research data
service workflows
5. Guidance services should support users of other
services to make an informed choice of downstream
service capabilities, informed by best practices for
reuse and reproducibility
Draft design principles for integration of research data
service workflows
6. Guidelines 1-5 should be implemented using
machine-actionable content
Guidance from Digital Curation Centre
Aims to effectively support organisations with-
• Providing effective research data services
• Promoting reusability of their research data
• and reproducibility of their research
Models that reflect and help shape reality
Higgins, S. (2008) The DCC Curation Lifecycle Model. Int. Journal Digital Curation 2008, Vol. 3, No. 1, pp. 134-140 doi:10.2218/ijdc.v3i1.48
Key element of
DCC guidance
e.g. Curation
Lifecycle Model
(2008)
DCC research data service model (2016)
RDM policy & strategy
Business plans &
sustainability
Training Advisory services
Data Management
Planning
Active data
management
Appraisal & risk
assessment
Preservation
Access &
publishing
Discovery
Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
Data Management
Planning
Active data
management
Appraisal & risk
assessment
Preservation
Access &
publishing
Discovery
But real solutions are made up of many more parts
tools and services at researchers disposal are increasing
‘lifecycles’
are often
non-linear
Integration
use cases
do not
follow neat
sequences
One size does not fit all
What about Functional
Models? OAIS foundation for
Trustworthy Digital Repository
standards
Ingest
Data
management
AIP AIP
Preservation
PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Archival storage
Access
Administration
RDM platforms can be grafted
on, to relate OAIS functions
to context info
sources
Ingest
Data
management
AIP AIP
CRIS
Open access
Repository
Preservation
PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Data
Catalogue
Archival storage
Collection
Appraisal
Access
Data
Preservation
Active Data
Management
Data Mgmt
Planning
Data
Publication
Administration
But what best practice models apply ‘upstream’?
Q. How can institutions ensure researchers have informed choice of
trustworthy services before they need a repository
• “Whole lifecycle” service models are one solution
• But how should they be governed?
– Commons?
– Public-private partnerships?
– Commercial services?
But what best practice models apply ‘upstream’?
Q. How can institutions ensure researchers have informed choice of
trustworthy services before they need a repository
• “Whole lifecycle” service models are one solution
• But how should they be governed?
– Commercial services?
– Commons?
– Public-private partnerships?
Commercial services model
e.g. Elsevier
“On Wednesday 1 June, Elsevier acquired Hivebench to
help further streamline the workflow of researchers –
putting research data management at their fingertips.
The added value of the integration lies in linking
Hivebench with Elsevier’s existing Research Data
Management portfolio for products and services.
The research data that researchers have stored in the
Hivebench notebook are linked to the Mendeley Data
repository, which will be linked to Pure. This ….adds
instant value to the datasets because they become far
more suitable for reuse.”
Commons approach
e.g. Principles for Open Scholarly Infrastructures
“…What should a shared infrastructure look like?
Infrastructure at its best is invisible. We tend to only
notice it when it fails. If successful, it is stable and
sustainable. Above all, it is trusted and relied on by the
broad community it serves. Trust must run strongly
across each of the following areas: running the
infrastructure (governance), funding it (sustainability),
and preserving community ownership of it
(insurance).” (emphasis added)
Cameron Neylon, Science in the Open, 25 Feb 2015
Commons approach
e.g. Open Science Framework
But
workflows
in effect
governed
by a ‘mixed
economy’
of platforms
So what
principles
should
apply?
Mathew Spitzer An Open Science Framework for Solving Institutional Challenges: Supporting the Institutional
Research Method Research Data Access and Preservation Summit, 2016 Atlanta, GA May 4-7, 2016
Background - RDA Working Group on Data Publishing
Workflows
Reviewed 25 examples of repository data publishing workflows,
including some integrating with ‘downstream’ services e.g. data
journals for peer review
Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo.
http://dx.doi.org/10.5281/zenodo.34542
Drafted a reference model and made best practice
recommendations-
1. Start small, building modular, open source and shareable
components
2. Follow standards that facilitate interoperability and permit
extensions
3. Facilitate data citation, e.g. through use of digital object PIDs,
data/article linkages, researcher PIDs
4. Document roles, workflows and services
Background - RDA Working Group on Data Publishing
Workflows
• Follow up call (Dec 15) for examples of repositories
connecting with upstream research workflows e.g. to gather
metadata earlier
• Aiming to identify whether recommendations apply, and how
the intention to publish data is changing research practices.
• Collected 12 cases - mix of small concrete examples,
prototypes, conceptual models
Analysis of upstream workflows
Are the services components underpinning research
workflows loosely coupled?
– Modular design of workflow components? yes, plenty
evidence
– Standard vocabularies and protocols to describe
components some evidence
– Significant investment in building trust-based
relationships among participants some evidence
– Standardized ways of specifying capabilities and
performance requirements limited (e.g. WDS-DSA
Catalogue of Requirements)
* ‘The Joy of Flex’ Hagel and Seely-Brown, 2005 see e.g. http://www.cio.com.au/article/29148/joy_flex
So do we need more …
E.g. DCC How-to describe research data service workflows (forthcoming)
• Definitions – to describe services ?
• Design principles – to articulate best practice?
• Capability models - to articulate mutual expectations
of service owners?
• Case studies of repository workflow integration?
• Understanding of how tools are actually being
integrated, and effect on data publication practices
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
Thanks for comments to Suenje Dalmeier Tiessen and Amy Nurnberger, co-chairs of the RDA Working
Group on Data Publishing Workflows
Maintaining trust across the research cycle
Q. How do these principles apply in real cases?
Case Study 1 – Rory Macneill, Research Space
Integrating the RSpace ELN with University of Edinburgh’s DataShare
and Harvard’s Dataverse repositories
Export data and metadata
Archive or Repository
Current Repository Paradigm
The reality is (a lot) more complex
Data
Files
Links
Repositories
Archives
ELNs
File systems
Databases
Capture and structure data
Generate metadata
Export data and metadata
Establish links to file(s) and databases
Track file locations
ActionsVehiclesResearch units
Issues
Data versus file links
Forms of data export
Data from intermediary vehicles
File location and integrity of links
Post deposit access - permissions
Post deposit access - capabilities
Pre-deposit impact on post-deposit capabilities
Repositories need the ability to easily ingest diverse data types and formats,
and links to files,
in a structured manner,
directly and from other tools
The Dataverse – Starfish – RSpace
project @Harvard Medical School
Towards a new paradigm
Data, files and research Capture and structure data
Generate metadata
Track file locations Make data and files available for
public access and query
Access hyperlinks √
Track file locations √
Export data and metadata
In various formats
Using open standards
PDF √
Word √
HTML √
XML √
On premises or
Cloud file
system
Database
Capture and structure data √
Generate metadata √
Track file locations √
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
Case study 2
Stuart Lewis, University of Edinburgh
DataVault –Jisc Research Data Spring prototype for packaging data to
be archived
Stuart Lewis
Deputy Director, Library &
University Collections
The University of Edinburgh
What is a Data Vault?
@JiscDataVault
Research Data Management Services
Data Management Support
Data
Management
Planning
Active Data
Infrastructure
Data
Stewardship
Data Stewardship
• DataVault
– Long term archival storage
– First envisaged a few years ago…
What is the DataVault - Analogies
https://www.flickr.com/photos/brookward/8457736952
What is the DataVault - Analogies
https://www.flickr.com/photos/timshelyn/412548076
7
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
Where does could it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
Open Data Repository
Phase 1 (3 months)
Phase 2 (4 months)
Phase 3 (6 months)
Make available online
Metadata already captured from CRIS, plus files from the Vault
Stuart Lewis
stuart.lewis@ed.ac.uk
@stuartlewis
What is a Data Vault?
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content

More Related Content

What's hot

Research Data Shared Service
Research Data Shared ServiceResearch Data Shared Service
Research Data Shared Service
Jisc
 

What's hot (20)

BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Turning FAIR into Reality - Role for Libraries
Turning FAIR into Reality - Role for Libraries Turning FAIR into Reality - Role for Libraries
Turning FAIR into Reality - Role for Libraries
 
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)OAIS: What is it and Where is it Going? - Don Sawyer (2002)
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
 
Current Research Information Systems
Current Research Information SystemsCurrent Research Information Systems
Current Research Information Systems
 
Research Data Shared Service
Research Data Shared ServiceResearch Data Shared Service
Research Data Shared Service
 
RDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library AssociationsRDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library Associations
 
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University ChennaiBig Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
 
Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015
 
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie LenertA Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
A Framework for Geospatial Web Services for Public Health by Dr. Leslie Lenert
 
RD shared services and research data spring
RD shared services and research data springRD shared services and research data spring
RD shared services and research data spring
 
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
 
UKSG webinar - Current Research Information Systems (CRIS): What are they and...
UKSG webinar - Current Research Information Systems (CRIS): What are they and...UKSG webinar - Current Research Information Systems (CRIS): What are they and...
UKSG webinar - Current Research Information Systems (CRIS): What are they and...
 
Imaging dearry ncrdc 11062017
Imaging dearry ncrdc  11062017Imaging dearry ncrdc  11062017
Imaging dearry ncrdc 11062017
 
Data Commons Garvan - 2016
Data Commons Garvan -  2016 Data Commons Garvan -  2016
Data Commons Garvan - 2016
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
SMART Seminar Series: SMART Data Management
SMART Seminar Series: SMART Data ManagementSMART Seminar Series: SMART Data Management
SMART Seminar Series: SMART Data Management
 
Trusted Digital Repositories: OAIS and Certification - Robin Dale (2002)
Trusted Digital Repositories: OAIS and Certification - Robin Dale (2002)Trusted Digital Repositories: OAIS and Certification - Robin Dale (2002)
Trusted Digital Repositories: OAIS and Certification - Robin Dale (2002)
 
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
 
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
 

Viewers also liked

Diseño de presentaciones visuales
Diseño de presentaciones visualesDiseño de presentaciones visuales
Diseño de presentaciones visuales
Linda Castañeda
 
Avionics Systems Instruments
Avionics Systems InstrumentsAvionics Systems Instruments
Avionics Systems Instruments
Michael Bseliss
 

Viewers also liked (9)

Where Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical ResearchWhere Little Data meets Big Data in Biomedical Research
Where Little Data meets Big Data in Biomedical Research
 
Finding the sweet spot - blending the best of native and web
Finding the sweet spot - blending the best of native and webFinding the sweet spot - blending the best of native and web
Finding the sweet spot - blending the best of native and web
 
Cómo crear una buena tarea
Cómo crear una buena tareaCómo crear una buena tarea
Cómo crear una buena tarea
 
Diseño de presentaciones visuales
Diseño de presentaciones visualesDiseño de presentaciones visuales
Diseño de presentaciones visuales
 
Avionics Systems Instruments
Avionics Systems InstrumentsAvionics Systems Instruments
Avionics Systems Instruments
 
9. aircraft electrical systems
9. aircraft electrical systems9. aircraft electrical systems
9. aircraft electrical systems
 
Numbers simulation - less is more!
Numbers simulation - less is more!Numbers simulation - less is more!
Numbers simulation - less is more!
 
Artificial Intelligence: Predictions for 2017
Artificial Intelligence: Predictions for 2017Artificial Intelligence: Predictions for 2017
Artificial Intelligence: Predictions for 2017
 
Pass the pennies - Lean game simulation
Pass the pennies - Lean game simulationPass the pennies - Lean game simulation
Pass the pennies - Lean game simulation
 

Similar to Paving the way to open and interoperable research data service workflows Progress from 3 perspectives

Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring Update
Jisc RDM
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 
Summary of data citation synthesis activity & Review
Summary of data citation synthesis activity & ReviewSummary of data citation synthesis activity & Review
Summary of data citation synthesis activity & Review
Micah Altman
 
Data citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 AltmanData citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 Altman
Micah Altman
 

Similar to Paving the way to open and interoperable research data service workflows Progress from 3 perspectives (20)

Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)
 
Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring Update
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Summary of data citation synthesis activity & Review
Summary of data citation synthesis activity & ReviewSummary of data citation synthesis activity & Review
Summary of data citation synthesis activity & Review
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016
 
What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...
 
UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Turning FAIR into Reality: Briefing on the EC’s report on FAIR data
Turning FAIR into Reality: Briefing on the EC’s report on FAIR dataTurning FAIR into Reality: Briefing on the EC’s report on FAIR data
Turning FAIR into Reality: Briefing on the EC’s report on FAIR data
 
Digital Curation 101 - Taster
Digital Curation 101 - TasterDigital Curation 101 - Taster
Digital Curation 101 - Taster
 
RDM shared services at IDCC
RDM shared services at IDCCRDM shared services at IDCC
RDM shared services at IDCC
 
Why Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointWhy Data Citation Currently Misses the Point
Why Data Citation Currently Misses the Point
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016
 
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
 
Data citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 AltmanData citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 Altman
 

Recently uploaded

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Recently uploaded (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Early Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdfEarly Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdf
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

Paving the way to open and interoperable research data service workflows Progress from 3 perspectives

  • 1. Paving the way to open and interoperable research data service workflows Progress from 3 perspectives Angus Whyte, Digital Curation Centre Rory Macneil, Research Space Stuart Lewis, University of Edinburgh Repository Fringe, Edinburgh, 2nd August 2016
  • 2. Paving the way to open and interoperable research data service workflows Progress from 3 perspectives Angus Whyte, Digital Curation Centre Research data service models, new DCC guidance and some draft ‘design principles’ for institutions integrating research data service workflows Rory Macneill, Research Space Integrating the RSpace ELN with University of Edinburgh’s DataShare and Harvard’s Dataverse repositories Stuart Lewis, University of Edinburgh DataVault –Jisc Research Data Spring prototype for packaging data to be archived
  • 3. Some working definitions Research data service – “a means of delivering value to the producers and users of digital objects by facilitating outcomes they want to achieve without the ownership of specific costs or risks” (derived from ITIL definition of a service) Sub-types - Active data management “services used to create or transform digital objects for the purposes of research” Preservation: “services offering to ensure digital objects meet a defined level of FAIRness - findability, accessibility, interoperability, and reusability - for a designated community and period of time” Publication: “services offering to enhance digital objects FAIRness by reviewing their quality on specified criteria, or connecting them to additional metadata” Guidance: “services offering practical guidance on choosing or using the above services”
  • 4. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms
  • 5. Draft design principles for integration of research data service workflows 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions
  • 6. Draft design principles for integration of research data service workflows 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse
  • 7. Draft design principles for integration of research data service workflows 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon
  • 8. Draft design principles for integration of research data service workflows 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility
  • 9. Draft design principles for integration of research data service workflows 6. Guidelines 1-5 should be implemented using machine-actionable content
  • 10. Guidance from Digital Curation Centre Aims to effectively support organisations with- • Providing effective research data services • Promoting reusability of their research data • and reproducibility of their research
  • 11. Models that reflect and help shape reality Higgins, S. (2008) The DCC Curation Lifecycle Model. Int. Journal Digital Curation 2008, Vol. 3, No. 1, pp. 134-140 doi:10.2218/ijdc.v3i1.48 Key element of DCC guidance e.g. Curation Lifecycle Model (2008)
  • 12. DCC research data service model (2016) RDM policy & strategy Business plans & sustainability Training Advisory services Data Management Planning Active data management Appraisal & risk assessment Preservation Access & publishing Discovery Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
  • 13. Data Management Planning Active data management Appraisal & risk assessment Preservation Access & publishing Discovery But real solutions are made up of many more parts tools and services at researchers disposal are increasing
  • 14. ‘lifecycles’ are often non-linear Integration use cases do not follow neat sequences One size does not fit all
  • 15. What about Functional Models? OAIS foundation for Trustworthy Digital Repository standards Ingest Data management AIP AIP Preservation PlanningContext Context Description Description RDM SIP DIP UsersProducer Archival storage Access Administration
  • 16. RDM platforms can be grafted on, to relate OAIS functions to context info sources Ingest Data management AIP AIP CRIS Open access Repository Preservation PlanningContext Context Description Description RDM SIP DIP UsersProducer Data Catalogue Archival storage Collection Appraisal Access Data Preservation Active Data Management Data Mgmt Planning Data Publication Administration
  • 17. But what best practice models apply ‘upstream’? Q. How can institutions ensure researchers have informed choice of trustworthy services before they need a repository • “Whole lifecycle” service models are one solution • But how should they be governed? – Commons? – Public-private partnerships? – Commercial services?
  • 18. But what best practice models apply ‘upstream’? Q. How can institutions ensure researchers have informed choice of trustworthy services before they need a repository • “Whole lifecycle” service models are one solution • But how should they be governed? – Commercial services? – Commons? – Public-private partnerships?
  • 19. Commercial services model e.g. Elsevier “On Wednesday 1 June, Elsevier acquired Hivebench to help further streamline the workflow of researchers – putting research data management at their fingertips. The added value of the integration lies in linking Hivebench with Elsevier’s existing Research Data Management portfolio for products and services. The research data that researchers have stored in the Hivebench notebook are linked to the Mendeley Data repository, which will be linked to Pure. This ….adds instant value to the datasets because they become far more suitable for reuse.”
  • 20. Commons approach e.g. Principles for Open Scholarly Infrastructures “…What should a shared infrastructure look like? Infrastructure at its best is invisible. We tend to only notice it when it fails. If successful, it is stable and sustainable. Above all, it is trusted and relied on by the broad community it serves. Trust must run strongly across each of the following areas: running the infrastructure (governance), funding it (sustainability), and preserving community ownership of it (insurance).” (emphasis added) Cameron Neylon, Science in the Open, 25 Feb 2015
  • 21. Commons approach e.g. Open Science Framework But workflows in effect governed by a ‘mixed economy’ of platforms So what principles should apply? Mathew Spitzer An Open Science Framework for Solving Institutional Challenges: Supporting the Institutional Research Method Research Data Access and Preservation Summit, 2016 Atlanta, GA May 4-7, 2016
  • 22. Background - RDA Working Group on Data Publishing Workflows Reviewed 25 examples of repository data publishing workflows, including some integrating with ‘downstream’ services e.g. data journals for peer review Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542 Drafted a reference model and made best practice recommendations- 1. Start small, building modular, open source and shareable components 2. Follow standards that facilitate interoperability and permit extensions 3. Facilitate data citation, e.g. through use of digital object PIDs, data/article linkages, researcher PIDs 4. Document roles, workflows and services
  • 23. Background - RDA Working Group on Data Publishing Workflows • Follow up call (Dec 15) for examples of repositories connecting with upstream research workflows e.g. to gather metadata earlier • Aiming to identify whether recommendations apply, and how the intention to publish data is changing research practices. • Collected 12 cases - mix of small concrete examples, prototypes, conceptual models
  • 24. Analysis of upstream workflows Are the services components underpinning research workflows loosely coupled? – Modular design of workflow components? yes, plenty evidence – Standard vocabularies and protocols to describe components some evidence – Significant investment in building trust-based relationships among participants some evidence – Standardized ways of specifying capabilities and performance requirements limited (e.g. WDS-DSA Catalogue of Requirements) * ‘The Joy of Flex’ Hagel and Seely-Brown, 2005 see e.g. http://www.cio.com.au/article/29148/joy_flex
  • 25. So do we need more … E.g. DCC How-to describe research data service workflows (forthcoming) • Definitions – to describe services ? • Design principles – to articulate best practice? • Capability models - to articulate mutual expectations of service owners? • Case studies of repository workflow integration? • Understanding of how tools are actually being integrated, and effect on data publication practices
  • 26. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content Thanks for comments to Suenje Dalmeier Tiessen and Amy Nurnberger, co-chairs of the RDA Working Group on Data Publishing Workflows
  • 27. Maintaining trust across the research cycle Q. How do these principles apply in real cases? Case Study 1 – Rory Macneill, Research Space Integrating the RSpace ELN with University of Edinburgh’s DataShare and Harvard’s Dataverse repositories
  • 28. Export data and metadata Archive or Repository Current Repository Paradigm
  • 29. The reality is (a lot) more complex Data Files Links Repositories Archives ELNs File systems Databases Capture and structure data Generate metadata Export data and metadata Establish links to file(s) and databases Track file locations ActionsVehiclesResearch units Issues Data versus file links Forms of data export Data from intermediary vehicles File location and integrity of links Post deposit access - permissions Post deposit access - capabilities Pre-deposit impact on post-deposit capabilities Repositories need the ability to easily ingest diverse data types and formats, and links to files, in a structured manner, directly and from other tools
  • 30. The Dataverse – Starfish – RSpace project @Harvard Medical School Towards a new paradigm Data, files and research Capture and structure data Generate metadata Track file locations Make data and files available for public access and query
  • 31. Access hyperlinks √ Track file locations √ Export data and metadata In various formats Using open standards PDF √ Word √ HTML √ XML √ On premises or Cloud file system Database Capture and structure data √ Generate metadata √ Track file locations √
  • 32. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content
  • 33. Case study 2 Stuart Lewis, University of Edinburgh DataVault –Jisc Research Data Spring prototype for packaging data to be archived
  • 34. Stuart Lewis Deputy Director, Library & University Collections The University of Edinburgh What is a Data Vault? @JiscDataVault
  • 35. Research Data Management Services Data Management Support Data Management Planning Active Data Infrastructure Data Stewardship
  • 36. Data Stewardship • DataVault – Long term archival storage – First envisaged a few years ago…
  • 37. What is the DataVault - Analogies https://www.flickr.com/photos/brookward/8457736952
  • 38. What is the DataVault - Analogies https://www.flickr.com/photos/timshelyn/412548076 7
  • 39. Where does it sit? Active Storage Lab Equipment Other Media Archival Storage
  • 40. Where does it sit? Active Storage Lab Equipment Other Media Archival Storage PURE / CRIS / Data Catalogue
  • 41. Where does could it sit? Active Storage Lab Equipment Other Media Archival Storage PURE / CRIS / Data Catalogue Open Data Repository
  • 42. Phase 1 (3 months) Phase 2 (4 months) Phase 3 (6 months)
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54. Make available online Metadata already captured from CRIS, plus files from the Vault
  • 55.
  • 57. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content