This document summarizes a presentation on research data management. It discusses the key activities, roles, requirements and drivers of RDM. Stakeholders include researchers, institutions, funding bodies and data repositories. Institutions are developing RDM policies, services and infrastructure to meet growing requirements from funders around data sharing and preservation. Research data management planning is important to address issues like data documentation, sharing, storage and long-term preservation.
This presentation was given by Joy Davidson from the Digital Curation Centre at the KAPTUR training event held on Monday 19th November and supported by DCC through the Institutional Engagement project.
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK presents on Supporting Libraries in Leading the Way in Research Data Management at Online Information, London 20th -21st November 2012
This presentation was given by Joy Davidson from the Digital Curation Centre at the KAPTUR training event held on Monday 19th November and supported by DCC through the Institutional Engagement project.
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK presents on Supporting Libraries in Leading the Way in Research Data Management at Online Information, London 20th -21st November 2012
This slide deck provides an overview and resources to respond to the OSTP memo with the subject: Increasing Access to the Results of Federally Funded Scientific Research issued by John P. Holdren in February 2013. It provides resources and information agencies, foundations, and research projects can use to assemble achieve public access to scientific data in digital formats.
(July 2011) One Less "To-Do:" Perceptions on the Role of Archives and Librari...Carolyn Hank
Event:
Archival Educators Research Institute (AERI)
July 12, 2011, Boston, MA
Abstract:
The neologisms, bloggership and blogademia, have emerged in recent years, reflecting the adoption of blogs as channels for scholarly communication; the former in reference to legal scholarship blogs, or blawgs, and the latter to blogs across disciplines. This presentation reports select findings from a descriptive study of scholars who blog in the areas of history, economics, law, biology, chemistry and physics. The study examined scholars’ attitudes and perceptions of their blogs in relation to the system of scholarly communication, their preferences for digital preservation, and their respective blog publishing behaviors and blog characteristics influencing preservation action. Drawing from 153 questionnaires, 24 interviews, and content analysis of 93 blogs, this presentation will provide a focused analysis of findings related to preservation preferences. Results from the questionnaire portion of the study show that scholars who blog are generally interested in blog preservation with a strong sense of personal responsibility. Most feel their blogs should be preserved for both personal and public access and use into the indefinite, rather than short-term, future. Respondents identify themselves as most responsible for blog preservation. Concerning capability, they perceive blog service providers, hosts, and networks as most capable. National and institutional-based libraries and archives, as well as institutional IT departments, are perceived as least responsible and least capable for preservation of scholars’ respective blogs. During the subsequent interview portion of the study, participants did not dismiss the value of these organizations. If anything, for some, it is exactly this value that contributes to perceptions of libraries and archives’ low responsibility and capability. This presentation will conclude by offering implications from these findings on the potential role, or lack of role, for archives and libraries in the preservation of scholars’ blogs.
Event:
Digital Curation Institute Symposium
November 22, 2011
4:30-6:30pm
iSchool, University Of Toronto
Abstract:
This presentation reports select findings from two descriptive studies of blogs and bloggers in the areas of history, economics, law, biology, chemistry and physics. The first study focused on scholar bloggersʼ preferences for digital preservation, as well as their publishing behaviors and blog characteristics that influence preservation action. Findings are drawn from 153 questionnaires, 24 interviews, and content analysis of 93 blogs. Briefly, questionnaire respondents are generally interested in blog preservation with a strong sense of personal responsibility. Most feel their blogs should be preserved for both personal and public access and use into the indefinite, rather than short-term, future. Over half of questionnaire respondents report saving their blog content, in whole or in part, and many interviewees expressed a sophisticated understanding of issues of digital preservation. However, the findings also indicate that bloggers exhibit behaviors and preferences complicating preservation action, including issues related to rights and use, co-producer dependencies, and content integrity.
The second study, currently on-going, looks toward the public availability of scholar blogs over-time, with findings drawn from a sample of 644 blogs. Content analysis is currently underway on inactive blogs, characterized as available, but with no new posts published within three months of coding. Initial analysis of the most recent post published to these inactive blogs shows that some bloggers did provide indicators of their respective blogʼs declining activity or, in some cases, blog stoppage. However, such indicators are only present in a clear minority of publicly available, yet inactive blogs. These preliminary findings offer implications for both personal and programmatic preservation approaches, including, notably, issues related to selection and appraisal.
(Jan 2011) Digital Curation (Guest Lecture)Carolyn Hank
Event: Guest lecture on introduction to digital curation for Prof. Elaine Menard's GLIS 639: Introduction to Museology class, School of Information Studies, McGill University (January 28, 2011)
Presentation given on October 10, 2012 at the School of Information Management, Faculty of Management at Dalhousie University.
Abstract: Ensuring persistent access to digital content is a challenge confronting contemporary institutions of all types and sizes, regardless of professional, disciplinary or organizational context. Introduced in 2002, the term digital curation describes an array of principles, strategies and technical approaches for enabling the use and re-use of reliable and trusted digital content into the indefinite future. Trusted digital repositories have emerged as one strategy in response to today's digital curatorial challenges. Successful digital repository development and deployment necessitates coordination and collaboration among an array of actors, resources, and diverse, potentially divergent requirements. The literature contains an assortment of digital repository planning and best practice recommendations and resources, though reports on actual, as opposed to perceived or potential, roadblocks and obstacles are less reported. Drawing from a first-hand account of an extensive, multi-year digital curation and repository project at a major research university, this presentation provides an overview of what was done, including what worked and what didn’t, and resulting recommendations for advancing digital repository planning, implementation, and research.
(Feb 2011) Scholars in the Blogosphere: Blogs, the Scholarly Record, and Impl...Carolyn Hank
Event: Guest lecture in Ross Harvey's LIS 531W: Digital Stewardship, Graduate School of Library and Information Science, Simmons College, February 24, 2011.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
Introduction to Research Data Management by Michael Day, (UKOLN). Presentation at Demystifying Research Data: don’t be scared be prepared: A joint JIBS/RLUK event, Tuesday 17th July 17th July 2012, Brunei Gallery at SOAS (School of Oriental and African Studies), London.
This slide deck provides an overview and resources to respond to the OSTP memo with the subject: Increasing Access to the Results of Federally Funded Scientific Research issued by John P. Holdren in February 2013. It provides resources and information agencies, foundations, and research projects can use to assemble achieve public access to scientific data in digital formats.
(July 2011) One Less "To-Do:" Perceptions on the Role of Archives and Librari...Carolyn Hank
Event:
Archival Educators Research Institute (AERI)
July 12, 2011, Boston, MA
Abstract:
The neologisms, bloggership and blogademia, have emerged in recent years, reflecting the adoption of blogs as channels for scholarly communication; the former in reference to legal scholarship blogs, or blawgs, and the latter to blogs across disciplines. This presentation reports select findings from a descriptive study of scholars who blog in the areas of history, economics, law, biology, chemistry and physics. The study examined scholars’ attitudes and perceptions of their blogs in relation to the system of scholarly communication, their preferences for digital preservation, and their respective blog publishing behaviors and blog characteristics influencing preservation action. Drawing from 153 questionnaires, 24 interviews, and content analysis of 93 blogs, this presentation will provide a focused analysis of findings related to preservation preferences. Results from the questionnaire portion of the study show that scholars who blog are generally interested in blog preservation with a strong sense of personal responsibility. Most feel their blogs should be preserved for both personal and public access and use into the indefinite, rather than short-term, future. Respondents identify themselves as most responsible for blog preservation. Concerning capability, they perceive blog service providers, hosts, and networks as most capable. National and institutional-based libraries and archives, as well as institutional IT departments, are perceived as least responsible and least capable for preservation of scholars’ respective blogs. During the subsequent interview portion of the study, participants did not dismiss the value of these organizations. If anything, for some, it is exactly this value that contributes to perceptions of libraries and archives’ low responsibility and capability. This presentation will conclude by offering implications from these findings on the potential role, or lack of role, for archives and libraries in the preservation of scholars’ blogs.
Event:
Digital Curation Institute Symposium
November 22, 2011
4:30-6:30pm
iSchool, University Of Toronto
Abstract:
This presentation reports select findings from two descriptive studies of blogs and bloggers in the areas of history, economics, law, biology, chemistry and physics. The first study focused on scholar bloggersʼ preferences for digital preservation, as well as their publishing behaviors and blog characteristics that influence preservation action. Findings are drawn from 153 questionnaires, 24 interviews, and content analysis of 93 blogs. Briefly, questionnaire respondents are generally interested in blog preservation with a strong sense of personal responsibility. Most feel their blogs should be preserved for both personal and public access and use into the indefinite, rather than short-term, future. Over half of questionnaire respondents report saving their blog content, in whole or in part, and many interviewees expressed a sophisticated understanding of issues of digital preservation. However, the findings also indicate that bloggers exhibit behaviors and preferences complicating preservation action, including issues related to rights and use, co-producer dependencies, and content integrity.
The second study, currently on-going, looks toward the public availability of scholar blogs over-time, with findings drawn from a sample of 644 blogs. Content analysis is currently underway on inactive blogs, characterized as available, but with no new posts published within three months of coding. Initial analysis of the most recent post published to these inactive blogs shows that some bloggers did provide indicators of their respective blogʼs declining activity or, in some cases, blog stoppage. However, such indicators are only present in a clear minority of publicly available, yet inactive blogs. These preliminary findings offer implications for both personal and programmatic preservation approaches, including, notably, issues related to selection and appraisal.
(Jan 2011) Digital Curation (Guest Lecture)Carolyn Hank
Event: Guest lecture on introduction to digital curation for Prof. Elaine Menard's GLIS 639: Introduction to Museology class, School of Information Studies, McGill University (January 28, 2011)
Presentation given on October 10, 2012 at the School of Information Management, Faculty of Management at Dalhousie University.
Abstract: Ensuring persistent access to digital content is a challenge confronting contemporary institutions of all types and sizes, regardless of professional, disciplinary or organizational context. Introduced in 2002, the term digital curation describes an array of principles, strategies and technical approaches for enabling the use and re-use of reliable and trusted digital content into the indefinite future. Trusted digital repositories have emerged as one strategy in response to today's digital curatorial challenges. Successful digital repository development and deployment necessitates coordination and collaboration among an array of actors, resources, and diverse, potentially divergent requirements. The literature contains an assortment of digital repository planning and best practice recommendations and resources, though reports on actual, as opposed to perceived or potential, roadblocks and obstacles are less reported. Drawing from a first-hand account of an extensive, multi-year digital curation and repository project at a major research university, this presentation provides an overview of what was done, including what worked and what didn’t, and resulting recommendations for advancing digital repository planning, implementation, and research.
(Feb 2011) Scholars in the Blogosphere: Blogs, the Scholarly Record, and Impl...Carolyn Hank
Event: Guest lecture in Ross Harvey's LIS 531W: Digital Stewardship, Graduate School of Library and Information Science, Simmons College, February 24, 2011.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
Introduction to Research Data Management by Michael Day, (UKOLN). Presentation at Demystifying Research Data: don’t be scared be prepared: A joint JIBS/RLUK event, Tuesday 17th July 17th July 2012, Brunei Gallery at SOAS (School of Oriental and African Studies), London.
Overview of the UKRDDS pilot project at Univwersity of Edinburgh employing PhD interns to validate metadata about research data created by University of Edinburgh researchers and held in local RDM services solutions. This was presented at IASSIST in June 2016, Bergen, Norway.
Building Sustainability: Preserving research data without breaking the bankGarethKnight
An overview of methods for establishing buy-in into digital preservation activities within a university, accompanied by practical examples of how this approach is being performed at the London School of Hygiene & Tropical Medicine
In order to be reused, research data must be discoverable.
The EPSRC Research Data Expectations* requires research organisations to maintain a data catalogue to record metadata about research data generated by EPSRC-funded research projects.
Universities are increasingly making research data assets available through repositories or other data portals.
The requirement for a UK research data discovery service has grown as universities become more involved in RDM and capacity develops.
A presentation given at the RECODE workshop on 25th September 2014. It covers what is happening in terms of opening up access to research data at the University of Glasgow and via the Digital Curation Centre. The RECODE project is developing policy recommendations for open access to research data in Europe - http://recodeproject.eu
Models for integrating institutional repositories and research information ma...Michael Day
Slides from a presentation given by Michael Day of UKOLN at the CNR/euroCRIS Workshop on CRIS, CERIF and Institutional Repositories, CNR, Rome, 10-11 May 2010
Exercise associated with a lecture on digital preservation given at the University of the West of England (UWE) as part of the MSc in Library and Library Management, University of the West of England, Frenchay Campus, Bristol, March 10, 2010
Brief Introduction to Digital PreservationMichael Day
Presentation slides from a lecture given at the University of the West of England (UWE) as part of the MSc in Library and Library Management, University of the West of England, Frenchay Campus, Bristol, March 10, 2010
Supplementary presentation slides from a lecture on digital preservation given at the University of the West of England (UWE) as part of the MSc in Library and Library Management, University of the West of England, Frenchay Campus, Bristol, March 10, 2010
Presentation slides from a talk given at RSP 'Goes back to' School 2009, Matfen Hall, Nr. Hexham, Northumberland, 14-16 September 2009. The actual presentation on the 15 September only covered the content up to Slide 33. The remainder includes a more detailed reflection on the curation of research data, left in to provide additional context for those using the full presentation.
1. … because good research needs good data
Digital Curation 101
University of Glamorgan
21 January 2013
Michael Day
Digital Curation Centre
UKOLN, University of Bath
m.day@ukoln.ac.uk
http://www.dcc.ac.uk/
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
2. … because good research needs good data
Agenda
• Part 1. Introduction to research data management:
activities, roles and requirements
• Exercise: Data management quiz
• Part 2. Developing data policies and services
• Exercise: Developing a roadmap
• Part 3: DMP Online tool and guidance
• With thanks to Joy Davidson, Sarah Jones and Kerry Miller
(DCC)
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
3. … because good research needs good data
Introduction to Research Data
Management: activities, roles and
requirements
Michael Day and Kerry Miller
Digital Curation Centre
UKOLN, University of Bath
m.day@ukoln.ac.uk
http://www.dcc.ac.uk/
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
4. … because good research needs good data
A Quick Introduction
• What is research data management?
• Who is involved and how?
• What skills and support are needed?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
5. … because good research needs good data
What is Research Data Management?
• Caring for,
• Facilitating access to,
• Preserving and
• Adding value to digital
research data throughout its
lifecycle.
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
6. … because good research needs good data
Typical Activities
• Creation and sharing of data
• File naming and description
• Dealing appropriately with
sensitive data
• Data storage
• Appraisal, selection and
disposal
• Data licensing
• Data management planning
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
7. … because good research needs good data
What are the main drivers?
• National and international policy development
• The Organisation for Economic Co-operation and Development
describes data as a public good that should be made available
• Research Councils UK in its Code of Good Research Conduct says
data should be preserved and accessible for 10 years +
• The data management policies of funding bodies are increasingly
demanding of institutional commitment and provisions ...
• The needs of
• Researchers
• Institutions
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
8. … because good research needs good data
Benefits to researchers
• Scholarly communication/access to data
• Re-purposing and re-use of data
• Stimulating new networks/collaborations &
• new research
• Knowledge transfer to industry
• Verification of research/research integrity
• Re-purposing data for new audiences
• Secure storage for data intensive research
• Availability of data underpinning journal articles
• Increased visibility/citation
Keeping Research Data Safe Factsheet
Keeping Research Data Safe Factsheet
http://www.beagrie.com/KRDS_Factsheet_0910.pdf
http://www.beagrie.com/KRDS_Factsheet_0910.pdf
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
9. … because good research needs good data
The researcher perspective
• Managing and sharing data is simply part of good
research:
• Adhering to disciplinary and/or institutional codes of practice
and policies
• Has been practiced since the advent of modern science, but
not always consistently; data intensive research makes it
even more critical
• Meeting the specific requirements of funding bodies
• Reputational risks if data management is not handled
properly
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
10. … because good research needs good data
Institutional drivers
• Safeguarding research integrity
• Increasing number of FOI requests for data
• Adhering to existing codes of research practice and ethics
• Developing new institution-wide strategies, policies and services
for data storage and management
• Increased institutional focus on research management (e.g., in
response to REF)
• Benchmarking – self-assessing infrastructure and planning for
improvement
• More demands but less resources to work with
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
11. … because good research needs good data
Research codes of practice (1)
• UK Research Integrity Office Code of Practice for
Research (2009)
Data management planning is an essential part of research
design
Organisations should have in place procedures, resources
(including physical space) and administrative support to
assist researchers in the accurate and efficient collection of
data and its storage in a secure and accessible form [3.12.5]
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
12. … because good research needs good data
Research codes of practice (2)
• RCUK Code of Conduct on the Governance of Good
Research Conduct (2011)
Primary data and research evidence [should be made]
accessible to others for reasonable periods after the
completion of the research: data should normally be
preserved and accessible for 10 yrs (in some cases 20 yrs or
longer)
Responsibility for proper management and preservation of
data and primary materials is shared between the researcher
and the research organisation [although deposit within
national collections is endorsed]
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
13. … because good research needs good data
Research funding bodies
• UK Research Councils
• Help fund some data archives, e.g.:
• Archaeology Data Service, European Bioinformatics
Institute, the NERC data centres, UK Data Archive
• Support for JISC (and DCC)
• RCUK Common Principles on Data Policy
• Recognises that data are a critical output of the research
process
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
14. … because good research needs good data
RCUK Principles (in a nutshell)
• Publicly funded research data should be made openly available
• Data with acknowledged long-term value should be preserved and
remain accessible and usable for future research
• Sufficient metadata should be recorded to enable other researchers to
find and understand the research to enable re-use; published results
should always include information on how to access the supporting data
• Recognition that there may be legal, ethical and commercial constraints
• Recognition that researchers may need privileged use of data for a
limited period
• All users of research data should acknowledge their sources
• Appropriate to use public funds to support MRD
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
15. … because good research needs good data
Funder expectations
• Institutions need to inform themselves about main
funder policies (mandates) with respect to research
data management
• There is an explicit link between research income and
appropriate data management infrastructures
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
16. … because good research needs good data
Funder policies
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-poli
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
17. … because good research needs good data
EPSRC expectations (1)
• EPSRC policy (2011) expected all institutions
receiving grant funding:
• To develop a roadmap aligning their policies and processes
with EPSRC’s expectations by 1st May 2012
• To be fully compliant with these expectations by 1st May
2015
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
18. … because good research needs good data
EPSRC expectations (2)
• Appropriate metadata (including unique IDs) to be made
freely available on the Internet within 12 months of data
generation
• Data not generated in digital format should be stored in a
manner to facilitate it being shared
• Data should be securely preserved for a minimum of 10
years after privileged access expires or the last date access
was requested by a third party
• Adequate resources from existing funding streams
• EPSRC will monitor progress and compliance, and reserves
the right to impose appropriate sanctions
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
19. … because good research needs good data
Implications for researchers
• Increasing number of research councils and funding bodies with data
management and sharing requirements
• Potential loss of research income if these mandates are not met
• Need to determine the costs associated with short and longer-term
management and curation and to request funds as part of grant
• Responsibility for infrastructure shifting more to HEIs and less to
centralised data archives, but institutional infrastructures and services
are still emerging
• Need guidance - some good external support
• But also need more local support; often fragmented (need to draw upon
existing channels within your institution wherever possible)
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
20. … because good research needs good data
Activities, roles, requirements (1)
• Requirements gathering
• Identifying researchers’ data requirements
• Developing a shared understanding of what needs to be
done (e.g., identifying where data exist, its form and scale,
any existing retention requirements)
• Identifying good practice within the institution (and the
opposite)
• Methods: surveys, focus groups, case studies, joint R&D
projects, assessment tools (e.g. DAF)
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
21. … because good research needs good data
Activities, roles, requirements (2)
• Identifying motivations and benefits
• For researchers, support services, the institution
• Identifying risks
• Data loss (institution, research group, individual)
• Increased costs (lack of planning, service inefficiency, data
loss)
• Legal compliance (research funder, H&S, ethics, FoI)
• Reputation (institution, unit, individual)
• Identifying costs
• Keeping Research Data Safe (KRDS) toolkit
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
22. … because good research needs good data
Activities, roles, requirements (3)
• Assessing institutional preparedness
• Identifying institutional stakeholders, existing data support services,
gaps
• Benchmarking and planning for the future
• Skills audit
• DCC CARDIO tool
• Policy development
• Policies – approval by senior management is just the start; policies
need to be embedded in research practice and responsive to
changing requirements
• Data management planning
• DMP online, DCC How-to Develop a Data Management Plan guide
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
23. … because good research needs good data
Activities, roles, requirements (4)
• Implementation and service development
• Integrating where possible with existing services, e.g. IR,
CRIS, VRE, HPC, cloud services, social media, etc.
• Appraisal, deciding what needs to be kept and for how long
• Storage choices – no one-size-fits-all solution, e.g. Bristol’s
BluePeta petascale storage facility, Bath’s X-Drive approach,
cloud approaches
• Data documentation and metadata – layered approaches:
top-level discovery (core metadata, collection/experiment-
level?), role of standards like DCMI, CERIF, DDI, etc.
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
24. … because good research needs good data
Activities, roles, requirements (5)
• Data issues:
• Appraisal: selection criteria, retention periods (who decides?)
• DCC How to appraise and select research data for
curation guide
• Documentation: metadata, schema, semantics
• Formats: proprietary formats, community standards, etc.
• Provenance and authenticity
• Citation (assignment of persistent IDs?)
• Access (embargo policies?)
• Licensing
• DCC How to license research data guide
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
25. … because good research needs good data
Who are involved?
• Funding bodies
• Archives / long-term data repositories
• At institutions:
• Senior management
• Researcher(s)
• Research support officers / project staff
• Lab technicians
• Librarians / Data Centre staff
• Faculty ethics committees
• Institutional legal / IP advisors
• FOI officer / DPA officer / records manager
• Computing support
• Institutional compliance officers
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
26. … because good research needs good data
Approaching the Issue
• What data exist and are being created?
• Where are greatest recoups on investment available?
• Training?
• Storage?
• Policy development
• What are the requirements?
• Who needs to be involved?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
27. … because good research needs good data
Making the most of what we’ve got
• Local expertise more
widespread than you
think
• Ethics committees
• Data protection office
• IT Services
• Repository Service
• If you need help, ask!
From University of Glasgow’s Data Management micro-site
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
28. … because good research needs good data
Data management planning
• A plan to address critical data management issues:
• What data will be created (format, types) and how?
• How will the data be documented and described?
• How will ethics and intellectual property considerations be
addressed?
• What are the plans for data sharing and access?
• What is the strategy for long-term preservation?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
29. … because good research needs good data
Integrating is a tricky business
• Make a sound case for investing in data management training
• Draw upon existing policies and mandates wherever you can
• Spend some time identifying current data holdings, researchers’
practice and future training needs
• Make sure you are putting your effort where it will count
• Don’t reinvent the wheel – augment or adapt existing training
and support materials with data management aspects
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
30. … because good research needs good data
What the DCC can help with
Needs assessment
CARDIO Tool– collaborative assessment & benchmarking of
RDM strengths/weaknesses
Data Asset Framework – interviews to scope current RDM
practice and recommend improvements
Developing strategic institutional RDM framework
Workflow assessment – methodology for analysing current
Strategy development – getting key people together to discuss/plan for
RDM workflows
RDM
Policy development – scoping, defining, embedding research data policies
Delivering support
Costing - assist with the development of costing and pricing for RDM
Customised Data Management Plans – templates / guidance to
services
be added to DMP Online
Risk management - identify risks in RDM practice and recommend
Training – institutional/disciplinary tailored courses, online
mitigations
resources
Institutional data catalogues - recommend options for exposing metadata
Incremental – repackaging existing support to raise awareness
about your research data via CRIS systems, repositories, or a mix of these
and make guidance more meaningful to researchers
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
31. … because good research needs good data
Exercise: How are you performing?
• Individually, complete the quick data management
quiz (5 mins)
• Compare results, try to learn from those with
confidence in those areas in which you consider
yourself to be weaker (10 mins)
• Based on your group’s discussions...
• Write down one practical thing you can do at work in order to
edge towards an A.
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
32. … because good research needs good data
Part 2:
Developing data policies and
services
Based on a presentation prepared by Sarah Jones
(Digital Curation Centre)
sarah.jones@glasgow.ac.uk
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
33. … because good research needs good data
Outline
• Who is responsible for RDM?
• What are the components of a data service?
• Learning lessons from other HEIs
• Developing roadmaps
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
34. … because good research needs good data
Who is responsible for RDM?
Funders
Advisory Data
bodies centres
Research
Organisations
Support Publishers
services
Researchers
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
35. … because good research needs good data
Components of a research data service?
Tools Support staff & services
Metadata and documentation
Research
Archive
environment&
Storage
systems Preserve
Back-up
RDM policies & Share
Access
Advocacy (senior mgmt & researcher)
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
36. … because good research needs good data
Data storage – Bristol example
• £2m funding to date
• Petascale facility – expandable
• 3 machine rooms – resilience
(tape archive 2012)
• Available to all researchers for
research data
Blue Peta at Bristol
1st 5TB free per Data Steward then
£400 per TB p.a. for disk storage;
tape backup £40 per TB
http://data.bris.ac.uk Funded by:
DCC 101, University of Glamorgan, 21 January 2013
37. … because good research needs good data
Tools – an ‘academic dropbox’
Piloted at Lincoln & Edinburgh
www.dataflow.ox.ac.uk http://tiny.cc/owncloud-pilot
National level negotiation via Janet brokerage?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
38. … because good research needs good data
Archiving – institutional data repositories
Not intended to replace
national, subject or other
established data collections
http://datashare.is.ed.ac.uk Essex-RDR and
Acknowledgment of hybrid DataPool at Southampton
environment
www.dspace.cam.ac.uk/
https://databank.ora.ox.ac.uk
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
39. … because good research needs good data
Archiving – external data centres
Research funders’ data centres…
Structured databases
Disciplinary&
community List of data centres:
initiatives http://databib.org
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
40. … because good research needs good data
Data catalogues (metadata)
• DataFinder at Oxford
Develop a research data • DDI metadata by
extension to the CERIF standard
ResearchData@Essex
http://cerif4datasets.wordpress.com
JISC & DCC planning national coordination
Can we learn lessons from overseas?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
41. … because good research needs good data
Guidance and training
Collate guidance
www.gla.ac.uk/datamanagement
Online training
http://datalib.edina.ac.uk/mantra
Embed into curriculum via
Doctoral Training Centres
e.g. Research360@Bath
http://blogs.bath.ac.uk/research360
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
42. … because good research needs good data
Disciplinary training (RDMTrain)
www.dcc.ac.uk/training/train-trainer/
disciplinary-rdm-training
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
43. … because good research needs good data
Early research data policies
“Statement of commitment” legal compliance style
Infrastructure policy a section in uni DM policy
useful guide as appendix
“10 commandments”
mutual promises
aspirational
Based on Edin.
with a few
Baseline of RCUK Code
additions
+ procedures & support
www.dcc.ac.uk/resources/policy-and-legal/institutional-
Funded by:
data-policies
DCC 101, University of Glamorgan, 21 January 2013
44. … because good research needs good data
How are others developing policies?
Theme from MRD workshop in Leeds:
High level policy (ratified)
+
User guides, practical support
+
RDM Infrastructure
Developing data policies:
a trend for 2012 http://tiny.cc/MRD-policy-workshop
http://tiny.cc/PolicyNews
(news post from Dec 2011)
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
45. … because good research needs good data
Policy development
“EPSRC expects all those it funds to have developed a clear
roadmap to align their policies and processes with EPSRC’s
expectations by 1st May 2012, and to be fully compliant with
these expectations by 1st May 2015.”
www.epsrc.ac.uk/about/standards/researchdata/Pages/impact.aspx
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
46. … because good research needs good data
What is the EPSRC looking for?
• Know what you hold – publish metadata
• Link publications and data
• Share data wherever possible http://tiny.cc/
EPSRC-data-policy
• Curate and preserve valuable data
The same as other funders (i.e. good research practice)
so think broadly when you develop your strategy
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
47. … because good research needs good data
Exercise: Developing a roadmap for RDM
Think about the potential components of a RDM service
Based on the strengths/weaknesses you identified in the quiz:
• Draft a list of actions needed at your institution
• Attempt to prioritise your list and pencil in timeframes (consider
quick wins!)
• Decide who needs to be involved to make this happen?
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
48. … because good research needs good data
Part 3
DMP Online tool and guidance
Based on a presentation prepared by Sarah Jones
and Joy Davidson (DCC)
sarah.jones@glasgow.ac.uk
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
49. … because good research needs good data
Funders have DMP requirements
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
50. … because good research needs good data
Funding body requirements
• Typically a short (c.1-2 pp) statement, covering:
• What data will be created (format, types, volume, avoidance
of duplication)
• Standards and methodologies to be used (including
metadata)
• How ethics and Intellectual Property will be addressed
• Plans for data sharing and access
• Strategy for long-term preservation
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
51. … because good research needs good data
DCC support
• Guidance
• Examples
• Tools
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
52. … because good research needs good data
What is DMP Online?
• A web-based tool to help researchers write plans
• It features:
• Templates based on different requirements
• Tailored guidance (disciplinary, funder etc)
• Customised exports to a variety of formats
• Ability to share DMPs with others
• https://dmponline.dcc.ac.uk
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
53. … because good research needs good data
Start a plan
Pick relevant
funder
template
Get a list of
their
specific
questions Funded by:
DCC 101, University of Glamorgan, 21 January 2013
54. … because good research needs good data
Create a plan
at the
bid stage
...answer
the
questions
based on
initial
research Funded by:
ideas
DCC 101, University of Glamorgan, 21 January 2013
55. … because good research needs good data
Once funded,
flesh the plan
out
(roles, etc)
...answer
the
questions
based on
detailed Funded by:
workplan
DCC 101, University of Glamorgan, 21 January 2013
56. … because good research needs good data
When project
is finished
...answer
the
questions
based on
the outputs
that are Funded by:
being
kept DCC 101, University of Glamorgan, 21 January 2013
57. … because good research needs good data
Institutional customisation
Add your logo, URL, colours
Profile local support, boilerplate text
Select desired
questions
http://www.dcc.ac.uk/blog/tailoring-dmp-online-for-your-institution
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
58. … because good research needs good data
Links to specific examples
Thinks about why
the questions are
being asked – what
are funders looking
for?
Gives examples,
local if possible
http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/framewo
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
59. … because good research needs good data
Top tips
• Encourage researchers to start early - not wait
until the last minute!
• The plan will - and should - change over life of
project.
• Get other support staff involved - ethics, IT,
library, RM, DP/FoI
• Update the plan with project updates
• Use plan as a communication tool - with
partners, funding bodies and yourself!
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
60. … because good research needs good data
Thank you!
Any questions?
Michael Day,
Digital Curation Centre
UKOLN, University of Bath
m.day@ukoln.ac.uk
http://www.dcc.ac.uk/
Funded by:
DCC 101, University of Glamorgan, 21 January 2013
Editor's Notes
Given the audience I’ll reflect on two pieces of DCC work: DAF tool, which has been used primarily by service providers or intermediaries to investigate what’s happening in terms of data management at the coalface and explore service gaps to see what support researchers need, and; Research funders policies, specifically in terms of data management and sharing plan requirements, as this is directly relevant to researchers
This talk pulls together the lessons from the DCC roadshow to consider how to develop policies and services for Research Data Management (RDM)
We’ll cover who is responsible for RDM and what the potential components of a research data service are. The main part of the talk will focus on how other universities are addressing certain aspects to see where you can learn lessons At the end we’ll touch on developing roadmaps in light of the EPSRC policy requirement and do an exercise on this
There are lots of stakeholders with varied roles, both within organisations and external to them. Requirements and support can be external (e.g. from funders, publishers, data centres) but in terms of developing infrastructure, research organisations are taking a central role. Ensuring clarity of responsibility across stakeholders and bringing people together is key.
*Animated slide – components come in separately* This isn’t definitive. It’s just an idea of the building blocks involved and how they might be put together. - Storage is often though of first. It should be properly backed up with appropriate access controls and ability to access from anywhere - Also need an appropriate environment for research (instruments, hardware, software, VREs) tools and systems e.g. for grants - Aside from current work environments, we also need to consider facilities for archiving to preserve and share data - There’s an inherent need to access/share data, so we need standards, tools and approaches for metadata across the lifecycle - We have the basics of a system, but none of this works without people to keep things running and provide guidance and training - Also need policies to provide overarching governance - And to ensure uptake and maintenance you need buy-in across the board, incentives and financial backing We’ll now consider how different institutions are addressing certain aspects of this.
The data.bris team gave a case study at the DCC Roadshow in Cardiff in December 2011. This details here are abstracted from that talk. They are building research data services around their High Performance Computing facility to provide all researchers with adequate storage for their research data. The key things to note is the cost model – they provide a clear, up-front cost so additional storage can be written into proposals. Other Universities (Oxford, Leicester) have produced similar figures
A few institutions already run data repositories e.g. Edinburgh and Cambridge (both DSpace) Others are piloting them e.g. Essex and Southampton (doing extensions to existing ePrints repositories as part of JISC MRD02 programme) and Databank at Oxford. Key thing is that none of these services intend to replace established data services. Where there are more appropriate disciplinary data centres, for example, the data should be submitted there.
There are many external services – dedicated data centres supported by research funders and various structured databases and community initiatives. The list of data centres provided by DataCite is a useful reference for institutions and researchers to identify the most appropriate place of deposit.
This area is the aspect most in its infancy. No institutions appear to have a handle on exactly what research data they hold in order to systematically register & manage data, and expose appropriate metadata to facilitate sharing. However, several UK institutions have flagged a desire to develop institutional data catalogues so models are likely to emerge. EDINA at the University of Edinburgh started to investigate approaches in the RADAR project. A pertinent project to look at is C4D, which is developing an extension to the cerif standard to record information on research data. Research Data Australia – a discovery service for research data from Australian universities supported by ANDS – is a model the DCC is looking at to see how a similar service could be provided in the UK.
There are many examples of guidance and training – most are Creative Commons licensed so you can repurpose them. At the University of Glasgow, the Incremental project pulled together details of existing support to raise awareness of services that tended to be missed or misunderstood. Mantra provided excellent online training modules, as did other JISC RDMTrain projects. A current trend is to embed RDM into existing curricula e.g. core PhD skills courses. The research360 project is collaborating with a Doctoral Training Centre and reflect on this in their blog
Lots of training materials have been created on the JISC MRD programme. The outputs from the 5 disciplinary training projects are all freely available to reuse and are deposited in JORUM. We have mapped the modules & materials to the DCC lifecycle model to help people find relevant resources.
There are five institutional RDM policies at present (April 2012). These differ in approach: Oxford University doesn’t have a policy per se. They collaborated with the University of Melbourne on the EIDCSR project (c.2009) and realised that implementation is a stumbling block so first introduced a Statement of Commitment until infrastructure was developed. A proper policy is being developed on the DaMaRO project. The University of Edinburgh’s policy is exemplary and seems to be the biggest influence on policy development at other institutions. It was written by an external consultant (Chris Rusbridge) and is described as aspirational as they know there’s some way to go to make it a reality. The University of Hertfordshire has RDM requirements as part of a wider data management policy. The language/style is more legal, however an appendix provides much more practical guidance on data management. The University of Northampton reiterates the RCUK Code as its guiding principle and usefully provides guidance on procedures and support to explain how the policy should be implemented. An the University of East London has taken the Edinburgh policy as a model and made minor adjustments and additions – rewording, adding data review dates etc
Other universities are sharing lessons about how they are developing policy. We pulled together examples of how policies were being developed in December. The news post has links to blogs and draft policy texts. There was a JISC MRD workshop on policy development in Leeds in March 2012. Suggestion to have a high-level policy (fairly generic) and accompanying user guides & support (which won’t need to go through the whole ratification process each time they’re changed) Detailed guidance for implementation may be better at a departmental / group level
Uppermost on many minds at the moment is the requirement to develop a roadmap in response to the EPSRC. So what is a roadmap and where do you start? The key thing isn’t this outcome (i.e. the plan) rather the process of getting there – taking stock of your current position and realising what you need to do to be in a position to comply with the EPSRC policy in 3 years so you can plan for that activity.
The EPSRC policy is more specific than others in terms of what institutions should be doing e.g. register data, put metadata online within 12 months of creation, access = longer period of preservation... Looking for data to be shared (linked to publications) and curated/preserved to ensure ongoing access. These requirements are essentially the same as other funders, so don’t be too blinkered by what the EPSRC is looking for specifically.
In the exercise, please consider the potential components of a RDM service which we’ve covered here and the strengths and weaknesses you identified earlier in the CARDIO quiz to decide what you need to do, when and how.
I’ll give a quick over view to DCC’s main tools DMP Online is a tool to help researchers write plans. It pulls together the various requirements and relevant support to make the process easier.
I recommend this ICPSR resource It explains the importance of different questions as a pointer to how to answer Examples are given. This is the most frequent request we get at DCC - examples help researchers think of what to write for their context