This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
Developing research data management policy & servicesSarah Jones
Slides updated for presentation at DCC Northeast roadshow in Newcastle, April 2012.
Session ends with an exercise on developing a roadmap for research data management.
Presentation initially given by Sarah Jones at the DCC roadshow in Loughborough, February 2012.
See event details at: http://www.dcc.ac.uk/events/data-management-roadshows/dcc-roadshow-loughborough
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
Developing research data management policy & servicesSarah Jones
Slides updated for presentation at DCC Northeast roadshow in Newcastle, April 2012.
Session ends with an exercise on developing a roadmap for research data management.
Presentation initially given by Sarah Jones at the DCC roadshow in Loughborough, February 2012.
See event details at: http://www.dcc.ac.uk/events/data-management-roadshows/dcc-roadshow-loughborough
This presentation introduced participants to the DC 101 course and was given at the Digital Curation and Preservation Outreach and Capacity Building Workshop in Belfast on September 14-15 2009.
http://www.dcc.ac.uk/events/workshops/digital-curation-and-preservation-outreach-and-capacity-building-workshop
Developing an institutional research management plan: guidelinesheila1
Research data cycle; what is a data management plan;benefits of a rdm plan; the two best known international rdm plans; examples of university rdm plans; guidelines
A talk outlining the virtues and processes of Research Data Management for PhD students in the geosciences. Given by Stuart Macdonald at the Introduction to RDM Workshop, School of Geosciences, University of Edinburgh, on 2 November 2015
Long-term storage – will it fill up with the good stuff, or the big, bad, an...DCC-info
Presentation by Angus Whyte at DCC-Arkivum event 'Data Storage & Preservation Strategies for Research Data Management' at University of Edinburgh 27 October 2014
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
If Big Data is data that exceeds the processing capacity of conventional systems, thereby necessitating alternative processing measures, we are looking at an essentially technological challenge that IT managers are best equipped to address.
The DCC is currently working with 18 HEIs to support and develop their capabilities in the management of research data and, whilst the aforementioned challenge is not usually core to their expressed concerns, are there particular issues of curation inherent to Big Data that might force a different perspective?
We have some understanding of Big Data from our contacts in the Astronomy and High Energy Physics domains, and the scale and speed of development in Genomics data generation is well known, but the inability to provide sufficient processing capacity is not one of their more frequent complaints.
That’s not to say that Big Science and its Big Data are free of challenges in data curation; only that they are shared with their lesser cousins, where one might say that the real challenge is less one of size than diversity and complexity.
This brief presentation explores those aspects of data curation that go beyond the challenges of processing power but which may lend a broader perspective to the technology selection process.
This presentation was given by Jon Wheeler and Karl Benedict of the University of New Mexico during the joint NISO-NFAIS Virtual Conference held on December 7, 2016
Building Sustainability: Preserving research data without breaking the bankGarethKnight
An overview of methods for establishing buy-in into digital preservation activities within a university, accompanied by practical examples of how this approach is being performed at the London School of Hygiene & Tropical Medicine
This presentation introduced participants to the DC 101 course and was given at the Digital Curation and Preservation Outreach and Capacity Building Workshop in Belfast on September 14-15 2009.
http://www.dcc.ac.uk/events/workshops/digital-curation-and-preservation-outreach-and-capacity-building-workshop
Developing an institutional research management plan: guidelinesheila1
Research data cycle; what is a data management plan;benefits of a rdm plan; the two best known international rdm plans; examples of university rdm plans; guidelines
A talk outlining the virtues and processes of Research Data Management for PhD students in the geosciences. Given by Stuart Macdonald at the Introduction to RDM Workshop, School of Geosciences, University of Edinburgh, on 2 November 2015
Long-term storage – will it fill up with the good stuff, or the big, bad, an...DCC-info
Presentation by Angus Whyte at DCC-Arkivum event 'Data Storage & Preservation Strategies for Research Data Management' at University of Edinburgh 27 October 2014
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
If Big Data is data that exceeds the processing capacity of conventional systems, thereby necessitating alternative processing measures, we are looking at an essentially technological challenge that IT managers are best equipped to address.
The DCC is currently working with 18 HEIs to support and develop their capabilities in the management of research data and, whilst the aforementioned challenge is not usually core to their expressed concerns, are there particular issues of curation inherent to Big Data that might force a different perspective?
We have some understanding of Big Data from our contacts in the Astronomy and High Energy Physics domains, and the scale and speed of development in Genomics data generation is well known, but the inability to provide sufficient processing capacity is not one of their more frequent complaints.
That’s not to say that Big Science and its Big Data are free of challenges in data curation; only that they are shared with their lesser cousins, where one might say that the real challenge is less one of size than diversity and complexity.
This brief presentation explores those aspects of data curation that go beyond the challenges of processing power but which may lend a broader perspective to the technology selection process.
This presentation was given by Jon Wheeler and Karl Benedict of the University of New Mexico during the joint NISO-NFAIS Virtual Conference held on December 7, 2016
Building Sustainability: Preserving research data without breaking the bankGarethKnight
An overview of methods for establishing buy-in into digital preservation activities within a university, accompanied by practical examples of how this approach is being performed at the London School of Hygiene & Tropical Medicine
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK presents on Supporting Libraries in Leading the Way in Research Data Management at Online Information, London 20th -21st November 2012
Presentation given at the Indiana University School of Medicine's Ruth Lilly Medical Library. Contains information and resources specific to Indiana University Purdue University Indianapolis (IUPUI). For full class materials, see LYD17_IUPUIWorkshop folder here: https://osf.io/r8tht/.
Research Data Management Storage Requirements: University of LeedsResearch Data Leeds
Research Data Management Storage Requirements Workshop, Mon 25 February, organised by Jisc, Janet and DCC. Presentation covers a research data survey, the RoaDMaP project, research data characteristics and potential storage requirements at the University of Leeds.
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
Watch full webinar here: https://bit.ly/3hgOSwm
Data Lake technologies have been in constant evolution in recent years, with each iteration primising to fix what previous ones failed to accomplish. Several data lake engines are hitting the market with better ingestion, governance, and acceleration capabilities that aim to create the ultimate data repository. But isn't that the promise of a logical architecture with data virtualization too? So, what’s the difference between the two technologies? Are they friends or foes? This session will explore the details.
Results from Digital Curation Centre's 2015 survey of UK universities and Higher Education Institutions on development of RDM (research data management) support services
Outline of some of the opportunities and risks driving universities in the UK to develop services aiming to help researchers manage their data. Gives an overview of funder policies and service components, and suggests where these are heading in the short term.
Presentation from "Institutional Repositories Dealing with Data" OR2013 Workshop, 8th July 2013, Prince Edward Island. Outlines UK programmes to help Higher Education Institutions develop Research Data Management Services. Gives background on the Digital Curation Centre, and the
DCC role in developing services. Outlines emerging RDM services based on this experience. projects in the JISC Managing Research Data programmes, and two ecent surveys on library plans & priorities. Then outlnes
examples in ‘new’ universities of how repository managers are enabling new roles for subject librarians to take shape in their institutions.
Presentation to IASSIST 2013, in the session Expanding Scholarship: Research Journals and Data Linkages. Describes PREPARDE workshop on repository accreditation for data publication and invites comments on guidelines.
Presented to "Managing the Material: Tackling Visual Arts as Research Data" workshop, organised by Visual Arts Data Service (VADS) in conjunction with the Digital Curation Centre (DCC), through the JISC-funded KAPTUR project. London, 14 September 2012
Reasons to select research data and where to start
Data Selection & Triage
1. Data Selection & Triage
JISC/DCC
Progress
Workshop
Managing
Research Data
& Institutional
Engagement
Nottingham
25 October
2012
This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland License
2. Introduction
How can researchers and support staff
effectively decide what data is worth holding
on to, agree what to do with it, and arrange
for its handover?
What challenges does this represent
How to address them?
3. Outline
• What guidelines are there and why do we need
more?Angus Whyte DCC and Marie Therese
Gramstadt - KAPTUR
• UK Data Archive's Data Review Process - Veerle
van Eynden UKDA
• Applying NERC's Data Value Checklist - Sam
Pepler, British Atmospheric Data Centre
• Discussion
4. Guidelines clarify expectations
…adapted by
Archaeology Data Service
NERC
KAPTUR
University of Leicester
What criteria
will be used to
judge what’s
handed over?
5. Basic model
1. Define a policy i.e. criteria
and range of decisions All
2. Archive manager applies data
criteria, involving researchers
3. Select the significant,
dispose of the rest 10
%
For records records yes, but
researchdata? 90%
6. Characterising research data…
• Research process more uncertain and open-ended
than admin processes
• Research data purpose may change before complete
• More effort to make reusable - complex inter-
relationships, and richer contexts to document
• Originators should be engaged but may not have
capacity e.g. if project funding has ceased
• Others may need to be involved with broader view of
potential in other disciplines
• More than keep/dispose choice –need to prioritise
attention and effort to make data fit for reuse
7. Triage analogy
First Deposit location
characterise
research data Institutional Data
Prioritise Repository
Criteria High reuse value + Data Centre
needs attention
Duty of care affordable Subject Repository etc.
Reuse value Other
permutations Tiered approach to
Quality and deploying resources
More permutations
condition Discoverability
Accessibility Low reuse value,
Unaffordable Access management
Costs associated
Storage performance
Potential to automate ?
Preservation actions
9. e.g.Data Centre Collection Policies
“The ADS expects to
collect all of the
following
archaeological data
types…”
http://archaeologydataservice.ac.uk/advice/collectionsPolicy
9
10. Costs should persuade us
IDC Digital Universe Study- Increasing volumes outpace declining storage
hardware costs
According to: John Gantz and David Reinsel 2011 Extracting Value from Chaos
http://www.emc.com/digital_universe.
10
11. We can’t afford it all
“Keeping 2018’s data in S3 would
cost the entire global GDP”
http://blog.dshr.org/2012/05/lets-just-keep-everything-forever-in.html
11
12. Selection presumes description
• You can’t value what you don’t know about!
• Researchers can’t afford NOT to spend effort
on minimal metadata description and
organisation, because costs of retention will
be much higher if they don’t
• Description makes data affordable – is citation
potential a concrete enough reward?
12
13. Challenges
• Identify what datasets are created
and where they are
• Differentiate those that are of high
value from those where most
uncertainty or least reusability
• Be able to justify ‘natural’ wastage
of low priority data as much as
deliberate selection of high value
14. Questions
• What has worked/is working
• What lessons have you learned and
how generalisable
• What challenges remain
• How may they be approached and
what do you intend to do
• What DCC / MRD activity do you
think may help make the challenge
more tractable.