32 
The Evolving World of 
Research Data Management 
Options and Opportunties 
@MarkHahnel 
@figshare
“But taxpayers who are paying for that 
research will want to see something 
back. Directly – through open access 
to results and data. And indirectly – 
through making science work better 
for all of us. 
That’s why we will require open access 
to all publications stemming from EU-funded 
research. That’s why we will 
progressively open access to the 
research data, too. And why we’re 
asking national funding bodies to do 
the same.” 
Neelie Kroes. 
Vice President for the Eurpoean Commission
4 
“The Obama Administration is committed to the proposition that citizens deserve 
easy access to the results of scientific research their tax dollars have paid for. 
That’s why, in a policy memorandum released today, OSTP Director John 
Holdren has directed Federal agencies with more than $100M in R&D 
expenditures to develop plans to make the published results of federally funded 
research freely available to the public within one year of publication and 
requiring researchers to better account for and manage the digital data 
resulting from federally funded scientific research.” 
February 22nd 2013
“Investigators are expected to share with other researchers, at no more than 
incremental cost and within a reasonable time, the primary data, samples, physical 
collections and other supporting materials created or gathered in the course of 
work under NSF grants” 
http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4 
“NIH expects the timely release and sharing of data to be no later than the 
acceptance for publication of the main findings from the final dataset” 
http://grants.nih.gov/grants/policy/data_sharingdata_sharing_guidance.htm#time 
“NEH is committed to timely and rapid data distribution” 
http://www.neh.gov/files/grants/data_management_plans_2012.pdf
6 
"Products of research are not just publications.” 
NSF senior policy specialist Beth Strausser. 
Biographical Sketch(es), has been revised to rename the “Publications” 
section to “Products” and amend terminology and instructions accordingly. 
13 January 2013: "National Science Foundation’s Merit Review Criteria: Review and Revisions” Chapter II.C.2.f(i)(c),
11
1. Recommended open access to scholarly papers of publicly 
funded research 
2. Recommended open access to all digital outputs of publicly 
funded research 
3. Mandated open access to scholarly papers of publicly funded 
research 
4. Mandated open access to all digital outputs of publicly funded 
research 
5. Enforced, mandated open access to scholarly papers of publicly 
funded research 
6. Enforced, mandated open access to all digital outputs of publicly 
funded research 
The Open Academic Tidal Wave
1. Recommended open access to scholarly papers of publicly 
funded research 
2. Recommended open access to all digital outputs of publicly 
funded research 
3. Mandated open access to scholarly papers of publicly funded 
research 
4. Mandated open access to all digital outputs of publicly funded 
research 
5. Enforced, mandated open access to scholarly papers of 
publicly funded research 
6. Enforced, mandated open access to all digital outputs of 
publicly funded research 
The Open Academic Tidal Wave
14
2 
What is figshare? 
A cloud based research data management 
system for academics and administrators: 
Manage their research 
outputs privately and 
securely, with controlled 
collaborative spaces 
Public repository of all 
research outputs from an 
institution, with impact and 
usage metrics
17
Promo9ng 
Sharing 
Managing 
Open 
Data 
Making 
it 
discoverable 
Storing 
it 
properly
Edi9ng 
an 
item 
on 
figshare
Confiden9al 
item 
on 
figshare
Linked 
item 
on 
figshare
16363 
There are 109 metrics! 
‘Greater effort than expected: over 500 person hours’ 
‘A full audit would cost us 10,000 to 25,000 euro’s, a midterm review 5,000 to 10,000 euro’s. 
Every year such an effort would not be feasible and too costly’ 
‘The formulation of the metrics is a bit idealistic (“down to the bit level”)… since no archive 
is perfect, what will be the ‘less than perfect’ level (or levels for the different metrics), which 
is acceptable and deserves certification?’ 
Feedback from test audits 
http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2012/04/APARSEN-REP-D33_1B-01-1_0.pdf
4 Key Modules 
1 
2 
3 
4 Reporting Dashboard 
Research Data Management 
Private, controlled storage and collaborative spaces 
for every academic at the institution. 
Public Digital Research Repository 
A customisable public portal with all digital files made public at an 
institutional, departmental and group level. 
Administrative Workflow Portal 
A portal where administrators can manage curation of files to be 
made public, storage space allocation and user rights. 
Impact and Usage Reporting.
37 
Institutional API 
The figshare API allows you to push 
data to figshare, or pull data out. 
This allows you to build applications on 
top of your academic’s research.
2326
2337
• Incentivising compliance 
• Facilitating international collaboration 
• Integration into user workflows 
• Quantifying impact 
• Administrative curation layer 
• Embargo support 
• Open data principles 
• Citable – with DOIs 
• Increases impact of research 
• Trusted Repository 
• Persistent links 
• Heavyweight infrastructure
Persistent identifiers are essential 
43
Persistent identifiers are essential 
44
4 
5 
APIs 
are 
essen9al
4 
6 
Open 
Access 
is 
essen9al
4 
7 
Advocacy 
is 
essen9al
4 
8
49 
Institutions 
Generating the world’s knowledge
50
Thanks for your time. 
@markhahnel 
@figshare 
figshare.com 
api.figshare.com 
institutions.figshare.com 
mark@figshare.com
Publisher examples 
http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1003094#s5 
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059671#s4 
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059503#s5 
51 
http://f1000research.com/articles/2-5/v1 
http://f1000research.com/articles/1-47/v1
Figshare’s 
posi9oning: 
the 
only 
player 
to 
support 
ins9tu9ons 
all 
the 
way 
to 
the 
top 
of 
the 
hierarchy: 
‘Ac9ve 
Data’ 
Figshare Mendelay Archivum 
Research 
Gate Dryad Eprints 
Fedora 
+Front 
End Zenodo 
Lab 
Archive 
✓ ✓ no ✓ 
have the 
community 
✓ 
Needs 
developers. 
Files all stored 
as individual 
objects 
Can but don’t 
have a 
community of 
eyes on the 
system. 
Example of 
Missouri 
✓ ✓ 
✓ 
no no no no 
Can track use 
at level of 
article. 
No - needs 
manual 
intervention 
no no 
✓ ✓ no ✓ ✓ ✓ ✓ ✓ ✓ 
✓ No – focused 
on papers. 
None of the 
permanence 
✓ no 
✓ 
but not an 
institutional 
offer 
✓ 
Own servers 
so yes 
✓ 
because its on 
the institutions 
servers 
No – as only a 
5 (2?) year 
funding plan 
Active Data 
Promoting 
Sharing 
Managing 
Open Data 
Making it 
discoverable 
• advocacy – driving uptake of 
tools 
• training for researchers, 
• incentives? 
• facilitating international 
collaboration 
• knowing the numbers. How 
many papers, how many 
citations, also for data 
• Allocation of space around the 
institution – e.g. 30GB / user. 
User management 
• Having a rights system for 
access approval. CCO, CCBY, 
CCNC etc 
• Configurable workflow? 
• Open data principles 
• Having data stored somewhere 
where – technically – it’s 
discoverable – ie not on hard 
drives 
• Ensuring metadata attached 
within 12 months 
• Raw storage capacity 
• Security and back up 
• Persitent links 
• Storage for 10 years from last use 
(which must therefore be known) 
• Archiving for posterity 
Storing it properly no
BLC & Digital Science: Mark Hahnel, Figshare

BLC & Digital Science: Mark Hahnel, Figshare

  • 1.
    32 The EvolvingWorld of Research Data Management Options and Opportunties @MarkHahnel @figshare
  • 2.
    “But taxpayers whoare paying for that research will want to see something back. Directly – through open access to results and data. And indirectly – through making science work better for all of us. That’s why we will require open access to all publications stemming from EU-funded research. That’s why we will progressively open access to the research data, too. And why we’re asking national funding bodies to do the same.” Neelie Kroes. Vice President for the Eurpoean Commission
  • 4.
    4 “The ObamaAdministration is committed to the proposition that citizens deserve easy access to the results of scientific research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the published results of federally funded research freely available to the public within one year of publication and requiring researchers to better account for and manage the digital data resulting from federally funded scientific research.” February 22nd 2013
  • 5.
    “Investigators are expectedto share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants” http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4 “NIH expects the timely release and sharing of data to be no later than the acceptance for publication of the main findings from the final dataset” http://grants.nih.gov/grants/policy/data_sharingdata_sharing_guidance.htm#time “NEH is committed to timely and rapid data distribution” http://www.neh.gov/files/grants/data_management_plans_2012.pdf
  • 6.
    6 "Products ofresearch are not just publications.” NSF senior policy specialist Beth Strausser. Biographical Sketch(es), has been revised to rename the “Publications” section to “Products” and amend terminology and instructions accordingly. 13 January 2013: "National Science Foundation’s Merit Review Criteria: Review and Revisions” Chapter II.C.2.f(i)(c),
  • 11.
  • 12.
    1. Recommended openaccess to scholarly papers of publicly funded research 2. Recommended open access to all digital outputs of publicly funded research 3. Mandated open access to scholarly papers of publicly funded research 4. Mandated open access to all digital outputs of publicly funded research 5. Enforced, mandated open access to scholarly papers of publicly funded research 6. Enforced, mandated open access to all digital outputs of publicly funded research The Open Academic Tidal Wave
  • 13.
    1. Recommended openaccess to scholarly papers of publicly funded research 2. Recommended open access to all digital outputs of publicly funded research 3. Mandated open access to scholarly papers of publicly funded research 4. Mandated open access to all digital outputs of publicly funded research 5. Enforced, mandated open access to scholarly papers of publicly funded research 6. Enforced, mandated open access to all digital outputs of publicly funded research The Open Academic Tidal Wave
  • 14.
  • 15.
    2 What isfigshare? A cloud based research data management system for academics and administrators: Manage their research outputs privately and securely, with controlled collaborative spaces Public repository of all research outputs from an institution, with impact and usage metrics
  • 17.
  • 18.
    Promo9ng Sharing Managing Open Data Making it discoverable Storing it properly
  • 19.
    Edi9ng an item on figshare
  • 20.
  • 21.
    Linked item on figshare
  • 25.
    16363 There are109 metrics! ‘Greater effort than expected: over 500 person hours’ ‘A full audit would cost us 10,000 to 25,000 euro’s, a midterm review 5,000 to 10,000 euro’s. Every year such an effort would not be feasible and too costly’ ‘The formulation of the metrics is a bit idealistic (“down to the bit level”)… since no archive is perfect, what will be the ‘less than perfect’ level (or levels for the different metrics), which is acceptable and deserves certification?’ Feedback from test audits http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2012/04/APARSEN-REP-D33_1B-01-1_0.pdf
  • 26.
    4 Key Modules 1 2 3 4 Reporting Dashboard Research Data Management Private, controlled storage and collaborative spaces for every academic at the institution. Public Digital Research Repository A customisable public portal with all digital files made public at an institutional, departmental and group level. Administrative Workflow Portal A portal where administrators can manage curation of files to be made public, storage space allocation and user rights. Impact and Usage Reporting.
  • 29.
    37 Institutional API The figshare API allows you to push data to figshare, or pull data out. This allows you to build applications on top of your academic’s research.
  • 32.
  • 33.
  • 39.
    • Incentivising compliance • Facilitating international collaboration • Integration into user workflows • Quantifying impact • Administrative curation layer • Embargo support • Open data principles • Citable – with DOIs • Increases impact of research • Trusted Repository • Persistent links • Heavyweight infrastructure
  • 43.
  • 44.
  • 45.
    4 5 APIs are essen9al
  • 46.
    4 6 Open Access is essen9al
  • 47.
    4 7 Advocacy is essen9al
  • 48.
  • 49.
    49 Institutions Generatingthe world’s knowledge
  • 50.
  • 51.
    Thanks for yourtime. @markhahnel @figshare figshare.com api.figshare.com institutions.figshare.com mark@figshare.com
  • 52.
    Publisher examples http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1003094#s5 http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059671#s4 http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059503#s5 51 http://f1000research.com/articles/2-5/v1 http://f1000research.com/articles/1-47/v1
  • 58.
    Figshare’s posi9oning: the only player to support ins9tu9ons all the way to the top of the hierarchy: ‘Ac9ve Data’ Figshare Mendelay Archivum Research Gate Dryad Eprints Fedora +Front End Zenodo Lab Archive ✓ ✓ no ✓ have the community ✓ Needs developers. Files all stored as individual objects Can but don’t have a community of eyes on the system. Example of Missouri ✓ ✓ ✓ no no no no Can track use at level of article. No - needs manual intervention no no ✓ ✓ no ✓ ✓ ✓ ✓ ✓ ✓ ✓ No – focused on papers. None of the permanence ✓ no ✓ but not an institutional offer ✓ Own servers so yes ✓ because its on the institutions servers No – as only a 5 (2?) year funding plan Active Data Promoting Sharing Managing Open Data Making it discoverable • advocacy – driving uptake of tools • training for researchers, • incentives? • facilitating international collaboration • knowing the numbers. How many papers, how many citations, also for data • Allocation of space around the institution – e.g. 30GB / user. User management • Having a rights system for access approval. CCO, CCBY, CCNC etc • Configurable workflow? • Open data principles • Having data stored somewhere where – technically – it’s discoverable – ie not on hard drives • Ensuring metadata attached within 12 months • Raw storage capacity • Security and back up • Persitent links • Storage for 10 years from last use (which must therefore be known) • Archiving for posterity Storing it properly no