Open Science:
Research Data Management
April 2016
Wouter Haak
VP Research Data Management Solutions
ICSU/IAP/TWAS/ISSC: It is widely recognized that ‘data
re-use’ is not just a technology challenge, or something
that the funding bodies can just change by themselves
“Open Data in a Big Data world” – January 2016
Four major organisations representing global science, the International
Council for Science (ICSU), the InterAcademy Partnership (IAP),
The World Academy of Sciences (TWAS) and the International Social
Science Council (ISSC)
Backup
When you leave your institution…what happens with your data?
„Forschende und ihre Daten. Ergebnisse einer österreichweiten Befragung (eBook)“
E-infrastructures Austria
Bauer, B. (Bruno) et all
Oct 2015
https://phaidra.univie.ac.at/detail_object/o:407736
Stays at
institution
Take it with me
Don’t know
Data is lost
Other
Is your research data useful for others?
Frequently
Yes
No
„Forschende und ihre Daten. Ergebnisse einer österreichweiten Befragung (eBook)“
E-infrastructures Austria
Bauer, B. (Bruno) et all
Oct 2015
https://phaidra.univie.ac.at/detail_object/o:407736
The 10 components for effective research data
10.Integrateupstreamanddownstream
–makemetadatatoserveuse.
Save
Share
Use
9. Re-usable (allow tools to run on it)
8. Reproducible
7. Trusted (e.g. reviewed)
6. Comprehensible (description / method is available)
5. Citable
4. Discoverable (data is indexed or data is linked from article)
3. Accessible
2. Preserved (long-term & format-independent)
1. Stored (existing in some form)
5
10.Integrateupstreamanddownstream
–makemetadatatoserveuse.
Funder
mandates
9. Re-usable (allow tools to run on it)
8. Reproducible
7. Trusted (e.g. reviewed)
6. Comprehensible (description / method is available)
5. Citable
4. Discoverable (data is indexed or data is linked from article)
3. Accessible
2. Preserved (long-term & format-independent)
1. Stored (existing in some form)
6
Mandates are changing behavior – but only focused on the bottom of
the data pyramid
7
1. Research Data linking – linking articles to external datasets
http://www.sciencedirect.com/science/article/pii/S022352341500272X
7
9
2. Gold OA reviewed data journals, software journals, method
journals (collectively called “Research Elements”)
http://www.journals.elsevier.com/data-in-brief/
Direct
submission
50%
Just publish
and get credit
for your data
Co-
submission
50%
On top of the
main article:
attract attention
for your data
Increasing from
100 journals to
250+ journals in
November 2015
MethodsX
Data in Brief
SoftwareX
e.g.:
2. Gold OA journal data / research elements articles growth
• Usage is in top 25% of open
access articles on ScienceDirect
0
20
40
60
80
100
120
140
160
180
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014
2015
Submitted
0
10
20
30
40
50
60
70
80
90
100
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014
2015
Accepted
10
0
10
20
30
40
50
60
70
80
90
100
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014
2015
Non-Elsevie
competition*, predominantly: F1000Research, Nature Scientific Data, Codata, BMC Research Notes, GigaScience
3. Development Partnership (France) – Lab Data Tool: structure in the lab
www.hivebench.com
11
https://data.mendeley.com/datasets/xz6gv65m6d/6
Linked to published
papers – or not
Versioning and
provenance
4. Manage, Store: Mendeley Data launched Dec 2015
Data citation
(DataCite)
Researcher in control
(embargo / visible)
13
5. Prototype: Research Data Search
http://demo-rdm-datasearch-1436039625.eu-west-1.elb.amazonaws.com/indexed#/
(prototype under username / password. Upon request)
search for “rare geochemical ionic liquid” or “mantle calcium variation”
13
6. Datasets in Pure („Research Data Sets“) – Compliance!
•Register datasets
and their related
metadata in Pure
•Upload binary
dataset files
directly into Pure
•Link datasets
to related projects,
publications, awards
and more for
enhanced reporting
capabilities
Provide transparency and comply with funders' requirements through the new
'Datasets' content type
Regular Elsevier journals
Mendeley data repo (researcher)
Data in the lab / ELN
Post
Other repositories
Discoverability:
Data/article linking
program
Index
Index
15
The open RDM ecosystem: where do we stand now
Review & curate data:
data journals
Elsevier
RDM Solutions
Non-Elsevier / open
Integrated
Discoverability: Data search
M&A process
Prototype
Link
Link
Powered by ROS Social (profiles, awareness, dashboards)
Compliance (e.g. Pure)
Local institutional data repository
altmetrics, SciVal, Scopus
Scopus
citations
Regular Elsevier journals
Journal data repository
Mendeley data repo (researcher)
Publish
Data in the lab / ELN
PostPublish
Other publishers/journals
Other repositories
Discoverability: Data search
Discoverability:
Data/article linking
program
Index
Index
Post data
Repository data search
Institutional data search
Domain data search
Embed
Embed
Measure outcomes (data use)
16
The open RDM ecosystem: where do we go
=> Focus and complexity is in the integration & coverage
(and less in the individual pieces)
Review & curate data:
data journals
Other lab management tools
Elsevier
RDM Solutions
Non-Elsevier / open
Institutional data repository
Integrated
Link
Link
Post data &
supp material
Questions?
10.Integrateupstreamanddownstream
–makemetadatatoserveuse.
Save
Share
Use
9. Re-usable (allow tools to run on it)
8. Reproducible
7. Trusted (e.g. reviewed)
6. Comprehensible (description / method is available)
5. Citable
4. Discoverable (data is indexed or data is linked from article)
3. Accessible
2. Preserved (long-term & format-independent)
1. Stored (existing in some form)
17

Open Science: Research Data Management

  • 1.
    Open Science: Research DataManagement April 2016 Wouter Haak VP Research Data Management Solutions
  • 2.
    ICSU/IAP/TWAS/ISSC: It iswidely recognized that ‘data re-use’ is not just a technology challenge, or something that the funding bodies can just change by themselves “Open Data in a Big Data world” – January 2016 Four major organisations representing global science, the International Council for Science (ICSU), the InterAcademy Partnership (IAP), The World Academy of Sciences (TWAS) and the International Social Science Council (ISSC) Backup
  • 3.
    When you leaveyour institution…what happens with your data? „Forschende und ihre Daten. Ergebnisse einer österreichweiten Befragung (eBook)“ E-infrastructures Austria Bauer, B. (Bruno) et all Oct 2015 https://phaidra.univie.ac.at/detail_object/o:407736 Stays at institution Take it with me Don’t know Data is lost Other
  • 4.
    Is your researchdata useful for others? Frequently Yes No „Forschende und ihre Daten. Ergebnisse einer österreichweiten Befragung (eBook)“ E-infrastructures Austria Bauer, B. (Bruno) et all Oct 2015 https://phaidra.univie.ac.at/detail_object/o:407736
  • 5.
    The 10 componentsfor effective research data 10.Integrateupstreamanddownstream –makemetadatatoserveuse. Save Share Use 9. Re-usable (allow tools to run on it) 8. Reproducible 7. Trusted (e.g. reviewed) 6. Comprehensible (description / method is available) 5. Citable 4. Discoverable (data is indexed or data is linked from article) 3. Accessible 2. Preserved (long-term & format-independent) 1. Stored (existing in some form) 5
  • 6.
    10.Integrateupstreamanddownstream –makemetadatatoserveuse. Funder mandates 9. Re-usable (allowtools to run on it) 8. Reproducible 7. Trusted (e.g. reviewed) 6. Comprehensible (description / method is available) 5. Citable 4. Discoverable (data is indexed or data is linked from article) 3. Accessible 2. Preserved (long-term & format-independent) 1. Stored (existing in some form) 6 Mandates are changing behavior – but only focused on the bottom of the data pyramid
  • 7.
    7 1. Research Datalinking – linking articles to external datasets http://www.sciencedirect.com/science/article/pii/S022352341500272X 7
  • 8.
    9 2. Gold OAreviewed data journals, software journals, method journals (collectively called “Research Elements”) http://www.journals.elsevier.com/data-in-brief/ Direct submission 50% Just publish and get credit for your data Co- submission 50% On top of the main article: attract attention for your data Increasing from 100 journals to 250+ journals in November 2015 MethodsX Data in Brief SoftwareX e.g.:
  • 9.
    2. Gold OAjournal data / research elements articles growth • Usage is in top 25% of open access articles on ScienceDirect 0 20 40 60 80 100 120 140 160 180 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 2015 Submitted 0 10 20 30 40 50 60 70 80 90 100 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 2015 Accepted 10 0 10 20 30 40 50 60 70 80 90 100 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2014 2015 Non-Elsevie competition*, predominantly: F1000Research, Nature Scientific Data, Codata, BMC Research Notes, GigaScience
  • 10.
    3. Development Partnership(France) – Lab Data Tool: structure in the lab www.hivebench.com 11
  • 11.
    https://data.mendeley.com/datasets/xz6gv65m6d/6 Linked to published papers– or not Versioning and provenance 4. Manage, Store: Mendeley Data launched Dec 2015 Data citation (DataCite) Researcher in control (embargo / visible)
  • 12.
    13 5. Prototype: ResearchData Search http://demo-rdm-datasearch-1436039625.eu-west-1.elb.amazonaws.com/indexed#/ (prototype under username / password. Upon request) search for “rare geochemical ionic liquid” or “mantle calcium variation” 13
  • 13.
    6. Datasets inPure („Research Data Sets“) – Compliance! •Register datasets and their related metadata in Pure •Upload binary dataset files directly into Pure •Link datasets to related projects, publications, awards and more for enhanced reporting capabilities Provide transparency and comply with funders' requirements through the new 'Datasets' content type
  • 14.
    Regular Elsevier journals Mendeleydata repo (researcher) Data in the lab / ELN Post Other repositories Discoverability: Data/article linking program Index Index 15 The open RDM ecosystem: where do we stand now Review & curate data: data journals Elsevier RDM Solutions Non-Elsevier / open Integrated Discoverability: Data search M&A process Prototype Link Link
  • 15.
    Powered by ROSSocial (profiles, awareness, dashboards) Compliance (e.g. Pure) Local institutional data repository altmetrics, SciVal, Scopus Scopus citations Regular Elsevier journals Journal data repository Mendeley data repo (researcher) Publish Data in the lab / ELN PostPublish Other publishers/journals Other repositories Discoverability: Data search Discoverability: Data/article linking program Index Index Post data Repository data search Institutional data search Domain data search Embed Embed Measure outcomes (data use) 16 The open RDM ecosystem: where do we go => Focus and complexity is in the integration & coverage (and less in the individual pieces) Review & curate data: data journals Other lab management tools Elsevier RDM Solutions Non-Elsevier / open Institutional data repository Integrated Link Link Post data & supp material
  • 16.
    Questions? 10.Integrateupstreamanddownstream –makemetadatatoserveuse. Save Share Use 9. Re-usable (allowtools to run on it) 8. Reproducible 7. Trusted (e.g. reviewed) 6. Comprehensible (description / method is available) 5. Citable 4. Discoverable (data is indexed or data is linked from article) 3. Accessible 2. Preserved (long-term & format-independent) 1. Stored (existing in some form) 17