Where is the opportunity for libraries in the collaborative data infrastructure?

Where is the opportunity for
libraries in the collaborative
data infrastructure?
Susan Reilly
Project Manager
LIBER

susan.reilly@kb.nl
@skreilly

Contents
 About LIBER
 Some context
 What is the collaborative data infrastructure?
 Introducing the researcher to the CDI
 Introducing the CDI to the researcher
 Now and next?

LIBER: reinventing the library of the future

 Largest network of European reseach libraries: 450 in over 40
countries

Mission:

To provide an information infrastructure to enable research
in LIBER institutions to be world class

Key performance areas

 Scholarly communication and research infrastructures
 Reshaping the research library
 Advocacy

LIBER Projects

Reshaping
The
research library

Scholarly
Communication
Advocacy
&
Research
Infrastructure

So why am I here?

Reshaping Collaborative data
The infrastructure
research library

Scholarly
Communication
Advocacy &
Research
Infrastructure

What is the collaborative data infrastructure
(scientific data infrastructure)?

…it’s about data

Not just the 20+ petabytes that the LHC at CERN
produces every year

Libraries in the data deluge

 Increasing amount of digitised and born digital content
in libraries
 Increasing emphasis on open access publications and
data: mandates, institutional repositories
 Demand for data management support

What is the collaborative data infrastructure?

“a broad, conceptual framework for how different
companies, institutes, universities, governments and
individuals would interact with the system – what types of
data, privileges, authentication or performance metrics
should be planned. This framework would ensure the
trustworthiness of data, provide for its curation, and
permit an easy interchange among the generators and
users of data”

Now and Next

 Authentication & authorisation
 New skills

Introducing the researcher to the CDI

 Current situation
 ODE & linking data to publications
 Demand for data management support
 Advocacy

Opportunities for data exchange (ODE)

 identify, collate, interpret and deliver evidence of
emerging best practices in sharing, re-using, preserving
and citing data, the drivers for these changes and barriers
impeding progress, in forms suited to each audience
 policy makers, funders, infrastructure operators, data
centres, data providers and users, libraries and publishers

Steps to creating the conditions for data
sharing
 Understand data sharing today
 Collection of "success stories”, “near misses” and “honourable
failures” in data sharing, re-use and preservation
 Data & scholarly communications
 Integrating data and publications
 Best practice in data citation
 New roles
 Identify drivers and barriers
 Interviews with stakeholder
to seek consensus

Foto "Bell", Noordewierweg 116, Amersfoort.

Hypotheses

“Without the infrastructure
that helps scientists manage
their data in a convenient
and efficient way, no
culture of data sharing will
evolve.”

Stefan Winkler-Nees
(German Research Foundation, DFG)

Hypotheses by Category

4.Attitudes
6.Policies
8.Infrastructure
10.DMPs,
Citability
11.Dependency on
discipline

The Data
Publication Pyramid (1) Data
contained and
explained within
the article
(2) Further data
explanations in
any kind of
supplementary (3) Data
files to articles referenced from
the article and
held in data
centers and
(4) Data
repositories
publications,
describing
available
datasets
(5) Data in
drawers and on
disks at the
institute

The Pyramid’s likely short term reality:
(1) Top of the
pyramid is stable
but small
(2) Risk that
supplements to
articles turn into
Data Dumping (3) Too many
places disciplines lack
a community
endorsed data
archive

(4) Estimates
are that at least
75 % of
research data is
never made
openly avaiable

21

(1) More
integration of text
and data, viewers
and seamless
links to interactive
datasets
The Ideal Pyramid
(2) Only if data
cannot be
integrated in (3) Seamless links
article, and only (bi-directional)
relevant extra between
explanations publications and
data, interactive
(4) More Data viewers within the
Journals that articles
describe
datasets, data
mgt plans and
data methods

22

Issues for researchers
 Researchers need somewhere to put data and
make it safe for reuse
 Researchers need to control its sharing and
access
 Researchers need the ability to integrate data and
publication
 Researchers need to get credit
for data as a first class research
object
 Researchers need someone to
pay for the costs of data availability
and re-use

Library support for the researcher

Libraries and data centres must support…
 data as first class research object: Availability
publishing, persistent identification/citation
of datasets
 data description, metadata, standards Findability
documentation and retrieval
 proper documentation of data
Interpretability
 long-term data archiving including data
curation and preservation
Re-usability

Implications for libraries

Level of integration Implication for library

Data contained within the article  Prepare for adequate preservation
strategies
Data published in supplementary files to  Presentation and preservation
articles mechanisms
 Persistent link

Datasets referenced from the articles  Citability of dataset
 Persistent link
 Perpetual access to dataset

Data published independently from written  Support publication process
publications (“data publication”)  Curation of datasets
 Metadata and documentation

Data in drawers and on disks at the  Engage in data management
institute planning

Demand for data management support

Advocacy

“Many researchers do not appear to see the value and
benefits of data citation. There is a gap, which could be
filled by libraries, in advocacy for data sharing, the use of
subject specific repositories, and best practice in data
citation. These, if filled, would increase the number of
researchers sharing and reusing data.”

http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/downlo

Introducing the CDI to the researcher

 Scoping the researcher’s requirements
 Collaboration & policy development

The AAA Study: a research passport

“evaluate the feasibility of delivering an integrated
Authentication and Authorisation Infrastructure, AAI, to
help the emergence of a robust platform for access to and
preservation of scientific information within a Scientific
Data Infrastructure (SDI)”

Collaboration

“Networked science is on the rise, the researcher is no
longer working alone in his office, he is working virtually
with other researchers from around the world. For them it
is important that they can use the same software and
share and reuse the same content related objects, in a
trusted environment.”
Heinke Neuroth, Head of Innovation, Goettingen State &
University Library

Use Cases

1. Creating Data
2. Processing Data
3. Sharing Data
4. Preserving Data
5. Multi-disciplinary Data Services
6. Analysing Data
7. Accessing Data
8. Accessing Experiments and Data

Requirements…
 Tracking of provenance, authenticity, integrity of the material
 Integration of researcher ID with institutional credentials
 Researchers’ self registration
 Securely linking researcher and data identifiers for tracking
provenance
 Delegation of identity management to home institute
 Attribute provisioning for users participating in specific research
projects managed by the specific research groups (VOs)
 Attribute aggregation
 Unification and homogenisation of identity federations´ attributes and
agreed levels of assurance in order to facilitate authorisation
 Accreditation of trusted identity Providers (IdPs), based on
international standards, depending on the required level of assurance
 Entitlement management to minimise the occurrence of events where
license monies are being paid twice without necessity (e.g., for
access to scientific journals).

Legal Recommendations
Need to
protect
the user

Collaboration & policy development

 Policies for data sharing
 Values & Ecosystems
 Infrastructure & Technology
 Legal & Ethical
 Institutional Support

http://recodeproject.eu/

Now & next

What should our priorities be?

LIBER ten recommendations:
http://www.libereurope.eu/news/ten-recommendations-for-libraries-to-get-started-with-research-data

2.Collaborate

 Alliance for Permanent Access to the Record of Science
in Europe Network (APARSEN)
 look across the excellent work in digital preservation which is
carried out in Europe and to try to bring it together under a
common vision

Trust! Sustainability! Usability! Access!

http://www.alliancepermanentaccess.org/

Thank you!

 Any questions?

Where is the opportunity for libraries in the collaborative data infrastructure?

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (6)

Similar to Where is the opportunity for libraries in the collaborative data infrastructure?

Similar to Where is the opportunity for libraries in the collaborative data infrastructure? (20)

More from LIBER Europe

More from LIBER Europe (20)

Where is the opportunity for libraries in the collaborative data infrastructure?

Editor's Notes