Data management experiences in
the European projects context:
which lessons for us?which lessons for us?
Claudio Cacciari, c.cacciari@cineca.it (Cineca)
FAIR data management
RDA National Event in Italy
Florence, Italy, 14-15 November 2016
This work is licensed under the Creative Commons Attribution 4.0 International License.
Cineca consortium
• Cineca is a not for profit Consortium of Universities.
Founded in 1969, today is made up of 70 Universities,
six national Reseach Institutions, and the Italian
Ministry of Education, Universities and
Research (MIUR).
• It is the most powerful supercomputing center in Italy• It is the most powerful supercomputing center in Italy
devoted to scientific and industrial research, and one
of the most important worldwide. Cineca supports
the international community of scientific research with
HPC and Big Data analysis, it develops information
systems for universities administation offices, for the
MIUR, and for companies, health care, and public
administration.
http://www.cineca.it
Cineca’s users
http://www.hpc.cineca.it/content/statistics-cineca-hardware-utilization-september-2016
Cineca’s storage resources
May 2015
Online storage will reach 20 Pbytes in 2017
Cineca’s scenario
European projects
Projects
FAIR
principles
Italian projects
Services &
resourcesCloud
HPC
Big Data
analytics
Long term
archiving
Tape
library
High
performance
data transfer
Cloud
HPC
plain fs
Long
term
archive
Lowering the barrier
• In many cases the tools that support FAIR
principles are too complex for the scientific
researchers
• The Cloud infrastructure allows to lower the• The Cloud infrastructure allows to lower the
barrier deploying more user-friendly tools
close to the data and sharing the effort of
their implementation/maintaining.
EUHIT Portal
EuHIT
EuHIT is a consortium that aims at integrating cutting-edge European
facilities for turbulence research across national boundaries.
Data repository
EUDAT
A truly pan-European Infrastructure
EUDAT offers common data services,
supporting multiple research communities as
well as individuals, through a geographically
distributed, resilient network of 35 European
organisations
Our vision is to enable European
researchers and practitionersresearchers and practitioners
from any research discipline to
preserve, find, access, and
process data in a trusted
environment, as part of a
Collaborative Data
Infrastructure
EUDAT Service Suite
http://www.eudat.eu/services
Conclusions 1
• There is an immense heterogeneity about data
and metadata specifications among the different
disciplines and often within each discipline too.
• If the communities of a scientific domain are not• If the communities of a scientific domain are not
able to, at least partially, converge towards
common standards, the data centers offering
Data and Computing services/resources cannot
fill the gap on their behalf.
Conclusions 2
• The data centers should offer services and
enforce policies which support FAIR principles
in a flexible way.
• Some scientific communities can/want to• Some scientific communities can/want to
comply with some reccomendations, but not
others. The data services should allow the
community to improve its compliancy
progressively.
Conclusions 3
• We see interest in making the data accessible
and findable.
• Not much to make them interoperable and re-
usable.usable.
Thanks !Thanks !

Data management experiences in the European projects context: which lessons for us

  • 1.
    Data management experiencesin the European projects context: which lessons for us?which lessons for us? Claudio Cacciari, c.cacciari@cineca.it (Cineca) FAIR data management RDA National Event in Italy Florence, Italy, 14-15 November 2016 This work is licensed under the Creative Commons Attribution 4.0 International License.
  • 2.
    Cineca consortium • Cinecais a not for profit Consortium of Universities. Founded in 1969, today is made up of 70 Universities, six national Reseach Institutions, and the Italian Ministry of Education, Universities and Research (MIUR). • It is the most powerful supercomputing center in Italy• It is the most powerful supercomputing center in Italy devoted to scientific and industrial research, and one of the most important worldwide. Cineca supports the international community of scientific research with HPC and Big Data analysis, it develops information systems for universities administation offices, for the MIUR, and for companies, health care, and public administration. http://www.cineca.it
  • 3.
  • 4.
    Cineca’s storage resources May2015 Online storage will reach 20 Pbytes in 2017
  • 5.
    Cineca’s scenario European projects Projects FAIR principles Italianprojects Services & resourcesCloud HPC Big Data analytics Long term archiving Tape library High performance data transfer
  • 6.
  • 7.
    Lowering the barrier •In many cases the tools that support FAIR principles are too complex for the scientific researchers • The Cloud infrastructure allows to lower the• The Cloud infrastructure allows to lower the barrier deploying more user-friendly tools close to the data and sharing the effort of their implementation/maintaining.
  • 8.
  • 9.
    EuHIT EuHIT is aconsortium that aims at integrating cutting-edge European facilities for turbulence research across national boundaries.
  • 10.
  • 11.
    EUDAT A truly pan-EuropeanInfrastructure EUDAT offers common data services, supporting multiple research communities as well as individuals, through a geographically distributed, resilient network of 35 European organisations Our vision is to enable European researchers and practitionersresearchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure
  • 12.
  • 13.
    Conclusions 1 • Thereis an immense heterogeneity about data and metadata specifications among the different disciplines and often within each discipline too. • If the communities of a scientific domain are not• If the communities of a scientific domain are not able to, at least partially, converge towards common standards, the data centers offering Data and Computing services/resources cannot fill the gap on their behalf.
  • 14.
    Conclusions 2 • Thedata centers should offer services and enforce policies which support FAIR principles in a flexible way. • Some scientific communities can/want to• Some scientific communities can/want to comply with some reccomendations, but not others. The data services should allow the community to improve its compliancy progressively.
  • 15.
    Conclusions 3 • Wesee interest in making the data accessible and findable. • Not much to make them interoperable and re- usable.usable.
  • 16.