Eudat presentation nov2013


Published on

EUDAT general presentation, November 2013

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Project partners represent the data scientists in these consortia. EPOS – data and observatories for earthquakes, volcanoes, tectonics – based on sensor data. CLARIN – making language resources and technology usable ENES – simulations of the climate system using HPC Lifewatch – biodiversity research VPH – biomedical modelling and simulation of the human body
  • Eudat presentation nov2013

    1. 1. EUDAT A cross-disciplinary data infrastructure in Horizon 2020 Damien Lecarpentier EUDAT Project Manager CSC – IT Center for Science Ltd
    2. 2. Exponential growth Data ”Deluge” Zettabytes Exabytes Petabytes Terabytes Gigabytes Increasing complexity and variety • Where to store it? • How to find it? • How to make the most of it? 2
    3. 3. Synergies If there are hundreds of Research Infrastructures, how many different data management systems can we sustain? 3 3
    4. 4. Riding the Wave Collaborative Data Infrastructure -A framework for the future? - Trust Data Curation Data Generators Users Community Support Services Common Data Services
    5. 5. 5
    6. 6. Consortium 6
    7. 7. Seven Research Communities on Board • EPOS: European Plate Observatory System • CLARIN: Common Language Resources and Technology Infrastructure • ENES: Service for Climate Modelling in Europe • LifeWatch: Biodiversity Data and Observatories • VPH: The Virtual Physiological Human • INCF: International Neuroinformatics Coordinating Facility • DRIHM: Distributed Research Infrastructure for Hydrometeorology 7
    8. 8. User Forums + 25 communities 1st User Forum 7-8 March 2012, Barcelona 8
    9. 9. Service Building Process Takes time! Infrastructure coordination (resources, se curity, etc.) Reusing existing technologies and expertise rather than reinventing everything!
    10. 10. Selected Services Metadata Catalogue PID Aggregated EUDAT metadata domain. Data inventory Identity Integrity Authenticity Locations Data Staging Safe Replication Simple Store Dynamic replication to HPC workspace for processing Data curation and access optimization Researcher data store (simple upload, share and access) New services to come EUDAT Box dropbox-like service easy sharing local synching Semantic Anno checking & referencing AAI Network of trust among authentication and authorization actors Dynamic Data immediate handling
    11. 11. Safe Replication Service • Robust, safe and highly available data replication service for small- and medium- sized repositories – To guard against data loss in long-term archiving and preservation – To optimize access for user from different regions – To bring data closer to powerful computers for compute-intensive analysis PIDs • Policy rules EUDAT CDI Domain of registered data | 11
    12. 12. Data Staging Service • Support researchers in transferring large data collections from EUDAT storage to HPC facilities • Reliable, efficient, and easy-to-use tools to manage data transfers • Provide the means to rePRACE ingest computational results HPC back into the EUDAT infrastructure HPC EUDAT CDI Domain of registered data | 12
    13. 13. Simple Store Service • Allow registered users to upload ”long tail” data into the EUDAT store • Enable sharing objects and collections with other researchers • Utilise other EUDAT services to provide reliability and data retention Simple upload Simple metadata PID registration EUDAT CDI Domain of registered data | 13
    14. 14. 14
    15. 15. 15
    16. 16. Metadata Service • Easily find collections of scientific data – generated either by various communities or via EUDAT services • Access those data collections through the given references in the metadata to the relevant data stores • Europeana of scientific data EUDAT CDI Domain of registered data | 16
    17. 17. 17
    18. 18. Towards Horizon 2020 User driven services Sustainability Trust Synergy Joint e-infrastructure roadmaps Global collaboration 18
    19. 19. A Network of Trusted Centers Generic data centres Community data sites • Strong and sustainable generic data centers with existing trusted relationships • Each having specific relationship with research communities • EUDAT is about providing solutions in a federated environment
    20. 20. Bridging National and European solutions • Strong requirement from researchers and funders  Path to Sustainability
    21. 21. EUDAT Priorities in H2020 • Consolidation of Core Services – Increased performance, new functionalities, AAI, etc. – Develop tools and policies to facilitate usage: data management plans, licensing, training, etc. – Development of new services • Financial Sustainability – Cost and funding models – Framework and mechanisms for sharing resources across sites and across communities (juste retour, etc.) • Interoperability – E-Infrastructures  a joint roadmap? – National initiatives  service portfolios – RDA  EUDAT as a driver and implementer 22
    22. 22. 23