www.eudat.euEUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
EGI - EUDAT interoperability
efforts
DI4R Conference - Krakow
Michaela Barth caela@kth.se
Peter Gille petergil@kth.se
226/09/16
24 countries
CERN, EMBL-EBI
1 coordinating organization – EGI.eu
Slide thanks to Matthew Viljoen
326/09/16
Open Solutions
Slide thanks to Matthew Viljoen
426/09/16
Computing and Data Integration
EGI Open Data Platform enhances
EGI/EUDAT services through developments
in:
• FedCloud: EGI AppDB, Cloud
Management Frameworks, hybrid clouds
• Data: federated data -> sharing across
VMs, discoverability via DataHub
EGI AAI now based on Open ID Connect
Slide thanks to Matthew Viljoen
EUDAT in a NutShell
5
● EUDAT – offers a complete set of research data
services, expertise and technology solutions to all
European scientists and researchers. These shared
services and storage resources are distributed across 15
European nations and data is stored alongside some of
Europe’s most powerful supercomputers.
● One of the main ambitions of EUDAT is to bridge the
gap between research infrastructures and e-
infrastructures through an active engagement strategy,
using the communities that are in the consortium as
EUDAT beacons and integrating others through
innovative partnerships.
6
supporting
multiple research
communities as
well as
individuals,
through a
geographically
distributed,
resilient network
of 35 European
organisations
B2 Service Suite
B2ACCESS
B2HANDLE
8
Joint Access to Data
and HPC Services Commercial
stakeholders
Collaboration with
Commercial
Stakeholders
Joint Access to Data,
HTC and Cloud
Computing
Resources
Interoperability
Overall WP 7 objectives
9
● Ensure the interoperability of EUDAT with other
public and private e-Infrastructures, lowering
technical and policy barriers by piloting
concrete use cases with user communities
● Provide European researchers and industries
with seamless access to data and computing
resources for cross-utilization use cases
● Pave the way towards the interoperability of e-
Infrastructure tools and services beyond H2020
10
Joint Access to Data
and HPC Services Commercial
stakeholders
Collaboration with
Commercial
Stakeholders
Joint Access to Data,
HTC and Cloud
Computing
Resources
Interoperability
WP 7 Task 7.2
WP 7
WP7 Task 7.2: Joint Access to Data, HTC
and Cloud Computing Resources
• Connects data stored in the CDI to high throughput and
cloud computing resources provided through EGI
• Starting with concrete community pilots
• Aiming at a production cross-infrastructure service
(integrating storage resources managed by EUDAT and
computing resources available at EGI)
• EGI-EUDAT collaboration started in March 2016.
11
WP7 Task 7.2: Joint Access to Data, HTC
and Cloud Computing Resources
Two subtasks:
• 7.2.1 Service Interfaces and Access Policies
• with focus on users’ authentication, standards transfer protocols and
tools, execution of computational workflow and harmonization of
access policies
• 7.2.2 Community pilots and Coordination of Joint Calls
for Proposals
• coordinating the implementation of selected community pilots,
• initially intended to start with four community pilots (ICOS, ELIXIR,
EISCAT-3D, BBMRI), now only 2 pilots (ICOS and EPOS)
• organizing a Joint Call for Proposals to expand the pilot activity to
further communities
• Main objectives:
• provide end-users with a seamless access to an integrated
infrastructure offering both EGI and EUDAT services
• pairing data and high-throughput computing resources together.
12
• Federated services through EGI paired together with
existing EUDAT common data services of EUDAT:
• Additionally pooling effect: augmentation of services from
existing communities using EGI/EUDAT
Expected Benefits
13
Computation on
EGI Federated
Cloud and HTC
EUDAT services
for transfer, syncing,
sharing, staging and
preservation of data
• Technical
• interoperability (e.g. workflow execution, data
discoverability and provenance),
• Authentication, Authorization and Identity (AAI) Mgmt.
• Combination of respective service catalogues
• Policies
• access policy
• long-term perspective
• operational policies
• Operational
• Operational tools, technologies, best practices
• Security
• SLAs
Harmonization on all levels
14
• Transparent access to EGI and EUDAT services:
• When a user is authenticated once on EGI and EUDAT, (s)he
should be able to see the EGI and EUDAT services as offered
by a unique infrastructure
• Breaking this down into smaller steps:
• Allowing users to access EGI and EUDAT web services
with the same credential.
• Allowing users to access EGI and EUDAT non web
services with the same credential.
• Attributes harmonisation
• Enabling EGI services to delegate user’s credential to
EUDAT services and vice versa.
• Data privacy issues and policy harmonisation.
EGI/EUDAT AAI Interoperability
15
• Work done so far:
• Get an understanding of each other’s AAI layers
• breaking the task down into smaller steps
• Started draft document
• Enabling accounts for some feasibility tests
• Next step:
• Complete Roadmap
EGI/EUDAT AAI Interoperability
16
• Work done so far:
• EGI and EUDAT selected a set of relevant (= already
collaborating with both infrastructures) user communities
• Started process of getting requirements from user
communities and their indication of prioritization of those
requirements (“Definition of the universal use case”) →
integration activity has been driven by the end users from
the start!
• Identified user communities were prominent European
Research infrastructure in the field of Earth Science
(EPOS and ICOS), Bioinformatics (BBMRI and ELIXIR)
and Space Physics (EISCAT-3D).
User community pilots
17
Definition of the universal use case
18
• covers the user needs with respect to the integration of
the two infrastructures
• permits a user of either e-infrastructure to instantiate a
VM on the EGI Cloud Federation for the execution of a
computational job consuming data preserved onto
EUDAT resources. The results of such analysis can be
staged back to EUDAT storages, and if needed,
allocated with Permanent identifiers (PIDs) for future
use.
Definition of the universal use case
• Demo at EGI Community Forum 2015 by Diego
Scardaci
19
Definition of the universal use case
20
• To implement all the steps of this use case the following
integration activities between the two infrastructures has
to be fulfilled: (1) harmonisation between the
authentication and authorisation model, (2) definition
and implementation of the interfaces between the
involved EGI and EUDAT services.
• The first phase of the implementation of this use case
has been demonstrated at the EGI Community Forum
2015 (Bari, IT). In addition, two pilot use cases (EPOS
and ICOS) have been selected to drive the
implementation and validate the results.
ICOS Carbon Portal use case
21
Slide thanks to Margareta Hellström
Background:
ICOS wants to
set up "on
demand
computing"
facilities that
allows users
to perform
calculations
based on
ICOS
observational
data.
ICOS Carbon Portal use case status
• Virtual machine with attached block storage instantiated in the EGI
Federated Cloud
• Docker container hosting web service and model computations running in
the VM with all input data copied to the local storage of the VM
• Data transfer between the VM and B2STAGE instance (at KTH Stockholm)
tested using the contextualization script provided by EGI
• Storing of ICOS data tested on the B2SAFE system at KTH
Next steps:
• ICOS data storage in B2SAFE (at KTH) and access via B2STAGE service
• Access to common storage for several VMs (e.g. via the EGI DataHub)
• Robot certificates to allow for further automation of the workflow
• Load balancing to distribute computations/users requests to several VMs
• Prepare “focused” documentation, with a clear user perspective!
22
Slide thanks to Margareta Hellström
Challenges encountered so far
When implementing prototypes of use-cases of EPOS and
ICOS:
• Scaling up from proof of concept to multiple users
• AAI interoperation
• Managing co-existing support systems and channels
• more user-friendly documentation of all services in the
current state (not the final vision) desirable
• steep learning curve for the user communities in using
e-infrastructure communities
• 3rd party dependencies
• On the plus side: personal contacts highly appreciated
23
Next steps
• Prolong the tasks until March 2018
• Deliverable “12 months report”
• Continued implementation of use cases with
result evaluation
• Joint call for proposals to expand the pilot
activity
• Final report including recommendations for
service development and access policy
harmonization
24
For more info:
https://b2drop.eudat.eu
https://eudat.eu/services/userdoc/b2drop
https://b2share.eudat.eu
https://eudat.eu/services/userdoc/b2share
https://eudat.eu/services/userdoc/b2safe
https://eudat.eu/services/userdoc/b2stage
http://b2find.eudat.eu
https://eudat.eu/services/userdoc/b2find
http://b2access.eudat.eu
More on EUDAT
26
● EUDAT is a collaborative Pan-European infrastructure
and provides also training and consultancy for
researchers, research communities, research
infrastructures and data centres.
● EUDAT’s vision is to enable European researchers and
practitioners from any research discipline to preserve,
find, access, and process data in a trusted environment,
as part of a Collaborative Data Infrastructure (CDI)
conceived as a network of collaborating, cooperating
centres, combining the richness of numerous community-
specific data repositories with the permanence and
persistence of some of Europe’s largest scientific data
centres.

EGI-EUDAT interoperability| www.eudat.eu |

  • 1.
    www.eudat.euEUDAT receives fundingfrom the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 EGI - EUDAT interoperability efforts DI4R Conference - Krakow Michaela Barth caela@kth.se Peter Gille petergil@kth.se
  • 2.
    226/09/16 24 countries CERN, EMBL-EBI 1coordinating organization – EGI.eu Slide thanks to Matthew Viljoen
  • 3.
  • 4.
    426/09/16 Computing and DataIntegration EGI Open Data Platform enhances EGI/EUDAT services through developments in: • FedCloud: EGI AppDB, Cloud Management Frameworks, hybrid clouds • Data: federated data -> sharing across VMs, discoverability via DataHub EGI AAI now based on Open ID Connect Slide thanks to Matthew Viljoen
  • 5.
    EUDAT in aNutShell 5 ● EUDAT – offers a complete set of research data services, expertise and technology solutions to all European scientists and researchers. These shared services and storage resources are distributed across 15 European nations and data is stored alongside some of Europe’s most powerful supercomputers. ● One of the main ambitions of EUDAT is to bridge the gap between research infrastructures and e- infrastructures through an active engagement strategy, using the communities that are in the consortium as EUDAT beacons and integrating others through innovative partnerships.
  • 6.
    6 supporting multiple research communities as wellas individuals, through a geographically distributed, resilient network of 35 European organisations
  • 7.
  • 8.
    8 Joint Access toData and HPC Services Commercial stakeholders Collaboration with Commercial Stakeholders Joint Access to Data, HTC and Cloud Computing Resources Interoperability
  • 9.
    Overall WP 7objectives 9 ● Ensure the interoperability of EUDAT with other public and private e-Infrastructures, lowering technical and policy barriers by piloting concrete use cases with user communities ● Provide European researchers and industries with seamless access to data and computing resources for cross-utilization use cases ● Pave the way towards the interoperability of e- Infrastructure tools and services beyond H2020
  • 10.
    10 Joint Access toData and HPC Services Commercial stakeholders Collaboration with Commercial Stakeholders Joint Access to Data, HTC and Cloud Computing Resources Interoperability WP 7 Task 7.2 WP 7
  • 11.
    WP7 Task 7.2:Joint Access to Data, HTC and Cloud Computing Resources • Connects data stored in the CDI to high throughput and cloud computing resources provided through EGI • Starting with concrete community pilots • Aiming at a production cross-infrastructure service (integrating storage resources managed by EUDAT and computing resources available at EGI) • EGI-EUDAT collaboration started in March 2016. 11
  • 12.
    WP7 Task 7.2:Joint Access to Data, HTC and Cloud Computing Resources Two subtasks: • 7.2.1 Service Interfaces and Access Policies • with focus on users’ authentication, standards transfer protocols and tools, execution of computational workflow and harmonization of access policies • 7.2.2 Community pilots and Coordination of Joint Calls for Proposals • coordinating the implementation of selected community pilots, • initially intended to start with four community pilots (ICOS, ELIXIR, EISCAT-3D, BBMRI), now only 2 pilots (ICOS and EPOS) • organizing a Joint Call for Proposals to expand the pilot activity to further communities • Main objectives: • provide end-users with a seamless access to an integrated infrastructure offering both EGI and EUDAT services • pairing data and high-throughput computing resources together. 12
  • 13.
    • Federated servicesthrough EGI paired together with existing EUDAT common data services of EUDAT: • Additionally pooling effect: augmentation of services from existing communities using EGI/EUDAT Expected Benefits 13 Computation on EGI Federated Cloud and HTC EUDAT services for transfer, syncing, sharing, staging and preservation of data
  • 14.
    • Technical • interoperability(e.g. workflow execution, data discoverability and provenance), • Authentication, Authorization and Identity (AAI) Mgmt. • Combination of respective service catalogues • Policies • access policy • long-term perspective • operational policies • Operational • Operational tools, technologies, best practices • Security • SLAs Harmonization on all levels 14
  • 15.
    • Transparent accessto EGI and EUDAT services: • When a user is authenticated once on EGI and EUDAT, (s)he should be able to see the EGI and EUDAT services as offered by a unique infrastructure • Breaking this down into smaller steps: • Allowing users to access EGI and EUDAT web services with the same credential. • Allowing users to access EGI and EUDAT non web services with the same credential. • Attributes harmonisation • Enabling EGI services to delegate user’s credential to EUDAT services and vice versa. • Data privacy issues and policy harmonisation. EGI/EUDAT AAI Interoperability 15
  • 16.
    • Work doneso far: • Get an understanding of each other’s AAI layers • breaking the task down into smaller steps • Started draft document • Enabling accounts for some feasibility tests • Next step: • Complete Roadmap EGI/EUDAT AAI Interoperability 16
  • 17.
    • Work doneso far: • EGI and EUDAT selected a set of relevant (= already collaborating with both infrastructures) user communities • Started process of getting requirements from user communities and their indication of prioritization of those requirements (“Definition of the universal use case”) → integration activity has been driven by the end users from the start! • Identified user communities were prominent European Research infrastructure in the field of Earth Science (EPOS and ICOS), Bioinformatics (BBMRI and ELIXIR) and Space Physics (EISCAT-3D). User community pilots 17
  • 18.
    Definition of theuniversal use case 18 • covers the user needs with respect to the integration of the two infrastructures • permits a user of either e-infrastructure to instantiate a VM on the EGI Cloud Federation for the execution of a computational job consuming data preserved onto EUDAT resources. The results of such analysis can be staged back to EUDAT storages, and if needed, allocated with Permanent identifiers (PIDs) for future use.
  • 19.
    Definition of theuniversal use case • Demo at EGI Community Forum 2015 by Diego Scardaci 19
  • 20.
    Definition of theuniversal use case 20 • To implement all the steps of this use case the following integration activities between the two infrastructures has to be fulfilled: (1) harmonisation between the authentication and authorisation model, (2) definition and implementation of the interfaces between the involved EGI and EUDAT services. • The first phase of the implementation of this use case has been demonstrated at the EGI Community Forum 2015 (Bari, IT). In addition, two pilot use cases (EPOS and ICOS) have been selected to drive the implementation and validate the results.
  • 21.
    ICOS Carbon Portaluse case 21 Slide thanks to Margareta Hellström Background: ICOS wants to set up "on demand computing" facilities that allows users to perform calculations based on ICOS observational data.
  • 22.
    ICOS Carbon Portaluse case status • Virtual machine with attached block storage instantiated in the EGI Federated Cloud • Docker container hosting web service and model computations running in the VM with all input data copied to the local storage of the VM • Data transfer between the VM and B2STAGE instance (at KTH Stockholm) tested using the contextualization script provided by EGI • Storing of ICOS data tested on the B2SAFE system at KTH Next steps: • ICOS data storage in B2SAFE (at KTH) and access via B2STAGE service • Access to common storage for several VMs (e.g. via the EGI DataHub) • Robot certificates to allow for further automation of the workflow • Load balancing to distribute computations/users requests to several VMs • Prepare “focused” documentation, with a clear user perspective! 22 Slide thanks to Margareta Hellström
  • 23.
    Challenges encountered sofar When implementing prototypes of use-cases of EPOS and ICOS: • Scaling up from proof of concept to multiple users • AAI interoperation • Managing co-existing support systems and channels • more user-friendly documentation of all services in the current state (not the final vision) desirable • steep learning curve for the user communities in using e-infrastructure communities • 3rd party dependencies • On the plus side: personal contacts highly appreciated 23
  • 24.
    Next steps • Prolongthe tasks until March 2018 • Deliverable “12 months report” • Continued implementation of use cases with result evaluation • Joint call for proposals to expand the pilot activity • Final report including recommendations for service development and access policy harmonization 24
  • 25.
  • 26.
    More on EUDAT 26 ●EUDAT is a collaborative Pan-European infrastructure and provides also training and consultancy for researchers, research communities, research infrastructures and data centres. ● EUDAT’s vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, combining the richness of numerous community- specific data repositories with the permanence and persistence of some of Europe’s largest scientific data centres.