DEEP-Hybrid-DataCloud has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 777435.
DEEP general presentation
Brief project overview
Jesús Marco de Lucas, Álvaro López García
{marco,aloga}@ifca.unican.es
Spanish National Research Council
EOSC-HUB Week
Malaga
16-20 April 2018
DEEP-HybridDataCloud: context
 H2020 project, EINFRA-21 call
 Topic: Platform-driven e-Infrastructure towards the European Open Science Cloud
 Scope: Computing e-infrastructure with extreme large datasets
 DEEP-HybridDataCloud: Designing and Enabling E-Infrastructures for intensive data Processing in
a Hybrid DataCloud
 Started as a spin-off project (together with XDC) from INDIGO-DataCloud technologies
 Submitted in March 2017
 Started November 1st 2017
 Grant agreement number 777435
 Global objective: Promote the use of intensive computing services by different research
communities and areas, and their support by the corresponding e-Infrastructure providers and
open source projects.
DEEP consortium
 Balanced set of partners
 Strong technological background on development, implementation,
deployment and operation of federated e-Infrastructures
 9 academic partners
 CSIC, LIP, INFN, PSNC, KIT, UPV, CESNET, IISAS, HMGU
 1 industrial partner
 Atos
 6 countries
 Spain, Italy, Poland, Germany, Czech Republic, Slovakia
DEEP project objectives
 Focus on intensive computing techniques for the analysis of very large datasets
considering demanding use cases
 Evolve up to production level intensive computing services exploiting specialized
hardware
 Integrate intensive computing services under a hybrid cloud approach
 Define a “DEEP as a Service” solution to offer an adequate integration path to
developers of final applications
 Analyse the complementarity with other ongoing projects targeting added value
services for the cloud
DEEP pilot use cases
 Deep learning
 Pilot cases: stem cells, biodiversity applications, medical image
 Provide a general, distributed architecture and pipeline to train deep learning (and other)
models
 Post-processing
 Pilot cases: post-processing of HPC simulations
 Flexible pipeline for the analysis of simulation data generated at HPC resources
 On-line analysis of data streams
 Pilot case: intrusion detection systems
 Provide an architecture able to analyze massive on-line data streams, also with historical
records
INDIGO Components and evolution (I)
 INDIGO Orchestrator
 Hybrid support on multiple sites
 Support for specialized computing hardware
 Infrastructure Manager
 Hybrid cloud support involving specialized computing hardware
 uDocker
 Support for GPUs and specialized hardware to be further developed
 Cloud Information System
 Missing information about accelerators or specialized hardware at a provider
 React faster to changes in the infrastructure (faster publication and propagation of
information)
INDIGO Components and evolution (II)
 OpenStack/OpenNebula: extensions needed to properly support accelerators:
improving scheduling strategies, easier configuration and improved documentation.
 PaaS layer: support for specialized computing hardware
 Docker: container technology for applications
 LXC: alternative hypervisor
 Ansible: contextualization and configuration tool, further development of modules
 INDIGO Virtual Router: improvements to reach production level
DEEP work programme
 Plan and requirements (Nov 2017 – Jan 2018)
 Initial design (Feb – Apr 2018)
 First prototype (May – Oct 2018)
 in-situ integration meeting to take place: conclude the integration of the first testbed prototype, supporting at least two
initial pilot applications
 Second prototype
 Improvement of design and proposed solutions (first quarter of 2019)
 Integration towards a “second prototype” (mid 2019)
 Full Pilot testbed
 Integration of all the Pilot applications and their tuning for high performance
 Promotion and exploitation (2020)
 Improve the support and final quality of the solutions
 Promote the exploitation in the EOSC framework, following the integration activities
Contacts
Web page:
https://deep-hybrid-datacloud.eu
Email:
info@deep-hybrid-datacloud.eu
https://twitter.com/DEEP_eu
https://deep-hybrid-datacloud.eu
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777435.
Thank you
Any Questions?

Deep Hybrid DataCloud

  • 1.
    DEEP-Hybrid-DataCloud has receivedfunding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777435. DEEP general presentation Brief project overview Jesús Marco de Lucas, Álvaro López García {marco,aloga}@ifca.unican.es Spanish National Research Council EOSC-HUB Week Malaga 16-20 April 2018
  • 2.
    DEEP-HybridDataCloud: context  H2020project, EINFRA-21 call  Topic: Platform-driven e-Infrastructure towards the European Open Science Cloud  Scope: Computing e-infrastructure with extreme large datasets  DEEP-HybridDataCloud: Designing and Enabling E-Infrastructures for intensive data Processing in a Hybrid DataCloud  Started as a spin-off project (together with XDC) from INDIGO-DataCloud technologies  Submitted in March 2017  Started November 1st 2017  Grant agreement number 777435  Global objective: Promote the use of intensive computing services by different research communities and areas, and their support by the corresponding e-Infrastructure providers and open source projects.
  • 3.
    DEEP consortium  Balancedset of partners  Strong technological background on development, implementation, deployment and operation of federated e-Infrastructures  9 academic partners  CSIC, LIP, INFN, PSNC, KIT, UPV, CESNET, IISAS, HMGU  1 industrial partner  Atos  6 countries  Spain, Italy, Poland, Germany, Czech Republic, Slovakia
  • 4.
    DEEP project objectives Focus on intensive computing techniques for the analysis of very large datasets considering demanding use cases  Evolve up to production level intensive computing services exploiting specialized hardware  Integrate intensive computing services under a hybrid cloud approach  Define a “DEEP as a Service” solution to offer an adequate integration path to developers of final applications  Analyse the complementarity with other ongoing projects targeting added value services for the cloud
  • 5.
    DEEP pilot usecases  Deep learning  Pilot cases: stem cells, biodiversity applications, medical image  Provide a general, distributed architecture and pipeline to train deep learning (and other) models  Post-processing  Pilot cases: post-processing of HPC simulations  Flexible pipeline for the analysis of simulation data generated at HPC resources  On-line analysis of data streams  Pilot case: intrusion detection systems  Provide an architecture able to analyze massive on-line data streams, also with historical records
  • 6.
    INDIGO Components andevolution (I)  INDIGO Orchestrator  Hybrid support on multiple sites  Support for specialized computing hardware  Infrastructure Manager  Hybrid cloud support involving specialized computing hardware  uDocker  Support for GPUs and specialized hardware to be further developed  Cloud Information System  Missing information about accelerators or specialized hardware at a provider  React faster to changes in the infrastructure (faster publication and propagation of information)
  • 7.
    INDIGO Components andevolution (II)  OpenStack/OpenNebula: extensions needed to properly support accelerators: improving scheduling strategies, easier configuration and improved documentation.  PaaS layer: support for specialized computing hardware  Docker: container technology for applications  LXC: alternative hypervisor  Ansible: contextualization and configuration tool, further development of modules  INDIGO Virtual Router: improvements to reach production level
  • 8.
    DEEP work programme Plan and requirements (Nov 2017 – Jan 2018)  Initial design (Feb – Apr 2018)  First prototype (May – Oct 2018)  in-situ integration meeting to take place: conclude the integration of the first testbed prototype, supporting at least two initial pilot applications  Second prototype  Improvement of design and proposed solutions (first quarter of 2019)  Integration towards a “second prototype” (mid 2019)  Full Pilot testbed  Integration of all the Pilot applications and their tuning for high performance  Promotion and exploitation (2020)  Improve the support and final quality of the solutions  Promote the exploitation in the EOSC framework, following the integration activities
  • 9.
  • 10.
    https://deep-hybrid-datacloud.eu This project hasreceived funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777435. Thank you Any Questions?