Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenNebulaConf 2016 - Provisioning Flexible and High Available Climate Data Services by Marco Mancini, CMCC


Published on

The Euro-Mediterranean Center on Climate Change (CMCC) Foundation is a non-profit research institution that manages and promotes scientific and applied activities in the field of international climate change research. In this talk, CLIMA, the climate information management platform that has been developed recently at CMCC Supercomputing Center, will be presented. The platform is based on iRODS, an open source data management software that provides features such as data discovery, automated data workflows, secure collaboration and data virtualization. The main goal in CLIMA is to provide climate data services such as data portals, data delivery, and data analytics, that are provisioned through OpenNebula private cloud by using features such as Oneflow and Onegate. Moreover, CLIMA can provision high available climate data services, by using a cloud hybrid approach based on the federation of OpenNebula and iRODS zones defined on-premise and on Amazon AWS.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

OpenNebulaConf 2016 - Provisioning Flexible and High Available Climate Data Services by Marco Mancini, CMCC

  1. 1. Marco Mancini, Ph.D. Senior Scien)st - Advanced Scien)fic Compu)ng Division CTO – Supercompu)ng Center h>p:// @marcomancini72 h>ps:// Provisioning Flexible and High Available Climate Data Services
  2. 2. About CMCC •  CMCC is a non-profit research institution (Since 10th Dec. 2015 it is a Foundation) •  Established in 2005, with the financial support of the Ministry of Education, University and Research (MIUR), the Ministry of the Environment and Protection of Land and Sea (MATT), the Ministry of Agricultural and Forestry Policies (MIPAF) and the Ministry of Finance (MEF) •  CMCC’s Mission is to investigate and model our climate system and its interactions with society and the environment to guarantee reliable, rigorous, and timely scientific results to stimulate sustainable growth, protect the environment, and to develop science driven adaptation and mitigation policies in a changing climate. •  6 Consortium Members: National Institute of Geophysics and Volcanology (INGV); University of Salento; Italian Aerospace Research Center (CIRA S.c.p.a); Ca’ Foscari University of Venice; University of Tuscia; University of Sassari. •  8 Research Divisions: ASC, CSP, ECIP, IAFES, ODA, OPA, RAAS, REMHI •  1 Supercomputing Center with HPC and Storage facilities
  3. 3. The big challenge is to model this complex system •  Several complex processes to be simulated •  Several interacting processes •  Great range of time scales to be analyzed •  Great range of spatial scales to be considered •  Need interdisciplinar sciences (physics, chemistry, biology, geology,…) •  Inherently non-linear governing equations •  Need sophisticated numerics •  Need huge computational resources •  …and large volumes of data can be produced Warren M. Washington – NCAR Scien9fic Grand Challenges Workshop Series: Challenges in Climate Change Science and the Role of Compu9ng at the Extreme Scale DOE Workshop (ASCR-BER) November 6-7, 2008
  4. 4. CMCC information LIfecycle Management plAtform CLIMA CMCC information LIfecycle Management plAtform High Performance Computing Analysis and Visualization Sharing and Publication Archiving and Retrieval Objectives •  Enforcing Data Policies •  Optimizing Storage Cost •  Improving Data High Availability •  Robust Implementation of Operational Chains •  Ease Search&Discovery, Data Sharing and Collaboration Federation of Data Services
  5. 5. CLIMA Data Service Ingestion Operational Chains Data Access Portal Gateway Search & Discovery Data Manage- ment iRODS is an open-source data management software: •  Virtualization •  Data Discovery •  Workflow Automation •  Data Sharing Solr is open source enterprise search server that provides faceted navigation, clustering, grouping, and other search features Thredds is a data access server that provides bulk file transfer, remote access, subsetting, web map services
  6. 6. Servers Servers Servers Disks Disks Disks Networking Networking Networking VLAN ONEFLOW Storage ServiceCompute & Networking Service Physical ResourcesStorageNetworkingVirtualization Authentication Multi-tier Infrastructure Orchestration (VMs) Multi-tier Service Orchestration (Containers) Multi-tier Application Provisioning - Scaling - Self Healing Portal GatewayPortal GatewayWorkflow AutomationOperational ChainsData AccessData AccessSearch & DiscoverySearch & DiscoveryData ManagementData Management IngestionIngestion CLIMA Rest Engine CLIMA Backend
  7. 7. Create Data Service ONEFLOWCLIMA Backend Create Environment Create API Key Create Registration Token Create OneFlow Service Template Instantiate OneFlow Service Template Create S3 Bucket Instantiate Rancher Stack Create Container Volumes
  8. 8. Data Service OneFlow Template Data Service Rancher Stack
  9. 9. High Available Data Services Amazon EC2 Amazon S3 ONEFLOW VPN VM VM VM VM VM VM VM VM VM VM VM VM ONEFLOW Federation + File Replication Cross Data Center Replication Federation Slave Zone Master Zone
  10. 10. Current/Future Works ONEFLOW
  11. 11. Thank you.