At Euro-Mediterranean Center for Climate Change (CMCC), the Advanced Scientific Computing (ASC) Division carries out R&D activities on Computational Science applied to the Climate Change. Activities focus on the management of large volumes of scientific data with open source solutions addressing efficient access, analysis and mining and on the optimization and parallelization of the numerical models for climate change simulations on HPC architectures. In this talk, we will share experiences in using OpenNebula by describing cloud-based deployment scenarios for the parallel data analytics service based on CMCC Ophidia platform; two use cases regarding the EU FP7 EUBrazil Cloud Connect and the INTERREG OFIDIA projects will be presented. Moreover, we will show a system prototype based on OpenNebula for the orchestration of HPC Virtual Clusters with some early results carried on CMCC HPC resources.
Author Biography
Marco Mancini (male) Ph.D., is a Research Scientist at the Euro-Mediterranean Center on Climate Change. In 2013 he joined the Advanced Scientific Computing Division, where his main research activities are related to the application of virtualization, cloud computing and software-defined storage technologies for the management of large scientific data volumes on distributed and HPC architectures. At CMCC, he has been involved into national and international projects like: TESSA, EUBRAZILCC, OFIDIA. In 2001 he received his Ph.D. in Computer Science and System Engineering from the University of Calabria. Afterwards, he was Visiting Researcher at Argonne National Laboratory (Chicago, US) and University of Aachen (Germany), adjunct professor and research fellow at the Faculty of Engineering at University of Calabria. He has been also a free lance ICT consultant for government and telecommunication companies. He is author of several scientific papers in the field of computational science, combinatorial optimization and high performance computing. He is IEEE and IEEE Computer Society member.
3. Super Computing Center (SCC) & Me!
Leading
Research
and
ExperimentaAon
of
InnovaAve
Technologies
(VirtualizaAon,
Cloud
CompuAng,
IoT)
for
HPC,
Data
Management
&
OperaAonal
Services
G.
Aloisio:
Director
M.
Mancini:
Chief
Technology
Officer
O.
Marra:
Chief
OperaAons
Officer
G.
Calò:
S.
Research
Associate
S.
Raolil:
S.
Research
Associate
M.
Tridici:
S.
Research
Associate
F.
Martano:
J.
Research
Associate
A.
Pezzuto:
J.
Research
Associate
4. Outline!
HPC
Infrastructure
and
Architecture
Private
Cloud
Infrastructure
Deployment
Cloud-‐based
High
Performance
CompuAng
&
Services
5. IBM
System
X
iDataPlex
DX360M4
7712
Cores
Intel
Xeon
Sandy
Bridge
2,6
Ghz
30,1
TB
RAM
(4GB
per
core)
160
Tflops
Infiniband
40Gb/s
Batch
Scheduler:
IBM
LSF
8
840
TB
storage
(raw
space)
DDN
SFA10K
3
GPFS
File
Systems:
Home/Scratch/Archive
Server 2
Server 1
Shared Resource Group
(shared file system containing Tivoli
software, DB and logs)
Redundant CISCO MDS 9500
Fibre channel Switch (4Gbps)
HSM&Backup
Service
Nr
2
IBM
p520
server
in
HACMP
Nr
1
IBM
DS4700
system
storage;
Nr
2
Cisco
MDS
9500
FC
4
Gbps
switch;
nr
4
TSM
clients;
Tape
library
IBM
TotalStorage
TS3584
:
7
frames;
3737
storage
slots;
18
LTO4
drives;
>
3
PBytes
of
capacity.
Tivoli
Storage
Manager
v.5.5
sohware
6. CMCC Private Cloud!
KVM Service Cluster
3 x BTO System servers
CPU: 2 x Intel Xeon E5-2630L v3
1,8Ghz 8 cores
RAM: 64 GB DDR4-2133, ECC
Reg.
HD: 4 x Seagate 2,5", 600GB,
SAS 6GBi/s 64MB cache, SED
NIC: 1 x dual port 10Gb, 1 x quad
port 1Gb
10Gb
NetworkKVM/CEPH
Big Data Cluster
6 x IBM x3650M4 BigData
servers
CPU: 2 x Intel Xeon E5-2660 v2
2.20GHz 10c/20t
RAM: 256 GB DDR3-1600
HD: 1 x 200GB SSD, 2 x 1TB
SATA, 4 x 3TB SATA
NIC: 1 x dual port 10Gb, 1 x quad
port 1Gb
KVM SRIOV HPC Cluster
10 x HPC IBM System X
iDataPlex dx360M4 servers
CPU: 2 x Intel Xeon E5-2670
2.60GHz
RAM: 64 GB DDR3-1600
HD: 1 x 300GB SAS
NIC: 1 x dual port IB 40Gb,1 x
dual port 10Gb, 1 x dual port 1Gb
KVM Service Cluster
3 x BTO System servers
CPU: 2 x Intel Xeon E5-2630L v3
1,8Ghz 8 cores
RAM: 64 GB DDR4-2133, ECC
Reg.
HD: 4 x Seagate 2,5", 600GB,
SAS 6GBi/s 64MB cache, SED
NIC: 1 x dual port 10Gb, 1 x quad
port 1Gb
Management
Network
10Gb
Network
40Gb Infiniband
Interconnection
Network
NFS Image
datastore
10Gb Storage
Network
KVM/CEPH
Big Data Cluster
6 x IBM x3650M4 BigData
servers
CPU: 2 x Intel Xeon E5-2660 v2
2.20GHz 10c/20t
RAM: 256 GB DDR3-1600
HD: 1 x 200GB SSD, 2 x 1TB
SATA, 4 x 3TB SATA
NIC: 1 x dual port 10Gb, 1 x quad
port 1Gb
KVM SRIOV HPC Cluster
10 x HPC IBM System X
iDataPlex dx360M4 servers
CPU: 2 x Intel Xeon E5-2670
2.60GHz
RAM: 64 GB DDR3-1600
HD: 1 x 300GB SAS
NIC: 1 x dual port IB 40Gb,1 x
dual port 10Gb, 1 x dual port 1Gb
Front-End
4.12.1
3 Monitor
24 OSD
(72 TB raw)
10Gb Storage
Network
GPFS Shared
System datastore
40Gb Infiniband
Storage
Network
8. HPC Virtual Cluster!
• Infiniband
VirtualizaAon
(SRIOV)
• OpenNebula
KVM-‐SRIOV
Addon
driver
• HPC
enabled
-‐>
vCPU
pinning
and
Hugepages
Customized
and
Isolated
Environments
for
OperaAonal
Chains
in
the
framework
of
Oceanography
and
Meteorology
9. Service Orchestration & Provisioning Tool!
Infrastructure
Manager
(IM)
-‐
hip://www.grycap.upv.es/im/index.php
-‐
developed
by
Universidad
Politecnica
de
Valencia
– automates
VMI
selecAon,
deployment,
configuraAon,
sohware
installaAon,
monitoring
and
update
of
Virtual
Appliances.
– provides
funcAonaliAes
through
XML-‐RPC
and
REST
APIs.
Uses
the
Resource
and
ApplicaAon
DescripAon
Language
(RADL)
to
specify
the
requirements
of
the
resources:
– hardware
specificaAon
(CPU,
RAM,
network)
– sohware
requirements
(applicaAons,
sohware
libraries,
database
systems)
– configuraAon
of
OS
and
applicaAons.
– Integrates
Ansible
as
the
contextualizaAon
system
to
enable
the
installaAon
and
configuraAon
of
all
the
user
required
applicaAons.
10. A project funded under the second Eu-
Brazil coordinated call for providing a
user-centric environment for the
European and Brazilian research
communities to test the execution of
scientific applications on heterogeneous
e-Infrastructure."
Build a cross-border operational fire danger
prevention infrastructure"
Improve the ability of regional stakeholders to
detect and fight forest wildfires "
"
Research
Project
co-‐funded
by
the
European
Territorial
CooperaAon
Programmes
JTS
Greece-‐Italy
2007-‐2013
11. • Deployment
A:
simple
scenarios
with
low
workload
• Deployment
B:
trade-‐off
among
scalability,
performances
and
deployment
simplicity
• Deployment
C:
most
performing
and
scalable
scenario,
suitable
for
big
data
analysis
Big Data Analytics Cloud Deployment Scenarios!
• Ansible
roles
have
been
defined
for
Ophidia
components
contextualizaAon
• Ophidia
terminal
(shell)
has
been
extended
with
commands
to
deploy
a
data
analyAcs
cluster
in
the
cloud
environment
by
integraAng
the
IM
REST
API
12. in a snapshot!
Wireless
sensor
networks
-‐
6
wooded
area
in
the
Apulia
Region
-‐
2
weather
staAon
of
the
Apulia
Civil
ProtecAon
in
Bari
Province
-‐
10
wooded
area
in
Epirus
Region
Data
Analy?cs
pla@orm
Visualiza?on
system
CMCC
Cloud
infrastucture
OFIDIA
Deployed
at
CMCC
Private
Cloud:
-‐ Data
AnalyAcs
Planorm-‐>
Ophidia
Cluster
-‐ WRF
-‐>
HPC
Virtual
Cluster
14. DU in Cloud!
PROXY
HA
DNS
Business
Con?nuity
Backup
&
Disaster
Recovery
CMCC
PROXY
Download
Upload
STAGING
AREA
File
Service
RadosGW
File
Service
Sync
SAS
MODULE
Download
Upload
STAGING
AREA
Object
Storage
Public
DU
Service
Backup
Main
DU
Service
15. Work in progress & Future Activities!
-‐ Provisioning
Services
as
OneFlow
+
Ansible
ConfiguraAon
-‐ ElasAcity
by
using
OneGate
Hooks
For
OneFlow
Services
HPC
in
Cloud
+