Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Lxcloud
1. PES CERN's Cloud Computing infrastructure
CERN's Cloud Computing Infrastructure
Tony Cass, Sebastien Goasguen, Belmiro Moreira, Ewan Roche,
Ulrich Schwickerath, Romain Wartel
Cloudview conference, Porto, 2010
See also related presentations:
HEPIX spring and autumn meeting 2009, 2010
Virtualization vision, Grid Deployment Board (GDB) 9/9/2009
Batch virtualization at CERN, EGEE09 conference, Barcelona
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
2. PES Outline and disclaimer
An introduction to CERN
Why virtualization and cloud computing ?
Virtualization of batch resources at CERN
Building blocks and current status
Image management systems: ISF and ONE
Status of the project and first numbers
Disclaimer: We are still in the testing and evaluation phase. No final decision
has been taken yet on what we are going to use in the future.
All given numbers and figures are preliminary
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 2
3. PES Introduction to CERN
European Organization for Nuclear Research
The world’s largest particle physics laboratory
Located on Swiss/French border
Funded/staffed by 20 member states in 1954
With many contributors in the USA
Birth place of World Wide Web
Made popular by the movie “Angels and Demons”
Flag ship accelerator LHC
http://www.cern.ch
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 3
4. PES Introduction to CERN: LHC and the experiments
TOTEM
LHCb LHCf Alice
Circumference of LHC: 26 659 m
Magnets : 9300
Temperature: -271.3°C (1.9 K)
Cooling: ~60T liquid He
Max. Beam energy: 7TeV
Current beam energy: 3.5TeV
ATLAS CMS
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 4
5. PES Introduction to CERN
Data Signal/Noise ratio 10-9
Data volume:
High rate * large number of
channels * 4 experiments
15 PetaBytes of new data
each year
Compute power
Event complexity * Nb. events *
thousands users
100 k CPUs (cores)
Worldwide analysis & funding
Computing funding locally in
major regions & countries
Efficient analysis everywhere
GRID technology
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 5
6. PES LCG computing GRID
Requried computing capacity:
~100 000 processors
Number of sites:
T0: 1 (CERN), 20%
T1: 11 round the world
T2: ~160
http://lcg.web.cern.ch/lcg/public/
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 6
7. PES The CERN Computer Center
Disk and tape:
1500 disk servers
5PB disk space
16PB tape storage
Computing facilities:
>20.000 CPU cores (batch only)
Up to ~10000 concurrent jobs
Job throughput ~200 000/day
CERN IT Department
http://it-dep.web.cern.ch/it-dep/communications/it_facts__figures.htm
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 7
8. PES Why virtualization and cloud computing ?
Service consolidation:
Improve resource usage by squeezing mostly unused machines
onto single big hypervisors
Decouple hardware life cycle from applications running on the box
Ease management by supporting life migration
Virtualization of batch resources:
Decouple jobs and physical resources
Ease management of the batch farm resources
Enable the computer center for new computing models
This presentation is about virtualization of batch resources only
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 8
9. PES Batch virtualization
CERN batch farm lxbatch:
~3000 physical hosts
~20000 CPU cores
>70 queues
Type 1:
Run my jobs in your VM
Type 2:
Run my jobs in my VM
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 9
10. PES Towards cloud computing
Type 3:
Give me my infrastructure
i.e a VM or a batch of VMs
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 10
11. PES Philosophy
(SLC = Scientific Linux CERN)
I
( Near future:
Batch
SLC4 WN SLC5 WN
Physical Physical
SLC4 WN SLC5 WN
hypervisor cluster
(far) future ?
Batch T0 development other/cloud applications
Internal cloud
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 11
12. PES Visions beyond the current plan
Reusing/sharing images between different sites (phase 2)
HEPIX working group founded in autumn 2009 to define rules and
boundary conditions (https://www.hepix.org/)
Experiment specific images (phase 3)
Use of images which are customized for specific experiments
Use of resources in a cloud like operation mode (phase 4)
Images directly join experiment controlled scheduling systems
(phase 5)
Controlled by experiment
Spread across sites
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 12
13. PES Virtual batch: basic ideas
Virtual batch worker nodes:
Clones of real worker nodes, same setup
Platform Infrastructure Sharing Facility
Mix with physical resources (ISF)
Dynamically join the batch farm as normal worker nodes
For high level VM management
Limited lifetime: stop accepting jobs after 24h
Destroy when empty
Only one user job per VM at a time
Note:
The limited lifetime allows for a fully automated system which dynamically adapts
to the current needs, and automatically deploys intrusive updates.
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 13
14. PES Virtual batch: basic ideas, technical
Images:
staged on hypervisors Infrastructure Sharing Facility
Platform
(ISF)
master images, instances use LVM snapshots
For high level VM management
Start with few different flavors only
Image creation:
Derived from a centrally managed “golden node”
Regularly updated and distributed to get updates “in”
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 14
15. PES Virtual batch: basic ideas
Images distribution:
Platform Infrastructure Sharing Facility
(ISF)
Only shared file system available is AFS
For high level VM management
Prefer peer to peer methods (more on that later)
SCP wave
Rtorrent
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 15
16. PES Virtual batch: basic ideas
VM placement and Platform Infrastructure Sharing Facility
management system
(ISF)
Use existing solutions
For high level VM management
Testing both a free and a commercial solution
OpenNebula (ONE)
Platform's Infrastructure Sharing Facility (ISF)
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 16
17. PES Batch virtualization: architecture
Centrally managed
Grey: physical resource
Golden nodes
Job submission Colored: different VMs
CE/interactive
VM kiosk
Batch system
management Hypervisors / HW resources
VM worker nodes
With limited lifetime VM management system
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 17
18. PES Status of building blocks (test system)
Submission Hypervisor VM kiosk VM
and batch cluster and image management
managemen distribution system
Initial
deployment OK OK OK OK
Central ISF OK,
management OK OK Mostly ONE
implemented missing
Monitoring
and alarming OK Switched off missing missing
for tests
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 18
19. PES Image distribution: SCP wave versus rtorrent
Slow nodes,
(under investigation)
Preliminary !
(BT = bit torrent)
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 19
20. PES VM placement and management
OpenNebula (ONE):
Basic model:
Single ONE master
Communication with hypervisors via ssh only
(Currently) no special tools on the hypervisors
Some scalability issues at the beginning (50 VM at the beginning)
Addressing issues as they turn up
Close collaboration with developers, ideas for improvements
Managed to start more than 7,500 VMs
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 20
21. PES Scalability tests: some first numbers
“One shot” test with OpenNebula:
Inject virtual machine requests
And let them die
Record the number of alive machines seen by LSF every 30s
Units: 1h5min
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 21
22. PES VM placement and management
Platforms Infrastructure Sharing Facility (ISF)
One active ISF master node, plus fail-over candidates
Hypervisors run an agent which talks to XEN
Needed to be packaged for CERN
Resource management layer similar to LSF
Scalability expected to be good but needs verification
Tested with 2 racks (96 machines) so far, ramping up
Filled with ~2.000 VMs so far (which is the maximum)
See: http://www.platform.com
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 22
23. PES Screen shots: ISF
VM status:
Status: 1 of 2 racks available
CERN IT Department
Note: one out of 2 racks enabled for demonstration purpose
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 23
24. PES Screen shots: ISF
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 24
25. PES Summary
Virtualization efforts at CERN are proceeding.
Still some work to be done.
Main challenges
Scalability considerations
provisioning systems
of the batch system
No decision on provisioning system to be used yet
Reliability and speed of image distribution
General readiness for production (hardening)
Seamless integration into the existing infrastructure
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 25
26. PES Outlook
What's next ?
Continue testing of ONE and ISF in parallel
Solve remaining (known) issues
Release first VMs for testing by our users soon
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 26
27. PES Questions ?
CERN IT Department
?
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 27
28. PES Philosophy
Platform Infrastructure Sharing Facility
VM provisioning system(s) (ISF)
For high level VM management
OpenNebula pVMO
Hypervisor cluster
(physical resources)
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 28
29. PES Details: Some additional explanations ...
“Golden node”:
A centrally managed (i.e. Quattor controlled) standard worker node
which
Is a virtual machine
Does not accept jobs
Receives regular updates
Purpose: creation of VM images
“Virtual machine worker node”:
A virtual machine derived from a golden node
Not updated during their life time
Dynamically adds itself to the batch farm
Accepts jobs for only 24h
Runs only one user job at a time
Destroys itself when empty
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 29
30. PES VM kiosk and Image distribution
Boundary conditions at CERN
Only available shared file system is AFS
Network infrastructure with a single 1GE connection
No dedicated fast network for transfers that could be used
(eg 10GE, IB or similar)
Tested options:
Scp wave:
Developed at Clemson university
Based on simple scp
rtorrent:
Infrastructure developed at CERN
Each node starts serving blocks it already hosts
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 30
31. PES Details: VM kiosk and Image distribution
Local image distribution model at CERN
Virtual machines are instantiated as LVM snapshots of a base
image.This is process is very fast
Replacing a production image:
Approved images are moved to a central image repository (the “kiosk”)
Hypervisors check regularly for new images on the kiosk node
The new image is transferred to a temporary area on the hypervisors
When the transfer is finished, a new LV is created
The new image is unpacked into the new LV
The current production image is renamed (via lvrename)
The new image is renamed to become the production image
Note: may need a sophisticated locking strategy in this process
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 31
32. PES Image distribution with torrrent: transfer speed
7GB file compressed
452 target nodes
y
in ar
lim
Pre
90% finished after 25min
Slow nodes, under investigation
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 32
33. PES Image distribution: total distribution speed
→ Unpacking still needs some tuning !
7GB file compressed
452 target nodes
y
in ar All done after 1.5h
lim
Pre
CERN IT Department
CH-1211 Genève 23
Switzerland
www.cern.ch/it
CERNs Cloud computing infrastructure - a status report - 33