SlideShare a Scribd company logo
1 of 45
The CERN OpenStack Cloud
Compute Resource Provisioning for the
Large Hadron Collider
2
Arne Wiebalck
for the CERN Cloud Team
OpenStack Days Germany
Cologne, DE
June 21, 2016
CERN: home.cern
• European Organization for Nuclear
Research (Conseil Européen pour la Recherche Nucléaire)
- Founded in 1954, today 21 member states
- World’s largest particle physics laboratory
- Located at Franco-Swiss border near Geneva
- ~2’300 staff members, >12’500 users
- Budget: ~1000 MCHF (2016)
• CERN’s mission
- Answer fundamental questions of the universe
- Advance the technology frontiers
- Train the scientists and engineers of tomorrow
- Bring nations together
3Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
The Large Hadron Collider (LHC)
4
Largest machine on earth: 27km circumference
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
LHC: 9’600 Magnets for Beam Control
5
1232 superconducting dipoles for bending: 14m, 35t, 8.3T, 12kA
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
LHC: Coldest Temperature
6
World’s largest cryogenic system: colder than outer space (1.9K/2.7K), 120t of He
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
LHC: Highest Vacuum
7
Vacuum system: 104 km of pipes, 10-10-10-11 mbar (comparable to the moon)
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
8
LHC: Detectors
Four main experiments to study the fundamental properties of the universe
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
LHC: Data & Compute Growth
0
20
40
60
80
100
120
140
160
Run 1 Run 2 Run 3 Run 4
GRID
ATLAS
CMS
LHCb
ALICE
0.0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
400.0
450.0
Run 1 Run 2 Run 3 Run 4
CMS
ATLAS
ALICE
LHCb
PB/year109HS06sec/month
What we
can afford
9Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Collisions produce ~1 PB/s
10
LHC: World-wide Computing Grid
TIER-1:
permanent storage,
re-processing,
analysis
TIER-0 (CERN):
data recording,
reconstruction and
distribution
TIER-2:
simulation,
end-user analysis
> 2 million jobs/day
~350’000 cores
500 PB of storage
nearly 170 sites,
40 countries
10-100 Gb links
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
In production:
• 4 clouds
• >200K cores
• >8,000
hypervisors
90% of CERN’s
compute resources
are now delivered
on top of OpenStack
OpenStack at CERN
11Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
12Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Agenda
• Service Overview
• Operations
• Performance
• Outlook
Cloud Service Context
• CERN IT to enable the laboratory to fulfill its mission
- Main data center on the Geneva site
- Wigner data center, Budapest, 23ms distance
- Connected via two dedicated 100Gbs links
• CERN Cloud Service one of the three
major components in IT’s AI project
- Policy: Servers in CERN IT shall be virtual
• Based on OpenStack
- Production service since July 2013
- Performed (almost) 4 rolling upgrades since
- Currently in transition from Kilo to Liberty
- Nova, Glance, Keystone, Horizon, Cinder, Ceilometer, Heat, Neutron, Magnum
13Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
http://goo.gl/maps/K5SoG
CERN Cloud Architecture (1)
14
• Deployment spans our two
data centers
- 1 region (to have 1 API), ~40 cells
- Cells map use cases
hardware, hypervisor type, location, users, …
• Top cell on physical and virtual
nodes in HA
- Clustered RabbitMQ with mirrored queues
- API servers are VMs in various child cells
• Child cell controllers are OpenStack VMs
- One controller per cell
- Tradeoff between complexity and failure impact
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
CERN Cloud Architecture (2)
15
nova-cells
rabbitmq
Top cell controller API server
nova-api
rabbitmq
nova-cells
nova-api
nova-scheduler
nova-conductor
nova-network
Child cell controller
Compute node
nova-compute
rabbitmq
nova-cells
nova-api
nova-scheduler
nova-conductor
nova-network
Child cell controller
Compute node
nova-compute
DB infrastructure
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
CERN Cloud in Numbers (1)
• ~5’800 hypervisors in production
- Majority qemu/kvm now on CERN CentOS 7 (~150 Hyper-V hosts)
- ~2’100 HVs at Wigner in Hungary (batch, compute, services)
- 370 HVs on critical power
• 155k Cores
• ~350 TB RAM
• ~18’000 VMs
• To be increased in 2016!
- +57k cores in spring
- +400kHS06 in autumn
16Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
CERN Cloud in Numbers (2)
• 2’700 images/snapshots
- Glance on Ceph
• 2’300 volumes
- Cinder on Ceph (& NetApp) in GVA & Wigner
17
Only issue during
2 years in prod:
Ceph Issue 6480
Rally down
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Every 10s a VM
gets created or
deleted in our
cloud!
18Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Agenda
• Cloud Service Overview
• Operations
• Performance
• Outlook
Software Deployment
19Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• Deployment based on RDO
- Upstream, only patched where necessary
(e.g. nova/neutron for CERN networks)
- Some few customizations
- Works well for us
• Puppet for config’ management
- Introduced with the adoption of AI paradigm
• We submit upstream whenever possible
- openstack, openstack-puppet, RDO, …
• Updates done service-by-service over several months
- Running services on dedicated (virtual) servers helps
(Exception: ceilometer and nova on compute nodes)
• Upgrade testing done with packstack and devstack
- Depends on service: from simple DB upgrades to full shadow installations
‘dev’ environment
20Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• Simulate the full CERN cloud environment
on your laptop, even offline
• Docker containers in a Kubernetes cluster
- Clone of central Puppet configuration
- Mock-up of
- our two Ceph clusters
- our secret store
- our network DB & DHCP
- Central DB & Rabbit instances
- One POD per service
• Change & test in seconds
• Full upgrade testing
21Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
(*) Pilot
ESSEX
Nova (*)
Swift
Glance (*)
Horizon (*)
Keystone (*)
FOLSOM
Nova (*)
Swift
Glance (*)
Horizon (*)
Keystone (*)
Quantum
Cinder
GRIZZLY
Nova
Swift
Glance
Horizon
Keystone
Quantum
Cinder
Ceilometer (*)
HAVANA
Nova
Swift
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer (*)
Heat
ICEHOUSE
Nova
Swift
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
JUNO
Nova
Swift
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat (*)
Rally (*)
5 April 2012 27 September 2012 4 April 2013 17 October 2013 17 April 2014 16 October 2014
July 2013
CERN OpenStack
Production Service
February 2014
CERN OpenStack
Havana Release
October 2014
CERN OpenStack
Icehouse Release
30 April 2015
March2015
CERN OpenStack
Juno Release
LIBERTY
Nova
Swift
Glance
Horizon
Keystone
Neutron (*)
Cinder
Ceilometer
Heat
Rally
Magnum (*)
Barbican (*)
15 October 2015
September 2015
CERN OpenStack
Kilo Release
KILO
Nova
Swift
Glance
Horizon
Keystone
Neutron
Cinder
Ceilometer
Heat
Rally
May 2016
CERN OpenStack
ongoing Liberty
Cloud Service Release Evolution
Rich Usage Spectrum …
22Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• Batch service
- Physics data analysis
• IT Services
- Sometimes built on top of other
virtualised services
• Experiment services
- E.g. build machines
• Engineering services
- E.g. micro-electronics/chip design
• Infrastructure services
- E.g. hostel booking, car rental, …
• Personal VMs
- Development
… rich requirement spectrum!
(some flexibility is required)
Wide Hardware Spectrum
23Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• The ~5’800 hypervisors differ in …
- Processor architectures: AMD vs. Intel (av. features, NUMA, …)
- Core-to-RAM ratio (1:2, 1:4, 1:1.5, ...)
- Core-to-disk ratio (going down with SSDs!)
- Disk layout (2 HDDs, 3 HDDs, 2 HDDs + 1 SSD, 2 SSDs, …)
- Network (1GbE vs 10 GbE)
- Critical vs physics power
- Physical location (GVA vs. Wigner)
- Network domain (LCG vs. GPN vs. TN)
- SLC6, CERN CentOS 7, RHEL7, Windows
- …
• Variety reflected/acessible via instance types, cells,
projects … variety not necessarily visible to users!
- We try to keep things simple and hide some of the complexity
- We can react to (most of the) requests with special needs
Basic Building Blocks: Instance Types
24Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Name vCPUs RAM [GB] Ephemeral
[GB]
m2.small 1 1.875 10
m2.medium 2 3.750 20
m2.large 4 7.5 40
m2.xlarge 8 15 80
m2.2xlarge 16 30 160
m2.3xlarge 32 60 320
~80 flavors in total (swap,
additional/large ephemerals, core-to-
RAM, …)
Basic Building Blocks: Volume Types
25Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Name IOPS b/w [MB/s] feature Location Backend
standard 100 80 GVA
io1 500 120 fast GVA
cp1 100 80 critical GVA
cpio1 500 120 critical/fast GVA
cp2 n.a. 120 critical/Windows GVA
wig-cp1 100 80 critical WIG
wig-cpio1 500 120 critical/fast WIG
m2.* flavor family plus volumes as
basic building blocks for services
Automate provisioning with
26Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Automate routine procedures
- Common place for workflows
- Clean web interface
- Scheduled jobs, cron-style
- Traceability and auditing
- Fine-grained access control
- …
Procedures for
- OpenStack project creation
- OpenStack quota changes
- Notifications of VM owners
- Usage and health reports
- …
Disable
compute node
• Disable nova-service
• Switch Alarms OFF
• Update Service-Now ticket
Notifications
• Send e-mail to VM owners
Other tasks
Post new
message broker
Add remote AT job
Save intervention
details
Send calendar
invitation
Operations: Retirement Campaign
27Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• About 1’600 nodes to retire from the service by 3Q16
- ~1’200 from compute, ~400 with services (several thousand)
• We have gained quite some experience with (manual)
migration
- Live where possible and cold where necessary
- Works reliably (where it can)
• We have developed a tool that you can instruct to drain
hypervisor (or simply migrate given VMs)
- Main tasks are VM classification and progress monitoring
- The nova scheduler will pick the target (nova patch)
• We will use the “IP service bridging” to handle
CERN network specifics (VPLS)
- See this talk for more details
Operations: Network Migration
28
• We’ll need to replace nova-network
- It’s going to be deprecated (really really really this time)
• New potential features (Tenant networks, LBaaS,
Floating IPs, …)
• We have a number of patches to adapt to the CERN network
constraints
- We patched nova for each release …
- … neutron allows for out-of-tree plugins!
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
29
• We have two working Neutron cells …
- Neutron control plane in Liberty (fully HA)
- Bridge agent in Kilo (nova)
• … but there is still quite some work to do.
- IPv6, custom DHCP, network naming, …
• “As mentioned, there is currently no way to
cleanly migrate from nova-network to neutron.”
- All efforts to establish a general migration path failed so far
- Should be OK for us, various options (incl. in-place, w/ migration, …)
Operations: Neutron Status
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
30
• Magnum: OpenStack project to treat Container
Orchestration Engines (COEs) as 1st class
resources
• Pre-production service available
- Supporting Docker Swarm and Kubernetes for now
- Adding Mesos
• Many users interested, usage ramping up
- GitLab CI, Jupyter/Swan, FTS, …
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Operations: Containers
Operations: Federated Clouds
Public Cloud such
as Rackspace or IBM
CERN Private Cloud
160K cores
ATLAS Trigger
28K cores
ALICE Trigger
9K cores
CMS Trigger
13K cores
INFN
Italy
Brookhaven
National Labs
NecTAR
Australia
Many Others
on Their Way
31Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• Access to EduGAIN users via Horizon
- Allow (limited) access given appropriate membership
32Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Agenda
• Cloud Service Overview
• Operations
• Performance
• Outlook
IO Performance Issues
33
• Symptoms
- High IOwait
- Erratic response times
• User expectation
- Ephemeral ≈ HDD
• Mirrored spinning drives on the host
- 100-200 IOPS shared among guests
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
34Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
IO benchmark
starts
Switch from ‘deadline’
to ‘CFQ’
IO benchmarks continues
‘deadline’ default
(on RH’s virtual-guest profile)
- prefers reads
(heavy load)
- can block writes up
to 5 secs!
- sssd DB update at
login time
- ‘CFQ’ better suited
IO Performance: Scheduling
IO: Cinder Volumes at Work
35
ATLAS TDAQ monitoring application
Y- Axis: CPU % spent in IOwait
Blue: CVI VM (h/w RAID-10 with cache)
Yellow: OpenStack VM
IOPS QoS changed
from 100 to 500 IOPS
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Going to 1000 IOPS doesn’t help.
If the application is not designed to keep enough requests in-flight, more
IOPS do not reduce the time spent in IOwait.
We need to reduce the latency as well …
36Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
IO: Block-level Caching
We chose bcache:
- Easy to setup
- Strong error handling
On a 4 VM hypervisor:
~25-50 IOPS/VM 
~1k-5k IOPS/VM
• Single SSD to cache in front of HDDs
- flashcache, bcache, dm-cache, …
37
Blue: CVI VM (h/w RAID-10 w/ cache)
Yellow: OpenStack VM
Red: OpenStack on bcache HV
SSD block level caching sufficient
for IOPS and latency demands.
Use a VM on a
bcache hypervisor
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
IO: bcache and Latency
IO: ZFS ZIL/l2arc
38Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
• Meanwhile: SSD-only hypervisors
- Fast, but (too) small for some use cases
• Re-use cache idea leveraging ZFS features
- Have partitions on the VM (on the SSDs!) which ZFS
can use for the l2arc cache and the ZFS Intent Log
- Use a fast volumes as the backing store
• We’re testing this setup at the moment …
CPU Performance Issues
39
• The benchmarks on full-node VMs was about
20% lower than the one of the underlying host
- Smaller VMs much better
• Investigated various tuning options
- KSM*, EPT**, PAE, Pinning, … +hardware type dependencies
- Discrepancy down to ~10% between virtual and physical
• Comparison with Hyper-V: no general issue
- Loss w/o tuning ~3% (full-node), <1% for small VMs
- … NUMA-awareness!
*KSM on/off: beware of memory reclaim! **EPT on/off: beware of expensive page table walks!
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
CPU: NUMA
40
• NUMA-awareness identified as most
efficient setting
- Full node VMs have ~3% overhead in HS06
• “EPT-off” side-effect
- Small number of hosts, but very
visible there
• Use 2MB Huge Pages
- Keep the “EPT off” performance gain
with “EPT on”
• More details in this talk
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Operations: NUMA/THP Roll-out
41
• Rolled out on ~2’000 batch hypervisors (~6’000 VMs)
- HP allocation as boot parameter  reboot
- VM NUMA awareness as flavor metadata  delete/recreate
• Cell-by-cell (~200 hosts):
- Queue-reshuffle to minimize resource impact
- Draining & deletion of batch VMs
- Hypervisor reconfiguration (Puppet) & reboot
- Recreation of batch VMs
• Whole update took about 8 weeks
- Organized between batch and cloud teams
- No performance issue observed since
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
42Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Agenda
• Cloud Service Overview
• Operations
• Performance
• Outlook
Future Plans
43
• Investigate Ironic (Bare metal provisioning)
- OpenStack as one interface for compute resource provisioning
- Allow for complete accounting
- Use physical machines for containers
• Replace Hyper-V by qemu/kvm?
- Successfully done at other sites
- Remove dependency on Windows expertise
- Reduce complexity in service setup
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
Summary
44
• OpenStack at CERN in production since 3 years
- We’re working closely with the various communities
• Cloud service continues to grow and mature
- While experimental, good experience with cells for scaling
- Experience gained helps with general resource provisioning
- New features added (containers, identity federation)
- Expansion planned (bare metal provisioning)
• Major operational challenges ahead
- Transparent retirement of service hosts
- Replacement of network layer
• http://openstack-in-production.blogspot.com
(read about our recent 2M req/s Magnum & Kubernetes!)
Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
The OpenStack Cloud at CERN

More Related Content

What's hot

What's hot (20)

Openstack Installation (ver. liberty)
Openstack Installation (ver. liberty)Openstack Installation (ver. liberty)
Openstack Installation (ver. liberty)
 
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
 
10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k cores10 Years of OpenStack at CERN - From 0 to 300k cores
10 Years of OpenStack at CERN - From 0 to 300k cores
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
OpenStack 101 @ ENEI 2014
OpenStack 101 @ ENEI 2014OpenStack 101 @ ENEI 2014
OpenStack 101 @ ENEI 2014
 
OpenStack 101
OpenStack 101OpenStack 101
OpenStack 101
 
OpenStack basics
OpenStack basicsOpenStack basics
OpenStack basics
 
Cloud Computing Open Stack Compute Node
Cloud Computing Open Stack Compute NodeCloud Computing Open Stack Compute Node
Cloud Computing Open Stack Compute Node
 
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
 
Learning to Scale OpenStack
Learning to Scale OpenStackLearning to Scale OpenStack
Learning to Scale OpenStack
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014Mirantis OpenStack-DC-Meetup 17 Sept 2014
Mirantis OpenStack-DC-Meetup 17 Sept 2014
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
 
Introduction to openstack
Introduction to openstackIntroduction to openstack
Introduction to openstack
 
OpenStack 101 update
OpenStack 101 updateOpenStack 101 update
OpenStack 101 update
 
Swami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updatesSwami osi bangalore2017days pike release_updates
Swami osi bangalore2017days pike release_updates
 
Vancouver open stack meetup presentation
Vancouver open stack meetup presentationVancouver open stack meetup presentation
Vancouver open stack meetup presentation
 
Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in Production
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 

Viewers also liked

OpenStack Introduction
OpenStack IntroductionOpenStack Introduction
OpenStack Introduction
openstackindia
 

Viewers also liked (20)

OpenStack Introduction
OpenStack IntroductionOpenStack Introduction
OpenStack Introduction
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends Report
 
OpenStack Architecture and Use Cases
OpenStack Architecture and Use CasesOpenStack Architecture and Use Cases
OpenStack Architecture and Use Cases
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation report3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation report
 
Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017
 
Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017
 
OpenStack Tutorial
OpenStack TutorialOpenStack Tutorial
OpenStack Tutorial
 
OpenStack Architecture
OpenStack ArchitectureOpenStack Architecture
OpenStack Architecture
 
Cross-regional Application Deplolyment on AWS - Channy Yun (JAWS Days 2017)
Cross-regional Application Deplolyment on AWS - Channy Yun (JAWS Days 2017)Cross-regional Application Deplolyment on AWS - Channy Yun (JAWS Days 2017)
Cross-regional Application Deplolyment on AWS - Channy Yun (JAWS Days 2017)
 
How to Develop OpenStack
How to Develop OpenStackHow to Develop OpenStack
How to Develop OpenStack
 
Accelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStackAccelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStack
 
Logging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed EnvironmentLogging/Request Tracing in Distributed Environment
Logging/Request Tracing in Distributed Environment
 
OpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps MeetupOpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps Meetup
 
오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기
 
OpenStack and private cloud
OpenStack and private cloudOpenStack and private cloud
OpenStack and private cloud
 
Comparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s GanetiComparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s Ganeti
 
Business Environment Assignment Sample - Assignment Prime Australia
Business Environment Assignment Sample - Assignment Prime AustraliaBusiness Environment Assignment Sample - Assignment Prime Australia
Business Environment Assignment Sample - Assignment Prime Australia
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 

Similar to The OpenStack Cloud at CERN

OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack - in...
OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack -  in...OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack -  in...
OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack - in...
IndicThreads
 

Similar to The OpenStack Cloud at CERN (20)

OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
 
From OpenStack to Docker swarm
From OpenStack to Docker swarmFrom OpenStack to Docker swarm
From OpenStack to Docker swarm
 
Openstackoverview-DEC2013
Openstackoverview-DEC2013Openstackoverview-DEC2013
Openstackoverview-DEC2013
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
20121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.220121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.2
 
OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack - in...
OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack -  in...OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack -  in...
OpenStack Ecosystem – Xen Cloud Platform and Integration into OpenStack - in...
 
OpenstackOverview.pdf
OpenstackOverview.pdfOpenstackOverview.pdf
OpenstackOverview.pdf
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
CERN & Huawei collaboration to improve OpenStack for running large scale scie...
CERN & Huawei collaboration to improve OpenStack for running large scale scie...CERN & Huawei collaboration to improve OpenStack for running large scale scie...
CERN & Huawei collaboration to improve OpenStack for running large scale scie...
 
Intro to OpenStack
Intro to OpenStackIntro to OpenStack
Intro to OpenStack
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
 
OpenStack Orchestration with Heat
OpenStack Orchestration with HeatOpenStack Orchestration with Heat
OpenStack Orchestration with Heat
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Ceph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der SterCeph for Big Science - Dan van der Ster
Ceph for Big Science - Dan van der Ster
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
SCALE/SWITCHengines Update - Current and Possible SDN Applications
SCALE/SWITCHengines Update - Current and Possible SDN ApplicationsSCALE/SWITCHengines Update - Current and Possible SDN Applications
SCALE/SWITCHengines Update - Current and Possible SDN Applications
 
Burst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runBurst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud run
 
OpenStack STL January Meetup
OpenStack STL January MeetupOpenStack STL January Meetup
OpenStack STL January Meetup
 
Kolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in SydneyKolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in Sydney
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

The OpenStack Cloud at CERN

  • 1.
  • 2. The CERN OpenStack Cloud Compute Resource Provisioning for the Large Hadron Collider 2 Arne Wiebalck for the CERN Cloud Team OpenStack Days Germany Cologne, DE June 21, 2016
  • 3. CERN: home.cern • European Organization for Nuclear Research (Conseil Européen pour la Recherche Nucléaire) - Founded in 1954, today 21 member states - World’s largest particle physics laboratory - Located at Franco-Swiss border near Geneva - ~2’300 staff members, >12’500 users - Budget: ~1000 MCHF (2016) • CERN’s mission - Answer fundamental questions of the universe - Advance the technology frontiers - Train the scientists and engineers of tomorrow - Bring nations together 3Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 4. The Large Hadron Collider (LHC) 4 Largest machine on earth: 27km circumference Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 5. LHC: 9’600 Magnets for Beam Control 5 1232 superconducting dipoles for bending: 14m, 35t, 8.3T, 12kA Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 6. LHC: Coldest Temperature 6 World’s largest cryogenic system: colder than outer space (1.9K/2.7K), 120t of He Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 7. LHC: Highest Vacuum 7 Vacuum system: 104 km of pipes, 10-10-10-11 mbar (comparable to the moon) Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 8. 8 LHC: Detectors Four main experiments to study the fundamental properties of the universe Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 9. LHC: Data & Compute Growth 0 20 40 60 80 100 120 140 160 Run 1 Run 2 Run 3 Run 4 GRID ATLAS CMS LHCb ALICE 0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0 400.0 450.0 Run 1 Run 2 Run 3 Run 4 CMS ATLAS ALICE LHCb PB/year109HS06sec/month What we can afford 9Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Collisions produce ~1 PB/s
  • 10. 10 LHC: World-wide Computing Grid TIER-1: permanent storage, re-processing, analysis TIER-0 (CERN): data recording, reconstruction and distribution TIER-2: simulation, end-user analysis > 2 million jobs/day ~350’000 cores 500 PB of storage nearly 170 sites, 40 countries 10-100 Gb links Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 11. In production: • 4 clouds • >200K cores • >8,000 hypervisors 90% of CERN’s compute resources are now delivered on top of OpenStack OpenStack at CERN 11Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 12. 12Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Agenda • Service Overview • Operations • Performance • Outlook
  • 13. Cloud Service Context • CERN IT to enable the laboratory to fulfill its mission - Main data center on the Geneva site - Wigner data center, Budapest, 23ms distance - Connected via two dedicated 100Gbs links • CERN Cloud Service one of the three major components in IT’s AI project - Policy: Servers in CERN IT shall be virtual • Based on OpenStack - Production service since July 2013 - Performed (almost) 4 rolling upgrades since - Currently in transition from Kilo to Liberty - Nova, Glance, Keystone, Horizon, Cinder, Ceilometer, Heat, Neutron, Magnum 13Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 http://goo.gl/maps/K5SoG
  • 14. CERN Cloud Architecture (1) 14 • Deployment spans our two data centers - 1 region (to have 1 API), ~40 cells - Cells map use cases hardware, hypervisor type, location, users, … • Top cell on physical and virtual nodes in HA - Clustered RabbitMQ with mirrored queues - API servers are VMs in various child cells • Child cell controllers are OpenStack VMs - One controller per cell - Tradeoff between complexity and failure impact Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 15. CERN Cloud Architecture (2) 15 nova-cells rabbitmq Top cell controller API server nova-api rabbitmq nova-cells nova-api nova-scheduler nova-conductor nova-network Child cell controller Compute node nova-compute rabbitmq nova-cells nova-api nova-scheduler nova-conductor nova-network Child cell controller Compute node nova-compute DB infrastructure Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 16. CERN Cloud in Numbers (1) • ~5’800 hypervisors in production - Majority qemu/kvm now on CERN CentOS 7 (~150 Hyper-V hosts) - ~2’100 HVs at Wigner in Hungary (batch, compute, services) - 370 HVs on critical power • 155k Cores • ~350 TB RAM • ~18’000 VMs • To be increased in 2016! - +57k cores in spring - +400kHS06 in autumn 16Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 17. CERN Cloud in Numbers (2) • 2’700 images/snapshots - Glance on Ceph • 2’300 volumes - Cinder on Ceph (& NetApp) in GVA & Wigner 17 Only issue during 2 years in prod: Ceph Issue 6480 Rally down Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Every 10s a VM gets created or deleted in our cloud!
  • 18. 18Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Agenda • Cloud Service Overview • Operations • Performance • Outlook
  • 19. Software Deployment 19Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • Deployment based on RDO - Upstream, only patched where necessary (e.g. nova/neutron for CERN networks) - Some few customizations - Works well for us • Puppet for config’ management - Introduced with the adoption of AI paradigm • We submit upstream whenever possible - openstack, openstack-puppet, RDO, … • Updates done service-by-service over several months - Running services on dedicated (virtual) servers helps (Exception: ceilometer and nova on compute nodes) • Upgrade testing done with packstack and devstack - Depends on service: from simple DB upgrades to full shadow installations
  • 20. ‘dev’ environment 20Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • Simulate the full CERN cloud environment on your laptop, even offline • Docker containers in a Kubernetes cluster - Clone of central Puppet configuration - Mock-up of - our two Ceph clusters - our secret store - our network DB & DHCP - Central DB & Rabbit instances - One POD per service • Change & test in seconds • Full upgrade testing
  • 21. 21Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 (*) Pilot ESSEX Nova (*) Swift Glance (*) Horizon (*) Keystone (*) FOLSOM Nova (*) Swift Glance (*) Horizon (*) Keystone (*) Quantum Cinder GRIZZLY Nova Swift Glance Horizon Keystone Quantum Cinder Ceilometer (*) HAVANA Nova Swift Glance Horizon Keystone Neutron Cinder Ceilometer (*) Heat ICEHOUSE Nova Swift Glance Horizon Keystone Neutron Cinder Ceilometer Heat JUNO Nova Swift Glance Horizon Keystone Neutron Cinder Ceilometer Heat (*) Rally (*) 5 April 2012 27 September 2012 4 April 2013 17 October 2013 17 April 2014 16 October 2014 July 2013 CERN OpenStack Production Service February 2014 CERN OpenStack Havana Release October 2014 CERN OpenStack Icehouse Release 30 April 2015 March2015 CERN OpenStack Juno Release LIBERTY Nova Swift Glance Horizon Keystone Neutron (*) Cinder Ceilometer Heat Rally Magnum (*) Barbican (*) 15 October 2015 September 2015 CERN OpenStack Kilo Release KILO Nova Swift Glance Horizon Keystone Neutron Cinder Ceilometer Heat Rally May 2016 CERN OpenStack ongoing Liberty Cloud Service Release Evolution
  • 22. Rich Usage Spectrum … 22Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • Batch service - Physics data analysis • IT Services - Sometimes built on top of other virtualised services • Experiment services - E.g. build machines • Engineering services - E.g. micro-electronics/chip design • Infrastructure services - E.g. hostel booking, car rental, … • Personal VMs - Development … rich requirement spectrum! (some flexibility is required)
  • 23. Wide Hardware Spectrum 23Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • The ~5’800 hypervisors differ in … - Processor architectures: AMD vs. Intel (av. features, NUMA, …) - Core-to-RAM ratio (1:2, 1:4, 1:1.5, ...) - Core-to-disk ratio (going down with SSDs!) - Disk layout (2 HDDs, 3 HDDs, 2 HDDs + 1 SSD, 2 SSDs, …) - Network (1GbE vs 10 GbE) - Critical vs physics power - Physical location (GVA vs. Wigner) - Network domain (LCG vs. GPN vs. TN) - SLC6, CERN CentOS 7, RHEL7, Windows - … • Variety reflected/acessible via instance types, cells, projects … variety not necessarily visible to users! - We try to keep things simple and hide some of the complexity - We can react to (most of the) requests with special needs
  • 24. Basic Building Blocks: Instance Types 24Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Name vCPUs RAM [GB] Ephemeral [GB] m2.small 1 1.875 10 m2.medium 2 3.750 20 m2.large 4 7.5 40 m2.xlarge 8 15 80 m2.2xlarge 16 30 160 m2.3xlarge 32 60 320 ~80 flavors in total (swap, additional/large ephemerals, core-to- RAM, …)
  • 25. Basic Building Blocks: Volume Types 25Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Name IOPS b/w [MB/s] feature Location Backend standard 100 80 GVA io1 500 120 fast GVA cp1 100 80 critical GVA cpio1 500 120 critical/fast GVA cp2 n.a. 120 critical/Windows GVA wig-cp1 100 80 critical WIG wig-cpio1 500 120 critical/fast WIG m2.* flavor family plus volumes as basic building blocks for services
  • 26. Automate provisioning with 26Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Automate routine procedures - Common place for workflows - Clean web interface - Scheduled jobs, cron-style - Traceability and auditing - Fine-grained access control - … Procedures for - OpenStack project creation - OpenStack quota changes - Notifications of VM owners - Usage and health reports - … Disable compute node • Disable nova-service • Switch Alarms OFF • Update Service-Now ticket Notifications • Send e-mail to VM owners Other tasks Post new message broker Add remote AT job Save intervention details Send calendar invitation
  • 27. Operations: Retirement Campaign 27Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • About 1’600 nodes to retire from the service by 3Q16 - ~1’200 from compute, ~400 with services (several thousand) • We have gained quite some experience with (manual) migration - Live where possible and cold where necessary - Works reliably (where it can) • We have developed a tool that you can instruct to drain hypervisor (or simply migrate given VMs) - Main tasks are VM classification and progress monitoring - The nova scheduler will pick the target (nova patch) • We will use the “IP service bridging” to handle CERN network specifics (VPLS) - See this talk for more details
  • 28. Operations: Network Migration 28 • We’ll need to replace nova-network - It’s going to be deprecated (really really really this time) • New potential features (Tenant networks, LBaaS, Floating IPs, …) • We have a number of patches to adapt to the CERN network constraints - We patched nova for each release … - … neutron allows for out-of-tree plugins! Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 29. 29 • We have two working Neutron cells … - Neutron control plane in Liberty (fully HA) - Bridge agent in Kilo (nova) • … but there is still quite some work to do. - IPv6, custom DHCP, network naming, … • “As mentioned, there is currently no way to cleanly migrate from nova-network to neutron.” - All efforts to establish a general migration path failed so far - Should be OK for us, various options (incl. in-place, w/ migration, …) Operations: Neutron Status Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 30. 30 • Magnum: OpenStack project to treat Container Orchestration Engines (COEs) as 1st class resources • Pre-production service available - Supporting Docker Swarm and Kubernetes for now - Adding Mesos • Many users interested, usage ramping up - GitLab CI, Jupyter/Swan, FTS, … Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Operations: Containers
  • 31. Operations: Federated Clouds Public Cloud such as Rackspace or IBM CERN Private Cloud 160K cores ATLAS Trigger 28K cores ALICE Trigger 9K cores CMS Trigger 13K cores INFN Italy Brookhaven National Labs NecTAR Australia Many Others on Their Way 31Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • Access to EduGAIN users via Horizon - Allow (limited) access given appropriate membership
  • 32. 32Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Agenda • Cloud Service Overview • Operations • Performance • Outlook
  • 33. IO Performance Issues 33 • Symptoms - High IOwait - Erratic response times • User expectation - Ephemeral ≈ HDD • Mirrored spinning drives on the host - 100-200 IOPS shared among guests Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 34. 34Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 IO benchmark starts Switch from ‘deadline’ to ‘CFQ’ IO benchmarks continues ‘deadline’ default (on RH’s virtual-guest profile) - prefers reads (heavy load) - can block writes up to 5 secs! - sssd DB update at login time - ‘CFQ’ better suited IO Performance: Scheduling
  • 35. IO: Cinder Volumes at Work 35 ATLAS TDAQ monitoring application Y- Axis: CPU % spent in IOwait Blue: CVI VM (h/w RAID-10 with cache) Yellow: OpenStack VM IOPS QoS changed from 100 to 500 IOPS Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Going to 1000 IOPS doesn’t help. If the application is not designed to keep enough requests in-flight, more IOPS do not reduce the time spent in IOwait. We need to reduce the latency as well …
  • 36. 36Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 IO: Block-level Caching We chose bcache: - Easy to setup - Strong error handling On a 4 VM hypervisor: ~25-50 IOPS/VM  ~1k-5k IOPS/VM • Single SSD to cache in front of HDDs - flashcache, bcache, dm-cache, …
  • 37. 37 Blue: CVI VM (h/w RAID-10 w/ cache) Yellow: OpenStack VM Red: OpenStack on bcache HV SSD block level caching sufficient for IOPS and latency demands. Use a VM on a bcache hypervisor Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 IO: bcache and Latency
  • 38. IO: ZFS ZIL/l2arc 38Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 • Meanwhile: SSD-only hypervisors - Fast, but (too) small for some use cases • Re-use cache idea leveraging ZFS features - Have partitions on the VM (on the SSDs!) which ZFS can use for the l2arc cache and the ZFS Intent Log - Use a fast volumes as the backing store • We’re testing this setup at the moment …
  • 39. CPU Performance Issues 39 • The benchmarks on full-node VMs was about 20% lower than the one of the underlying host - Smaller VMs much better • Investigated various tuning options - KSM*, EPT**, PAE, Pinning, … +hardware type dependencies - Discrepancy down to ~10% between virtual and physical • Comparison with Hyper-V: no general issue - Loss w/o tuning ~3% (full-node), <1% for small VMs - … NUMA-awareness! *KSM on/off: beware of memory reclaim! **EPT on/off: beware of expensive page table walks! Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 40. CPU: NUMA 40 • NUMA-awareness identified as most efficient setting - Full node VMs have ~3% overhead in HS06 • “EPT-off” side-effect - Small number of hosts, but very visible there • Use 2MB Huge Pages - Keep the “EPT off” performance gain with “EPT on” • More details in this talk Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 41. Operations: NUMA/THP Roll-out 41 • Rolled out on ~2’000 batch hypervisors (~6’000 VMs) - HP allocation as boot parameter  reboot - VM NUMA awareness as flavor metadata  delete/recreate • Cell-by-cell (~200 hosts): - Queue-reshuffle to minimize resource impact - Draining & deletion of batch VMs - Hypervisor reconfiguration (Puppet) & reboot - Recreation of batch VMs • Whole update took about 8 weeks - Organized between batch and cloud teams - No performance issue observed since Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 42. 42Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016 Agenda • Cloud Service Overview • Operations • Performance • Outlook
  • 43. Future Plans 43 • Investigate Ironic (Bare metal provisioning) - OpenStack as one interface for compute resource provisioning - Allow for complete accounting - Use physical machines for containers • Replace Hyper-V by qemu/kvm? - Successfully done at other sites - Remove dependency on Windows expertise - Reduce complexity in service setup Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016
  • 44. Summary 44 • OpenStack at CERN in production since 3 years - We’re working closely with the various communities • Cloud service continues to grow and mature - While experimental, good experience with cells for scaling - Experience gained helps with general resource provisioning - New features added (containers, identity federation) - Expansion planned (bare metal provisioning) • Major operational challenges ahead - Transparent retirement of service hosts - Replacement of network layer • http://openstack-in-production.blogspot.com (read about our recent 2M req/s Magnum & Kubernetes!) Arne Wiebalck: The CERN OpenStack Cloud, German OpenStack Days 2016

Editor's Notes

  1. This is a map of the complex of main accelerators of CERN. The dotted line is the French-Swiss border. The two main CERN sites are Meyrin and Prévessin (as big, but not as built nor populated). The PS is the tiny 600m ring on the Meyrin site, the SPS goes accross the border and the LHC takes all the space between Geneva airport and the Jura (dark green on the left). The 4 red dots represent where the 4 detectors are located.
  2. The LHC is built underground, for Financial reasons (would have cost too much to buy all the land), Re-using the LEP tunnel, Provisioning of natural shielding of radiations (for the population but also shielding from cosmic rays to the 4 detectors), Mechanical stability You can see on the picture that the 2 counter-rotating beams are hosted in different pipes, making collisions impossible there. The LHC is mainly hosted in a 3m diameter tunnel containing mostly superconducting magnets and some accelerating devices. 8.3T is around 200’000 times stronger than the earth’s magnetic field. 12 kA compares to 10-20 A in your wires at home.
  3. 120t == 700’000 litres. 1.9K == -271°C. Cables made of Niobium Titanium (NbTi) are «superconducting», it means they don’t heat up anymore as they have no electric resistance.
  4. Beam pipes need to be as empty as possible to avoid collisions with air molecules.
  5. ATLAS is a multi-purpose experiment looking for the Higgs Boson (which it has found in 2012), dark matter candidates etc. It gathers 3000 physicists from 174 universities and institutes in 38 countries. Like ATLAS, CMS is also a multi-purpose experiment. It also discovered the Higgs Boson in 2012. The C in CMS stands for Compact. Even though it does not look compact to most of you, this experiment is twice as small as ATLAS but twice as heavy: 2 times the Eiffel Tower (14’000 tonnes)! It gathers 2600 physicists from 182 universities and institutes in 42 countries. The ALICE experiment looks more particularly at the creation of the «Quark Gluon Plasma» state of matter which existed moments after the Big Bang. For that it wait for the beams of lead nuclei which occur about 10% of the beamtime of the LHC. It gathers 1500 physicists from 154 universities and institutes in 37 countries. And the final experiment on the LHC is LHCb. Is is name after the «b» quark which it studies. In fact LHCb studies the behaviour difference between the b quark and the anti-b quark. It tries to find difference in the symmetry which may account for the lack of antimatter in the Universe. It gathers 800 physicists from 69 universities and institutes in 17 countries.
  6. Run2 is now Run4 is in 10 years from now
  7. The trigger farms are those servers nearest the accelerator which are not needed while the accelerator is shut down till 2015 Public clouds are interesting for burst load (such as coming up to a conference) or when price drops such as spot market Private clouds allow universities and other research labs to collaborate in processing the LHC data EduGain is a global interfederation service, connecting identity federations technically and legally.