OpenStack at CERN
Barcelona Summit 2016
CMS
ALICE
ATLAS LHCb
Largest
machine
on Earth
CMS
1 Billion
Collisions/s
25/10/2016
25/10/2016 Tim Bell - OpenStack Summit Barcelona 2016 6
• ~160 PB stored on tape at CERN
• June-Aug 2016 recorded >0.5 PB / day
2016 : A lot of physics
• Outlook is for 60x compute increase by
2023 but the budget outlook for servers
and people is flat
• >190K cores in
production under
OpenStack
• >90% of CERN
compute
resources are
virtualised
• >5,000 VMs
migrated from old
hardware in 2016
• >100K cores to be
added in the next
6 months
Containers on CloudsFor the user
• Interactive
• Dynamic
For IT - Magnum
• Secure
• Managed
• Integrated
• Scalable
There will be many
disruptive technologies
within this framework
Hybrid Clouds for Science
• LHC workloads on public clouds?
Thanks for your help!
For Further Information ?
CERN home page is at http://home.cern/
Technical details on the CERN cloud at http://openstack-in-
production.blogspot.fr
Custom CERN code is at https://github.com/cernops
Scientific Working Group at
https://wiki.openstack.org/wiki/Scientific_working_group
CERN OpenStack story from OpenStack Paris 2014 at
https://www.youtube.com/watch?v=7k3VnWXOjP4
Helix Nebula details at http://www.helix-nebula.eu/

20161025 OpenStack at CERN Barcelona

  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
    25/10/2016 Tim Bell- OpenStack Summit Barcelona 2016 6
  • 7.
    • ~160 PBstored on tape at CERN • June-Aug 2016 recorded >0.5 PB / day 2016 : A lot of physics • Outlook is for 60x compute increase by 2023 but the budget outlook for servers and people is flat
  • 8.
    • >190K coresin production under OpenStack • >90% of CERN compute resources are virtualised • >5,000 VMs migrated from old hardware in 2016 • >100K cores to be added in the next 6 months
  • 9.
    Containers on CloudsForthe user • Interactive • Dynamic For IT - Magnum • Secure • Managed • Integrated • Scalable There will be many disruptive technologies within this framework
  • 10.
    Hybrid Clouds forScience • LHC workloads on public clouds?
  • 11.
  • 12.
    For Further Information? CERN home page is at http://home.cern/ Technical details on the CERN cloud at http://openstack-in- production.blogspot.fr Custom CERN code is at https://github.com/cernops Scientific Working Group at https://wiki.openstack.org/wiki/Scientific_working_group CERN OpenStack story from OpenStack Paris 2014 at https://www.youtube.com/watch?v=7k3VnWXOjP4 Helix Nebula details at http://www.helix-nebula.eu/

Editor's Notes

  • #5 CMS is the Compact Muon Solenoid experiment Compact is an interesting adjective for something that weighs 14,000 tonnes, double the Eiffel tower, 21 meters long and 15 meters across.
  • #6 100 billion protons in each bunch collide around 1 billion times per second. The computing requirements increase significantly as there are multiple simultaneous collisions to separate out.
  • #7 Not just the LHC I have an Anti Matter Factory just outside my office where anti-protons and positrons are slowed down and anti-hydrogen created. The properties can then be studied such as does anti matter go up or down in a gravitational field nTOF is a time of flight experiment that studies the properties of Neutrons with spin offs in areas like hadroin therapy for Cancer treatment.
  • #8 The accelerator has done really well in 2016 … beam has been very stable However, looking forward, we’re seeing the need for 60x increase in compute resources by 2023, with Moore’s law getting us 10x in the best case, which is unlikely given the recent technology evoution.
  • #9 Having started in 2012, we went into production in 2013 on Grizzly and have since then done online upgrades of 6 releases. We are currently running around 50 cells distributed across 2 data centres with around 7,500 hypervisors. Automation is key… a single framework for managing compute resources within limited staffing, many of whom are only at CERN for a few years and return to their home countries.
  • #10 Managed means: Project structure Shared free resource pool Authorisation / Identity Audit Resource lifecycle Familiar end user interface We’ve been running tests at scale, 1000 nodes delivering 7 million requests/second, for appliations using OpenStack Magnum. A router firmware issue stopped us scaling further. We’ll work with the Magnm and Heat teams to make this even easier.
  • #11  With technologies such as federation, we can consider using public cloud resources for science. The Helix Nebula organisation has a number of end users and suppliers working together on how to be able to use public clouds for science. CERN has been involved in testing 10 clouds. 8 of these are OpenStack based which means our on-premise tooling can be used for external clouds using tools like Terraform. Running the technical workload has been achieved but integrating procurement, billing management, networking costs, federated identity and support models are the focus for upcoming work.
  • #12 Thanks for your contributions to all of the open source projects. We upstream all of our improvements, working with the community through teams like the large deployments team and scientific working group to make sure other labs and OpenStack users benefit from the extreme computing challenges of the LHC.