Towards An Agile Infrastructure at CERNTim BellTim.Bell@cern.chOpenStack Conference6th October 20111
What is CERN ?OpenStack Conference, Boston 2011Tim Bell, CERN2ConseilEuropéen pour la RechercheNucléaire – aka European Laboratory for Particle PhysicsBetween Geneva and the Jura mountains, straddling the Swiss-French borderFounded in 1954 with an international treatyOur business is fundamental physics and how our universe works
OpenStack Conference, Boston 2011Tim Bell, CERN3Answeringfundamental questions…How to explainparticles have mass?We have theories but needexperimentalevidenceWhatis 96% of the universe made of ?Wecanonlysee 4% of itsestimated mass!Whyisn’tthere anti-matterin the universe?Nature shouldbesymmetric…Whatwas the state of matterjustafter the « Big Bang » ?Travelling back to the earliest instants ofthe universewould help…
Community collaboration on an international scaleTim Bell, CERN4OpenStack Conference, Boston 2011
The Large Hadron ColliderTim Bell, CERN5OpenStack Conference, Boston 2011
OpenStack Conference, Boston 2011Tim Bell, CERN6
LHC constructionOpenStack Conference, Boston 2011Tim Bell, CERN7
The Large Hadron Collider (LHC) tunnel8OpenStack Conference, Boston 2011Tim Bell, CERN
OpenStack Conference, Boston 2011Tim Bell, CERN9
Accumulating events in 2009-2011OpenStack Conference, Boston 2011Tim Bell, CERN10
OpenStack Conference, Boston 2011Tim Bell, CERN11
Heavy Ion CollisionsOpenStack Conference, Boston 2011Tim Bell, CERN12
OpenStack Conference, Boston 2011Tim Bell, CERN13
OpenStack Conference, Boston 2011Tim Bell, CERN14Tier-0 (CERN):Data recording
Initial data reconstruction
Data distributionTier-1 (11 centres):Permanent storage
Re-processing
AnalysisTier-2  (~200 centres): Simulation
 End-user analysis
Data is recorded at CERN and Tier-1s and analysed in the Worldwide LHC Computing Grid
In a normal day, the grid provides 100,000 CPU days executing 1 million jobsOpenStack Conference, Boston 2011Tim Bell, CERN15Data Centre by NumbersHardware installation & retirement~7,000 hardware movements/year; ~1,800 disk failures/year
Our EnvironmentOur usersExperiments build on top of our infrastructure and services to deliver application frameworks for the 10,000 physicistsOur custom user applications split intoRaw data processing from the accelerator and export to the world wide LHC computing gridAnalysis of physics dataSimulationWe also have standard large organisation applicationsPayroll, Web, Mail, HR, …OpenStack Conference, Boston 2011Tim Bell, CERN16
Our InfrastructureHardware is generally based on commodity, white-box serversOpen tendering process based on SpecInt/CHF, CHF/Watt and GB/CHFCompute nodes typically dual processor, 2GB per coreBulk storage on 24x2TB disk storage-in-a-box with a RAID cardVast majority of servers run Scientific Linux, developed by Fermilab and CERN, based on Redhat EnterpriseFocus is on stability in view of the number of centres on the WLCGOpenStack Conference, Boston 2011Tim Bell, CERN17
Our Challenges – ComputeOptimise CPU resourcesMaximise production lifetime of serversSchedule interventions such as hardware repairs and OS patchingMatch memory and core requirements per jobReduce CPUs waiting idle for I/OConflicting software requirementsDifferent experiments want different librariesMaintenance of old programs needs old OSesOpenStack Conference, Boston 2011Tim Bell, CERN18
Our Challenges – variable demandOpenStack Conference, Boston 2011Tim Bell, CERN19
Our Challenges - Data storageOpenStack Conference, Boston 2011Tim Bell, CERN2025PB/year to record
>20 years retention
6GB/s average
25GB/s peaksOpenStack Conference, Boston 2011Tim Bell, CERN21
Our Challenges – ‘minor’ other issuesPowerLiving within a fixed envelope of 2.9MW available for computer centreCoolingOnly 6kW/m2 without using water cooled racks (and no spare power) SpaceNew capacity replaces old servers in same racks (as density is low)StaffCERN staff headcount is fixedBudgetCERN IT budget reflects member states contributionsOpenStack Conference, Boston 2011Tim Bell, CERN22
Server ConsolidationOpenStack Conference, Boston 2011Tim Bell, CERN23
Batch VirtualisationOpenStack Conference, Boston 2011Tim Bell, CERN24
Infrastructure as a Service StudiesCERN has been using virtualisation on a small scale since 2007Server Consolidation with Microsoft System Centre VM manager and Hyper-VVirtual batch compute farm using OpenNebula and Platform ISF on KVMWe are investigating moving to a cloud service provider model for infrastructure at CERNVirtualisation consolidation across multiple sitesBulk storage / Dropbox / …Self-Service AimsImprove efficiencyReduce operations effortEase remote data centre supportEnable cloud APIsOpenStack Conference, Boston 2011Tim Bell, CERN25
OpenStack Infrastructure as a Service StudiesCurrent FocusConverge the current virtualisation services into a single IaaSTest Swift for bulk storage, compatibility with S3 tools and resilience on commodity hardwareIntegrate OpenStack with CERN’s infrastructure such as LDAP and network databasesStatusSwift testbed (480TB) is being migrated to Diablo and expanded to 1PB with 10Ge networking48 Hypervisors running RHEL/KVM/Nova under testOpenStack Conference, Boston 2011Tim Bell, CERN26
Areas where we struggledNetworking configuration with CactusTrying out new Network-as-a-Service Quantum functions in DiabloRedhat distribution baseRPMs not yet in EPEL but Grid Dynamics RPMs helpedPuppet manifests needed adapting and multiple sources from OpenStack and PuppetlabsCurrently only testing with KVMWe’ll try Hyper-V once Diablo/Hyper-V support is fully in placeOpenStack Conference, Boston 2011Tim Bell, CERN27
OpenStack investigations : next stepsHomogeneous servers for both storage and batch ?OpenStack Conference, Boston 2011Tim Bell, CERN28
OpenStack investigations : next stepsScale testing with CERN’s toolchains to install and schedule 16,000 VMsOpenStack Conference, Boston 2011Tim Bell, CERN29Previous test results performed with OpenNebula
OpenStack investigations : next stepsInvestigate the commodity solutions for external volume storageCephSheepdogGluster...Focus is onReducing performance impact of I/O with virtualisationEnabling widespread use of live migrationUnderstanding the future storage classes and service definitionsSupporting remote data centre use casesOpenStack Conference, Boston 2011Tim Bell, CERN30
Areas of interest looking forwardNova and GlanceScheduling VMs near to the data they needManaging the queue of requests when “no credit card” and no resourcesOrchestration of bare metal servers within OpenStackSwiftHigh performance transfers through the proxies without encryptionLong term archiving for low power disks or tapeGeneralFilling in the missing functions such as billing, availability and performance monitoringOpenStack Conference, Boston 2011Tim Bell, CERN31
Final ThoughtsOpenStack Conference, Boston 2011Tim Bell, CERN32A small project to share documents at CERN in the ‘90s created the massive phenomenon that is today’s world wide web
Open Source
Transparent governance
Basis for innovation and competition

CERN User Story

  • 1.
    Towards An AgileInfrastructure at CERNTim BellTim.Bell@cern.chOpenStack Conference6th October 20111
  • 2.
    What is CERN?OpenStack Conference, Boston 2011Tim Bell, CERN2ConseilEuropéen pour la RechercheNucléaire – aka European Laboratory for Particle PhysicsBetween Geneva and the Jura mountains, straddling the Swiss-French borderFounded in 1954 with an international treatyOur business is fundamental physics and how our universe works
  • 3.
    OpenStack Conference, Boston2011Tim Bell, CERN3Answeringfundamental questions…How to explainparticles have mass?We have theories but needexperimentalevidenceWhatis 96% of the universe made of ?Wecanonlysee 4% of itsestimated mass!Whyisn’tthere anti-matterin the universe?Nature shouldbesymmetric…Whatwas the state of matterjustafter the « Big Bang » ?Travelling back to the earliest instants ofthe universewould help…
  • 4.
    Community collaboration onan international scaleTim Bell, CERN4OpenStack Conference, Boston 2011
  • 5.
    The Large HadronColliderTim Bell, CERN5OpenStack Conference, Boston 2011
  • 6.
    OpenStack Conference, Boston2011Tim Bell, CERN6
  • 7.
    LHC constructionOpenStack Conference,Boston 2011Tim Bell, CERN7
  • 8.
    The Large HadronCollider (LHC) tunnel8OpenStack Conference, Boston 2011Tim Bell, CERN
  • 9.
    OpenStack Conference, Boston2011Tim Bell, CERN9
  • 10.
    Accumulating events in2009-2011OpenStack Conference, Boston 2011Tim Bell, CERN10
  • 11.
    OpenStack Conference, Boston2011Tim Bell, CERN11
  • 12.
    Heavy Ion CollisionsOpenStackConference, Boston 2011Tim Bell, CERN12
  • 13.
    OpenStack Conference, Boston2011Tim Bell, CERN13
  • 14.
    OpenStack Conference, Boston2011Tim Bell, CERN14Tier-0 (CERN):Data recording
  • 15.
  • 16.
    Data distributionTier-1 (11centres):Permanent storage
  • 17.
  • 18.
    AnalysisTier-2 (~200centres): Simulation
  • 19.
  • 20.
    Data is recordedat CERN and Tier-1s and analysed in the Worldwide LHC Computing Grid
  • 21.
    In a normalday, the grid provides 100,000 CPU days executing 1 million jobsOpenStack Conference, Boston 2011Tim Bell, CERN15Data Centre by NumbersHardware installation & retirement~7,000 hardware movements/year; ~1,800 disk failures/year
  • 22.
    Our EnvironmentOur usersExperimentsbuild on top of our infrastructure and services to deliver application frameworks for the 10,000 physicistsOur custom user applications split intoRaw data processing from the accelerator and export to the world wide LHC computing gridAnalysis of physics dataSimulationWe also have standard large organisation applicationsPayroll, Web, Mail, HR, …OpenStack Conference, Boston 2011Tim Bell, CERN16
  • 23.
    Our InfrastructureHardware isgenerally based on commodity, white-box serversOpen tendering process based on SpecInt/CHF, CHF/Watt and GB/CHFCompute nodes typically dual processor, 2GB per coreBulk storage on 24x2TB disk storage-in-a-box with a RAID cardVast majority of servers run Scientific Linux, developed by Fermilab and CERN, based on Redhat EnterpriseFocus is on stability in view of the number of centres on the WLCGOpenStack Conference, Boston 2011Tim Bell, CERN17
  • 24.
    Our Challenges –ComputeOptimise CPU resourcesMaximise production lifetime of serversSchedule interventions such as hardware repairs and OS patchingMatch memory and core requirements per jobReduce CPUs waiting idle for I/OConflicting software requirementsDifferent experiments want different librariesMaintenance of old programs needs old OSesOpenStack Conference, Boston 2011Tim Bell, CERN18
  • 25.
    Our Challenges –variable demandOpenStack Conference, Boston 2011Tim Bell, CERN19
  • 26.
    Our Challenges -Data storageOpenStack Conference, Boston 2011Tim Bell, CERN2025PB/year to record
  • 27.
  • 28.
  • 29.
    25GB/s peaksOpenStack Conference,Boston 2011Tim Bell, CERN21
  • 30.
    Our Challenges –‘minor’ other issuesPowerLiving within a fixed envelope of 2.9MW available for computer centreCoolingOnly 6kW/m2 without using water cooled racks (and no spare power) SpaceNew capacity replaces old servers in same racks (as density is low)StaffCERN staff headcount is fixedBudgetCERN IT budget reflects member states contributionsOpenStack Conference, Boston 2011Tim Bell, CERN22
  • 31.
  • 32.
  • 33.
    Infrastructure as aService StudiesCERN has been using virtualisation on a small scale since 2007Server Consolidation with Microsoft System Centre VM manager and Hyper-VVirtual batch compute farm using OpenNebula and Platform ISF on KVMWe are investigating moving to a cloud service provider model for infrastructure at CERNVirtualisation consolidation across multiple sitesBulk storage / Dropbox / …Self-Service AimsImprove efficiencyReduce operations effortEase remote data centre supportEnable cloud APIsOpenStack Conference, Boston 2011Tim Bell, CERN25
  • 34.
    OpenStack Infrastructure asa Service StudiesCurrent FocusConverge the current virtualisation services into a single IaaSTest Swift for bulk storage, compatibility with S3 tools and resilience on commodity hardwareIntegrate OpenStack with CERN’s infrastructure such as LDAP and network databasesStatusSwift testbed (480TB) is being migrated to Diablo and expanded to 1PB with 10Ge networking48 Hypervisors running RHEL/KVM/Nova under testOpenStack Conference, Boston 2011Tim Bell, CERN26
  • 35.
    Areas where westruggledNetworking configuration with CactusTrying out new Network-as-a-Service Quantum functions in DiabloRedhat distribution baseRPMs not yet in EPEL but Grid Dynamics RPMs helpedPuppet manifests needed adapting and multiple sources from OpenStack and PuppetlabsCurrently only testing with KVMWe’ll try Hyper-V once Diablo/Hyper-V support is fully in placeOpenStack Conference, Boston 2011Tim Bell, CERN27
  • 36.
    OpenStack investigations :next stepsHomogeneous servers for both storage and batch ?OpenStack Conference, Boston 2011Tim Bell, CERN28
  • 37.
    OpenStack investigations :next stepsScale testing with CERN’s toolchains to install and schedule 16,000 VMsOpenStack Conference, Boston 2011Tim Bell, CERN29Previous test results performed with OpenNebula
  • 38.
    OpenStack investigations :next stepsInvestigate the commodity solutions for external volume storageCephSheepdogGluster...Focus is onReducing performance impact of I/O with virtualisationEnabling widespread use of live migrationUnderstanding the future storage classes and service definitionsSupporting remote data centre use casesOpenStack Conference, Boston 2011Tim Bell, CERN30
  • 39.
    Areas of interestlooking forwardNova and GlanceScheduling VMs near to the data they needManaging the queue of requests when “no credit card” and no resourcesOrchestration of bare metal servers within OpenStackSwiftHigh performance transfers through the proxies without encryptionLong term archiving for low power disks or tapeGeneralFilling in the missing functions such as billing, availability and performance monitoringOpenStack Conference, Boston 2011Tim Bell, CERN31
  • 40.
    Final ThoughtsOpenStack Conference,Boston 2011Tim Bell, CERN32A small project to share documents at CERN in the ‘90s created the massive phenomenon that is today’s world wide web
  • 41.
  • 42.
  • 43.
    Basis for innovationand competition

Editor's Notes

  • #3 Established by an international treaty at the end of 2nd world war as a place where scientists could work together for fundamental researchNuclear is part of the name but our world is particle physics
  • #4 Our current understanding of the universe is incomplete. A theory, called the Standard Model, proposes particles and forces, many of which have been experimentally observed. However, there are open questions- Why do some particles have mass and others not ? The Higgs Boson is a theory but we need experimental evidence.Our theory of forces does not explain how Gravity worksCosmologists can only find 4% of the matter in the universe, we have lost the other 96%We should have 50% matter, 50% anti-matter… why is there an asymmetry (although it is a good thing that there is since the two anhialiate each other) ?When we go back through time 13 billion years towards the big bang, we move back through planets, stars, atoms, protons/electrons towards a soup like quark gluon plasma. What were the properties of this?
  • #5 Biggest international scientific collaboration in the world, over 10,000 scientistsfrom 100 countriesAnnual Budget around 1.1 billion USDFunding for CERN, the laboratory, itselfcomesfrom the 20 member states, in ratio to the grossdomesticproduct… other countries contribute to experimentsincludingsubstantial US contribution towards the LHC experiments
  • #6 The LHC is CERN’s largest accelerator. A 17 mile ring 100 meters underground where two beams of particles are sent in opposite directions and collided at the 4 experiments, Atlas, CMS, LHCb and ALICE. Lake Geneva and the airport are visible in the top to give a scale.
  • #7 CERN is more than just the LHCCNGS neutrinos to Gran Sasso faster than the speed of light?CLOUD demonstrating impacts of cosmic rays on weather patternsAnti-hydrogen atoms contained for minutes in a magnetic vesselHowever, for those of you who have read Dan Brown’s Angels and Demons or seen the film, there are no maniacal monks with pounds of anti-matter running around the campus
  • #8 LHC was conceived in the 1980s and construction was started in 2002 within the tunnel of a previous accelerator called LEP6,000 magnets lowered down 100m shafts weighing up to 35 tons each
  • #9 The ring consists of two beam pipes, with a vacuum pressure 10 times lower than on the moon which contain the beams of protons accelerated to just below the speed of light. These go round 11,000 times per second being bent by the superconducting magnets cooled to 2K by liquid helium (-450F), colder than outer space. The beams themselves have a total energy similar to a high speed train so care needs to be taken to make sure they turn the corners correctly and don’t bump into the walls of the pipe.
  • #10 - At 4 points around the ring, the beams are made to cross at points where detectors, the size of cathedrals and weighing up to 12,500 tonnes surround the pipe. These are like digital camera, but they take 100 mega pixel photos 40 million times a second. This produces up to 1 petabyte/s.
  • #11 - Collisions can be visualised by the tracks left in the various parts of the detectors. With many collisions, the statistics allows particle identification such as mass and charge. This is a simple one…
  • #12 To improve the statistics, we send round beams of multiple bunches, as they cross there are multiple collisions as 100 billion protons per bunch pass through each otherSoftware close by the detector and later offline in the computer centre then has to examine the tracks to understand the particles involved
  • #13 To get Quark Gluon plasma, the material closest to the big bang, we also collide lead ions which is much more intensive… the temperatures reach 100,000 times that in the sun.
  • #14 - We cannot record 1PB/s so there are hardware filters to remove uninteresting collisions such as those whose physics we understand already. The data is then sent to the CERN computer centre for recording via 10Gbit optical connections.
  • #15 The Worldwide LHC Computing grid is used to record and analyse this data. The grid currently runs around 1 million jobs/day, less than 10% of the work is done at CERN. There is an agreed set of protocols for running jobs, data distribution and accounting between all the sites which co-operate in order to support the physicists across the globe.
  • #16 So, to the Tier-0 computer centre at CERN… we are unusual in that we are public with our environment as there is no competitive advantage for us. We have thousands of visitors a year coming for tours and education and the computer center is a popular visit.The data centre has around 2.9MW of usable power looking after 12,000 servers.. In comparison, the accelerator uses 120MW, like a small town.With 64,000 disks, we have around 1,800 failing each year… this is much higher than the manufacturers’ MTBFs which is consistent with results from Google.Servers are mainly Intel processors, some AMD with dual core Xeon being the most common configuration.
  • #17 CERN has around 10,000 physicist programmersApplications split into data recording, analysis and simulation.It is high throughput computing, not high performance computing… no parallel programs required as each collision is independent and can be farmed out using commodity networkingMajority of servers are running SL, some RHEL for Oracle databases
  • #18 We purchase on an annuak cycle, replacing around ¼ of the servers. This purchasing is based on performance metrics such as cost per SpecInt or cost/GBGenerally, we are seeing dual core computer servers with Intel or AMD processors and bulk storage servers with 24 or 36 2TB disksThe operating system is Redhatlinux based distributon called Scientific Linux. We share the development and maintenance with Fermilab in Chicago. The choice of a Redhat based distribution comes from the need for stability across the grid, where keeping the 200 centres running compatible Linux distributions.
  • #19 Get burnt in quickly, production and retire lateShort vs long programs can vary by up to 1 week
  • #20 Generally running 30,000 jobs in the Tier-0 with up to 110,000 waiting to run, especially as conferences approach and physicists prepare the last minute analysis.
  • #21 Our data storage system has to record and preserve 25PB/year with an expected lifetime of 20 years. Keeping the old data is required to get the maximum statistics for discoveries. At times, physicists will want to skim this data looking for new physics. Data rates are around 6GB/s average, with peaks of 25GB/s.
  • #22 Around 60,000 tape mounts / week so the robots are kept busy
  • #24 Our service consolidation environment is intended to allow rapid machine requests such as development servers through to full servers with live migration for productionCurrently based on Hyper-V and using SCVMM, we have around 1,600 guests running a mixture of Linux and Windows
  • #25 Provides virtual machines to run physics jobs such that the users do not see any different between a physical machine and a virtual oneCurrently based on OpenNebula providing EC2 APIs for experiments to investigate using clouds
  • #29 Can we find a model where Compute and Mass Storage reside on the same server?
  • #30 Previous tests performed with OpenNebulaBottlenecks were identified within CERN’s toolchain (LDAP and batch system) rather than with the orchestrator
  • #32 These are items which we foresee as being potentially interesting in a few months time where we would like to discuss with other users of openstack to understand potential solutions.
  • #33 Infrastructure as a Service with a vibrant open source implementation such as OpenStack can offer efficiency and agility to IT services, both private and publicAs more users and companies move towards production usage, we need to balance the rapid evolution with the need for stabilityAs demonstrated by the World Wide Web’s evolution from a CERN project to a global presence, a set of core standards allows innovation & competition. Let’s not forget in our enthusaism to enhance OpenStack that there will be more and more sites facing the classic issues of production stability and maintenance.With the good information sharing amongst the community such as these conferences, these can be addressed.
  • #43 Peaks of up to 25GBytes/s to handle with averages of 6 over the year.