Established by an international treaty at the end of 2nd world war as a place where scientists could work together for fundamental researchNuclear is part of the name but our world is particle physics
Our current understanding of the universe is incomplete. A theory, called the Standard Model, proposes particles and forces, many of which have been experimentally observed. However, there are open questions- Why do some particles have mass and others not ? The Higgs Boson is a theory but we need experimental evidence.Our theory of forces does not explain how Gravity worksCosmologists can only find 4% of the matter in the universe, we have lost the other 96%We should have 50% matter, 50% anti-matter… why is there an asymmetry (although it is a good thing that there is since the two anhialiate each other) ?When we go back through time 13 billion years towards the big bang, we move back through planets, stars, atoms, protons/electrons towards a soup like quark gluon plasma. What were the properties of this?
Biggest international scientific collaboration in the world, over 11,000 scientistsfrom 100 countriesAnnual Budget around 1.1 billion USDFunding for CERN, the laboratory, itselfcomesfrom the 20 member states, in ratio to the grossdomesticproduct… other countries contribute to experimentsincludingsubstantial US contribution towards the LHC experiments
The LHC is CERN’s largest accelerator. A 17 mile ring 100 meters underground where two beams of particles are sent in opposite directions and collided at the 4 experiments, Atlas, CMS, LHCb and ALICE. Lake Geneva and the airport are visible in the top to give a scale.
The ring consists of two beam pipes, with a vacuum pressure 10 times lower than on the moon which contain the beams of protons accelerated to just below the speed of light. These go round 11,000 times per second being bent by the superconducting magnets cooled to 2K by liquid helium (-450F), colder than outer space. The beams themselves have a total energy similar to a high speed train so care needs to be taken to make sure they turn the corners correctly and don’t bump into the walls of the pipe.
- At 4 points around the ring, the beams are made to cross at points where detectors, the size of cathedrals and weighing up to 12,500 tonnes surround the pipe. These are like digital camera, but they take 100 mega pixel photos 40 million times a second. This produces up to 1 petabyte/s.
- Collisions can be visualised by the tracks left in the various parts of the detectors. With many collisions, the statistics allows particle identification such as mass and charge. This is a simple one…
To improve the statistics, we send round beams of multiple bunches, as they cross there are multiple collisions as 100 billion protons per bunch pass through each otherSoftware close by the detector and later offline in the computer centre then has to examine the tracks to understand the particles involved
To get Quark Gluon plasma, the material closest to the big bang, we also collide lead ions which is much more intensive… the temperatures reach 100,000 times that in the sun.
- We cannot record 1PB/s so there are hardware filters to remove uninteresting collisions such as those whose physics we understand already. The data is then sent to the CERN computer centre for recording via 10Gbit optical connections.
The Worldwide LHC Computing grid is used to record and analyse this data. The grid currently runs over 2 million jobs/day, less than 10% of the work is done at CERN. There is an agreed set of protocols for running jobs, data distribution and accounting between all the sites which co-operate in order to support the physicists across the globe.
So, to the Tier-0 computer centre at CERN… we are unusual in that we are public with our environment as there is no competitive advantage for us. We have thousands of visitors a year coming for tours and education and the computer center is a popular visit.The data centre has around 2.9MW of usable power looking after 12,000 servers.. In comparison, the accelerator uses 120MW, like a small town.With 64,000 disks, we have around 1,800 failing each year… this is much higher than the manufacturers’ MTBFs which is consistent with results from Google.Servers are mainly Intel processors, some AMD with dual core Xeon being the most common configuration.
Upstairs in the computer centre, a high roof was the fashion in the 1980s for mainframes but now is very difficult to get cooled efficiently
Our data storage system has to record and preserve 35PB/year with an expected lifetime of 20 years. Keeping the old data is required to get the maximum statistics for discoveries. At times, physicists will want to skim this data looking for new physics. Data rates are around 6GB/s average, with peaks of 25GB/s.
Tape robots from IBM and OracleAround 60,000 tape mounts / week so the robots are kept busyData copied every two years to keep up with the latest media densities
Asked member states for offers200Gbit/s links connecting the centresExpect to double computing capacity compared to today by 2015
Double the capacity, same manpowerNeed to rethink how to solve the problem… look at how others approach itWe had our own tools in 2002 and as they become more sophisticated, it was not possible to take advantage of other developments elsewhere without a major break.Doing this while doing their ‘day’ jobs so it re-enforces the approach of taking what we can from the community
Model based on Google Toolchain, Puppet is key for many operations. We’ve only had to write one new significant custom CERN software component which is in the certificate authority. Other parts such as Lemon for monitoring are from our previous implementation as we did not want to change all at once and they scale.
We’ve been very pleased with our choices. Along with the obvious benefits of the functionality, there are soft benefits from the community model.
Many staff at CERN are short term contracts… good benefits for those staff to leave with skills in need.
Standardise hardware … buy in bulk and pile it up then work out what to use it forMemory, motherboards, cables or disks interventionsUsers waiting for I/O means wasted cycles. Build machines at night unused during the day. Interactive machines mainly during the dayMove to cloud APIs … need to support them but also maintain our existing applicationsDetails later on reception and testing
The concept of pets and cattle came from Cloudscaling.Puppet applies well to the cattle model but we’re also using it to handle the pet cases that can’t yet move over due to software limitations. So, they get cloud provisioning but flexible configuration management.
Communities integrating … when a new option is being used at CERN in OpenStack, we contribute the changes back to the puppet forge such as certificate handling. Even looking at Hyper-V/Windows openstack configuration…
The project’s success comes down to community. A vibrant community has momentum of its own. As the WWW showed, many contributors can change how we see the world.Looking forward, as we help improve Puppet, remember that you will also be helping achieve a clearer understanding of the universe and how it works.
CERN is more than just the LHCCNGS neutrinos to Gran SassoCLOUD demonstrating impacts of cosmic rays on weather patternsAnti-hydrogen atoms contained for minutes in a magnetic vesselHowever, for those of you who have read Dan Brown’s Angels and Demons or seen the film, there are no maniacal monks with pounds of anti-matter running around the campus
We purchase on an annual cycle, replacing around ¼ of the servers. This purchasing is based on performance metrics such as cost per SpecInt or cost/GBGenerally, we are seeing dual core computer servers with Intel or AMD processors and bulk storage servers with 24 or 36 2TB disksThe operating system is Redhatlinux based distribution called Scientific Linux. We share the development and maintenance with Fermilab in Chicago. The choice of a Redhat based distribution comes from the need for stability across the grid, where keeping the 200 centres running compatible Linux distributions.
LHC@Home is not an instruction on how to build your own accelerator but a magnet simulation tool to test multiple passes around the ring. We wanted to use it as a stress test tool and in ½ day, it was running on 1000 VMs.
20121205 open stack_accelerating_science_v3
Accelerating Science with OpenStackThe CERN User Story Tim Bell Tim.Bell@cern.ch @noggin143 EMEA OpenStack Day, London 5th December 2012
What is CERN ?• Conseil Européen pour la Recherche Nucléaire – aka European Laboratory for Particle Physics• Between Geneva and the Jura mountains, straddling the Swiss-French border• Founded in 1954 with an international treaty• Our business is fundamental physics , what is the universe made of and how does it work OpenStack London December 2012 Tim Bell, CERN 2
Answering fundamental questions…• How to explain particles have mass? We have theories and accumulating experimental evidence.. Getting close…• What is 96% of the universe made of ? We can only see 4% of its estimated mass!• Why isn’t there anti-matter in the universe? Nature should be symmetric…• What was the state of matter just after the « Big Bang » ? Travelling back to the earliest instants of the universe would help…OpenStack London December 2012 Tim Bell, CERN 3
Community collaboration on an international scaleOpenStack London December 2012 Tim Bell, CERN 4
The Large Hadron ColliderOpenStack London December 2012 Tim Bell, CERN 5
The Large Hadron Collider (LHC) tunnelOpenStack London December 2012 Tim Bell, CERN 6
OpenStack London December 2012 Tim Bell, CERN 7
Accumulating events in 2009-2011OpenStack London December 2012 Tim Bell, CERN 8
OpenStack London December 2012 Tim Bell, CERN 9
Heavy Ion CollisionsOpenStack London December 2012 Tim Bell, CERN 10
Data Acquisition and Trigger FarmsOpenStack London December 2012 Tim Bell, CERN 11
OpenStack London December 2012 Tim Bell, CERN 12
Tier-0 (CERN): •Data recording •Initial data reconstruction •Data distribution Tier-1 (11 centres): •Permanent storage •Re-processing •Analysis Tier-2 (~200 centres): • Simulation • End-user analysis• Data is recorded at CERN and Tier-1s and analysed in the Worldwide LHC Computing Grid• In a normal day, the grid provides 100,000 CPU days executing over 2 million jobs OpenStack London December 2012 Tim Bell, CERN 13
• Data Centre by Numbers – Hardware installation & retirement • ~7,000 hardware movements/year; ~1,800 disk failures/year Racks 828 Disks 64,109 Tape Drives 160 Servers 11,728 Raw disk capacity (TiB) 63,289 Tape Cartridges 45,000 Processors 15,694 Memory modules 56,014 Tape slots 56,000 Cores 64,238 Memory capacity (TiB) 158 Tape Capacity (TiB) 73,000 HEPSpec06 482,507 RAID controllers 3,749 High Speed Routers 24 Xeon Xeon Xeon Other Fujitsu (640 Mbps → 2.4 Tbps) 3GHz 5150 5160 Xeon 0% 3% Xeon 4% 2% 10% E5335 Ethernet Switches 350 L5520 7% Xeon Hitachi 33% 23% 10 Gbps ports 2,000 E5345 14% HP Switching Capacity 4.8 Tbps Seagate 0% 15% 1 Gbps ports 16,939 Maxtor Western 0% 10 Gbps ports 558 Xeon Xeon Digital E5405 Xeon 59% L5420 6% IT Power Consumption 2,456 KW 8% E5410 16% Total Power Consumption 3,890 KW OpenStack London December 2012 Tim Bell, CERN 14
OpenStack London December 2012 Tim Bell, CERN 15
Our Challenges - Data storage • >20 years retention • 6GB/s average • 25GB/s peaks • 35PB/year recordedOpenStack London December 2012 Tim Bell, CERN 16
45,000 tapes holding 80PB of physics dataOpenStack London December 2012 Tim Bell, CERN 17
New data centre to expand capacity • Data centre in Geneva at the limit of electrical capacity at 3.5MW • New centre chosen in Budapest, Hungary • Additional 2.7MW of usable power • Hands off facility • Deploying from 2013 with 200Gbit/sOpenStack London December 2012 Tim Bell, CERN network to CERN 18
Time to change strategy• Rationale – Need to manage twice the servers as today – No increase in staff numbers – Tools becoming increasingly brittle and will not scale as-is• Approach – CERN is no longer a special case for compute – Adopt an open source tool chain model – Our engineers rapidly iterate • Evaluate solutions in the problem domain • Identify functional gaps and challenge them • Select first choice but be prepared to change in future – Contribute new function back to the communityOpenStack London December 2012 Tim Bell, CERN 19
Building Blocks mcollective, yum Bamboo Puppet AIMS/PXE Foreman JIRA OpenStack Nova git Koji, Mock Yum repo Active Directory / Pulp LDAP Lemon / Hardware Hadoop database Puppet-DBOpenStack London December 2012 Tim Bell, CERN 20
Training and Support• Buy the book rather than guru mentoring• Follow the mailing lists to learn• Newcomers are rapidly productive (and often know more than us)• Community and Enterprise support means we’re not on our ownOpenStack London December 2012 Tim Bell, CERN 21
Staff Motivation• Skills valuable outside of CERN when an engineer’s contracts endOpenStack London December 2012 Tim Bell, CERN 22
Prepare the move to the clouds• Improve operational efficiency – Machine ordering, reception and testing – Hardware interventions with long running programs – Multiple operating system demand• Improve resource efficiency – Exploit idle resources, especially waiting for disk and tape I/O – Highly variable load such as interactive or build machines• Enable cloud architectures – Gradual migration to cloud interfaces and workflows – Autoscaling and scheduling• Improve responsiveness – Self-Service with coffee break response timeOpenStack London December 2012 Tim Bell, CERN 23
Public Procurement Purchase ModelStep Time (Days) Elapsed (Days)User expresses requirement 0Market Survey prepared 15 15Market Survey for possible vendors 30 45Specifications prepared 15 60Vendor responses 30 90Test systems evaluated 30 120Offers adjudicated 10 130Finance committee 30 160Hardware delivered 90 250Burn in and acceptance 30 days typical 280 380 worst caseTotal 280+ DaysOpenStack London December 2012 Tim Bell, CERN 24
Service Model • Pets are given names like pussinboots.cern.ch • They are unique, lovingly hand raised and cared for • When they get ill, you nurse them back to health • Cattle are given numbers like vm0042.cern.ch • They are almost identical to other cattle • When they get ill, you get another one • Future application architectures should use Cattle but Pets with strong configuration management are viable and still neededOpenStack London December 2012 Tim Bell, CERN 25
Current Status of OpenStack at CERN• Working with Essex OpenStack release – Excellent experience with Fedora/RedHat team using EPEL packages – Started Folsom test environment in November• Focusing on the compute side to start with – Nova compute with KVM and Hyper-V – Keystone identity now integrated with Active Directory – Replaced network layer with CERN code for our legacy network management system• Current pre-production installation – 170 Hypervisors – 2,700 VMs – 3 DevOps part time running and enhancing the service – Running production ‘cattle’ workloads for stress testingOpenStack London December 2012 Tim Bell, CERN 26
OpenStack London December 2012 Tim Bell, CERN 27
When communities combine…• OpenStack’s many components and options make configuration complex out of the box• Puppet forge module from PuppetLabs does our configuration• The Foreman adds OpenStack provisioning for user kiosk to a configured machine in 15 minutesOpenStack London December 2012 Tim Bell, CERN 28
Foreman to manage Puppetized VMOpenStack London December 2012 Tim Bell, CERN 29
Opportunistic Clouds in online experiment farms• The CERN experiments have farms of 1000s of Linux servers close to the detectors to filter the 1PByte/s down to 6GByte/s to be recorded to tape• When the accelerator is not running, these machines are currently idle – Accelerator has regular maintenance slots of several days – Long Shutdown due from March 2013-November 2014• One of the experiments have deployed OpenStack on their farm – Simulation (low I/O, high CPU) – Analysis (high I/O, high CPU, high network)OpenStack London December 2012 Tim Bell, CERN 30
Federated European Clouds• Two significant European projects around Federated Clouds – European Grid Initiative Federated Cloud as a federation of grid sites providing IaaS – HELiX Nebula European Union funded project to create a scientific cloud based on commercial providers EGI Federated Cloud Sites CESGA CESNET INFN SARA Cyfronet FZ Jülich SZTAKI IPHC GRIF GRNET KTH Oxford GWDG IGI TCD IN2P3 STFCOpenStack London December 2012 Tim Bell, CERN 31
Ongoing Projects• Production Preparation – Following High Availability recommendations from the community – Integrate monitoring with CERN frameworks• Quota management with Boson – CERN does not have infinite compute resources or budget – Experiments are allocated a quota to split between their projects – Collaborating with the community to develop a distributed quota manager• Block Storage Evaluation – Investigating Gluster, NetApp and Ceph integration for EBS functionalityOpenStack London December 2012 Tim Bell, CERN 32
Going further forward• Deployment to production – Planned for Q1 2013 based on Folsom – No legacy tools in the second data centre• Exploit new functionality – Ceilometer for metering – Bare metal for non-virtualised use cases such as high I/O servers – X.509 user certificate authentication – Load balancing as a service – Cells for scalabilityRamping to 15,000 hypervisors with100,000 to 300,000 VMs by 2015OpenStack London December 2012 Tim Bell, CERN 33
Final Thoughts • A small project to share documents at CERN in the ‘90s created the massive phenomenon that is today’s world wide web • Open Source • Vibrant community and eco-system • Working with the Puppet and OpenStack communities has shown the power of collaboration • We have built a toolchain in one year with part time resources • Sharing with other organisations to achieve scale is the only economically feasible path • CERN contributes and benefits from contributions of othersOpenStack London December 2012 Tim Bell, CERN 34
Questions ? OpenStack London December Tim Bell, CERN 35 2012
ReferencesCERN http://public.web.cern.ch/public/Scientific Linux http://www.scientificlinux.org/Worldwide LHC Computing Grid http://wlcg.web.cern.ch/Jobs http://cern.ch/jobsDetailed Report on Agile Infrastructure http://cern.ch/go/N8wpHELiX Nebula http://helix-nebula.eu/EGI Cloud Taskforce https://wiki.egi.eu/wiki/Fedcloud-tf OpenStack London December 2012 Tim Bell, CERN 36
Backup SlidesOpenStack London December 2012 Tim Bell, CERN 37
OpenStack London December 2012 Tim Bell, CERN 38
CERN’s tools• The world’s most powerful accelerator: LHC – A 27 km long tunnel filled with high-tech instruments – Equipped with thousands of superconducting magnets – Accelerates particles to energies never before obtained – Produces particle collisions creating microscopic “big bangs”• Very large sophisticated detectors – Four experiments each the size of a cathedral – Hundred million measurement channels each – Data acquisition systems treating Petabytes per second• Top level computing to distribute and analyse the data – A Computing Grid linking ~200 computer centres around the globe – Sufficient computing power and storage to handle 25 Petabytes per year, making them available to thousands of physicists for analysisOpenStack London December 2012 Tim Bell, CERN 39
Our Infrastructure• Hardware is generally based on commodity, white-box servers – Open tendering process based on SpecInt/CHF, CHF/Watt and GB/CHF – Compute nodes typically dual processor, 2GB per core – Bulk storage on 24x2TB disk storage-in-a-box with a RAID card• Vast majority of servers run Scientific Linux, developed by Fermilab and CERN, based on Redhat Enterprise – Focus is on stability in view of the number of centres on the WLCGOpenStack London December 2012 Tim Bell, CERN 40
New architecture data flowsOpenStack London December 2012 Tim Bell, CERN 41
500 1500 2000 2500 3000 3500 1000 0 Mar-10 Apr-10May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10OpenStack London December 2012 Dec-10 Jan-11 Feb-11 Mar-11 Apr-11May-11 Jun-11 Jul-11Tim Bell, CERN Aug-11 Sep-11 Oct-11 Nov-11 Dec-11 Jan-12 Feb-12 Mar-12 Apr-12May-12 Virtualisation on SCVMM/Hyper-V Jun-12 Jul-12 Aug-12 Sep-1242 Linux Oct-12 Windows
Scaling up with Puppet and OpenStack• Use LHC@Home based on BOINC for simulating magnetics guiding particles around the LHC• Naturally, there is a puppet module puppet-boinc• 1000 VMs spun up to stress test the hypervisors with Puppet, Foreman and OpenStackOpenStack London December 2012 Tim Bell, CERN 43
Federated Cloud Commonalities• Basic building blocks – Each site gives an IaaS endpoint with an API and common security policy • OCCI? CDMI ? Libcloud ? Jclouds ? – Image stores available across the sites – Federated identity management based on X.509 certificates – Consolidation of accounting information to validate pledges and usage• Multiple cloud technologies – OpenStack – OpenNebula – ProprietaryOpenStack London December 2012 Tim Bell, CERN 44
Supporting the Pets with OpenStack• Network – CERN specific component to interface to our legacy network environment• Configuration – Using Puppet with Puppetlabs modules for rapid deployment• External Block Storage – Currently using nova-volume with Gluster backing store• Live migration to maximise availability – KVM live migration using Gluster – KVM and Hyper-V block migrationOpenStack London December 2012 Tim Bell, CERN 45
Active Directory Integration• CERN’s Active Directory – Unified identity management across the site – 44,000 users – 29,000 groups – 200 arrivals/departures per month• Full integration with Active Directory via LDAP – Uses the OpenLDAP backend with some particular configuration settings – Aim for minimal changes to Active Directory – 7 patches submitted around hard coded values and additional filtering• Now in use in our pre-production instance – Map project roles (admins, members) to groups – Documentation in the OpenStack wikiOpenStack London December 2012 Tim Bell, CERN 46
Welcome Back Hyper-V!• We currently use Hyper-V/System Centre for our server consolidation activities – But need to scale to 100x current installation size• Choice of hypervisors should be tactical – Performance – Compatibility/Support with integration components – Image migration from legacy environments• CERN is working closely with the Hyper-V OpenStack team and Microsoft to arrive at parity – Puppet to configure hypervisors on Windows – Most functions work well but further work on Console, Ceilometer, …OpenStack London December 2012 Tim Bell, CERN 47
Active Directory Integration• CERN’s Active Directory – Unified identity management across the site – 44,000 users – 29,000 groups – 200 arrivals/departures per month• Full integration with Active Directory via LDAP – Uses the OpenLDAP backend with some particular configuration settings – Aim for minimal changes to Active Directory – 7 patches submitted around hard coded values and additional filtering• Now in use in our pre-production instance – Map project roles (admins, members) to groups – Documentation in the OpenStack wikiOpenStack London December 2012 Tim Bell, CERN 48
OpenStack London December 2012 Tim Bell, CERN 49