Tim Bell
tim.bell@cern.ch
29/05/2013 CERN OpenStack CEE Day 2
29/05/2013 CERN OpenStack CEE Day 3
CERN was founded 1954: 12 European States
“Science for Peace”
Today: 20 Member States
Member States: Austria, Belgium, Bulgaria, the Czech
Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, the
Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland
and
the United Kingdom
Candidate for Accession: Romania
Associate Members in Pre-Stage to Membership: Israel, Serbia
Applicant States for Membership or Associate Membership:
Brazil, Cyprus (awaiting ratification), Pakistan, Russia, Slovenia, Turkey, Ukraine
Observers to Council: India, Japan, Russia, Turkey, United States of America;
European Commission and UNESCO
~ 2,300 staff
~ 1,000 other paid personnel
> 11,000 users
Budget (2013) ~1,000 MCHF
What are the Origins of Mass ?
29/05/2013 CERN OpenStack CEE Day 4
Matter/Anti Matter Symmetric?
29/05/2013 CERN OpenStack CEE Day 5
Where is 95% of the Universe?
29/05/2013 CERN OpenStack CEE Day 6
29/05/2013 CERN OpenStack CEE Day 7
29/05/2013 CERN OpenStack CEE Day 8
29/05/2013 CERN OpenStack CEE Day 9
Collisions
29/05/2013 CERN OpenStack CEE Day 10
29/05/2013 CERN OpenStack CEE Day 11
July 4, 2012
29/05/2013 CERN OpenStack CEE Day 12
OpenStack London December
2012
Tim Bell, CERN 13
Tier-1 (11 centres):
•Permanent storage
•Re-processing
•Analysis
Tier-0 (CERN):
•Data recording
•Initial data reconstruction
•Data distribution
Tier-2 (~200 centres):
• Simulation
• End-user analysis
• Data is recorded at CERN and Tier-1s and analysed in the Worldwide LHC
Computing Grid
• In a normal day, the grid provides 100,000 CPU days executing over 2 million jobs
The CERN Meyrin Data Centre
29/05/2013 CERN OpenStack CEE Day 14
29/05/2013 CERN OpenStack CEE Day 15
A Big Data Challenge
29/05/2013 CERN OpenStack CEE Day 16
• Nearly 100PB storage today
• LHC produces up to
35PB/year today
• Peaks of 25GB/second
• Data rates double in 2015
• Experiments will run for 20
years and data needs to be
preserved
• Exabytes of storage to
maintain….
29/05/2013 CERN OpenStack CEE Day 17
New Data Centre Approved
29/05/2013 CERN OpenStack CEE Day 18
• Data centre in Geneva at
the limit of electrical and
cooling capacity at 3.5MW
• New centre chosen in
Budapest, Hungary
• Additional 2.7MW of
usable power
• Local on-site support for
hardware maintenance
and installations
29/05/2013 CERN OpenStack CEE Day 19
29/05/2013 CERN OpenStack CEE Day 20
Good News, Bad News
• Additional data centre in Budapest now online
• Increasing users of CERN’s facilities and higher
computing requirements as data rates increase
• Staff numbers are fixed, no more people
• Materials budget decreasing, no more money
• Legacy tools are high maintenance and brittle
How do we maximise our computing resources within
these constraints ?
29/05/2013 CERN OpenStack CEE Day 21
Approach
29/05/2013 CERN OpenStack CEE Day 22
• Remodel IT services on Cloud layered models
• IaaS, PaaS, SaaS
• Move to commonly used open source tools
• Focus on strong communities and momentum
• Implement clouds at scale
• Aim for 90% infrastructure virtualised
• Exploit ecosystem solutions rather than writing from
scratch
29/05/2013
Bamboo
Koji, Mock
AIMS/PXE
Foreman
Yum repo
Pulp
Puppet-DB
mcollective, yum
JIRA
Lemon /
Hadoop /
LogStash /
Kibana
git
OpenStack
Nova
Hardware
database
Puppet
Active Directory /
LDAP
Training for Newcomers
29/05/2013 CERN OpenStack CEE Day 24
Buy the book rather than guru mentoring
Job Opportunities
29/05/2013 CERN OpenStack CEE Day 25
Service Models
29/05/2013 CERN OpenStack CEE Day 26
• Pets are given names like pussinboots.cern.ch
• They are unique, lovingly hand raised and cared for
• When they get ill, you nurse them back to health
• Cattle are given numbers like vm0042.cern.ch
• They are almost identical to other cattle
• When they get ill, you get another one
CERN Status
• IT OpenStack Cloud
• Running Folsom around 500 hypervisors on KVM and Hyper-V
• High availability components using load balancing
• All Puppet managed to configure OpenStack
• Over 100 users creating up to 300 new VMs/day
• Cattle service level only
• LHC experiment farms
• CMS currently running 1,300 hypervisors with 50,000 cores
• ATLAS starting to ramp up to a similar size
• Other science grid sites moving to private cloud on OpenStack
• Brookhaven, IN2P3, FutureGrid, NeCTAR, IHEP, …
29/05/2013 CERN OpenStack CEE Day 27
29/05/2013 CERN OpenStack CEE Day 28
Microsoft Active
Directory
CERN DB
on Demand
CERN Network
Database
Account mgmt
system
Horizon
Keystone
Network
Compute
Glance
Scheduler
Cinder
Nova
Block Storage
Provider
29/05/2013 CERN OpenStack CEE Day 29
http://www.eucalyptus.com/blog/2013/04/02/cy13-q1-community-analysis-%E2%80%94-openstack-vs-opennebula-vs-eucalyptus-vs-
cloudstack
29/05/2013 CERN OpenStack CEE Day 30
Preproduction Service
29/05/2013 CERN OpenStack CEE Day 31
Outlook
• Track stable Grizzly releases in RedHat EPEL
• Up to date but not too close to the leading edge
• Scaling
• Expect 15,000 hypervisors, 150,000 VMs by 2015
• Manageability
• Metering, Orchestration with Heat, Bare Metal
• Functionality
• Load Balancing, High Availability Storage and Pets
29/05/2013 CERN OpenStack CEE Day 32
High Energy Physics Clouds
29/05/2013 CERN OpenStack CEE Day 33
Long-term preservation
of software and data of
HEP experiments
Utilize special
computing resources
attached to the
detectors
Simplify the management
of heterogeneous in-
house resources
Use commercial
clouds for exceptional
computing demands
Distributed cloud
computing using HEP
and non-HEP clouds
What have we learnt?
• Automate everything from the beginning
• Stackforge community projects are a great help
• Distributions and appliances make getting started much easier
• Constant rate of change requires a different approach
• Focus on core technologies and keep up to date
• Track new projects but don’t adopt too early unless strategic
• Many of our users are cloud aware
• Culture changes for legacy application coding and IT services
• Communities are major motivators
• Staff approach needs to change to adapt rather than re-invent
29/05/2013 CERN OpenStack CEE Day 34
Conclusions
• CERN IT is re-engineering to deliver
additional capacity to 11,000 physicists
within fixed resources
• Clouds models can simplify current large
scale computing infrastructure
• OpenStack and its ecosystem allows us to
meet this challenge and help others through
open source
29/05/2013 CERN OpenStack CEE Day 35
Questions ?
29/05/2013 CERN OpenStack CEE Day 36
Linux Security Incident
29/05/2013 CERN OpenStack CEE Day 37
New operational procedures being developed with cloud flexibility

20130529 openstack cee_day_v6

  • 2.
  • 3.
    29/05/2013 CERN OpenStackCEE Day 3 CERN was founded 1954: 12 European States “Science for Peace” Today: 20 Member States Member States: Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, the Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom Candidate for Accession: Romania Associate Members in Pre-Stage to Membership: Israel, Serbia Applicant States for Membership or Associate Membership: Brazil, Cyprus (awaiting ratification), Pakistan, Russia, Slovenia, Turkey, Ukraine Observers to Council: India, Japan, Russia, Turkey, United States of America; European Commission and UNESCO ~ 2,300 staff ~ 1,000 other paid personnel > 11,000 users Budget (2013) ~1,000 MCHF
  • 4.
    What are theOrigins of Mass ? 29/05/2013 CERN OpenStack CEE Day 4
  • 5.
  • 6.
    Where is 95%of the Universe? 29/05/2013 CERN OpenStack CEE Day 6
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
    29/05/2013 CERN OpenStackCEE Day 11 July 4, 2012
  • 12.
  • 13.
    OpenStack London December 2012 TimBell, CERN 13 Tier-1 (11 centres): •Permanent storage •Re-processing •Analysis Tier-0 (CERN): •Data recording •Initial data reconstruction •Data distribution Tier-2 (~200 centres): • Simulation • End-user analysis • Data is recorded at CERN and Tier-1s and analysed in the Worldwide LHC Computing Grid • In a normal day, the grid provides 100,000 CPU days executing over 2 million jobs
  • 14.
    The CERN MeyrinData Centre 29/05/2013 CERN OpenStack CEE Day 14
  • 15.
  • 16.
    A Big DataChallenge 29/05/2013 CERN OpenStack CEE Day 16 • Nearly 100PB storage today • LHC produces up to 35PB/year today • Peaks of 25GB/second • Data rates double in 2015 • Experiments will run for 20 years and data needs to be preserved • Exabytes of storage to maintain….
  • 17.
  • 18.
    New Data CentreApproved 29/05/2013 CERN OpenStack CEE Day 18 • Data centre in Geneva at the limit of electrical and cooling capacity at 3.5MW • New centre chosen in Budapest, Hungary • Additional 2.7MW of usable power • Local on-site support for hardware maintenance and installations
  • 19.
  • 20.
  • 21.
    Good News, BadNews • Additional data centre in Budapest now online • Increasing users of CERN’s facilities and higher computing requirements as data rates increase • Staff numbers are fixed, no more people • Materials budget decreasing, no more money • Legacy tools are high maintenance and brittle How do we maximise our computing resources within these constraints ? 29/05/2013 CERN OpenStack CEE Day 21
  • 22.
    Approach 29/05/2013 CERN OpenStackCEE Day 22 • Remodel IT services on Cloud layered models • IaaS, PaaS, SaaS • Move to commonly used open source tools • Focus on strong communities and momentum • Implement clouds at scale • Aim for 90% infrastructure virtualised • Exploit ecosystem solutions rather than writing from scratch
  • 23.
    29/05/2013 Bamboo Koji, Mock AIMS/PXE Foreman Yum repo Pulp Puppet-DB mcollective,yum JIRA Lemon / Hadoop / LogStash / Kibana git OpenStack Nova Hardware database Puppet Active Directory / LDAP
  • 24.
    Training for Newcomers 29/05/2013CERN OpenStack CEE Day 24 Buy the book rather than guru mentoring
  • 25.
  • 26.
    Service Models 29/05/2013 CERNOpenStack CEE Day 26 • Pets are given names like pussinboots.cern.ch • They are unique, lovingly hand raised and cared for • When they get ill, you nurse them back to health • Cattle are given numbers like vm0042.cern.ch • They are almost identical to other cattle • When they get ill, you get another one
  • 27.
    CERN Status • ITOpenStack Cloud • Running Folsom around 500 hypervisors on KVM and Hyper-V • High availability components using load balancing • All Puppet managed to configure OpenStack • Over 100 users creating up to 300 new VMs/day • Cattle service level only • LHC experiment farms • CMS currently running 1,300 hypervisors with 50,000 cores • ATLAS starting to ramp up to a similar size • Other science grid sites moving to private cloud on OpenStack • Brookhaven, IN2P3, FutureGrid, NeCTAR, IHEP, … 29/05/2013 CERN OpenStack CEE Day 27
  • 28.
    29/05/2013 CERN OpenStackCEE Day 28 Microsoft Active Directory CERN DB on Demand CERN Network Database Account mgmt system Horizon Keystone Network Compute Glance Scheduler Cinder Nova Block Storage Provider
  • 29.
    29/05/2013 CERN OpenStackCEE Day 29 http://www.eucalyptus.com/blog/2013/04/02/cy13-q1-community-analysis-%E2%80%94-openstack-vs-opennebula-vs-eucalyptus-vs- cloudstack
  • 30.
  • 31.
  • 32.
    Outlook • Track stableGrizzly releases in RedHat EPEL • Up to date but not too close to the leading edge • Scaling • Expect 15,000 hypervisors, 150,000 VMs by 2015 • Manageability • Metering, Orchestration with Heat, Bare Metal • Functionality • Load Balancing, High Availability Storage and Pets 29/05/2013 CERN OpenStack CEE Day 32
  • 33.
    High Energy PhysicsClouds 29/05/2013 CERN OpenStack CEE Day 33 Long-term preservation of software and data of HEP experiments Utilize special computing resources attached to the detectors Simplify the management of heterogeneous in- house resources Use commercial clouds for exceptional computing demands Distributed cloud computing using HEP and non-HEP clouds
  • 34.
    What have welearnt? • Automate everything from the beginning • Stackforge community projects are a great help • Distributions and appliances make getting started much easier • Constant rate of change requires a different approach • Focus on core technologies and keep up to date • Track new projects but don’t adopt too early unless strategic • Many of our users are cloud aware • Culture changes for legacy application coding and IT services • Communities are major motivators • Staff approach needs to change to adapt rather than re-invent 29/05/2013 CERN OpenStack CEE Day 34
  • 35.
    Conclusions • CERN ITis re-engineering to deliver additional capacity to 11,000 physicists within fixed resources • Clouds models can simplify current large scale computing infrastructure • OpenStack and its ecosystem allows us to meet this challenge and help others through open source 29/05/2013 CERN OpenStack CEE Day 35
  • 36.
    Questions ? 29/05/2013 CERNOpenStack CEE Day 36
  • 37.
    Linux Security Incident 29/05/2013CERN OpenStack CEE Day 37 New operational procedures being developed with cloud flexibility

Editor's Notes

  • #9 Over 1,600 magnets lowered down shafts and cooled to -271 C to become superconducting. Two beam pipes, vacuum 10 times less than the moon
  • #14 The Worldwide LHC Computing grid is used to record and analyse this data. The grid currently runs over 2 million jobs/day, less than 10% of the work is done at CERN. There is an agreed set of protocols for running jobs, data distribution and accounting between all the sites which co-operate in order to support the physicists across the globe.