Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenStack at CERN : A 5 year perspective

1,547 views

Published on

OpenStack days Budapest 2018

Published in: Government & Nonprofit
  • Login to see the comments

OpenStack at CERN : A 5 year perspective

  1. 1. Grappling with Massive Data Sets Gavin McCance, CERN IT Digital Energy 2018 1 May 2018 | Aberdeen 06/06/2018 OpenStack at CERN 2 OpenStack at CERN : A 5 year perspective Tim Bell tim.bell@cern.ch @noggin143 OpenStack Days Budapest 2018
  2. 2. About Me - @noggin143 • Responsible for Compute and Monitoring at CERN • Elected member of the OpenStack Foundation board • Member of the OpenStack user committee from 2013- 2015 06/06/2018 OpenStack at CERN 3
  3. 3. OpenStack at CERN 4 CERNa Worldwide collaboration CERN’s primary mission: SCIENCE Fundamental research on particle physics, pushing the boundaries of knowledge and technology 06/06/2018
  4. 4. CERN World’s largest particle physics laboratory OpenStack at CERN 5 Image credit: CERN 06/06/2018
  5. 5. 06/06/2018 OpenStack at CERN 6 Evolution of the Universe Test the Standard Model? What’s matter made of? What holds it together? Anti-matter? (Gravity?)
  6. 6. OpenStack at CERN 7 The Large Hadron Collider: LHC 1232 dipole magnets 15 metres 35t EACH 27km Image credit: CERN 06/06/2018
  7. 7. Image credit: CERN COLDER TEMPERATURES than outer space ( 120t He ) OpenStack at CERN 8 LHC: World’s Largest Cryogenic System (1.9 K) 06/06/2018
  8. 8. Vacuum? • Yes OpenStack at CERN 9 LHC: Highest Vacuum 104 km of PIPES 10-11bar (~ moon) Image credit: CERN 06/06/2018
  9. 9. Image credit: CERN Image credit: CERN OpenStack at CERN 10 ATLAS, CMS, ALICE and LHCb EIFFEL TOWER HEAVIER than the Image credit: CERN 06/06/2018
  10. 10. OpenStack at CERN 11 40 million pictures per second 1PB/s Image credit: CERN
  11. 11. OpenStack at CERN 12 Data Flow to Storage and Processing ALICE: 4GB/s ATLAS: 1GB/s CMS: 600MB/s LHCB: 750MB/s RUN 2CERN DC 06/06/2018
  12. 12. Image credit: CERN OpenStack at CERN 13 CERN Data Centre: Primary Copy of LHC Data Data Centre on Google Street View 90k disks 15k servers > 200 PB on TAPES 06/06/2018
  13. 13. About WLCG: • A community of 10,000 physicists • ~250,000 jobs running concurrently • 600,000 processing cores • 700 PB storage available worldwide • 20-40 Gbit/s connect CERN to Tier1s Tier-0 (CERN) • Initial data reconstruction • Data recording & archiving • Data distribution to rest of world Tier-1s (14 centres worldwide) • Permanent storage • Re-processing • Monte Carlo Simulation • End-user analysis Tier-2s (>150 centres worldwide) • Monte Carlo Simulation • End-user analysis WLCG: LHC Computing Grid Image credit: CERN 170 sites WORLDWIDE > 10000 users
  14. 14. CERN in 2017 230 PB on tape 550 million files 2017 55 PB produced TB 06/06/2018 OpenStack at CERN 15
  15. 15. Cloud OpenStack at CERN 16 CERN Data Centre: Private OpenStack Cloud More Than 300 000 cores More Than 500 000 physics jobs per day 06/06/2018
  16. 16. Infrastructure in 2011 • Data centre managed by home grown toolset • Initial development funded by EU projects • Quattor, Lemon, … • Development environment based on CVS • 100K or so lines of Perl • At the limit for power and cooling in Geneva • No simple expansion options 06/06/2018 OpenStack at CERN 17
  17. 17. Wigner Data Centre 06/06/2018 OpenStack at CERN 18 Started project in 2011 with inauguration in June 2013
  18. 18. Getting resources in 2011 06/06/2018 OpenStack at CERN 19
  19. 19. OpenStack London July 2011 06/06/2018 OpenStack at CERN 20
  20. 20. 2011 - First OpenStack summit talk 06/06/2018 OpenStack at CERN 21 https://www.slideshare.net/noggin143/cern-user-story
  21. 21. The Agile Infrastructure Project 2012, a turning point for CERN IT: - LHC Computing and data requirements were increasing … Moore’s law would help, but not enough - EU funded projects for fabric management toolset ended - Staff fixed but must grow resources - LS1 (2013) ahead, next window only in 2019! - Other deployments have surpassed CERN‘s Three core areas: - Centralized Monitoring - Config’ management - IaaS based on OpenStack “All servers shall be virtual!” 06/06/2018 OpenStack at CERN 22
  22. 22. CERN Tool Chain 06/06/2018 OpenStack at CERN 23
  23. 23. 06/06/2018 OpenStack at CERN 24
  24. 24. And block storage.. February 2013 06/06/2018 OpenStack at CERN 25
  25. 25. Sharing with Central Europe – May 2013 06/06/2018 OpenStack at CERN 26 https://www.slideshare.net/noggin143/20130529-openstack-ceedayv6
  26. 26. Production in Summer 2013 06/06/2018 OpenStack at CERN 27
  27. 27. 06/06/2018 OpenStack at CERN 28
  28. 28. CERN Ceph Clusters Size Version OpenStack Cinder/Glance Production 5.5PB jewel Satellite data centre (1000km away) 0.4PB luminous CephFS (HPC+Manila) Production 0.8PB luminous Manila testing cluster 0.4PB luminous Hyperconverged HPC 0.4PB luminous CASTOR/XRootD Production 4.2PB luminous CERN Tape Archive 0.8PB luminous S3+SWIFT Production 0.9PB luminous 29 +5PB in the pipeline 06/06/2018 OpenStack at CERN
  29. 29. Bigbang Scale Tests • Bigbang scale tests mutually benefit CERN & Ceph project • Bigbang I: 30PB, 7200 OSDs, Ceph hammer. Several osdmap limitations • Bigbang II: Similar size, Ceph jewel. Scalability limited by OSD/MON messaging. Motivated ceph-mgr • Bigbang III: 65PB, 10800 OSDs 30 https://ceph.com/community/new-luminous-scalability/ 06/06/2018 OpenStack at CERN
  30. 30. OpenStack Magnum An OpenStack API Service that allows creation of container clusters ● Use your keystone credentials ● You choose your cluster type ● Multi-Tenancy ● Quickly create new clusters with advanced features such as multi-master
  31. 31. OpenStack Magnum $ openstack coe cluster create --cluster-template kubernetes --node-count 100 … mycluster $ openstack cluster list +------+----------------+------------+--------------+-----------------+ | uuid | name | node_count | master_count | status | +------+----------------+------------+--------------+-----------------+ | .... | mycluster | 100 | 1 | CREATE_COMPLETE | +------+----------------+------------+--------------+-----------------+ $ $(magnum cluster-config mycluster --dir mycluster) $ kubectl get pod $ openstack coe cluster update mycluster replace node_count=200 Single command cluster creation
  32. 32. 33 Why Bare-Metal Provisioning? • VMs not sensible/suitable for all of our use cases - Storage and database nodes, HPC clusters, boot strapping, critical network equipment or specialised network setups, precise/repeatable benchmarking for s/w frameworks, … • Complete our service offerings - Physical nodes (in addition to VMs and containers) - OpenStack UI as the single pane of glass • Simplify hardware provisioning workflows - For users: openstack server create/delete - For procurement & h/w provisioning team: initial on-boarding, server re-assignments • Consolidate accounting & bookkeeping - Resource accounting input will come from less sources - Machine re-assignments will be easier to track 06/06/2018 OpenStack at CERN
  33. 33. Compute Intensive Workloads on VMs • Up to 20% loss on very large VMs! • “Tuning”: KSM*, EPT**, pinning, … 10% • Compare with Hyper-V: no issue • Numa-awares & node pinning ... <3%! • Cross over : patches from Telecom (*) Kernel Shared Memory (**) Extended Page Tables 06/06/2018 OpenStack at CERN 34 VM Before After 4x 8 7.8% 2x 16 16% 1x 24 20% 5% 1x 32 20% 3%
  34. 34. 06/06/2018 OpenStack at CERN 35 A new use case: Containers on Bare-Metal • OpenStack managed containers and bare metal so put them together • General service offer: managed clusters - Users get only K8s credentials - Cloud team manages the cluster and the underlying infra • Batch farm runs in VMs as well - Evaluating federated kubernetes for hybrid cloud integration - 7 clouds federated demonstrated at Kubecon - OpenStack and non-OpenStack transparently managed Integration: seamless! (based on specific template) Monitoring (metrics/logs)?  Pod in the cluster  Logs: fluentd + ES  Metrics: cadvisor + influx
  35. 35. • h/w purchases: formal procedure compliant with public procurements - Market survey identifies potential bidders - Tender spec is sent to ask for offers - Larger deliveries 1-2 times / year • “Burn-in” before acceptance - Compliance with technical spec (e.g. performance) - Find failed components (e.g. broken RAM) - Find systematic errors (e.g. bad firmware) - Provoke early failing due to stress Whole process can take weeks! Hardware Burn-in in the CERN Data Centre (1) “bathtub curve” 06/06/2018 OpenStack at CERN 36
  36. 36. Hardware Burn-in in the CERN Data Centre (2) • Initial checks: Serial Asset Tag and BIOS settings - Purchase order ID and unique serial no. to be set in the BMC (node name!) • “Burn-in” tests - CPU: burnK7, burnP6, burnMMX (cooling) - RAM: memtest, Disk: badblocks - Network: iperf(3) between pairs of nodes - automatic node pairing - Benchmarking: HEPSpec06 (& fio) - derivative of SPEC06 - we buy total compute capacity (not newest processors) $ ipmitool fru print 0 | tail -2 Product Serial : 245410-1 Product Asset Tag : CD5792984 $ openstack baremetal node show CD5792984-245410-1 “Double peak” structure due to slower hardware threads OpenAccess paper 06/06/2018 OpenStack at CERN 37
  37. 37. 38 Re-allocationAllocation Foreman Recently added Burn-in 06/06/2018 OpenStack at CERN
  38. 38. 39 Phase 1. Nova Network Linux Bridge Phase 2. Neutron Linux Bridge Phase 3. SDN Tungsten Fabric (testing) Network Migration New Region coming in 2018 Already running * But still used in 2018 *
  39. 39. Spectre / Meltdown  In January, a security vulnerability was disclosed a new kernel everywhere  Campaign over two weeks from15th January  7 reboot days, 7 tidy up days  By availability zone  Benefits  Automation now to reboot the cloud if needed - 33,000 VMs on 9,000 hypervisors  Latest QEMU and RBD user code on all VMs  Downside  Discovered Kernel bug in XFS which may mean we have to do it again soon 06/06/2018 OpenStack at CERN 40
  40. 40. Community Experience  Open source collaboration sets model for in-house teams  External recognition by the community is highly rewarding for contributors  Reviews and being reviewed is a constant learning experience  Productive for job market for staff  Working groups, like the Scientific and Large Deployment teams, discuss wide range of topics  Effective knowledge transfer mechanisms consistent with the CERN mission  110 outreach talks since 2011  Dojos at CERN bring good attendance  Ceph, CentOS, Elastic, OpenStack CH, … 06/06/2018 OpenStack at CERN 41
  41. 41.  Increased complexity due to much higher pile-up and higher trigger rates will bring several challenges to reconstruction algorithms MS had to cope with monster pile-up 8b4e bunch structure à pile-up of ~ 60 events/x-ing for ~ 20 events/x-ing) CMS: event with 78 reconstructed vertices CMS: event from 2017 with 78 reconstructed vertices ATLAS: simulation for HL-LHC with 200 vertices 06/06/2018 OpenStack at CERN 42 HL-LHC: More collisions!
  42. 42. 06/06/2018 OpenStack at CERN 43 First run LS1 Second run Third run LS3 HL-LHC Run4 …2009 2013 2014 2015 2016 2017 201820112010 2012 2019 2023 2024 2030?20212020 2022 …2025 LS2  Significant part of cost comes from global operations  Even with technology increase of ~15%/year, we still have a big gap if we keep trying to do things with our current compute models Raw data volume increases significantly for High Luminosity LHC 2026
  43. 43. Commercial Clouds 06/06/2018 OpenStack at CERN 44
  44. 44. Development areas going forward  Spot Market  Cells V2  Neutron scaling  Magnum rolling upgrades  Block Storage Performance  Federated Kubernetes  Collaborations with Industry and SKA 06/06/2018 OpenStack at CERN 45
  45. 45. Summary  OpenStack has provided flexible infrastructure at CERN since 2013  The open infrastructure toolchain has been stable at scale  Clouds are part but not all of the solution  Open source collaborations have been fruitful for CERN, industry and the communities  Further efforts will be needed to ensure that physics is not limited by the computing resources available 06/06/2018 OpenStack at CERN 46
  46. 46. Thanks for all your help .. Some links  CERN OpenStack blog at http://openstack- in-production.blogspot.com  Recent CERN OpenStack talks at Vancouver summit at https://www.openstack.org/videos/search?se arch=cern  CERN Tools at https://github.com/cernops 06/06/2018 OpenStack at CERN 47
  47. 47. Backup Material 06/06/2018 OpenStack at CERN 48
  48. 48. Hardware Evolution  Looking at new hardware platforms to reduce the upcoming resource gap  Explorations have been made in low cost and low power ARM processors  Interesting R&Ds in high performance hardware  GPUs for deep learning network training and fast simulation  FPGAs for neural network inference and data transformations 49 Significant algorithm changes needed to benefit from potential 06/06/2018 OpenStack at CERN

×