Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Worldwide LHC Computing Grid - Ian Bird -HNSciCloud Prototype Phase kickoff Meeting

752 views

Published on

Ian Bird's presentation at HNSciCloud Prototype Phase kick-off Meeting, 3 April 2017, CERN

Published in: Technology
  • Be the first to comment

Worldwide LHC Computing Grid - Ian Bird -HNSciCloud Prototype Phase kickoff Meeting

  1. 1. Dr. Ian Bird CERN LHC Computing Project Leader 3rd April 2017 Worldwide Distributed Computing for the LHC
  2. 2. The Large Hadron Collider (LHC) A new frontier in Energy & Data volumes: LHC experiments generate 50 PB/year HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 2 ~700 MB/s ~10 GB/s >1 GB/s >1 GB/s
  3. 3. 3 Tier-1: permanent storage, re-processing, analysis Tier-0 (CERN and Hungary): data recording, reconstruction and distribution Tier-2: Simulation, end-user analysis > 2 million jobs/day ~750k CPU cores 600 PB of storage ~170 sites, 42 countries 10-100 Gb links WLCG: An International collaboration to distribute and analyse LHC data Integrates computer centres worldwide that provide computing and storage resource into a single infrastructure accessible by all LHC physicists The Worldwide LHC Computing Grid HNSciCloud; 3 April 2017 Ian.Bird@cern.ch
  4. 4. HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 4 WLCG MoU Signatures 2017: - 63 MoU’s - 167 sites; 42 countries
  5. 5. HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 5 Optical Private Network Support T0 – T1 transfers & T1 – T1 traffic Managed by LHC Tier 0 and Tier 1 sites Networks Up to 340 Gbps transatlantic
  6. 6. HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 6 Asia North America South America Europe LHCOne: Overlay network Allows NREN’s to manage HEP traffic on general purpose network Managed by NREN collaboration 0 10 20 30 40 50 60 70 JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC JAN FEB MAR APR MAY
  7. 7. Data in 2016 HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 7 2016: 49.4 PB LHC data/ 58 PB all experiments/ 73 PB total ALICE: 7.6 PB ATLAS: 17.4 PB CMS: 16.0 PB LHCb: 8.5 PB 11 PB in July 180 PB on tape 800 M files
  8. 8. Data distribution HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 8  Global transfer rates increased to 30-40 GB/s (>2 x Run1) Increased performance everywhere: - Data acquisition >10PB / month - Data transfer rates > 35 GB/s globally Several Tier 1s increased OPN network bandwidth to CERN to manage new data rates; GEANT has deployed additional capacity for LHC 0 10 20 30 40 50 60 70 JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC JAN FEB MAR APR MAY Monthly traffic growth on LHCONE Regular transfers of >80 PB/month with ~100 PB/month during July- October (many billions of files)
  9. 9. 0 20 40 60 80 100 120 140 160 180 200 2010Jan 2010Mar 2010May 2010Jul 2010Sep 2010Nov 2011Jan 2011Mar 2011May 2011Jul 2011Sep 2011Nov 2012Jan 2012Mar 2012May 2012Jul 2012Sep 2012Nov 2013Jan 2013Mar 2013May 2013Jul 2013Sep 2013Nov 2014Jan 2014Mar 2014May 2014Jul 2014Sep 2014Nov 2015Jan 2015Mar 2015May 2015Jul 2015Sep 2015Nov 2016Jan 2016Mar 2016May 2016Jul 2016Sep 2016Nov 2017Jan MillionHS06/days CPU Delivered: HS06-Days/Month ALICE ATLAS CMS LHCb Worldwide computing HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 9 Peak delivery: 18M core-days/month (~580k cores permanently)
  10. 10. CERN Facilities today HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 10 2017: • 225k cores  325k • 150 PB raw  250 PB 2017-18/19 • Upgrade internal networking capacity • Refresh tape infrastructure
  11. 11. Provisioning services Moving towards Elastic Hybrid IaaS model: • In house resources at full occupation • Elastic use of commercial & public clouds • Assume “spot-market” style pricing OpenStack Resource Provisioning (>1 physical data centre) HTCondor Public Cloud VMsContainersBare Metal and HPC (LSF) Volunteer Computing IT & Experiment Services End Users CI/CD APIs CLIs GUIs Experiment Pilot Factories HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 11
  12. 12. CERN cloud procurements Since ~2014, series of short CERN procurement projects of increasing scale and complexity HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 12 2015 2016 2nd Mar. 23rd Mar. 6th Nov.• End: 31st of March 2015 • ATLAS simulation jobs • Single core VMs • Up to 3k VMs for 45 days • 1st Cloud Procurement • Sponsored Account • “ evaluation of Azure as an IaaS” • Any VO, any workload • Targeting multiple DCs: - Iowa, Dublin and Amsterdam • End: 30th of Nov. 2015 • End: 18th of Dec. 2015 • Target all VOs, simulation jobs • 4-core VMs, O(1000) instances • 2nd Cloud Procurement 20th Nov. • Agreement between IBM and CERN • CERN PoC to evaluate: - Resource provisioning - Networkconfigurations - Compute performance • Transparent extension of CERN’s T0 • End: 13th of May 2016 1st Aug. • End: 30th of Nov. 2016 • Provided by OTC IaaS • 4-core VMs, O(1000) instances • 500TB of central storage (DPM) • 1k public IPs through GÉANT • 3rd Cloud Procurement
  13. 13. Commercial Clouds HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 13
  14. 14. Future Challenges 0 100 200 300 400 500 600 700 800 900 1000 Raw Derived Data estimates for 1st year of HL-LHC (PB) ALICE ATLAS CMS LHCb 0 50000 100000 150000 200000 250000 CPU (HS06) CPU Needs for 1st Year of HL-LHC (kHS06) ALICE ATLAS CMS LHCb B First run LS1 Second run LS2 Third run LS3 HL-LHC … FCC? 2013 2014 2015 2016 2017 201820112010 2012 2019 2023 2024 2030?20212020 2022 …2025 CPU: • x60 from 2016 Data: • Raw 2016: 50 PB  2027: 600 PB • Derived (1 copy): 2016: 80 PB  2027: 900 PB HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 14  Raw data volume for LHC increases exponentially and with it processing and analysis load  Technology at ~20%/year will bring x6-10 in 10-11 years  Estimates of resource needs at HL-LHC x10 above what is realistic to expect from technology with reasonably constant cost
  15. 15. HL-LHC computing cost parameters HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 15 Core Algorithms Infrastructure Software Performance Parameters Business of the experiments: amount of Raw data, thresholds; Detector design long term computing cost implications Business of the experiments: reconstruction, and simulation algorithms Performance/architectures/memory etc.; Tools to support: automated build/validation Collaboration with externals – via HSF New grid/cloud models; optimize CPU/disk/network; economies of scale via clouds, joint procurements etc.
  16. 16. HEP Data cloud Storage and compute 1-10 Tb/s DC DC DC Compute Compute Cloud users: Analysis HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 16 Possible Model for future HEP computing infrastructure Simulation resources
  17. 17. Conclusions  WLCG has been very successful in providing the global computing environment for physics at the LHC  Engagement and contributions of the worldwide community have been essential for that  LHC upgrades over the coming decade will give new challenges and opportunities  Technology will change our computing models HNSciCloud; 3 April 2017 Ian.Bird@cern.ch 17

×