The World Wide Distributed Computing Architecture of the LHC Datagrid

1,043 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,043
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The World Wide Distributed Computing Architecture of the LHC Datagrid

  1. 1. Distributed Data Management for LHC Dirk Duellmann CERN, Geneva Accelerating Science and Innovation 1
  2. 2. July4th2012TheStatusoftheHiggsSearchJ.IncandelafortheCMSCOLLABORATION H #γγ candidate Ian.Bird@cern.ch   2   July4th2012TheStatusoftheHiggsSearchJ.IncandelafortheCMSCOLLABOR !  B%is%integral%of%background%model%over%a%constant%signal%fraction%inte ATLAS: Status of SM Higgs searches, 4/7/2012 Evolution of the excess with time Energy- system not incl
  3. 3. 4   Founded in 1954: “Science for Peace” Member States: Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, the Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom Candidate for Accession: Romania Associate Members in the Pre-Stage to Membership: Israel, Serbia Applicant States: Cyprus, Slovenia, Turkey Observers to Council: India, Japan, the Russian Federation, the United States of America, Turkey, the European Commission and UNESCO ~ 2300 staff ~ 1050 other paid personnel ~ 11000 users Budget (2012) ~1000 MCHF CERN: 20 member states
  4. 4. 5   Global Science: 11000 scientists
  5. 5. Dirk Düllmann, CERN/IT 7
  6. 6. 8 Stars and Planets only account for a small percentage of the universe !
  7. 7. CERN  /  May  2011  
  8. 8. Ø 27 kilometre circle Ø proton collisions at 7+7 TeV Ø 10.000 magnets Ø 8000 km super-conducting cables Ø 120 t of liquid Helium The Large Hadron Collider The largest super conducting installation in the word
  9. 9. Dirk Düllmann, CERN/IT 14 Precision ! The 27 km long ring is sensitive to <1mm changes Tides Stray currents Rainfall LHC
  10. 10. Dirk Düllmann, CERN/IT 17 Ø 140 000 m3 rock removed Ø 53 000 m3 concrete Ø 6 000 tons steel reinforcement Ø 55 meters long Ø 30 meters wide Ø 53 meters high (10-storey building) The ATLAS Cavern
  11. 11. 15    A  collision  at  LHC   26  June  2009   Ian  Bird,  CERN  
  12. 12. Ian  Bird,  CERN   18   The  Data  AcquisiIon  for  one  Experiment  
  13. 13. Tier  0  at  CERN:  AcquisiIon,  First  reconstrucIon,    Storage  &  DistribuIon   Ian.Bird@cern.ch   1.25 GB/sec (ions) 19   2011: 400-500 MB/sec 2011: 4-6 GB/sec
  14. 14. 20   The  LHC  Computing  Challenge   ž  Signal/Noise:  10-­‐13  (10-­‐9  offline)   ž  Data  volume   —  High  rate  *  large  number  of   channels  *  4  experiments   è  ~15  PetaBytes  of  new  data  each   year   ž  Compute  power   —  Event  complexity  *  Nb.  events  *   thousands  users   è 200  k  CPUs   è 45  PB  of  disk  storage   ž  Worldwide  analysis  &  funding   —  CompuIng  funding  locally  in  major   regions  &  countries   —  Efficient  analysis  everywhere   è  GRID  technology   à ~30 PB in 2012 à 170 PB à 300 k CPU
  15. 15. CERN  Computer  Centre   CERN  computer  centre:   •  Built  in  the  70s  on  the  CERN  site   •  ~3000  m2  (on  three  machine  rooms)   •  3.5  MW  for  equipment   A  recent  extension:   •  Located  at  Wigner  (Budapest,  Hungary)   •  ~1000  m2     •  2.7  MW  for  equipment   •  Connected  to  CERN  with  2x100Gb  links   21  
  16. 16. •  A  distributed  compuIng   infrastructure  to  provide  the   producIon  and  analysis   environments  for  the  LHC   experiments   •  Managed  and  operated  by  a   worldwide  collaboraIon   between  the  experiments  and   the  parIcipaIng  computer   centres     •  The  resources  are  distributed  –   for  funding  and  sociological   reasons   •  Our  task  was  to  make  use  of   the  resources  available  to  us  –   no  mafer  where  they  are   located   23   World  Wide  Grid  –  what  and  why?   Tier-0 (CERN): • Data recording • Initial data reconstruction • Data distribution Tier-1 (11 centres): • Permanent storage • Re-processing • Analysis Tier-2 (~130 centres): • Simulation • End-user analysis
  17. 17. •  The  grid  really  works   •  All  sites,  large  and  small  can   contribute   –  And  their  contribuIons  are   needed!   Ian.Bird@cern.ch   24   CPU  –  around  the  Tiers     CPU$delivered$+$January$2011$ CERN% BNL% CNAF% KIT% NL%LHC/Tier21% RAL% FNAL% CC2IN2P3% ASGC% PIC% NDGF% TRIUMF% Tier%2% Tier%2%CPU%delivered%by%country%4%January%2011% USA$ UK$ France$ Germany$ Italy$ Russian$Federa7on$ Spain$ Canada$ Poland$ Switzerland$ Slovenia$ Czech$Republic$ China$ Portugal$ Japan$ Sweden$ Israel$ Romania$ Belgium$ Austria$ Hungary$ Taipei$ Australia$ Republic$of$Korea$ Norway$ Turkey$ Ukraine$ Finland$ India$ Pakistan$ Estonia$ Brazil$ Greece$
  18. 18. 25   Evolution  of  capacity:  CERN  &  WLCG   0" 200000" 400000" 600000" 800000" 1000000" 1200000" 1400000" 1600000" 1800000" 2000000" 2008" 2009" 2010" 2011" 2012" 2013" WLCG%CPU%Growth% Tier2% Tier1% CERN% 0" 20" 40" 60" 80" 100" 120" 140" 160" 180" 200" 2008" 2009" 2010" 2011" 2012" 2013" WLCG%Disk%Growth% Tier2% Tier1% CERN% 0" 100000" 200000" 300000" 400000" 500000" 600000" 2005" 2006" 2007" 2008" 2009" 2010" 2011" 2012" 2013" CERN%Compu*ng%Capacity% CERN" 2013/14:  modest  increases  to  process   “parked  data”   2015  à  budget  limited  ?        -­‐  experiments  will  push  trigger  rates        -­‐  flat  budgets  give  ~20%/year  growth   What  we  thought  was   needed  at  LHC  start   What  we  actually   used  at  LHC  start!  
  19. 19. •  Relies  on     –  OPN,  GEANT,  US-­‐LHCNet   –  NRENs  &  other  naIonal   &  internaIonal  providers  Ian  Bird,  CERN   27   LHC  Networking  
  20. 20. 28   Computing  model  evolution   EvoluIon  of   compuIng  models   Hierarchy   Mesh  
  21. 21. Physics Storage @ CERN: CASTOR and EOS   CASTOR  and  EOS  are  using  the  same  commodity  disk  servers   •  With  RAID-­‐1  for  CASTOR   •  2  copies  in  the  mirror   •  JBOD  with  RAIN  for  EOS   •  Replicas  spread  over  different  disk  servers   •  Tunable  redundancy   Storage  Systems  developed  at  CERN   30  
  22. 22. CERN Disk/Tape Storage Management @ storage-day.ch CASTOR  -­‐  Physics  Data  Archive   31 Data: •  ~90 PB of data on tape; 250 M files •  Up to 4.5 PB new data per month •  Over 10GB/s (R+W) peaks Infrastructure: •  ~ 52K tapes (1TB, 4TB, 5TB) •  9 Robotic libraries (IBM and Oracle) •  80 production + 30 legacy tape drives
  23. 23. CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Internet Services DSS 44.8 PB 136 (279) Mio. 20.7k 32.1 PB EOS Usage at CERN Today
  24. 24. Availability  and  Performance   Archival  &  Data  DistribuIon   User  Analysis   Usage     Peaks   pp  2012   pA  2013   34  
  25. 25. CERN  openlab  in  a  nutshell   •  A  science  –  industry  partnership  to  drive  R&D  and   innovaIon  with  over  a  decade  of  success     •  Evaluate  state-­‐of-­‐the-­‐art  technologies  in  a   challenging  environment  and  improve  them     •  Test  in  a  research  environment  today  what  will  be   used  in  many  business  sectors  tomorrow     •  Train  next  generaIon  of  engineers/employees   •  Disseminate  results  and  outreach  to  new   audiences   40  
  26. 26. 41 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Internet Services DSS Ongoing R&D: Eg Cloud Storage •  CERN openlab – joint project since Jan 12 – Testing scaling and TCO gains with prototype applications •  Huawei S3 storage appliance (0.8 PB) •  logical replication •  fail-in-place
  27. 27. Thanks for your attention! More at http://cern.ch Accelerating Science and Innovation 45

×