The size and complexity of the CERN network

1,115 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,115
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The size and complexity of the CERN network

  1. 1. The CERN Network Openlab Summer 2012 CERN, 6th August 2012 edoardo.martelli@cern.chCERN IT DepartmentCH-1211 Genève 23 Switzerlandwww.cern.ch/it 1
  2. 2. Summary - IT-CS - CERN networks - LHC Data Challenge - WLCG - LHCOPN and LHCONE - Openlab - Conclusions 2
  3. 3. IT-CSCommunication systems 3
  4. 4. IT-CS The IT-CS group is responsible for all communication services in use at CERN for data, voice and video http://it-cs.web.cern.ch/it-cs/ 4
  5. 5. IT-CS organization 5
  6. 6. Networks at CERN 6
  7. 7. CERN accelerator complex 7
  8. 8. High Energy Physics over IP Most of the CERN infrastructure is controlled and managed over a pervasive IP network 8
  9. 9. Cryogenics 27Km of pipes at -271.11° C by means of 700.000 litres of Helium: controlled over IP Source: http://te-dep-crg-oa.web.cern.ch/te-dep-crg-oa/te-crg-oa_fichiers/cryolhc/LHC%20Cryo_BEOP_lectures2009.pdf 9
  10. 10. Access controlSafety and Security: made over IP Source:https://edms.cern.ch/file/931641/1/LASS-LACS_IHM.pdf 10
  11. 11. Remote inspections Remote inspection of dangerous areas: robots controlled and giving feedback over WiFi and GSM IP networks 11
  12. 12. DAQ: Data AcquisitionA constant stream of data from the four Detectors to disk storage Source: http://aliceinfo.cern.ch/Public/Objects/Chapter2/DetectorComponents/daq_architecture.pdf 12
  13. 13. CCC: CERN Control CentreThe neuralgic centre of the particle accelerator: over IP 13
  14. 14. CERN data network - 150 routers - 2200 Switches - 50000 connected devices - 5000km of optical fibres 14
  15. 15. Network Provisioning andManagement System - 250 Database tables - 100,000 Registered devices - 50,000 hits/day on web user interface - 1,000,000 lines of codes - 11 years of development 15
  16. 16. Monitoring and OperationsThe whole network is monitored and operated by the CERN NOC(Network Operation Centre) 16
  17. 17. IPv6 IPv6 dual stack network deployment on going: ready in 2013 Already available: dual-stack testbed More information: http://cern.ch/ipv6 almost 17
  18. 18. LHC Data Challenge 18
  19. 19. Collisions in the LHC 19
  20. 20. Comparing theory... Simulated production of a Higgs event in ATLAS 20
  21. 21. .. to real events Higgs event in CMS 21
  22. 22. Data flow 3 PBytes/s 4 Experiments Filter and first selection 2 GBytes/s to the CERN computer center Create sub-samples World-Wide Analysis 1 TByte/s ? Distributed + local Physics Explanation of nature 10 GBytes/s 4 GBytes/s sΓZ2 Store on disk and tape Export copies σ _ ≈σ _× 0 with ff ff ( s - m 2 ) 2 + s 2 ΓZ / m 2 Z 2 z 12π Γ Γ GF m 3 σ 0 _ = 2 ee 2 ff and Γff = Z × ( v f2 + a f2 ) × N col ff m Z ΓZ 6π 2 22
  23. 23. Data Challenge - 40 million collisions per second - After filtering, 100 collisions of interest per second - 1010 collisions recorded each year = 15 Petabytes/year of data 23
  24. 24. Computing model 24
  25. 25. Last months data transfers 25
  26. 26. WLCGWorldwide LHC Computing Grid 26
  27. 27. WLCG Distributed Computing Infrastructure for LHC experiments Collaborative effort of the HEP community 27
  28. 28. WLCG resources WLCG sites: - 1 Tier0 (CERN) - 11 Tier1s - ~140 Tier2s - >300 Tier3s worldwide - ~250,000 CPUs - ~ 150PB of disk space 28
  29. 29. CERN Tier0 resources Servers 11000 Disks 64000 Processors 15000 Raw Disk Capacity (TB) 63000 Cores 64000 Memory Modules 56000 HEPspec06 480000 RAID controllers 3750 Tape drives 160 High Speed routers 23 Tape cartridges 45000 Ethernet switches 500 Tape slots 56000 10Gbps ports 3000 Tape capacity(TB) 34000 100Gbps ports 48 March 2012 29
  30. 30. CERN Tier0 LCG network Tier2/3s Tier1s CERN Campus 170G aggregated Border routers LHC Experiments 40G links Core routers 100G links Distribution routers 10G or 40G links LCG access switches Access switches ... x892 (max) 1G or 10G links Servers 30
  31. 31. Trends Virtualization mobility (Software Defined Networks) Commodity Servers with 10G NICs High-end Servers with 40G NICs 40G and 100G interfaces on switches and routers 31
  32. 32. LHCOPNLHC Optical Private Network 32
  33. 33. Tier0-Tier1s network 33
  34. 34. A collaborative effort Designed, built and operated by the Tier0-Tier1s community Links provided by the Research and Education network providers: Geant, USLHCnet, Esnet, Canarie, ASnet, Nordunet, Surfnet, GARR, Renater, JANET.UK, Rediris, DFN, SWITCH 34
  35. 35. Technology - Single and bundled long distance 10G ethernet links - Multiple redundant paths. Star+PartialMesh topology - BGP routing: communities for traffic engineering, load balancing. - Security: only declared IP prefixes can exchange traffic. 35
  36. 36. Traffic to the Tier1s 36
  37. 37. Monitoring 37
  38. 38. LHCONELHC Open Network Environment 38
  39. 39. Driving the change “The Network infrastructure is the most reliable service we have” “Network Bandwidth (rather than disk) will need to scale more with users and data volume” “Data placement will be driven by demand for analysis and not pre- placement” Ian Bird, WLCG project leader 39
  40. 40. Change of computing model (ATLAS) 40
  41. 41. New computing model - Better and more dynamic use of storage - Reduce the load on the Tier1s for data serving - Increase the speed to populate analysis facilities Needs for a faster, predictable, pervasive network connecting Tier1s and Tier2s 41
  42. 42. Requirements from the Experiments - Connecting any pair of sites, regardless of the continent they reside - Bandwidth ranging from 1Gbps (Minimal), 5Gbps (Nominal), 10G and above (Leadership) - Scalability: sites are expected to grow - Flexibility: sites may join and leave at any time - Predictable cost: well defined cost, and not too high 42
  43. 43. Needs for a better network - more bandwidth by federating (existing) resources - sharing cost of expensive resources - accessible to any TierX site = LHC Open Network Environment 43
  44. 44. LHCONE concepts - Serves any LHC sites according to their needs and allowing them to grow - A collaborative effort among Research & Education Network Providers - Based on Open Exchange Points: easy to join, neutral - Multiple services: one cannot fit all - Traffic separation: no clash with other data transfer, resource allocated for and funded by HEP community 44
  45. 45. LHCONE architecture 45
  46. 46. LHCONE building blocks - Single node Exchange Points - Continental/regional Distributed Exchange Points - Interconnect circuits between Exchange Points These exchange points and the links in between collectively provide LHCONE services and operate under a common LHCONE policy 46
  47. 47. The underlying infrastructure 47
  48. 48. LHCONE services - Layer3 VPN - Point-to-Point links - Monitoring 48
  49. 49. Openlab and IT-CS 49
  50. 50. Openlab project: CINBAD 50
  51. 51. CINBAD CERN Investigation of Network Behaviour and Anomaly Detection Project Goals: Understand the behaviour of large computer networks (10’000+ nodes) in High Performance Computing or large Campus installations to be able to: ● detect traffic anomalies in the system ● perform trend analysis ● automatically take counter measures ● provide post-mortem analysis facilities Resources: - In collaboration with HP Networking - Two Engineers in IT-CS 51
  52. 52. Results Project completed in 2010 For CERN: Designed and deployed a complete framework (hardware and software) to detect anomalies in the Campus Network (GPN) For HP: Intellectual properties of new technologies used in commercial products 52
  53. 53. CINBAD Architecture data sources storage analysis collectors 53
  54. 54. Openlab project: WIND 54
  55. 55. WIND Wireless Infrastructure Network Deployment Project Goals - Analyze the problems of large scale wireless deployments and understand the constraint - Simulate behaviour of WLAN - Develop new optimisation algorithms Resources: - In collaboration with HP Networking - Two Engineers in IT-CS - Started in 2010 55
  56. 56. Needs Wireless LAN (WLAN) deployments are problematic: ● Radio propagation is very difficult to predict ● Interference is an ever present danger ● WLANs are difficult to properly deploy ● Monitoring was not an issue when the first standards were developed ● When administrators are struggling just to operate the WLAN, performance optimisation is often forgotten 56
  57. 57. Example: Radio interferences Max data rate in 0031-S: The APs work on the same channel Max data rate in 0031-S: The APs work on 3 independent channels 57
  58. 58. Expected results Extend monitoring and analysis tools Act on the network - smart load balancing - isolating misbehaving clients - intelligent minimum data rates More accurate troubleshooting Streamline WLAN design 58
  59. 59. Openlab project: ViSION 59
  60. 60. ViSION Project Goals: - Develop a SDN traffic orchestrator using OpenFlow Resources: - In collaboration with HP Networking - Two Engineers in IT-CS - Started in 2012 60
  61. 61. Goals SDN traffic orchestrator using OpenFlow: ● distribute traffic over a set of network resources ● perform classification (different types of applications and resources) ● perform load sharing (similar resources). Benefits: ● improved scalability and control than traditional networking technologies 61
  62. 62. From traditional networks...Closed boxes, fully distributed protocols App App App Operating System App App App Specialized Packet Forwarding Hardware Operating System App App App Specialized Packet Forwarding Hardware Operating System App App App Specialized Packet Forwarding Hardware Operating System Specialized Packet App App App Forwarding Hardware Operating System Specialized Packet Forwarding Hardware 62
  63. 63. .. to Software Defined Networks (SDN) 2. At least one good operating system 3. Well-defined open API Extensible, possibly open-source App App App Network Operating System 1. Open interface to hardware (OpenFlow) Simple Packet Forwarding Hardware Simple Packet Forwarding Hardware Simple Packet Forwarding Hardware Simple Packet Forwarding Hardware Simple Packet Forwarding Hardware 63
  64. 64. OpenFlow example Controller PCSoftwareLayer OpenFlow Client Flow Table MAC MAC IP IP TCP TCP Action Hardware Forwarding table remotely src dst Src Dst sport dport Hardware Forwarding table remotelyHardware controlled controlled * * * 5.6.7.8 * * port 1Layer port 1 port 2 port 3 port 4 64
  65. 65. Conclusions 65
  66. 66. Conclusions - The Data Network is an essential component of the LHC instrument - The Data Network is a key part of the LHC data processing and will become even more important - More and more security and design challenges to come 66
  67. 67. Credits Artur Barczyk (LHCONE) Dan Savu (VISION) Milosz Hulboj (WIND and CINBAD) Ryszrard Jurga (CINBAD) Sebastien Ceuterickx (WIND) Stefan Stancu (VISION) Vlad Lapadatescu (WIND) 67
  68. 68. Whats next SWAN: Space Wide Area Network :-) 68

×