Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Networking Shawn McKee University of Michigan PCAP Review October 30, 2001
  2. 2. Why Networking? <ul><li>Since the early 1980’s physicists have depended upon leading-edge networks to enable ever larger international collaborations. </li></ul><ul><li>Major HEP collaborations, such as ATLAS , require rapid access to event samples from massive data stores, not all of which can be locally stored at each computational site. </li></ul><ul><li>Evolving integrated applications, i.e. Data Grids , rely on seamless, transparent operation of the underlying LANs and WANs. </li></ul><ul><li>Networks are among the most basic Grid building blocks. </li></ul>
  3. 3. Tier 1 Online System Offline Farm, CERN Computer Ctr ~25 TIPS BNL Center France Italy UK Institute Institute Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec 100 - 1000 Mbits/sec HPSS Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~2.5 Gbits/sec ~2.5 Gbps Tier 0 +1 Tier 3 Tier 4 Tier 2 CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1 Hierarchical Computing Model Tier2 Center HPSS HPSS HPSS HPSS Tier2 Center Tier2 Center Tier2 Center Tier2 Center
  4. 4. MONARC Simulations <ul><li>MONARC ( M odels o f N etworked A nalysis at R egional C entres) has simulated Tier 0/ Tier 1/Tier 2 data processing for ATLAS. </li></ul><ul><li>Networking implications: Tier 1 centers require ~ 140 Mbytes/sec to Tier 0 and ~200 Mbytes/sec other Tier 1s , based upon 1/3 of ESD stored at each Tier 1. </li></ul>
  5. 5. TCP WAN Performance <ul><li>Mathis, et. al ., Computer Communications Review v27, 3 , July 1997, demonstrated the dependence of bandwidth on network parameters: </li></ul>BW - Bandwidth MSS – Max. Segment Size RTT – Round Trip Time PkLoss – Packet loss rate If you want to get 90 Mbps via TCP/IP on a WAN link from LBL to IU you need a packet loss < 1.8e-6 !! (~70 ms RTT).
  6. 6. Network Monitoring: Iperf <ul><li>We have setup testbed network monitoring using Iperf (V1.2) (S. McKee(Umich), D. Yu (BNL)) </li></ul><ul><li>We test both UDP (90 Mbps sending) and TCP between all combinations of our 8 testbed sites. </li></ul><ul><li>Globus is used to initiate both the client and server Iperf processes. </li></ul>(
  7. 7. USATLAS Grid Testbed UC Berkeley LBNL-NERSC Brookhaven National Laboratory Indiana University Boston University Argonne National Laboratory HPSS sites U Michigan University of Texas at Arlington University of Oklahoma Prototype Tier 2s Calren Esnet, Abilene, Nton Abilene ESnet, Mren ESnet NPACI, Abilene
  8. 8. Testbed Network Measurements 128 K, 10 2 M, 100 2 M, 100 2 M, 100 2 M, 45 128 K, 100 4 M, 100 2 M, 100 TCP Wind, Bottleneck 1.1/0.1 0.24/0.03 17.7/ 20.9 65.4/ 81.3 ANL 1.3 0.57 3.8 9.5 UTA 1.8/0.6 0.26/0.018 27.5/ 36.0 69.7/ 87.3 UM 1.7/0.4 0.89/0.020 21.5/ 27.8 72.1/ 90.8 OU 1.6/0.7 0.16/0.014 15.7/ 20.8 70.4/ 88.4 LBL 0.9/0.55 0.31/0.048 26.7/ 35.0 35.8/ 40.3 IU 2.4/1.27 0.70/0.25 10.8/ 13.4 63.4/ 78.6 BU 1.7/0.5 0.51/0.19 10.5/ 13.6 66.4/ 83.5 BNL Jitter (ms) PkLoss (%)* TCP (Mpbs) UDP (Mbps) Site
  9. 9. Baseline BW for the US-CERN Link: TAN-WG From the Transatlantic Networking Committee (TAN) report
  10. 10. Networking Requirements <ul><li>There is more than a simple requirement of adequate network bandwidth for USATLAS. We need : </li></ul><ul><ul><li>A set of local, regional, national and international networks able to interoperate transparently, without bottlenecks. </li></ul></ul><ul><ul><li>Application software that works together with the network to provide high throughput and bandwidth management. </li></ul></ul><ul><ul><li>A suite of high-level collaborative tools that will enable effective data analysis between internationally distributed collaborators. </li></ul></ul><ul><li>The ability of USATLAS to effectively participate at the LHC is closely tied to our underlying networking infrastructure! </li></ul>
  11. 11. Network Coupling to Software <ul><li>Our software and computing model will evolve as our network evolves… both are coupled. </li></ul><ul><li>Very different computing models result from different assumptions about the capabilities of the underlying network ( Distributed vs Local ). </li></ul><ul><li>We must be careful to keep our software “ network aware ” while we work to insure our networks will meet the needs of the computing model. </li></ul>
  12. 12. Local Networking Infrastructure <ul><li>LANs used to lead WANs in performance, capabilities and stability, but this is no longer true. </li></ul><ul><li>WANs are deploying 10 Gigabit technology compared with 1 Gigabit on leading edge LANs. </li></ul><ul><li>New protocols and services are appearing on backbones ( Diffserv, IPV6, multicast ) ( ESNet, I2 ). </li></ul><ul><li>Insuring our ATLAS institutions have the required LOCAL level of networking infrastructure to effectively participate in ATLAS is a major challenge. </li></ul>
  13. 13. Estimating Site Costs Network Planning for US ATLAS Tier 2 Facilities, R. Gardner, G. Bernbom (IU) $430K $270K $110K Network connection Fee Variable (~$50K/y) Variable (~$20K/y) Variable (~$12K/y) Telecom service Provider $60-120K $40-80K $15-30K Routers $1K/NIC (Gigabit) $1K/NIC (Gigabit) $100/NIC (Fast Eth.) Network Interface I2 req. (Sup Gig) I2 req. (Sup Gig) I2 req. (Sup. Gig) Fiber/campus Backbone OC48 2.4Gbps OC12 622Mbps OC3 155Mbps Site Costs
  14. 14. Achieving High Performance Networking <ul><ul><li>Server and Client CPU , I/O and NIC throughput sufficient </li></ul></ul><ul><ul><ul><li>Must consider firmware, hard disk interfaces, bus type/capacity </li></ul></ul></ul><ul><ul><ul><li>Knowledge base of hardware: performance, tuning issues, examples </li></ul></ul></ul><ul><ul><li>TCP/IP stack configuration and tuning is Absolutely Required </li></ul></ul><ul><ul><ul><li>Large windows, multiple streams </li></ul></ul></ul><ul><ul><li>No Local infrastructure bottlenecks </li></ul></ul><ul><ul><ul><li>Gigabit Ethernet “clear path” between selected host pairs </li></ul></ul></ul><ul><ul><ul><li>To 10 Gbps Ethernet by ~2003 </li></ul></ul></ul><ul><ul><li>Careful Router/Switch configuration and monitoring </li></ul></ul><ul><ul><li>Enough router “Horsepower” (CPUs, Buffer Size, Backplane BW) </li></ul></ul><ul><ul><li>Packet Loss must be ~Zero (well below 0.1%) </li></ul></ul><ul><ul><ul><li>i.e . No “Commodity” networks </li></ul></ul></ul><ul><ul><li>End-to-end monitoring and tracking of performance </li></ul></ul>
  15. 15. USATLAS Networking Plans <ul><li>There is a draft document in preparation dealing with the US ATLAS Facilities Plan </li></ul><ul><li>Network systems play a key role in the planning for USATLAS infrastructure </li></ul><ul><li>Major networking issues: </li></ul><ul><ul><li>Network Requirements </li></ul></ul><ul><ul><li>Network Services </li></ul></ul><ul><ul><li>End-to-end Performance </li></ul></ul><ul><ul><li>Local infrastructure </li></ul></ul><ul><ul><li>Operations & Liaison </li></ul></ul><ul><ul><li>Network R&D Activities </li></ul></ul><ul><ul><li>Network Cost Evaluation </li></ul></ul>
  16. 16. Networking Plan of Attack <ul><li>Refine our requirements for the network </li></ul><ul><li>Survey existing work and standards </li></ul><ul><li>Estimate likely developments in networking and their timescales </li></ul><ul><li>Focus on gaps between expectations and needs </li></ul><ul><li>Provide clear, compelling cases to funding agencies about the critical importance of the network </li></ul>
  17. 17. Support for Networking? <ul><li>Traditionally, DOE and NSF have provided University networking support indirectly through the overhead charged to grant recipients. </li></ul><ul><li>National labs have network infrastructure provided by DOE , but not at the level we are finding we require. </li></ul><ul><li>Unlike networking, computing for HEP has never been considered as simply infrastructure. </li></ul><ul><li>The Grid is blurring the boundaries of computing and the network is taking on a much more significant, fundamental role in HEP computing. </li></ul><ul><li>It will be necessary for funding agencies to recognize the fundamental role the network plays in our computing model and to support it directly. </li></ul>
  18. 18. Networking as a Common Project <ul><li>A new Internet2 working group has formed from the LHC Common Projects initiative: HENP ( High Energy/Nuclear Physics ), co-chaired by Harvey Newman (CMS) and Shawn McKee (ATLAS). </li></ul><ul><li>Initial meeting hosted by IU in June, kick-off meeting in Ann Arbor October 26 th </li></ul><ul><li>The issues this group is focusing on are the same that USATLAS networking needs to address. </li></ul><ul><li>USATLAS gains the advantage of a greater resource pool dedicated to solving network problems, a “louder” voice in standard settings and a better chance to realize necessary networking changes. </li></ul>
  19. 19. What can we Conclude? <ul><li>Networks will be vital to the success of our USATLAS efforts. </li></ul><ul><li>Network technologies and services are evolving requiring us to test and develop with current networks while planning for the future. </li></ul><ul><li>We must raise and maintain awareness of networking issues for our collaborators, network providers and funding agencies. </li></ul><ul><li>We must clearly present network issues to the funding agencies to get the required support. </li></ul><ul><li>We need to determine what gaps exist in network infrastructure, services and support and work to insure those gaps are closed before they adversely impact our program. </li></ul>
  20. 20. References <ul><li>US ATLAS Facilities Plan </li></ul><ul><ul><li> </li></ul></ul><ul><li>MONARC </li></ul><ul><ul><li> </li></ul></ul><ul><li>HENP Working Group </li></ul><ul><ul><li> </li></ul></ul><ul><li>Iperf monitoring page </li></ul><ul><ul><li> </li></ul></ul>