Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
434
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
8
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • This slide shows a break-down of traffic flows between ESnet sites and other sectors of the Internet for Jan 2003. It shows that 72% of incoming (or accepted) traffic came from ESnet sites, while the remaining 28% of incoming traffic came from the various Sectors as shown above. Similarly 53% of the outgoing (or delivered) traffic went to Sites, while the remaining 47% went to the 3 external Sectors This would indicate that DOE is a net exporter of data – i.e. more data sent than received. The data flowing between sites and to the R&E and International sectors could clearly be considered scientific activities. Data flowing to the Commercial sector would be a mix of direct scientific activities and activities in support of science such as making research data available to the general public. It is important to note that external sector traffic can only flow to or from ESnet sites; traffic between external sectors cannot flow over ESnet. This is one major distinction between ESnet and a commercial ISP. The fact that ESnet does not need to provide bandwidth for transit traffic between external sectors is one factor of its cost effectiveness. A second factor is that we do not pay for traffic to/from the external sectors except for the costs to connect to the peering points.
  • Physical Topology documents devices & their connections including interface names & addresses core Map shows our connections to Qwest. SecureNet map shows the PVP’s in use between SecureNet sites & encapsulation points. OSPF map shows how we have manually set OSPF metrics to optimize routing. IBGP map shows where we are using full meshing and where we are using route reflection. LANWAN system is another interface into all the site diagrams showing equipment & interconnections at each site.
  • Physical Topology documents devices & their connections including interface names & addresses Backbone Map shows our connections to Qwest. SecureNet map shows the PVP’s in use between SecureNet sites & encapsulation points. OSPF map shows how we have manually set OSPF metrics to optimize routing. IBGP map shows where we are using full meshing and where we are using route reflection. LANWAN system is another interface into all the site diagrams showing equipment & interconnections at each site.
  • Physical Topology documents devices & their connections including interface names & addresses Backbone Map shows our connections to Qwest. SecureNet map shows the PVP’s in use between SecureNet sites & encapsulation points. OSPF map shows how we have manually set OSPF metrics to optimize routing. IBGP map shows where we are using full meshing and where we are using route reflection. LANWAN system is another interface into all the site diagrams showing equipment & interconnections at each site.

Transcript

  • 1. The Intersection of Grids and Networks: Where the Rubber Hits the Road William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory
  • 2. Objectives of this Talk
    • How a production R&E network works
    • Why some types of services needed by Grids / widely distributed computing environments are hard
  • 3. Outline
    • How do Networks Work?
    • Role of the R&E Core Network
    • ESnet as a Core Network
      • ESnet Has Experienced Exponential Growth Since 1992
      • ESnet is Monitored in Many Ways
      • How Are Problems Detected and Resolved?
    • Operating Science Mission Critical Infrastructure
      • Disaster Recovery and Stability
      • Recovery from Physical Attack / Failure
      • Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack
    • Services that Grids need from the Network
      • Public Key Infrastructure example
  • 4. How Do Networks Work?
    • Accessing a service, Grid or otherwise, such as a Web server, FTP server, etc., from a client computer and client application (e.g. a Web browser_ involves
      • Target host names
      • Host addresses
      • Service identification
      • Routing
  • 5. How Do Networks Work?
    • When one types “google.com” into a Web browser to use the search engine, the following takes place
      • The name “google.com” is resolved to an Internet address by the Domain Name System (DNS) – a hierarchical directory service
      • The address is attached to a network packet (which carries the data – a google search request in this case) which is then sent out of the computer into the network
      • The first place that the packet reaches is a router that must decide how to get that packet to its desitnatiion (google.com)
  • 6. How Do Networks Work?
      • In the Internet, routing is done “hot potato”
        • Routers are in your site LANs and at your ISP, and each router typically communicates directly with several other routers
        • The first router to receive your packet takes a quick look at the address and says, if I send this packet to router B that will probably take it closer to its destination. So it sends it to B without further adieu.
        • Router B does the same thing, and so forth, until the packet reaches google.com
      • What makes this work is routing protocols that exchange reachability information between all directly connected routers – “BGP” is the most common such protocol in WANs
  • 7. How Do Networks Work?
    • Once the packet reaches its destination (the computer called google.com) it must be delivered to the google search engine, as opposed to the google mail server that may be running on the same machine.
      • This is accomplished with a service identifier that is put on the packet by the browser (the client side application)
        • The service identifier says that this packet is to be delivered to the Web server on the destination system – on each system every server/service has a unique identified called a “port number”
      • So when someone says that the Blaster/Lovsan worm is attacking port 135 on the system called google.com, they mean that a worm program somewhere in the Internet is trying to gain access to the service at port 135 on google.com (usually to exploit a vulnerability).
  • 8. Role of the R&E Core Network: Transit (Deliver Every Packet) LBNL Google, Inc. ESnet (Core network) Big ISP (e.g. SprintLink) gateway router router router router router router core router router peering router core router border router
    • border/gateway routers
    • implement separate site and network provider policy (including site firewall policy)
    • peering routers
    • implement/enforce routing policy for each provider
    • provide cyberdefense
    router router
    • core routers
    • focus on high-speed packet forwarding
    peering router
  • 9. Outline
    • How do Networks Work?
    • Role of the R&E Core Network
    • ESnet as a Core Network
      • ESnet Has Experienced Exponential Growth Since 1992
      • ESnet is Monitored in Many Ways
      • How Are Problems Detected and Resolved?
    • Operating Science Mission Critical Infrastructure
      • Disaster Recovery and Stability
      • Recovery from Physical Attack / Failure
      • Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack
    • Services that Grids need from the Network
      • Public Key Infrastructure example
  • 10. What is ESnet
    • ESnet is a large-scale, very high bandwidth network providing connectivity between DOE Science Labs and their science partners in the US, Europe, and Japan
    • Essentially all of the national data traffic supporting US open science is carried by two networks – ESnet and Internet-2 / Abilene (which plays a similar role for the university community)
    • ESnet is very different from commercial ISPs (Internet Service Providers) like Earthlink, AOL, etc.
      • Most big ISPs provide small amounts of bandwidth to a large number of sites
      • ESnet supplies very high bandwidth to a small number of sites
  • 11. ESnet Connects DOE Facilities and Collaborators Japan International (high speed) OC192 (10G/s optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet (1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155 Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s) QWEST ATM ESnet IP CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) ESnet core ring: Packet over SONET Optical Ring and Hubs SNV HUB MAE-E Chi NAP Fix-W PAIX-W MAE-W NY-NAP PAIX-E Euqinix PNWG SEA HUB Abilene Abilene Abilene Abilene NNSA Sponsored (12) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) 42 end user sites peering points ESnet hubs Office Of Science Sponsored (22) TWC JGI SNLL LBNL SLAC YUCCA MT BECHTEL PNNL LIGO INEEL LANL SNLA Allied Signal PANTEX ARM KCP NOAA OSTI ORAU SRS ORNL JLAB PPPL ANL-DC INEEL-DC ORAU-DC LLNL/LANL-DC MIT ANL BNL FNAL AMES 4xLAB-DC NERSC NREL ALB HUB LLNL GA DOE-ALB SDSC GTN&NNSA GEANT - Germany - France - Italy - UK - etc. Sinet (Japan) Japan – Russia(BINP) CA*net4 KDDI (Japan) France Switzerland Taiwan (TANet2) Australia CA*net4 Taiwan (TANet2) Singaren ELP HUB CHI HUB NYC HUB ATL HUB DC HUB Starlight SNV HUB
  • 12. Current Architecture 10GE 10GE optical fiber ring
    • Wave division multiplexing
    • today typically 64 x 10 Gb/s optical channels per fiber
    • channels (referred to as “lambdas”) are usually used in bi-directional pairs
    • Lambda channels are converted to electrical channels
    • usually SONET data framing or Ethernet data framing
    • can be clear digital channels (no framing – e.g. for digital HDTV)
    ESnet IP router ESnet core Site IP router Site – ESnet network policy demarcation (“DMZ”) site LAN ESnet hub ESnet site A ring topology network is inherently reliable – all single point failures are mitigated by routing traffic in the other direction around the ring. RTR RTR RTR RTR RTR RTR
  • 13. LBNL ESnet Peering (connections to other networks) Commercial NYC HUBS SEA HUB Japan SNV HUB MAE-W FIX-W PAIX-W 26 PEERS CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) Abilene + 7 Universities 22 PEERS MAX GPOP GEANT - Germany - France - Italy - UK - etc SInet (Japan) KEK Japan – Russia (BINP) Australia CA*net4 Taiwan (TANet2) Singaren 20 PEERS 3 PEERS LANL TECHnet 2 PEERS 39 PEERS CENIC SDSC PNW-GPOP CalREN2 Distributed 6TAP 19 Peers 2 PEERS KDDI (Japan) France 1 PEER 1 PEER 5 PEERS ESnet provides complete access to the Internet by managing the full complement of Global Internet routes (about 150,000) at 10 general/commercial peering points + high-speed peerings w/ Abilene and the international networks. ATL HUB University International Commercial Abilene Abilene 6 PEERS Abilene STARLIGHT MAE-E NY-NAP PAIX-E GA Peering – ESnet’s Logical Infrastructure – Connects the DOE Community With its Collaborators CHI NAP EQX-ASH EQX-SJ
  • 14. What is Peering?
    • Peering points exchange routing information that says “which packets I can get closer to their destination”
    • ESnet daily peering report (top 20 of about 100)
    • This is a lot of work
    peering with this outfit is not random, it carries routes that ESnet needs (e.g. to the Russian Backbone Net) 7132 2828 6395 4200 4323 5400 11537 3491 7473 6461 174 5511 3549 2914 7018 3561 3356 209 701 1239 AS 1961 2383 2408 2475 2774 3321 3327 3529 4429 5032 5492 8190 17369 19723 28728 35980 41440 47063 51685 63384 routes SBC XO BROADWING ALERON TWTELECOM BT ABILENE CAIS SINGTEL ABOVENET COGENTCO OPENTRANSIT GLOBALCENTER VERIO ATT-WORLDNET CABLE-WIRELESS LEVEL3 QWEST UUNET-ALTERNET SPRINTLINK peer
  • 15.
    • Why so many routes? So that when I want to get to someplace out of the ordinary, I can get there. For example: http://www-sbras.nsc.ru/eng/sbras/copan/microel_main.html (Technological Design Institute of Applied Microelectronics of SB RAS 630090, Novosibirsk, Russia)
    What is Peering? Finish: 194.226.160.10 195.209.14.206 195.209.14.153 195.209.14.29 63.218.13.134 63.218.12.37 63.216.0.30 63.216.0.53 63.218.6.38 63.218.6.65 134.55.209.90 Start: 134.55.209.5 Novosibirsk-NSC-RBNet.nsc.ru NSK-RBNet-2.RBNet.ru MSK-M9-RBNet-1.RBNet.ru MSK-M9-RBNet-5.RBNet.ru rbnet.pos4-1.cr01.ldn01.pccwbtn.net pos6-0.cr01.ldn01.pccwbtn.net pos5-3.cr02.nyc02.pccwbtn.net pos6-1.cr01.vna01.pccwbtn.net pos5-1.cr01.chc01.pccwbtn.net pos3-0.cr01.sjo01.pccwbtn.net snvrt1-ge0-snvcr1.es.net snv-lbl-oc48.es.net Peering routers RBN to AS 5387 (NSCNET-2) “ “ “ “ Russian Backbone Network AS3491->AS5568 (Russian Backbone Network) peering point “ “ “ “ “ “ “ “ AS3491 CAIS Internet ESnet peering at Sunnyvale ESnet core
  • 16. ESnet is Engineered to Move a Lot of Data Annual growth in the past five years has increased from 1.7x annually to just over 2.0x annually. TBytes/Month ESnet is currently transporting about 250 terabytes/mo. ESnet Monthly Accepted Traffic
  • 17. Who Generates Traffic, and Where Does it Go? Traffic coming into ESnet = Green Traffic leaving ESnet = Blue Traffic between sites % = of total ingress or egress traffic Note that more that 90% of the ESnet traffic is OSC traffic ESnet Appropriate Use Policy (AUP) All ESnet traffic must originate and/or terminate on an ESnet an site (no transit traffic is allowed) ESnet Inter-Sector Traffic Summary, Jan 2003 / Feb 2004 (1.7X overall traffic increase, 1.9X OSC increase) (the international traffic is increasing due to BABAR at SLAC and the LHC tier 1 centers at FNAL and BNL) Peering Points Commercial R&E (mostly universities) International 21/14% 17/10% 9/26% 14/12% 10/13% 4/6% ESnet ~25/18% DOE collaborator traffic, inc. data 72/68% 53/49% DOE is a net supplier of data because DOE facilities are used by universities and commercial entities, as well as by DOE researchers DOE sites
  • 18. ESnet Top 20 Data Flows, 24 hrs., 2004-04-20 Fermilab (US)  CERN SLAC (US)  IN2P3 (FR) 1 terabyte/day SLAC (US)  INFN Padva (IT) Fermilab (US)  U. Chicago (US) CEBAF (US)  IN2P3 (FR) INFN Padva (IT)  SLAC (US) U. Toronto (CA)  Fermilab (US) DFN-WiN (DE)  SLAC (US) DOE Lab  DOE Lab DOE Lab  DOE Lab SLAC (US)  JANET (UK) Fermilab (US)  JANET (UK) Argonne (US)  Level3 (US) Argonne  SURFnet (NL) IN2P3 (FR)  SLAC (US) Fermilab (US)  INFN Padva (IT) A small number of science users account for a significant fraction of all ESnet traffic
  • 19. Top 50 Traffic Flows Monitoring – 24hr – 1 Int’l Peering Point 10 flows > 100 GBy/day More than 50 flows > 10 GBy/day
  • 20. Scalable Operation is Essential
    • R&E networks typically operate with a small staff
    • The key to everything that the network provides is scalability
      • How do you manage a huge infrastructure with a small number of people?
      • This issue dominates all others when looking at whether to support new services (e.g. Grid middleware)
        • Can the service be structured so that its operational aspects do not scale as a function of the use population?
        • If not, then it cannot be offered as a service
  • 21. Scalable Operation is Essential
    • The entire ESnet network is operated by fewer than 15 people
    7X24 Operations Desk (2-4 FTE) 7X24 On-Call Engineers (7 FTE) Core Engineering Group (5 FTE) Infrastructure (6 FTE) Management, resource management, circuit accounting, group leads (4 FTE) Science Services (middleware and collaboration tools) (5 FTE)
  • 22.
    • Automated, real-time monitoring of traffic levels and operating state of some 4400 network entities is the primary network operational and diagnosis tool
    SecureNet Network Configuration OSPF Metrics (internal routing and connectivity) Performance Hardware Configuration IBGP Mesh (WAN routing and connectivity)
  • 23. How Are Problems Detected and Resolved? Japan International (high speed) OC192 (10G/s optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet (1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155 Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s) QWEST ATM ESnet IP CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) When a hardware alarm goes off here, the 24x7 operator is notified TWC JGI SNLL LBNL SLAC YUCCA MT BECHTEL PNNL LIGO INEEL LANL SNLA Allied Signal PANTEX ARM Allied Signal NOAA OSTI ORAU SRS ORNL JLAB PPPL ANL-DC INEEL-DC ORAU-DC LLNL/LANL-DC MIT ANL BNL FNAL AMES Nevis Yale 4xLAB-DC Brandeis NERSC NREL ALB HUB LLNL GA DOE-ALB SDSC GTN&NNSA GEANT - Germany - France - Italy - UK - etc Sinet (Japan) Japan – Russia(BINP) CA*net4 KDDI (Japan) France Switzerland Taiwan (TANet2) Australia CA*net4 Taiwan (TANet2) Singaren SEA HUB ELP HUB SNV HUB CHI HUB NYC HUB ATL HUB DC HUB
  • 24. ESnet is Monitored in Many Ways SecureNet ESnet configuration OSPF Metrics Performance IBGP Mesh Hardware Configuration
  • 25. Drill Down into the Configuration DB to Operating Characteristics of Every Device e.g. cooling air temperature for the router chassis air inlet, hot-point, and air exhaust for the ESnet gateway router at PNNL
  • 26. Problem Resolution
    • Let’s say that the diagnoistics have pinpointed a bad module in a router rack in the ESnet hub in NYC
    • Almost all high-end routers, and other equipment that ESnet uses, have multiple, redundant modules for all critical functions
    • Failure of a module (e.g. a power supply or a control computer) can be corrected on-the-fly, without turning off the power or impacting the continued operation of the router
    • Failed modules are typically replaced by a “smart hands” service at the hubs or sites
      • One of the many essential scalability mechanisms
  • 27. ESnet is Monitored in Many Ways SecureNet ESnet configuration OSPF Metrics Performance IBGP Mesh Hardware Configuration
  • 28. Drill Down into the Hardware Configuration DB for Every Wire Connection Equipment rack detail at AOA, NYC Hub (one of the 10 Gb/s core optical ring sites)
  • 29.
    • Equipment wiring detail for two modules at the AOA, NYC Hub
    • This allows “smart hands” – e.g., Qwest personnel at the NYC site – to replace modules for ESnet)
    The Hub Configuration Database
  • 30. What Does this Equipment Actually Look Like? Equipment rack detail at NYC Hub, 32 Avenue of the Americas (one of the 10 Gb/s core optical ring sites) Picture detail
  • 31. Cisco 7206 AOA-AR1 (low speed links to MIT & PPPL) ($38,150 list) Juniper M20 AOA-PR1 (peering RTR) ($353,000 list) Juniper T320 AOA-CR1 (Core router) ($1,133,000 list) Juniper OC192 Optical Ring Interface (the AOA end of the OC192 to CHI ($195,000 list) Juniper OC48 Optical Ring Interface (the AOA end of the OC48 to DC-HUB ($65,000 list) AOA Performance Tester ($4800 list) Qwest DS3 DCX DC / AC Converter ($2200 list) Lightwave Secure Terminal Server ($4800 list) ESnet core equipment @ Qwest 32 AofA HUB NYC, NY (~$1.8M, list) Sentry power 48v 30/60 amp panel ($3900 list) Sentry power 48v 10/25 amp panel ($3350 list) Typical Equipment of an ESnet Core Network Hub
  • 32. Outline
    • How do Networks Work?
    • Role of the R&E Core Network
    • ESnet as a Core Network
      • ESnet Has Experienced Exponential Growth Since 1992
      • ESnet is Monitored in Many Ways
      • How Are Problems Detected and Resolved?
    • Operating Science Mission Critical Infrastructure
      • Disaster Recovery and Stability
      • Recovery from Physical Attack / Failure
      • Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack
    • Services that Grids need from the Network
      • Public Key Infrastructure example
  • 33. Operating Science Mission Critical Infrastructure
    • ESnet is a visible and critical piece of DOE science infrastructure
      • if ESnet fails,10s of thousands of DOE and University users know it within minutes if not seconds
    • Requires high reliability and high operational security in the systems that are integral to the operation and management of the network
      • Secure and redundant mail and Web systems are central to the operation and security of ESnet
        • trouble tickets are by email
        • engineering communication by email
        • engineering database interfaces are via Web
      • Secure network access to Hub routers
      • Backup secure telephone modem access to Hub equipment
      • 24x7 help desk and 24x7 on-call network engineer
    • [email_address] (end-to-end problem resolution)
  • 34. Disaster Recovery and Stability
    • Remote Engineer
    • partial duplicate infrastructure
    DNS
    • Remote Engineer
    • partial duplicate infrastructure
    Remote Engineer
    • The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc.
    Duplicate Infrastructure Currently deploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub
    • Engineers, 24x7 Network Operations Center, generator backed power
    • Spectrum (net mgmt system)
    • DNS (name – IP address translation)
    • Eng database
    • Load database
    • Config database
    • Public and private Web
    • E-mail (server and archive)
    • PKI cert. repository and revocation lists
    • collaboratory authorization service
      • Reliable operation of the network involves
      • remote Network Operation Centers (3)
      • replicated support infrastructure
      • generator backed UPS power at all critical network and infrastructure locations
      • high physical security for all equipment
      • non-interruptible core - ESnet core operated without interruption through
        • N. Calif. Power blackout of 2000
        • the 9/11/2001 attacks, and
        • the Sept., 2003 NE States power blackout
    LBNL PPPL BNL AMES TWC ATL HUB SEA HUB ALB HUB NYC HUBS DC HUB ELP HUB CHI HUB SNV HUB
  • 35. Recovery from Physical Attack / Core Ring Failure New York (AOA) Chicago (CHI) Sunnyvale (SNV) Atlanta (ATL) Washington, DC (DC) El Paso (ELP) Site gateway router Site LAN ESnet border router DMZ Site Hubs (backbone routers and local loop connection points) ESnet backbone (optical fiber ring) Local loop (Hub to local site) The Hubs have lots of connections (42 in all) We can route traffic either way around the ring, so any single failure in the ring is transparent to ESnet users X normal traffic flow reversed traffic flow The local loops are still single points of failure break in the ring
  • 36. Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack
    • A Phased Security Architecture is being implemented to protects the network and the ESnet sites
    • The phased response ranges from blocking certain site traffic to a complete isolation of the network which allows the sites to continue communicating among themselves in the face of the most virulent attacks
      • Separates ESnet core routing functionality from external Internet connections by means of a “peering” router that can have a policy different from the core routers
      • Provide a rate limited path to the external Internet that will insure site-to-site communication during an external denial of service attack
      • Provide “lifeline” connectivity for downloading of patches, exchange of e-mail and viewing web pages (i.e.; e-mail, dns, http, https, ssh, etc.) with the external Internet prior to full isolation of the network
  • 37. Cyberattack Defense LBNL ESnet router router border router X peering router Lab Lab gateway router ESnet second response – filter traffic from outside of ESnet
    • Lab first response – filter incoming traffic at their ESnet gateway router
    ESnet third response – shut down the main peering paths and provide only limited bandwidth paths for specific “lifeline” services X peering router gateway router border router router attack traffic X ESnet first response – filters to assist a site
    • Sapphire/Slammer worm infection created a Gb/s of traffic on the ESnet core until filters were put in place (both into and out of sites) to damp it out.
  • 38. ESnet WAN Security and Cybersecurity
    • Cybersecurity is a new dimension of ESnet security
      • Security is now inherently a global problem
      • As the entity with a global view of the network, ESnet has an important role in overall security
    30 minutes after the Sapphire/Slammer worm was released, 75,000 hosts running Microsoft's SQL Server (port 1434) were infected. (“The Spread of the Sapphire/Slammer Worm,” David Moore (CAIDA & UCSD CSE), Vern Paxson (ICIR & LBNL), Stefan Savage (UCSD CSE), Colleen Shannon (CAIDA), Stuart Staniford (Silicon Defense), Nicholas Weaver (Silicon Defense & UC Berkeley EECS) http://www.cs.berkeley.edu/~nweaver/sapphire ) Jan., 2003
  • 39. ESnet and Cybersecurity Sapphire/Slammer worm infection hits creating almost a full Gb/s (1000 megabit/sec.) traffic spike on the ESnet backbone
  • 40. Outline
    • Role of the R&E Transit Network
    • ESnet is Driven by the Requirements of DOE Science
    • Terminology – How Do Networks Work?
    • How Does it Work? – ESnet as a Backbone Network
      • ESnet Has Experienced Exponential Growth Since 1992
      • ESnet is Monitored in Many Ways
      • How Are Problems Detected and Resolved?
    • Operating Science Mission Critical Infrastructure
      • Disaster Recovery and Stability
      • Recovery from Physical Attack / Failure
      • Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack
    • Services that Grids need from the Network
      • Public Key Infrastructure example
  • 41. Network and Middleware Needs of DOE Science Organized by Office of Science Mary Anne Scott, Chair Dave Bader Steve Eckstrand Marvin Frazier Dale Koelling Vicky White Workshop Panel Chairs Ray Bair and Deb Agarwal Bill Johnston and Mike Wilde Rick Stevens Ian Foster and Dennis Gannon Linda Winkler and Brian Tierney Sandy Merola and Charlie Catlett August 13-15, 2002
    • Focused on science requirements that drive
      • Advanced Network Infrastructure
      • Middleware Research
      • Network Research
      • Network Governance Model
    • The requirements for DOE science were developed by the OSC science community representing major DOE science disciplines
      • Climate
      • Spallation Neutron Source
      • Macromolecular Crystallography
      • High Energy Physics
      • Magnetic Fusion Energy Sciences
      • Chemical Sciences
      • Bioinformatics
      • Available at www.es.net/#research
  • 42. Grid Middleware Requirements (DOE Workshop)
    • A DOE workshop examined science driven requirements for network and middleware and identified twelve high priority middleware services (see www.es.net/#research)
    • Some of these services have a central management component and some do not
    • Most of the services that have central management fit the criteria for ESnet support. These include, for example
      • Production, federated RADIUS authentication service
      • PKI federation services
      • Virtual Organization Management services to manage organization membership, member attributes and privileges
      • Long-term PKI key and proxy credential management
      • End-to-end monitoring for Grid / distributed application debugging and tuning
      • Some form of authorization service (e.g. based on RADIUS)
      • Knowledge management services that have the characteristics of an ESnet service are also likely to be important (future)
  • 43. Grid Middleware Services
    • ESnet provides several “science services” – services that support the practice of science
    • A number of such services have an organization like ESnet as the natural provider
      • ESnet is trusted, persistent, and has a large (almost comprehensive within DOE) user base
      • ESnet has the facilities to provide reliable access and high availability through assured network access to replicated services at geographically diverse locations
      • However, service must be scalable in the sense that as its user base grows, ESnet interaction with the users does not grow (otherwise not practical for a small organization like ESnet to operate)
  • 44. Science Services: PKI Support for Grids
    • Public Key Infrastructure supports cross-site, cross-organization, and international trust relationships that permit sharing computing and data resources and other Grid services
    • DOEGrids Certification Authority service provides X.509 identity certificates to support Grid authentication provides an example of this model
      • The service requires a highly trusted provider, and requires a high degree of availability
      • The service provider is a centralized agent for negotiating trust relationships, e.g. with European CAs
      • The service scales by adding site based or Virtual Organization based Registration Agents that interact directly with the users
      • See DOEGrids CA (www.doegrids.org)
  • 45. Science Services: Public Key Infrastructure
    • DOEGrids CA policies are tailored to science Grids
      • Digital identity certificates for people, hosts and services
      • Provides formal and verified trust management – an essential service for widely distributed heterogeneous collaboration , e.g. in the International High Energy Physics community
    • This service was the basis of the first routine sharing of HEP computing resources between US and Europe
    • Have recently added a second CA with a policy that supports secondary issuers that need to do bulk issuing of certificates with central private key management
      • NERSC will auto issue certs when accounts are set up – this constitutes an acceptable identity verification
      • A variant of this will also be set up to support security domain gateways such as Kerberos – X509 – e.g. KX509 – at FNAL
  • 46. Science Services: Public Key Infrastructure
    • The rapidly expanding customer base of this service will soon make it ESnet’s largest collaboration service by customer count
    Registration Authorities ANL LBNL ORNL DOESG (DOE Science Grid) ESG (Climate) FNAL PPDG (HEP) Fusion Grid iVDGL (NSF-DOE HEP collab.) NERSC PNNL
  • 47. Grid Network Services Requirements (GGF, GHPN)
    • Grid High Performance Networking Research Group, “Networking Issues of Grid Infrastructures” (draft-ggf-ghpn-netissues-3) – what networks should provide to Grids
      • High performance transport for bulk data transfer (over 1Gb/s per flow)
      • Performance controllability to provide ad hoc quality of service and traffic isolation.
      • Dynamic Network resource allocation and reservation
      • High availability when expensive computing or visualization resources have been reserved
      • Security controllability to provide a trusty and efficient communication environment when required
      • Multicast to efficiently distribute data to group of resources.
      • How to integrate wireless network and sensor networks in Grid environment
  • 48. Transport Services
    • network tools available to build services
      • queue management
        • provide forwarding priorities different from best effort
        • e.g.
          • scavenger (discard if anything behind in the queue)
          • expedited forwarding (elevated priority queuing)
          • low latency forwarding (highest priority – ahead of all other traffic)
      • path management
        • tagged traffic can be managed separately from regular traffic
      • policing
        • limit the bandwidth of an incoming stream
  • 49. Priority Service: Guaranteed Bandwidth user system2 0 1000 network pipe bandwidth reserved for production, best effort traffic bandwidth management model available for elevated priority traffic border router border router ? flag traffic from user system1 for expedited forwarding bandwidth broker user system1 site A site B
  • 50. Priority Service: Guaranteed Bandwidth
    • What is wrong with this? (almost everything)
    there may be several users that want all of the premium bandwidth at the same time the user may send data into the high priority stream at a high enough bandwidth that it interferes with production traffic (and not even know it) this is at least three independent networks, and probably more a user that was a priority at site A may not be at site B site A site B user system2 border router border router ? bandwidth broker user system1
  • 51. Priority Service: Guaranteed Bandwidth user system2 user system1 site B resource manager resource manager resource manager policer authorization shaper site A bandwidth broker allocation manager
    • To address all of the issues is complex
  • 52. Priority Service
    • So, practically, what can be done?
    • With available tools can provide a small number of provisioned circuits
      • secure and end-to-end (system to system)
      • various Quality of Service possible, including minimum latency
      • a certain amount of route reliability (if redundant paths exist in the network)
      • end systems can manage these circuits as single high bandwidth paths or multiple lower bandwidth paths of (with application level shapers)
      • non-interfering with production traffic, so aggressive protocols may be used
  • 53. Priority Service: Guaranteed Bandwidth user system2 user system1 site B policer site A
    • will probably be service level agreements among transit networks allowing for a fixed amount of priority traffic – so the resource manager does minimal checking and no authorization
    • will do policing, but only at the full bandwidth of the service agreement (for self protection)
    resource manager authorization resource manager resource manager allocation will probably be relatively static and ad hoc bandwidth broker
  • 54. Grid Network Services Requirements (GGF, GHPN)
    • Grid High Performance Networking Research Group, “Networking Issues of Grid Infrastructures” (draft-ggf-ghpn-netissues-3) – what networks should provide to Grids
      • High performance transport for bulk data transfer (over 1Gb/s per flow)
      • Performance controllability to provide ad hoc quality of service and traffic isolation.
      • Dynamic Network resource allocation and reservation
      • High availability when expensive computing or visualization resources have been reserved
      • Security controllability to provide a trusted and efficient communication environment when required
      • Multicast to efficiently distribute data to group of resources.
      • Integrated wireless network and sensor networks in Grid environment
  • 55. High Throughput 1) Alternatives to TCP (see DT-RG survey document) 2) OS by-pass and protocol off-loading 3) Overlays 4) End to end optical paths Proposed alternatives 1a) Multiple TCP sessions 1b) Larger MTU 1c) ECN Available solutions 1a) End system bottleneck, 1b) Protocol misconfigured, 1c) Inefficient Protocol 1d) Mixing of congestion control and error recovery 2a) TCP connection Set up: Blocking operations vs asynchronous 2b)Window scale option not accessible through the API Analyzed reasons 1) Low average throughput 2) Semantic gap between socket buffer interface and the protocol capabilities of TCP Current issues 1) High average throughput 2) Advanced protocol capabilities available and usable at the end-systems 3) Lack of use of QoS parameters Requirements
  • 56. A New Architecture
    • The essential requirements cannot be met with the current, telecom provided, hub and spoke architecture of ESnet
    • The core ring has good capacity and resiliency against single point failures, but the point-to-point tail circuits are neither reliable nor scalable to the required bandwidth
    ESnet Core/Backbone New York (AOA) Chicago (CHI) Sunnyvale (SNV) Atlanta (ATL) Washington, DC (DC) El Paso (ELP) DOE sites
  • 57. A New Architecture
    • A second backbone ring will multiply connect the MAN rings to protect against hub failure
    • All OSC Labs will be able to participate in some variation of this new architecture in order to gain highly reliable and high capacity network access
    Europe Asia-Pacific ESnet Core/Backbone New York (AOA) Chicago (CHI) Sunnyvale (SNV) Atlanta (ATL) Washington, DC (DC) El Paso (ELP) DOE sites
  • 58. Conclusions
    • ESnet is an infrastructure that is critical to DOE’s science mission and that serves all of DOE
    • Focused on the Office of Science Labs
    • ESnet is working on providing the DOE mission science networking requirements with several new initiatives and a new architecture
    • QoS is hard – but we have enough experience to do pilot studies (which ESnet is just about to start)
    • Middleware services for large numbers of users are hard – but they can be provided if careful attention is paid to scaling