Prezentacja rozwiązania SDN ( projekt espresso - https://blog.google/topics/google-cloud/making-google-cloud-faster-more-available-and-cost-effective-extending-sdn-public-internet-espresso/ ) dla sieci brzegowej Google. Opisuje rozproszoną architekture warstwy kontrolnej i warstwy przesyłania pakietów, system mapowania oraz omawia doświadczenie operatorskie zebrane przy wspieraniu systemu w warunkach produkcyjnych.
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
1. Confidential + ProprietaryConfidential + Proprietary
Taking the Edge off with Espresso
Scale, Reliability and Programmability for Global Internet Peering
KK Yap, Murtaza Motiwala, Jeremy Rahe, Steve Padgett, Matthew Holliman, Gary Baldus, Marcus Hines,
Taeeun Kim, Ashok Narayanan, Ankur Jain, Victor Lin, Colin Rice, Brian Rogan, Arjun Singh, Bert Tanaka,
Manish Verma, Puneet Sood, Mukarram Tariq, Matt Tierney, Dzevad Trumic, Vytautas Valancius, Calvin Ying,
Mahesh Kallahalla, Bikash Koley, Amin Vahdat and many others.
Presented by: Piotr Marecki (bubu@google.com)
2. Confidential + Proprietary
Problem Statement
Egress Terabits/sec of traffic to our Internet peers
● High-def video, cloud traffic, etc.
2
3. Confidential + Proprietary
Problem Statement
Egress Terabits/sec of traffic to our Internet peers
● High-def video, cloud traffic, etc.
1. Optimize traffic per-customer and per-application
● e.g., optimal video quality, or differentiated service for cloud
3
Google
Alternate path with better
user experience?
● Problem: Constrained by BGP shortest path and lack of application awareness
4. Confidential + Proprietary
Problem Statement
Egress Terabits/sec of traffic to our Internet peers
● High-def video, cloud traffic, etc.
2. Deliver new features quickly
4
Request to
vendor
Commit to
Feature
Implement
Vendor
Testing
Integration
Testing @
Google
Deploy
Novel L2 VPN?
● Problem: router-vendor feature cycles and qualification take many years
5. Confidential + Proprietary
Espresso: Google’s SDN Peering Edge
Our previous experience with SDN
● B4 [SIGCOMM 2013] and Jupiter [SIGCOMM 2015]
● Enable flexible traffic engineering
● Increase feature velocity
5
SDN is only suited for walled gardens?
Peering edge requires interoperability with heterogeneous peers.
12. Confidential + Proprietary
Espresso’s Design Principles
12
1. Hierarchical control plane
○ Global optimization while local control plane provide fast reaction.
2. Fail static
○ Local control plane continues to function without global controller failure.
3. Software programmability
○ Externalize features into software to exploit commodity servers for scale.
4. Testability
5. Manageability
13. Confidential + Proprietary
Espresso’s Design Principles
13
1. Hierarchical control plane
○ Global optimization while local control plane provide fast reaction.
2. Fail static
○ Local control plane continues to function without global controller failure.
3. Software programmability
○ Externalize features into software to exploit commodity servers for scale.
4. Testability - loosely coupled control plane, automated testing and release
process
5. Manageability
14. Confidential + Proprietary
Architecture: Externalizing BGP
eBGP Peering
Espresso
Peering Router
Internet-size
routing/forwarding
table
Large ACL
External
Peer
Traditional
Peering Router
14
Hierarchical control plane
Fail static
Software programmability
Host
Host
Host
Host
Host
Host Servers
in Metro
Label-switched
Fabric
BGP
speaker
External
Peer
Peering Fabric
Host
Host
Host
Host
Host
Host Servers
in Metro
15. Confidential + Proprietary
Label-switched
Fabric
Architecture: Reliability and Scale of BGP
External
Peer
eBGP Peering
Peering Router
Internet-size RIB/FIB
Large TCAM
External
Peer
Traditional
Peering Router
15
Espresso
Peering Fabric
Host
Host
Host
Host
Host
Host
Host
Host
BGP
speaker
BGP
speaker
BGP
speaker
Host Servers
in Metro
Hierarchical control plane
Fail static
Software programmability
Host
Host
Host
Host
Host
Host Servers
in Metro
20. Confidential + Proprietary
Using User’s Best Path, not BGP’s
20
Google
● Serve 13% more traffic than
BGP best path in application
aware manner.
● Helps capacity-constrained
ISPs by overflowing demand
to alternate paths within local
metro and also via remote
metros.
21. Confidential + Proprietary
Improvements in End User Experience
Client ISP Change in mean time
between rebuffers (MTBR)
Change in Mean Goodput
A 10 → 20 min 2.25 → 4.5 Mbps
B 4.6 → 12.5 min 2.75 → 4.9 Mbps
C 14 → 19 min 3.2 → 4.2 Mbps
Provide significant improvements to end-user experience.
21
22. Confidential + Proprietary
Release Velocity
Component Average Velocity
(days)
Local Controller 11.2
BGP speaker 12.6
Peering Fabric Controller 15.8
> 50× more frequently than with traditional peering routers.
Novel L2 VPN delivered 6× faster via incremental rollout.
22
23. Confidential + Proprietary
Manageability
● Espresso supports fully automated configuration and upgrade through
intent-driven configuration and management stack
○ To change config , operator or system change intent
○ Commit of change triggers management system to generate, version
and statically verify configuration before pushing it to all relevant
software components and devices
● Change canarying
● “Impact radius”
23
24. Confidential + Proprietary
Operational aspects
● Project development model - DevOps
○ Developers and operational team works together as one team with
common goal
○ Ops are not only providing requirements but actively participate during
design, development and deployment
○ Ops actively develop software tools for debugging and monitoring
○ Developers participate in operational activities and procedures,
effectively reducing “abstraction bias”
○ Entire team shares “oncall” duties
24
25. Confidential + Proprietary
Operational aspects - teams involved
● Traditional operations - distinction between system, network and multiple ops
groups that usually different methods, tools and develop different “work
culture”
● Without proper training and participation in DevOps model, support for
distributed SDN on Network Edge can raise confusion
○ where do i change BGP policy ?
○ Device is connected to remote peer but BGP is down - how do i
troubleshoot
● Most importantly - establish who is responsible for different parts of the
system and engage early
25
26. Confidential + Proprietary
Operational aspects - configuration and deployment
● SDN Edge system is no longer sum of “Network” and “Host” configuration
and provisioning that can be run by separate teams
○ Deployment procedure must be coherent process that efficiently
combines different teams and different provisioning systems
○ Not only Ops and Dev but also Deployment teams must be “in loop”
● Intent driven configuration is a hard requirement
○ Different config systems may complicate process ( synchronisations,
intent consistency )
26
27. Confidential + Proprietary
Operational aspects - monitoring
● Data and Control planes fully distributed
○ Eventual consistency
○ Fail Open ( data, control and management planes )
● Measure state of system
○ Data plane no longer contained to single device
○ Streaming telemetry (OC)
○ Black box approach
○ Control plane pipeline monitoring and anomaly detection
27
28. Confidential + Proprietary
Operational aspects - lessons learned
● Interesting failures
○ Configuration state reporting library bug - threat exhaustion caused ALL
espresso control element jobs to lock ( integration testing failure )
○ Erroneous configuration push on GC draining all PF nodes
○ Slow propagation of new routing changes from GC to HOST (inspired
development of local BGP-derived forwarding map )
○ Ingress traffic blackholing
28
29. Confidential + Proprietary
Conclusion
SDN is only suited for walled gardens.
29
.
Espresso demonstrates that
● traditional peering architecture can evolve to exploit SDN ( incremental changes
while maintaining full interoperability )
● SDN’s value is in flexibility and feature velocity ( cost savings secondary )
30. Confidential + Proprietary
Conclusion
Cloud 1.0
Router
Centric
Protocols
Local view
Connectivity based optimization
Slow evolution
Costly
Espresso
SDN
Peering
Global view
Application signals-based optimization
Rapid deploy-and-iterate
75% Cheaper
30