Software Innovations and Control Plane Evolution in the new SDN Transport Arc...Cisco Canada
Loukas Paraschis, Technology Solution Architecture at Cisco presents software innovation and control plane evolution in the new SDN transport at Cisco Connect Toronto 2015.
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...Tal Lavian Ph.D.
Dynamic High-Performance Networks :
Support data-intensive Grid applications
Gives adequate and uncontested bandwidth to an application’s burst
Employs circuit-switching of large flows of data to avoid overheads in breaking flows into small packets and delays routing
Is capable of automatic end-to-end path provisioning
Is capable of automatic wavelength switching
Provides a set of protocols for managing dynamically provisioned wavelengths
DWDM-RAM :
Encapsulates “optical network resources” into a service framework to support dynamically provisioned and advanced data-intensive transport services
Offers network resources as Grid services for Grid computing
Allows cooperation of distributed resources
Provides a generalized framework for high performance applications over next generation networks, not necessary optical end-to-end
Yields good overall utilization of network resources
Software Innovations and Control Plane Evolution in the new SDN Transport Arc...Cisco Canada
Loukas Paraschis, Technology Solution Architecture at Cisco presents software innovation and control plane evolution in the new SDN transport at Cisco Connect Toronto 2015.
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...Tal Lavian Ph.D.
Dynamic High-Performance Networks :
Support data-intensive Grid applications
Gives adequate and uncontested bandwidth to an application’s burst
Employs circuit-switching of large flows of data to avoid overheads in breaking flows into small packets and delays routing
Is capable of automatic end-to-end path provisioning
Is capable of automatic wavelength switching
Provides a set of protocols for managing dynamically provisioned wavelengths
DWDM-RAM :
Encapsulates “optical network resources” into a service framework to support dynamically provisioned and advanced data-intensive transport services
Offers network resources as Grid services for Grid computing
Allows cooperation of distributed resources
Provides a generalized framework for high performance applications over next generation networks, not necessary optical end-to-end
Yields good overall utilization of network resources
NetBrain Consultant Edition (CE) is designed to make a Consultant’s job easier by providing instant network discovery, document automation, and visual troubleshooting. NetBrain enables consultants to:
1. Carry out deep discovery of the customer network
2. Automate documentation for network assessments
3. Analyze network design visually
4. Automatically troubleshoot and collect data without custom scripts
In short, NetBrain’s visual workbench allows consultants to complete network assessment tasks much faster and with much more accuracy.
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...Tal Lavian Ph.D.
Programmable Internet:
Enhance internetworking functions.
Move computations into the network for value added services.
Manage the network more capably than possible with SNMP.
More quickly introduce Diffserv or Inserv to support new multimedia applications
Implement traffic control algorithms to support QoS.
ODL is one of the most reliable and secured controllers in Software Defined Networks. This presentation takes you through the journey of load-balanced switching.
Implementation ans analysis_of_quic_for_mqttPuneet Kumar
Transport and security protocols are essential to ensure reliable and secure communication between two parties. For IoT applications, these protocols must be lightweight, since IoT devices are usually resource constrained. Unfortunately, the existing transport and security protocols – namely TCP/TLS and UDP/DTLS – fall short in terms of connection overhead, latency, and connection migration when used in IoT applications. In this paper, after studying the root causes of these shortcomings, we show how utilizing QUIC in IoT scenarios results in a higher performance. Based on these observations, and given the popularity of MQTT as an IoT application layer protocol, we integrate MQTT with QUIC. By presenting the main APIs and functions developed, we explain how connection establishment and message exchange functionalities work. We evaluate the performance of MQTTw/QUIC versus MQTTw/TCP using wired, wireless, and long-distance testbeds. Our results show that MQTTw/QUIC reduces connection overhead in terms of the number of packets exchanged with the broker by up to 56%. In addition, by eliminating half-open connections, MQTTw/QUIC reduces processor and memory usage by up to 83% and 50%, respectively. Furthermore, by removing the head-of-line blocking problem, delivery latency is reduced by up to 55%. We also show that the throughput drops experienced by MQTTw/QUIC when a connection migration happens is considerably lower than that of MQTTw/TCP.
#http3 #QUIC #IoT #IoT_Application #Transport_in_IoT #TCP #
A 12 page paper describing the challenges of diagramming enterprise networks and the benefits of Dynamic Mapping. The paper draws analogies between Dynamic Maps and Google Maps.
Dynamic Classification in a Silicon-Based Forwarding EngineTal Lavian Ph.D.
Implement flow performance enhancement mechanisms without introducing software into data forwarding path
Service defined packet processing in a silicon-based forwarding engine
Policy-based Dynamic packet classifier
Create OPEN platform for introduction of new services
Specify OPEN interfaces for Java applications to control a generic, platform-neutral forwarding plane
Enable downloading of services to network node
Allow object sharing and inter-service communication
The Carrier DevOps Trend (Presented to Okinawa Open Days Conference)Alex Henthorn-Iwane
Telecom carriers are adopting DevOps practices to complement new SDN and NFV network architectures. This presentation to the Okinawa Open Days 2014 conference talks about why this is so, how carriers are going about it, and some best practices.
NetBrain Consultant Edition (CE) is designed to make a Consultant’s job easier by providing instant network discovery, document automation, and visual troubleshooting. NetBrain enables consultants to:
1. Carry out deep discovery of the customer network
2. Automate documentation for network assessments
3. Analyze network design visually
4. Automatically troubleshoot and collect data without custom scripts
In short, NetBrain’s visual workbench allows consultants to complete network assessment tasks much faster and with much more accuracy.
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...Tal Lavian Ph.D.
Programmable Internet:
Enhance internetworking functions.
Move computations into the network for value added services.
Manage the network more capably than possible with SNMP.
More quickly introduce Diffserv or Inserv to support new multimedia applications
Implement traffic control algorithms to support QoS.
ODL is one of the most reliable and secured controllers in Software Defined Networks. This presentation takes you through the journey of load-balanced switching.
Implementation ans analysis_of_quic_for_mqttPuneet Kumar
Transport and security protocols are essential to ensure reliable and secure communication between two parties. For IoT applications, these protocols must be lightweight, since IoT devices are usually resource constrained. Unfortunately, the existing transport and security protocols – namely TCP/TLS and UDP/DTLS – fall short in terms of connection overhead, latency, and connection migration when used in IoT applications. In this paper, after studying the root causes of these shortcomings, we show how utilizing QUIC in IoT scenarios results in a higher performance. Based on these observations, and given the popularity of MQTT as an IoT application layer protocol, we integrate MQTT with QUIC. By presenting the main APIs and functions developed, we explain how connection establishment and message exchange functionalities work. We evaluate the performance of MQTTw/QUIC versus MQTTw/TCP using wired, wireless, and long-distance testbeds. Our results show that MQTTw/QUIC reduces connection overhead in terms of the number of packets exchanged with the broker by up to 56%. In addition, by eliminating half-open connections, MQTTw/QUIC reduces processor and memory usage by up to 83% and 50%, respectively. Furthermore, by removing the head-of-line blocking problem, delivery latency is reduced by up to 55%. We also show that the throughput drops experienced by MQTTw/QUIC when a connection migration happens is considerably lower than that of MQTTw/TCP.
#http3 #QUIC #IoT #IoT_Application #Transport_in_IoT #TCP #
A 12 page paper describing the challenges of diagramming enterprise networks and the benefits of Dynamic Mapping. The paper draws analogies between Dynamic Maps and Google Maps.
Dynamic Classification in a Silicon-Based Forwarding EngineTal Lavian Ph.D.
Implement flow performance enhancement mechanisms without introducing software into data forwarding path
Service defined packet processing in a silicon-based forwarding engine
Policy-based Dynamic packet classifier
Create OPEN platform for introduction of new services
Specify OPEN interfaces for Java applications to control a generic, platform-neutral forwarding plane
Enable downloading of services to network node
Allow object sharing and inter-service communication
The Carrier DevOps Trend (Presented to Okinawa Open Days Conference)Alex Henthorn-Iwane
Telecom carriers are adopting DevOps practices to complement new SDN and NFV network architectures. This presentation to the Okinawa Open Days 2014 conference talks about why this is so, how carriers are going about it, and some best practices.
Presentation given at Isocore's MPLS/SDN 2014 conference in Washington DC, on devtest orchestration to support the SDN/NFV transition and DevOps transformation at carriers and mobile operators.
DDoS detection needs to get scalable and mitigation-neutral so you can devise a protection strategy that is agile and long-lasting. This presentation delves into how big data analytics is useful for DDoS detection. You can see the live version at: http://www.kentik.com/webinars
The goal of swarm-sec is to assess and highlight the security issues in a native swarm cluster setup and faciliate a developer or administrator to deploy a secure cluster environment. swarm-sec is an extension to docker-bench-security.
Currently swarm-sec can:
- Assesses possible issues in swarm daemon configuration
- Details node-specific security issues in the cluster
- Remote connection issues between the node and swarm manager
If your business is heavily dependent on the Internet, you may be facing an unprecedented level of network traffic analytics data. How to make the most of that data is the challenge. This presentation from Kentik VP Product and former EMA analyst Jim Frey explores the evolving need, the architecture and key use cases for BGP and NetFlow analysis based on scale-out cloud computing and Big Data technologies.
Introduction to Docker Networking options. We give in-depth description of the different options with single host examples. See our other presentations for multi-host, IPv6, and CoreOS Flannel descriptions.
Edge Computing Platforms and Protocols - Ph.D. thesisNitinder Mohan
Introductory presentation for Ph.D. thesis of Nitinder Mohan titled "Edge Computing Platforms and Protocols". The defense took place at the University of Helsinki, Finland on 8th November 2019.
The video of the presentation is available at https://youtu.be/dDVZozTwreE
The thesis can be found on https://helda.helsinki.fi/handle/10138/306041
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys Corporation
Radisys and Wind River present on the evolution to the Telecom Cloud and how cloud technology and network virtualization will provide both big opportunities and challenges for operators. Important details and insights are shared on Network Function Virtualization (NFV), Software Defined Network (SDN) and Virtualization.
Mobile IoT Middleware Interoperability & QoS Analysis - Eclipse IoT Day Paris...Nikolaos Georgantas
Research results by the Inria Paris MiMove Team on Mobile IoT Middleware Interoperability & QoS Analysis. Presentation at Eclipse IoT Day Paris Saclay 2019
Paper presentation with title "Building and Operating Distributed SDN-CloudTestbed with Hyper-convergent SmartX Boxes" in EAI Cloud Computing Conference in Daejeon Seoul Korea.
Cloud computing and Software defined networkingsaigandham1
This is my Graduate defense presentation. I have interest in various topics like cloud computing and software defined networking. This slides includes the research of various researchers on cloud computing and SDN, presented their work as my comprehensive exam.
ENHANCING AND MEASURING THE PERFORMANCE IN SOFTWARE DEFINED NETWORKINGIJCNCJournal
Software Defined Networking (SDN) is a challenging chapter in today’s networking era. It is a network design approach that engages the framework to be controlled or 'altered' adroitly and halfway using programming applications. SDN is a serious advancement that assures to provide a better strategy than displaying the Quality of Service (QoS) approach in the present correspondence frameworks. SDN etymologically changes the lead and convenience of system instruments using the single high state program. It separates the system control and sending functions, empowering the network control to end up specifically. It provides more functionality and more flexibility than the traditional networks. A network administrator can easily shape the traffic without touching any individual switches and services which are needed in a network. The main technology for implementing SDN is a separation of data plane and control plane, network virtualization through programmability. The total amount of time in which user can respond is called response time. Throughput is known as how fast a network can send data. In this paper, we have design a network through which we have measured the Response Time and Throughput comparing with the Real-time Online Interactive Applications (ROIA), Multiple Packet Scheduler, and NOX.
1. Telemetry in Large Data Center
Networks
Sriram Natarajan
Google Networking
on behalf of Google Technical Infrastructure
2. ❯ Rich and deep history of building world-class infrastructure
❯ Common infrastructure across Google apps and Google Cloud Platform
❯ This infrastructure is available to you via GCP
Google Technical Infrastructure
2
3. Over the past 15+ years Google has built one
of the fastest, most capable
network infrastructures on the planet.
4. 2002 2004 2006 2008 2010 2012 2013 2014
Colossus
Spanner
Dremel
Dataflow
Kubernetes
2003 2005 2007 2009 2011
Efficient Server Power
GFS
MapReduce
On-Board UPS
Warehouse Scale
Datacenter
Carbon Neutral
Bigtable
Containerized Datacenter
First Public
PUE Reporting
Containers Software Defined
Networking
Flume
Megastore
6. Google Network Infrastructure
•GGC: edge presence
•B4: WAN interconnect
•Jupiter: building scale data center network
•Freedome: campus-level interconnect
•Andromeda: virtualize the physical network
7. ❯ Tiny round trip times
❯ Massive multi-path
❯ More plentiful and more uniform bandwidth
❯ Little buffering
❯ Latency, tail latency as important as bandwidth
❯ Homogeneity, protocol modification much easier
❯ Aggregate bandwidth of entire Internet in one building
What’s different about Data Center Networks
8. Five Generations of Data Center Fabrics
Year
Datacenter
Generation
Fabric
Speed
Host
Speed
Aggr
BW
2004 Four-Post CRs 10G 1G 2T
2005 Firehose 1.0 10G 1G 10T
2006 Firehose 1.1 10G 1G 10T
2008 Watchtower 10G 1G 82T
2009 Saturn 10G 10G 207T
2012 Jupiter 10/40G 10G/40G 1.3P
Aggregatetraffic
Traffic generated by servers in our data centers
Jul
‘08
Jun
‘09
May
‘10
Apr
‘11
Mar
‘12
Feb
‘13
Nov
‘14
50x
1x
Dec
‘13
8
9. So, why do we need Telemetry & Analytics in DC Fabrics?
9
5+
10+
20+
# of fabric
architectures in
production
kinds of switches
as part of fabrics in
production
# consumers per
production fabric
}Need Sensors, Software and Systems that help
• perform network design and modeling
• perform topology, configuration and routing verification
• perform smart analytics for root cause isolation
10. Data Center Fabrics are Complex
10
n stage
host stacks:
virtualization
transport
application
switch stack
1
2
3
4
State is distributed across various
elements inside and out of the fabric
Complex interaction between the states
Multiple uncoordinated writers of
state under various control loops
Large: Impossible to humanly observe
state and react to ambiguity or faults
controller software
Challenges are similar for SDN-centric or traditional protocol-based networks
11. 11
Life of a typical Data Center Fabric
OPERATE (Apps and SLAs)
• Define application SLAs
• Measure SLAs and traffic characteristics
• Feed stats to TE, enforcers, PCR schedulers
OPERATE (Apps & SLA)
CONNECT (Routing)
• Design connectivity policies
• Create the intended configuration
• Push configuration to devices
CONNECT (routing, reachability)
BUILD (Initial Topology)
• Design the network topology
• Model the intended topology
• Populate (deploy, wire-up) DC floor
BUILD (init topology)
12. 12
Safety, Correctness and Visibility in a DC Fabric
• Define application SLAs
• Measure SLAs and traffic characteristics
• Feed stats to TE, enforcers, PCR schedulers
OPERATE (Apps & SLA)
• Create connectivity policies & config
• Push configuration to devices
• Verify routing consistency
CONNECT (routing, reachability)
BUILD (Initial Topology)
• Design & model the network topology
• Populate (deploy, wire-up) DC floor
• Verify deployed topology against intent
BUILD (init topology)
13. 13
Compounded by Scale
10,000+
switches per fabric**
3.5 Billion
searches per day
30,000 burst
updates within 1 min**
300 hours
of video uploaded every minute
10 Million +
routing rules per fabric**
0.25Million+
links per fabric**
SLA
& app visibility
BUILD
& topology
CONNECTIVITY
& routing
**Typical numbers seen in large data center fabrics
14. 14
Lots of sensors and data sources
fabric data plane application plane
TCP, UDP / IP stats
virtual switch state
RPC sensors
application sensors
host enforcers
stats | counters
queue depths
switch probes
host probes
routing state
topology models
configuration models
Challenge: To use these effectively to simplify operational complexities in Data Center Fabrics
control & management plane
forwarding state
15. Systems to Enable Safety, Correctness & Visibility
15
Topology Verification
To continually verify that what is
deployed is what was intended
01.
Route Consistency
To verify routing state
consistency between
controllers and data plane
02.
Traffic Characteristics
**
To verify host-granular
reachability and measure traffic
characteristics
03.
SLA
& app visibility
BUILD
& topology
CONNECTIVITY
& routing
**The analytics system described here focuses primarily on the host-level reachability
and packet-loss characterization of app2app communication
16. 01. Topology Verification at Scale
16
How do we verify that what has been
deployed and wired up matches
intended topology
in a 10,000+ node / 250,000+ links fabric
17. n stage
17
1
2
3
4
Generate probe traffic from Hosts
This is not switched like production traffic
Don’t rely on just destination-based routing
Source Route: Ensures targeted full coverage
Analytics App: Takes all this data generated and
localize connectivity problems
Read in the intended model of the fabric
Generate a topology to verify against
01. Topology Verification at Scale
Simultaneous Detection of Topological faults within a minute of occurrence
18. 18
How do we verify consistency between
configured policy, routing state and
forwarding state
in a fabric with 10M+ rules and detect & isolate
loops & black holes quickly
02. Routing Consistency Verification at Scale
19. 19
1
2
3
4
Map: Create a network slice by destination subnet by
picking rules relevant to that subnet against a shard
of the full set
Reduce: Construct a directed forwarding graph and
verify properties such as loop freedom and reachability
At every subsequent routing update, only
analyze incremental updates
Generate snapshot of the routing state by
recording route change events
SDN Controller
Route Snapshot
}
Rule Shard 1 Rule Shard n…
M M M…
R R R
M
R …
Report Subnet 1 Report Subnet m…
02. Libra: Routing Consistency Verification at Scale
Detection of Loops & Holes within 1ms of occurrence in a 10K Node Network
20. 20
How do we measure host-level
reachability and app-level traffic
characteristics comprehensively
across all host-pairs and all traffic classes with
varying SLA & TE needs
03. App-level SLA measurements at Scale
21. 21
1
2
3
4
Exercise src2dest and dest2src paths through
entire host-fabric-host SW stack
Structured rotation of probes across all hosts &
traffic queues for full coverage
Correlate probe loss & latency in fwd & rev directions
across a number of probe sets to localize issues
Randomly pick a subset of hosts and generate
probes to and from the hosts.
03. App-level SLA measurements at Scale
22. 22
That IS a lot of data...
Now that your network infrastructure is richly
instrumented...how do you extract this information?
We use a RPC framework optimized for encrypted,
streaming, and multiplexed connections.
...and so can you.
http://grpc.io
23. 23
Network equipment
● We’ve discussed a lot of external software telemetry sources...what about data from
network devices themselves?
● Today, SNMP is often the de facto network telemetry protocol. Time to upgrade!
○ legacy implementations -- designed for limited processing and bandwidth
○ expensive discoverability -- re-walk MIBs to discover new elements
○ no capability advertisement -- test OIDs to determine support
○ rigid structure -- limited extensibility to add new data
○ proprietary data -- require vendor-specific mappings and multiple requests to
reassemble data
○ protocol stagnation -- no absorption of current data modeling and transmission
techniques
24. 24
Next generation network telemetry:
● network elements stream data to collectors (push model)
● data populated based on vendor-neutral models whenever possible
● utilize a publish/subscribe API to select desired data
● scale for next 10 years of density growth with high data freshness
○ other protocols distribute load to hardware, so should telemetry
● utilize modern transport mechanisms with active development communities
○ e.g., gRPC (HTTP/2), Thrift, protobuf over UDP
www.openconfig.net
25. 25
Building Safe, Correct and Transparent DC Fabrics
Detect host reachability issues and SLA violations
within minutes of these events
OPERATE (Apps & SLA)
Verify routing consistency and detect black
holes, loops within a minute of a routing or
configuration change
CONNECT (routing, reachability)
Explore Topology Design Space & Synthesize
Verify deployed topology against intent within
minutes of turn-up
BUILD (init topology)
26. ❯ Think outside the BOX
❯ Sensors and DATA ANALYTICS are key for building data center networks
❯ HUMANS are (almost) useless
Key Take-Aways
26
27. Data Center Networking @ Google
• An SDN centric DevOps organization
• Responsible for the architecture, design, and build of Google’s data center networks
• Responsible for development of software & systems for modeling, automation, telemetry, analytics
Network Engineers Network Systems Engineers Technical Program ManagersSREArchitects
Join Data Center Networking @ Google
28. Sriram Natarajan
snr@google.com
References THANK YOU
1. “Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s
Datacenter Network,” SIGCOMM 2015.
2. “Network Detective: Finding network blackholes”, Israel Networking Day 2014.
3. “Libra: Divide and Conquer to Verify Forwarding Tables in Huge Networks”, NSDI 2014.
4. “B4: Experience With a Globally-Deployed Software Defined WAN,” SIGCOMM 2013.
5. “Bandwidth Enforcer: Flexible, Hierarchical Bandwidth Allocation for WAN Distributed
Computing,” SIGCOMM 2015.