Windows Azure:
Scaling SDN in the Public Cloud
Albert Greenberg
Director of Development
Windows Azure Networking
albert@microsoft.com
• Microsoft’s big bet on public
cloud
• Companies move their IT
infrastructure to the cloud
• Elastic scaling and less
expensive than on-premises
DC
• Runs major Microsoft
properties (Office 365,
OneDrive, Skype, Bing,
Xbox)
Summary
• Scenario: BYO Virtual Network to the Cloud
• Per customer, with capabilities equivalent to on premise counterpart
• Challenge: How do we scale virtual networks across millions of servers?
• Solution: Host SDN solves it: scale, flexibility, timely feature rollout, debuggabililty
• Virtual networks, software load balancing, …
• How: Scaling flow processing to millions of nodes
• Flow tables on the host, with on-demand rule dissemination
• RDMA to storage
• Demo: ExpressRoute to the Cloud (Bing it!)
Infrastructure as a Service:
Develop, test, run your apps
Easy VM portability
If it runs on Hyper-V, it runs
in Windows Azure:
Windows, Linux, … (Ubuntu, redis,
mongodb, redis, …)
Deploy VMs anywhere
with no lock-in
What Does IaaS Mean for Networking?
Scenario: BYO Network
Windows Azure Virtual Networks
• Goal: BYO Address Space +
Policy
• Azure is just another branch
office of your enterprise, via
VPN
• Communication between
tenants of your Azure
deployment should be efficient
and scalable
10.1/16 10.1/16
SecureTunnel
Public Cloud Scale
2010 2014
Compute Instances
2010 2014
Azure Storage
2010 2014
Azure DC Network Capacity
Windows Azure momentum
How do we support 50k+ virtual
networks, spread over a single 100k+
server deployment in a DC?
Start by finding the right abstractions
SDN: Building the right abstractions for Scale
Abstract by separating management,
control, and data planes
Azure Frontend
Controller
Switch
Management Plane
Control Plane
Management plane Create a tenant
Control plane Plumb these tenant
ACLs to these
switches
Data plane Apply these ACLs to
these flows
Example: ACLs
• Data plane needs to apply per-flow
policy to millions of VMs
• How do we apply billions of flow policy
actions to packets?
Solution: Host Networking
• If every host performs all packet actions for its own VMs, scale is
much more tractable
• Use a tiny bit of the distributed computing power of millions of
servers to solve the SDN problem
• If millions of hosts work to implement billions of flows, each host only needs
thousands
• Build the controller abstraction to push all SDN to the host
VNets on the Host
• A VNet is essentially a set of mappings
from a customer defined address space
(CAs) to provider addresses (PAs) of hosts
where VMs are located
• Separate the interface to specify a VNet
from the interface to plumb mappings to
switches via a Network Controller
• All CA<-> PA mappings for a local VM
reside on the VM’s host, and are applied
there
Azure Frontend
Controller
Customer Config
VNet Description (CAs)
L3 Forwarding Policy
(CAs <-> PAs)
VMSwitchVMSwitch
Blue VMs
CA Space
Green VMs
CA Space
Northbound API
Southbound API
VNet Controller
Azure Frontend
Controller
Node1: 10.1.1.5
Blue VM1
10.1.1.2
Green VM1
10.1.1.2
Azure VMSwitch
Node2: 10.1.1.6
Red VM1
10.1.1.2
Green VM2
10.1.1.3
Azure VMSwitch
Node3: 10.1.1.7
Green S2S GW
10.1.2.1
Azure VMSwitch
Green Enterpise
Network
10.2/16
VPN GW
Customer Config
VNet Description
L3 Forwarding Policy
Secondary
Controllers
Consensus
Protocol
Forwarding Policy: Traffic to on-prem
Node1: 10.1.1.5
Blue VM1
10.1.1.2
Green VM1
10.1.1.2
Azure VMSwitchSrc:10.1.1.2 Dst:10.2.0.9
Src:10.1.1.2 Dst:10.2.0.9
Policy lookup:
10.2/16 routes to
GW on host with
PA 10.1.1.7
Controller
Src:10.1.1.5 Dst:10.1.1.7 GRE:Green Src:10.1.1.2 Dst:10.2.0.9
L3 Forwarding Policy
Node3: 10.1.1.7
Green S2S GW
10.1.2.1
Azure VMSwitch
Green Enterpise
Network
10.2/16
VPN GW
Src:10.1.1.2 Dst:10.2.0.9L3VPN PPP
IaaS VM
Cloud Load Balancing
• All infrastructure runs behind an LB
to enable high availability and
application scale
• How do we make application load
balancing scale to the cloud?
• Challenges:
• Load balancing the load balancers
• Hardware LBs are expensive, and
cannot support the rapid
creation/deletion of LB endpoints
required in the cloud
• Support 10s of Gbps per cluster
• Support a simple provisioning model
LB
Web Server
VM
Web Server
VM
SQL
Service
IaaS VM
SQL
Service
NAT
All-Software Load Balancer:
Scale using the Hosts
LB VM
VM DIP
10.1.1.2
VM DIP
10.1.1.3
Azure VMSwitch
Stateless
Tunnel
Edge Routers
Client
VIP
VIP
DIP
DIP
Direct
Return:
VIP
VIP
LB VM
VM DIP
10.1.1.4
VM DIP
10.1.1.5
Azure VMSwitch
NAT
Controller
Tenant Definition:
VIPs, # DIPs
Mappings
• Goal of an LB: Map a Virtual IP
(VIP) to a Dynamic IP (DIP) set of a
cloud service
• Two steps: Load Balance (select a
DIP) and NAT (translate VIP->DIP
and ports)
• Pushing the NAT to the vswitch
makes the LBs stateless (ECMP)
and enables direct return
• SDN controller abstracts out
LB/vswitch interactions
NAT
How We Scaled Host SDN
Flow Tables are the right abstraction
Node: 10.4.1.5
Azure VMSwitch
Blue VM1
10.1.1.2
NIC
Controller
Tenant Description
VNet Description
Flow Action
VNet Routing
Policy
ACLsNAT
Endpoints
Flow ActionFlow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
Flow ActionFlow Action
TO: 79.3.1.2 DNAT to 10.1.1.2
TO: !10/8 SNAT to 79.3.1.2
Flow Action
TO: 10.1.1/24 Allow
10.4/16 Block
TO: !10/8 Allow
• VMSwitch exposes a typed Match-
Action-Table API to the controller
• One table per policy
• Key insight: Let controller tell the
switch exactly what to do with
which packets (e.g. encap/decap),
rather than trying to use existing
abstractions (Tunnels, …)
VNET LB NAT ACLS
1. Table typing and flow caching are critical to
Dataplane Performance
Node: 10.4.1.5
Azure VMSwitch
Blue VM1
10.1.1.2
NIC
Flow ActionFlow ActionFlow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
Flow ActionFlow Action
TO: 79.3.1.2 DNAT to 10.1.1.2
TO: !10/8 SNAT to 79.3.1.2
Flow Action
TO: 10.1.1/24 Allow
10.4/16 Block
TO: !10/8 Allow
VNET LB NAT ACLS
• COGS in the cloud is driven by VM density – 40GbE is here
• NIC Offloads are critical to achieving density
• Requires significant design work in the VMSwitch to scale overlay / NAT /
ACL policy to line speed
• First-packet actions can be complex, but established-flow matches need
to be typed, predictable, and simple
Node: 10.4.1.5
Azure VMSwitch
2. Separate Controllers By Application
Blue VM1
10.1.1.2
NIC
LB Controller
Tenant Description
VNet Description
Flow Action
VNet Routing
Policy
ACLs
NAT Endpoints
Flow ActionFlow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
Flow ActionFlow Action
TO: 79.3.1.2 DNAT to 10.1.1.2
TO: !10/8 SNAT to 79.3.1.2
Flow Action
TO: 10.1.1/24 Allow
10.4/16 Block
TO: !10/8 Allow
VNET LB NAT ACLS
Network
Controller
VNet Controller
LB
VIP
Endpoints
Northbound API
3. Eventing: Agents are also per-Application
• Attempting to give each VMSwitch
a synchronously consistent view of
the entire network is not scalable
• Separate rapidly changing policy
(location mappings of VMs in VNet)
from static provisioning policy
• VMSwitches should request needed
mappings on-demand via eventing
• We need a smart host agent to
handle eventing and look up
mappings
Azure VMSwitch
Blue VM1
10.1.1.2
NIC
Flow ActionFlow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
VNET
VNet Agent
VNet Controller
Mapping Service
Mapping Service
Mapping Service
Policy (once)
Policy
Mapping Request Event
(No policy found for packet)
Mapping Request
Mappings
Eventing: The Real API is on the Host
• The wire protocols between the
controller, agent, and related services
are now application specific (rather than
generic SDN APIs)
• The real southbound API (which is
implemented by VNet, LB, ACLs, etc) is
now between the Agents and the
VMSwitch
• High performance OS-level API rather than a
wire protocol
• We have found that eventing is a
requirement of any nontrivial SDN
application
Azure VMSwitch
Blue VM1
10.1.1.2
NIC
Flow ActionFlow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
VNET
Vnet Agent
VNet Controller
Mapping Service
Mapping Service
Mapping Service
Policy (once)
Mapping Request Event
(No policy found for packet)
Mapping Request
Southbound API
VNet Application
Mappings
• VNet scope is a region –
100k+ nodes. One controller
can’t manage them all!
• Solution: Regional controller
defines the VNet, local
controller programs end
hosts
• Make the Mapping Service
hierarchical, enabling DNS-
style recursive lookup VNET
Agent
Local
Controller
Local
Mappings
Policy Mapping Request
Mappings
4. Separate Regional and Local Controllers
Flow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
VNET
Agent
Local
Controller
Local
Mappings
Policy Mapping Request
Mappings
Flow Action
TO: 10.2/16 Encap to GW
TO: 10.1.1.5 Encap to 10.5.1.7
TO: !10/8 NAT out of VNET
Regional
Controller
Regional
Controller
Regional
Controller
Regional
Controller
Regional
Controller
Regional
Mappings
Mapping Request
VNet Description
Policy
A complete virtual network needs
storage as well as compute!
How do we make Azure Storage scale?
Storage is Software Defined, Too
• Erasure Coding provides durability
of 3-copy writes with small (<1.5x)
overhead by distributing coded
blocks over many servers
• Lots of network I/O for each
storage I/O
…
Write
Commit
Erasure Code
• We want to make storage clusters scale cheaply on commodity servers
To make storage cheaper, we use lots more network!
RDMA – High Performance Transport for Storage
• Remote DMA primitives (e.g. Read address, Write address) implemented on-NIC
• Zero Copy (NIC handles all transfers via DMA)
• Zero CPU Utilization at 40Gbps (NIC handles all packetization)
• <2μs E2E latency
• RoCE enables Infiniband RDMA transport over IP/Ethernet network (all L3)
• Enabled at 40GbE for Windows Azure Storage, achieving massive COGS savings by
eliminating many CPUs in the rack
All the logic is in the host:
Software Defined Storage now scales with the Software Defined Network
NIC
Application
NIC
Application
Memory
Buffer A
Memory
Buffer B
Write local buffer at
Address A to remote
buffer at Address B
Buffer B is filled
Just so we’re clear…
40Gbps of I/O with 0%
CPU
Hybrid Cloud:
How do we Onboard Enterprise?
Public
internet
Public
internet
ExpressRoute: Direct Connection to Your
VNet
• All VNET policy to
tunnel to/from
customer circuit
implemented on hosts
• Predictable low
latency, high
throughput to the
cloud
ExpressRoute: Now live in MSIT!
Host
Customer
Router
ExpressRoute: Entirely Automated SDN Solution
Edge
Router
VMSwitch
Gateway
VM
BGP RIB
VNET Agent
Gateway
Controller
VNET
Controller
SLB
Mapping Service
DEMO: ExpressRoute
Result: We made SDN Scale
• VNET, SLB, ACLs, Metering, and more scale to millions of servers
• Tens of Thousands of VNETs
• Tens of Thousands of Gateways
• Hundreds of Thousands VIPs
• 10s of Tbps of LB’d traffic
• Billions of Flows…
all in the host!
Bandwidth served by SLB to a storage cluster over a week
40Gbps
30Gbps
20Gbps
Host Networking makes Physical Network
Fast and Scalable
• Massive, distributed 40GbE network
built on commodity hardware
• No Hardware per tenant ACLs
• No Hardware NAT
• No Hardware VPN / overlay
• No Vendor-specific control,
management or data plane
• All policy is in software – and
everything’s a VM!
• Network services deployed like all
other services
• Battle-tested solutions in Windows
Azure are coming to private cloud
10G Servers
We bet our infrastructure on Host SDN, and it paid off
• The incremental cost of deploying a new tenant, new VNet, or new
load balancer is tiny – everything is in software
• Using scale, we are cheaper and faster than any tenant deployed by
an admin on-prem
• Public cloud is the future! Join us!

Windows Azure: Scaling SDN in the Public Cloud

  • 3.
    Windows Azure: Scaling SDNin the Public Cloud Albert Greenberg Director of Development Windows Azure Networking albert@microsoft.com
  • 4.
    • Microsoft’s bigbet on public cloud • Companies move their IT infrastructure to the cloud • Elastic scaling and less expensive than on-premises DC • Runs major Microsoft properties (Office 365, OneDrive, Skype, Bing, Xbox)
  • 5.
    Summary • Scenario: BYOVirtual Network to the Cloud • Per customer, with capabilities equivalent to on premise counterpart • Challenge: How do we scale virtual networks across millions of servers? • Solution: Host SDN solves it: scale, flexibility, timely feature rollout, debuggabililty • Virtual networks, software load balancing, … • How: Scaling flow processing to millions of nodes • Flow tables on the host, with on-demand rule dissemination • RDMA to storage • Demo: ExpressRoute to the Cloud (Bing it!)
  • 6.
    Infrastructure as aService: Develop, test, run your apps Easy VM portability If it runs on Hyper-V, it runs in Windows Azure: Windows, Linux, … (Ubuntu, redis, mongodb, redis, …) Deploy VMs anywhere with no lock-in
  • 7.
    What Does IaaSMean for Networking? Scenario: BYO Network Windows Azure Virtual Networks • Goal: BYO Address Space + Policy • Azure is just another branch office of your enterprise, via VPN • Communication between tenants of your Azure deployment should be efficient and scalable 10.1/16 10.1/16 SecureTunnel
  • 8.
  • 10.
  • 11.
  • 12.
    2010 2014 Azure DCNetwork Capacity
  • 13.
  • 14.
    How do wesupport 50k+ virtual networks, spread over a single 100k+ server deployment in a DC? Start by finding the right abstractions
  • 15.
    SDN: Building theright abstractions for Scale Abstract by separating management, control, and data planes Azure Frontend Controller Switch Management Plane Control Plane Management plane Create a tenant Control plane Plumb these tenant ACLs to these switches Data plane Apply these ACLs to these flows Example: ACLs • Data plane needs to apply per-flow policy to millions of VMs • How do we apply billions of flow policy actions to packets?
  • 16.
    Solution: Host Networking •If every host performs all packet actions for its own VMs, scale is much more tractable • Use a tiny bit of the distributed computing power of millions of servers to solve the SDN problem • If millions of hosts work to implement billions of flows, each host only needs thousands • Build the controller abstraction to push all SDN to the host
  • 17.
    VNets on theHost • A VNet is essentially a set of mappings from a customer defined address space (CAs) to provider addresses (PAs) of hosts where VMs are located • Separate the interface to specify a VNet from the interface to plumb mappings to switches via a Network Controller • All CA<-> PA mappings for a local VM reside on the VM’s host, and are applied there Azure Frontend Controller Customer Config VNet Description (CAs) L3 Forwarding Policy (CAs <-> PAs) VMSwitchVMSwitch Blue VMs CA Space Green VMs CA Space Northbound API Southbound API
  • 18.
    VNet Controller Azure Frontend Controller Node1:10.1.1.5 Blue VM1 10.1.1.2 Green VM1 10.1.1.2 Azure VMSwitch Node2: 10.1.1.6 Red VM1 10.1.1.2 Green VM2 10.1.1.3 Azure VMSwitch Node3: 10.1.1.7 Green S2S GW 10.1.2.1 Azure VMSwitch Green Enterpise Network 10.2/16 VPN GW Customer Config VNet Description L3 Forwarding Policy Secondary Controllers Consensus Protocol
  • 19.
    Forwarding Policy: Trafficto on-prem Node1: 10.1.1.5 Blue VM1 10.1.1.2 Green VM1 10.1.1.2 Azure VMSwitchSrc:10.1.1.2 Dst:10.2.0.9 Src:10.1.1.2 Dst:10.2.0.9 Policy lookup: 10.2/16 routes to GW on host with PA 10.1.1.7 Controller Src:10.1.1.5 Dst:10.1.1.7 GRE:Green Src:10.1.1.2 Dst:10.2.0.9 L3 Forwarding Policy Node3: 10.1.1.7 Green S2S GW 10.1.2.1 Azure VMSwitch Green Enterpise Network 10.2/16 VPN GW Src:10.1.1.2 Dst:10.2.0.9L3VPN PPP
  • 20.
    IaaS VM Cloud LoadBalancing • All infrastructure runs behind an LB to enable high availability and application scale • How do we make application load balancing scale to the cloud? • Challenges: • Load balancing the load balancers • Hardware LBs are expensive, and cannot support the rapid creation/deletion of LB endpoints required in the cloud • Support 10s of Gbps per cluster • Support a simple provisioning model LB Web Server VM Web Server VM SQL Service IaaS VM SQL Service
  • 21.
    NAT All-Software Load Balancer: Scaleusing the Hosts LB VM VM DIP 10.1.1.2 VM DIP 10.1.1.3 Azure VMSwitch Stateless Tunnel Edge Routers Client VIP VIP DIP DIP Direct Return: VIP VIP LB VM VM DIP 10.1.1.4 VM DIP 10.1.1.5 Azure VMSwitch NAT Controller Tenant Definition: VIPs, # DIPs Mappings • Goal of an LB: Map a Virtual IP (VIP) to a Dynamic IP (DIP) set of a cloud service • Two steps: Load Balance (select a DIP) and NAT (translate VIP->DIP and ports) • Pushing the NAT to the vswitch makes the LBs stateless (ECMP) and enables direct return • SDN controller abstracts out LB/vswitch interactions NAT
  • 22.
    How We ScaledHost SDN
  • 23.
    Flow Tables arethe right abstraction Node: 10.4.1.5 Azure VMSwitch Blue VM1 10.1.1.2 NIC Controller Tenant Description VNet Description Flow Action VNet Routing Policy ACLsNAT Endpoints Flow ActionFlow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET Flow ActionFlow Action TO: 79.3.1.2 DNAT to 10.1.1.2 TO: !10/8 SNAT to 79.3.1.2 Flow Action TO: 10.1.1/24 Allow 10.4/16 Block TO: !10/8 Allow • VMSwitch exposes a typed Match- Action-Table API to the controller • One table per policy • Key insight: Let controller tell the switch exactly what to do with which packets (e.g. encap/decap), rather than trying to use existing abstractions (Tunnels, …) VNET LB NAT ACLS
  • 24.
    1. Table typingand flow caching are critical to Dataplane Performance Node: 10.4.1.5 Azure VMSwitch Blue VM1 10.1.1.2 NIC Flow ActionFlow ActionFlow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET Flow ActionFlow Action TO: 79.3.1.2 DNAT to 10.1.1.2 TO: !10/8 SNAT to 79.3.1.2 Flow Action TO: 10.1.1/24 Allow 10.4/16 Block TO: !10/8 Allow VNET LB NAT ACLS • COGS in the cloud is driven by VM density – 40GbE is here • NIC Offloads are critical to achieving density • Requires significant design work in the VMSwitch to scale overlay / NAT / ACL policy to line speed • First-packet actions can be complex, but established-flow matches need to be typed, predictable, and simple
  • 25.
    Node: 10.4.1.5 Azure VMSwitch 2.Separate Controllers By Application Blue VM1 10.1.1.2 NIC LB Controller Tenant Description VNet Description Flow Action VNet Routing Policy ACLs NAT Endpoints Flow ActionFlow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET Flow ActionFlow Action TO: 79.3.1.2 DNAT to 10.1.1.2 TO: !10/8 SNAT to 79.3.1.2 Flow Action TO: 10.1.1/24 Allow 10.4/16 Block TO: !10/8 Allow VNET LB NAT ACLS Network Controller VNet Controller LB VIP Endpoints Northbound API
  • 26.
    3. Eventing: Agentsare also per-Application • Attempting to give each VMSwitch a synchronously consistent view of the entire network is not scalable • Separate rapidly changing policy (location mappings of VMs in VNet) from static provisioning policy • VMSwitches should request needed mappings on-demand via eventing • We need a smart host agent to handle eventing and look up mappings Azure VMSwitch Blue VM1 10.1.1.2 NIC Flow ActionFlow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET VNET VNet Agent VNet Controller Mapping Service Mapping Service Mapping Service Policy (once) Policy Mapping Request Event (No policy found for packet) Mapping Request Mappings
  • 27.
    Eventing: The RealAPI is on the Host • The wire protocols between the controller, agent, and related services are now application specific (rather than generic SDN APIs) • The real southbound API (which is implemented by VNet, LB, ACLs, etc) is now between the Agents and the VMSwitch • High performance OS-level API rather than a wire protocol • We have found that eventing is a requirement of any nontrivial SDN application Azure VMSwitch Blue VM1 10.1.1.2 NIC Flow ActionFlow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET VNET Vnet Agent VNet Controller Mapping Service Mapping Service Mapping Service Policy (once) Mapping Request Event (No policy found for packet) Mapping Request Southbound API VNet Application Mappings
  • 28.
    • VNet scopeis a region – 100k+ nodes. One controller can’t manage them all! • Solution: Regional controller defines the VNet, local controller programs end hosts • Make the Mapping Service hierarchical, enabling DNS- style recursive lookup VNET Agent Local Controller Local Mappings Policy Mapping Request Mappings 4. Separate Regional and Local Controllers Flow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET VNET Agent Local Controller Local Mappings Policy Mapping Request Mappings Flow Action TO: 10.2/16 Encap to GW TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 NAT out of VNET Regional Controller Regional Controller Regional Controller Regional Controller Regional Controller Regional Mappings Mapping Request VNet Description Policy
  • 29.
    A complete virtualnetwork needs storage as well as compute! How do we make Azure Storage scale?
  • 30.
    Storage is SoftwareDefined, Too • Erasure Coding provides durability of 3-copy writes with small (<1.5x) overhead by distributing coded blocks over many servers • Lots of network I/O for each storage I/O … Write Commit Erasure Code • We want to make storage clusters scale cheaply on commodity servers To make storage cheaper, we use lots more network!
  • 31.
    RDMA – HighPerformance Transport for Storage • Remote DMA primitives (e.g. Read address, Write address) implemented on-NIC • Zero Copy (NIC handles all transfers via DMA) • Zero CPU Utilization at 40Gbps (NIC handles all packetization) • <2μs E2E latency • RoCE enables Infiniband RDMA transport over IP/Ethernet network (all L3) • Enabled at 40GbE for Windows Azure Storage, achieving massive COGS savings by eliminating many CPUs in the rack All the logic is in the host: Software Defined Storage now scales with the Software Defined Network NIC Application NIC Application Memory Buffer A Memory Buffer B Write local buffer at Address A to remote buffer at Address B Buffer B is filled
  • 32.
    Just so we’reclear… 40Gbps of I/O with 0% CPU
  • 33.
    Hybrid Cloud: How dowe Onboard Enterprise?
  • 34.
    Public internet Public internet ExpressRoute: Direct Connectionto Your VNet • All VNET policy to tunnel to/from customer circuit implemented on hosts • Predictable low latency, high throughput to the cloud
  • 35.
  • 36.
    Host Customer Router ExpressRoute: Entirely AutomatedSDN Solution Edge Router VMSwitch Gateway VM BGP RIB VNET Agent Gateway Controller VNET Controller SLB Mapping Service
  • 37.
  • 38.
    Result: We madeSDN Scale • VNET, SLB, ACLs, Metering, and more scale to millions of servers • Tens of Thousands of VNETs • Tens of Thousands of Gateways • Hundreds of Thousands VIPs • 10s of Tbps of LB’d traffic • Billions of Flows… all in the host! Bandwidth served by SLB to a storage cluster over a week 40Gbps 30Gbps 20Gbps
  • 39.
    Host Networking makesPhysical Network Fast and Scalable • Massive, distributed 40GbE network built on commodity hardware • No Hardware per tenant ACLs • No Hardware NAT • No Hardware VPN / overlay • No Vendor-specific control, management or data plane • All policy is in software – and everything’s a VM! • Network services deployed like all other services • Battle-tested solutions in Windows Azure are coming to private cloud 10G Servers
  • 40.
    We bet ourinfrastructure on Host SDN, and it paid off • The incremental cost of deploying a new tenant, new VNet, or new load balancer is tiny – everything is in software • Using scale, we are cheaper and faster than any tenant deployed by an admin on-prem • Public cloud is the future! Join us!