Anurag Palsule
anurag@avinetworks.com
Ludicruous scaling of SSL Traffic
Increase Application Capacity, Reliability, and Scale
Why do we need ludicruous scale?
- Cashless transactions have gone to 25% from 5% in a matter of
few months!
- IRCTC bookings have grown from 29 tickets per day to 13L per
day !
- IHS forecasts that the IoT market will grow from an installed
base of 15.4 billion devices in 2015 to 30.7 billion devices in
2020 and 75.4 billion in 2025 => Huge scalability requirements
on IOT applications
Load Balancer Scalability – New Considerations
• SSL/TLS traffic seeing explosive growth
• Performance myth: Ultra expensive and inflexible
hardware appliances the only solution
• Moore’s law: advances in Intel x86 servers –
processors, memory, and networking
• Crypto advances: RSA 2K vs. ECC encryption keys
• Software-defined architectural advances enable
significant elasticity
Architectural Approaches to Scale Load Balancers
- Hierarchical load balancers
- DNS + Proxy load balancers
- Route injection/Anycast load balancers
Hierarchical Load Balancers
Concept:
• Chaining of load balancing services
• Tier 1 – Layer 4 (TCP/UDP) load balancing
• Tier 2 – Layer 7 load balancing
Pros:
• Simplest approach
• May suffice for small scale environments
Cons:
• Limited by performance of Tier-1 LB
Users
Tier 1
Load Balancer
Tier 2
Load Balancer
Application
Instances
DNS + Proxy Load Balancers
Concept:
• DNS redirections with server mirroring
• Dynamic mapping of hostname to IP
addresses
Pros:
• Easy to configure
• Scales well
Cons:
• DNS caches can become stale
Users DNS
Load Balancer
Application
Instances
IP1
IP2
IP3
IP4 IP1
IP2
IP3
IP4
Route injection/Anycast Load Balancers
Concept:
• DNS resolves to single IP
• Upstream router holds IP address
• Router performs flow-based ECMP to
next hop load balancers
Pros:
• Can scale significantly – most routers
support at least 64 next hops
Cons:
• Access to an upstream router is needed
Users
Router
Load Balancer
Application
Instances
Legacy 90s Arch,
Box approach
• Proprietary Hardware
• Manage Each Device
• No Automation
• No Telemetry
• Static Capacity
The State of Load Balancing/Application Delivery
WebScale computing is here but load balancing is still a bottleneck!
Takeaways from
AWS/FB/Microsoft
• Commodity x86
• Manage As One
• Highly Automated
• Built-In Telemetry
• Elastic
Flexible, Fluid CapacityRigid
Legacy
ADC/LBs WEB SCALE TECH
Load
Balancers
Virtualized Containers Public CloudCompute ComputeCompute
Modern Distributed Architecture
Separate Control and Dataplane
Manage as one, not many devices
Controller
Load Balancers
Management Plane: UI/CLI
Data Plane: LB
Virtualized Containers Public CloudCompute
Modern Distributed Architecture
Separate Control and Dataplane
Manage as one, not many devices
Controller
Load Balancers
Modern Distributed Architecture
Separate Control and Dataplane
Manage as one, not many devices
Load Balancers
Bare Metal Virtualized Containers Public Cloud
Controller
MESOS
Management & Orchestration
REST API
Multi-Cloud Fabric
Single solution, any environment
Automation
Highly programmable, Plug-n-Play
Built-In Visibility & Analytics
Actionable insights key to automation
Innovation
1 Million TPS on Google Compute Engine - Setup
Avi Networks – Elastic Application Services Fabric
320x Test
Clients
40x Avi Service Engines
(Load Balancers)
ab ab
ab
n1-highcpu-16
ab ab
ab
n1-highcpu-16
ab ab
ab
n1-highcpu-16
GCP
Router
Controller
ab ab
ab
n1-highcpu-16
Application
Instances
Key Stats
- Total cost for setup in Google Compute < $50
- SSL TPS – 0 to 1 million TPS in a few seconds
- Dataplane: 40 VM instances with 32 hyperthreaded cores each
- Traffic generators – 320 VM instances on 16 hyperthreaded cores
each
• Setup in Google Compute
• Bootstrap instance - 1 g1-small instance
• Avi Controller - 1 n1-standard-4 instance
• Avi Service Engines (load balancers) - 40 n1-highcpu-32 instances
• Pool server - 1 g1-small instance
• Test clients (load/traffic generators) - 320 n1-highcpu-16 instances
• Running the test
• https://github.com/avinetworks/avi-test-scripts : This public repo has all the scripts
required for anyone to perform the scalability test
Test setup and methodology
Avi Networks Proprietary and Confidential 2017
Scale Performance Up and Out
Managed as One Elastic Load Balancer Fabric
• 1 LB, 1 core
• 5 Gbps
• 2,500 SSL TPS
• 1 LB, 24 cores (2 Sockets)
• 20 Gbps (10 Gbps NICs)
• 60,000 SSL TPS
SCALE-UP
More cores & IO
LB performance scales with CPUs
(Moore’s Law) & IO (40 Gbps NICs)
• 1 LB, 2 core
• 10 Gbps
• 5,000 SSL TPS
Single App Perf
• 640 Gbps
• 1.9M SSL TPS
Performance
• 4 Tbps
• 12M SSL TPS
Scale to 200 LBs
• 2 LB, 1 core
• 10 Gbps
• 5,000 SSL TPS
SCALE-OUT
More LBs
Fabric performance scales
horizontally with LBs
Centralized
API
Management
Monitoring
Beyond Google Compute; Any Data Center or Public Cloud
Clients Load Balancers
Controller
Application
Instances
GCP
Router
DEMO
Real-time Insights for Elastic Application Services
The New Rules of Elastic, Cost-effective Load Balancing
1 Take advantage of WebScale architectures
2 Use analytics-driven decisions for on-demand elasticity
Automate L4 – L7 services with APIs3
Leverage load balancers for application intelligence4
Eliminate hardware overprovisioning5
Anurag Palsule
anurag@AviNetworks.com
Thank You!
Avi Networks (India) Pvt Ltd.
JB House, 110, 4th Cross,
5th Block, Koramangala Industrial Layout,
Bangalore 560 095, Karnataka.

Ludicrous scalewithloadbalancers

  • 1.
    Anurag Palsule anurag@avinetworks.com Ludicruous scalingof SSL Traffic Increase Application Capacity, Reliability, and Scale
  • 2.
    Why do weneed ludicruous scale? - Cashless transactions have gone to 25% from 5% in a matter of few months! - IRCTC bookings have grown from 29 tickets per day to 13L per day ! - IHS forecasts that the IoT market will grow from an installed base of 15.4 billion devices in 2015 to 30.7 billion devices in 2020 and 75.4 billion in 2025 => Huge scalability requirements on IOT applications
  • 3.
    Load Balancer Scalability– New Considerations • SSL/TLS traffic seeing explosive growth • Performance myth: Ultra expensive and inflexible hardware appliances the only solution • Moore’s law: advances in Intel x86 servers – processors, memory, and networking • Crypto advances: RSA 2K vs. ECC encryption keys • Software-defined architectural advances enable significant elasticity
  • 4.
    Architectural Approaches toScale Load Balancers - Hierarchical load balancers - DNS + Proxy load balancers - Route injection/Anycast load balancers
  • 5.
    Hierarchical Load Balancers Concept: •Chaining of load balancing services • Tier 1 – Layer 4 (TCP/UDP) load balancing • Tier 2 – Layer 7 load balancing Pros: • Simplest approach • May suffice for small scale environments Cons: • Limited by performance of Tier-1 LB Users Tier 1 Load Balancer Tier 2 Load Balancer Application Instances
  • 6.
    DNS + ProxyLoad Balancers Concept: • DNS redirections with server mirroring • Dynamic mapping of hostname to IP addresses Pros: • Easy to configure • Scales well Cons: • DNS caches can become stale Users DNS Load Balancer Application Instances IP1 IP2 IP3 IP4 IP1 IP2 IP3 IP4
  • 7.
    Route injection/Anycast LoadBalancers Concept: • DNS resolves to single IP • Upstream router holds IP address • Router performs flow-based ECMP to next hop load balancers Pros: • Can scale significantly – most routers support at least 64 next hops Cons: • Access to an upstream router is needed Users Router Load Balancer Application Instances
  • 8.
    Legacy 90s Arch, Boxapproach • Proprietary Hardware • Manage Each Device • No Automation • No Telemetry • Static Capacity The State of Load Balancing/Application Delivery WebScale computing is here but load balancing is still a bottleneck! Takeaways from AWS/FB/Microsoft • Commodity x86 • Manage As One • Highly Automated • Built-In Telemetry • Elastic Flexible, Fluid CapacityRigid Legacy ADC/LBs WEB SCALE TECH Load Balancers
  • 9.
    Virtualized Containers PublicCloudCompute ComputeCompute Modern Distributed Architecture Separate Control and Dataplane Manage as one, not many devices Controller Load Balancers Management Plane: UI/CLI Data Plane: LB
  • 10.
    Virtualized Containers PublicCloudCompute Modern Distributed Architecture Separate Control and Dataplane Manage as one, not many devices Controller Load Balancers
  • 11.
    Modern Distributed Architecture SeparateControl and Dataplane Manage as one, not many devices Load Balancers Bare Metal Virtualized Containers Public Cloud Controller MESOS Management & Orchestration REST API Multi-Cloud Fabric Single solution, any environment Automation Highly programmable, Plug-n-Play Built-In Visibility & Analytics Actionable insights key to automation Innovation
  • 12.
    1 Million TPSon Google Compute Engine - Setup Avi Networks – Elastic Application Services Fabric 320x Test Clients 40x Avi Service Engines (Load Balancers) ab ab ab n1-highcpu-16 ab ab ab n1-highcpu-16 ab ab ab n1-highcpu-16 GCP Router Controller ab ab ab n1-highcpu-16 Application Instances
  • 13.
    Key Stats - Totalcost for setup in Google Compute < $50 - SSL TPS – 0 to 1 million TPS in a few seconds - Dataplane: 40 VM instances with 32 hyperthreaded cores each - Traffic generators – 320 VM instances on 16 hyperthreaded cores each
  • 14.
    • Setup inGoogle Compute • Bootstrap instance - 1 g1-small instance • Avi Controller - 1 n1-standard-4 instance • Avi Service Engines (load balancers) - 40 n1-highcpu-32 instances • Pool server - 1 g1-small instance • Test clients (load/traffic generators) - 320 n1-highcpu-16 instances • Running the test • https://github.com/avinetworks/avi-test-scripts : This public repo has all the scripts required for anyone to perform the scalability test Test setup and methodology
  • 15.
    Avi Networks Proprietaryand Confidential 2017 Scale Performance Up and Out Managed as One Elastic Load Balancer Fabric • 1 LB, 1 core • 5 Gbps • 2,500 SSL TPS • 1 LB, 24 cores (2 Sockets) • 20 Gbps (10 Gbps NICs) • 60,000 SSL TPS SCALE-UP More cores & IO LB performance scales with CPUs (Moore’s Law) & IO (40 Gbps NICs) • 1 LB, 2 core • 10 Gbps • 5,000 SSL TPS Single App Perf • 640 Gbps • 1.9M SSL TPS Performance • 4 Tbps • 12M SSL TPS Scale to 200 LBs • 2 LB, 1 core • 10 Gbps • 5,000 SSL TPS SCALE-OUT More LBs Fabric performance scales horizontally with LBs Centralized API Management Monitoring
  • 16.
    Beyond Google Compute;Any Data Center or Public Cloud Clients Load Balancers Controller Application Instances GCP Router
  • 17.
    DEMO Real-time Insights forElastic Application Services
  • 18.
    The New Rulesof Elastic, Cost-effective Load Balancing 1 Take advantage of WebScale architectures 2 Use analytics-driven decisions for on-demand elasticity Automate L4 – L7 services with APIs3 Leverage load balancers for application intelligence4 Eliminate hardware overprovisioning5
  • 19.
    Anurag Palsule anurag@AviNetworks.com Thank You! AviNetworks (India) Pvt Ltd. JB House, 110, 4th Cross, 5th Block, Koramangala Industrial Layout, Bangalore 560 095, Karnataka.

Editor's Notes

  • #3 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #4 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #5 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #6 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #7 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #8 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #9 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.
  • #10 Load balancers sit in the network today, in front of every business critical application in your environment. These are largely standard x86 servers in a proprietary box, with each box containing it’s own management plane and data plane – so your teams manage each one by one as an independent appliance. Avi’s architecture combined the management plane into a centralized controller which allows you to manage the dataplane – what we call a service engine – as an elastic fabric that can grow and shrink based on capacity needs, without increasing the number of management points.
  • #11 As your infrastructure goes from bare metal to virtual to containers and public cloud – you are now able to spin up the service engines as bare metal appliances on standard x86 servers, or VMs, or containers, or in the public cloud depending on the application needs you are trying to meet. The bare metal deployments offer an easy transition from an existing hardware appliance based environment to a software-defined env while ensuring that future transition to virtual, container or public cloud environments is smooth. Across all of these environments, the controller offers a single point of management and monitoring.
  • #12 As your infrastructure goes from bare metal to virtual to containers and public cloud – you are now able to spin up the service engines as bare metal appliances on standard x86 servers, or VMs, or containers, or in the public cloud depending on the application needs you are trying to meet. The bare metal deployments offer an easy transition from an existing hardware appliance based environment to a software-defined env while ensuring that future transition to virtual, container or public cloud environments is smooth. Across all of these environments, the controller offers a single point of management and monitoring. The controller is pre-integrated with management and orchestration platforms like vCenter, SDN controllers, container cluster managers like Mesos and Kubernetes, as well as public clouds like AWS. This allows a completely automated experience where Service Engines can be spun up or down and connected to networks automatically as need. Finally, given the strategic location of load balancers, they are best positioned to provide visibility into application usage and performance. So we built 100s of virtual probes in the Service Engines which can send this real-time telemetry on app performance back to the controller. The controller has a real-time analytics engine which process billions of data points to provide insights on application performance, usage, end-usage experience, security posture, DDoS, etc. Now operations team can track this data on applications in real time for any application without needing monitoring fabrics, taps, or external network based app performance monitoring solutions.
  • #13 Customer: I have hundreds of servers in my data center. With Chef, Puppet, Ansible, I can turn a couple of racks in to webserver within 5 mins, and turn another 2 racks in to app servers.