#lspe: Dynamic Scaling

#lspe: Dynamic Scaling
Shock Absorbers and APIs

Steve Shah
Sr. Director, Product Management
February 21, 2013

Disclaimer

• I’m going to talk about a product.
ᵒIt’s kind of necessary in order to make this talk useful.
ᵒBut a lot of you have this product or know someone that does!
ᵒThe product is pretty cool…
ᵒIt can also sing and dance.
ᵒMaking coffee is on the roadmap.
• Sorry.
ᵒYes, I am marketing scum.
ᵒNo, I will not to do a hard sell.
• My Competition
ᵒGoogle it. No really… It’s not hard to find them.
ᵒTheir product has various approaches too. I encourage you to ask them.

What is NetScaler?

Performan
Availability ce Offload Security

NetScaler powers some of the world’s largest infrastructures.

1998 to 2012: From Load Balancing to Virtual
Networking

1998 1999 2002 2003 2005 2006 2008 2009 2011

L4 SLB L7 SLB SSL SSLVPN AppFW ICA XML VPX SDX
GSLB CMP RHI SIP IPv6 nCore EdgeSight AppFlow
MUX DNS AAA-TM DataStream

RHI = Route Health Injection
Secret Decoder Ring: ICA = App Proxy for ICA
SLB = Server Load Balancing IPv6 = IPv6 Routing, Switching, LB
GSLB = Global Server Load Balancing XML = XML Security, Routing
MUX = HTTP Multiplexing VPX = Virtual NetScaler
SSL = SSL Acceleration nCore = multi-core scaling
CMP = HTTP Compression SDX = Multi-tenant NetScaler
DNS = DNS Load Balancing / Proxy

Agenda

• Things That Impact Scalability
• Shock Absorbers
• Out Scaling
• Your ADC has an API!

Things That Impact Scalability
Touching on a bit of theory…

Load is Not Linear

• There are startup costs for enabling features in an ADC (memory and CPU)
• However, each incremental request takes a small fraction of resources
• As load increases, some global functions can take resources as well
ᵒE.g., flushing unused IP fragments, running timers, management overhead, etc.

Data Structures and Big O

• I/O, Data structures, and String processing are big factors

• The two that get you are data structures and string
ᵒACLs, VLANs, connection table, connection state, persistence table, etc.
ᵒHTTP request processing and policy execution

• Know your Big O – understand their impact
ᵒBig O notation is how programmers describe efficiency of algorithms
ᵒE.g., O(n) vs. O(log n) vs. O(1)

Shock Absorbers
Coping with Load

Launching v8: The Role of Data Structures

• Story time… launching a major service and what we learned
• Major new roll-out – expected to double the number of servers to handle
• Early testing revealed that large numbers of slow connections are meh
• Invest in your data structures! Clean up on several core structures
•  Average connection lookup time driven to near constant time: O(1)
•  Stir in a team that dreams in assembly language and can see cache
misalignment by glancing at code and shave another 20% off connection
lookup times (absolute times)
• Lesson: drive your apps to good data structures. Drive your vendors to do
better.

MaxConns and SurgeQ

Incoming load

Peak perf – we want to
stay there

Typical server performance curve

MaxConns and SurgeQ

Queue incoming requests
in the ADC

Set max conns here

Server stays operating at maximum throughput

Story time:
When 4 Hurricanes Hit

The SR-71 Approach: Go Faster
Treat a collection of NS devices
• Single System
like a grand unified “big” device
ᵒconfigured and managed as a
single logical system
• Scalable
The Sheet-metal Test
Steps:
ᵒscales with number of devices
• Take a cluster of NS, and an L2 switch. (distributes work)
• Configure the devices to your liking.
• Wrap the whole thing with sheet-metal, such
that only the network ports remain exposed.
• Fault Tolerant
Test: ᵒHandles device failure, addition…
Must be able to configure and use this contraption as
if it were just another NS box. • Dynamic
• connect wires into any visible port(s), create
LAGs at will, enable L2 mode, MBF …
• point GUI to Cluster’s IP and configure away

Clustering

• Create a single system image out of a collection of instances
ᵒInstances = virtual machines, physical instances, or instances on multi-tenant boxes
• True shared management + data plane (the sheet metal test)
• Shared state for key data structures (persistence, health check, etc.)
• Linear scale by adding instances (up to 32)
• Ability to manage faults with proportional degradation

Real-time Policy Based
Analytics Actions
Bandwidth Compress
Connections Cache
Top ‘N Requests Log
Response Time Drop
Frequency Respond

Policy Based Decision
Traffic Selection Feedback loop

Scaling Globally

Active Mirror
Site Site

Global Server Load Balancing Route Health Injection
(GSLB) (RHI)
NetScaler uses DNS to send users to the closest site based NetScaler dynamically updates routing tables to direct
on administrator defined metrics (geography, topology, clients to the active site based on real-time health
site performance, availability) monitoring of backend infrastructure.

API in a Nutshell: Your ADC Has This

API

Interfaces Client Toolkits Policy Statistics

Scripting OOP Reverse Bulk Granular
SOAP RESTful Perl/PHP/Python/ Java/C#/ASP/ JSON/XML
PowerShell .NET based Call-Out Reporting Reporting

More RESTful - HTTP Status Code

REQUEST RESPONSE

Success Case: Success Case
GET
http://<nsip>/nitro/v1/config/lbvserver/lbv1 HTTP 200 OK

Failure Case:
POST http://<nsip>/nitro/v1/config/lbvserver Failure Case:
Content-
Type:application/vnd.com.citrix.netscaler.lbvser HTTP/1.0 409 Conflict
ver+json
{
{"lbvserver": "errorcode": 273,
{"name":"lbv111", "servicetype":"HTTP"} "message": "Resource already exists",
} "severity": "ERROR"
}

Citrix Confidential - Do Not Distribute

Example: Using Java
Indicate we want “rollback on failure” in this session

Prepare 3 lbvservers to be added in one bulk operation

Output

Print results No attempt to add
“lb3” because of
Rollback behavior

AutoSense and AutoScale
NetScalerautomatically is auto-provisionedabnormal behavior withbindings
Traffic is monitoring engine auto-detects byin new serviceon NetScaler
NetScaler NetScaler scaled for the newly added services does servers
NetScaler triggers AutoScale capability CloudStack
CloudStack “auto-provisions”CloudStack provides CloudStackAutoScale policy
On successful AutoScale, adds server instances Latency, Throughput …
NetScaler automatically new new service resources and descriptions
monitors servers to CPU, Memory, based on

M
M
M
Internet M

M
M

CloudStack

#lspe: Dynamic Scaling

More Related Content

What's hot

Similar to #lspe: Dynamic Scaling

Recently uploaded

#lspe: Dynamic Scaling

Editor's Notes