Featured Speaker:Mike Coward, co-Founder and CTO, Continuous Computing
When: Thursday, June 3, 2010
Time: 12:00- 12:30 p.m.
Track: Deep Packet Inspection
Topic: DPI at 40G and 100G: Moving Beyond Appliances
JMeter webinar - integration with InfluxDB and Grafana
ISS World Prague 2010 - DPI at 40G and 100G: Moving Beyond Appliances
1. 40G and 100G DPI Platforms
Moving Beyond Appliances
Mike Coward
CTO & co-founder - Continuous Computing
www.ccpu.com Confidential and Proprietary
2. Agenda
Company Overview
DPI Market Trends – High Scale Requirements
DPI System Architecture Options
Bladed DPI Development Options
100G System Design Example
www.ccpu.com Confidential and Proprietary 2
3. Quick Corporate Facts
Founded Feb ’98
Private, VC-financed
>300 employees globally
Headquarters in San Diego
Engineering in Bangalore, Shenzhen
Acquired Trillium® from Intel (Feb ’03)
95,000 ft2
facilities
ISO 9001 certified
RoHS compliant Sell to equipment mfrs, not operators
CMMI Level 3
www.ccpu.com Confidential and Proprietary 3
4. Integrated Systems & Services
Protocols & HA Middleware
Integrated
DPI
Platforms
AdvancedTCA & Professional
CompactPCI Hardware Services
www.ccpu.com Confidential and Proprietary 4
5. 2 Categories of DPI Applications
Optimize / Protect OpEx Drive Revenue
IDS/IPS/Firewall Tiered subscriber
Security gateway (UTM) bandwidth / applications
Subscriber traffic shaping Managed Security Svcs
Peer-to-Peer blocking Market intelligence
Peer-to-Peer caching / gathering
redirection Mobile ad insertion
Lawful intercept Server virtualization
Carrier demand for new revenue drives investment into large scale DPI
Lawful Intercept gets higher capacity systems with minimal investment
www.ccpu.com Confidential and Proprietary 5
6. Movement towards 40G and 100G: Core and Edge
Core
Cisco CRS-3: >1000 ports of 100G Ethernet
Juniper T1600
Access Equipment: DSLAM, G-EPON, 10G-EPON, CMTS
Adding 10G interfaces, and starting to add 40G and 100G interfaces
CMTS: channel bonding yields 200 Mb/s
Google fiber broadband at 1Gb/s
Recent informal survey: 19 of 21 operators planning to skip
40G and move directly to 100G interfaces
www.ccpu.com Confidential and Proprietary 6
7. Appliances
Appliances: Good enough for
multi-gigabit, even to 10G
Typically carefully tuned x86 with
customized accelerated packet capture cards
Biggest Problem: Application creep
Applications and inspection criteria tend to
expand over time: can’t guarantee line rate
performance with new feature additions
Other issues: Redundancy, Scalability
www.ccpu.com Confidential and Proprietary 7
8. 40G Appliances? 100G Appliances?
Don’t bet against Moore’s Law
But not a silver bullet – still have to wait
2013: 40G Appliance
2016: 100G Appliance
www.ccpu.com Confidential and Proprietary 8
9. The Solution Today: Bladed Systems
Bladed systems used in Telecom central office
deployments for 20 years
Provide scalability, reliability, upgradeability
Proven technology in DPI / LI
80 Gbps traffic shaping system deployed since 2007
10 Gbps network security platform deployed since 2008
40 Gbps network security platform deployed since 2009
10 Gbps LI inspection system deployed since 2008
www.ccpu.com Confidential and Proprietary 9
15. Bladed DPI Components
4 components needed for Bladed DPI
x86 Compute Blades
Nothing better for complex flow analysis
Packet Processing Blades
Required for line rate encryption/decryption, IP header
manipulation, flow duplication, DPI
Ethernet Switch Fabric
Must support load balancing, complex filtering
Chassis
Multiple sizes needed to support range of capacities
www.ccpu.com Confidential and Proprietary 15
16. Introduction to AdvancedTCA™ (ATCA)
Open blade specification created for telecom market
Designed for Central Office Requirements
Reliability – High Available, Easily Serviceable
Capacity – Large boards – high power processors
Bandwidth – Up to 40Gbps/slot – 500Gbps/chassis
200+ PICMG members, dozens of blade vendors
Globally $100M+ of ATCA-focused R&D
Every permutation of CPU (x86, AMD, PowerPC), Packet Processor
(RMI, Cavium, IXP), Ethernet switch fabric (Broadcom, Fulcrum), and
I/O (1G/10G Ethernet, ATM, SS7, serial)
Blades are designed to interoperate – no vendor lock in
Good economies of scale: ATCA is $1B market
www.ccpu.com Confidential and Proprietary 16
17. Typical High-End Next Generation ATCA DPI
FM80 ATCA 40G Fabric Switch
40G Switching to every slot in the backplane
16k TCAM-based rules for pattern matching & ingress
processing
Integrated Load Balancing
XE80 Dual 6-core x86 CPU
Dual Westmere 6-Core CPU
Up to 64GB DDR3 memory
Dual 10GE Fabric high performance accelerated NIC
(TOE, RDMA, iSCSI)
PP80/CV80 – Dual 40G Packet Processors
Flexible Ethernet-based architecture
32GB memory for millions of flow entries
www.ccpu.com Confidential and Proprietary 17
18. ATCA Comparison with Bladed Systems
Bladed Systems optimized for Enterprise compute
applications
Lots of x86 processors, fairly simple Ethernet switching
ATCA optimized for Telco applications – same reqmts as DPI
Broad array of silicon architectures: x86, PowerPC, Packet Processors,
Network Processors, FPGAs, DSPs
High Capacity Flexible I/O
Support for 10G, 40G, 100G Uplinks
Advanced Switch Features
Switch Load Balancing
Switch-based packet pre-filtering and routing
Globally, >$100M R&D focused on ATCA platforms
www.ccpu.com Confidential and Proprietary 18
19. Packet Processors vs x86 Blades – Which is better?
Lots of debate in the market: Should DPI be done with packet
processors or x86?
x86 vendors say:
Intel roadmap moves faster than anyone else, commodity: very good pricing
Packet Processors vendors say:
Integrated NICs and architecture allows very high packet rates
Integrated encryption/decryption allows line rate security processing
Our answer: Use Both!
x86 blades are best for complex flow analysis, deep correlation, database
and reporting
Packet Processors: Packet reception, reassembly, regular expression
scanning, application identification
Our experience is that the highest performance systems on the market
include both
www.ccpu.com Confidential and Proprietary 19
20. System Design Example
Requirements
Capable of monitoring 100G Tapped Interface
Want to monitor up to 1,000 users, with up to 1,000 flows
per subscriber: 1M flows total
Want aggregate application decode rate of 10 Gbps
www.ccpu.com Confidential and Proprietary 20
21. 100G
100G
www.ccpu.com
Load Balancing
Load Balancing
Switching
Switching
Load Balanced
Load Balanced
Load Balanced
Load Balanced
System Design Example
Flow Identification
Flow Identification
Packet Processing
Flow Processing
Packet Identification
Flow Processing
Packet Identification
Packet Processing
Filtered
Filtered
Filtered
Filtered
100G Tapped interface means 200G into system
Need 200G of load balancing capability on switch
Flow Identification
Flow Processing
PacketIdentification
Flow Processing
PacketIdentification
Flow Analysis
Packet Processing
Compute
Confidential and Proprietary
21
22. 40/100G
www.ccpu.com
Load Balancing
Load Balancing
Switching
Switching
Load Balanced
Load Balanced
Load Balanced
Load Balanced
System Design Example
Flow Identification
Flow Identification
Packet Processing
Flow Processing
Packet Identification
Flow Processing
Packet Identification
Packet Processing
Add 1 card as a spare for N+1 redundancy
Filtered
Filtered
Filtered
Filtered
Flow Identification
Each Flow Identification card can handle 40G traffic
Flow Processing
PacketIdentification
200G ingress traffic means 5 packet processing cards
Flow Processing
PacketIdentification
Flow Analysis
Packet Processing
Compute
Confidential and Proprietary
22
23. System Design Example
Packet Processing
Packet Identification
Packet Processing
PacketIdentification
Flow Analysis
Load Balancing
Flow Processing
Packet Identification
Flow Processing
PacketIdentification
Load Balanced Filtered
Load Balancing
Flow Processing
Flow Identification
Flow Processing
Flow Identification
Compute
Switching
Packet Processing
Flow Identification
Switching
Load Balanced Filtered
40/100G
Load Balanced Filtered
Load Balanced Filtered
Each Flow Analysis/Decode card can handle 2-5G traffic
Aggregate decode rate of 10G means 5 cards
Add 1 card as a spare for N+1 redundancy
Local Storage: 1TB per card, or use external storage
www.ccpu.com Confidential and Proprietary 23
25. Ways to deploy DPI on ATCA
1. Adopt 100% off-the-shelf ATCA hardware,
and focus on DPI application
Theory: More Differentiation
More R&D Investment
2. Adopt ATCA system and develop custom
Better Time to Market
mezzanine or module to implement
hardware “secret sauce”
3. Use ATCA switch, x86 cards, but develop
full ATCA blade with specific DPI
implementation
4. Use ATCA spec but develop all blades
www.ccpu.com Confidential and Proprietary 25
26. CCPU Packet Inspection Platform Capabilities
Packet Inspection at 200+ Gbps per shelf
Mix of dedicated packet processing cards
and x86 compute offers ideal blend of quick time-to-
deployment and high capacity
Dedicated security engines allow real-time packet
decryption at line rate
Supports tapped, inline bump-in-wire or
terminated modes
www.ccpu.com Confidential and Proprietary 26
27. Summary
ATCA emerging as the preferred architecture for
next generation DPI deployments at 40G and 100G
Cost effective, scalable, future-proof
www.ccpu.com Confidential and Proprietary 27
28. Thank You
Mike Coward
www.ccpu.com
mike.coward@ccpu.com
www.ccpu.com Confidential and Proprietary