• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. The Impact of Optimized Packet Processing Software on Multicore Platformsfor DPI and Network Security
  • 2. Agenda Optimizing the Hardware Optimizing the Software  Paul Stevens, Advantech  Eric Carmes, 6WIND  paul.stevens@advantech.com  eric.carmes@6wind.com  www.advantech.com/nc  www.6wind.com
  • 3. Multicore Network Platform Design Goals
  • 4. Meeting OEM Requirements Need a clear path to sustainable business growth through differentiated products and services  Preserve existing investments while meeting new performance requirements  Reduce time to revenue to beat competition Need to deploy a flexible architecture and a scalable technology  Develop a range of products with a limited number of technologies  Ensure hardware independency Need to meet dynamic market requirements  Manage performance growth  Reduce cost and power consumption Must ship a working product on time  Integrate and validate new and complex technologies faster
  • 5. Anatomy of a Network Appliance (today)SMB (1-10 Gbps) Data Center >80Gbps 10 GbE 10 GbE GbE GbE PCIe x1 10 GbE GbE 10 GbE GbE IA chipset (e.g., XAUI 10 Intel® Core™ i7 Intel® GbE 10 GbE PCIe x8 XAUI Processor or GbE Xeon® GbE Intel ® Atom) processor 5600 series + NPU XAUI 10 GbE GbE 10 GbE I/O Hub XAUI GbE Switch XAUI 10Enterprise(10-80Gbps) Intel® XAUI GbE 10 Xeon® PCIe x8 GbE Intel® PCIe x8 10 GbE 10 processor NPU XAUI 10 Xeon® PCIe x8 5600 series + GbE 10 GbE XAUI processor I/O Hub GbE PCIe x8 10 5600 series + 10 GbE 10 I/O Hub PCIe x8 GbE 10 GbE GbE PCIe x8 10 10 Intel® Control Plane Data Plane GbE 10 GbE 10 Xeon® PCIe x8 GbE Processing Processing GbE processor PCIe x8 10 5600 series + GbE 10 I/O Hub PCIe x8 GbE Variety of Security and Encryption coprocessor options
  • 6. Translating to a Scalable Blade Topology (today) Switch connect 40G IA packet processing IA IA IA and load balancing to the IA Node payloads Switch 40G Node Node Node 2x10G 2 x10G 2 x 10G Dual Star 10G NPU + Switch Connect NPU does front end 40G packet processing NPU IA IA IA IA Node Node Node and load balancing to NPU + 40G Noe the IA Node payloads Switch Hub 2x10G 2x10G 2x10G Dual Star 10G
  • 7. Performance Scaling to full 40G Interconnects Fast path packet processing and load balancing to the IA Node payloads 100G+ IA IA IA Switch NPU NPU 100G+ Node Node Node 2x 20G 2x 20G 2x 20G 2x 40G 2x 40G Dual Star 40G Dual Star 20G 100G+ IA IA IA IA IA Switch 100G+ Node Node Node Node Node 2x 40G 2x 40G 2x 40G 2x 40G 2x 40G Dual Star 40G
  • 8. High-end DPI Example40GE Additional Additional 40GESwitching switching High Level Low Level switching Switchingrule based load capacity using Flow Pro. Flow capacity using rule based loadbalancing dual dual star and DPI Processing dual dual star balancing General Hub blade Hub blade purpose CPU NPU blade Hub blade Hub blade (prim.) (sec.) blade (sec.) (prim.) Switch NPU NPU Switch Switch Switch THUB2 Management THUB2 Management Next CPU Gen CPU ATCA-7410 THUB2 Management THUB2 Management (LMP) 40GE Hub 40GE Hub (LMP) 40GE Dual Dual NPU LMP 40GE Hub (LMP) 40GE Hub (LMP) Blade Blade Xeon GbE Blade Blade xGE Blade GbE Blade xGE GbE xGE GbE xGE xGE xGE GbE xGE GbE SW SW SW MAC SW SW SW SW SW SW MAC MAC SW 6WINDGate gate Slow / Fast Path Partitioning across iA/NPU ShMC ShMC GbE Secondardy 40GE GbE used as Base Interface for 40GE fabric used as fabric interface for data Management and control plane and user plane. Dual star topology Dual star topology Primary 40GE IPMB 40GE used as fabric interface for data and user plane Low level management interface based on2 redundant IPMB busses. Dual star topology Bussed or radial (star) topology
  • 9. Creating a Virtuous Cycle with Multicorefor cost-optimized DPIMore Cores, New Higher TechnologyThroughput Introduction& Capacity
  • 10. 80G Packetarium™ – “ATCA rewrapped”Shrink & Cost-down for non-HA DPI Packetarium is a cost-optimized, modular system architecture for multicore packet processing. Scalable and upgradable to meet bandwidth demand, it’s also a cost effective alternative to 1, 2 or 4 x 10GbE (XAUI) per board ATCA. 8 boards per system Trade-off on availability (system level) QorIQ up to 128 cores MIPS64 up to 256 cores The all-IP design simplifies customization and TI DSP up to 480 cores X86 in design the identical system management design preserves ATCA S/W investment. The Mainboard’s topology is similar to ATCA backplane + switch with transition modules + chassis management modules Each network processing board connects to mainboard’s switch via 2 or 4 x 10GE (XAUI)
  • 11. Scalable Hardware Platforms for DPI • Processor-independent • Main architectures supported today • More to follow >256 cores x8 x2 x1 256 cores 64 cores 32 cores
  • 12. Challenges for DPI Software Unprecedented performance stress on network equipment (cloud and mobile infrastructure)  40G throughput now with 100G on the horizon  Complex networking protocols. Accurate user packet identification and QoS classification. Efficient packet steering decisions for optimized application-level processing Advanced content inspection functions  Application-aware firewall, video compression.
  • 13. Introducing the 6WIND Solution High-performance packet processing engine. DPI Application Processing Optimized for DPI acceleration and protocol termination. Linux Includes comprehensive set of networking protocols with High ……. Availability support. Fast path architecture maximizes Multicore Processor system throughput. Advantech Platform Used by tier-1 OEMs worldwide.
  • 14. Packet Detection Challenges Wire-speed performance. Packets may be fragmented and need re-construction. Packet always hidden by combination of encapsulation techniques  VLAN, GTP, IP in IP, GRE, L2TP, MPLS… Packet is often encrypted (IPsec). Integrated firewall required. Latency for each packet has to be minimized. Solution requires high-performance packet processing for packet identification, classification, steering and termination.
  • 15. Flexible Mapping to Cores, Processors and Blades Dynamically allocate functions Linux across processor cores. Cores Application Processing DPI Packet Processing Control Plane Transparent scaling across homogeneous or heterogeneous Networking Stack blades Data Plane Fast Path Fast Path Cores
  • 16. Includes a Full Set of Networking Protocols Control Plane Modules Fast Path Modules Static RIP (IPv4, IPv6), RIPng, OSPFv2, OSPFv3, BGP-4, IPv4-v6 forwarding RSTP Routing BGP-4+, ECMP (IPv4, IPv6), Protocols VRRP, PIMv4-SM, PIMv6-SM, IGMP/MLD snooping & proxy, IPsec, IPsec SVTI ROHC static route monitoring & BFD Networking Stack IKE, IKEv2, EAP, VPN Layer 2 VLAN, GRE, Security Optimized stack for multicore including: Flow inspection monitoring link aggregation PPP, Multi-link PPP, PPPoE, • All Linux networking features CHDLC, VLAN, GRE, 6in6, (TCP/IP, filtering, NAT, IPsec…) QoS MulticastConnectivity 4in4, L2TP, DHCPv4/v6, DNS proxy, RADIUS client • Optimized SMP, 2K VR for forwarding, firewalling, NAT and IPsec IPv4-v6 reassembly GTP-u encapsulation Switching LACP • Integrated crypto engine management Home agent, FMIP, SCTP TCP termination for IPsec and SSL corresponding node, mobile Mobility node, IPsec integration, • VNB framework for fast Layer 2 IPv4-v6 filtering, NEMO, proxy MIP MPLS encapsulation through Layer 4 protocol integration NAT Virtual Routing Routing protocols, IKE • Network system calls optimization IPv6 tunneling and (UDP, SCTP, RAW). PPP / L2TP (VRF) transition Monitoring system, High • Graceful Restart extensions for High synchronization daemons for High Availability Extended Fast Path availability Availability. ARP-NDP, routing and IPsec
  • 17. 6WINDGate in DPI Application flow identification and Policy enforcement, video compression, analysis security etc. DPI Application Processing 6WINDGate APIs 6WINDGate APIs Flow table Protocol termination Unknown flow or Update TCP, HTTP etc. flow to be flow and monitored flow event Flow to be 40G / processed by Apply 40G / Flow100G Decryption application policy Encryption 100G identificationtraffic (QoS) traffic No application Packet Processing processing  Architecture optimized for managing very large flow tables (millions of flows)  Efficient APIs maximize system throughput (packet cloning, zero-copy architecture etc.)  Scalable architecture for simultaneous support of multiple application instances.
  • 18. Example: Mobile Video Compression • Detection of flows that could include video. Video compression • Detection of events to locate video in flow. DPI 6WINDGate APIs 6WINDGate APIs Flow table HTTP Unknown flow or Update termination Compressed flow event flow and video flow event Flow with 40G / video Apply 40G / GTP flow100G Decryption policy Encryption 100G identificationtraffic (QoS) traffic Flow without video Packet Processing
  • 19. Example: Application-Aware Firewall + UTM Detection of flows that could contain Anti-virus UTM viruses. DPI 6WINDGate APIs 6WINDGate APIs Flow table Transparent proxy Unknown flow or Update Scanned TCP, UDP etc. flow to be flow and flow monitored flow event Flow to be 40G / scanned Apply 40G / L2 / L3 flow100G Decryption policy Encryption 100G identification Firewalltraffic (QoS) traffic Packet Processing
  • 20. Summary 6WIND-Advantech solution addresses critical requirements for DPI equipment:  Wire-speed DPI DPI Application Processing  Comprehensive protocol support for advanced services  Fast path environment optimized for acceleration of DPI and application Linux processing.  Zero downtime reliability via integrated High Availability support …….  Portable solution available on industry- Multicore Processor leading processor platforms  Deployed today in cloud infrastructure and mobile networks. Advantech Platform