Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FD.io Vector Packet Processing (VPP)

My presentation on Kernel TLV Meetup,
27.11.2016

  • Be the first to comment

FD.io Vector Packet Processing (VPP)

  1. 1. ©2015 Check Point Software Technologies Ltd. 1©2015 Check Point Software Technologies Ltd. Overview Kirill Tsym, Next Generation Enforcement team FD.IO VECTOR PACKET PROCESSING
  2. 2. ©2015 Check Point Software Technologies Ltd. 2 CHECK POINT SOFTWARE TECHNOLOGIES The largest pure-play security vendor in the world Protecting more than 100,000 companies with millions of users worldwide $1.63B annual revenues in 2015 Over 4,300 employees Partners in over 95 countries
  3. 3. ©2015 Check Point Software Technologies Ltd. 3 Lecture agenda  Linux networking stack vs user space networking initiatives – Why User Space networking? Why so many projects around it?  Introduction to FD.io and VPP – Architecture, Vectors, Graph, etc.  VPP Data path – Typical graphs – Example of supported topologies  VPP Threads and scheduling  Single and Multicore support  Supported topologies
  4. 4. ©2015 Check Point Software Technologies Ltd. LINUX KERNEL STACK 01
  5. 5. ©2015 Check Point Software Technologies Ltd. 5 Applications Linux kernel data path User Space Kernel Space NIC1 NIC2 TCP/IP Stack Forwarding To Application HW Rx Tx  Design goals or why stack is in the kernel? – Linux is designed as an Internet Host (RFC1122) or an “End-System” OS – Need to service multiple applications – Separate user applications from sensitive kernel code – Make application as simple as possible – Receive direct access to HW drivers  Cost – Not optimized for Forwarding – Every change requires new kernel version – Code is too generic – Networking stack today is a huge part of the kernel Pass-through Application Path ApplicationsApplication  Reference: Kernel Data Path L1 L2 L3 L4 L7 Drivers Sockets L5
  6. 6. ©2015 Check Point Software Technologies Ltd. 6 Linux stack whole picture  Reference: Network_data_flow_through_kernel
  7. 7. ©2015 Check Point Software Technologies Ltd. 7 Linux stack packet processing  Packets are processed in Kernel one by one – A lot of code involved in each packet processing – Processing path is monolithic, it’s impossible to change it or load new stack modules – Impossible to achieve Instruction Cache optimization in this model – There are technics to hijack kernel routines or defines hooks, but no simple and standard way to replace tcp_input() for example  skb processing is not cache optimized – sk_buff struct includes too much information – It could be ideal to load all needed sk_buff ‘s to cache before processing – But skb doesn’t fit to cache line nor placed in chain – As result there is no Data Cache optimization and usually a lot of cache misses  Every change requires new kernel version – Upstream a new protocol takes very long time – Standardization goes much faster than implementation
  8. 8. ©2015 Check Point Software Technologies Ltd. USER SPACE NETWORKING PROJECTS 01
  9. 9. ©2015 Check Point Software Technologies Ltd. 9 Application netmap API Netmap User Space Kernel Space NIC HW Linux Networking Stack netmap rings NIC rings  Pros – BSD, Linux and Windows ports – Good scalability – Data path is detached from host stack – Widely adopted  Cons – No networking stack – Routing done in host stack which slows down initial processing  Performance Packet forwarding Mpps Freebsd bridging 0.690 Netmap + libpcap 7.5 Netmap 14.88 Reference: netmap - the fast packet I/O framework
  10. 10. ©2015 Check Point Software Technologies Ltd. 10 DPDK / Forwarding engine DPDK User Space Kernel Space NIC1 Linux Networking Stack Slow Path Fast Path 4 HW Kernel Networking Interface 3 5 8 NIC2  Pros – Kernel independent – All packet processing done in user space – DPDK Fast Path is cache and minimum instructions optimized  Cons – No networking stack – No routing stack – Need to send packets to Kernel for routing decisions – Doesn’t perform well on scaling tests – No external API – No integration with management – Out of tree drivers Fast Path Slow Path Routing Decision Drivers 7 1 2 6
  11. 11. ©2015 Check Point Software Technologies Ltd. 11 OpenFastPath  BSD Networking Stack on top of DPDK and ODP  OpenDataPlane (ODP) is a cross-platform data plane SoC networking open source API  Supported by Nokia, ARM, Cavium and ENEA  Includes optimized IP, UDP and TCP stacks  Routes and MACs are in sync with Linux through Netlink
  12. 12. ©2015 Check Point Software Technologies Ltd. 12 Other projects  OpenSwitch ̶ OS with Main component: DPDK based Open vSwitch ̶ Various management and CLI daemons ̶ Routing decision made by Linux Kernel (Ouch!) ̶ REST API ̶ Good for inter-VM communications  OpenOnload ̶ A user-level network stack from Sloarflare ̶ Depends on Solarflare NICs (Ouch!) • IO Visor ̶ XDP or eXpress Data Path ̶ Not a user space networking! ̶ Tries to bring performance in to existing kernel with BPF ̶ No need for 3rd party code ̶ Allows option of busy polling ̶ No need to allocate large pages ̶ No need for dedicated CPUs
  13. 13. ©2015 Check Point Software Technologies Ltd. FD.IO 01
  14. 14. ©2015 Check Point Software Technologies Ltd. 14 FD.io Project overview • FD.io is Linux Foundation project ̶ A collection of several projects based on Data Plane Development Kit (DPDK) ̶ Distributed under Apache license ̶ A key project the Vector Packet Processing (VPP) is donated by Cisco ̶ Proprietary version of VPP is running in Cisco CRS1 router ̶ There is no tool chain, OS, etc in Open sourced VPP version ̶ VPP is about 300K lines of code ̶ Major contributor: Cisco Chief Technology and Architecture office team • Three Main components ̶ Management Agent ̶ Packet Processing ̶ IO • VPP Roadmap ̶ First release 16 of June includes14MPPS single core L3 performance ̶ 16.09 release includes integration with containers and orchestration ̶ 17.01 release will include dpdk-16.11, dpdk CryptoDev, enhanced NAT, etc.
  15. 15. ©2015 Check Point Software Technologies Ltd. 15 VPP ideas • CPU cycles budget ̶ 14 Mpps on 3.5 Ghz CPU = 250 cycles per packet budget ̶ Memory access 67ns and it’s the cost of fetching one cache line (64 bytes) OR 134 CPU cycles • Solution ̶ Perform all the processing with minimum of code ̶ Process more than one packet at a time ̶ Grab all available packets from Rx ring on every cycle ̶ Perform each atomic task in a dedicated Node • VPP Optimization Techniques ̶ Branch Prediction hints ̶ Use of vector instructions SSE, AVX ̶ Prefetching – do not pre-fetch to much to left the cache warm ̶ Speculations – around the packet destination instead of a full lookup ̶ Dual Loops Cache miss is unacceptable
  16. 16. ©2015 Check Point Software Technologies Ltd. 16 VPP architecture NIC1 NIC2 User Space Kernel Space DPDK VPP IP Stack PluginsPluginVPP Plugins VPP  Pros – Kernel independent – All packet processing done in user space – DPDK based (or netmap, virtio, host, etc.) – Includes full scale L2/L3 Networking stack – Routing decision made by VPP – Also includes bridge implementation – Good plugins framework – Integrated with external management: Honeycomb  Cons – Young project – First stable release ~06/16 – Many open areas – Open Stack integration / Neutron – Lack of Transport Layer integration – Control Plane API & Stack  But what about L4/L7? – TLDK Project HW Fast Path VPP I/O Tasks I/O Polling logic + L2 L3 tasks User Defined tasks
  17. 17. ©2015 Check Point Software Technologies Ltd. 17 Performance ̶ VPP data plane throughput not impacted by large IPv4 FIB size ̶ OVSDPDK data plane throughput heavily impacted by IPv4 FIB size ̶ VPP and OVSDPDK tested on Haswell x86 platform with E5-2698v3 2x16C 2.3GHz (Ubuntu 14.04 trusty) fd.io Foundation  Reference: FD.io intro
  18. 18. ©2015 Check Point Software Technologies Ltd. 18 TLDK VPP TLDK Application layer (project) NIC1 User Space Kernel Space HW Fast Path Purpose build TLDK Application Socket Application BSD Socket Layer LD_PRELOAD SocketLayer NativeLinux Application DPDK NIC2 VPP  TLDK Application Layer – Using TLDK Library to process TCP and UDP packets  Purpose Built Application – Using TLDK API Directly (VPP node) – Provides highest performance  BSD Socket Layer – A standard BSD socket layer for applications using sockets by design – Lower performance, but good compatibility  LD_PRELOAD Socket Layer – Used to allow a ‘native binary Linux’ application to be ported in to the system – Allows for existing application to work without any change
  19. 19. ©2015 Check Point Software Technologies Ltd. 19 VPP Nodes and Graph Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Processing is divided per Node Node works on Vector of Packets Nodes are connected to graph Graph could be changed dynamically vector of packets
  20. 20. ©2015 Check Point Software Technologies Ltd. DATA PATH
  21. 21. ©2015 Check Point Software Technologies Ltd. 21 • Full zero copy • Data always resides in Huge Pages memory • Vector is passed from graph node to node during processing ethernet- input Data path - ping dpdk-input ipv4-input ipv4-local ipv4-icmp- input ipv4-icmp- echo- request ipv4- rewrite- local Gigabit Ethernet- Output Gigabit Ethernet- Txt DPDK Core 0 vector of packet pointers Huge Pages Memory packets data Packets placed to Huge Pages by NIC VPP Vector created during input device work Node
  22. 22. ©2015 Check Point Software Technologies Ltd. 22 ethernet- input Vector processing – split example input- device ipv4-input Gigabit Ethernet- Output Gigabit Ethernet- Txt input vector ipv6-input output vector A output vector B Transmit queue: packets are reordered Next node is called twice by threads scheduler DPDK
  23. 23. ©2015 Check Point Software Technologies Ltd. 23 ethernet- input Vector processing – cloning example dpdk-input ipv4-input Gigabit Ethernet- Output Gigabit Ethernet- Txt input vector Transmit queue ipv4-frag output vector * 2 packets input vector Max vector size is 256 If output vector is full Then two vectors will be created DPDK
  24. 24. ©2015 Check Point Software Technologies Ltd. 24 Rx features example : IPsec flow dpdk-input ipsec-if- output Gigabit Ethernet- Output Gigabit Ethernet- Txt DPDK ethernet- input ipv4-input esp- encrypt ipv4- rewrite- local esp- decrypt ipsec-if- input ipv4-local ipsec-if node been dynamically registered to receive IPsec traffic using Rx Features during interface UP Done through rewrite adjutancy
  25. 25. ©2015 Check Point Software Technologies Ltd. THREADS AND SCHEDULING
  26. 26. ©2015 Check Point Software Technologies Ltd. 26 Threads scheduling [Restricted] ONLY for designated groups and individuals​ One VPP scheduling cycle PRE-INPUT Purpose: Linux input and system control Example: unix_epoll_input dhcp-client management stack interface INPUT Purpose: Packets input Example: dpdk_io_input dpdk_input tuntap_rx INTERRUPTS Purpose: Run Suspended processes Example: expired timers PENDING NODES DISPATCH Purpose: Processing all vectors that needs additional processing after changes Example: Worker thread main INTERNAL NODES DISPATCH Purpose: Processing all pending vectors on VPP graph Example: Worker thread main Main work: L2/L3 stack processing and Tx
  27. 27. ©2015 Check Point Software Technologies Ltd. 27 Threads zoom-in [Restricted] ONLY for designated groups and individuals​ vpp# show run Time 9.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call admin-up-down-process event wait 0 0 1 6.52e3 0.00 api-rx-from-ring active 0 0 6 1.04e5 0.00 cdp-process any wait 0 0 1 1.10e5 0.00 cnat-db-scanner any wait 0 0 1 5.34e3 0.00 dhcp-client-process any wait 0 0 1 6.58e3 0.00 dpdk-process any wait 0 0 3 2.73e6 0.00 flow-report-process any wait 0 0 1 6.19e3 0.00 gmon-process time wait 0 0 2 5.36e8 0.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 10 1.81e4 0.00 startup-config-process done 1 0 1 2.64e5 0.00 unix-cli-stdin event wait 0 0 1 3.05e9 0.00 unix-epoll-input polling 24811921 0 0 9.48e2 0.00 vhost-user-process any wait 0 0 1 3.24e4 0.00 vpe-link-state-process event wait 0 0 1 7.10e3 0.00 vpe-oam-process any wait 0 0 5 1.37e4 0.00 vpe-route-resolver-process any wait 0 0 1 9.52e3 0.00 vpp# exit # ps -elf | grep vpp 4 R root 20566 1 92 80 0 - 535432 - 16:10 ? 00:00:27 vpp -c /etc/vpp/startup.conf 0 S root 20582 1960 0 80 0 - 4293 pipe_w 16:10 pts/34 00:00:00 grep --color=auto vpp #
  28. 28. ©2015 Check Point Software Technologies Ltd. SINGLE AND MULTCORE MODES [Restricted] ONLY for designated groups and individuals​
  29. 29. ©2015 Check Point Software Technologies Ltd. 29 Core 0 Core 1 Core 2 Rx Tx Rx Tx VPP Threading modes [Restricted] ONLY for designated groups and individuals​ • Single-threaded ̶ Both control and forwarding engine run on single thread • Multi-thread with workers only ̶ Control running on Main thread (API, CLI) ̶ Forwarding performed by one or more worker threads • Multi-thread with IO and Workers ̶ Control on main thread (API,CLI) ̶ IO thread handling input and dispatching to worker threads ̶ Worker threads doing actual work including interface TX ̶ RSS is in use • Multi-thread with Main and IO on a single thread ̶ Workers separated by core - Control - IO - Worker Core 0 Core 1 Core 2 Rx Tx Tx Core 0 Rx Tx Core 0 Core 1 Core 2 Rx Tx Core 3 Rx …..
  30. 30. ©2015 Check Point Software Technologies Ltd. SUPPORTED TOPOLOGIES [Restricted] ONLY for designated groups and individuals​
  31. 31. ©2015 Check Point Software Technologies Ltd. 31 Router and Switch for namespaces Reference
  32. 32. ©2015 Check Point Software Technologies Ltd. QUESTIONS?
  33. 33. ©2015 Check Point Software Technologies Ltd. 33 VPP Capabilities • Why VPP? ̶ Linux Kernel is good, but going too slow because of backward compatibility ̶ Standardization today moving faster than implementations ̶ Main reason for VPP speed – optimal usage of ICACHE ̶ Do not trash the cache with packet per packet processing like in the standard IP stack ̶ Separation of Data Plane and Control Plane. VPP is pure Data Plane • Main ideas ̶ Separation of Data Plane and Control Plane ̶ API generation. Available binding for Java, C and Python ̶ OpenStack integration ̶ Neutron ML2 driver ̶ OPENFV / ODL-GBP / ODL-SFC (Service chaining like firewalls, NAT, QoS) • Containers ̶ Could be in the host connecting between containers ̶ Could be VPP inside of containers and talking between them
  34. 34. ©2015 Check Point Software Technologies Ltd. 34 Connection between various layers dpdk-input plugin ethernet-input ip-input udp-local ip4_register_protocol()  UDP ethernet_register_input_type()  IPv4 vnet_hw_interface_rx_redirect_to_node() Defined in plugin code Next node is hardcoded in dpdk-input/handoff-dispatch Callback Data
  35. 35. ©2015 Check Point Software Technologies Ltd. 35 Output attachment point ipv4-input ipv4- lookup VPP Adjacency: mechanism to add and rewrite next node dynamically after routing lookup. Available nodes: - miss - drop - punt - local - rewrite - classify - map - map_t - sixrd - hop_by_hop *Possible place for POSTROUTING HOOK ipv4- rewrite- transit VPP Rx features: mechanism to add and rewrite next node dynamically after ipv4-input. Available nodes: - input acl  *Prerouting - source check rx - source check any - ipsec - vpath - lookup *Currently impossible to do it from plugins L3 Nodes Various L4 Nodes Various Post Routing Nodes

    Be the first to comment

    Login to see the comments

  • VitalyZhidchenko

    May. 10, 2018
  • HarishPatil1

    May. 17, 2018
  • garyachy

    Jun. 13, 2018
  • unforgiving2

    Jun. 14, 2018
  • NikitaAgarwal114

    Jun. 28, 2018
  • ssuser8d49e5

    Jul. 10, 2018
  • kevin_yuan

    Sep. 18, 2018
  • ssuser881ae5

    Oct. 30, 2018
  • dkalog

    Dec. 30, 2018
  • SrikumarSubramanian

    Jan. 7, 2019
  • zhikuigeng

    Jan. 28, 2019
  • AnjhuChandran

    Feb. 8, 2019
  • BoHan7

    Mar. 19, 2019
  • ssuser4cd75c

    May. 2, 2019
  • ssuserd1226b

    Jun. 4, 2019
  • Kevin_Kuo

    Sep. 11, 2019
  • ssuser977db8

    Jan. 20, 2020
  • SurajRGupta

    Apr. 24, 2020
  • SambhramKanavalli

    Jun. 3, 2020
  • chughan

    Oct. 23, 2020

My presentation on Kernel TLV Meetup, 27.11.2016

Views

Total views

7,609

On Slideshare

0

From embeds

0

Number of embeds

33

Actions

Downloads

1

Shares

0

Comments

0

Likes

35

×