Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
18 Feb, 2013     SAKURA Internet Research CenterSenior Researcher / Naoto MATSUMOTO
Hardware Acceraletion Overview                                 Source: (c) 2012 Enyx
In-Network Acceleration Model                                Source: (c) 2012 Enyx
Hardware Acceleration Limitation         Supports up to 16 TCP sessions (ONLY)                                            ...
Altera 5SGXEA7N2F45C2N                 Arista 7124FX                                                                 Alter...
Comparison of FPGA SPECsArista 7124FX / Enyx FPB1 (NIC)/            Accelize XpressGX4LP (NIC)SolarFlare AOE(NIC)Altera St...
Enabling ToDo1) Using Pre-installed kernel modeuls for Mellanox 40GbE-NIC(mlx4_core,en)2) Load 40GbE-NIC kernel module on ...
40GbE-NIC Status Check $ show interfaces ethernet eth1 physical Settings for eth1:         Supported ports: [ TP ]        ...
HighGig DATA Transfer Benchmark                                           0.87 Gbit/sec*                                  ...
Application Bottleneck in OS                                                                       (BAD KnowHow)          ...
DPDK TESTING Overview 1) Intel ® DPDK source codes for linux were released at End of 2012. http://www.intel.com/p/en_US/em...
DPDK Layer 3 Fwd Benchmark                                    [Layer 3 Fowarder with Intel® DPDK]                         ...
DPDK Layer 3 Fwd perf stat                                  [Layer 3 Fowarder with Intel® DPDK]                           ...
Thanks for your interest.SAKURA Internet Research Center.
Upcoming SlideShare
Loading in …5
×

In-Network Acceleration with FPGA (MEMO)

2,292 views

Published on

In-Network Acceleration with FPGA (MEMO)

05 Feb, 2013
SAKURA Internet Research Center
Senior Researcher / Naoto MATSUMOTO

Published in: Technology
  • Be the first to comment

In-Network Acceleration with FPGA (MEMO)

  1. 1. 18 Feb, 2013 SAKURA Internet Research CenterSenior Researcher / Naoto MATSUMOTO
  2. 2. Hardware Acceraletion Overview Source: (c) 2012 Enyx
  3. 3. In-Network Acceleration Model Source: (c) 2012 Enyx
  4. 4. Hardware Acceleration Limitation Supports up to 16 TCP sessions (ONLY) Source: © Copyright 2012 PLDA All Rights Reserved
  5. 5. Altera 5SGXEA7N2F45C2N Arista 7124FX Altera 5SGXEA7N2F45C2N - 622,000 Logic Elements, - 234,750 Adaptive Logic Modules (ALMs), - 642,000 Registers, - 57 Mb internal memory, - 512 18×18 Hardware multipliers, - 256 27×27 Digital Signal Processing Blocks, Copyright © 2013 Arista Networks, Inc. All rights reserved. - 28 PLL for digital clocks synthesis.- 2 x 4 GB DDR3-1333 ECC @ 600 MHz memory- 3 x 72Mbit QDR II+ SRAM Memory @ 333 MHz.- 16+8 SFP+ ports for 1 Gb/s and 10 Gb/s Ethernet fiber or copper applications.- PCI Express Generation 2.0 x4 interface. Source: (c) 2012 Enyx
  6. 6. Comparison of FPGA SPECsArista 7124FX / Enyx FPB1 (NIC)/ Accelize XpressGX4LP (NIC)SolarFlare AOE(NIC)Altera Stratix V 5SGXEA7N2F45C2N Altera EP4SGX530KF40C2N622,000 Logic Elements, 531,200 Logic Elements,234,750 Adaptive Logic Modules (ALMs), 212,480 Adaptive Logic Modules (ALMs),642,000 Registers, 424,960 Registers,57 Mb internal memory, 2.6 Mb internal memory,512 18×18 Hardware multipliers, 1024 18×18 Hardware multipliers.256 27×27 Digital Signal Processing Blks,28 PLL for digital clocks synthesis. Source: (c) 2012 Enyx
  7. 7. Enabling ToDo1) Using Pre-installed kernel modeuls for Mellanox 40GbE-NIC(mlx4_core,en)2) Load 40GbE-NIC kernel module on /etc /modules $ show version Version: VC6.5R1 Description: Vyatta Core 6.5 R1 $ sudo vi /etc/modules mlx4_en $ sync; sync; sync; reboot © 2013 Mellanox Technologies. All Rights Reserved.
  8. 8. 40GbE-NIC Status Check $ show interfaces ethernet eth1 physical Settings for eth1: Supported ports: [ TP ] : Speed: 40000Mb/s Duplex: Full Port: Twisted Pair : Link detected: yes driver: mlx4_en version: 2.0 (Dec 2011) firmware-version: 2.10.800 bus-info: 0000:01:00.0
  9. 9. HighGig DATA Transfer Benchmark 0.87 Gbit/sec* 5.58 Gbit/sec* 8.00 Gbit/sec* 13.68 Gbit/sec* 18.23 Gbit/sec**[System; Intel® Core™ i7-3930K CPU @ 3.20GHz / 32GB DDR3-DIMM / Linux 3.7-rc7 / Mellanox ConnectX3 40GbE-NIC][Benchmark Tool: wget+thttpd+tmpfs*, rcopy+tmpfs**,] SOURCE: SAKURA Internet Research Center. 12/2012 rev2 Project THORN.
  10. 10. Application Bottleneck in OS (BAD KnowHow) 5.28 Gbit/sec* 5.17 Gbit/sec* Application Bottoleneck 5.37 Gbit/sec* 18.24 Gbit/sec**[System; Intel® Core™ i7-3930K CPU @ 3.20GHz / 32GB DDR3-DIMM / Linux 3.7-rc7 / Mellanox ConnectX3 40GbE-NIC][Benchmark Tool: nc+dd+tmpfs*, rcopy+tmpfs**,] SOURCE: SAKURA Internet Research Center. 12/2012 rev2 Project THORN.
  11. 11. DPDK TESTING Overview 1) Intel ® DPDK source codes for linux were released at End of 2012. http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing Running Intel® DPDK Applications in a Linux Environment To run an Intel® DPDK application, some customization must be done on the target machine. Running an Intel® DPDK application requires some kernel configuration customization (done at build time) and some dynamic kernel tweaks (modules, procfs): Required: • glibc >= 2.7 (for features related to cpuset) ..etc Intel® 10Gbps Dual-port Network AdapterLinux DPDK Layer3 Router is Evolutionary Network Technology. Source: SAKURA Internet Research Center. 11/2012: Project THORN
  12. 12. DPDK Layer 3 Fwd Benchmark [Layer 3 Fowarder with Intel® DPDK] Intel® Core™ i7-3960X CPU @ 3.30GHz Intel 82599EB 10GbE-NIC /PCI Epxress 3.0 Linux 2.6.32-220.23.1.el6.x86_64 # ./build/l3fwd -c 0x3 -n 2 -- -p 0x3 --config="(0,0,0),(1,0,1)" : done: Port 0 Link Up - speed 10000 Mbps - full-duplex done: Port 1 Link Up - speed 10000 Mbps - full-duplex L3FWD: entering main loop on lcore 1 VXLAN Network L3FWD: -- lcoreid=1 portid=1 rxqueueid=0 :[Traffic Generator] MTU64Byte Short Pkt. [Packet Receiver]Intel® Core™ i7-3930K CPU @ 3.20GHz AMD E-350 1.76GHz / DDR3 8GBIntel 82599EB 10GbE-NIC/PCI Express 2.0 Intel 82599EB 10GbE-NIC/PCI Express 2.010.0.0.11 / 00:0C:BD:00:E8:1B 10.0.0.22 / 90:E2:BA:23:02:9D# pkt-gen –i ix1 –f tx –l 64 -d 10.0.0.22 # pkt-gen –i ix1 –f rxmain [1042] map size is 207712 Kb main [1071] map size is 207712 Kbmain [1064] mmapping 207712 Kbytes main [1093] mmapping 207712 Kbytesmain [1119] Ready... main [1146] Wait 2 secs for phy resetsender_body [607] start main [1148] Ready...sender_body [644] drop copy main [1257] 1206448 ppsmain [1231] 14115785 pps main [1257] 13602560 ppsmain [1231] 14118009 pps main [1257] 13573141 pps: [14.1Mpps] : [13.5Mpps] Source: SAKURA Internet Research Center. 11/2012: Project THORN
  13. 13. DPDK Layer 3 Fwd perf stat [Layer 3 Fowarder with Intel® DPDK] Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz Intel 82599EB 10GbE-NIC /PCI Epxress 3.0 Linux 2.6.32-220.23.1.el6.x86_64 # perf rstat ./build/l3fwd -c 0x3 -n 2 -- -p 0x3 --config="(0,0,0),(1,0,1)" : Performance counter stats for ./build/l3fwd -c 0x3 -n 2 -- -p 0x3 --config=(0,0,0),(1,0,1): VXLAN Network 92805.936402 task-clock # 1.853 CPUs utilized 133 context-switches # 0.000 M/sec 13 CPU-migrations # 0.000 M/sec 1,958 page-faults # 0.000 M/sec 370,566,087,852 cycles # 3.993 GHz [83.33%] 102,860,504,930 stalled-cycles-frontend # 27.76% frontend cycles idle [83.33%] 32,572,874,185 stalled-cycles-backend # 8.79% backend cycles idle [66.67%] 663,418,320,041 instructions # 1.79 insns per cycle # 0.16 stalled cycles per insn [83.33%] 106,088,555,938 branches # 1143.123 M/sec [83.33%] 63,608,468 branch-misses # 0.06% of all branches [83.33%] 50.077399637 seconds time elapsed[Traffic Generator] MTU64Byte Short Pkt. [Packet Receiver] Source: SAKURA Internet Research Center. 11/2012: Project THORN
  14. 14. Thanks for your interest.SAKURA Internet Research Center.

×