Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kernel Recipes 2017 - EBPF and XDP - Eric Leblond

2,001 views

Published on


Berkeley Packet Filter is an old friend for most people that deal with network under Linux. But its extended version eBPF is completely redefining the scope of usage and interaction with the kernel. It can indeed be used to instrument most parts of the kernel. This goes from network tracing to process or I/O monitoring.

This talk will provide an overview of eBPF, from concept to tools like BCC. It will then focus on XDP for eXtreme Data Path and the possible applications in term of networking provided by this new framework.

Eric Leblond, Stamus Network

Published in: Technology
  • Be the first to comment

Kernel Recipes 2017 - EBPF and XDP - Eric Leblond

  1. 1. eBPF and XDP seen from the eyes of a meerkat É. Leblond Stamus Networks September 29, 2017 É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 1 / 42
  2. 2. About me Eric Leblond a.k.a Regit Network security expert Netfilter core team: Maintainer of ulogd2: Netfilter logging daemon Suricata developer: In charge of packet acquisition co-founder of Stamus Networks, a company providing Suricata based network probe appliances. É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 2 / 42
  3. 3. About me Eric Leblond a.k.a Regit Network security expert Netfilter core team: Maintainer of ulogd2: Netfilter logging daemon Suricata developer: In charge of packet acquisition co-founder of Stamus Networks, a company providing Suricata based network probe appliances. Not a Kernel developer 42 + 1 patches in Linux Mainly hacks É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 2 / 42
  4. 4. About me Eric Leblond a.k.a Regit Network security expert Netfilter core team: Maintainer of ulogd2: Netfilter logging daemon Suricata developer: In charge of packet acquisition co-founder of Stamus Networks, a company providing Suricata based network probe appliances. Not a Kernel developer 42 + 1 patches in Linux Mainly hacks No kernel harmed during the making of this talk É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 2 / 42
  5. 5. What is Suricata IDS and IPS engine Get it here: http://www.suricata-ids.org Open Source (GPLv2) Initially publicly funded, now funded by consortium members Run by Open Information Security Foundation (OISF) More information about OISF at http://www.openinfosecfoundation.org/ É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 3 / 42
  6. 6. Suricata key points Suricata Protocol analysis Port inde- pendant detection Metadata extraction Protocols HTTP SMTP TLS DNS SSH Intrusion detection Signatures Protocol keywords Multi step attacks Lua for advance analysis ArchitectureOutput JSON Redis Syslog Prelude IDSpcap afpacket netmap IPS Netfilter afpacket ipfw É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 4 / 42
  7. 7. Suricata Ecosystem É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 5 / 42
  8. 8. Suricata application layer analysis É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 6 / 42
  9. 9. Suricata EVE JSON event É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 7 / 42
  10. 10. Suricata Dashboard É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 8 / 42
  11. 11. 1 Suricata meets eBPF 2 AF_PACKET bypass via eBPF 3 XDP É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 8 / 42
  12. 12. AF_PACKET Linux raw socket Raw packet capture method Socket based or mmap based É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 9 / 42
  13. 13. AF_PACKET Linux raw socket Raw packet capture method Socket based or mmap based Fanout mode Load balancing over multiple sockets Multiple load balancing functions Flow based CPU based RSS based É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 9 / 42
  14. 14. Suricata workers mode É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 10 / 42
  15. 15. Load balancing and hash symmetry Stream reconstruction Using packets sniffed from network to reconstruct TCP stream as seen by remote application Non symmetrical hash break Out of order packets Effect of non symmetrical hash É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 11 / 42
  16. 16. Broken symmetry History T. Herbert introduce asymmetrical hash function in flow Kernel 4.2 Users did start to complain And our quest did begin Fixed in 4.6 and pushed to stable by David S. Miller É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 12 / 42
  17. 17. Broken symmetry History T. Herbert introduce asymmetrical hash function in flow Kernel 4.2 Users did start to complain And our quest did begin Fixed in 4.6 and pushed to stable by David S. Miller Intel NIC RSS hash XL510 hash is not symmetrical XL710 could be symmetrical Hardware is capable Driver does not allow it Patch proposed by Victor Julien É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 12 / 42
  18. 18. eBPF cluster Userspace to the rescue Program your own hash function in userspace Available since Linux 4.3 Developed by Willem de Bruijn Using eBPF infrastructure by Alexei Storovoitov eBPF cluster: ippair IP pair load balancing Perfect for xbit ebpf-lb-file variable in af-packet iface configuration É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 13 / 42
  19. 19. eBPF code for ippair s t a t i c __always_inline i n t ipv4_hash ( s t r u c t __sk_buff ∗skb ) { uint32_t nhoff ; uint32_t src , dst ; nhoff = skb−>cb [ 0 ] ; src = load_word ( skb , nhoff + o f f s e t o f ( s t r u c t iphdr , saddr ) ) ; dst = load_word ( skb , nhoff + o f f s e t o f ( s t r u c t iphdr , daddr ) ) ; return src + dst ; } i n t __section ( " loadbalancer " ) lb ( s t r u c t __sk_buff ∗skb ) { __u32 nhoff = BPF_LL_OFF + ETH_HLEN; skb−>cb [ 0 ] = nhoff ; switch ( skb−>protocol ) { case __constant_htons (ETH_P_IP ) : return ipv4_hash ( skb ) ; case __constant_htons (ETH_P_IPV6 ) : return ipv6_hash ( skb ) ; default : break ; } return skb−>protocol ; /∗ hash on proto by default ∗/ } É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 14 / 42
  20. 20. eBPF cluster: prospective Custom tunnelled traffic Tunneling protocol not known by kernel and card L2TP GTP: 4G protocol Shared by different flows Result in poor load balancing eBPF solution Strip tunnel headers Load balance on inner packets Get fair balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 15 / 42
  21. 21. 1 Suricata meets eBPF 2 AF_PACKET bypass via eBPF 3 XDP É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 15 / 42
  22. 22. The big flow problem: load balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 16 / 42
  23. 23. The big flow problem: load balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 16 / 42
  24. 24. The big flow problem: load balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 16 / 42
  25. 25. The big flow problem: unfair balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 17 / 42
  26. 26. The big flow problem: unfair balancing É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 17 / 42
  27. 27. The big flow problem: elephant flow É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 18 / 42
  28. 28. The big flow problem: elephant flow É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 18 / 42
  29. 29. The big flow problem: elephant flow É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 18 / 42
  30. 30. The big flow problem Ring buffer overrun Limited sized ring buffer Overrun cause packets loss that cause streaming malfunction Ring size increase Work around Use memory Fail for non burst Dequeue at N Queue at speed N+M É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 19 / 42
  31. 31. Introducing bypass Stop packet handling as soon as possible Tag flow as bypassed Maintain table of bypassed flows Discard packet if part of a bypassed flow Bypass method Local bypass: Suricata discard packet after decoding Capture bypass: capture method maintain flow table and discard packets of bypassed flows É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 20 / 42
  32. 32. Bypassing big flow: local bypass É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 21 / 42
  33. 33. Bypassing big flow: capture bypass É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 22 / 42
  34. 34. Bypassing big flow: capture bypass É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 22 / 42
  35. 35. Stream depth bypass Attacks characteristic In most cases attack is done at start of TCP session Generation of requests prior to attack is not common Multiple requests are often not even possible on same TCP session Stream reassembly depth Reassembly is done till stream.reassembly.depth bytes. Stream is not analyzed once limit is reached Activating stream depth bypass Set stream.bypass to yes in YAML É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 23 / 42
  36. 36. Selective bypass Ignore some traffic Ignore intensive traffic like Netflix Can be done independently of stream depth Can be done using generic or custom signatures É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 24 / 42
  37. 37. Selective bypass Ignore some traffic Ignore intensive traffic like Netflix Can be done independently of stream depth Can be done using generic or custom signatures The bypass keyword A new bypass signature keyword Trigger bypass when signature match Example of signature pass http any any −> any any ( content : " suricata . io " ; http_host ; bypass ; sid :6666; rev : 1 ; ) É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 24 / 42
  38. 38. Implementation Suricata update Add callback function Capture method register itself and provide a callback Suricata calls callback when it wants to offload É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 25 / 42
  39. 39. Implementation Suricata update Add callback function Capture method register itself and provide a callback Suricata calls callback when it wants to offload NFQ bypass in Suricata 3.2 Update capture register function Written callback function Set a mark with respect to a mask on packet Mark is set on packet when issuing the verdict É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 25 / 42
  40. 40. And now AF_PACKET What’s needed Suricata to tell kernel to ignore flows Kernel system able to Maintain a list of flow entries Discard packets belonging to flows in the list Update from userspace nftables is too late even in ingress É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 26 / 42
  41. 41. And now AF_PACKET What’s needed Suricata to tell kernel to ignore flows Kernel system able to Maintain a list of flow entries Discard packets belonging to flows in the list Update from userspace nftables is too late even in ingress eBPF filter using maps eBPF introduce maps Different data structures Hash, array, . . . Update and fetch from userspace Looks good! É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 26 / 42
  42. 42. Using libbpf Library Linux source in tools/lib/bpf directory Provide high level function to load eBPF elf file Create maps for user Do the relocation Sample usage s t r u c t bpf_object ∗ bpfobj = bpf_object__open ( path ) ; bpf_object__load ( bpfobj ) ; pfd = bpf_program__fd ( bpfprog ) ; /∗ store the map in our array ∗/ bpf_map__for_each (map, bpfobj ) { map_array [ l a s t ] . fd = bpf_map__fd (map ) ; map_array [ l a s t ] . name = strdup ( bpf_map__name (map ) ) ; l a s t ++; } É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 27 / 42
  43. 43. Kernel code and exchange structure s t r u c t pair { uint64_t time ; uint64_t packets ; uint64_t bytes ; } ; s t r u c t bpf_map_def SEC( "maps" ) flow_table_v4 = { . type = BPF_MAP_TYPE_HASH, . key_size = sizeof ( s t r u c t flowv4_keys ) , . value_size = sizeof ( s t r u c t pair ) , . max_entries = 32768, } ; value = bpf_map_lookup_elem(& flow_table_v4 , &tuple ) ; i f ( value ) { __sync_fetch_and_add(& value−>packets , 1 ) ; __sync_fetch_and_add(& value−>bytes , skb−>len ) ; value−>time = bpf_ktime_get_ns ( ) ; return 0; } return −1; É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 28 / 42
  44. 44. Sharing data Data is updated with stats Getting last flow activity time allow Suricata to handle timeout É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 29 / 42
  45. 45. Userspace code s t r u c t flowv4_keys { __be32 src ; __be32 dst ; union { __be32 ports ; __be16 port16 [ 2 ] ; } ; __u32 ip_proto ; } ; while ( bpf_map__get_next_key ( mapfd , &key , &next_key ) == 0) { bpf_map__lookup_elem ( mapfd , &key , &value ) ; clock_gettime (CLOCK_MONOTONIC, &curtime ) ; i f ( curtime −>tv_sec ∗ 1000000000 − value . time > BYPASSED_FLOW_TIMEOUT) { flowstats −>count ++; flowstats −>packets += value . packets ; flowstats −>bytes += value . bytes ; bpf_map__delete_elem ( fd , key ) ; } key = next_key ; } É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 30 / 42
  46. 46. Test methodology Test setup Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Live traffic: Around 1Gbps to 2Gbps Real users so not reproducible Tests One hour long run Different stream depth values Collected Suricata statistics counters (JSON export) Graphs done via Timelion (https://www.elastic.co/blog/timelion-timeline) É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 31 / 42
  47. 47. Results: stream bypass at 1mb É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 32 / 42
  48. 48. Results: stream bypass at 512kb É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 33 / 42
  49. 49. A few words on graphics Tests at 1mb Mark show some really high rate bypass Potentialy a big high speed flow Tests at 512kb We have on big flow that kill the bandwidth Capture get almost null Even number of closed bypassed flows is low É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 34 / 42
  50. 50. AF_PACKET bypass and your CPU is peaceful É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 35 / 42
  51. 51. 1 Suricata meets eBPF 2 AF_PACKET bypass via eBPF 3 XDP É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 35 / 42
  52. 52. eXtreme Data Path raw packet-page inside driver Before allocating SKBs Inside device drivers RX function Operate directly on RX DMA packet-pages Run eBPF program at hook point eBPF decision XDP_PASS: pass to normal network stack (can be modified) XDP_DROP: drop packet XDP_TX: bounce trafic XDP_REDIRECT: redirect packet to another interface using port map Talks by Jesper Dangaard Brouer http://people.netfilter.org/hawk/presentations/LLC2017/XDP_DDoS_protecting_LLC2017.pdf https://people.netfilter.org/hawk/presentations/NetConf2017/xdp_work_ahead_NetConf_April_ 2017.pdf É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 36 / 42
  53. 53. XDP Support and Performance Need modified drivers Supported drivers Mellanox: mlx4 + mlx5 Netronome: nfp Cavium/Qlogic: qede virtio-net Broadcom: bnxt_en Intel: ixgbe, i40e Kernel 4.12 introduce generic drivers Performances Single CPU on Mellanox 40Gbit/s NICs (mlx4) 28 Mpps – Filter drop all (actually read/touch data) 12 Mpps – TX-bounce forward (TX bulking) 10 Mpps – TX-bounce with udp+mac rewrite É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 37 / 42
  54. 54. Suricata XDP bypass Convert eBPF code No more access to skb Direct access to data means parsing From filter socket to device Filter is attached to device Code to attach needs to be run And added to libbpf (?) per CPU structure for flow table É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 38 / 42
  55. 55. Suricata XDP capture Use perf event Use perf event system Memory mapped ring buffer Per CPU structure Architecture constraints Load balancing per CPU Done by the network card Symetrical hash needed CPU pinning É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 39 / 42
  56. 56. Implementation i n t SEC( " xdp " ) xdp_hashfilter ( s t r u c t xdp_md ∗ ctx ) { void ∗data_end = ( void ∗ ) ( long ) ctx −>data_end ; void ∗data = ( void ∗ ) ( long ) ctx −>data ; s t r u c t ethhdr ∗eth = data ; i n t rc = XDP_PASS; uint16_t h_proto ; uint64_t nh_off ; nh_off = sizeof (∗ eth ) ; i f ( data + nh_off > data_end ) return rc ; h_proto = eth−>h_proto ; i f ( h_proto == __constant_htons (ETH_P_8021Q) | | h_proto == __constant_htons (ETH_P_8021AD ) s t r u c t vlan_hdr ∗vhdr ; vhdr = data + nh_off ; nh_off += sizeof ( s t r u c t vlan_hdr ) ; i f ( data + nh_off > data_end ) return rc ; h_proto = vhdr−>h_vlan_encapsulated_proto ;É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 40 / 42
  57. 57. Conclusion Suricata, eBPF and XDP A fresh but interesting method Feedback welcome More information Stamus Networks: https://www.stamus-networks.com/ Suricata eBPF code: https://github.com/regit/suricata/tree/ebpf-3.16 Libbpf update: https://github.com/regit/linux/tree/libbpf-xdp É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 41 / 42
  58. 58. Questions ? Thanks to Alexei Storovoitov Daniel Borkmann Contact me Mail: eleblond@stamus-networks.com Twitter: @regiteric More information Suricata eBPF and XDP code: https://github.com/regit/ suricata/tree/ebpf-3.16 É. Leblond (Stamus Networks) eBPF and XDP seen from the eyes of a meerkat September 29, 2017 42 / 42

×