Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Network Functions with eBPF & BCC

299 views

Published on

eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.

BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.

Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).

In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.

About the speaker

Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.

Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.

Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.​

Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.

Published in: Software
  • Be the first to comment

Building Network Functions with eBPF & BCC

  1. 1. Shmulik Ladkani, 2018 Building Network Functions with eBPF & BCC This work is licensed under a Creative Commons Attribution 4.0 International License.
  2. 2. Agenda ● Intro ● Theory ○ Classical BPF ○ eBPF ○ BCC ● Practice ○ Examples and demo
  3. 3. Berkeley Packet Filter
  4. 4. Berkeley Packet Filter New Architecture for User-level Packet Capture ● McCanne/Jacobson 1993 ● Standardized API ● Performant
  5. 5. Berkeley Packet Filter ● Allows user program to attach a filter onto a socket ● Available on most *nix systems
  6. 6. Design ● Abstract-machine architecture ○ Registers, memory, addressing modes… ○ Instruction set (load, store, branch, ALU…) ● In-kernel interpreter Example program: assembly / machine instructions (000) ldh [12] { 0x28, 0, 0, 0x0000000c }, (001) jeq #0x800 jt 2 jf 5 { 0x15, 0, 3, 0x00000800 }, (002) ldb [23] { 0x30, 0, 0, 0x00000017 }, (003) jeq #0x6 jt 4 jf 5 { 0x15, 0, 1, 0x00000006 }, (004) ret #262144 { 0x6, 0, 0, 0x00040000 }, (005) ret #0 { 0x6, 0, 0, 0x00000000 },
  7. 7. Modus Operandi struct sock_filter code[] = { /* ... machine instructions ... */ }; struct sock_fprog bpf = { .filter = code, .len = ARRAY_SIZE(code), }; sock = socket(...); setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &bpf, sizeof(bpf));
  8. 8. Applications ● Libpcap ○ Tcpdump, Wireshark, Nmap... ● DHCP stacks ● WPA 802.1x stacks ● Android 464XLAT ● android.net.NetworkUtils ● Custom user-space protocol stacks
  9. 9. Linux Enhancements Packet Metadata Access Extension Description len skb->len proto skb->protocol type skb->pkt_type ifidx skb->dev->ifindex hatype skb->dev->type mark skb->mark rxhash skb->hash vlan_tci skb_vlan_tag_get(skb) vlan_avail skb_vlan_tag_present(skb) vlan_tpid skb->vlan_proto nla Netlink attribute of type X with offset A nlan Nested Netlink attribute of type X with offset A
  10. 10. Linux Enhancements Just-In-Time Compiler ● Converts BPF instructions directly into native code ● As of v3.0 (x86_64) ○ SPARC, PowerPC, ARM, ARM64, MIPS, s390 followed
  11. 11. Linux Enhancements Hooking Points ● IPTables xt_bpf ○ Competitive with traditional u32 match ○ As of v3.9 ○ iptables -A OUTPUT -m bpf --bytecode '4,48 0 0 9,21 0 1 6,6 0 0 1,6 0 0 0' -j ACCEPT ● TC cls_bpf ○ Alternative to ematch / u32 classification ○ As of v3.13 ○ tc filter add dev em1 parent 1: bpf bytecode '1,6 0 0 4294967295,' flowid 1:1 tc filter add dev em1 parent 1: bpf bytecode-file /var/bpf/tcp-syn flowid 1:1
  12. 12. Linux Enhancements Seccomp BPF ● Filters system calls using a BPF filter ○ Operates on syscall number and syscall arguments ○ As of v3.5 ○ ● Used by Chrome, Firefox, OpenSSH, Android… static struct filter = { /* ... */ // load syscall number BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)), // only allow ‘read’ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, SYS_read, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW) BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL) }; /* ... */ prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &filterprog);
  13. 13. Summary ● Fixed filter program ● Few injection points ● Two domains ○ Packet filtering ○ Syscall filtering ● Functional, stateless ● Kernel data is immutable ● No kernel interaction User-program injected into kernel to control behavior
  14. 14. Extended BPF
  15. 15. eBPF ● Abstract-machine engine running injected user programs ● On steroids ○ New domain (tracing/profiling) ○ Numerous hooking points ○ LLVM backend ○ Actions (mutates data) ○ Data-structures (“maps”) ○ Kernel callable helper functions
  16. 16. Applications (network) ● Network Security (DDoS, IDS, IPS …) ● Load Balancers ● Custom Statistics ● Monitoring ● Container Networking ● Custom Forwarding Stacks ● Network Functions
  17. 17. ● Write ○ Restricted C ● Compile ○ clang & llc ● Load ○ bpf(BPF_PROG_LOAD, ...) ● Attach ○ Subsystem dependent Modus Operandi
  18. 18. struct bpf_map_def SEC("maps") my_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), .value_size = sizeof(long), .max_entries = 256, }; SEC("socket1") int bpf_prog1(struct __sk_buff *skb) { int index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)); long *value; if (skb->pkt_type != PACKET_OUTGOING) return 0; value = bpf_map_lookup_elem(&my_map, &index); if (value) __sync_fetch_and_add(value, skb->len); return 0; } samples/bpf/sockex1_kern.c
  19. 19. load_bpf_file(filename); // assigns prog_fd, map_fd sock = open_raw_sock("lo"); setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, prog_fd, sizeof(prog_fd[0])); f = popen("ping -c5 localhost", "r"); for (i = 0; i < 5; i++) { long long tcp_cnt, udp_cnt, icmp_cnt; key = IPPROTO_TCP; bpf_map_lookup_elem(map_fd[0], &key, &tcp_cnt); key = IPPROTO_UDP; bpf_map_lookup_elem(map_fd[0], &key, &udp_cnt); key = IPPROTO_ICMP; bpf_map_lookup_elem(map_fd[0], &key, &icmp_cnt); printf("TCP %lld UDP %lld ICMP %lld bytesn", tcp_cnt, udp_cnt, icmp_cnt); sleep(1); } samples/bpf/sockex1_user.c
  20. 20. eBPF Maps ● Key-value store ○ Keeps program state ○ Accessible from the eBPF program ○ Accessible from userspace ● Allows context aware behavior ● Numerous data structures BPF_MAP_TYPE_HASH BPF_MAP_TYPE_ARRAY BPF_MAP_TYPE_LRU_HASH BPF_MAP_TYPE_LPM_TRIE more ...
  21. 21. Determines: context, whence, access rights BPF_PROG_TYPE_SOCKET_FILTER packet filter BPF_PROG_TYPE_SCHED_CLS tc classifier BPF_PROG_TYPE_SCHED_ACT tc action BPF_PROG_TYPE_LWT_* lightweight tunnel filter BPF_PROG_TYPE_KPROBE kprobe filter BPF_PROG_TYPE_TRACEPOINT tracepoint filter BPF_PROG_TYPE_PERF_EVENT perf event filter BPF_PROG_TYPE_XDP packet filter from XDP BPF_PROG_TYPE_CGROUP_SKB packet filter for control groups BPF_PROG_TYPE_CGROUP_SOCK same, allowed to modify socket options Program Types
  22. 22. Helper Functions ● eBPF program may call a predefined set of functions ● Differs by program type ● Examples: BPF_FUNC_skb_load_bytes BPF_FUNC_csum_diff BPF_FUNC_skb_get_tunnel_key BPF_FUNC_get_hash_recalc ... BPF_FUNC_skb_store_bytes BPF_FUNC_skb_pull_data BPF_FUNC_l3_csum_replace BPF_FUNC_l4_csum_replace BPF_FUNC_redirect BPF_FUNC_clone_redirect BPF_FUNC_skb_vlan_push BPF_FUNC_skb_vlan_pop BPF_FUNC_skb_change_proto BPF_FUNC_skb_set_tunnel_key ...
  23. 23. BCC
  24. 24. BPF Compiler Collection ● Toolkit for creating and using eBPF ● Makes eBPF programs easier to write ○ Kernel instrumentation in C ○ Frontends in Python and Lua ● Numerous examples ● Documentation and tutorials
  25. 25. Example #1 Custom Statistics Histogram of packets by their size
  26. 26. Example #2 Custom Filtering Drop egress ARP Requests for specific Target Addresses
  27. 27. Example #3 Custom Network Function Network Load Balancer
  28. 28. Example #3 - Topology Server1 VIP 192.0.2.50 10.50.1.9 Server2 VIP 192.0.2.50 10.50.2.9 Test Machine 10.33.33.10 10.33.33.11 10.33.33.12 10.33.33.13 10.33.33.14 Load Balancer 192.0.2.50 dev multigre0 Set GRE tunnel destination by flow hash Src: 10.33.33.10 Dst: 192.0.2.50 Src: 10.50.1.1 Dst: 10.50.1.9 Src: 10.33.33.10 Dst: 192.0.2.50
  29. 29. Further Topics ● bpfilter ● Open vSwitch eBPF datapath ● XDP ● Hardware Offloads ● Tracing / Profiling
  30. 30. Thank You!

×