How eBPF boost up
Kubernetes service
networking performance
Joseph Muli
Rajesh Dutta
Presenters
7 years' experience in IT
Started as a python developer
Now a Platform engineer & Architecture enthusiast
Joseph Muli
13 years' experience in IT
Started as a performance engineer
Now a Platform engineer & Kubernetes enthusiast
Rajesh Dutta
Agenda
• Quick refresher
• How does service networking work in Kubernetes?
• Deep diving into iptables
• Introducing eBPF
• How eBPF improves service networking in Kubernetes?
• Usecase: Cilium
• Impact on latency
• When to use and when not to use?
Quick Refresher
Service(SVC)
SVC
POD POD
POD
172.16.1.10 172.16.1.11 172.16.1.12
10.96.12.11
endpoint endpoint endpoint
Kube-Proxy
• Runs as a DaemonSet (1 per node)
• Two modes: iptables and IPVS
• Watches the API server for new services
and endpoints or changes in existing ones.
How does service
networking work
in Kubernetes?
Service networking using iptables
Host Network Namespace
Kernel
User
lxc0 eth0
192.168.0.10
Service networking using iptables
Host Network Namespace
Kernel
User
lxc0 eth0
192.168.0.10
rules
-----
-----
-----
kube proxy
(iptables)
Service networking using iptables
Host Network Namespace
Pod Network Namespace
eth0
Kernel
User
lxc0 eth0
192.168.0.10
rules
-----
-----
-----
kube proxy
(iptables)
172.16.1.10
Service networking using iptables
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
Kernel
User
lxc0 eth0
192.168.0.10
rules
-----
-----
-----
kube proxy
(iptables)
172.16.1.10
Service networking using iptables
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
rules
-----
-----
-----
kube proxy
(iptables)
172.16.1.10
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
172.16.1.10
rules
-----
-----
-----
kube proxy
(iptables)
n/w packet
Service networking using iptables
Let's deep dive
within iptables
Table
e.g. Filter
Table
e.g. Filter
iptables
Table
e.g. Filter
Firewalls work by defining rules that govern which
traffic is allowed, and which is blocked.
The utility firewall developed for Linux systems
is iptables. It was chosen because of its stability and
wide adoption on Linux
Table
e.g. Filter
Table
e.g. Filter
1
Table
e.g. Filter
Firewalls work by defining rules that govern which
traffic is allowed, and which is blocked.
The utility firewall developed for Linux systems
is iptables. It was chosen because of its stability and
wide adoption on Linux
Chain 1
e.g. Input
iptables
Table
e.g. Filter
Table
e.g. Filter
Table
e.g. Filter
Firewalls work by defining rules that govern which
traffic is allowed, and which is blocked.
The utility firewall developed for Linux systems
is iptables. It was chosen because of its stability and
wide adoption on Linux
Chain 1
e.g. Input
Rule
Rule
Rule
iptables
Inside iptables
Inside iptables
Inside iptables
24
Inside iptables
Inside iptables
Inside iptables
27
Inside iptables
Inside iptables
Umm, that’s a hell lot of rules!
Limitations using iptables 1/4
A multitude of rule processing for each
and every service
Sequential search, no indexing!
Full table scans are expensive !!
Limitations using iptables 2/4
Incremental changes are not supported.
If you need to make a change to an
existing rule, you must remove the old
rule and add a new rule with the updated
configuration.
Limitations using iptables 3/4
One size doesn’t fit all!
Limitations using iptables 4/4
https://twitter.com/jpetazzo/status/614851069508595712?ref_src=twsrc%5Etfw
How much latency are we talking about?
Latency: Adding Virtual IP
Number of Services 1 5,000 20,000
Number of Rules 8 40,000 160,000
Latency using IPTables 2 ms 11 min 5 hours
Latency Overhead
Latency Overhead
Latency Overhead
Let’s move on from iptables to IPVS
Inside IPVS
IPVS allows for more efficient
and scalable load balancing of
traffic using hash tables.
The mode includes optionally
choosing load balancing
algorithms.
And Hash tables are fast!
Latency: Adding Virtual IP
Number of Services 1 5000 20000
Number of Rules 8 40,000 160,000
Latency using IPTables 2 ms 11 min 5 hours
Latency using IPVS 2 ms 2 ms 2 ms
That’s significant improvement with IPVS!
Is that it?
Let’s compare
Features IPVS eBPF
Programmability ❌ ✅
(flexible)
Security Limited
(not out-of-the-box)
✅
(n/w attacks, custom security policies)
Observability Limited
(not out-of-the-box)
✅
(packet inspection, system performance)
LoadBalancing ✅ ✅
(faster)
IPVS vs eBPF
“eBPF(XDP) offers a performance gain of 4.3x over IPVS” for load
balancing
"The express Data Path: Fast Programmable Packet Processing in the Operating System Kernel” published
in 2018
Enter
Introduction to eBPF
eBPF is technology that allows running sandboxed programs within the kernel
space without changing or restarting the kernel.
Journey
1990 1995 2000 2005 2010 2015 2020
1993, Inception as BSD Packet Filter in
a paper by Lawrence Berkeley National
Laboratory’s Steven McCanne and Van
Jacobson
pseudomachine that can run filters,
which are programs written to
determine whether to accept or reject a
network packet.
1997, BPF (Berkeley Packet Filter)
released to support tcpdump utility
2012, seccomp-bpf released. This
enabled the use of BPF programs to
allow or deny user space applications
from making system calls
2014, BPF evolved as extended BPF
(eBPF) with support for maps, bpf()
system calls, safety verifier
2015, eBPF backend merged to LLVM
compiler suite
2016, ebpf was attached to the XDP
Programmable LSM via eBPF is
upstreamed by Google
Where does eBPF run?
eBPF Maps
Common point between
user space and kernel
space. Stores state and
shares information.
So eBPF makes
what difference?
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
rules
-----
-----
-----
kube proxy
(iptables)
172.16.1.10
Service networking using Kube-Proxy
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
172.16.1.10
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
eBPF
verifier
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
eBPF
verifier
JIT
compiler
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
eBPF
verifier
JIT
compiler
172.16.1.10
Service networking using eBPF
Host Network Namespace
Pod Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
eBPF
verifier
JIT
compiler
eBPF
maps
172.16.1.10
Service networking using eBPF
Pod Network Namespace
172.16.1.10
61 © Xebia IT Architects 2023
Host Network Namespace
eth0
Hello
World
port:8080
ClusterIP Service
10.96.12.11
endpoint
172.16.1.10:8080
Kernel
User
lxc0 eth0
192.168.0.10
eBPF
code
compile
eBPF
code
eBPF
loader
eBPF
verifier
JIT
compiler
eBPF
maps
// eBPF code (An Example)
SEC(“to_netdev”)
int handle(struct sk_buff
*skb) {
…
if(tcp >dport == 8080)
redirect(lxc0);
return DROP_PACKET;
}
Service networking using eBPF

SRE NL MeetUp - eBPF.pdf

  • 1.
    How eBPF boostup Kubernetes service networking performance Joseph Muli Rajesh Dutta
  • 2.
    Presenters 7 years' experiencein IT Started as a python developer Now a Platform engineer & Architecture enthusiast Joseph Muli 13 years' experience in IT Started as a performance engineer Now a Platform engineer & Kubernetes enthusiast Rajesh Dutta
  • 3.
    Agenda • Quick refresher •How does service networking work in Kubernetes? • Deep diving into iptables • Introducing eBPF • How eBPF improves service networking in Kubernetes? • Usecase: Cilium • Impact on latency • When to use and when not to use?
  • 4.
  • 5.
    Service(SVC) SVC POD POD POD 172.16.1.10 172.16.1.11172.16.1.12 10.96.12.11 endpoint endpoint endpoint
  • 6.
    Kube-Proxy • Runs asa DaemonSet (1 per node) • Two modes: iptables and IPVS • Watches the API server for new services and endpoints or changes in existing ones.
  • 7.
    How does service networkingwork in Kubernetes?
  • 8.
    Service networking usingiptables Host Network Namespace Kernel User lxc0 eth0 192.168.0.10
  • 9.
    Service networking usingiptables Host Network Namespace Kernel User lxc0 eth0 192.168.0.10 rules ----- ----- ----- kube proxy (iptables)
  • 10.
    Service networking usingiptables Host Network Namespace Pod Network Namespace eth0 Kernel User lxc0 eth0 192.168.0.10 rules ----- ----- ----- kube proxy (iptables) 172.16.1.10
  • 11.
    Service networking usingiptables Host Network Namespace Pod Network Namespace eth0 Hello World port:8080 Kernel User lxc0 eth0 192.168.0.10 rules ----- ----- ----- kube proxy (iptables) 172.16.1.10
  • 12.
    Service networking usingiptables Host Network Namespace Pod Network Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 rules ----- ----- ----- kube proxy (iptables) 172.16.1.10
  • 13.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 172.16.1.10 rules ----- ----- ----- kube proxy (iptables) n/w packet Service networking using iptables
  • 14.
  • 15.
    Table e.g. Filter Table e.g. Filter iptables Table e.g.Filter Firewalls work by defining rules that govern which traffic is allowed, and which is blocked. The utility firewall developed for Linux systems is iptables. It was chosen because of its stability and wide adoption on Linux
  • 16.
    Table e.g. Filter Table e.g. Filter 1 Table e.g.Filter Firewalls work by defining rules that govern which traffic is allowed, and which is blocked. The utility firewall developed for Linux systems is iptables. It was chosen because of its stability and wide adoption on Linux Chain 1 e.g. Input iptables
  • 17.
    Table e.g. Filter Table e.g. Filter Table e.g.Filter Firewalls work by defining rules that govern which traffic is allowed, and which is blocked. The utility firewall developed for Linux systems is iptables. It was chosen because of its stability and wide adoption on Linux Chain 1 e.g. Input Rule Rule Rule iptables
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Umm, that’s ahell lot of rules!
  • 27.
    Limitations using iptables1/4 A multitude of rule processing for each and every service
  • 28.
    Sequential search, noindexing! Full table scans are expensive !! Limitations using iptables 2/4
  • 29.
    Incremental changes arenot supported. If you need to make a change to an existing rule, you must remove the old rule and add a new rule with the updated configuration. Limitations using iptables 3/4
  • 30.
    One size doesn’tfit all! Limitations using iptables 4/4
  • 31.
  • 32.
    How much latencyare we talking about?
  • 33.
    Latency: Adding VirtualIP Number of Services 1 5,000 20,000 Number of Rules 8 40,000 160,000 Latency using IPTables 2 ms 11 min 5 hours
  • 34.
  • 35.
  • 36.
  • 37.
    Let’s move onfrom iptables to IPVS
  • 38.
    Inside IPVS IPVS allowsfor more efficient and scalable load balancing of traffic using hash tables. The mode includes optionally choosing load balancing algorithms. And Hash tables are fast!
  • 39.
    Latency: Adding VirtualIP Number of Services 1 5000 20000 Number of Rules 8 40,000 160,000 Latency using IPTables 2 ms 11 min 5 hours Latency using IPVS 2 ms 2 ms 2 ms
  • 40.
    That’s significant improvementwith IPVS! Is that it?
  • 41.
    Let’s compare Features IPVSeBPF Programmability ❌ ✅ (flexible) Security Limited (not out-of-the-box) ✅ (n/w attacks, custom security policies) Observability Limited (not out-of-the-box) ✅ (packet inspection, system performance) LoadBalancing ✅ ✅ (faster)
  • 42.
    IPVS vs eBPF “eBPF(XDP)offers a performance gain of 4.3x over IPVS” for load balancing "The express Data Path: Fast Programmable Packet Processing in the Operating System Kernel” published in 2018
  • 43.
  • 44.
    Introduction to eBPF eBPFis technology that allows running sandboxed programs within the kernel space without changing or restarting the kernel.
  • 45.
    Journey 1990 1995 20002005 2010 2015 2020 1993, Inception as BSD Packet Filter in a paper by Lawrence Berkeley National Laboratory’s Steven McCanne and Van Jacobson pseudomachine that can run filters, which are programs written to determine whether to accept or reject a network packet. 1997, BPF (Berkeley Packet Filter) released to support tcpdump utility 2012, seccomp-bpf released. This enabled the use of BPF programs to allow or deny user space applications from making system calls 2014, BPF evolved as extended BPF (eBPF) with support for maps, bpf() system calls, safety verifier 2015, eBPF backend merged to LLVM compiler suite 2016, ebpf was attached to the XDP Programmable LSM via eBPF is upstreamed by Google
  • 46.
  • 47.
    eBPF Maps Common pointbetween user space and kernel space. Stores state and shares information.
  • 48.
    So eBPF makes whatdifference?
  • 49.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 rules ----- ----- ----- kube proxy (iptables) 172.16.1.10 Service networking using Kube-Proxy
  • 50.
    Service networking usingeBPF Host Network Namespace Pod Network Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 172.16.1.10
  • 51.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code 172.16.1.10 Service networking using eBPF
  • 52.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code 172.16.1.10 Service networking using eBPF
  • 53.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader 172.16.1.10 Service networking using eBPF
  • 54.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader eBPF verifier 172.16.1.10 Service networking using eBPF
  • 55.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader eBPF verifier JIT compiler 172.16.1.10 Service networking using eBPF
  • 56.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader eBPF verifier JIT compiler 172.16.1.10 Service networking using eBPF
  • 57.
    Host Network Namespace PodNetwork Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader eBPF verifier JIT compiler eBPF maps 172.16.1.10 Service networking using eBPF
  • 58.
    Pod Network Namespace 172.16.1.10 61© Xebia IT Architects 2023 Host Network Namespace eth0 Hello World port:8080 ClusterIP Service 10.96.12.11 endpoint 172.16.1.10:8080 Kernel User lxc0 eth0 192.168.0.10 eBPF code compile eBPF code eBPF loader eBPF verifier JIT compiler eBPF maps // eBPF code (An Example) SEC(“to_netdev”) int handle(struct sk_buff *skb) { … if(tcp >dport == 8080) redirect(lxc0); return DROP_PACKET; } Service networking using eBPF