If you are a performance engineer/network engineer or even security engineer, the chance of you encountering eBPF technology in the future is very high. eBPF now has a huge community of users, including big players like Meta, Google, Cloudflare, and Netflix all using this tech in their daily operations.
"Impact of front-end architecture on development cost", Viktor Turskyi
eBPF — Divulging The Hidden Super Power.pdf
1. eBPF — Divulging The Hidden Superpower
If you are a performance engineer/network engineer or even security engineer, the
chance of you encountering eBPF technology in the future is very high. eBPF now
has a huge community of users, including big players like Meta, Google, Cloudflare,
and Netflix all using this tech in their daily operations.
By CoffeeBeans Consulting
Prelude
Let me start the blog with a real story. A year back one of my friends called me
to discuss tech (which is a very common thing between us). We share different
technical challenges each of us faces at our workplace or by any of our peers
2. and these discussions lead to some informative and creative knowledge-
sharing sessions. In such a discussion, he described a specific challenge
faced by his cousin, who works for a giant cloud provider. The challenge was
to restrict certain IPs dynamically as they provide threats such as a DOS
(Denial of service attack), the application developer’s brain in me impetuously
replied that these should be handled at a firewall level, or middleware can be
written to check the origin of the packet and maintain a blacklist for the
malicious senders and ignore the requests (Yes I come from a NodeJS and Go
background so the initial solution strikes as a middleware). My friend patiently
explained the scale and performance at which this needed to be executed
which was way beyond my comprehension. After a noob’s doubt clearing
session, we agreed that the scale he wanted could only be achieved at a kernel
level. I wished him luck (sarcastically) to write a kernel patch and raise a PR
hoping the OS maintainers would include the kernel patch in an upcoming
kernel release and he can use this feature when it is released. As a reply to my
sarcasm, he shared with me a link to an article that detailed something called
“eBPF” (extended Berkeley Packet Filter). I did a basic skimming through the
article, and my ignorant mind came to sense that there are amazing inventions
in the tech world that I am unaware of.
According to eBPF, you can inject the code directly into the kernel without writing a
patch, waiting for it to be approved by the OS maintainer,
“RUN YOUR CUSTOM CODE DIRECTLY IN OS KERNEL” — LIZ RICE
“SUPERPOWERS FOR LINUX”. — BRENDAN GREGGS
3. History
The eBPF came to life in 2014, introduced in Linux kernel 3.18, thereby unlocking
the God mode of the Linux kernel. The natural doubt anyone reading this blog would
have is regarding the name. If this is an “extended berkeley packet filter” then there
should be a BPF “berkeley packet filter”. Well, you are right. The BSD packet filter is
not a new concept. It was from the 90’s. This gem has been hiding under the radar
for years, the Xennails were true innovators. BPF was very basic and its only job
was to filter packets at the kernel level hence the name.
The eBPF has come a long way from BPF, just a packet filtering utility to the
consideration of microservices architecture for kernels or as they call it microkernels.
All the top tech companies that work at scale nowadays use eBPF on daily basis.
CNCF community nowadays breathes and lives on eBPF, if you are a DevOps
engineer or sysadmin you would have heard of cilium and Falco both popular in
Kubernetes users and production tools that are written on top of eBPF. In 2018 Linux
announced it would replace its iptable-based implementation with an eBPF version in
the kernel (well replacing iptable with any solution would be better), fall back and
disadvantage of using iptables is out of the scope of this article, please go to the
reference section and find a well-written article about it. The Kubernetes used
iptables for the following use cases mostly
4. Kube-proxy — the component which implements Services and load balancing
by DNAT iptables rules
Most CNI plugins are using iptables for Network Policies
Cilium has made it more efficient by eliminating the iptable whose performance
degrades. You can refer to the details here.
Program Execution Bozo’s Guide
To explain the importance of eBPF there needs to be an explanation of how
programs are executed in Linux, I will try to explain it from a 1000ft view for
everyone.
NB: Windows User? Well why are you even reading this article, you guys do not
have all these cool features.
Linux memory is divided into two
1. Kernel space
2. User space
5. Credits: slideshare
The image itself explains the difference between these two. All the programs that
you write are just collections of syscalls that are kernel APIs. Just take the example
of opening a file through your favorite programming language that just translates into
a fileopen syscalls in the kernel.
When your application asks the kernel for something, a chunk of data in kernel
space is frequently copied into user space. We must do this because operating
systems strictly partition memory regions used by the kernel, making it impossible to
simply provide a pointer to some region of kernel memory to a user space program.
This is known as “crossing the user/kernel boundary,” Because of the copy
operation, operations like these can have significant performance consequences.
While syscalls cover almost all cases, there arise situations where this is not
sufficient like when we need kernel-level performance or write a new driver
programming, etc. Depending on the OS maintainers to make patches for all these
6. small use cases is a waste of time and an impossible process. This is where ebpf
comes into the picture.
eBPF helps you to write programs in the user space which get packaged and
injected into the kernel directly, these programs run on VM in the kernel with a
limited instruction set thus extending the capability of the base kernel module.
eBPF Dissected
eBPF is the provision to run custom code that runs on the kernel for various
processes like
Observability (tracing)
Debugging
Firewalling
Load Balancing
Network related activity
Anyone who has worked in tracing the various programs in the kernel would know its
difficulty. The half-baked utilities available in the Linux systems are not enough for
profile complex systems or even to extend the perf tooling.
Ebpf is event-driven which means it gets triggered on the following scenarios
A system call
Function entry/exit
When a packet enters or leaves
K probes or U probes
The programs are written in a language called restricted c which is c with a limited
instruction set. The BPF compiler BCC converts this into a bytecode which is loaded
into the kernel for execution. A validator is run before compiling to ensure there is no
infinite loop or such never-ending I/O operation which could crash the kernel.
7. Additional Trick Under Your Sleeve
The ebpf is indeed a powerful tool that you could have under your sleeve. When
working on high-performance projects tweaking the packets or extending the tracing
functionality all help you give better observability of what’s happening with the
system. Even though encountering the ebpf by an application developer at the
present stage is very feeble, if you are a performance engineer/network engineer or
even security engineer, the chance of you encountering ebpf in the future is going to
shoot up to the sky.
There are some considerations while writing ebpf programs, there have been several
privilege escalation attacks that leverage ebpf since it runs in a sudo privilege. The
ebpf programs could be used as a powerful aid when leveraging kernel memory
vulnerabilities. A detailed writeup of leveraging such a vulnerability was found by
Qualys, there is a writeup by them which you can refer to from here.
Conclusion
As said in Spiderman movies “Great power comes with great responsibility” when
you unlock the God mode of Linux you are on your own, the guards that protected
your program from corrupting the whole are not available now. There are specific
use cases to use Ebpf, it is not the swiss knife for all your performance issues. The
community is pretty huge now including big players like meta, google, Cloudflare,
and Netflix all using the tech daily. The tech has loads of potential to grow, recent
years have seen separate conf for ebpf enthusiasts.
To know more visit our remaining pages:-
Website:- https://coffeebeans.io/
Blogs:- https://coffeebeans.io/blogs