SlideShare a Scribd company logo
1 of 24
Download to read offline
Rethinking the
Linux kernel
Thomas Graf
Cilium Project, Co-Founder & CTO, Isovalent
2Cameron Askin: Cameron’s World
Remember
GeoCities?
3
Markup Only (HTML)
What enabled this evolution?
Programmable Platform
Programmability Essentials
4
Untrusted code runs
in the browser of the
user.
→ Sandboxing
Allow evolution of
logic without requiring
to constantly ship new
browser versions.
→ Deploy anytime with
seamless upgrades
Programmability must
be provided with
minimal overhead.
→ Native Execution
(JIT compiler)
Safety
Continuous
Delivery
Performance
Kernel Architecture
5
TCP/IPVFS
Linux
Kernel
Network DeviceBlock Device
AdminProcess Process
Network
Hardware
Storage
Hardware
Configuration
(sysfs,netlink,procfs,...)
Sockets
recvmsg()sendmsg()
Syscall
read()
File Descriptor
write()
Syscall
User
Space
HW
Cons:
● You likely need to ship a different
module for each kernel version
● Might crash your kernel
● Change kernel source code
● Expose configuration API
● Wait 5 years for your users
to upgrade
6
Kernel Development 101
● Write kernel module
● Every kernel release will break it
Cons:
Option 1
Native Support
Option 2
Kernel Module
How about we add
JavaScript-like capabilities
to the Linux Kernel?
7
8
9
Process
Scheduler
execve()
Linux
Kernel
Syscall
eBPF Runtime
10
Controller
Sockets
bpf()
Linux
Kernel
TCP/IP
Network Device
recvmsg()sendmsg()
Process
Syscall
Verifier
JIT Compiler
BPF
Program
BPF
Program
BPF
Program
approved
x86_64
Syscall
Safety & Security
The verifier will reject any
unsafe program and
provides a sandbox.
Continuous Delivery
Programs can be exchanged
without disrupting workloads.
Performance
The JIT compiler ensures
native execution
performance.
bytecode
eBPF Hooks
11
Process
Storage
Hardware
Sockets
TCP/IP
Network Device
read()
File Descriptor
VFS
Block Device
write()
Linux
Kernel
Network
Hardware
Process
SyscallSyscall
Where can you hook? kernel functions (kprobes), userspace functions (uprobes), system calls,
fentry/fexit, tracepoints, network devices (tc/xdp), network routes, TCP congestion algorithms,
sockets (data level)
recvmsg()sendmsg()
eBPF Maps
12
Controller
Sockets
Linux
Kernel
TCP/IP
Network Device
Process
Syscall Syscall
Admin
BPF
Map
Syscall
Map Types:
- Hash tables, Arrays
- LRU (Least Recently Used)
- Ring Buffer
- Stack Trace
- LPM (Longest Prefix match)
What are Maps used for?
● Program state
● Program configuration
● Share data between programs
● Share state, metrics, and
statistics with user space
recvmsg()sendmsg()
eBPF Helpers
13
Sockets
Linux
Kernel
TCP/IP
Network Device
Process
Syscall
What helpers exist?
● Random numbers
● Get current time
● Map access
● Get process/cgroup context
● Manipulate network packets and
forwarding
● Access socket data
● Perform tail call
● Access process stack
● Access syscall arguments
● ...
[...]
num = bpf_get_prandom_u32();
[...]
recvmsg()sendmsg()
eBPF Tail and Function Calls
14
Linux
Kernel
What are Tail Calls used for?
● Chain programs together
● Split programs into independent
logical components
● Make BPF programs composable
What are Functions Calls used for?
● Reuse functionality inside of a
program
● Reduce program size (avoid
inlining)
15
Community
287 contributors:
(Jan 2016 to Jan 2020)
● 466 Daniel Borkmann (Cilium; maintainer)
● 290 Andrii Nakryiko (Facebook)
● 279 Alexei Starovoitov (Facebook; maintainer)
● 217 Jakub Kicinski (Facebook)
● 173 Yonghong Song (Facebook)
● 168 Martin KaFai Lau (Facebook)
● 159 Stanislav Fomichev (Google)
● 148 Quentin Monnet (Cilium)
● 148 John Fastabend (Cilium)
● 118 Jesper Dangaard Brouer (Red Hat)
● [...]
16
eBPF Projects
Katran
High-performance L4
Loadbalancer
facebookincubator/katran
Android & Security
kernel runtime security
instrumentation (KRSI),
Android BPF loader,
eBPF traffic monitor
bcc, bpftrace
Performance
troubleshooting &
profiling
iovisor/bcc
Traffic Optimization
DDoS mitigation, QoS,
traffic optimization,
load balancer
cloudflare/bpftools
Falco
Container runtime
security, behavior
analysis
falcosecurity/falco
Cilium
Networking, security and
load-balancing for k8s
cilium/cilium
et al.
Tracing & Profiling with
17
Sockets
Linux
Kernel
TCP/IP
Process
Syscall
Verifier
JIT Compiler
Syscall
BPF
Program
Python
BCC
BPF
Maps
BCC:
github.com/iovisor/bcc
recvmsg()sendmsg()
# tcptop
Tracing... Output every 1 secs. Hit Ctrl-C to end
<screen clears>
19:46:24 loadavg: 1.86 2.67 2.91 3/362 16681
PID COMM LADDR RADDR RX_KB TX_KB
16648 16648 100.66.3.172:22 100.127.69.165:6684 1 0
16647 sshd 100.66.3.172:22 100.127.69.165:6684 0 2149
14374 sshd 100.66.3.172:22 100.127.69.165:25219 0 0
14458 sshd 100.66.3.172:22 100.127.69.165:7165 0 0
bpftrace
bpftrace - DTrace for Linux
18
File Descriptors
Linux
Kernel
VFS
Process
Syscall
Verifier
JIT Compiler
Syscall
bpftrace
Program
BPF
Maps
bpftrace:
github.com/iovisor/bpftrace
# bpftrace -e 'kprobe:do_sys_open { printf("%s: %sn", comm, str(arg1)) }'
Attaching 1 probe...
git: .git/objects/da
git: .git/objects/pack
git: /etc/localtime
systemd-journal: /var/log/journal/72d0774c88dc4943ae3d34ac356125dd
DNS Res~ver #15: /etc/hosts
^C
open()
Networking, load-balancing
and security for Kubernetes
19
Sockets
Linux
Kernel
TCP/IP
Container
Syscall
Verifier
JIT Compiler
Syscall
Clium
BPF
Maps
Network Device
Sockets
Container
Syscall
Network Device
Network
Hardware
TCP/IP
Kubernetes
20
Container Networking
● Highly efficient and flexible networking
● Routing, Overlay, Cloud-provider native
● IPv4, IPv6, NAT46
● Multi cluster routing
Service Load balancing:
● Highly scalable L3-L4 load balancing
● Kubernetes services (replaces
kube-proxy)
● Multi-cluster
● Service affinity (prefer zones)
Container Security
● Identity-based network security
● API-aware security (HTTP, gRPC, Kafka,
Cassandra, memcached, ..)
● DNS-aware policies
● Encryption
● SSL data visibility via kTLS
Visibility
● Service topology map & live visualization
● Advanced network metrics & alerting
Servicemesh:
● Minimize overhead when injecting
servicemesh sidecar proxies
● Istio integration
21
Hubble: eBPF Visibility for Kubernetes
# hubble observe --since=1m -t l7 -j 
| jq 'select(.l7.dns.rcode==3) | .destination.namespace + "/" + .destination.pod_name' 
| sort | uniq -c | sort -r
42 "starwars/jar-jar-binks-6f5847c97c-qmggv"
Development
Program Maps
Runtime
Go Development Toolchain
22
clang -target bpf
Sockets
Linux
Kernel
TCP/IP
recvmsg()sendmsg()
Process
Verifier
JIT Compiler
Syscall
BPF
Program
C source
BPF
Program
bytecode
BPF
Map
Syscall
Go Library
Go Library: https://github.com/cilium/ebpf
23
Outlook: Future of
is turning the Linux
kernel into a microkernel.
● An increasing amount of new kernel
functionality is implemented with eBPF.
● 100% modular and composable.
● New additions can evolve at a rapid pace.
Much quicker than normal kernel
development.
Example: The linux kernel is not aware of
containers and microservices (it only knows
about namespaces). Cilium is making the
Linux kernel container and Kubernetes
aware.
could enable the Linux kernel
hotpatching we always dreamed about.
Problem:
● Linux kernel vulnerability requires to
patch kernel.
● Rebooting 20’000 servers takes a very
long time without risking extensive
downtime.
Function
Function
Function
Hotfix
Linux
Kernel
Thank You
eBPF Maintainers
Daniel Borkmann, Alexei Starovoitov
Cilium Team
André Martins, Jarno Rajahalme, Joe Stringer,
John Fastabend, Maciej Kwiek, Martynas
Pumputis, Paul Chaignon, Quentin Monnet,
Ray Bejjani, Tobias Klauser
Facebook Team
Andrii Nakryiko, Andrey Ignatov, Jakub
Kicinski, Martin KaFai Lau, Roman Gushchin,
Song Liu, Yonghong Song
Google Team
Chenbo Feng, KP Singh, Lorenzo Colitti,
Maciej Żenczykowski, Stanislav Fomichev,
BCC & bpftrace
Alastair Robertson, Brendan Gregg, Brenden
Blanco
Kernel Team
Björn Töpel, David S. Miller, Edward Cree,
Jesper Brouer, Toke Høiland-Jørgensen
24
● BPF Getting Started Guide
BPF and XDP Reference Guide
● Cilium
github.com/cilium/cilium
● Twitter
@ciliumproject
● Contact the speaker
@tgraf__
All images: Pixabay

More Related Content

What's hot

BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF Abyss
Sasha Goldshtein
 

What's hot (20)

Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLab
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Staring into the eBPF Abyss
Staring into the eBPF AbyssStaring into the eBPF Abyss
Staring into the eBPF Abyss
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
 
Cloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPFCloud Native Networking & Security with Cilium & eBPF
Cloud Native Networking & Security with Cilium & eBPF
 

Similar to eBPF - Rethinking the Linux Kernel

20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_Gallo
Andrea Gallo
 

Similar to eBPF - Rethinking the Linux Kernel (20)

ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
 
Comprehensive XDP Off‌load-handling the Edge Cases
Comprehensive XDP Off‌load-handling the Edge CasesComprehensive XDP Off‌load-handling the Edge Cases
Comprehensive XDP Off‌load-handling the Edge Cases
 
20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_Gallo
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
Kernel bug hunting
Kernel bug huntingKernel bug hunting
Kernel bug hunting
 
Coscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloudCoscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloud
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Feedback on Big Compute & HPC on Windows Azure
Feedback on Big Compute & HPC on Windows AzureFeedback on Big Compute & HPC on Windows Azure
Feedback on Big Compute & HPC on Windows Azure
 
Environment management in a continuous delivery world (3)
Environment management in a continuous delivery world (3)Environment management in a continuous delivery world (3)
Environment management in a continuous delivery world (3)
 
Container based android
Container based androidContainer based android
Container based android
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan Hazlett
 
Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
 
Porting Android
Porting AndroidPorting Android
Porting Android
 
Using VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear ContainersUsing VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear Containers
 

More from Thomas Graf

SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
Thomas Graf
 

More from Thomas Graf (14)

Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable Datapath
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVS
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NAT
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

eBPF - Rethinking the Linux Kernel

  • 1. Rethinking the Linux kernel Thomas Graf Cilium Project, Co-Founder & CTO, Isovalent
  • 2. 2Cameron Askin: Cameron’s World Remember GeoCities?
  • 3. 3 Markup Only (HTML) What enabled this evolution? Programmable Platform
  • 4. Programmability Essentials 4 Untrusted code runs in the browser of the user. → Sandboxing Allow evolution of logic without requiring to constantly ship new browser versions. → Deploy anytime with seamless upgrades Programmability must be provided with minimal overhead. → Native Execution (JIT compiler) Safety Continuous Delivery Performance
  • 5. Kernel Architecture 5 TCP/IPVFS Linux Kernel Network DeviceBlock Device AdminProcess Process Network Hardware Storage Hardware Configuration (sysfs,netlink,procfs,...) Sockets recvmsg()sendmsg() Syscall read() File Descriptor write() Syscall User Space HW
  • 6. Cons: ● You likely need to ship a different module for each kernel version ● Might crash your kernel ● Change kernel source code ● Expose configuration API ● Wait 5 years for your users to upgrade 6 Kernel Development 101 ● Write kernel module ● Every kernel release will break it Cons: Option 1 Native Support Option 2 Kernel Module
  • 7. How about we add JavaScript-like capabilities to the Linux Kernel? 7
  • 8. 8
  • 10. eBPF Runtime 10 Controller Sockets bpf() Linux Kernel TCP/IP Network Device recvmsg()sendmsg() Process Syscall Verifier JIT Compiler BPF Program BPF Program BPF Program approved x86_64 Syscall Safety & Security The verifier will reject any unsafe program and provides a sandbox. Continuous Delivery Programs can be exchanged without disrupting workloads. Performance The JIT compiler ensures native execution performance. bytecode
  • 11. eBPF Hooks 11 Process Storage Hardware Sockets TCP/IP Network Device read() File Descriptor VFS Block Device write() Linux Kernel Network Hardware Process SyscallSyscall Where can you hook? kernel functions (kprobes), userspace functions (uprobes), system calls, fentry/fexit, tracepoints, network devices (tc/xdp), network routes, TCP congestion algorithms, sockets (data level) recvmsg()sendmsg()
  • 12. eBPF Maps 12 Controller Sockets Linux Kernel TCP/IP Network Device Process Syscall Syscall Admin BPF Map Syscall Map Types: - Hash tables, Arrays - LRU (Least Recently Used) - Ring Buffer - Stack Trace - LPM (Longest Prefix match) What are Maps used for? ● Program state ● Program configuration ● Share data between programs ● Share state, metrics, and statistics with user space recvmsg()sendmsg()
  • 13. eBPF Helpers 13 Sockets Linux Kernel TCP/IP Network Device Process Syscall What helpers exist? ● Random numbers ● Get current time ● Map access ● Get process/cgroup context ● Manipulate network packets and forwarding ● Access socket data ● Perform tail call ● Access process stack ● Access syscall arguments ● ... [...] num = bpf_get_prandom_u32(); [...] recvmsg()sendmsg()
  • 14. eBPF Tail and Function Calls 14 Linux Kernel What are Tail Calls used for? ● Chain programs together ● Split programs into independent logical components ● Make BPF programs composable What are Functions Calls used for? ● Reuse functionality inside of a program ● Reduce program size (avoid inlining)
  • 15. 15 Community 287 contributors: (Jan 2016 to Jan 2020) ● 466 Daniel Borkmann (Cilium; maintainer) ● 290 Andrii Nakryiko (Facebook) ● 279 Alexei Starovoitov (Facebook; maintainer) ● 217 Jakub Kicinski (Facebook) ● 173 Yonghong Song (Facebook) ● 168 Martin KaFai Lau (Facebook) ● 159 Stanislav Fomichev (Google) ● 148 Quentin Monnet (Cilium) ● 148 John Fastabend (Cilium) ● 118 Jesper Dangaard Brouer (Red Hat) ● [...]
  • 16. 16 eBPF Projects Katran High-performance L4 Loadbalancer facebookincubator/katran Android & Security kernel runtime security instrumentation (KRSI), Android BPF loader, eBPF traffic monitor bcc, bpftrace Performance troubleshooting & profiling iovisor/bcc Traffic Optimization DDoS mitigation, QoS, traffic optimization, load balancer cloudflare/bpftools Falco Container runtime security, behavior analysis falcosecurity/falco Cilium Networking, security and load-balancing for k8s cilium/cilium et al.
  • 17. Tracing & Profiling with 17 Sockets Linux Kernel TCP/IP Process Syscall Verifier JIT Compiler Syscall BPF Program Python BCC BPF Maps BCC: github.com/iovisor/bcc recvmsg()sendmsg() # tcptop Tracing... Output every 1 secs. Hit Ctrl-C to end <screen clears> 19:46:24 loadavg: 1.86 2.67 2.91 3/362 16681 PID COMM LADDR RADDR RX_KB TX_KB 16648 16648 100.66.3.172:22 100.127.69.165:6684 1 0 16647 sshd 100.66.3.172:22 100.127.69.165:6684 0 2149 14374 sshd 100.66.3.172:22 100.127.69.165:25219 0 0 14458 sshd 100.66.3.172:22 100.127.69.165:7165 0 0
  • 18. bpftrace bpftrace - DTrace for Linux 18 File Descriptors Linux Kernel VFS Process Syscall Verifier JIT Compiler Syscall bpftrace Program BPF Maps bpftrace: github.com/iovisor/bpftrace # bpftrace -e 'kprobe:do_sys_open { printf("%s: %sn", comm, str(arg1)) }' Attaching 1 probe... git: .git/objects/da git: .git/objects/pack git: /etc/localtime systemd-journal: /var/log/journal/72d0774c88dc4943ae3d34ac356125dd DNS Res~ver #15: /etc/hosts ^C open()
  • 19. Networking, load-balancing and security for Kubernetes 19 Sockets Linux Kernel TCP/IP Container Syscall Verifier JIT Compiler Syscall Clium BPF Maps Network Device Sockets Container Syscall Network Device Network Hardware TCP/IP Kubernetes
  • 20. 20 Container Networking ● Highly efficient and flexible networking ● Routing, Overlay, Cloud-provider native ● IPv4, IPv6, NAT46 ● Multi cluster routing Service Load balancing: ● Highly scalable L3-L4 load balancing ● Kubernetes services (replaces kube-proxy) ● Multi-cluster ● Service affinity (prefer zones) Container Security ● Identity-based network security ● API-aware security (HTTP, gRPC, Kafka, Cassandra, memcached, ..) ● DNS-aware policies ● Encryption ● SSL data visibility via kTLS Visibility ● Service topology map & live visualization ● Advanced network metrics & alerting Servicemesh: ● Minimize overhead when injecting servicemesh sidecar proxies ● Istio integration
  • 21. 21 Hubble: eBPF Visibility for Kubernetes # hubble observe --since=1m -t l7 -j | jq 'select(.l7.dns.rcode==3) | .destination.namespace + "/" + .destination.pod_name' | sort | uniq -c | sort -r 42 "starwars/jar-jar-binks-6f5847c97c-qmggv"
  • 22. Development Program Maps Runtime Go Development Toolchain 22 clang -target bpf Sockets Linux Kernel TCP/IP recvmsg()sendmsg() Process Verifier JIT Compiler Syscall BPF Program C source BPF Program bytecode BPF Map Syscall Go Library Go Library: https://github.com/cilium/ebpf
  • 23. 23 Outlook: Future of is turning the Linux kernel into a microkernel. ● An increasing amount of new kernel functionality is implemented with eBPF. ● 100% modular and composable. ● New additions can evolve at a rapid pace. Much quicker than normal kernel development. Example: The linux kernel is not aware of containers and microservices (it only knows about namespaces). Cilium is making the Linux kernel container and Kubernetes aware. could enable the Linux kernel hotpatching we always dreamed about. Problem: ● Linux kernel vulnerability requires to patch kernel. ● Rebooting 20’000 servers takes a very long time without risking extensive downtime. Function Function Function Hotfix Linux Kernel
  • 24. Thank You eBPF Maintainers Daniel Borkmann, Alexei Starovoitov Cilium Team André Martins, Jarno Rajahalme, Joe Stringer, John Fastabend, Maciej Kwiek, Martynas Pumputis, Paul Chaignon, Quentin Monnet, Ray Bejjani, Tobias Klauser Facebook Team Andrii Nakryiko, Andrey Ignatov, Jakub Kicinski, Martin KaFai Lau, Roman Gushchin, Song Liu, Yonghong Song Google Team Chenbo Feng, KP Singh, Lorenzo Colitti, Maciej Żenczykowski, Stanislav Fomichev, BCC & bpftrace Alastair Robertson, Brendan Gregg, Brenden Blanco Kernel Team Björn Töpel, David S. Miller, Edward Cree, Jesper Brouer, Toke Høiland-Jørgensen 24 ● BPF Getting Started Guide BPF and XDP Reference Guide ● Cilium github.com/cilium/cilium ● Twitter @ciliumproject ● Contact the speaker @tgraf__ All images: Pixabay