SlideShare a Scribd company logo
1 of 42
Download to read offline
The story of BPF
A practical guide to land patches
• What BPF stands for?
• Does it matter ?
• The name given to an instruction set 30 years ago by Steven McCanne and
Van Jacobson.
• Little they knew that in 2011 a startup decides to revolutionize
Software Defined Networking.
• Physical -> Virtual
• Servers -> VMs
• Networking gear -> virtual routers, switches, firewalls
• Virtual Machine
• Technology: hypervisor
• KVM, QEMU
• Virtual firewall, Virtual Router, Virtual Switch
• Technology: iovisor
One physical server:
• 5 VMs
• 1 router
• 2 switches
• 5 firewalls
Traditional approach
• VM -> kvm.ko
• Virtual router -> vrouter.ko
• Virtual switch -> vswitch.ko
• Virtual firewall -> vfirewall.ko
PLUMgrid’s solution v1
• iovisor.ko
• switch, router, firewall – binary blobs of x86 code
• pushed to a host by a remote controller
• Including 3rd party NAT, packet captures, etc
• What can go wrong?
• After 4Gbyte of networking traffic the kernel would crash
• 32-bit overflow ?
• Race condition ?
• What can go wrong?
arch/x86/Makefile: KBUILD_CFLAGS += -mno-red-zone
PLUMgrid’s solution v2
• Verify x86 code
• The verifier was born.
• Verification pain points with x86 asm
• Lots of ways to compute an address.
• Lots of memory access instructions.
• Solution: reduced x86 instruction set.
• Hack GCC x86 backend.
• The first iovisor.ko had the verifier and no JIT.
PLUMgrid’s solution v3
• New instruction set (x86 like)
• GCC backend that emits binary code
• iovisor.ko
• The verifier for this instruction set
• JIT to x86
• No interpreter
How to upstream iovisor.ko ?
• Talk to key people when possible
• New instruction set is scary to compiler folks
• Even scarier to kernel maintainers
• Solution: make it look familiar
Make it look familiar
• Is there an instruction set in the kernel with similar properties?
• BPF, iptables, netfilter tables, inet_diag
• Make new instruction set look as close as possible to BPF
• Reuse opcode encoding and 8-byte size of insn
• Call it ‘extended’ BPF
Next steps
• Read netdev@vger mailing list for 6 month
• Understand the land
• Identify key people
• And post the jumbo patch? No.
Build reputation
• Find lockdep report in your area of interest.
My 1st kernel patch:
Move module_free() of x86 JITed memory into a worker.
Keep building reputation…
my kernel commit #5
Finally post eBPF patchset
Did it work?
Finally post eBPF patchset
Nope. It was rejected.
What is the biggest maintainer’s concern?
UAPI !
Need a plan B for eBPF
Add eBPF without exposing it in UAPI
How?
Need a plan B for eBPF
Add eBPF without exposing it in UAPI
Answer: Make existing code faster
Rewrite existing BPF interpreter
Thankfully it was easy to make it 2 times faster.
10% of the speedup came from eBPF instruction set itself.
90% of the speedup from jump-threaded implementation.
That’s how ‘internal BPF’ was created.
Need to disambiguate two BPFs.
Daniel Borkmann came up with a name ‘classic BPF’.
The state of BPF in May 2014:
• cBPF converter to iBPF (internal BPF)
• Interpreter that runs iBPF
• x86, sparc, arm JIT compilers from iBPF to native code
eBPF doesn’t exist yet. There is no verifier either.
Where to apply iBPF ‘engine’ ?
The concepts of the verifier, maps, helpers were proposed.
Programs suppose to run from netif_receive_skb.
The networking use case still struggles.
Arguments against:
- [ei]BPF instruction set is not extensible. Should be using TLV ?
- u8 opcode looks small. eBPF 2.0 will be coming ?
- The verifier is not supported by static analysis theory.
- It bypasses networking stack.
If the mountain will not come to Mohammed…
Strategy: Compromise on networking, pivot eBPF into tracing.
Strategy: Make it look familiar.
F - filter.
Proposal to ‘filter’ perf events.
Reuse verifier, maps, helpers concepts, but instead of network stack
execute programs from perf events and kprobes.
Strategy: Make existing code faster.
Demonstrate that BPF tracing ‘filter’ is faster than predicate tree walker.
Demonstrate that BPF TC ‘classifier’ is faster than TC u32 classifier.
Sad trade-off: clean design vs upstreamability.
Finally on September 26, 2014
eBPF is learning to walk.
89aa075832b0 (net: sock: allow eBPF programs to be attached to sockets, 2014-12-01)
e2e9b6541dd4 (cls_bpf: add initial eBPF support for programmable classifiers, 2015-03-01)
2541517c32be (tracing, perf: Implement BPF programs attached to kprobes, 2015-03-25)
Are we done?
Are we done?
Kernel was just the beginning.
Landing new backend in LLVM was just as difficult.
LLVM community
• Most developers have direct write access
• Anyone can revert anyone else’s commit
• s/MAINTAINERS/CODE_OWNERS.TXT/
• Back then LLVM was using SVN
• Phabricator for diffs
• C++ in CamelStyle
LLVM community
• No UAPI concerns
• Compiler internals are changing a lot
• Backward incompatible backend changes is not a concern
• Kernel UAPI doesn’t justify or restrict LLVM choices
• Continuous integration and testing is mandatory
• Build bots run tests right after diff lands
• Backends have to contribute build bots
• Many operating systems
• Approved diffs might get reverted and re-landed many times
• Monthly meetup at Tied House, Mountain View, CA
LLVM BPF backend
Differential Revision: http://reviews.llvm.org/D6494
llvm-svn: 227008
llvm/CODE_OWNERS.TXT | 4 +
llvm/include/llvm/ADT/Triple.h | 1 +
llvm/include/llvm/IR/Intrinsics.td | 1 +
llvm/include/llvm/IR/IntrinsicsBPF.td | 22 +++++
llvm/lib/Support/Triple.cpp | 8 ++
llvm/lib/Target/BPF/BPF.h | 22 +++++
llvm/lib/Target/BPF/BPF.td | 31 ++++++
llvm/lib/Target/BPF/BPFAsmPrinter.cpp | 87 +++++++++++++++++
llvm/lib/Target/BPF/BPFCallingConv.td | 29 ++++++
llvm/lib/Target/BPF/BPFFrameLowering.cpp | 39 ++++++++
llvm/lib/Target/BPF/BPFFrameLowering.h | 41 ++++++++
llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp | 159 ++++++++++++++++++++++++++++++
llvm/lib/Target/BPF/BPFISelLowering.cpp | 642 +++++++++++++++++++++++++++++++++++++++++
...
llvm/lib/Target/LLVMBuild.txt | 2 +-
69 files changed, 4644 insertions(+), 1 deletion(-)
Proposed in Dec 2014
http://reviews.llvm.org/D6494
took 2 month to land in Jan 2015 as
experimental backend.
To graduate BPF backend from experimental status
• It has to have users
• It needs more than one developer
• Developers must help with tree wide refactoring
• Build bot
BPF backend in GCC
• Emits BPF byte code directly. Upstream blocker.
• Unlike LLVM GCC doesn’t have integrated assembler. GCC has to emit plain text
• Would have to make libbfd/gas/ld work
• Being lazy as an upstream strategy sometimes works too
• In 2019 Oracle GCC folks implemented everything
Steps that did NOT help to land patches
• Present at the conferences
• Describe amazing future
Summary: Strategies to land patches
• Learn the community
• Understand maintainer’s concerns
• Build the reputation
• Make new ideas look familiar
• Make existing code faster
• Split big ideas into small building blocks
• Be prepared to compromise
Thank you
What questions do you have?
Slide 42

More Related Content

Similar to story_of_bpf-1.pdf

eBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current TechniqueseBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current TechniquesNetronome
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
 
Composing services with Kubernetes
Composing services with KubernetesComposing services with Kubernetes
Composing services with KubernetesBart Spaans
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedBrendan Gregg
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthFelipe Prado
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsStefano Salsano
 
Kubernetes deep dive - - Huawei 2015-10
Kubernetes deep dive - - Huawei 2015-10Kubernetes deep dive - - Huawei 2015-10
Kubernetes deep dive - - Huawei 2015-10Vishnu Kannan
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthScyllaDB
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!Affan Syed
 
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.All Things Open
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In DeepMydbops
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABIAlison Chaiken
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-upHungWei Chiu
 
SFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationSFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationLinaro
 
Introduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure PlatformIntroduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure PlatformSZ Lin
 
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfCalico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfyingxinwang4
 
RISC-V NOEL-V - A new high performance RISC-V Processor Family
RISC-V NOEL-V - A new high performance RISC-V Processor FamilyRISC-V NOEL-V - A new high performance RISC-V Processor Family
RISC-V NOEL-V - A new high performance RISC-V Processor FamilyRISC-V International
 
Os Grossupdated
Os GrossupdatedOs Grossupdated
Os Grossupdatedoscon2007
 

Similar to story_of_bpf-1.pdf (20)

eBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current TechniqueseBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current Techniques
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Composing services with Kubernetes
Composing services with KubernetesComposing services with Kubernetes
Composing services with Kubernetes
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Kubernetes deep dive - - Huawei 2015-10
Kubernetes deep dive - - Huawei 2015-10Kubernetes deep dive - - Huawei 2015-10
Kubernetes deep dive - - Huawei 2015-10
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.
It’s 2021. Why are we -still- rebooting for patches? A look at Live Patching.
 
eBPF - Observability In Deep
eBPF - Observability In DeepeBPF - Observability In Deep
eBPF - Observability In Deep
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABI
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up
 
SFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationSFO15-110: Toolchain Collaboration
SFO15-110: Toolchain Collaboration
 
spack_hpc.pptx
spack_hpc.pptxspack_hpc.pptx
spack_hpc.pptx
 
Introduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure PlatformIntroduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure Platform
 
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfCalico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
 
NFV features in kubernetes
NFV features in kubernetesNFV features in kubernetes
NFV features in kubernetes
 
RISC-V NOEL-V - A new high performance RISC-V Processor Family
RISC-V NOEL-V - A new high performance RISC-V Processor FamilyRISC-V NOEL-V - A new high performance RISC-V Processor Family
RISC-V NOEL-V - A new high performance RISC-V Processor Family
 
Os Grossupdated
Os GrossupdatedOs Grossupdated
Os Grossupdated
 

Recently uploaded

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 

Recently uploaded (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

story_of_bpf-1.pdf

  • 1. The story of BPF A practical guide to land patches
  • 2. • What BPF stands for? • Does it matter ? • The name given to an instruction set 30 years ago by Steven McCanne and Van Jacobson.
  • 3. • Little they knew that in 2011 a startup decides to revolutionize Software Defined Networking.
  • 4. • Physical -> Virtual • Servers -> VMs • Networking gear -> virtual routers, switches, firewalls • Virtual Machine • Technology: hypervisor • KVM, QEMU • Virtual firewall, Virtual Router, Virtual Switch • Technology: iovisor
  • 5. One physical server: • 5 VMs • 1 router • 2 switches • 5 firewalls
  • 6. Traditional approach • VM -> kvm.ko • Virtual router -> vrouter.ko • Virtual switch -> vswitch.ko • Virtual firewall -> vfirewall.ko
  • 7. PLUMgrid’s solution v1 • iovisor.ko • switch, router, firewall – binary blobs of x86 code • pushed to a host by a remote controller • Including 3rd party NAT, packet captures, etc
  • 8. • What can go wrong? • After 4Gbyte of networking traffic the kernel would crash • 32-bit overflow ? • Race condition ?
  • 9. • What can go wrong? arch/x86/Makefile: KBUILD_CFLAGS += -mno-red-zone
  • 10. PLUMgrid’s solution v2 • Verify x86 code • The verifier was born.
  • 11. • Verification pain points with x86 asm • Lots of ways to compute an address. • Lots of memory access instructions. • Solution: reduced x86 instruction set. • Hack GCC x86 backend. • The first iovisor.ko had the verifier and no JIT.
  • 12. PLUMgrid’s solution v3 • New instruction set (x86 like) • GCC backend that emits binary code • iovisor.ko • The verifier for this instruction set • JIT to x86 • No interpreter
  • 13. How to upstream iovisor.ko ? • Talk to key people when possible • New instruction set is scary to compiler folks • Even scarier to kernel maintainers • Solution: make it look familiar
  • 14. Make it look familiar • Is there an instruction set in the kernel with similar properties? • BPF, iptables, netfilter tables, inet_diag • Make new instruction set look as close as possible to BPF • Reuse opcode encoding and 8-byte size of insn • Call it ‘extended’ BPF
  • 15. Next steps • Read netdev@vger mailing list for 6 month • Understand the land • Identify key people • And post the jumbo patch? No.
  • 16. Build reputation • Find lockdep report in your area of interest.
  • 17. My 1st kernel patch: Move module_free() of x86 JITed memory into a worker.
  • 18. Keep building reputation… my kernel commit #5
  • 19. Finally post eBPF patchset Did it work?
  • 20. Finally post eBPF patchset Nope. It was rejected.
  • 21. What is the biggest maintainer’s concern? UAPI !
  • 22. Need a plan B for eBPF Add eBPF without exposing it in UAPI How?
  • 23. Need a plan B for eBPF Add eBPF without exposing it in UAPI Answer: Make existing code faster
  • 24. Rewrite existing BPF interpreter Thankfully it was easy to make it 2 times faster. 10% of the speedup came from eBPF instruction set itself. 90% of the speedup from jump-threaded implementation. That’s how ‘internal BPF’ was created.
  • 25. Need to disambiguate two BPFs. Daniel Borkmann came up with a name ‘classic BPF’. The state of BPF in May 2014: • cBPF converter to iBPF (internal BPF) • Interpreter that runs iBPF • x86, sparc, arm JIT compilers from iBPF to native code eBPF doesn’t exist yet. There is no verifier either.
  • 26. Where to apply iBPF ‘engine’ ? The concepts of the verifier, maps, helpers were proposed. Programs suppose to run from netif_receive_skb. The networking use case still struggles. Arguments against: - [ei]BPF instruction set is not extensible. Should be using TLV ? - u8 opcode looks small. eBPF 2.0 will be coming ? - The verifier is not supported by static analysis theory. - It bypasses networking stack.
  • 27. If the mountain will not come to Mohammed… Strategy: Compromise on networking, pivot eBPF into tracing. Strategy: Make it look familiar. F - filter. Proposal to ‘filter’ perf events. Reuse verifier, maps, helpers concepts, but instead of network stack execute programs from perf events and kprobes.
  • 28. Strategy: Make existing code faster. Demonstrate that BPF tracing ‘filter’ is faster than predicate tree walker. Demonstrate that BPF TC ‘classifier’ is faster than TC u32 classifier. Sad trade-off: clean design vs upstreamability.
  • 30. eBPF is learning to walk. 89aa075832b0 (net: sock: allow eBPF programs to be attached to sockets, 2014-12-01) e2e9b6541dd4 (cls_bpf: add initial eBPF support for programmable classifiers, 2015-03-01) 2541517c32be (tracing, perf: Implement BPF programs attached to kprobes, 2015-03-25)
  • 32. Are we done? Kernel was just the beginning. Landing new backend in LLVM was just as difficult.
  • 33. LLVM community • Most developers have direct write access • Anyone can revert anyone else’s commit • s/MAINTAINERS/CODE_OWNERS.TXT/ • Back then LLVM was using SVN • Phabricator for diffs • C++ in CamelStyle
  • 34. LLVM community • No UAPI concerns • Compiler internals are changing a lot • Backward incompatible backend changes is not a concern • Kernel UAPI doesn’t justify or restrict LLVM choices • Continuous integration and testing is mandatory • Build bots run tests right after diff lands • Backends have to contribute build bots • Many operating systems • Approved diffs might get reverted and re-landed many times • Monthly meetup at Tied House, Mountain View, CA
  • 35. LLVM BPF backend Differential Revision: http://reviews.llvm.org/D6494 llvm-svn: 227008 llvm/CODE_OWNERS.TXT | 4 + llvm/include/llvm/ADT/Triple.h | 1 + llvm/include/llvm/IR/Intrinsics.td | 1 + llvm/include/llvm/IR/IntrinsicsBPF.td | 22 +++++ llvm/lib/Support/Triple.cpp | 8 ++ llvm/lib/Target/BPF/BPF.h | 22 +++++ llvm/lib/Target/BPF/BPF.td | 31 ++++++ llvm/lib/Target/BPF/BPFAsmPrinter.cpp | 87 +++++++++++++++++ llvm/lib/Target/BPF/BPFCallingConv.td | 29 ++++++ llvm/lib/Target/BPF/BPFFrameLowering.cpp | 39 ++++++++ llvm/lib/Target/BPF/BPFFrameLowering.h | 41 ++++++++ llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp | 159 ++++++++++++++++++++++++++++++ llvm/lib/Target/BPF/BPFISelLowering.cpp | 642 +++++++++++++++++++++++++++++++++++++++++ ... llvm/lib/Target/LLVMBuild.txt | 2 +- 69 files changed, 4644 insertions(+), 1 deletion(-) Proposed in Dec 2014
  • 36. http://reviews.llvm.org/D6494 took 2 month to land in Jan 2015 as experimental backend.
  • 37. To graduate BPF backend from experimental status • It has to have users • It needs more than one developer • Developers must help with tree wide refactoring • Build bot
  • 38. BPF backend in GCC • Emits BPF byte code directly. Upstream blocker. • Unlike LLVM GCC doesn’t have integrated assembler. GCC has to emit plain text • Would have to make libbfd/gas/ld work • Being lazy as an upstream strategy sometimes works too • In 2019 Oracle GCC folks implemented everything
  • 39. Steps that did NOT help to land patches • Present at the conferences • Describe amazing future
  • 40. Summary: Strategies to land patches • Learn the community • Understand maintainer’s concerns • Build the reputation • Make new ideas look familiar • Make existing code faster • Split big ideas into small building blocks • Be prepared to compromise
  • 41. Thank you What questions do you have?