SlideShare a Scribd company logo
Cilium:
Fast IPv6 Container Networking with
BPF and XDP
LinuxCon 2016, Toronto
Thomas Graf (@tgraf__)
Kernel, Cilium & Open vSwitch Team
Noiro Networks (Cisco)
The Cilium Experiment
Scale
– Addressing: IPv6?
– Policy: Linear lists don’t scale. Alternative?
Extensibility
– Can we be as extensible as userspace networking
in the kernel?
Simplicity
– What is an appropriate abstraction away from
traditional networking?
Performance
– Do we sacrifice performance in the process?
Scaling Addressing
Solution:
– IPv6 addresses with host scope allocator
Pros:
– Everything is globally addressable
– No NAT
– Path to ILA for mobility of tasks
Cons:
– Legacy IPv4 only endpoints/applications
→ Optional IPv4 addressing (+ NAT)
→ NAT46: Provide IPv6 only applications to IPv4
only clients
IPv6 Status in Kubernetes/Docker
● Kubernetes (CNI): Almost there
– Pods are IPv6-only capable as of k8s 1.3.6
(PR23317, PR26438, PR26439, PR26441)
– Kubeproxy (services) not done yet
● Docker (libnetwork): Working on it
– PR826 - “Make IPv6 Great Again”
Not merged yet
Scaling Policy
LB Frontend Backend
Scaling Policy
LB BEFE
LB FE
FE BE
LB
LB Frontend Backend
Policy:
NetworkPolicy Kubernetes policy spec
as discussed and standardized in the
Networking SIG
https://github.com/kubernetes/kubernetes/blo
b/master/docs/proposals/network-policy.md
Scaling Policy
LB QA BE QAFE QA
LB Prod BE ProdFE Prod
LB FE
FE BE
LB
LB Frontend Backend
QA
Prod
Policy:
Scaling Policy
LB QA BE QAFE QA
LB Prod BE ProdFE Prod
LB FE
FE
QA
Prod
BE
LB QA
Prod
requires
requires
LB Frontend Backend
QA
Prod
Policy:
Cilium extension
Not yet part of
Kubernetes spec
QA
Scaling Policy Enforcement
LB FE
FE
QA
Prod
BE
LB QA
Prod
requires
requires
LB QA
FE QA
LB Prod10
11
12
13
Policy enforcement cost becomes a single hashtable
lookup regardless of number of containers or policy
complexity.
BE QA
FE Prod 14
BE Prod 15
Distributed Label ID Table:Policy:
QA
This ID is carried in packet as
metadata to provide security
context at destination host
Extensibility
Kernel
Userspace
Source
Code
Byte
Code
LLVM/clang
Sockets
netdevice
Network
StackTC
Ingress
TC
Egress
netdevice
Verifier
+ JIT
add eax,edx
shl eax,2
add eax,edx
shl eax,2
BPF – Berkley Packet Filter
Kernel
Userspace
BPF
Program
Userspace
Process
BPF Maps & Perf Ring Buffer
BPF Map
Hashtable
BPF Map
Array
Userspace
Process
BPF
Program
Per Ring
Buffer
Data DataTail Call
BPF Features
(As of Aug 2016)
● Efficient data sharing via maps
– Per-CPU/global arrays & hashtables
● Rewrite packet content
● Extend/trim packet size
● Redirect to other net_device
● Attachment of tunnel metadata
● Cgroups integration
● Access to high performance perf ring buffer
● …
Kernel
Userspace
XDP – Express Data Path
Source
Code
Byte
Code
LLVM/clang
Sockets
Netdevice
Network
Stack
Verifier
+ JIT
add eax,edx
shl eax,2
Driver
Access to
DMA buffer
Kernel
Cilium Layer
Orchestration
systems
eth0
BPF
Program
Cilium
Daemon
Cilium
Monitor
Cilium
CLI
BPF Program
Conntrack Policy
Bytecode injection
Events
BPF Program
Conntrack Policy
Code
Generation
Plugins
Policy
Repository
Cilium Architecture
Why is this awesome?
On the fly BPF program generation means:
● Extensibility of userspace networking in the kernel
● MAC, IP, port number, … all become constants
→ compiler can optimize heavily!
● BPF programs can be recompiled and replaced without
interrupting the container and its connections
– Features can be compiled in/out at runtime with
container granularity
● Access to fast BPF maps and perf ring buffer to interact
with userspace.
– Drop monitor in n*Mpps context
– Use notifications for policy learning, IDS, logging, ...
Available Building Blocks
● L3 forwarding (IPv6 & IPV4)
● Host connectivity
● Encapsulation
(VXLAN/Geneve/GRE)
● ICMPv6 generation
● NDisc & ARP responder
● Access Control
Currently working on:
● Fragmentation handling
● Mobility
● Port Mapping (TCP/UDP)
● Connection tracking
● L3/L4 Load Balancer
● Statistics
● Events (perf ring buffer)
● Debugging framework
● NAT46
● End to end encryption
Networking should be invisible,
it is not.
Simplicity
Simplicity
● L3 only (Calico gets this right)
– No L2 scaling issues, no broadcast domains, no L2
vulnerabilities
● No “Networks”
– No need for containers to join multiple networks
to access multiple isolation domains. No need for
multiple addresses.
● Policy definition independent of addressing
– As specified in Kubernetes Networking SIG
– All policies based on container labels
Performance
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
0
100
200
300
400
500
600
Container to container on local node
# Cores
Gbit
netperf -t TCP_SENDFILE -H beef::aa0:18:ee5e
1 TCP flow per core, 10’000 policies
Intel Xeon 3.5Ghz Sandy Bridge, 24 cores
Performance
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Container to container over 10GiB NICs
64
128
256
512
1024
64000
# Cores
MBit
netperf -t TCP_SENDFILE -H beef::aa0:18:ee5e
1 TCP flow per core, 10’000 policies
Intel Xeon 3.5Ghz Sandy Bridge, 24 cores
<Insert Cool Demo Here>
Q&A
Image Sources:
● Cover (Toronto)
Rick Harris (https://www.flickr.com/photos/rickharris/)
● The Invisible Man
Dr. Azzacov (https://www.flickr.com/photos/drazzacov/)
Start hacking with BPF for containers:
http://github.com/cilium/cilium
Contact:
Slack: cilium.slack.com
Twitter: @tgraf__ Mail: tgraf@tgraf.ch
Team:
● André Martins
● Daniel Borkmann
● Madhu Challa
● Thomas Graf

More Related Content

What's hot

Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
Thomas Graf
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Thomas Graf
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
ScyllaDB
 
eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
Michael Kehoe
 
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking
Sim Janghoon
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
Thomas Graf
 
Cilium - BPF & XDP for containers
 Cilium - BPF & XDP for containers Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
Docker, Inc.
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
Adrien Mahieux
 
How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on LinuxEtsuji Nakai
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
BGP Unnumbered で遊んでみた
BGP Unnumbered で遊んでみたBGP Unnumbered で遊んでみた
BGP Unnumbered で遊んでみた
akira6592
 
Vxlan control plane and routing
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routing
Wilfredzeng
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
Sreenivas Makam
 
Neutron packet logging framework
Neutron packet logging frameworkNeutron packet logging framework
Neutron packet logging framework
Vietnam Open Infrastructure User Group
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Thomas Graf
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
Michael Kehoe
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
SUSE Labs Taipei
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRouting
Faisal Reza
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 

What's hot (20)

Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Using eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in CiliumUsing eBPF for High-Performance Networking in Cilium
Using eBPF for High-Performance Networking in Cilium
 
eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
 
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Cilium - BPF & XDP for containers
 Cilium - BPF & XDP for containers Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
How VXLAN works on Linux
How VXLAN works on LinuxHow VXLAN works on Linux
How VXLAN works on Linux
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
BGP Unnumbered で遊んでみた
BGP Unnumbered で遊んでみたBGP Unnumbered で遊んでみた
BGP Unnumbered で遊んでみた
 
Vxlan control plane and routing
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routing
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
 
Neutron packet logging framework
Neutron packet logging frameworkNeutron packet logging framework
Neutron packet logging framework
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRouting
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 

Similar to Cilium - Fast IPv6 Container Networking with BPF and XDP

FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
Kirill Tsym
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
Open Networking Summit
 
VNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment UpdateVNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment Update
APNIC
 
Ocpeu14
Ocpeu14Ocpeu14
Ocpeu14
KALRAY
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
Making our networking stack truly extensible
Making our networking stack truly extensible Making our networking stack truly extensible
Making our networking stack truly extensible
Olivier Bonaventure
 
Osnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptxOsnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptx
M.Qasim Arham
 
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
PROIDEA
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
Jim St. Leger
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
Redge Technologies
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch
 
Lustre, RoCE, and MAN
Lustre, RoCE, and MANLustre, RoCE, and MAN
Lustre, RoCE, and MAN
inside-BigData.com
 
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
Jim St. Leger
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
APNIC
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
Michelle Holley
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Samsung Open Source Group
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Haidee McMahon
 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Igalia
 

Similar to Cilium - Fast IPv6 Container Networking with BPF and XDP (20)

FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 
VNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment UpdateVNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment Update
 
Ocpeu14
Ocpeu14Ocpeu14
Ocpeu14
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Making our networking stack truly extensible
Making our networking stack truly extensible Making our networking stack truly extensible
Making our networking stack truly extensible
 
Osnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptxOsnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptx
 
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
PLNOG 17 - Nicolai van der Smagt - Building and connecting the eBay Classifie...
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Lustre, RoCE, and MAN
Lustre, RoCE, and MANLustre, RoCE, and MAN
Lustre, RoCE, and MAN
 
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
DPDK Summit - 08 Sept 2014 - 6WIND - High Perf Networking Leveraging the DPDK...
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
 

More from Thomas Graf

Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
Thomas Graf
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
Thomas Graf
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
Thomas Graf
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable Datapath
Thomas Graf
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
Thomas Graf
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVS
Thomas Graf
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Thomas Graf
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
Thomas Graf
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Thomas Graf
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
Thomas Graf
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingThomas Graf
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
Thomas Graf
 

More from Thomas Graf (13)

Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
 
Cilium - Network security for microservices
Cilium - Network security for microservicesCilium - Network security for microservices
Cilium - Network security for microservices
 
Linux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network SecurityLinux Native, HTTP Aware Network Security
Linux Native, HTTP Aware Network Security
 
BPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable DatapathBPF: Next Generation of Programmable Datapath
BPF: Next Generation of Programmable Datapath
 
Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersCilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
LinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVSLinuxCon 2015 Stateful NAT with OVS
LinuxCon 2015 Stateful NAT with OVS
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NATOpen vSwitch - Stateful Connection Tracking & Stateful NAT
Open vSwitch - Stateful Connection Tracking & Stateful NAT
 
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RCThe Next Generation Firewall for Red Hat Enterprise Linux 7 RC
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 

Recently uploaded

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 

Recently uploaded (20)

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 

Cilium - Fast IPv6 Container Networking with BPF and XDP

  • 1. Cilium: Fast IPv6 Container Networking with BPF and XDP LinuxCon 2016, Toronto Thomas Graf (@tgraf__) Kernel, Cilium & Open vSwitch Team Noiro Networks (Cisco)
  • 2. The Cilium Experiment Scale – Addressing: IPv6? – Policy: Linear lists don’t scale. Alternative? Extensibility – Can we be as extensible as userspace networking in the kernel? Simplicity – What is an appropriate abstraction away from traditional networking? Performance – Do we sacrifice performance in the process?
  • 3. Scaling Addressing Solution: – IPv6 addresses with host scope allocator Pros: – Everything is globally addressable – No NAT – Path to ILA for mobility of tasks Cons: – Legacy IPv4 only endpoints/applications → Optional IPv4 addressing (+ NAT) → NAT46: Provide IPv6 only applications to IPv4 only clients
  • 4. IPv6 Status in Kubernetes/Docker ● Kubernetes (CNI): Almost there – Pods are IPv6-only capable as of k8s 1.3.6 (PR23317, PR26438, PR26439, PR26441) – Kubeproxy (services) not done yet ● Docker (libnetwork): Working on it – PR826 - “Make IPv6 Great Again” Not merged yet
  • 6. Scaling Policy LB BEFE LB FE FE BE LB LB Frontend Backend Policy: NetworkPolicy Kubernetes policy spec as discussed and standardized in the Networking SIG https://github.com/kubernetes/kubernetes/blo b/master/docs/proposals/network-policy.md
  • 7. Scaling Policy LB QA BE QAFE QA LB Prod BE ProdFE Prod LB FE FE BE LB LB Frontend Backend QA Prod Policy:
  • 8. Scaling Policy LB QA BE QAFE QA LB Prod BE ProdFE Prod LB FE FE QA Prod BE LB QA Prod requires requires LB Frontend Backend QA Prod Policy: Cilium extension Not yet part of Kubernetes spec QA
  • 9. Scaling Policy Enforcement LB FE FE QA Prod BE LB QA Prod requires requires LB QA FE QA LB Prod10 11 12 13 Policy enforcement cost becomes a single hashtable lookup regardless of number of containers or policy complexity. BE QA FE Prod 14 BE Prod 15 Distributed Label ID Table:Policy: QA This ID is carried in packet as metadata to provide security context at destination host
  • 12. Kernel Userspace BPF Program Userspace Process BPF Maps & Perf Ring Buffer BPF Map Hashtable BPF Map Array Userspace Process BPF Program Per Ring Buffer Data DataTail Call
  • 13. BPF Features (As of Aug 2016) ● Efficient data sharing via maps – Per-CPU/global arrays & hashtables ● Rewrite packet content ● Extend/trim packet size ● Redirect to other net_device ● Attachment of tunnel metadata ● Cgroups integration ● Access to high performance perf ring buffer ● …
  • 14. Kernel Userspace XDP – Express Data Path Source Code Byte Code LLVM/clang Sockets Netdevice Network Stack Verifier + JIT add eax,edx shl eax,2 Driver Access to DMA buffer
  • 15. Kernel Cilium Layer Orchestration systems eth0 BPF Program Cilium Daemon Cilium Monitor Cilium CLI BPF Program Conntrack Policy Bytecode injection Events BPF Program Conntrack Policy Code Generation Plugins Policy Repository Cilium Architecture
  • 16. Why is this awesome? On the fly BPF program generation means: ● Extensibility of userspace networking in the kernel ● MAC, IP, port number, … all become constants → compiler can optimize heavily! ● BPF programs can be recompiled and replaced without interrupting the container and its connections – Features can be compiled in/out at runtime with container granularity ● Access to fast BPF maps and perf ring buffer to interact with userspace. – Drop monitor in n*Mpps context – Use notifications for policy learning, IDS, logging, ...
  • 17. Available Building Blocks ● L3 forwarding (IPv6 & IPV4) ● Host connectivity ● Encapsulation (VXLAN/Geneve/GRE) ● ICMPv6 generation ● NDisc & ARP responder ● Access Control Currently working on: ● Fragmentation handling ● Mobility ● Port Mapping (TCP/UDP) ● Connection tracking ● L3/L4 Load Balancer ● Statistics ● Events (perf ring buffer) ● Debugging framework ● NAT46 ● End to end encryption
  • 18. Networking should be invisible, it is not. Simplicity
  • 19. Simplicity ● L3 only (Calico gets this right) – No L2 scaling issues, no broadcast domains, no L2 vulnerabilities ● No “Networks” – No need for containers to join multiple networks to access multiple isolation domains. No need for multiple addresses. ● Policy definition independent of addressing – As specified in Kubernetes Networking SIG – All policies based on container labels
  • 20. Performance 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 0 100 200 300 400 500 600 Container to container on local node # Cores Gbit netperf -t TCP_SENDFILE -H beef::aa0:18:ee5e 1 TCP flow per core, 10’000 policies Intel Xeon 3.5Ghz Sandy Bridge, 24 cores
  • 21. Performance 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Container to container over 10GiB NICs 64 128 256 512 1024 64000 # Cores MBit netperf -t TCP_SENDFILE -H beef::aa0:18:ee5e 1 TCP flow per core, 10’000 policies Intel Xeon 3.5Ghz Sandy Bridge, 24 cores
  • 23. Q&A Image Sources: ● Cover (Toronto) Rick Harris (https://www.flickr.com/photos/rickharris/) ● The Invisible Man Dr. Azzacov (https://www.flickr.com/photos/drazzacov/) Start hacking with BPF for containers: http://github.com/cilium/cilium Contact: Slack: cilium.slack.com Twitter: @tgraf__ Mail: tgraf@tgraf.ch Team: ● André Martins ● Daniel Borkmann ● Madhu Challa ● Thomas Graf