SlideShare a Scribd company logo
1 of 26
Download to read offline
Enabling Fast, Dynamic Network
Processing with ClickOS
Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*,
Vladimir Olteanu§, Michio Honda*, Felipe Huici*
* NEC Labs Europe, Heidelberg, Germany
§ University Politehnica of Bucharest
firstname.lastname@neclab.eu, firstname.lastname@cs.pub.ro
The Idealized Network

Application

Application

Transport

Transport

Network

Network

Network

Datalink

Datalink

Datalink

Physical

Page 2

Datalink
Physical

Physical

Physical
A Middlebox World

ad insertion
WAN accelerator
BRAS

carrier-grade NAT

transcoder

IDS

session border
controller

load balancer
DDoS protection

firewall
QoE monitor

Page 3

DPI
Hardware Middleboxes - Drawbacks
▐ Middleboxes are useful, but…
Expensive
Difficult to add new features, lock-in
Difficult to manage
Cannot be scaled with demand
Cannot share a device among different tenants
Hard for new players to enter market

▐ Clearly shifting middlebox processing to a software-based,
multi-tenant platform would address these issues
But can it be built using commodity hardware while still
achieving high performance?

▐ ClickOS: tiny Xen-based virtual machine that runs Click
Page 4
Click Runtime
▐ Modular architecture for network processing
▐ Based around the concept of “elements”
▐ Elements are connected in a configuration
file
▐ A configuration is installed via a command
line executable
 (e.g., click-install router.click)

▐ An element
 Can be configured with parameters
(e.g., Queue::length)
 Can expose read and write variables available
via sockets or the /proc system under Linux
(e.g., Counter::reset, Counter::count)
 Compiled 262/300 elements
 Programmers can write new ones to extend
Click runtime

Page 5
A simple (click-based) firewall example
in

:: FromNetFront(DEVMAC 00:11:22:33:44:55, BURST 1024);

out

:: ToNetFront(DEVMAC 00:11:22:33:44:55, BURST 1);

filter :: IPFilter(
allow src host 10.0.0.1 && dst host 10.1.0.1 && udp,
drop all);
in -> CheckIPHeader(14) -> filter
filter[0] -> Print(“allow”) -> out;
filter[1] -> Print(“drop”) -> Discard();

Page 6
What's ClickOS ?
domU

ClickOS

apps

Click

guest
OS

mini
OS

paravirt

paravirt

▐ Work consisted of:
 Build system to create ClickOS images (5 MB in size)
 Emulating a Click control plane over MiniOS/Xen
 Reducing boot times (roughly 30 miliseconds)
 Optimizations to the data plane (10 Gb/s for almost all pkt sizes)

Page 7
Performance analysis
pkt size (bytes)

10Gb rate

64

14.8 Mp/s

128

8.4 Mp/s

256

4.5 Mp/s

512

2.3 Mp/s

1024

1.2 Mp/s

Driver Domain (or Dom 0)
netback
NW driver Linux/OVS bridge

1500

vif

Xen bus/store
810

Kp/s

ClickOS Domain
netfront

Event channel

Click
FromNetfront
ToNetfront

Xen ring API
(data)

300* Kp/s

350 Kp/s

225 Kp/s
* - maximum-sized packets

Page 8
Main issues
▐ Backend switch ( bridge / openvswitch ) are slow
▐ Copying pages between domains (grant copy) greatly affects packet I/O
– These are done in batches, but still expensive
▐ Packet metadata (skb or mbufs) allocations
▐ MiniOS netfront not as good as Linux
– 225 Kpps VS 430 Kpps Tx
– only 8 Kpps Rx

Page 9

© NEC Corporation 2009
Optimizing Network I/O – Backend Switch
ClickOS Domain

Driver Domain (or Dom 0)
NW driver
(netmap mode)

netback

netfront
Xen bus/store

VALE

port

Event channel

Click
FromNetfront
ToNetfront

Xen ring API
(data)

▐ Introduce VALE as the backend switch
– NIC switches to netmap-mode
▐ Slight modifications to the netback driver only
▐ Batch more I/O requests through multi-page rings
▐ Removed packet metadata manipulation
▐ 625 Kpps (1500 size, 2.7x improvement) and 1.2 Mpps (64 size, 4.2x improvement)
Page 10
Background - Netmap
▐ Fast packet I/O framework

– 14.88 Mpps on 1 core at 900 Mhz
▐ Available in FreeBSD 9+
– Also runs on Linux
▐ Minimal device driver modifications
– Critical resources (NIC registers, physical buffer addresses, and
descriptors) not exposed to the user
– NIC works in special mode, bypassing the host stack
▐ Amortize syscalls cost by using large batches
▐ Preallocated packet buffers, and memory mapped to userspace
Netmap – a novel framework for fast packet I/O
http://info.iet.unipi.it/~luigi/netmap/
Luigi Rizzo
Universita di Pisa
Page 11
Background - VALE Software Switch

▐ High performance switch based on netmap API (18 Mpps between virtual
ports, one CPU core)
▐ Packet processing is “modular”
– Default as learning bridge
– Modules are independent kernel modules
▐ Applications use the netmap API
Page 12

VALE, a Virtual Local Ethernet
http://info.iet.unipi.it/~luigi/vale/
Luigi Rizzo, Giuseppe Lettieri
Universita di Pisa
Optimizing Network I/O
ClickOS Domain

Driver Domain (or Dom 0)
netfront

netback
NW driver

VALE

Click

Xen bus/store
TX/RX Event channels

FromNetfront
ToNetfront

Netmap API
(data)

▐ No longer need the extra copy between domains
▐ Netmap rings (in the VALE switch) are mapped all the way to the guest
▐ An I/O request doesn't require a response to be consumed by the guest
▐ Event channels are used to proxy netmap operations from/to guest and VALE
▐ Breaks other (non-MiniOS) guests :(
–

Page 13

But we have implemented a netmap-based Linux netfront driver
Optimizing Network I/O – Initialization and Memory usage
▐ Netmap buffers are contiguous pages in guest memory

KB

# grants

slots

(per ring)

(per ring)

64

135

33

128

266

65

Driver Domain

256

528

130

VALE

512

1056

259

1024

2117

516

2048

4231

1033

▐ Buffers are 2k in size, each page fits 2 buffers
▐ Ring fits 1 page for 64 and 128 slots; (2+ for 256+ slots)

buf slot [0]
buf slot [1]
buf slot [2]
netmap
buffers pool

Vale
Mini-OS

Netback (Xen)

3. ring/bufs pages granted

netfront

app.

netmap API

netback
1. opens netmap device
2. registers a VALE port

Initialization

4. ring grant refs read from the xenstore
buffer refs read from the mapped ring slot
Optimizing Network I/O – Synchronization
▐ In netmap application, operation is done in sender context
▐ Backend/Frontend private copy not included in the shared ring page(s)
▐ Event channels used for synchronization

Domain-0

VALE

buf slot 0
buf slot 1
buf slot 2

Vale

(mapped)

Guest (Mini-OS)

Netback (Xen)

buf slot 0
buf slot 1
buf slot 2
Packets to transmit

netfront

netback
Backend finished

TX event channel

app
EVALUATION
ClickOS Base Performance

RX

Intel Xeon E1220 4-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC.
One CPU core assigned to VM, the rest to dom0

TX
Scaling out – Multiple NICs/VMs

Intel Xeon E1650 6-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC.
3 cores assigned to VMs, 3 cores for dom0
Linux Guest Performance
ClickOS (virtualized) Middlebox Performance
ClickOS Delay vs. Other Systems
Conclusions
Presented ClickOS:
 Tiny (5MB) Xen VM tailored at network processing
 Can be booted (on demand) in 30 milliseconds
 Can achieve 10Gb/s throughput using only a single core.
 Can run a varied range of middleboxes with high throughput
Future work:
 Improving performance on NUMA systems
 High consolidation of ClickOS VMs (thousands)
 Service chaining

Page 22
MiniOS (pkt-gen) Performance

RX

TX
Scaling Out – Multiple VMs TX
ClickOS VM and middlebox Boot time
220 milliseconds

30 milliseconds

More Related Content

What's hot

LinuxCon Japan 13 : 10 years of Xen and Beyond
LinuxCon Japan 13 : 10 years of Xen and BeyondLinuxCon Japan 13 : 10 years of Xen and Beyond
LinuxCon Japan 13 : 10 years of Xen and Beyond
The Linux Foundation
 

What's hot (20)

XS 2008 Boston Capacity Planning
XS 2008 Boston Capacity PlanningXS 2008 Boston Capacity Planning
XS 2008 Boston Capacity Planning
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
 
XS Boston 2008 Quantitative
XS Boston 2008 QuantitativeXS Boston 2008 Quantitative
XS Boston 2008 Quantitative
 
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
 
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, OracleXPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
 
XS Boston 2008 XenLoop
XS Boston 2008 XenLoopXS Boston 2008 XenLoop
XS Boston 2008 XenLoop
 
PVH : PV Guest in HVM container
PVH : PV Guest in HVM containerPVH : PV Guest in HVM container
PVH : PV Guest in HVM container
 
XPDDS18: Performance tuning on Xen platform - Bo Zhang & Yifei Jiang, Huawei
XPDDS18: Performance tuning on Xen platform - Bo Zhang & Yifei Jiang, HuaweiXPDDS18: Performance tuning on Xen platform - Bo Zhang & Yifei Jiang, Huawei
XPDDS18: Performance tuning on Xen platform - Bo Zhang & Yifei Jiang, Huawei
 
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
 
XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix
 
LinuxCon Japan 13 : 10 years of Xen and Beyond
LinuxCon Japan 13 : 10 years of Xen and BeyondLinuxCon Japan 13 : 10 years of Xen and Beyond
LinuxCon Japan 13 : 10 years of Xen and Beyond
 
Xen io
Xen ioXen io
Xen io
 
XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, CitrixXPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
 
System Device Tree update: Bus Firewalls and Lopper
System Device Tree update: Bus Firewalls and LopperSystem Device Tree update: Bus Firewalls and Lopper
System Device Tree update: Bus Firewalls and Lopper
 
ELC21: VM-to-VM Communication Mechanisms for Embedded
ELC21: VM-to-VM Communication Mechanisms for EmbeddedELC21: VM-to-VM Communication Mechanisms for Embedded
ELC21: VM-to-VM Communication Mechanisms for Embedded
 
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, SamsungXPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
 
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
XPDDS18: Unikraft: An easy way of crafting Unikernels on Arm - Kaly Xin, ARM
XPDDS18: Unikraft: An easy way of crafting Unikernels on Arm - Kaly Xin, ARMXPDDS18: Unikraft: An easy way of crafting Unikernels on Arm - Kaly Xin, ARM
XPDDS18: Unikraft: An easy way of crafting Unikernels on Arm - Kaly Xin, ARM
 

Viewers also liked

The HaLVM: A Simple Platform for Simple Platforms
The HaLVM: A Simple Platform for Simple PlatformsThe HaLVM: A Simple Platform for Simple Platforms
The HaLVM: A Simple Platform for Simple Platforms
The Linux Foundation
 

Viewers also liked (10)

XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NECXPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
 
The HaLVM: A Simple Platform for Simple Platforms
The HaLVM: A Simple Platform for Simple PlatformsThe HaLVM: A Simple Platform for Simple Platforms
The HaLVM: A Simple Platform for Simple Platforms
 
SCALE13x: Next Generation of the Cloud - Rise of the Unikernel
SCALE13x: Next Generation of the Cloud - Rise of the UnikernelSCALE13x: Next Generation of the Cloud - Rise of the Unikernel
SCALE13x: Next Generation of the Cloud - Rise of the Unikernel
 
#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++
 
Improving Scalability of Xen: The 3,000 Domains Experiment
Improving Scalability of Xen: The 3,000 Domains ExperimentImproving Scalability of Xen: The 3,000 Domains Experiment
Improving Scalability of Xen: The 3,000 Domains Experiment
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
LF Collaboration Summit: Xen Project 4 4 Features and Futures
LF Collaboration Summit: Xen Project 4 4 Features and FuturesLF Collaboration Summit: Xen Project 4 4 Features and Futures
LF Collaboration Summit: Xen Project 4 4 Features and Futures
 
Performance Tuning Xen
Performance Tuning XenPerformance Tuning Xen
Performance Tuning Xen
 
NFV & Openstack
NFV & OpenstackNFV & Openstack
NFV & Openstack
 
OSv at Cassandra Summit
OSv at Cassandra SummitOSv at Cassandra Summit
OSv at Cassandra Summit
 

Similar to XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
BiHongPhc
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)
Sri Prasanna
 

Similar to XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC (20)

ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
 
Clusters (Distributed computing)
Clusters (Distributed computing)Clusters (Distributed computing)
Clusters (Distributed computing)
 
Ocpeu14
Ocpeu14Ocpeu14
Ocpeu14
 
OpenStack Neutron Dragonflow l3 SDNmeetup
OpenStack Neutron Dragonflow l3 SDNmeetupOpenStack Neutron Dragonflow l3 SDNmeetup
OpenStack Neutron Dragonflow l3 SDNmeetup
 
Xen revisited
Xen revisitedXen revisited
Xen revisited
 
Presentation v mware performance overview
Presentation   v mware performance overviewPresentation   v mware performance overview
Presentation v mware performance overview
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
slides
slidesslides
slides
 
Mellanox Approach to NFV & SDN
Mellanox Approach to NFV & SDNMellanox Approach to NFV & SDN
Mellanox Approach to NFV & SDN
 
DragonFlow sdn based distributed virtual router for openstack neutron
DragonFlow sdn based distributed virtual router for openstack neutronDragonFlow sdn based distributed virtual router for openstack neutron
DragonFlow sdn based distributed virtual router for openstack neutron
 
From data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloudFrom data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloud
 
linuxcluster.ppt
linuxcluster.pptlinuxcluster.ppt
linuxcluster.ppt
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
CloudOpen 2013: Developing cloud infrastructure: from scratch: the tale of an...
CloudOpen 2013: Developing cloud infrastructure: from scratch: the tale of an...CloudOpen 2013: Developing cloud infrastructure: from scratch: the tale of an...
CloudOpen 2013: Developing cloud infrastructure: from scratch: the tale of an...
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
netty_qcon_v4
netty_qcon_v4netty_qcon_v4
netty_qcon_v4
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 

More from The Linux Foundation

More from The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 

XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

  • 1. Enabling Fast, Dynamic Network Processing with ClickOS Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*, Felipe Huici* * NEC Labs Europe, Heidelberg, Germany § University Politehnica of Bucharest firstname.lastname@neclab.eu, firstname.lastname@cs.pub.ro
  • 3. A Middlebox World ad insertion WAN accelerator BRAS carrier-grade NAT transcoder IDS session border controller load balancer DDoS protection firewall QoE monitor Page 3 DPI
  • 4. Hardware Middleboxes - Drawbacks ▐ Middleboxes are useful, but… Expensive Difficult to add new features, lock-in Difficult to manage Cannot be scaled with demand Cannot share a device among different tenants Hard for new players to enter market ▐ Clearly shifting middlebox processing to a software-based, multi-tenant platform would address these issues But can it be built using commodity hardware while still achieving high performance? ▐ ClickOS: tiny Xen-based virtual machine that runs Click Page 4
  • 5. Click Runtime ▐ Modular architecture for network processing ▐ Based around the concept of “elements” ▐ Elements are connected in a configuration file ▐ A configuration is installed via a command line executable  (e.g., click-install router.click) ▐ An element  Can be configured with parameters (e.g., Queue::length)  Can expose read and write variables available via sockets or the /proc system under Linux (e.g., Counter::reset, Counter::count)  Compiled 262/300 elements  Programmers can write new ones to extend Click runtime Page 5
  • 6. A simple (click-based) firewall example in :: FromNetFront(DEVMAC 00:11:22:33:44:55, BURST 1024); out :: ToNetFront(DEVMAC 00:11:22:33:44:55, BURST 1); filter :: IPFilter( allow src host 10.0.0.1 && dst host 10.1.0.1 && udp, drop all); in -> CheckIPHeader(14) -> filter filter[0] -> Print(“allow”) -> out; filter[1] -> Print(“drop”) -> Discard(); Page 6
  • 7. What's ClickOS ? domU ClickOS apps Click guest OS mini OS paravirt paravirt ▐ Work consisted of:  Build system to create ClickOS images (5 MB in size)  Emulating a Click control plane over MiniOS/Xen  Reducing boot times (roughly 30 miliseconds)  Optimizations to the data plane (10 Gb/s for almost all pkt sizes) Page 7
  • 8. Performance analysis pkt size (bytes) 10Gb rate 64 14.8 Mp/s 128 8.4 Mp/s 256 4.5 Mp/s 512 2.3 Mp/s 1024 1.2 Mp/s Driver Domain (or Dom 0) netback NW driver Linux/OVS bridge 1500 vif Xen bus/store 810 Kp/s ClickOS Domain netfront Event channel Click FromNetfront ToNetfront Xen ring API (data) 300* Kp/s 350 Kp/s 225 Kp/s * - maximum-sized packets Page 8
  • 9. Main issues ▐ Backend switch ( bridge / openvswitch ) are slow ▐ Copying pages between domains (grant copy) greatly affects packet I/O – These are done in batches, but still expensive ▐ Packet metadata (skb or mbufs) allocations ▐ MiniOS netfront not as good as Linux – 225 Kpps VS 430 Kpps Tx – only 8 Kpps Rx Page 9 © NEC Corporation 2009
  • 10. Optimizing Network I/O – Backend Switch ClickOS Domain Driver Domain (or Dom 0) NW driver (netmap mode) netback netfront Xen bus/store VALE port Event channel Click FromNetfront ToNetfront Xen ring API (data) ▐ Introduce VALE as the backend switch – NIC switches to netmap-mode ▐ Slight modifications to the netback driver only ▐ Batch more I/O requests through multi-page rings ▐ Removed packet metadata manipulation ▐ 625 Kpps (1500 size, 2.7x improvement) and 1.2 Mpps (64 size, 4.2x improvement) Page 10
  • 11. Background - Netmap ▐ Fast packet I/O framework – 14.88 Mpps on 1 core at 900 Mhz ▐ Available in FreeBSD 9+ – Also runs on Linux ▐ Minimal device driver modifications – Critical resources (NIC registers, physical buffer addresses, and descriptors) not exposed to the user – NIC works in special mode, bypassing the host stack ▐ Amortize syscalls cost by using large batches ▐ Preallocated packet buffers, and memory mapped to userspace Netmap – a novel framework for fast packet I/O http://info.iet.unipi.it/~luigi/netmap/ Luigi Rizzo Universita di Pisa Page 11
  • 12. Background - VALE Software Switch ▐ High performance switch based on netmap API (18 Mpps between virtual ports, one CPU core) ▐ Packet processing is “modular” – Default as learning bridge – Modules are independent kernel modules ▐ Applications use the netmap API Page 12 VALE, a Virtual Local Ethernet http://info.iet.unipi.it/~luigi/vale/ Luigi Rizzo, Giuseppe Lettieri Universita di Pisa
  • 13. Optimizing Network I/O ClickOS Domain Driver Domain (or Dom 0) netfront netback NW driver VALE Click Xen bus/store TX/RX Event channels FromNetfront ToNetfront Netmap API (data) ▐ No longer need the extra copy between domains ▐ Netmap rings (in the VALE switch) are mapped all the way to the guest ▐ An I/O request doesn't require a response to be consumed by the guest ▐ Event channels are used to proxy netmap operations from/to guest and VALE ▐ Breaks other (non-MiniOS) guests :( – Page 13 But we have implemented a netmap-based Linux netfront driver
  • 14. Optimizing Network I/O – Initialization and Memory usage ▐ Netmap buffers are contiguous pages in guest memory KB # grants slots (per ring) (per ring) 64 135 33 128 266 65 Driver Domain 256 528 130 VALE 512 1056 259 1024 2117 516 2048 4231 1033 ▐ Buffers are 2k in size, each page fits 2 buffers ▐ Ring fits 1 page for 64 and 128 slots; (2+ for 256+ slots) buf slot [0] buf slot [1] buf slot [2] netmap buffers pool Vale Mini-OS Netback (Xen) 3. ring/bufs pages granted netfront app. netmap API netback 1. opens netmap device 2. registers a VALE port Initialization 4. ring grant refs read from the xenstore buffer refs read from the mapped ring slot
  • 15. Optimizing Network I/O – Synchronization ▐ In netmap application, operation is done in sender context ▐ Backend/Frontend private copy not included in the shared ring page(s) ▐ Event channels used for synchronization Domain-0 VALE buf slot 0 buf slot 1 buf slot 2 Vale (mapped) Guest (Mini-OS) Netback (Xen) buf slot 0 buf slot 1 buf slot 2 Packets to transmit netfront netback Backend finished TX event channel app
  • 17. ClickOS Base Performance RX Intel Xeon E1220 4-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. One CPU core assigned to VM, the rest to dom0 TX
  • 18. Scaling out – Multiple NICs/VMs Intel Xeon E1650 6-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. 3 cores assigned to VMs, 3 cores for dom0
  • 21. ClickOS Delay vs. Other Systems
  • 22. Conclusions Presented ClickOS:  Tiny (5MB) Xen VM tailored at network processing  Can be booted (on demand) in 30 milliseconds  Can achieve 10Gb/s throughput using only a single core.  Can run a varied range of middleboxes with high throughput Future work:  Improving performance on NUMA systems  High consolidation of ClickOS VMs (thousands)  Service chaining Page 22
  • 23.
  • 25. Scaling Out – Multiple VMs TX
  • 26. ClickOS VM and middlebox Boot time 220 milliseconds 30 milliseconds