SlideShare a Scribd company logo
Network Stack in 
Userspace (NUSE) 
! 
! 
Hajime Tazaki 
高速PCルーター研究会 2014/9/29
Today’s talk 
• Userspace version of (Linux) network 
stack 
• not intended for high-speed something 
• but useful for high-speed network I/O 
2
I have a new Layer-3/4 
protocol! Yey! 
• I have new, great Layer-3/4 protocol ! It will 
change the WORLD ! 
• network stack って、入れかえたいですか? 
• No: your code will destroy my life ?! 
(experimental ? not tested ?) 
• Yes: I wanna be your slave. 
• VM cloud = OK, no much users/services interfere 
• multi-user server, PC, phone = Nightmare, my life 
will have trouble… 
3
I have a new Layer-3/4 
protocol! Yey! (cont’d) 
• Kernel programming sucks 
• LKM ? can cause panic anyway.. 
• Click ? only router/middlebox, not for 
end-hosts 
• Slow evolution 
• VM ? Hmm, I’m a lazy guy.. 
4
costin.raiciu@cs.pub.ro, j.araujo@ucl.ac.uk, rizzo@iet.unipi.it 
Internet paths 
that it is still 
despite the 
the blame 
extensions taking 
placed on end 
moving protocols 
deployment 
optimizations. 
support for user-level 
commodity 
number of 
host stack, 
s. 
our mux/de-mux 
line rate (up 
Slow evolution of network stack 
Honda et al., Rekindling Network Protocol Innovation with User-Level Stacks, ACM 
SIGCOMM CCR, Vol.44, Num. 2, April 2014 
cores, and 
over a basic 
same server 
1.00 
0.75 
0.50 
0.25 
0.00 
2007 2008 2009 2010 2011 2012 
Date 
Ratio of flows 
Option 
SACK 
Timestamp 
Windowscale 
Direction 
Inbound 
Outbound 
Figure 1: TCP options deployment over time. 
pen infrequently not only because of slow release cycles, but 
also due to their cost and potential disruption to existing 
setups. If protocol stacks were embedded into applications, 
they could be updated on a case-by-case basis, and deploy-ment 
would be a lot more timely. 
For example, Mac OS, Windows XP and FreeBSD still 
use a traditional Additive Increase Multiplicative Decrease 
(AIMD) algorithm for TCP congestion control, while Linux
Virtual Machine ? 
Poll: “When you download and run software, how often do you use a virtual machine (to reduce 
security risks)?” 
Jon Howell, Galen Hunt, David Molnar, and Donald E. Porter, Living Dangerously: A Survey of Software Download 
Practices, no. MSR-TR-2010-51, May 2010 
6
Meanwhile in 
Filesystem world.. 
• There is, 
• Filesystem in Userspace 
(FUSE) 
• Userspace code can host 
new filesystem (sshfs, 
GmailFS, etc) 
• Performance is bad, 
but doesn’t matter 
• Flexibility and 
functionality do matter 
7 
http://fuse.sourceforge.net/
Problem Statements 
• Slow evolution of network stack 
• Interfere to host OS (which is 
untouchable) 
• Too heavy workload of VM 
8
What’s NUSE ? 
• Network stack in Userspace 
• Userspace as much as possible 
• like Fuse (Filesystem in Userspace) 
• Library version of network stack (of 
monolithic kernel) 
• kernel bypassed 
• (UNIX) Process-based virtualization 
9
What can do with NUSE ? 
• Host operating system 
• Linux (for the moment) 
• Guest operating systems 
• Linux (3.17-rc1 based) 
• FreeBSD (ongoing) 
• Suitable with kernel-bypass technologies 
• DPDK/netmap with (full) network stack + (existing) applications 
• Applications 
• ping, iperf, nginx (partially worked) 
10
FUSE vs NUSE 
11 
nuse example 
kernel bypassed 
TCP/IP 
ARP/ 
ndisc 
libnuse 
glibc 
NIC 
userspace 
kernel 
raw sock 
netmap 
DPDK (etc) 
libfuse 
glibc glibc 
VFS 
FUSE 
...... 
NFS 
ext3 
ls -l 
/tmp/fuse 
example 
/tmp/fuse 
userspace 
kernel
Design Goals 
• No modification to userspace apps 
• No mod to kernel space as well 
• Transparent 
• LD_PRELOADable 
• x1 performance of native OS 
12
Application 
POSIX glue 
TCP UDP DCCP SCTP 
ICMP ARP 
IPv6 IPv4 
Qdisc 
Netfilter Bridging 
Netlink 
IPSec Tunneling 
Kernel layer 
NUSE core 
bottom halves/ 
rcu/timer/ 
interrupt 
struct 
net_device 
RAW DPDK netmap ... 
NIC 
Recipe 
petit-scheduler 
1. (monolithic) kernel 
source 
2. petit-scheduler 
3. POSIX glue 
• redirect system calls (at 
libc-level) 
4. network I/O 
• raw socket, DPDK, netmap, 
etc.. 
13
1) kernel build 
Application 
POSIX glue 
TCP UDP DCCP SCTP 
ICMP ARP 
IPv6 IPv4 
Qdisc 
Netfilter Bridging 
Netlink 
IPSec Tunneling 
Kernel layer 
NUSE core 
bottom halves/ 
rcu/timer/ 
interrupt 
struct 
net_device 
RAW DPDK netmap ... 
NIC 
petit-scheduler 
• patch to kernel tree 
• with new (hw independent) 
arch (arch/sim) 
• robust to (frequent) 
mainstream changes 
• build kernel source tree 
w/ the patch 
• make menuconfig ARCH=sim 
• make library ARCH=sim 
• ➔ libnuse-linux-3.17-rc1.so 
14
2) petit scheduler 
• offer alternate context 
primitives 
Application 
POSIX glue 
TCP UDP DCCP SCTP 
ICMP ARP 
IPv6 IPv4 
Qdisc 
Netfilter Bridging 
Netlink 
IPSec Tunneling 
Kernel layer 
NUSE core 
bottom halves/ 
rcu/timer/ 
interrupt 
struct 
net_device 
RAW DPDK netmap ... 
NIC 
petit-scheduler 
• interrupts, timer, thread, 
bottom halves (tasklet, 
workqueue, waiter, etc) 
! 
• Implemented with POSIX 
thread 
• easily debuggable 
• ucontext fiber for low 
overhead (not yet) 
15
3) POSIX glue code 
Application 
POSIX glue 
TCP UDP DCCP SCTP 
ICMP ARP 
IPv6 IPv4 
Qdisc 
Netfilter Bridging 
Netlink 
IPSec Tunneling 
Kernel layer 
NUSE core 
bottom halves/ 
rcu/timer/ 
interrupt 
struct 
net_device 
RAW DPDK netmap ... 
NIC 
petit-scheduler 
• Hijack function calls 
• socket => nuse_socket 
• read => nuse_read 
• libc level hijack 
• apps not aware of 
• LD_PRELOAD=libnuse.so .. 
• can’t catch int 0x80 
16
extern int sim_sock_socket (int,int,int, struct socket **); 
int socket (int family, int type, int proto) 
{ 
sim_update_jiffies (); 
struct socket *kernel_socket = 
sim_malloc (sizeof (struct socket)); 
memset (kernel_socket, 0, sizeof (struct socket)); 
int ret = sim_sock_socket (family, type, proto, &kernel_socket); 
g_fd_table[curfd++] = kernel_socket; 
sim_softirq_wakeup (); 
return curfd - 1; 
} 
https://github.com/thehajime/net-next-nuse/blob/nuse/arch/sim/nuse-glue.c
4) network I/O 
Application 
POSIX glue 
TCP UDP DCCP SCTP 
ICMP ARP 
IPv6 IPv4 
Qdisc 
Netfilter Bridging 
Netlink 
IPSec Tunneling 
Kernel layer 
NUSE core 
bottom halves/ 
rcu/timer/ 
interrupt 
struct 
net_device 
RAW DPDK netmap ... 
NIC 
petit-scheduler 
• connect NUSE to NIC 
• options 
• raw socket (general) 
• DPDK (if available) 
• netmap (if available) 
• Tap ? 
18
tatic netdev_tx_t 
kernel_dev_xmit(struct sk_buff *skb, 
struct net_device *dev) 
{ 
netif_stop_queue(dev); 
sim_dev_xmit ((struct SimDevice *)dev, skb->data, skb->len); 
dev_kfree_skb(skb); 
netif_wake_queue(dev); 
return 0; 
} 
static const struct net_device_ops sim_dev_ops = { 
.ndo_start_xmit = kernel_dev_xmit, 
.ndo_set_mac_address = eth_mac_addr, 
}; 
void sim_dev_rx (struct SimDevice *device, struct SimDevicePacket 
packet) 
{ 
struct sk_buff *skb = packet.token; 
struct net_device *dev = &device->dev; 
skb->protocol = eth_type_trans(skb, dev); 
skb->ip_summed = CHECKSUM_PARTIAL; // Do the TCP checksum (FIXME: 
should be configurable) 
! 
netif_rx (skb); 
} 
https://github.com/thehajime/net-next-nuse/blob/nuse/arch/sim/sim-device.c
How to use NUSE ? 
• download 
• git clone git://github.com/thehajime/net-next-nuse 
• compile 
• make library ARCH=sim NETMAP=yes 
• execute 
• sudo ./nuse (application) 
• success ? : lucky guy ! 
• fail: add hijack calls 
20
Alternatives 
• Container (LXC, OpenVZ, vimage) 
• share kernel with host operating system (no flexibility) 
• virtual machine (KVM,Xen,UML) 
• flexible/functional, but heavy bootstrap 
• Library OS 
• full scratch: mtcp, Mirage, lwIP 
• Porting: OSv, Sandstorm, libuinet (FreeBSD), Arrakis 
(lwIP), OpenOnload (lwIP?) 
• Glue-layer: LKL (Linux-2.6), Rump (NetBSD) 
21
Alternatives (cont’d) 
Rumpkernel 
• https://github.com/rumpkernel/wiki/wiki 
• One binary runs on everywhere 
• Linux,xBSD,Soralis,cygwin Host 
• Xen Dom-U 
• Bare metal (hardware, KVM, Virtualbox) 
• Well-defined API (hypercall) 
! 
• Only NetBSD network stack is available 
22
Evaluation 
• Performance ? 
• not good so far.. 
• Generality 
• Run all applications ? up to POSIX 
coverage 
23
next time..
Ongoings 
• (efficient) thread scheduling 
• batch Tx/Rx 
• fork(2)/exec(2) 
• multi-processes 
! 
• => migrate to rumpkernel ? 
25
Summary 
• Network Stack in Userspace (NUSE) 
• network stack library 
• light virtualization 
• fast evolution, easy deployments 
https://github.com/thehajime/net-next-nuse 
26
GASPP: A GPU-Accelerated Stateful 
Packet Processing Framework 
Giorgos Vasiliadis, Lazaros Koromilas, Michalis Polychronakis, and Sotiris Ioannidis, GASPP: A GPU-Accelerated Stateful Packet 
Processing Framework, USENIX ATC 2014, June, 2014 
28

More Related Content

What's hot

NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
Hajime Tazaki
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopenHajime Tazaki
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)micchie
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
Naoto MATSUMOTO
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
Stephen Hemminger
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
Amir Razmjou
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
Kernel TLV
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
Denys Haryachyy
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
harryvanhaaren
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
Denys Haryachyy
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
lcplcp1
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
Kernel TLV
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switch
micchie
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
Hisaki Ohara
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
Vipin Varghese
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
Jiannan Ouyang, PhD
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
IO Visor Project
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
Kernel TLV
 

What's hot (20)

mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce RichardsonThe 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
The 7 Deadly Sins of Packet Processing - Venky Venkatesan and Bruce Richardson
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switch
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 

Viewers also liked

XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, SamsungXPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
The Linux Foundation
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
Jeff Squyres
 
Cinder Intro@Open Stack China Tour Beijing
Cinder Intro@Open Stack China Tour BeijingCinder Intro@Open Stack China Tour Beijing
Cinder Intro@Open Stack China Tour Beijing
OpenCity Community
 
Livestock, Land and the Changing Economy of Pastoralism
Livestock, Land and the Changing Economy of PastoralismLivestock, Land and the Changing Economy of Pastoralism
Livestock, Land and the Changing Economy of Pastoralism
futureagricultures
 
Big Data: Movement, Warehousing, & Virtualization
Big Data: Movement, Warehousing, & VirtualizationBig Data: Movement, Warehousing, & Virtualization
Big Data: Movement, Warehousing, & Virtualization
tervela
 
Атомная энергетика в Казахстане
Атомная энергетика в КазахстанеАтомная энергетика в Казахстане
Атомная энергетика в Казахстане
АО "Самрук-Казына"
 
World2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverviewWorld2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverviewFarah Omer
 
Java swing tips
Java swing tipsJava swing tips
Java swing tipsTuan Ngo
 
Danhmuccaccoso kcbkcb2011
Danhmuccaccoso kcbkcb2011Danhmuccaccoso kcbkcb2011
Danhmuccaccoso kcbkcb2011foxhuong
 
Third index
Third indexThird index
Third index
ezaz123
 
Exceedence PItch
Exceedence PItchExceedence PItch
Exceedence PItch
rayalco
 
SOP OC COMM
SOP OC COMMSOP OC COMM
SOP OC COMM
antiik
 
68 avenue Gurgaon 7428424386
68 avenue Gurgaon 742842438668 avenue Gurgaon 7428424386
68 avenue Gurgaon 7428424386
Adore Global Pvt. Ltd
 
Udział w badaniach w celu walki z miaŝdŝycą
Udział w badaniach w celu walki z miaŝdŝycąUdział w badaniach w celu walki z miaŝdŝycą
Udział w badaniach w celu walki z miaŝdŝycą
Xplore Health
 
Edtc 6340-66 copyright crash course alberto tudon 6th ed
Edtc 6340-66 copyright crash course  alberto tudon 6th edEdtc 6340-66 copyright crash course  alberto tudon 6th ed
Edtc 6340-66 copyright crash course alberto tudon 6th edalbertotudon
 

Viewers also liked (17)

XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, SamsungXPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
XPDS13: Dual-Android on Nexus 10 - Lovene Bhatia, Samsung
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
 
Cinder Intro@Open Stack China Tour Beijing
Cinder Intro@Open Stack China Tour BeijingCinder Intro@Open Stack China Tour Beijing
Cinder Intro@Open Stack China Tour Beijing
 
Livestock, Land and the Changing Economy of Pastoralism
Livestock, Land and the Changing Economy of PastoralismLivestock, Land and the Changing Economy of Pastoralism
Livestock, Land and the Changing Economy of Pastoralism
 
Big Data: Movement, Warehousing, & Virtualization
Big Data: Movement, Warehousing, & VirtualizationBig Data: Movement, Warehousing, & Virtualization
Big Data: Movement, Warehousing, & Virtualization
 
Атомная энергетика в Казахстане
Атомная энергетика в КазахстанеАтомная энергетика в Казахстане
Атомная энергетика в Казахстане
 
Articles of Faith
Articles of FaithArticles of Faith
Articles of Faith
 
World2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverviewWorld2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverview
 
Java swing tips
Java swing tipsJava swing tips
Java swing tips
 
Danhmuccaccoso kcbkcb2011
Danhmuccaccoso kcbkcb2011Danhmuccaccoso kcbkcb2011
Danhmuccaccoso kcbkcb2011
 
Third index
Third indexThird index
Third index
 
Exceedence PItch
Exceedence PItchExceedence PItch
Exceedence PItch
 
SOP OC COMM
SOP OC COMMSOP OC COMM
SOP OC COMM
 
Our parents
Our parentsOur parents
Our parents
 
68 avenue Gurgaon 7428424386
68 avenue Gurgaon 742842438668 avenue Gurgaon 7428424386
68 avenue Gurgaon 7428424386
 
Udział w badaniach w celu walki z miaŝdŝycą
Udział w badaniach w celu walki z miaŝdŝycąUdział w badaniach w celu walki z miaŝdŝycą
Udział w badaniach w celu walki z miaŝdŝycą
 
Edtc 6340-66 copyright crash course alberto tudon 6th ed
Edtc 6340-66 copyright crash course  alberto tudon 6th edEdtc 6340-66 copyright crash course  alberto tudon 6th ed
Edtc 6340-66 copyright crash course alberto tudon 6th ed
 

Similar to Network Stack in Userspace (NUSE)

Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
eurobsdcon
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
Nikos Kormpakis
 
Include os @ flossuk 2018
Include os @ flossuk 2018Include os @ flossuk 2018
Include os @ flossuk 2018
Per Buer
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
Angelo Failla
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
HungWei Chiu
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
OpenStack Korea Community
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
Adrien Mahieux
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
Kirill Tsym
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
Brendan Gregg
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Jérôme Petazzoni
 
The State of Rootless Containers
The State of Rootless ContainersThe State of Rootless Containers
The State of Rootless Containers
Akihiro Suda
 
Bit_Bucket_x31_Final
Bit_Bucket_x31_FinalBit_Bucket_x31_Final
Bit_Bucket_x31_FinalSam Knutson
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0guest72e8c1
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Michelle Holley
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
Stefano Salsano
 
Nodejs a-practical-introduction-oredev
Nodejs a-practical-introduction-oredevNodejs a-practical-introduction-oredev
Nodejs a-practical-introduction-oredev
Felix Geisendörfer
 
the NML project
the NML projectthe NML project
the NML projectLei Yang
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
BiHongPhc
 

Similar to Network Stack in Userspace (NUSE) (20)

Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
Include os @ flossuk 2018
Include os @ flossuk 2018Include os @ flossuk 2018
Include os @ flossuk 2018
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
 
The State of Rootless Containers
The State of Rootless ContainersThe State of Rootless Containers
The State of Rootless Containers
 
Bit_Bucket_x31_Final
Bit_Bucket_x31_FinalBit_Bucket_x31_Final
Bit_Bucket_x31_Final
 
RMLL / LSM 2009
RMLL / LSM 2009RMLL / LSM 2009
RMLL / LSM 2009
 
Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0Rmll Virtualization As Is Tool 20090707 V1.0
Rmll Virtualization As Is Tool 20090707 V1.0
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Nodejs a-practical-introduction-oredev
Nodejs a-practical-introduction-oredevNodejs a-practical-introduction-oredev
Nodejs a-practical-introduction-oredev
 
the NML project
the NML projectthe NML project
the NML project
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
 

Recently uploaded

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 

Recently uploaded (20)

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 

Network Stack in Userspace (NUSE)

  • 1. Network Stack in Userspace (NUSE) ! ! Hajime Tazaki 高速PCルーター研究会 2014/9/29
  • 2. Today’s talk • Userspace version of (Linux) network stack • not intended for high-speed something • but useful for high-speed network I/O 2
  • 3. I have a new Layer-3/4 protocol! Yey! • I have new, great Layer-3/4 protocol ! It will change the WORLD ! • network stack って、入れかえたいですか? • No: your code will destroy my life ?! (experimental ? not tested ?) • Yes: I wanna be your slave. • VM cloud = OK, no much users/services interfere • multi-user server, PC, phone = Nightmare, my life will have trouble… 3
  • 4. I have a new Layer-3/4 protocol! Yey! (cont’d) • Kernel programming sucks • LKM ? can cause panic anyway.. • Click ? only router/middlebox, not for end-hosts • Slow evolution • VM ? Hmm, I’m a lazy guy.. 4
  • 5. costin.raiciu@cs.pub.ro, j.araujo@ucl.ac.uk, rizzo@iet.unipi.it Internet paths that it is still despite the the blame extensions taking placed on end moving protocols deployment optimizations. support for user-level commodity number of host stack, s. our mux/de-mux line rate (up Slow evolution of network stack Honda et al., Rekindling Network Protocol Innovation with User-Level Stacks, ACM SIGCOMM CCR, Vol.44, Num. 2, April 2014 cores, and over a basic same server 1.00 0.75 0.50 0.25 0.00 2007 2008 2009 2010 2011 2012 Date Ratio of flows Option SACK Timestamp Windowscale Direction Inbound Outbound Figure 1: TCP options deployment over time. pen infrequently not only because of slow release cycles, but also due to their cost and potential disruption to existing setups. If protocol stacks were embedded into applications, they could be updated on a case-by-case basis, and deploy-ment would be a lot more timely. For example, Mac OS, Windows XP and FreeBSD still use a traditional Additive Increase Multiplicative Decrease (AIMD) algorithm for TCP congestion control, while Linux
  • 6. Virtual Machine ? Poll: “When you download and run software, how often do you use a virtual machine (to reduce security risks)?” Jon Howell, Galen Hunt, David Molnar, and Donald E. Porter, Living Dangerously: A Survey of Software Download Practices, no. MSR-TR-2010-51, May 2010 6
  • 7. Meanwhile in Filesystem world.. • There is, • Filesystem in Userspace (FUSE) • Userspace code can host new filesystem (sshfs, GmailFS, etc) • Performance is bad, but doesn’t matter • Flexibility and functionality do matter 7 http://fuse.sourceforge.net/
  • 8. Problem Statements • Slow evolution of network stack • Interfere to host OS (which is untouchable) • Too heavy workload of VM 8
  • 9. What’s NUSE ? • Network stack in Userspace • Userspace as much as possible • like Fuse (Filesystem in Userspace) • Library version of network stack (of monolithic kernel) • kernel bypassed • (UNIX) Process-based virtualization 9
  • 10. What can do with NUSE ? • Host operating system • Linux (for the moment) • Guest operating systems • Linux (3.17-rc1 based) • FreeBSD (ongoing) • Suitable with kernel-bypass technologies • DPDK/netmap with (full) network stack + (existing) applications • Applications • ping, iperf, nginx (partially worked) 10
  • 11. FUSE vs NUSE 11 nuse example kernel bypassed TCP/IP ARP/ ndisc libnuse glibc NIC userspace kernel raw sock netmap DPDK (etc) libfuse glibc glibc VFS FUSE ...... NFS ext3 ls -l /tmp/fuse example /tmp/fuse userspace kernel
  • 12. Design Goals • No modification to userspace apps • No mod to kernel space as well • Transparent • LD_PRELOADable • x1 performance of native OS 12
  • 13. Application POSIX glue TCP UDP DCCP SCTP ICMP ARP IPv6 IPv4 Qdisc Netfilter Bridging Netlink IPSec Tunneling Kernel layer NUSE core bottom halves/ rcu/timer/ interrupt struct net_device RAW DPDK netmap ... NIC Recipe petit-scheduler 1. (monolithic) kernel source 2. petit-scheduler 3. POSIX glue • redirect system calls (at libc-level) 4. network I/O • raw socket, DPDK, netmap, etc.. 13
  • 14. 1) kernel build Application POSIX glue TCP UDP DCCP SCTP ICMP ARP IPv6 IPv4 Qdisc Netfilter Bridging Netlink IPSec Tunneling Kernel layer NUSE core bottom halves/ rcu/timer/ interrupt struct net_device RAW DPDK netmap ... NIC petit-scheduler • patch to kernel tree • with new (hw independent) arch (arch/sim) • robust to (frequent) mainstream changes • build kernel source tree w/ the patch • make menuconfig ARCH=sim • make library ARCH=sim • ➔ libnuse-linux-3.17-rc1.so 14
  • 15. 2) petit scheduler • offer alternate context primitives Application POSIX glue TCP UDP DCCP SCTP ICMP ARP IPv6 IPv4 Qdisc Netfilter Bridging Netlink IPSec Tunneling Kernel layer NUSE core bottom halves/ rcu/timer/ interrupt struct net_device RAW DPDK netmap ... NIC petit-scheduler • interrupts, timer, thread, bottom halves (tasklet, workqueue, waiter, etc) ! • Implemented with POSIX thread • easily debuggable • ucontext fiber for low overhead (not yet) 15
  • 16. 3) POSIX glue code Application POSIX glue TCP UDP DCCP SCTP ICMP ARP IPv6 IPv4 Qdisc Netfilter Bridging Netlink IPSec Tunneling Kernel layer NUSE core bottom halves/ rcu/timer/ interrupt struct net_device RAW DPDK netmap ... NIC petit-scheduler • Hijack function calls • socket => nuse_socket • read => nuse_read • libc level hijack • apps not aware of • LD_PRELOAD=libnuse.so .. • can’t catch int 0x80 16
  • 17. extern int sim_sock_socket (int,int,int, struct socket **); int socket (int family, int type, int proto) { sim_update_jiffies (); struct socket *kernel_socket = sim_malloc (sizeof (struct socket)); memset (kernel_socket, 0, sizeof (struct socket)); int ret = sim_sock_socket (family, type, proto, &kernel_socket); g_fd_table[curfd++] = kernel_socket; sim_softirq_wakeup (); return curfd - 1; } https://github.com/thehajime/net-next-nuse/blob/nuse/arch/sim/nuse-glue.c
  • 18. 4) network I/O Application POSIX glue TCP UDP DCCP SCTP ICMP ARP IPv6 IPv4 Qdisc Netfilter Bridging Netlink IPSec Tunneling Kernel layer NUSE core bottom halves/ rcu/timer/ interrupt struct net_device RAW DPDK netmap ... NIC petit-scheduler • connect NUSE to NIC • options • raw socket (general) • DPDK (if available) • netmap (if available) • Tap ? 18
  • 19. tatic netdev_tx_t kernel_dev_xmit(struct sk_buff *skb, struct net_device *dev) { netif_stop_queue(dev); sim_dev_xmit ((struct SimDevice *)dev, skb->data, skb->len); dev_kfree_skb(skb); netif_wake_queue(dev); return 0; } static const struct net_device_ops sim_dev_ops = { .ndo_start_xmit = kernel_dev_xmit, .ndo_set_mac_address = eth_mac_addr, }; void sim_dev_rx (struct SimDevice *device, struct SimDevicePacket packet) { struct sk_buff *skb = packet.token; struct net_device *dev = &device->dev; skb->protocol = eth_type_trans(skb, dev); skb->ip_summed = CHECKSUM_PARTIAL; // Do the TCP checksum (FIXME: should be configurable) ! netif_rx (skb); } https://github.com/thehajime/net-next-nuse/blob/nuse/arch/sim/sim-device.c
  • 20. How to use NUSE ? • download • git clone git://github.com/thehajime/net-next-nuse • compile • make library ARCH=sim NETMAP=yes • execute • sudo ./nuse (application) • success ? : lucky guy ! • fail: add hijack calls 20
  • 21. Alternatives • Container (LXC, OpenVZ, vimage) • share kernel with host operating system (no flexibility) • virtual machine (KVM,Xen,UML) • flexible/functional, but heavy bootstrap • Library OS • full scratch: mtcp, Mirage, lwIP • Porting: OSv, Sandstorm, libuinet (FreeBSD), Arrakis (lwIP), OpenOnload (lwIP?) • Glue-layer: LKL (Linux-2.6), Rump (NetBSD) 21
  • 22. Alternatives (cont’d) Rumpkernel • https://github.com/rumpkernel/wiki/wiki • One binary runs on everywhere • Linux,xBSD,Soralis,cygwin Host • Xen Dom-U • Bare metal (hardware, KVM, Virtualbox) • Well-defined API (hypercall) ! • Only NetBSD network stack is available 22
  • 23. Evaluation • Performance ? • not good so far.. • Generality • Run all applications ? up to POSIX coverage 23
  • 25. Ongoings • (efficient) thread scheduling • batch Tx/Rx • fork(2)/exec(2) • multi-processes ! • => migrate to rumpkernel ? 25
  • 26. Summary • Network Stack in Userspace (NUSE) • network stack library • light virtualization • fast evolution, easy deployments https://github.com/thehajime/net-next-nuse 26
  • 27. GASPP: A GPU-Accelerated Stateful Packet Processing Framework Giorgos Vasiliadis, Lazaros Koromilas, Michalis Polychronakis, and Sotiris Ioannidis, GASPP: A GPU-Accelerated Stateful Packet Processing Framework, USENIX ATC 2014, June, 2014 28