Direct Code Execution
Dive Into the Internals of
Kernel Network Stack with DCE
Hajime Tazaki
University of Tokyo
LinuxCon Japan 2014
Who am I ?
a lecturer/researcher of a university
studying/implementing/hacking
network protocols
networks measurement
2
this talk is about...
a testing framework for network
stack is really needed
userspace version of network
stack helps a lot
3
Development of network stack
newly introduced protocols (mptcp,
6lowpan)
refactoring, brings regression (bugs)
Can we really test network stack ?
to keep the pace of development
to keep the quality of software
Network Stack, still needs new
idea ?
4
Issues (Testing)
5
OSPF (>100 routers)
in an ISP network
How to setup ?
Hard to configure
each node
Heavy load (of VMs)
Limitation of network topology
Test suites of Linux Test Project (LTP)
Okay, LXC / UML / OSv promise a
handy way to test complex network
reproducible ?
Issues (Testing)
6
A bunch of VMs
gdb w/ 100 nodes ?
How to reproduce a bug in a heavy
load situation ?
Issues (debugging)
7
Code exercise
Large codebase (~600K LoC net/)
How can we close to 100% test
coverage ?
Issues (code coverage)
8
% cloc net-next/net
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
C 1186 121213 104596 572814
C/C++ Header 175 4408 7149 21972
make 71 246 252 943
awk 1 11 22 126
-------------------------------------------------------------------------------
SUM: 1433 125878 112019 595855
-------------------------------------------------------------------------------
Reproducibility is important
to ensure regression tests are
meaningful
to ensure the (ideal) performance
Issues (regression)
9
Destination option header (for
mobile ipv6) handling (3.7 fixed)
anycast address configuration via
sockopt (3.14, still exists)
Regressions we've seen in
net-next tree
10
11
http://patchwork.ozlabs.org/patch/209684/
Light-weight virtualization
Userspace network stack
What we already have ?
(alternatives)
12
Alternative: VM
LXC, UML, OpenVZ....
Light-weight virtualization
run many instances
bunch of emulation features
High load with large numbers of VM
Behavior (of test) is not deterministic
13
Alternative: OSv
Minimal Guest OS (for
Cloud)
no system call
no user/kernel space
1 process / a VM
Very lightweight
with controllability
timing reproducibility
is ?
14
http://www.slideshare.net/dmarti1111/o-sv-linux-collaboration-summit
Alternative: Userspace net stack
Rump Kernels
NetBSD kernel on
userspace, Xen
Automated testing
synchronizing multiple
processes makes
complex to debug
15
http://blog.netbsd.org/tnf/entry/revolutionizing_kernel_development_testing_with
nfsim
a netfilter simulation environment
automated test platform for NAT,
conntrack, etc
LD_PRELOAD=fakesockopt.so /
sbin/iptables -L ...
Only a single (kernel) instance
16
Rusty Russel and Jeremy Kerr. nfsim: Untested code is buggy code. In
Proceedings of the Ottawa Linux Symposium (OLS’05), 2005.
Summary (Alternatives)
17
LXC OSv nfsim DCE
Code Generality ✓ ✓ ✓
Controllability ✓ ✓ ✓
Deterministic
Clock ✓ ✓
Flexible
Configurations ✓
Summary (Alternatives)
17
LXC OSv nfsim DCE
Code Generality ✓ ✓ ✓
Controllability ✓ ✓ ✓
Deterministic
Clock ✓ ✓
Flexible
Configurations ✓
Direct Code Execution (DCE)
DCE is
a userspace kernel network stack
with asm-generic based architecture
with multiple hosts by dlmopen
in a single userspace process
Our solution is ...
18
Direct Code Execution (cont’d)
DCE makes
reproducible testing platform
with fine-grained parameter tuning
(by ns-3 network simulator)
providing development framework for
network protocols
19
DCE
Hardware
Simulation Core
Host
operating system
Process
Network
stack
Applications
Network
stack
Applications
node#1 node#N
Features
20
Functional Realism
Run real code
POSIX apps, kernel
network stacks
Timing Realism
ns-3 integration (virtual
clock)
Debuggability
all in userspace
single-process
virtualization
DCE architecture
21
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Core
layer
ns-3 (network simulation core)
POSIX layer
Application
(ip, iptables, quagga)
bottom halves/rcu/
timer/interrupt
struct net_device
DCE
ns-3
applicati
on
ns-3
TCP/IP
stack
1) Core
Layer
2) Kernel
Layer
3) POSIX
Layer
1) Virtualization core layer
22
Single process model
Run multiple nodes
on a single (host)
process
dlmopen(3) etc.
Simulated Process
isolation of global
symbols
management of
stacks/heaps of
simulated processes
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Core
layer
ns-3 (network simulation core)
POSIX layer
Application
(ip, iptables, quagga)
bottom halves/rcu/
timer/interrupt
struct net_device
DCE
ns-3
applicati
on
ns-3
TCP/IP
stack
1) Virtualization core layer
load shlib version of Linux kernel
at different base address (isolation)
application (iproute2) can be w/ PIE
glue time/NIC related function
redirected to ns-3 core
23
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Core
layer
ns-3 (network simulation core)
POSIX layer
Application
(ip, iptables, quagga)
bottom halves/rcu/
timer/interrupt
struct net_device
DCE
ns-3
applicati
on
ns-3
TCP/IP
stack
2) Kernel layer (library operating system)
24
Similar to Library OS
shared library (e.g.,
liblinux.so)
replaceable (e.g.,
libfreebsd.so)
Mapping via glue code
struct net_device <=>
ns3:NetDevice
jiffies <=> simulated clock
glue code in arch/sim
minimize original code
modifications
jiffies/
gettimeofday()
Simulated
Clock
Synchronize
struct
net_device
ns3::NetDevice
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Core
layer
network simulation core
POSIX layer
Application
(ip, iptables, quagga)
bottom halves/rcu/
timer/interrupt
struct net_device
DCE
https://github.com/direct-code-execution/net-next-sim
2) Kernel layer (library operating system)
networking glue code
timers glue code
25
static const struct net_device_ops sim_dev_ops = {
.ndo_start_xmit = kernel_dev_xmit, // go to ns-3 side
};
void do_gettimeofday(struct timeval *tv) {
u64 ns = sim_current_ns (); // get simulated clock
*tv = ns_to_timeval (ns);
}
2) Kernel layer (library operating system)
Build
make menuconfig ARCH=sim
make library ARCH=sim
26
3) POSIX layer
27
Our POSIX
implementation
1. pass-through host library
calls
e.g., strcpy(3) => (reuse)
2. system call => hijacking
redirect to our kernel module
e.g., socket(2) =>
dce_socket()
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Core
layer
ns-3 (network simulation core)
POSIX layer
Application
(ip, iptables, quagga)
bottom halves/rcu/
timer/interrupt
struct net_device
DCE
ns-3
applicati
on
ns-3
TCP/IP
stack
POSIX API Coverage
28
0
125
250
375
500
2009-09-04 2010-03-10 2011-05-20 2012-01-05 2013-04-09 2014-05-16
#offunctions
Date
Supported Codes
29
iproute2
quagga (RIP/OSPF/BGP/v6RA)
umip (Mobile IPv6)
bind9, unbound (DNS/DNSSEC)
iperf, ping, ping6
Linux net-next (TCP, IPv6/4, SCTP/
DCCP)
version 2.6.36 to 3.14
mptcp (UC Louvain)
How it looks like ? (ns-3
script interface)
30
How to use it ?
31
Recompile
Userspace as Position Independent
Executable
Kernelspace as shared library
Run within ns-3
Debug with gdb, valgrind !
Hello World.
(1) create 100 nodes
(2) connect via ethernet
links
(3) choose network stack
library
(4) application ‘ospfd’ set to
run at 5.0 second
(5) execution stop at 1000
second
32
#!/usr/bin/python
from ns.dce import *
from ns.core import *
nodes = NodeContainer()
nodes.Create (100) (1)
csma = csma.CsmaHelper()
csma.Install (nodes) (2)
dce = DceManagerHelper()
dce.SetNetworkStack ("liblinux.so"); (3)
dce.Install (nodes);
app = DceApplicationHelper()
app.SetBinary ("ospfd") (4)
app.Start (Seconds (5.0)) (4)
app.Install (nodes)
Simulator.Stop (Seconds(1000.0)) (5)
Simulator.Run ()
ns-3 scripting
C++, python (bindings)
Use cases
33
Code Coverage (gcov)
34
Settings
mptcp_v0.86
DCE-ed test programs
(<1K LoC)
Configuration of test
programs
simple 2 paths (ipv4
iperf)
dual-stack 2 paths
(v6only, v4/v6)
10 different packet loss
rates
Lines Funcs Branches
mptcp_ctrl.c 76.3% 86.7% 59.9%
mptcp_input.c 66.9% 85.0% 57.9%
mptcp_ipv4.c 68.0% 93.3% 43.8%
mptcp_ipv6.c 57.4% 85.0% 45.2%
mptcp_ofo_queue.c 91.2% 100.0% 89.2%
mptcp_output.c 71.2% 91.9% 58.6%
mptcp_pm.c 54.2% 71.4% 40.5%
Total 68.0% 85.9% 54.8%
make library ARCH=sim COV=yes
Code Coverage (gcov)
34
Settings
mptcp_v0.86
DCE-ed test programs
(<1K LoC)
Configuration of test
programs
simple 2 paths (ipv4
iperf)
dual-stack 2 paths
(v6only, v4/v6)
10 different packet loss
rates
Lines Funcs Branches
mptcp_ctrl.c 76.3% 86.7% 59.9%
mptcp_input.c 66.9% 85.0% 57.9%
mptcp_ipv4.c 68.0% 93.3% 43.8%
mptcp_ipv6.c 57.4% 85.0% 45.2%
mptcp_ofo_queue.c 91.2% 100.0% 89.2%
mptcp_output.c 71.2% 91.9% 58.6%
mptcp_pm.c 54.2% 71.4% 40.5%
Total 68.0% 85.9% 54.8%
make library ARCH=sim COV=yes
Code Coverage (gcov)
34
Settings
mptcp_v0.86
DCE-ed test programs
(<1K LoC)
Configuration of test
programs
simple 2 paths (ipv4
iperf)
dual-stack 2 paths
(v6only, v4/v6)
10 different packet loss
rates
Lines Funcs Branches
mptcp_ctrl.c 76.3% 86.7% 59.9%
mptcp_input.c 66.9% 85.0% 57.9%
mptcp_ipv4.c 68.0% 93.3% 43.8%
mptcp_ipv6.c 57.4% 85.0% 45.2%
mptcp_ofo_queue.c 91.2% 100.0% 89.2%
mptcp_output.c 71.2% 91.9% 58.6%
mptcp_pm.c 54.2% 71.4% 40.5%
Total 68.0% 85.9% 54.8%
make library ARCH=sim COV=yes
Debuggability (gdb)
35
Inspect codes during
testing
among distributed
nodes
in a single process
using gdb
conditional
breakpoint with
node id (in a
simulated network)
fully reproducible (to
easily catch a bug)
(gdb) b mip6_mh_filter if dce_debug_nodeid()==0
Breakpoint 1 at 0x7ffff287c569: file net/ipv6/mip6.c, line 88.
<continue>
(gdb) bt 4
#0  mip6_mh_filter
(sk=0x7ffff7f69e10, skb=0x7ffff7cde8b0)
at net/ipv6/mip6.c:109
#1  0x00007ffff2831418 in ipv6_raw_deliver
(skb=0x7ffff7cde8b0, nexthdr=135)
at net/ipv6/raw.c:199
#2  0x00007ffff2831697 in raw6_local_deliver
(skb=0x7ffff7cde8b0, nexthdr=135)
at net/ipv6/raw.c:232
#3  0x00007ffff27e6068 in ip6_input_finish
(skb=0x7ffff7cde8b0)
at net/ipv6/ip6_input.c:197
Wi-Fi Wi-Fi
Home Agent
AP1 AP2
handoff
ping6
mobile node
correspondent
node
Debuggability (valgrind)
36
Memory error
detection
among distributed
nodes
in a single process
using Valgrind
==5864== Memcheck, a memory error detector
==5864== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et
al.
==5864== UsingValgrind-3.6.0.SVN and LibVEX; rerun with -h for
copyright info
==5864== Command: ../build/bin/ns3test-dce-vdl --verbose
==5864==
==5864== Conditional jump or move depends on uninitialised
value(s)
==5864== at 0x7D5AE32: tcp_parse_options (tcp_input.c:3782)
==5864== by 0x7D65DCB: tcp_check_req (tcp_minisocks.c:532)
==5864== by 0x7D63B09: tcp_v4_hnd_req (tcp_ipv4.c:1496)
==5864== by 0x7D63CB4: tcp_v4_do_rcv (tcp_ipv4.c:1576)
==5864== by 0x7D6439C: tcp_v4_rcv (tcp_ipv4.c:1696)
==5864== by 0x7D447CC: ip_local_deliver_finish (ip_input.c:226)
==5864== by 0x7D442E4: ip_rcv_finish (dst.h:318)
==5864== by 0x7D2313F: process_backlog (dev.c:3368)
==5864== by 0x7D23455: net_rx_action (dev.c:3526)
==5864== by 0x7CF2477: do_softirq (softirq.c:65)
==5864== by 0x7CF2544: softirq_task_function (softirq.c:21)
==5864== by 0x4FA2BE1: ns3::TaskManager::Trampoline(void*) (task-
manager.cc:261)
==5864== Uninitialised value was created by a stack allocation
==5864== at 0x7D65B30: tcp_check_req (tcp_minisocks.c:522)
==5864==
http://valgrind.org/
Automated Testing
37
Automated testing
among multiple
nodes
code coverage
regression tests
w/ deterministic
clock
Jenkins CI
Linux kernel testing
w/ Userspace
applications too
http://ns-3-dce.cloud.wide.ad.jp/jenkins/job/daily-net-next-sim/
Conclusions
38
Direct Code Execution
reproducible testing framework
controllable with distributed nodes
debugging facilities
% cd linux/
% make test ARCH=sim
G+ (ns-3-dce community)
@ns-3-dce
http://bit.ly/ns-3-dce
https://github.com/direct-code-
execution
How can you reach us ?
39
ありがとうございました
40
• Tazaki et al., Direct code execution: revisiting library OS architecture for
reproducible network experiments. ACM CoNEXT '13
• Mathieu Lacage. Experimentation Tools for Networking Research. Université de
Nice-Sophia Antipolis, 2010.
Acknowledgement
Mathieu Lacage (Alcméon, France, Initial Design/Implementation)
Diana/Planete team, INRIA, Sophia Antipolis, France
Backup
41
How it works ?
42
rump
(netbsd)
43
(gdb) bt
#0 rumpcomp_sockin_sendmsg (s=7, msg=0x703010, flags=0, snd=0x7ffffffed178) at buildrump.sh/src/sys/
libsockin/rumpcomp_user.c:426
#1 0x00007ffff7df8526 in sockin_usrreq (so=so@entry=0x6fedb0, req=req@entry=9, m=0x6cce00,
nam=nam@entry=0x0, control=control@entry=0x0, l=<optimized out>) at buildrump.sh/src/sys/rump/ne
sockin.c:510
#2 0x00007ffff7be4e79 in sosend (so=0x6fedb0, addr=0x0, uio=0x7ffffffed500, top=0x6cce00, control=0x0
l=0x700800)
at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/uipc_socket.c:1048
#3 0x00007ffff7be7b4c in soo_write (fp=<optimized out>, offset=<optimized out>, uio=0x7ffffffed500, cre
out>,
flags=<optimized out>) at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/sy
116
#4 0x00007ffff788f620 in dofilewrite (fd=fd@entry=3, fp=0x6f8e80, buf=0x400e88, nbyte=37, offset=0x6f8
flags=flags@entry=1,
retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/
sys_generic.c:355
#5 0x00007ffff788f72f in sys_write (l=<optimized out>, uap=0x7ffffffed5f0, retval=0x7ffffffed5e0) at /home
gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../kern/sys_generic.c:323
#6 0x00007ffff78de3cd in sy_call (rval=0x7ffffffed5e0, uap=0x7ffffffed5f0, l=0x700800, sy=<optimized out>)
tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../sys/syscallvar.h:61
#7 rump_syscall (num=num@entry=4, data=data@entry=0x7ffffffed5f0, dlen=dlen@entry=24,
retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/lib
rumpkern/rump.c:1024
#8 0x00007ffff78d573b in rump___sysimpl_write (fd=<optimized out>, buf=<optimized out>, nbyte=<opt
at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:121
#9 0x0000000000400d08 in main () at webbrowser.c:86
(gdb)
BSD
Stack
glue
apps
44
(gdb) bt
#0 if_transmit (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/sys/net/if.c:3082
#1 0x0000000000252a57 in ether_output_frame (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/s
if_ethersubr.c:387
#2 0x0000000000252a0a in ether_output (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100, dst=0xffffc0003e9e
ro=0x2000059102a0) at ../../bsd/sys/net/if_ethersubr.c:356
#3 0x0000000000277982 in ip_output (m=0xffffc00005bfe100, opt=0x0, ro=0x2000059102a0, flags=0, imo=0
inp=0xffffc00009ea6400) at ../../bsd/sys/netinet/ip_output.c:612
#4 0x000000000028cb49 in tcp_output (tp=0xffffc00009eafc00) at ../../bsd/sys/netinet/tcp_output.c:1219
#5 0x0000000000296276 in tcp_output_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140) at ../../b
netinet/tcp_offload.h:270
#6 0x0000000000296b25 in tcp_usr_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at
netinet/tcp_usrreq.c:453
#7 0x000000000023503e in soconnect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at ../../bsd
uipc_socket.c:744
#8 0x000000000023ad0e in kern_connect (fd=46, sa=0xffffc00005a8e140) at ../../bsd/sys/kern/uipc_syscalls.c
#9 0x00000000002511fa in linux_connect (s=46, name=0x200005910660, namelen=16) at ../../bsd/sys/compa
linux_socket.c:712
#10 0x000000000023c088 in connect (fd=46, addr=0x200005910660, len=16) at ../../bsd/sys/kern/uipc_syscall
104
#11 0x000010000220c65a in NET_Connect ()
#12 0x000010000220d0fa in Java_java_net_PlainSocketImpl_socketConnect ()
#13 0x000020000021cd8e in ?? ()
#14 0x00002000059106d8 in ?? ()
(snip)
(gdb)
BSD
Stack
glue
apps
(java)
OSv
apps
45
(dce:node0) bt
#0 sim_dev_xmit (dev=0x7ffff5587020, data=0x7ffff3e0688a "", len=105) at arch/sim/sim.c:349
#1 kernel_dev_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020) at arch/sim/sim-device.c:20
#2 dev_hard_start_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020, txq=0x7ffff5571a90) at net/core/dev.c:25
#3 dev_queue_xmit (skb=0x7ffff5ccaa68) at net/core/dev.c:2830
#4 neigh_hh_output (skb=0x7ffff5ccaa68, hh=0x7ffff5ce8850) at include/net/neighbour.h:357
#5 dst_neigh_output (skb=0x7ffff5ccaa68, n=0x7ffff5ce8790, dst=0x7ffff3e045d0) at include/net/dst.h:409
#6 ip_finish_output2 (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:201
#7 ip_finish_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:234
#8 ip_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:307
#9 dst_output (skb=0x7ffff5ccaa68) at include/net/dst.h:448
#10 ip_local_out (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:110
#11 ip_queue_xmit (skb=0x7ffff5ccaa68, fl=0x7ffff3e04e78) at net/ipv4/ip_output.c:403
#12 tcp_transmit_skb (sk=0x7ffff3e04bd0, skb=0x7ffff5ccaa68, clone_it=1, gfp_mask=32) at net/ipv4/tcp_ou
#13 mptcp_write_xmit (meta_sk=0x7ffff3e053d0, mss_now=1428, nonagle=0, push_one=0, gfp=32) at net/m
mptcp_output.c:1182
#14 tcp_write_xmit (sk=0x7ffff3e053d0, mss_now=516, nonagle=0, push_one=0, gfp=32) at net/ipv4/tcp_ou
#15 __tcp_push_pending_frames (sk=0x7ffff3e053d0, cur_mss=516, nonagle=0) at net/ipv4/tcp_output.c:21
#16 tcp_push_pending_frames (sk=0x7ffff3e053d0) at include/net/tcp.h:1610
#17 do_tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/
2625
#18 tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/ipv4
#19 sock_common_setsockopt (sock=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=
core/sock.c:2455
#20 sim_sock_setsockopt (socket=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at
socket.c:167
#21 sim_sock_setsockopt_forwarder (v0=0x7ffff3e03850, v1=6, v2=3, v3=0x7ffff439cc78, v4=4) at arch/sim/
#22 ns3::LinuxSocketFdFactory::Setsockopt (this=0x64f000, socket=0x7ffff3e03850, level=6, optname=3,
optval=0x7ffff439cc78, optlen=4) at ../model/linux-socket-fd-factory.cc:947
#23 ns3::LinuxSocketFd::Setsockopt (this=0x815f20, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) a
linux-socket-fd.cc:89
#24 dce_setsockopt (fd=11, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/dce-fd.cc:529
#25 setsockopt () at ../model/libc-ns3.h:179
#26 sockopt_cork (sock=11, onoff=0) at sockunion.c:534
#27 bgp_write (thread=0x7ffff439ce10) at bgp_packet.c:691
#28 thread_call (thread=0x7ffff439ce10) at thread.c:1177
#29 main (argc=5, argv=0x658100) at bgp_main.c:455
#30 ns3::DceManager::DoStartProcess (context=0x6fa970) at ../model/dce-manager.cc:281
#31 ns3::TaskManager::Trampoline (context=0x6fab50) at ../model/task-manager.cc:274
#32 ns3::UcontextFiberManager::Trampoline (a0=32767, a1=-139668064, a2=0, a3=7318352) at ../model/uco
Linux
Stack
glue
glue(POS
IX)
glue(linu
x)
DCE
Conventional Virtualization
46
HW
Host OS
Guest
syscalls
applicatio
ns
(Guest OS)
KVM/Xen/LXC/UML
(Guest OS)
Guest
syscalls
applicatio
ns
Code generality
(pros)
Applications and
network stacks
(operating systems)
are not aware of
virtualization
Limitations of DCE
virtual clock vs real world
cannot interact with
can use wall-clock, but loose
reproducibility
low code generality
requires API-specific glue code
(POSIX/kernel)
48
Usage
49
git clone 
git://github.com/direct-code-execution/
net-next-sim.git
cd net-next-sim
make defconfig ARCH=sim
make library ARCH=sim
make testbin -C arch/sim/test
make test ARCH=sim

Direct Code Execution - LinuxCon Japan 2014

  • 1.
    Direct Code Execution DiveInto the Internals of Kernel Network Stack with DCE Hajime Tazaki University of Tokyo LinuxCon Japan 2014
  • 2.
    Who am I? a lecturer/researcher of a university studying/implementing/hacking network protocols networks measurement 2
  • 3.
    this talk isabout... a testing framework for network stack is really needed userspace version of network stack helps a lot 3
  • 4.
    Development of networkstack newly introduced protocols (mptcp, 6lowpan) refactoring, brings regression (bugs) Can we really test network stack ? to keep the pace of development to keep the quality of software Network Stack, still needs new idea ? 4
  • 5.
    Issues (Testing) 5 OSPF (>100routers) in an ISP network How to setup ? Hard to configure each node Heavy load (of VMs)
  • 6.
    Limitation of networktopology Test suites of Linux Test Project (LTP) Okay, LXC / UML / OSv promise a handy way to test complex network reproducible ? Issues (Testing) 6
  • 7.
    A bunch ofVMs gdb w/ 100 nodes ? How to reproduce a bug in a heavy load situation ? Issues (debugging) 7
  • 8.
    Code exercise Large codebase(~600K LoC net/) How can we close to 100% test coverage ? Issues (code coverage) 8 % cloc net-next/net ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- C 1186 121213 104596 572814 C/C++ Header 175 4408 7149 21972 make 71 246 252 943 awk 1 11 22 126 ------------------------------------------------------------------------------- SUM: 1433 125878 112019 595855 -------------------------------------------------------------------------------
  • 9.
    Reproducibility is important toensure regression tests are meaningful to ensure the (ideal) performance Issues (regression) 9
  • 10.
    Destination option header(for mobile ipv6) handling (3.7 fixed) anycast address configuration via sockopt (3.14, still exists) Regressions we've seen in net-next tree 10
  • 11.
  • 12.
    Light-weight virtualization Userspace networkstack What we already have ? (alternatives) 12
  • 13.
    Alternative: VM LXC, UML,OpenVZ.... Light-weight virtualization run many instances bunch of emulation features High load with large numbers of VM Behavior (of test) is not deterministic 13
  • 14.
    Alternative: OSv Minimal GuestOS (for Cloud) no system call no user/kernel space 1 process / a VM Very lightweight with controllability timing reproducibility is ? 14 http://www.slideshare.net/dmarti1111/o-sv-linux-collaboration-summit
  • 15.
    Alternative: Userspace netstack Rump Kernels NetBSD kernel on userspace, Xen Automated testing synchronizing multiple processes makes complex to debug 15 http://blog.netbsd.org/tnf/entry/revolutionizing_kernel_development_testing_with
  • 16.
    nfsim a netfilter simulationenvironment automated test platform for NAT, conntrack, etc LD_PRELOAD=fakesockopt.so / sbin/iptables -L ... Only a single (kernel) instance 16 Rusty Russel and Jeremy Kerr. nfsim: Untested code is buggy code. In Proceedings of the Ottawa Linux Symposium (OLS’05), 2005.
  • 17.
    Summary (Alternatives) 17 LXC OSvnfsim DCE Code Generality ✓ ✓ ✓ Controllability ✓ ✓ ✓ Deterministic Clock ✓ ✓ Flexible Configurations ✓
  • 18.
    Summary (Alternatives) 17 LXC OSvnfsim DCE Code Generality ✓ ✓ ✓ Controllability ✓ ✓ ✓ Deterministic Clock ✓ ✓ Flexible Configurations ✓
  • 19.
    Direct Code Execution(DCE) DCE is a userspace kernel network stack with asm-generic based architecture with multiple hosts by dlmopen in a single userspace process Our solution is ... 18
  • 20.
    Direct Code Execution(cont’d) DCE makes reproducible testing platform with fine-grained parameter tuning (by ns-3 network simulator) providing development framework for network protocols 19
  • 21.
    DCE Hardware Simulation Core Host operating system Process Network stack Applications Network stack Applications node#1node#N Features 20 Functional Realism Run real code POSIX apps, kernel network stacks Timing Realism ns-3 integration (virtual clock) Debuggability all in userspace single-process virtualization
  • 22.
    DCE architecture 21 ARP Qdisc TCP UDPDCCP SCTP ICMP IPv4IPv6 Netlink BridgingNetfilter IPSec Tunneling Kernel layer Heap Stack memory Virtualization Core layer ns-3 (network simulation core) POSIX layer Application (ip, iptables, quagga) bottom halves/rcu/ timer/interrupt struct net_device DCE ns-3 applicati on ns-3 TCP/IP stack 1) Core Layer 2) Kernel Layer 3) POSIX Layer
  • 23.
    1) Virtualization corelayer 22 Single process model Run multiple nodes on a single (host) process dlmopen(3) etc. Simulated Process isolation of global symbols management of stacks/heaps of simulated processes ARP Qdisc TCP UDP DCCP SCTP ICMP IPv4IPv6 Netlink BridgingNetfilter IPSec Tunneling Kernel layer Heap Stack memory Virtualization Core layer ns-3 (network simulation core) POSIX layer Application (ip, iptables, quagga) bottom halves/rcu/ timer/interrupt struct net_device DCE ns-3 applicati on ns-3 TCP/IP stack
  • 24.
    1) Virtualization corelayer load shlib version of Linux kernel at different base address (isolation) application (iproute2) can be w/ PIE glue time/NIC related function redirected to ns-3 core 23 ARP Qdisc TCP UDP DCCP SCTP ICMP IPv4IPv6 Netlink BridgingNetfilter IPSec Tunneling Kernel layer Heap Stack memory Virtualization Core layer ns-3 (network simulation core) POSIX layer Application (ip, iptables, quagga) bottom halves/rcu/ timer/interrupt struct net_device DCE ns-3 applicati on ns-3 TCP/IP stack
  • 25.
    2) Kernel layer(library operating system) 24 Similar to Library OS shared library (e.g., liblinux.so) replaceable (e.g., libfreebsd.so) Mapping via glue code struct net_device <=> ns3:NetDevice jiffies <=> simulated clock glue code in arch/sim minimize original code modifications jiffies/ gettimeofday() Simulated Clock Synchronize struct net_device ns3::NetDevice ARP Qdisc TCP UDP DCCP SCTP ICMP IPv4IPv6 Netlink BridgingNetfilter IPSec Tunneling Kernel layer Heap Stack memory Virtualization Core layer network simulation core POSIX layer Application (ip, iptables, quagga) bottom halves/rcu/ timer/interrupt struct net_device DCE https://github.com/direct-code-execution/net-next-sim
  • 26.
    2) Kernel layer(library operating system) networking glue code timers glue code 25 static const struct net_device_ops sim_dev_ops = { .ndo_start_xmit = kernel_dev_xmit, // go to ns-3 side }; void do_gettimeofday(struct timeval *tv) { u64 ns = sim_current_ns (); // get simulated clock *tv = ns_to_timeval (ns); }
  • 27.
    2) Kernel layer(library operating system) Build make menuconfig ARCH=sim make library ARCH=sim 26
  • 28.
    3) POSIX layer 27 OurPOSIX implementation 1. pass-through host library calls e.g., strcpy(3) => (reuse) 2. system call => hijacking redirect to our kernel module e.g., socket(2) => dce_socket() ARP Qdisc TCP UDP DCCP SCTP ICMP IPv4IPv6 Netlink BridgingNetfilter IPSec Tunneling Kernel layer Heap Stack memory Virtualization Core layer ns-3 (network simulation core) POSIX layer Application (ip, iptables, quagga) bottom halves/rcu/ timer/interrupt struct net_device DCE ns-3 applicati on ns-3 TCP/IP stack
  • 29.
    POSIX API Coverage 28 0 125 250 375 500 2009-09-042010-03-10 2011-05-20 2012-01-05 2013-04-09 2014-05-16 #offunctions Date
  • 30.
    Supported Codes 29 iproute2 quagga (RIP/OSPF/BGP/v6RA) umip(Mobile IPv6) bind9, unbound (DNS/DNSSEC) iperf, ping, ping6 Linux net-next (TCP, IPv6/4, SCTP/ DCCP) version 2.6.36 to 3.14 mptcp (UC Louvain)
  • 31.
    How it lookslike ? (ns-3 script interface) 30
  • 32.
    How to useit ? 31 Recompile Userspace as Position Independent Executable Kernelspace as shared library Run within ns-3 Debug with gdb, valgrind !
  • 33.
    Hello World. (1) create100 nodes (2) connect via ethernet links (3) choose network stack library (4) application ‘ospfd’ set to run at 5.0 second (5) execution stop at 1000 second 32 #!/usr/bin/python from ns.dce import * from ns.core import * nodes = NodeContainer() nodes.Create (100) (1) csma = csma.CsmaHelper() csma.Install (nodes) (2) dce = DceManagerHelper() dce.SetNetworkStack ("liblinux.so"); (3) dce.Install (nodes); app = DceApplicationHelper() app.SetBinary ("ospfd") (4) app.Start (Seconds (5.0)) (4) app.Install (nodes) Simulator.Stop (Seconds(1000.0)) (5) Simulator.Run () ns-3 scripting C++, python (bindings)
  • 34.
  • 35.
    Code Coverage (gcov) 34 Settings mptcp_v0.86 DCE-edtest programs (<1K LoC) Configuration of test programs simple 2 paths (ipv4 iperf) dual-stack 2 paths (v6only, v4/v6) 10 different packet loss rates Lines Funcs Branches mptcp_ctrl.c 76.3% 86.7% 59.9% mptcp_input.c 66.9% 85.0% 57.9% mptcp_ipv4.c 68.0% 93.3% 43.8% mptcp_ipv6.c 57.4% 85.0% 45.2% mptcp_ofo_queue.c 91.2% 100.0% 89.2% mptcp_output.c 71.2% 91.9% 58.6% mptcp_pm.c 54.2% 71.4% 40.5% Total 68.0% 85.9% 54.8% make library ARCH=sim COV=yes
  • 36.
    Code Coverage (gcov) 34 Settings mptcp_v0.86 DCE-edtest programs (<1K LoC) Configuration of test programs simple 2 paths (ipv4 iperf) dual-stack 2 paths (v6only, v4/v6) 10 different packet loss rates Lines Funcs Branches mptcp_ctrl.c 76.3% 86.7% 59.9% mptcp_input.c 66.9% 85.0% 57.9% mptcp_ipv4.c 68.0% 93.3% 43.8% mptcp_ipv6.c 57.4% 85.0% 45.2% mptcp_ofo_queue.c 91.2% 100.0% 89.2% mptcp_output.c 71.2% 91.9% 58.6% mptcp_pm.c 54.2% 71.4% 40.5% Total 68.0% 85.9% 54.8% make library ARCH=sim COV=yes
  • 37.
    Code Coverage (gcov) 34 Settings mptcp_v0.86 DCE-edtest programs (<1K LoC) Configuration of test programs simple 2 paths (ipv4 iperf) dual-stack 2 paths (v6only, v4/v6) 10 different packet loss rates Lines Funcs Branches mptcp_ctrl.c 76.3% 86.7% 59.9% mptcp_input.c 66.9% 85.0% 57.9% mptcp_ipv4.c 68.0% 93.3% 43.8% mptcp_ipv6.c 57.4% 85.0% 45.2% mptcp_ofo_queue.c 91.2% 100.0% 89.2% mptcp_output.c 71.2% 91.9% 58.6% mptcp_pm.c 54.2% 71.4% 40.5% Total 68.0% 85.9% 54.8% make library ARCH=sim COV=yes
  • 38.
    Debuggability (gdb) 35 Inspect codesduring testing among distributed nodes in a single process using gdb conditional breakpoint with node id (in a simulated network) fully reproducible (to easily catch a bug) (gdb) b mip6_mh_filter if dce_debug_nodeid()==0 Breakpoint 1 at 0x7ffff287c569: file net/ipv6/mip6.c, line 88. <continue> (gdb) bt 4 #0  mip6_mh_filter (sk=0x7ffff7f69e10, skb=0x7ffff7cde8b0) at net/ipv6/mip6.c:109 #1  0x00007ffff2831418 in ipv6_raw_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:199 #2  0x00007ffff2831697 in raw6_local_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:232 #3  0x00007ffff27e6068 in ip6_input_finish (skb=0x7ffff7cde8b0) at net/ipv6/ip6_input.c:197 Wi-Fi Wi-Fi Home Agent AP1 AP2 handoff ping6 mobile node correspondent node
  • 39.
    Debuggability (valgrind) 36 Memory error detection amongdistributed nodes in a single process using Valgrind ==5864== Memcheck, a memory error detector ==5864== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==5864== UsingValgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info ==5864== Command: ../build/bin/ns3test-dce-vdl --verbose ==5864== ==5864== Conditional jump or move depends on uninitialised value(s) ==5864== at 0x7D5AE32: tcp_parse_options (tcp_input.c:3782) ==5864== by 0x7D65DCB: tcp_check_req (tcp_minisocks.c:532) ==5864== by 0x7D63B09: tcp_v4_hnd_req (tcp_ipv4.c:1496) ==5864== by 0x7D63CB4: tcp_v4_do_rcv (tcp_ipv4.c:1576) ==5864== by 0x7D6439C: tcp_v4_rcv (tcp_ipv4.c:1696) ==5864== by 0x7D447CC: ip_local_deliver_finish (ip_input.c:226) ==5864== by 0x7D442E4: ip_rcv_finish (dst.h:318) ==5864== by 0x7D2313F: process_backlog (dev.c:3368) ==5864== by 0x7D23455: net_rx_action (dev.c:3526) ==5864== by 0x7CF2477: do_softirq (softirq.c:65) ==5864== by 0x7CF2544: softirq_task_function (softirq.c:21) ==5864== by 0x4FA2BE1: ns3::TaskManager::Trampoline(void*) (task- manager.cc:261) ==5864== Uninitialised value was created by a stack allocation ==5864== at 0x7D65B30: tcp_check_req (tcp_minisocks.c:522) ==5864== http://valgrind.org/
  • 40.
    Automated Testing 37 Automated testing amongmultiple nodes code coverage regression tests w/ deterministic clock Jenkins CI Linux kernel testing w/ Userspace applications too http://ns-3-dce.cloud.wide.ad.jp/jenkins/job/daily-net-next-sim/
  • 41.
    Conclusions 38 Direct Code Execution reproducibletesting framework controllable with distributed nodes debugging facilities % cd linux/ % make test ARCH=sim
  • 42.
  • 43.
    ありがとうございました 40 • Tazaki etal., Direct code execution: revisiting library OS architecture for reproducible network experiments. ACM CoNEXT '13 • Mathieu Lacage. Experimentation Tools for Networking Research. Université de Nice-Sophia Antipolis, 2010. Acknowledgement Mathieu Lacage (Alcméon, France, Initial Design/Implementation) Diana/Planete team, INRIA, Sophia Antipolis, France
  • 44.
  • 45.
  • 46.
    rump (netbsd) 43 (gdb) bt #0 rumpcomp_sockin_sendmsg(s=7, msg=0x703010, flags=0, snd=0x7ffffffed178) at buildrump.sh/src/sys/ libsockin/rumpcomp_user.c:426 #1 0x00007ffff7df8526 in sockin_usrreq (so=so@entry=0x6fedb0, req=req@entry=9, m=0x6cce00, nam=nam@entry=0x0, control=control@entry=0x0, l=<optimized out>) at buildrump.sh/src/sys/rump/ne sockin.c:510 #2 0x00007ffff7be4e79 in sosend (so=0x6fedb0, addr=0x0, uio=0x7ffffffed500, top=0x6cce00, control=0x0 l=0x700800) at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/uipc_socket.c:1048 #3 0x00007ffff7be7b4c in soo_write (fp=<optimized out>, offset=<optimized out>, uio=0x7ffffffed500, cre out>, flags=<optimized out>) at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/sy 116 #4 0x00007ffff788f620 in dofilewrite (fd=fd@entry=3, fp=0x6f8e80, buf=0x400e88, nbyte=37, offset=0x6f8 flags=flags@entry=1, retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/ sys_generic.c:355 #5 0x00007ffff788f72f in sys_write (l=<optimized out>, uap=0x7ffffffed5f0, retval=0x7ffffffed5e0) at /home gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../kern/sys_generic.c:323 #6 0x00007ffff78de3cd in sy_call (rval=0x7ffffffed5e0, uap=0x7ffffffed5f0, l=0x700800, sy=<optimized out>) tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../sys/syscallvar.h:61 #7 rump_syscall (num=num@entry=4, data=data@entry=0x7ffffffed5f0, dlen=dlen@entry=24, retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/lib rumpkern/rump.c:1024 #8 0x00007ffff78d573b in rump___sysimpl_write (fd=<optimized out>, buf=<optimized out>, nbyte=<opt at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:121 #9 0x0000000000400d08 in main () at webbrowser.c:86 (gdb) BSD Stack glue apps
  • 47.
    44 (gdb) bt #0 if_transmit(ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/sys/net/if.c:3082 #1 0x0000000000252a57 in ether_output_frame (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/s if_ethersubr.c:387 #2 0x0000000000252a0a in ether_output (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100, dst=0xffffc0003e9e ro=0x2000059102a0) at ../../bsd/sys/net/if_ethersubr.c:356 #3 0x0000000000277982 in ip_output (m=0xffffc00005bfe100, opt=0x0, ro=0x2000059102a0, flags=0, imo=0 inp=0xffffc00009ea6400) at ../../bsd/sys/netinet/ip_output.c:612 #4 0x000000000028cb49 in tcp_output (tp=0xffffc00009eafc00) at ../../bsd/sys/netinet/tcp_output.c:1219 #5 0x0000000000296276 in tcp_output_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140) at ../../b netinet/tcp_offload.h:270 #6 0x0000000000296b25 in tcp_usr_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at netinet/tcp_usrreq.c:453 #7 0x000000000023503e in soconnect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at ../../bsd uipc_socket.c:744 #8 0x000000000023ad0e in kern_connect (fd=46, sa=0xffffc00005a8e140) at ../../bsd/sys/kern/uipc_syscalls.c #9 0x00000000002511fa in linux_connect (s=46, name=0x200005910660, namelen=16) at ../../bsd/sys/compa linux_socket.c:712 #10 0x000000000023c088 in connect (fd=46, addr=0x200005910660, len=16) at ../../bsd/sys/kern/uipc_syscall 104 #11 0x000010000220c65a in NET_Connect () #12 0x000010000220d0fa in Java_java_net_PlainSocketImpl_socketConnect () #13 0x000020000021cd8e in ?? () #14 0x00002000059106d8 in ?? () (snip) (gdb) BSD Stack glue apps (java) OSv
  • 48.
    apps 45 (dce:node0) bt #0 sim_dev_xmit(dev=0x7ffff5587020, data=0x7ffff3e0688a "", len=105) at arch/sim/sim.c:349 #1 kernel_dev_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020) at arch/sim/sim-device.c:20 #2 dev_hard_start_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020, txq=0x7ffff5571a90) at net/core/dev.c:25 #3 dev_queue_xmit (skb=0x7ffff5ccaa68) at net/core/dev.c:2830 #4 neigh_hh_output (skb=0x7ffff5ccaa68, hh=0x7ffff5ce8850) at include/net/neighbour.h:357 #5 dst_neigh_output (skb=0x7ffff5ccaa68, n=0x7ffff5ce8790, dst=0x7ffff3e045d0) at include/net/dst.h:409 #6 ip_finish_output2 (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:201 #7 ip_finish_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:234 #8 ip_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:307 #9 dst_output (skb=0x7ffff5ccaa68) at include/net/dst.h:448 #10 ip_local_out (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:110 #11 ip_queue_xmit (skb=0x7ffff5ccaa68, fl=0x7ffff3e04e78) at net/ipv4/ip_output.c:403 #12 tcp_transmit_skb (sk=0x7ffff3e04bd0, skb=0x7ffff5ccaa68, clone_it=1, gfp_mask=32) at net/ipv4/tcp_ou #13 mptcp_write_xmit (meta_sk=0x7ffff3e053d0, mss_now=1428, nonagle=0, push_one=0, gfp=32) at net/m mptcp_output.c:1182 #14 tcp_write_xmit (sk=0x7ffff3e053d0, mss_now=516, nonagle=0, push_one=0, gfp=32) at net/ipv4/tcp_ou #15 __tcp_push_pending_frames (sk=0x7ffff3e053d0, cur_mss=516, nonagle=0) at net/ipv4/tcp_output.c:21 #16 tcp_push_pending_frames (sk=0x7ffff3e053d0) at include/net/tcp.h:1610 #17 do_tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/ 2625 #18 tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/ipv4 #19 sock_common_setsockopt (sock=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78 "", optlen= core/sock.c:2455 #20 sim_sock_setsockopt (socket=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at socket.c:167 #21 sim_sock_setsockopt_forwarder (v0=0x7ffff3e03850, v1=6, v2=3, v3=0x7ffff439cc78, v4=4) at arch/sim/ #22 ns3::LinuxSocketFdFactory::Setsockopt (this=0x64f000, socket=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/linux-socket-fd-factory.cc:947 #23 ns3::LinuxSocketFd::Setsockopt (this=0x815f20, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) a linux-socket-fd.cc:89 #24 dce_setsockopt (fd=11, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/dce-fd.cc:529 #25 setsockopt () at ../model/libc-ns3.h:179 #26 sockopt_cork (sock=11, onoff=0) at sockunion.c:534 #27 bgp_write (thread=0x7ffff439ce10) at bgp_packet.c:691 #28 thread_call (thread=0x7ffff439ce10) at thread.c:1177 #29 main (argc=5, argv=0x658100) at bgp_main.c:455 #30 ns3::DceManager::DoStartProcess (context=0x6fa970) at ../model/dce-manager.cc:281 #31 ns3::TaskManager::Trampoline (context=0x6fab50) at ../model/task-manager.cc:274 #32 ns3::UcontextFiberManager::Trampoline (a0=32767, a1=-139668064, a2=0, a3=7318352) at ../model/uco Linux Stack glue glue(POS IX) glue(linu x) DCE
  • 49.
    Conventional Virtualization 46 HW Host OS Guest syscalls applicatio ns (GuestOS) KVM/Xen/LXC/UML (Guest OS) Guest syscalls applicatio ns Code generality (pros) Applications and network stacks (operating systems) are not aware of virtualization
  • 50.
    Limitations of DCE virtualclock vs real world cannot interact with can use wall-clock, but loose reproducibility low code generality requires API-specific glue code (POSIX/kernel) 48
  • 51.
    Usage 49 git clone git://github.com/direct-code-execution/ net-next-sim.git cdnet-next-sim make defconfig ARCH=sim make library ARCH=sim make testbin -C arch/sim/test make test ARCH=sim