Direct Code Execution @ CoNEXT 2013

1,922 views

Published on

The presentation at CoNEXT 2013, Santa Barbara, CA.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,922
On SlideShare
0
From Embeds
0
Number of Embeds
200
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Direct Code Execution @ CoNEXT 2013

  1. 1. Direct Code Execution: Revisiting Library OS Architecture for Reproducible Network Experiments Hajime Tazaki (University of Tokyo, Japan), Frederic Urbani (INRIA, France), Emilio Mancini (INRIA, France), Mathieu Lacage (Alcmeon, France), Daniel Camara (INRIA, France), Thierry Turletti (INRIA, France), Walid Dabbous (INRIA, France) ACM CoNEXT 2013
  2. 2. Our target: experimentation reproducibility Ideally one should be able to easily Verify published results (same scenario) Test and debug with other scenarios This requires functional/timing realism, debuggability 2 Proof the idea is good at the same condition Try to replicate Extends w an idea
  3. 3. Related work: real time emulation Container Based Emulation provides lightweight virtualization Mininet-HiFi proposed in CoNEXT’12 ensures fidelity of experiments but : Timing realism still limited by hardware resources No debugging support 3
  4. 4. Related work: virtual time Time Dilation [NSDI’06] Clock adjustment between different systems Constant time dilation factor x0.5 Slice Time [NSDI’12] Uses synchronizer to adjust speeds between VMs and underlying emulated network TTVM [ATC’05] Support debugging with bw/fw navigation 4
  5. 5. Related work: network simulators Pros: more debuggability No realtime constraint Cons: lack of functional realism 5
  6. 6. Motivation Simulators Emulators Improve the functional realism of simulators Functional Realism While keeping timing realism, debuggability Timing Realism Debuggability 6 Ours -- ++ + ++ -/+ ++ + - +
  7. 7. Motivation Simulators Emulators Improve the functional realism of simulators Functional Realism While keeping timing realism, debuggability Timing Realism Debuggability 6 Ours -- ++ + ++ -/+ ++ + - +
  8. 8. Our approach Direct Code Execution node#N node#1 Applications Network stack Functional Realism Run real code POSIX apps, kernel network stacks Timing Realism ns-3 integration (virtual clock) Debuggability all in userspace single-process virtualization 7 Applications Network stack DCE Simulation Core Process Host operating system Hardware
  9. 9. DCE architecture Application (ip, iptables, quagga) ns-3 applicati on POSIX layer DCE TCP UDP Heap Stack memory ICMP Netfilter Qdisc IPSec struct net_device SCTP IPv6 ARP Netlink Virtualization Core layer DCCP IPv4 Bridging Tunneling bottom halves/rcu/ timer/interrupt Kernel layer ns-3 (network simulation core) 8 ns-3 TCP/IP stack
  10. 10. 1) Virtualization core layer Run multiple nodes on a single (host) process Application (ip, iptables, quagga) dlmopen(3) etc. Simulated Process ns-3 applicati on POSIX layer DCE TCP UDP isolation of global symbols management of stacks/ heaps of simulated processes Heap Stack memory ICMP Qdisc IPSec struct net_device SCTP IPv6 ARP Netfilter Netlink Virtualization Core layer DCCP IPv4 Bridging Tunneling bottom halves/rcu/ timer/interrupt Kernel layer Keep ns-3 features Timing Realism Debuggability ns-3 (network simulation core) 9 ns-3 TCP/IP stack
  11. 11. 2) Kernel layer (library operating system) Functional Realism Similar to Library OS shared library (e.g., liblinux.so) replaceable (e.g., libfreebsd.so) Mapping via glue code Application (ip, iptables, quagga) POSIX layer DCE struct net_device <=> ns3:NetDevice synchronize jiffies with simulated clock Architecture independent code TCP UDP Heap Stack ICMP DCCP ARP Netfilter memory Netlink Virtualization Core layer IPv6 Qdisc IPSec struct net_device SCTP IPv4 Bridging struct net_device jiffies/ gettimeofday() Tunneling bottom halves/rcu/ timer/interrupt Synchronize Kernel layer ns3::NetDevice network simulation core minimize original code modifications 10 Simulated Clock
  12. 12. 3) POSIX layer Functional Realism POSIX reimplementation 1. pass-through host library call Application (ip, iptables, quagga) ns-3 applicati on POSIX layer DCE TCP UDP Heap e.g., strcpy(3) => (reuse) Stack memory ICMP Virtualization Core layer Qdisc IPSec struct net_device SCTP IPv6 ARP Netfilter Netlink 2. reimplementation, if a function call involves kernel resource (i.e., system calls) DCCP IPv4 Bridging Tunneling bottom halves/rcu/ timer/interrupt Kernel layer redirect to our kernel module ns-3 (network simulation core) e.g., socket(2) => dce_socket() 11 ns-3 TCP/IP stack
  13. 13. Use cases
  14. 14. Use cases Reproducibility of an experiment (functional realism) How easy is it to debug a distributed protocol ? (debuggability) 13
  15. 15. Reproducibility LTE iperf (client) Replicating the MPTCP NSDI’12 experiment from the literature with DCE + ns-3 (LTE/Wi-Fi) Linux MPTCP (same s/w) iperf LTE Pgw iperf (server) Rx Tx Wi-Fi AP 4 3.5 Average goodput (Mbps) Goodput measurement of TCP (3G), TCP (WiFi), MPTCP (both) eNode B MPTCP TCP over Wi-Fi TCP over 3G 3 2.5 2 1.5 1 0.5 0 0.05 0.1 0.2 0.5 Receive/Send buffer size (Mbytes) MPTCP used over real 3G and WiFi 14
  16. 16. Fu Reproducibility (cont.d) n Fu Average goodput (Mbps) 3.5 MPTCP TCP over Wi-Fi TCP over 3G 3 2.5 2 1.5 1 4 3.5 Average goodput (Mbps) 4 3 MPTCP TCP over Wi-Fi TCP over LTE ct l l y ion Re al R pr ea od lis uc m ibl e 2.5 2 1.5 1 0.5 0.5 0 0 0.05 0.1 0.2 0.5 Receive/Send buffer size (Mbytes) 0.05 0.1 0.2 0.5 Receive/Send buffer size (Mbytes) Replicate (w/ DCE) Original (NSDI’12) Differences 1) no significant goodput improvement with buffer size when DCE in single TCP 2) Max goodput range: 2.2 - 2.9Mbps (DCE) 2.0 - 3.2Mbps (NSDI) 15
  17. 17. Debuggability Memory error detection among distributed nodes in a single process using Valgrind http://valgrind.org/ ==5864== Memcheck, a memory error detector ==5864== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==5864== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info ==5864== Command: ../build/bin/ns3test-dce-vdl --verbose ==5864== ==5864== Conditional jump or move depends on uninitialised value(s) ==5864== at 0x7D5AE32: tcp_parse_options (tcp_input.c:3782) ==5864== by 0x7D65DCB: tcp_check_req (tcp_minisocks.c:532) ==5864== by 0x7D63B09: tcp_v4_hnd_req (tcp_ipv4.c:1496) ==5864== by 0x7D63CB4: tcp_v4_do_rcv (tcp_ipv4.c:1576) ==5864== by 0x7D6439C: tcp_v4_rcv (tcp_ipv4.c:1696) ==5864== by 0x7D447CC: ip_local_deliver_finish (ip_input.c:226) ==5864== by 0x7D442E4: ip_rcv_finish (dst.h:318) ==5864== by 0x7D2313F: process_backlog (dev.c:3368) ==5864== by 0x7D23455: net_rx_action (dev.c:3526) ==5864== by 0x7CF2477: do_softirq (softirq.c:65) ==5864== by 0x7CF2544: softirq_task_function (softirq.c:21) ==5864== by 0x4FA2BE1: ns3::TaskManager::Trampoline(void*) (taskmanager.cc:261) ==5864== Uninitialised value was created by a stack allocation ==5864== at 0x7D65B30: tcp_check_req (tcp_minisocks.c:522) ==5864== 16
  18. 18. Debuggability Home Agent AP1 Inspect codes during experiments among distributed nodes in a single process using gdb conditional breakpoint with node id (in a simulated network) fully reproducible (to easily catch a bug) correspondent node ping6 AP2 Wi-Fi Wi-Fi handoff mobile node (gdb) b mip6_mh_filter if dce_debug_nodeid()==0 Breakpoint 1 at 0x7ffff287c569: file net/ipv6/mip6.c, line 88 <continue> (gdb) bt 4 #0  mip6_mh_filter (sk=0x7ffff7f69e10, skb=0x7ffff7cde8b0) at net/ipv6/mip6.c:109 #1  0x00007ffff2831418 in ipv6_raw_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:199 #2  0x00007ffff2831697 in raw6_local_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:232 #3  0x00007ffff27e6068 in ip6_input_finish (skb=0x7ffff7cde8b0) at net/ipv6/ip6_input.c:197 17
  19. 19. Continuous Integration (CI) Automated testing among multiple nodes code coverage regression tests w/ deterministic clock Jenkins CI Linux kernel testing Userspace applications 18
  20. 20. Simulat Emulato ors rs Conclusions DCE Functional Realism -- ++ + Timing Realism ++ -/+ ++ Debuggab ility DCE allows + - Increased realism (functional/timing) Full reproducibility (through determinism) Debuggability of protocol implementations Enable reproducible network experiments 19 +
  21. 21. Thank you http://bit.ly/ns-3-dce https://github.com/direct-code-execution
  22. 22. Backup Slides 21
  23. 23. Direct Code Execution Insert real network code in simulators Easy replication Functional Realism (real code) Timing Realism (time dilation) Reproducibility (full control) Scalability (slower execution, accurate results) Debuggability (single process) 22
  24. 24. How to use DCE ? #!/usr/bin/python Prepare binaries from ns.dce import * from ns.core import * liblinux.so (from linux tree+patch) iperf (built with PIE binnary) nodes = NodeContainer() nodes.Create (100) dce = DceManagerHelper() dce.SetNetworkStack ("liblinux.so"); dce.Install (nodes); app = DceApplicationHelper() app.SetBinary ("iperf") app.Install (nodes) Write a simulation script Simulator.Stop (Seconds(1000.0)) Simulator.Run () 26
  25. 25. Limitations of DCE virtual clock vs real world cannot interact with can use wall-clock, but loose reproducibility low code generality requires API-specific glue code (POSIX/ kernel) 27
  26. 26. Micro-benchmarks DCE vs Mininet-HiFi Settings Xeon 2.8 GHz/8 GB RAM UDP socket program Linear topology 1470 bytes/100Mbps udp-perf (client) 0 1) speed of packet processing 2) scalability needed to ensure realistic results 28 1 ....... udp-perf (server) n-1 n
  27. 27. udp-perf (client) 0 1 16000 Mininet-HiFi DCE 14000 12000 10000 8000 6000 4000 2000 0 0 4 8 16 24 48 64 ....... Number of sent/received packets (n) Received packets per wall clock seconds (pps) Micro-benchmarks udp-perf (server) n n-1 450000 400000 350000 Packet Loss 300000 250000 200000 150000 Sent Mininet Recv DCE Recv 100000 50000 0 0 Number of Hops 4 8 16 24 32 48 56 64 Number of Hops DCE achieves timing realism 29
  28. 28. Flexibility Code coverage as a metric of flexibility Lines Funcs Branches mptcp_ctrl.c 76.3% 86.7% 59.9% mptcp_input.c 66.9% 85.0% 57.9% mptcp_ipv4.c 68.0% 93.3% 43.8% mptcp_ipv6.c 57.4% 85.0% 45.2% mptcp_ofo_queue.c 91.2% 100.0% 89.2% mptcp_output.c 71.2% 91.9% 58.6% mptcp_pm.c 54.2% 71.4% 40.5% Total 68.0% 85.9% 54.8% Settings mptcp_v0.86 DCE-ed test programs (<1LoC) Configuration of test programs simple 2 paths (ipv4 iperf) dual-stack 2 paths (v6only, v4/v6) 10 different packet loss rates 30
  29. 29. POSIX API Coverage 500 375 250 125 0 2009-09-04 2010-03-10 2011-05-20 31 2012-01-05 2013-04-09

×