Mirage: extreme specialisation of virtual appliances
Upcoming SlideShare
Loading in...5
×
 

Mirage: extreme specialisation of virtual appliances

on

  • 5,885 views

Infrastructure-as-a-Service compute clouds provide a flexible hardware platform on which customers host applications as a set of appliances, e.g., web servers or databases. Each appliance is a VM ...

Infrastructure-as-a-Service compute clouds provide a flexible hardware platform on which customers host applications as a set of appliances, e.g., web servers or databases. Each appliance is a VM image containing an OS kernel and userspace processes, within which applications access resources via traditional APIs such as POSIX. However, the flexibility provided by the hypervisor comes at a cost: the addition of another layer in the already complex software stack which impacts runtime performance, and increases the size of the trusted computing base.

Given that modern software is generally written in high-level languages that abstract the underlying OS, we revisit how these appliances are constructed with our Mirage operating system. Mirage supports the progressive specialisation of source code, and gradually replaces traditional OS components with customisable libraries, ultimately resulting in "unikernel" VMs: sealed, fixed-purpose VMs that run directly on the hypervisor.

Developers no longer need to become sysadmins, expert in the configuration of all manner of system components, to use cloud resources. At the same time, they can develop their code using their usual tools, only making the final push to the cloud once they are satisfied their code works. As they explicitly link in components that would normally be provided by the host OS, the resulting unikernels are also highly compact: facilities that are not used are simply not included in the resulting microkernel binary.

This talk will describe the architecture of Mirage, and show a quick demonstration of how to build a web-server that runs as a unikernel on a standard Xen installation.

Statistics

Views

Total Views
5,885
Views on SlideShare
2,065
Embed Views
3,820

Actions

Likes
1
Downloads
32
Comments
0

15 Embeds 3,820

http://www.xen.org 2420
http://xen.org 1015
http://www.xenproject.org 205
http://www-archive.xenproject.org 115
http://xen.xensource.com 15
http://staging.xen.org 13
http://xenorg.cloudaccess.net 11
http://irq.tumblr.com 9
http://xenproject.org 6
http://translate.googleusercontent.com 5
http://safe.txmblr.com 2
http://dynamotech.com 1
http://moderation.local 1
http://www.xen.org. 1
http://10.1.0.56 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Mirage: extreme specialisation of virtual appliances Mirage: extreme specialisation of virtual appliances Presentation Transcript

  • Mirage: OCaml Appliances for Xen Clouds Anil Madhavapeddy, University of Cambridge with a merry crew Haris Rotsos, Balraj Singh, Steven Smith Jon Crowcroft, Steve Hand (University of Cambridge) Richard Mortier (University of Nottingham) Thomas Gazagnaire (OCamlPro) Dave Scott (Citrix Systems R&D) openmirage.org @avsmMonday, 27 August 12
  • Modern Stacks are Too Large Application • Millions of lines of code & configuration • Why build for clouds as for desktops? Threads • We can simplify! Language Processes OS Kernel Hypervisor Hardware openmirage.orgMonday, 27 August 12
  • Modern Stacks are Too Large Application • Millions of lines of code & configuration • Why build for clouds as for desktops? Threads • We can simplify! Language Processes Application Application OS Kernel Language Language Hypervisor Xen POSIX Hardware Hardware Hardware openmirage.orgMonday, 27 August 12
  • Modern Stacks are Too Large Application • Millions of lines of code & configuration • Why build for clouds as for desktops? Threads • We can simplify! Language Processes Application Application t t en en OS Kernel Language Language m ym op lo Hypervisor Xen POSIX el ep ev D D Hardware Hardware Hardware openmirage.orgMonday, 27 August 12
  • How Large is Large? 600 OCaml 500 C/C++ Lines of code (x 10 ) 3 ASM 400 300 200 100 0 Li gl Bi ht O O N M O nu ib pe pe ira tp nd X- c d x nS n ge 9. 2. za 2. 3. vS 9. SH 15 4. 2. ku w 0 2 2 itc 6. h 0p openmirage.org 1. 1 4. 0Monday, 27 August 12
  • Why Do We Care? • Critical memory bugs still occur regularly in mature software • CVE-2012-1182 – Samba – RPC code generator overflow – Variable containing buffer length checked independently of variable used to allocate memory for buffer – Leads to root exploit • CVE-2012-2110 – OpenSSL – Integer conversion bug mixed with realloc wrappers – Unsigned cast to signed, but realloc’d buffer not clamped – Leads to heap corruption openmirage.orgMonday, 27 August 12
  • The Cloud Threat Model • Attack from without, not within: trust hypervisor, but not I/O traffic or other VMs. • VMs are no longer multi-user, but primarily single-purpose deployments • …and they are in a multi-tenant datacenter • …and they are always network connected openmirage.orgMonday, 27 August 12
  • Sealed Appliances Configuration Files Mirage Compiler application source code Application Binary configuration files Language Runtime hardware architecture whole-system optimisation Parallel Threads User Processes OS Kernel Application Code Mirage Runtime } specialised unikernel Hypervisor Hypervisor Hardware Hardware openmirage.orgMonday, 27 August 12
  • Key Features of Mirage • Static typing – Pragmatic functional language (OCaml) – Large set of libraries provided, some used in XCP already Net.Manager.bind (fun mgr dev ! let src = ’any_addr, 53 in Dns.Server.listen dev src zones ) openmirage.orgMonday, 27 August 12
  • Key Features of Mirage • Static typing – Pragmatic functional language (OCaml) – Large set of libraries provided, some used in XCP already • Cooperative concurrency – Wrapped up in Lwt syntax extensions – Threads encapsulated and hidden within typed modules let main () = lwt zones = read key "zones" "zone.db" in Net.Manager.bind (fun mgr dev ! let src = ’any_addr, 53 in Dns.Server.listen dev src zones) openmirage.orgMonday, 27 August 12
  • Key Features of Mirage • Static typing – Pragmatic functional language (OCaml) – Large set of libraries provided, some used in XCP already • Cooperative concurrency – Wrapped up in Lwt syntax extensions – Threads encapsulated and hidden within typed modules • Library-based operating system – Fully re-entrant functional libraries • No dynamic loading – Configuration evaluated at compile-time, and sealed – Recompile and redeploy to reconfigure openmirage.orgMonday, 27 August 12
  • Progressive Specialization Develop Test Deploy ubuild posix-socket ubuild posix-direct ubuild xen-direct kernel sockets tuntap+safe I/O stack safe device drivers bytecode VM x86_64 native code link time optimisation ELF REPL ELF ELF ELF μ μ μ μ μ μ Linux Linux FreeBSD Xen openmirage.orgMonday, 27 August 12
  • Progressive Specialization • Development under UNIX: Develop Test Deploy – full debugging environment ubuild posix-socket (bytecode, debugger, xen-direct ubuild posix-direct ubuild gdb) kernel sockets tuntap+safe I/O stack safe device drivers – can use kernel sockets or tuntap bytecode VM x86_64 native code and IO to optimisation networking link time isolate bugs ELF REPL ELF interactive REPLμfor editor μ μ – ELF ELF μ μ μ integration and other niceties. Linux Linux FreeBSD Xen openmirage.orgMonday, 27 August 12
  • Progressive Specialization • Testing: Develop Test Deploy – Simulation ubuild posix-socket ubuild posix-direct ubuild xen-direct backend using NS3/MPI kernel sockets tuntap+safe I/O stack safe device drivers bytecode VM – Emulate your x86_64 native code link time optimisation code at scale before REPL ELF ELF ELF ELF μ μ μ μ μ μ deploying it Linux Linux FreeBSD Xen openmirage.orgMonday, 27 August 12
  • Progressive Specialization • Xen output kernel is specialised: Develop Test – “operating system” is a set of Deploy libraries (e.g Netfront, Blkfront, ubuild posix-socket ubuild posix-direct ubuild xen-direct TCP) linked with small C runtime. – Configuration files are evaluated at kernel sockets tuntap+safe I/O stack safe device drivers this stage, and image must be bytecode VM x86_64 native code recompiled to reconfigure. link time optimisation – Data files can also be linked in relatively small. ELF ELF directly, ifREPL ELF ELF μ μ μ μ μ μ – Dead-code elimination is across Linux Linux FreeBSD Xen whole OS image. openmirage.orgMonday, 27 August 12
  • Simplified Memory Management text$and$ IP$header 4kB 120TB data TCP$header foreign tx$data 64Dbit$virtual$address$space$ grants IP$header 4kB TCP$header rx$data reserved by$Xen 8×512kB 4kB sectors OCaml minor$heap 128TB OCaml major$heap openmirage.orgMonday, 27 August 12
  • Optional VM Seal Hypercall • Single address-space and no dynamic loading – W^X address space – Address offsets are randomized at compile-time • Dropping page table privileges: – Added freeze hypercall called just before app starts – Subsequent page table updates are rejected by Xen. – Exception for I/O mappings if they are non-exec and do not modify any existing mappings. • Very easy in unikernels due to focus on compile-time specialisation instead of run-time complexity. openmirage.orgMonday, 27 August 12
  • Event Driven co-threads Cumulative frequency (%) 100 80 60 40 xen-direct 20 linux-native linux-pv 0 0 0.05 0.1 0.15 0.2 Jitter (ms) The microkernel is event-driven, with no pre-emption at all. Graph shows CDF of thread wakeup latency for a Mirage VM running directly on Xen, vs native or PV Linux userspace. openmirage.orgMonday, 27 August 12
  • Single-instance Thread Scaling 4 linux-pv Execution time (s) linux-native 3 xen-direct (malloc) xen-direct (extent) 2 1 0 0 5 10 15 20 Number of threads (millions) Threads are heap allocated values, so benefit from the faster garbage collection cycle in the Mirage Xen version, and the scheduler can be overridden by application-specific needs. openmirage.orgMonday, 27 August 12
  • Appliance Image Size Appliance Standard Build Dead Code Elimination DNS 0.449 MB 0.184 MB Web Server 0.674 MB 0.172 MB Openflow learning switch 0.393 MB 0.164 MB Openflow controller 0.392 MB 0.168 MB All configuration and data compiled into the image by the toolchain, so no separate VBD required beyond the PV kernel. Live migration is easy and fun :-) openmirage.orgMonday, 27 August 12
  • Microbenchmarks: Boot Time linux-pv minimal-linux xen-direct 3 Time (s) 2 1 0 8 16 32 64 12 25 51 10 20 30 8 6 2 24 48 72 Memory size (MiB) • Minimal Linux is a custom initrd which directly calls ioctls to send a UDP packet. The Linux-PV is more realistic. • The Xen-Direct is a standard Mirage VM, and is limited by dom0 toolstack latency (XCP in this case). openmirage.orgMonday, 27 August 12
  • Zero-Copy IO Buffer Management free IO Page Pool recycle get_page make_writebuf channel output_buf reset_buf uninitialised reference Grant Table fixed data mutable data Ring response request openmirage.orgMonday, 27 August 12
  • Microbenchmarks: TCP 1000 Throughput (reqs/s x 10 ) 3 linux-pv, tx & rx 800 linux-pv tx, xen-direct rx xen-direct tx, linux-pv rx 600 • Simple throughput test with all 400 offload turned off to stress CPU • Performance bug in TX path 200 – …being fixed (ACKs being sent out of order) 0 10 flo 1 flo si w pa ws ng ra le lle l openmirage.orgMonday, 27 August 12
  • Microbenchmarks: Block Storage Fast PCI-E SSD 1600 Throughput (MiB/s) 1400 1200 xen-direct 1000 linux-pv, direct I/O 800 linux-pv, buffered I/O 600 400 200 0 1 2 4 8 16 32 64 12 25 51 10 20 40 8 6 2 24 48 96 Block size (KiB) openmirage.orgMonday, 27 August 12
  • Microbenchmarks: Block Storage • Same Ring API as Net – Unlike POSIX (libaio and so on) – Caching is controlled via libraries, not API • No reordering in the front-end – How do we know what the backend storage is? 70 60 Single spindle SATA Linux domU Throughput (MiB/s) 50 direct/buffered 40 30 20 Mirage 10 0 1 2 4 8 16 32 64 12 25 51 10 20 openmirage.org 40 8 6 2 24 48 96Monday, 27 August 12 Block size (KiB)
  • Microbenchmarks: Summary • Mirage is comparable to or exceeds Linux PV performance – I.e., the functional programming language overheads are offset via single-address space specialization and careful buffer management. • But what about in some more “realistic” scenarios? – DNS server appliance – HTTP server scaling across 6 vCPUs – OpenFlow Controller and Switch implementations openmirage.orgMonday, 27 August 12
  • DNS Server Performance 80 Throughput (reqs/s x 10 ) 3 70 60 Bind9, Linux 50 NSD, Linux NSD, MiniOS -O 40 NSD, MiniOS -O3 30 Mirage (no memo) 20 Mirage (memo) 10 0 100 1000 10000 Zone size (entries) openmirage.orgMonday, 27 August 12
  • DNS Server Performance let main () = lwt zones = read key "zones" "zone.db" in Net.Manager.bind (fun mgr dev → let src = ’any_addr, 53 in Dns.Server.listen dev src zones ) openmirage.orgMonday, 27 August 12
  • Scaling via Multiple Instances 2500 Throughput (conns/s) linux-pv (1 host, 6 vcpus) 2000 linux-pv (2 hosts, 3 vcpus) linux-pv (6 hosts, 1 vcpu) 1500 xen-direct (6 unikernels) 1000 500 • Apache/Linux vs. Mirage appliance 0 • Serving single static page openmirage.orgMonday, 27 August 12
  • Openflow Controller performance 180 160 maestro Requests/s (x 10 ) 3 140 nox-fast 120 xen-direct 100 80 60 40 20 0 Batch Single • Openflow controller is competitive with Nox (C++), but much more high-level. Applications can link directly against the switch to route their data. openmirage.orgMonday, 27 August 12
  • Design Discussion • Service domains should be built this way – Debugging and moving from dom0/domU is a lot easier. – Dependencies are explicitly managed. – Resource control can be fine-grained. – Very useful for unit testing Xen features! • Which language? – Several viable alternatives: HalVM (Haskell), GuestVM (Java), ErlangOnXen. More code sharing worthwhile? – Protocol implementations take time to mature. – Working on Mirage/XCP integration via “project Windsor” • Stub domains? – Less practical for stub domains, due to hardware devices. Turn Linux into a library OS? openmirage.orgMonday, 27 August 12
  • Mirage Roadmap • Developer Release for Q3 2012 – Monolithic repository at http://github.com/avsm/mirage – Adding pkg mgmt: http://github.com/OCamlPro/opam $ opam init git://github.com/mirage/opam-repository $ opam remote -add dev git://github.com/mirage/opam-repo-dev $ opam switch mirage-3.12.1-xen $ opam install mirage-www openmirage.orgMonday, 27 August 12
  • Mirage Roadmap • Developer Release for Q3 2012 – Monolithic repository at http://github.com/avsm/mirage – Adding pkg mgmt: http://github.com/OCamlPro/opam $ opam init git://github.com/mirage/opam-repository $ opam remote -add dev git://github.com/mirage/opam-repo-dev $ opam switch mirage-3.12.1-xen $ opam install mirage-www • New architectures ongoing: – FreeBSD kernel: http://github.com/pgj/mirage-kfreebsd – Javascript: http://ocsigen.org/js_of_ocaml – ARM/MIPs64/rPI: via FreeBSD kmod – Simulation: NS3/OpenMPI functional simulation openmirage.orgMonday, 27 August 12
  • Mirage Roadmap • Developer Release for Q3 2012 – First release will have pure-OCaml implementations of: • Device drivers (netfront/blkfront/xenstore) • TCP/IPv4 and DHCPv4 • HTTP • DNS(SEC) • SSH • OpenFlow (controller/switch) • vchan IPC • 9P :-) • NFS • FAT32 • Distributed k/v store: arakoon.org openmirage.orgMonday, 27 August 12
  • Mirage Roadmap • Online resources: – http://www.openmirage.org – http://tutorial.openmirage.org – https://lists.cam.ac.uk/mailman/listinfo/cl-mirage – (draft in Q4 2012) http://realworldocaml.org • Offline resources: – Find me at the hotel pool openmirage.orgMonday, 27 August 12