OSV
Avi Kivity, Don Marti
Cloudius Systems
@CloudiusSystems
Typical Cloud Stack
Hardware
Hypervisor
OS
JVM
App Server
Your App
OS
JVM
App Server
Your App
OS
JVM
App Server
Your App
Our software stack
congealed into existence.
Too Many Layers, Too Little Value
Property/Component
Hardware abstraction
Isolation
Resource virtualization
Backward compatibility
Security
Memory management
I/O stack
Configuration
VMM OS runtime
Duplication
Transformed the
enterprise from
physical2virtual
Virtualization
Virtualization 1.0 Virtualization 2.0
Compute node
virtual server
Virtualization 2.0, Massive Scale
Scalability
Virtualization 2.0
vServer OS 1.0Architecture
● No Hardware
● No Users
● One app/guest
● Yes Complexity
The new Cloud Stack - OSv
Hardware
Hypervisor
Core
App
Server
Your App
Single
Process
Kernel
space only
Linked to
existing
JVMs
App sees
no change
Core
Your App
The new Cloud Stack - OSv
Memory Huge pages, Heap vs Sys
I/O Zero copy, full aio, batching
Scheduling Lock free, low latency
Tuning Out of the box, auto
CPU
Low cost ctx, Direct
signals,..
Alpha Release
3/2014
Optimized ZFS cache
Dirty page writeback
JVM page table
HV support:
GCE, VMW, VBox
Milestones
Git init
osv,
12/2012
Java hello world,
01/2013
support for
64 vcpu
02/2013
UDP,
03/2013
ZFS support
> 1Gbps netperf,
6/2013
Cassandra
outperforms
Linux,
8/2013
Native REST API
Memcached gain >
70%
1/2014
Cli and web
interface
9/2013
Tomcat,
HAProxy
modules
10/2013
Net channels
JVM ballooning
>47Gbps netperf
2/2014
Image Repository
JVM Read barrier
elimination
DPDK support
More ...
Memcached benchmark
Requests/s (higher is better)
Integrating the JVM into the kernel
Core
JVM
Application
Server
Your AppDynamic
Heap
Memory
TCP in the
JVM + App
context Fast inter
thread
wakeup
JVM Garbage Collection and the
kernel
Application thread (“mutator”)
Garbage collection thread
JVM Object
JVM Object
ref
JVM Garbage Collection
Application thread (“mutator”)
Garbage collection thread
JVM Object
JVM Object
JVM Object
ref
change
JVM Garbage Collection
Application thread (“mutator”)
Garbage collection thread
JVM Object
JVM Object
JVM Object
ref
scan
change
JVM Garbage Collection
Application thread (“mutator”)
Garbage collection thread
JVM Object
JVM Object
JVM Object
ref
scan
write
change
card table
read
JVM Garbage Collection
Application thread (“mutator”)
Garbage collection thread
JVM Object
JVM Object
JVM Object
ref
scan
change
page table
read
hardware update
Van Jacobson == TCP/IP
Common kernel network stack
Leads to servo-loop:
Van Jacobson == TCP/IP
Net Channel design:
Dynamic heap, sharing is good
JVM Memory System
memory
Lend
memory
OS that doesn’t get in the way
4 VMs per sys
admin ratio
http://www.computerworld.com.au/article/352635/there_best_practice_server_system_administrator_ratio_/
NO Tuning
NO State
NO Patching
Virtualization 2.0: Stateless servers
REST API definition use Swagger
REST api for *any* action
Performance and tracing
Porting a JVM application to OSV
1. Done*
* well, unless the application fork()s
Porting a C application to OSV
1. Must be a single-process application
2. May not fork() or exec()
3. Rebuild as a shared object (.so)
4. Other API limitations apply*
* Post to the list, and missing items are usually added quickly.
● JVM-based application
○ No effort
● Traditional C/C++ application
○ More effort
● Virtio-app
○ NFV - highest gain
Different customization levels
● First OS to be written in C++11
● Interrupt handler using lambda function:
_msi.easy_register({
{ 0, [&] { _rxq.vqueue->disable_interrupts(); },
poll_task},
{ 1, [&] { _txq.vqueue->disable_interrupts(); }, nullptr }
});
OSv == Cool
So what about
containers?
Capstan demo
● Docker is awesome! Low overhead!
○ Small artifacts
○ Fast deployment
○ Fast startup (container start, not OS boot)
● Docker is awesome! Easy builds!
○ Simple configuration file
○ Minimal additional work on top of application build
Docker is awesome
Containers under the hood
● Hypervisors are awesome too!
○ Live migration (load balance, hardware maint)
○ Multiple kernel versions available
● Running VMs everywhere is awesome!
○ Public cloud, private cloud, existing hypervisor...
● Security is awesome!
○ Container attack surface is full kernel interface.
○ Hypervisor attack surface is small.
● Sometimes a hypervisor is already there
(private and public clouds)
Hypervisors are awesome
Everything is awesome
What if we could have advantages of Docker
without limitations of containers?
● Fast boot, fast provisioning
● High performance
● Tiny footprint
● Virtualized
○ Live migration
○ Elasticity
○ MMU access
Capstan
● Check out a base image
● Build application
● Create complete run-anywhere VM
○ No Capstan needed in production
● 12-20MB and 3 seconds of overhead
● All driven by one simple config file
● Low overhead of containers with
deployment flexibility of a VM image.
Let’s Collaborate
http://osv.io
https://github.com/cloudius-systems/osv
@CloudiusSystems
osv-dev@googlegroups.com
OSv
@Cloudius
Backup slides
Virtualization 2.0, Dev/Ops
Virtualization 2.0, agility!
Rolling upgrade
within seconds and
a fall back option
Cloudius Systems, OS Comparison
Feature/Property
Good for:
Typical workload
kernel vs app
API, compatibility
# Config files
Tuning
Upgrade/state
OSv
Machete:
Cloud/Virtualization
Single app * VMs
Cooperation
JVM, POSIX
0
Auto
Stateless, just boots
JVM support
Lines of code
License
Tailored GC/STW
solution
Fewer
BSD
Traditional OS
Swiss knife: anything
goes
Multiple apps/users,
utilities, anything
distrust
Any, but
versions/releases..
1000
Manual, requires
certifications
Complex, needs
snapshots, hope..
Yet another app
Gazillion
GPL / proprietary
Be the best OS
powering virtual machines
in the cloud
Mission statement
Hardware
Hypervisor
OSv
Your App
Hardware
Hypervisor
OSv
+ Jazz JVM
Your App
Hardware
Hypervisor
OSv
+ Jazz JVM
Hardware
Hypervisor
OSv
Hardware
Hypervisor
OSv
+ Jazz JVM
Your App
Application upload
● Going idle is much more expensive on
virtual machines
● So are inter-processor interrupts - IPIs
● Combine the two:
○ Before going idle, announce it via shared memory
○ Delay going idle
○ In the meanwhile, poll for wakeup requests from
other processors
● Result: wakeups are faster, both for the
processor waking, and for the wakee
Idle-time polling
OSv
applicable for
Virtualization
Application server
Virtual appliance
PaaS
Cloud
Vendors
Plan
1. Open Source Launch - Done
2. Harden Virtual Appliances - Doing
3. Q4 2014 GA by Cloud vendors -
Doing
4. OEM (cloud/appliance), Enterprise

OSv presentation from Linux Foundation Collaboration Summit

  • 1.
    OSV Avi Kivity, DonMarti Cloudius Systems @CloudiusSystems
  • 2.
    Typical Cloud Stack Hardware Hypervisor OS JVM AppServer Your App OS JVM App Server Your App OS JVM App Server Your App
  • 3.
  • 4.
    Too Many Layers,Too Little Value Property/Component Hardware abstraction Isolation Resource virtualization Backward compatibility Security Memory management I/O stack Configuration VMM OS runtime Duplication
  • 6.
    Transformed the enterprise from physical2virtual Virtualization Virtualization1.0 Virtualization 2.0 Compute node virtual server Virtualization 2.0, Massive Scale Scalability
  • 7.
    Virtualization 2.0 vServer OS1.0Architecture ● No Hardware ● No Users ● One app/guest ● Yes Complexity
  • 9.
    The new CloudStack - OSv Hardware Hypervisor Core App Server Your App Single Process Kernel space only Linked to existing JVMs App sees no change Core Your App
  • 10.
    The new CloudStack - OSv Memory Huge pages, Heap vs Sys I/O Zero copy, full aio, batching Scheduling Lock free, low latency Tuning Out of the box, auto CPU Low cost ctx, Direct signals,..
  • 11.
    Alpha Release 3/2014 Optimized ZFScache Dirty page writeback JVM page table HV support: GCE, VMW, VBox Milestones Git init osv, 12/2012 Java hello world, 01/2013 support for 64 vcpu 02/2013 UDP, 03/2013 ZFS support > 1Gbps netperf, 6/2013 Cassandra outperforms Linux, 8/2013 Native REST API Memcached gain > 70% 1/2014 Cli and web interface 9/2013 Tomcat, HAProxy modules 10/2013 Net channels JVM ballooning >47Gbps netperf 2/2014 Image Repository JVM Read barrier elimination DPDK support More ...
  • 12.
  • 13.
    Integrating the JVMinto the kernel Core JVM Application Server Your AppDynamic Heap Memory TCP in the JVM + App context Fast inter thread wakeup
  • 14.
    JVM Garbage Collectionand the kernel Application thread (“mutator”) Garbage collection thread JVM Object JVM Object ref
  • 15.
    JVM Garbage Collection Applicationthread (“mutator”) Garbage collection thread JVM Object JVM Object JVM Object ref change
  • 16.
    JVM Garbage Collection Applicationthread (“mutator”) Garbage collection thread JVM Object JVM Object JVM Object ref scan change
  • 17.
    JVM Garbage Collection Applicationthread (“mutator”) Garbage collection thread JVM Object JVM Object JVM Object ref scan write change card table read
  • 18.
    JVM Garbage Collection Applicationthread (“mutator”) Garbage collection thread JVM Object JVM Object JVM Object ref scan change page table read hardware update
  • 19.
    Van Jacobson ==TCP/IP Common kernel network stack Leads to servo-loop:
  • 20.
    Van Jacobson ==TCP/IP Net Channel design:
  • 21.
    Dynamic heap, sharingis good JVM Memory System memory Lend memory
  • 22.
    OS that doesn’tget in the way 4 VMs per sys admin ratio http://www.computerworld.com.au/article/352635/there_best_practice_server_system_administrator_ratio_/ NO Tuning NO State NO Patching
  • 23.
  • 24.
    REST API definitionuse Swagger REST api for *any* action
  • 25.
  • 26.
    Porting a JVMapplication to OSV 1. Done* * well, unless the application fork()s
  • 27.
    Porting a Capplication to OSV 1. Must be a single-process application 2. May not fork() or exec() 3. Rebuild as a shared object (.so) 4. Other API limitations apply* * Post to the list, and missing items are usually added quickly.
  • 28.
    ● JVM-based application ○No effort ● Traditional C/C++ application ○ More effort ● Virtio-app ○ NFV - highest gain Different customization levels
  • 29.
    ● First OSto be written in C++11 ● Interrupt handler using lambda function: _msi.easy_register({ { 0, [&] { _rxq.vqueue->disable_interrupts(); }, poll_task}, { 1, [&] { _txq.vqueue->disable_interrupts(); }, nullptr } }); OSv == Cool
  • 30.
  • 31.
  • 32.
    ● Docker isawesome! Low overhead! ○ Small artifacts ○ Fast deployment ○ Fast startup (container start, not OS boot) ● Docker is awesome! Easy builds! ○ Simple configuration file ○ Minimal additional work on top of application build Docker is awesome
  • 33.
    Containers under thehood ● Hypervisors are awesome too! ○ Live migration (load balance, hardware maint) ○ Multiple kernel versions available ● Running VMs everywhere is awesome! ○ Public cloud, private cloud, existing hypervisor... ● Security is awesome! ○ Container attack surface is full kernel interface. ○ Hypervisor attack surface is small. ● Sometimes a hypervisor is already there (private and public clouds) Hypervisors are awesome
  • 34.
    Everything is awesome Whatif we could have advantages of Docker without limitations of containers? ● Fast boot, fast provisioning ● High performance ● Tiny footprint ● Virtualized ○ Live migration ○ Elasticity ○ MMU access
  • 35.
    Capstan ● Check outa base image ● Build application ● Create complete run-anywhere VM ○ No Capstan needed in production ● 12-20MB and 3 seconds of overhead ● All driven by one simple config file ● Low overhead of containers with deployment flexibility of a VM image.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
    Virtualization 2.0, agility! Rollingupgrade within seconds and a fall back option
  • 41.
    Cloudius Systems, OSComparison Feature/Property Good for: Typical workload kernel vs app API, compatibility # Config files Tuning Upgrade/state OSv Machete: Cloud/Virtualization Single app * VMs Cooperation JVM, POSIX 0 Auto Stateless, just boots JVM support Lines of code License Tailored GC/STW solution Fewer BSD Traditional OS Swiss knife: anything goes Multiple apps/users, utilities, anything distrust Any, but versions/releases.. 1000 Manual, requires certifications Complex, needs snapshots, hope.. Yet another app Gazillion GPL / proprietary
  • 42.
    Be the bestOS powering virtual machines in the cloud Mission statement Hardware Hypervisor OSv Your App Hardware Hypervisor OSv + Jazz JVM Your App Hardware Hypervisor OSv + Jazz JVM Hardware Hypervisor OSv Hardware Hypervisor OSv + Jazz JVM Your App
  • 43.
  • 44.
    ● Going idleis much more expensive on virtual machines ● So are inter-processor interrupts - IPIs ● Combine the two: ○ Before going idle, announce it via shared memory ○ Delay going idle ○ In the meanwhile, poll for wakeup requests from other processors ● Result: wakeups are faster, both for the processor waking, and for the wakee Idle-time polling
  • 45.
  • 46.
    Plan 1. Open SourceLaunch - Done 2. Harden Virtual Appliances - Doing 3. Q4 2014 GA by Cloud vendors - Doing 4. OEM (cloud/appliance), Enterprise