PlanetLab Operating System support* *a work in progress
What is it? A Distributed set of machines that must be shared in an efficient way.. Where “efficient” can mean a varied “lot”..
Goals PlanetLab account, together with associated resources should span through multiple nodes. (SLICE) Distributed Virtualization Unbundled management Infrastructure services (running a platform as opposed to running an application) over a SLICE providing variety of services for the same functionality.
4 main areas..
VM Abstraction - Linux vserver
Resource Allocation + Isolation - SCOUT
Full virtualization like Vmware - performance, lot of memory consumed by each memory image
Para virtualization like xen - more efficient, a promising solution (but still has memory constraints)
Virtualize at system call level like Linux vservers, UML - support large number of slices with reasonable isolation
“ Node Virtualization”
OS for each VM ?
Linux vservers - linux inside linux
Each vserver is a directory in a chroot jail.
Each virtual server,
has its own packages,
has its own services,
is a weaker form of root that provides a local super user,
has its own users, i.e own GID/UID namespace
is confined to using some IP numbers only and,
is confined to some area(s) of the file system.
Communication among ‘vservers’
Not local sockets or IPC
but via IP
Simplifies resource management and isolation
Interaction is independent of their locations
Reduced resource usage
Copy of write memory segments across unrelated servers
Unification (Disk space)
Share files across contexts
Hard linked immutable un-linkable files
Required modifications for vserver
Notion of context
Isolate group of processes,
Each vserver is a separate context,
Add context id to all inodes,
Context specific capabilities were added,
Context limits can be specified,
Easy accounting for each contexts.
Create a mirror of reference root file system
Create two identical login account
Switching from default shell (modified shell)
Switch to the Slice's vserver security context
Chroot to vserver’s root file system
Relinquish subset of true super user privileges
Redirect into other account in that vserver
“ Isolation & Resource Allocation”
KeyKOS - strict resource accounting
Processor Capacity Reserves
Scout - scheduling along data paths (SILK)
Central infrastructure services ( Planet Lab Central )
central database of principles, slices, resource allocation and policies
Creation, deletion of slices through exported interface
Obtains resource information from central server
Bind resources to local VM that belongs to a slice
Rcap -> acquire( Rspecs )
Bind( slice_id, Rcap )
** Every resource accesses goes through the node manager as system call and validated using Rcap
Non renewable resources
Disk space, memory pages, file descriptor
Appropriate system calls wrapped to check with per slice resource limits, increment usage.
Fairness and guarantees
Hierarchical token bucket queuing discipline
Cap per-vserver total outgoing bandwidth
SILK for CPU scheduling
Proportional share scheduling using resource containers
Filters on network send and receive - like Exokernel and Nemesis.
Sharing and partitioning a single network address space - by using a safe version of raw sockets.
Alternative approach (similar to xen) - Assign different IP address to each VM, each using the entire port space and manage its own routing table. The problem is unavailability of enough IPV4 addresses in the order of 1000 per node.
Safe raw sockets
The Scout module manages all TCP and UDP ports and ICMP IDs to ensure that there are no collisions between safe raw sockets and TCP/UDP/ICMP sockets
For each IP address, all ports are either free or "owned" by a slice.
Two slices may split ownership of a port by binding it to different IP addresses.
Only two IP addresses for a node as of now.. External IP + loop back address
SLICE can reserve port as any other resource (Xclusive)
SLICE can open 3 sockets on a port
Error socket, consumer socket, sniffer socket
Http Sensor server collects data from sensor interface on each nodes.