PlanetLab and OneLab Presentation at the GRID 5000 School
Upcoming SlideShare
Loading in...5
×
 

PlanetLab and OneLab Presentation at the GRID 5000 School

on

  • 906 views

 

Statistics

Views

Total Views
906
Views on SlideShare
906
Embed Views
0

Actions

Likes
1
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • MA = management authority NM = node manager - creates virtual machines (VMs), controls resources allocated to VMs VMM = virtual machine monitor - execution environment for VMs SA = slice authority SCS = slice creation service VM = virtual machine - currently linux-vserver, could be xen-domain, or other
  • MA = management authority NM = node manager - creates virtual machines (VMs), controls resources allocated to VMs VMM = virtual machine monitor - execution environment for VMs SA = slice authority SCS = slice creation service VM = virtual machine - currently linux-vserver, could be xen-domain, or other
  • sliver = resources allocated to one user on one node slice = collection of slivers, spanning many or all PlanetLab nodes SliverMgr = sliver manager Proper = service for accessing privileged operations PlanetFlow = auditing service for all traffic generated on PlanetLab nodes SliceStat = slice-level resource consumption information pl_scs = PlanetLab slice creation service pl_mom = makes sure that swap space doesn’t get all used up (resets biggest memory hogs, reboots machine if necessary)
  • VMM = virtual machine monitor
  • need to better understand the following point: also supports virtual devices standard PF_PACKET behavior used to connect to a “virtual ISP”
  • Brokerage services - resource allocation Sirius allows one to sign up to receive increased CPU priority for one’s slices Bellagio allows bids using virtual currency Tycoon also uses a virtual currency, can also be used for Grid services (not just PlanetLab) Envrionment services Stork - software installation Application Manager - also for installing software Monitoring/Discovery Services CoMon - monitor nodes and slices (nodes up/down, heavily/lightly loaded, etc.) PsEPR - notification service for a planetary-scale system No longer active, but previously mentioned on this slide: SWORD: Berkeley IrisLog: Intel
  • Level playing field then makes it possible to keep VMM an CP orthogonal (self-reenforcing); Least privilege has it’s own value, but the value we consider is its impact on evolution. Have to virtualize features to make the playing field level.
  • YUM is a package installer and updater for Linux
  • PLC = PlanetLab Central MA = management authority

PlanetLab and OneLab Presentation at the GRID 5000 School PlanetLab and OneLab Presentation at the GRID 5000 School Presentation Transcript

  • PlanetLab and OneLab Presentation at the GRID 5000 School 9 March 2006 Timur Friedman Université Pierre et Marie Curie Laboratoire LIP6-CNRS PlanetLab slides based on slides provided courtesy of Larry Peterson
  • PlanetLab
    • An open platform for:
      • testing overlays,
      • deploying experimental services,
      • deploying commercial services,
      • developing the next generation of internet technologies.
    • A set of virtual machines
      • distributed virtualization
      • each of 350+ network services runs in its own slice
  • PlanetLab nodes
    • 637 machines spanning 302 sites and 35 countries
      • nodes within a LAN-hop of > 2M users
  • Slices
  • Slices
  • Slices
  • User Opt-in Server NAT Client
  • Per-Node View Virtual Machine Monitor (VMM) Node Mgr Local Admin VM 1 VM 2 VM n …
  • Architecture (1)
    • Node Operating System
      • isolate slices
      • audit behavior
    • PlanetLab Central (PLC)
      • remotely manage nodes
      • bootstrap service to instantiate and control slices
    • Third-party Infrastructure Services
      • monitor slice/node health
      • discover available resources
      • create and configure a slice
      • resource allocation
  • Architecture (2) Owner 1 Owner 2 Owner 3 Owner N . . . Slice Authority Management Authority Software updates Auditing data Create slices . . . U S E R S PlanetLab Nodes Service Developers Request a slice New slice ID Access slice Identify slice users (resolve abuse) Learn about nodes
  • Architecture (3) Node MA NM + VMM node database Node Owner Owner VM SCS SA slice database VM Service Developer
  • Long-Running Services
    • Content Distribution
      • CoDeeN: Princeton
      • Coral: NYU
      • Cobweb: Cornell
    • Internet Measurement
      • ScriptRoute: Washington, Maryland
    • Anomaly Detection & Fault Diagnosis
      • PIER: Berkeley, Intel
      • PlanetSeer: Princeton
    • DHT
      • Bamboo (OpenDHT): Berkeley, Intel
      • Chord (DHash): MIT
  • Services (cont)
    • Routing
      • i3: Berkeley
      • Virtual ISP: Princeton
    • DNS
      • CoDNS: Princeton
      • CoDoNs: Cornell
    • Storage & Large File Transfer
      • LOCI: Tennessee
      • CoBlitz: Princeton
      • Shark: NYU
    • Multicast
      • End System Multicast: CMU
      • Tmesh: Michigan
  • Usage Stats
    • Slices: 350 - 425
    • AS peers: 6000
    • Users: 1028
    • Bytes-per-day: 2 - 4 TB
      • Coral CDN represents about half of this
    • IP-flows-per-day: 190M
    • Unique IP-addrs-per-day: 1M
  • OneLab
    • A potential project, currently under negotiation with the European Commission
      • Project leader: UPMC/LIP6-CNRS
      • Technical direction: INRIA Sophia-Antipolis
      • Other partners:
        • Intel Research Cambridge, Universidad Carlos III de Madrid, Université Catholique de Louvain, Università di Napoli, France Telecom (Lannion), Università di Pisa, Alcatel Italia, Telekomunikacja Polska
    • Goals:
      • Extend PlanetLab into new environments, beyond the traditional wired internet.
      • Deepen PlanetLab’s monitoring capabilities.
      • Provide a European administration for PlanetLab nodes in Europe.
  • Goal: New Environments
    • Problem: PlanetLab nodes are connected to the traditional wired internet.
      • They are mostly connected to high-performance networks such as Abilene, DANTE, NRENs.
      • These are not representative of the internet as a whole.
      • PlanetLab does not provide access to emerging environments.
    • OneLab will place nodes in new environments:
      • Wireless: WiMAX, UMTS, and wireless ad hoc networks.
      • Wired: multihomed nodes.
      • Emulated: for new and experimental technologies.
  • Goal: Deepen Monitoring
    • Problem: PlanetLab provides limited facilities to make applications aware of the underlying network
    • OneLab’s monitoring components
      • Passive monitoring: Track packets at the routers
      • Topology monitoring: Provide a view of the route structure
  • PlanetLab Before OneLab
  • PlanetLab After OneLab
  • New Environments
  • Monitoring Capabilities
  • Goal: European Administration
    • Problem: Changes to PlanetLab must come through the administration at Princeton.
      • PlanetLab in the US is necessarily less responsive to European research priorities.
    • OneLab will create a PlanetLab Europe.
      • It will federate with PlanetLab in the US, Japan, and elsewhere.
      • The federated structure will allow:
        • PlanetLab Europe to set policy in accordance with European research priorities,
        • PlanetLab Europe to customize the platform, so long as a common interface is preserved.
  • PlanetLab and GRID 5000
    • Some goals in common
    • Some differences in architecture
    • Possibilities for cooperation between OneLab and GRID 5000
  • Common goals
    • Test at the scale of the internet, with internet conditions
    • Test new architectures and services
      • Even radical departures from the current internet
        • PlanetLab a precursor to the GENI initiative
  • Internet
    • GRID 5000 is at the scale of the internet
      • Reserved fibre to interconnect clusters
      • Cross-traffic can be injected, if wished
      • Ability to control and replay experiments
    • PlanetLab works over the internet
      • Connections between nodes pass via the public internet
      • Cross-traffic comes from the internet itself
        • Test services in a real setting
        • A challenged environment is interesting
      • PlanetLab provides services to internet users
  • Clusters
    • GRID 5000 consists of clusters of many machines
      • To participate in GRID 5000, one signs up with one of the cluster administrators
      • Access is to university and state-sponsored research labs
    • There are typically two PlanetLab nodes to a site
      • To participate in PlanetLab, one provides two nodes
      • University and state-sponsored research labs pay no fees
      • For-profit organizations pay fees of $25K/yr.+
      • Available for use by industry
  • Virtualization
    • GRID 5000 designed for the installation of any OS on top of the hardware of any node
      • One OS per machine
    • PlanetLab designed for the installation of multiple virtual machines on top of a VMM (virtual machine manager) on any node
      • VMM currently linux-vserver
        • Could be xen-domain, or other
      • Virtual machines currently only Linux
        • Eventually could be any suitably adapted OS
    • GRID 5000’s OS could be a VMM, if desired
  • Reservations
    • Users reserve GRID 5000 nodes
      • No two users have the same node at the same time
      • Users have access to completely unloaded machines
    • Users share PlanetLab nodes
      • The load affects the performance
      • Problems arise close to major conference deadlines
      • Services allow one to select a subset of nodes based on their load characteristics
      • Services allow one to make a reservation for higher priority on certain machines
  • Cooperation
    • Test PlanetLab architectures on GRID 5000
      • OneLab topology monitoring component will be tested on GRID 5000
        • Joint work: Pierre Sens and Timur Friedman
    • Test GRID 5000 architectures on PlanetLab?
    • The European Commission invites such forms of cooperation
  • Fin
  • More About PlanetLab
  • PlanetLab Architecture
    • What is the PlanetLab architecture?
      • more a question of synthesis than cleverness
    • Why is this the right architecture?
      • non-technical requirements
      • technical decisions that influenced adoption
    • What is a system architecture anyway?
      • how does it accommodate change (evolution)
  • Requirements
    • Global platform that supports both short-term experiments and long-running services.
      • services must be isolated from each other
        • performance isolation
        • name space isolation
      • multiple services must run concurrently
    • Distributed Virtualization
      • each service runs in its own slice : a set of VMs
  • Requirements
    • It must be available now, even though no one knows for sure what “it” is.
      • deploy what we have today, and evolve over time
      • make the system as familiar as possible (e.g., Linux)
    • Unbundled Management
      • independent mgmt services run in their own slice
      • evolve independently; best services survive
      • no single service gets to be “root” but some services require additional privilege
  • Requirements
    • Must convince sites to host nodes running code written by unknown researchers.
      • protect the Internet from PlanetLab
    • Chain of Responsibility
      • explicit notion of responsibility
      • trace network activity to responsible party
  • Requirements
    • Sustaining growth depends on support for autonomy and decentralized control.
      • sites have the final say about the nodes they host
      • sites want to provide “private PlanetLabs”
      • regional autonomy is important
    • Federation
      • universal agreement on minimal core (narrow waist)
      • allow independent pieces to evolve independently
      • identify principals and trust relationships among them
  • Requirements
    • Must scale to support many users with minimal resources available.
      • expect under-provisioned state to be the norm
      • shortage of logical resources too (e.g., IP addresses)
    • Decouple slice creation from resource allocation
    • Overbook with recovery
      • support both guarantees and best effort
      • recover from wedged states under heavy load
  • Tension Among Requirements
    • Distributed Virtualization / Unbundled Management
      • isolation vs one slice managing another
    • Federation / Chain of Responsibility
      • autonomy vs trusted authority
    • Under-provisioned / Distributed Virtualization
      • efficient sharing vs isolation
    • Other tensions
      • support users vs evolve the architecture
      • evolution vs clean slate
  • Synergy Among Requirements
    • Unbundled Management
      • third party management software
    • Federation
      • independent evolution of components
      • support for autonomous control of resources
  • Architecture (1)
    • Node Operating System
      • isolate slices
      • audit behavior
    • PlanetLab Central (PLC)
      • remotely manage nodes
      • bootstrap service to instantiate and control slices
    • Third-party Infrastructure Services
      • monitor slice/node health
      • discover available resources
      • create and configure a slice
      • resource allocation
  • Trust Relationships
    • Princeton
    • Berkeley
    • Washington
    • MIT
    • Brown
    • CMU
    • NYU
    • ETH
    • Harvard
    • HP Labs
    • Intel
    • NEC Labs
    • Purdue
    • UCSD
    • SICS
    • Cambridge
    • Cornell
    • princeton_codeen
    • nyu_d
    • cornell_beehive
    • att_mcash
    • cmu_esm
    • harvard_ice
    • hplabs_donutlab
    • idsl_psepr
    • irb_phi
    • paris6_landmarks
    • mit_dht
    • mcgill_card
    • huji_ender
    • arizona_stork
    • ucb_bamboo
    • ucsd_share
    • umd_scriptroute
    N x N Trusted Intermediary (PLC)
  • Trust Relationships (cont) Node Owner PLC Service Developer (User) 1) PLC expresses trust in a user by issuing it credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure 1 2 3 4
  • Trust Relationships (cont) 1) PLC expresses trust in a user by issuing credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure Slice Authority 5) MA trusts SA to reliably map slices to users 6) SA trusts MA to provide working VMs Node Owner Mgmt Authority Service Developer (User) 1 2 3 4 5 6
  • Architecture (2) Owner 1 Owner 2 Owner 3 Owner N . . . Slice Authority Management Authority Software updates Auditing data Create slices . . . U S E R S PlanetLab Nodes Service Developers Request a slice New slice ID Access slice Identify slice users (resolve abuse) Learn about nodes
  • Architecture (3) Node MA NM + VMM node database Node Owner Owner VM SCS SA slice database VM Service Developer
  • Per-Node Mechanisms Virtual Machine Monitor (VMM) Node Mgr Owner VM VM 1 VM 2 VM n … Linux kernel (Fedora Core) + Vservers (namespace isolation) + Schedulers (performance isolation) + VNET (network virtualization) SliverMgr Proper PlanetFlow SliceStat pl_scs pl_mom
  • VMM
    • Linux
      • significant mind-share
    • Vserver
      • scales to hundreds of VMs per node (12 MB each)
    • Scheduling
      • CPU
        • fair share per slice (guarantees possible)
      • link bandwidth
        • fair share per slice
        • average rate limit: 1.5Mbps (24-hour bucket size)
        • peak rate limit: set by each site (100Mbps default)
      • disk
        • 5 GB quota per slice (limit run-away log files)
      • memory
        • no limit
        • pl_mom resets biggest user at 90% utilization
  • VMM (cont)
    • VNET
      • socket programs “just work”
        • including raw sockets
      • slices should be able to send only…
        • well-formed IP packets
        • to non-blacklisted hosts
      • slices should be able to receive only…
        • packets related to connections that they initiated (e.g., replies)
        • packets destined for bound ports (e.g., server requests)
      • essentially a switching firewall for sockets
        • leverages Linux’s built-in connection tracking modules
  • Node Manager
    • SliverMgr
      • creates VM and sets resource allocations
      • interacts with…
        • bootstrap slice creation service (pl_scs)
        • third-party slice creation & brokerage services (using tickets)
    • Proper: PRivileged OPERations
      • grants unprivileged slices access to privileged info
      • effectively “pokes holes” in the namespace isolation
      • examples
        • files: open, get/set flags
        • directories: mount/unmount
        • sockets: create/bind
        • processes: fork/wait/kill
  • Auditing & Monitoring
    • PlanetFlow
      • logs every outbound IP flow on every node
        • accesses ulogd via Proper
        • retrieves packet headers, timestamps, context ids (batched)
      • used to audit traffic
      • aggregated and archived at PLC
    • SliceStat
      • has access to kernel-level / system-wide information
        • accesses /proc via Proper
      • used by global monitoring services
      • used to performance debug services
  • Infrastructure Services
    • Brokerage Services
      • Sirius: Georgia
      • Bellagio: UCSD, Harvard, Intel
      • Tycoon: HP
    • Environment Services
      • Stork: Arizona
      • Application Manager: Berkeley (UC + Intel)
    • Monitoring/Discovery Services
      • CoMon: Princeton
      • PsEPR: Intel
  • Evolution vs Intelligent Design
    • Favor evolution over clean slate
    • Favor design principles over a fixed architecture
    • Specifically…
      • leverage existing software and interfaces
      • keep VMM and control plane orthogonal
      • exploit virtualization
        • vertical: management services run in slices
        • horizontal: stacks of VMs
      • give no one root (least privilege + level playing field)
      • support federation (decentralized control)
  • Other Lessons
    • Inferior tracks lead to superior locomotives
    • Empower the user: yum
    • Build it and they (research papers) will come
    • Overlays are not networks
    • PlanetLab: We debug your network
    • From universal connectivity to gated communities
    • If you don’t talk to your university’s general counsel, you aren’t doing network research
    • Work fast, before anyone cares
  • Fin
  • Available CPU Capacity Feb 1-8, 2005 (Week before SIGCOMM deadline)
  • Node Boot/Install Node PLC (MA) Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config . from floppy 7. Node key read into memory from floppy 4. Contact PLC (MA) 6. Execute boot mgr Boot Manager 8. Invoke Boot API 10. State = “install”, run installer 11. Update node state via Boot API 13. Chain-boot node (no restart) 14. Node booted 9. Verify node key, send current node state 12. Verify node key, change state to “boot” 5. Send boot manager
  • Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) Nodes Added to Slices Users add nodes to their slice (logged) Slice Traffic Logged Experiments generate traffic (logged by PlanetFlow) Traffic Logs Centrally Stored PLC periodically pulls traffic logs from nodes Slice Created PI creates slice and assigns users to it (logged) Network Activity Slice Responsible Users & PI
  • Slice Creation PLC (SA) VMM NM VM VM VM … . . . . . . PI SliceCreate( ) SliceUsersAdd( ) User SliceAttributeSet( ) SliceGetTicket( ) (distribute ticket to slice creation service: pl_scs) SliverCreate(rspec)
  • Brokerage Service PLC (SA) VMM NM VM SliceAttributeSet( ) SliceGetTicket( ) VM VM … . . . . . . Broker (distribute ticket to brokerage service) rcap = PoolCreate(rspec)
  • Brokerage Service (cont) PLC (SA) VMM NM VM VM . . . . . . VM Broker VM … (broker contacts relevant nodes) PoolSplit(rcap, slice, rspec) User BuyResources( )