PlanetLab: Evolution vs
Intelligent Design in Global
   Network Infrastructure
          Larry Peterson
       Princeton U...
Case for PlanetLab
                                                     Deployed
                                         ...
PlanetLab


                            QuickTime™ and a
                            QuickTime™ and a
                  TI...
Slices
Slices
Slices
User Opt-in
Client




                        Server
                  NAT
Per-Node View

   Node   Local
                  VM1   VM2   …   VMn
   Mgr    Admin




    Virtual Machine Monitor (VMM)
Long-Running Services
• Content Distribution
   – CoDeeN: Princeton
   – Coral: NYU
   – Cobweb: Cornell
• Internet Measur...
Services (cont)
• Routing
   – i3: Berkeley
   – Virtual ISP: Princeton
• DNS
   – CoDNS: Princeton
   – CoDoNs: Cornell
•...
Usage Stats
•   Slices: 350 - 425
•   AS peers: 6000
•   Users: 1028
•   Bytes-per-day: 2 - 4 TB
•   IP-flows-per-day: 190...
Architectural Questions
• What is the PlanetLab architecture?
   – more a question of synthesis than cleverness
• Why is t...
Requirements
1) Global platform that supports both short-term
   experiments and long-running services.
   – services must...
Requirements
2) It must be available now, even though no one
   knows for sure what “it” is.
   – deploy what we have toda...
Requirements
3) Must convince sites to host nodes running code
   written by unknown researchers.
   – protect the Interne...
Requirements
4) Sustaining growth depends on support for
   autonomy and decentralized control.
   – sites have the final ...
Requirements
5) Must scale to support many users with minimal
   resources available.
   – expect under-provisioned state ...
Tension Among Requirements
• Distributed Virtualization / Unbundled Management
   – isolation vs one slice managing anothe...
Synergy Among Requirements
• Unbundled Management
  – third party management software
• Federation
  – independent evoluti...
Architecture (1)
• Node Operating System
   – isolate slices
   – audit behavior
• PlanetLab Central (PLC)
   – remotely m...
Trust Relationships
Princeton                   princeton_codeen
Berkeley                    nyu_d
Washington             ...
Trust Relationships (cont)
                      4                             2            Service
   Node
              ...
Trust Relationships (cont)
                4                     6                    2     Service
   Node               ...
Architecture (2)
PlanetLab                                                           Service
  Nodes                      ...
Architecture (3)
                                   node
                            MA     database




    Node         ...
Per-Node Mechanisms

SliverMgr                                                 PlanetFlow
   Proper   Node   Owner        ...
VMM
• Linux
   – significant mind-share
• Vserver
   – scales to hundreds of VMs per node (12MB each)
• Scheduling
   – CP...
VMM (cont)
• VNET
  – socket programs “just work”
      • including raw sockets
  – slices should be able to send only…
  ...
Node Manager
• SliverMgr
  – creates VM and sets resource allocations
  – interacts with…
     • bootstrap slice creation ...
Auditing & Monitoring
• PlanetFlow
   – logs every outbound IP flow on every node
      • accesses ulogd via Proper
      ...
Infrastructure Services
• Brokerage Services
  – Sirius: Georgia
  – Bellagio: UCSD, Harvard, Intel
  – Tycoon: HP
• Envir...
Evolution vs Intelligent Design
• Favor evolution over clean slate
• Favor design principles over a fixed architecture
• S...
Other Lessons
• Inferior tracks lead to superior locomotives
• Empower the user: yum
• Build it and they (research papers)...
Collaborators
•   Andy Bavier         •   David Culler, Berkeley
•   Marc Fiuczynski     •   Tom Anderson, UW
•   Mark Hua...
Available CPU Capacity
Feb 1-8, 2005 (Week before SIGCOMM deadline)
      120

                      100
   Pct of 360 Nod...
Node Boot/Install
       Node                Boot Manager                              PLC (MA) Boot Server
1. Boots from ...
Chain of Responsibility
 Join Request      PI submits Consortium paperwork and requests to join

  PI Activated     PLC ve...
Slice Creation
                                                                .
                                         ...
Brokerage Service
                                                                  .
                                    ...
Brokerage Service (cont)
                                                   .
                                            ...
Upcoming SlideShare
Loading in...5
×

peterson.ppt

856

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
856
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

peterson.ppt

  1. 1. PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University
  2. 2. Case for PlanetLab Deployed Future Internet Maturity This chasm is a major barrier to realizing the Small Scale Future Internet Testbeds Simulation and Research Prototypes Foundational Research Time
  3. 3. PlanetLab QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture. • 637 machines spanning 302 sites and 35 countries nodes within a LAN-hop of > 2M users • Supports distributed virtualization each of 350+ network services running in their own slice
  4. 4. Slices
  5. 5. Slices
  6. 6. Slices
  7. 7. User Opt-in Client Server NAT
  8. 8. Per-Node View Node Local VM1 VM2 … VMn Mgr Admin Virtual Machine Monitor (VMM)
  9. 9. Long-Running Services • Content Distribution – CoDeeN: Princeton – Coral: NYU – Cobweb: Cornell • Internet Measurement – ScriptRoute: Washington, Maryland • Anomaly Detection & Fault Diagnosis – PIER: Berkeley, Intel – PlanetSeer: Princeton • DHT – Bamboo (OpenDHT): Berkeley, Intel – Chord (DHash): MIT
  10. 10. Services (cont) • Routing – i3: Berkeley – Virtual ISP: Princeton • DNS – CoDNS: Princeton – CoDoNs: Cornell • Storage & Large File Transfer – LOCI: Tennessee – CoBlitz: Princeton – Shark: NYU • Multicast – End System Multicast: CMU – Tmesh: Michigan
  11. 11. Usage Stats • Slices: 350 - 425 • AS peers: 6000 • Users: 1028 • Bytes-per-day: 2 - 4 TB • IP-flows-per-day: 190M • Unique IP-addrs-per-day: 1M
  12. 12. Architectural Questions • What is the PlanetLab architecture? – more a question of synthesis than cleverness • Why is this the right architecture? – non-technical requirements – technical decisions that influenced adoption • What is a system architecture anyway? – how does it accommodate change (evolution)
  13. 13. Requirements 1) Global platform that supports both short-term experiments and long-running services. – services must be isolated from each other • performance isolation • name space isolation – multiple services must run concurrently Distributed Virtualization – each service runs in its own slice: a set of VMs
  14. 14. Requirements 2) It must be available now, even though no one knows for sure what “it” is. – deploy what we have today, and evolve over time – make the system as familiar as possible (e.g., Linux) Unbundled Management – independent mgmt services run in their own slice – evolve independently; best services survive – no single service gets to be “root” but some services require additional privilege
  15. 15. Requirements 3) Must convince sites to host nodes running code written by unknown researchers. – protect the Internet from PlanetLab Chain of Responsibility – explicit notion of responsibility – trace network activity to responsible party
  16. 16. Requirements 4) Sustaining growth depends on support for autonomy and decentralized control. – sites have the final say about the nodes they host – sites want to provide “private PlanetLabs” – regional autonomy is important Federation – universal agreement on minimal core (narrow waist) – allow independent pieces to evolve independently – identify principals and trust relationships among them
  17. 17. Requirements 5) Must scale to support many users with minimal resources available. – expect under-provisioned state to be the norm – shortage of logical resources too (e.g., IP addresses) Decouple slice creation from resource allocation Overbook with recovery – support both guarantees and best effort – recover from wedged states under heavy load
  18. 18. Tension Among Requirements • Distributed Virtualization / Unbundled Management – isolation vs one slice managing another • Federation / Chain of Responsibility – autonomy vs trusted authority • Under-provisioned / Distributed Virtualization – efficient sharing vs isolation • Other tensions – support users vs evolve the architecture – evolution vs clean slate
  19. 19. Synergy Among Requirements • Unbundled Management – third party management software • Federation – independent evolution of components – support for autonomous control of resources
  20. 20. Architecture (1) • Node Operating System – isolate slices – audit behavior • PlanetLab Central (PLC) – remotely manage nodes – bootstrap service to instantiate and control slices • Third-party Infrastructure Services – monitor slice/node health – discover available resources – create and configure a slice – resource allocation
  21. 21. Trust Relationships Princeton princeton_codeen Berkeley nyu_d Washington cornell_beehive MIT att_mcash Brown cmu_esm CMU harvard_ice NYU hplabs_donutlab ETH Trusted idsl_psepr Harvard irb_phi HP Labs NxN Intermediary paris6_landmarks Intel (PLC) mit_dht NEC Labs mcgill_card Purdue huji_ender UCSD arizona_stork SICS ucb_bamboo Cambridge ucsd_share Cornell umd_scriptroute … …
  22. 22. Trust Relationships (cont) 4 2 Service Node PLC Developer Owner 3 1 (User) 1) PLC expresses trust in a user by issuing it credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure
  23. 23. Trust Relationships (cont) 4 6 2 Service Node Mgmt Slice Developer Owner Authority Authority 3 5 1 (User) 1) PLC expresses trust in a user by issuing credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure 5) MA trusts SA to reliably map slices to users 6) SA trusts MA to provide working VMs
  24. 24. Architecture (2) PlanetLab Service Nodes Developers Owner 1 Create slices Request a slice Slice Authority Owner 2 New slice ID USERS Learn about Identify nodes slice users (resolve abuse) Owner 3 Auditing data Management ... Authority Software updates ... Owner N Access slice
  25. 25. Architecture (3) node MA database Node Owner NM + Owner VM VMM Node Service SCS VM Developer slice database SA
  26. 26. Per-Node Mechanisms SliverMgr PlanetFlow Proper Node Owner SliceStat VM1 VM2 … VMn Mgr VM pl_scs pl_mom Virtual Machine Monitor (VMM) Linux kernel (Fedora Core) + Vservers (namespace isolation) + Schedulers (performance isolation) + VNET (network virtualization)
  27. 27. VMM • Linux – significant mind-share • Vserver – scales to hundreds of VMs per node (12MB each) • Scheduling – CPU • fair share per slice (guarantees possible) – link bandwidth • fair share per slice • average rate limit: 1.5Mbps (24-hour bucket size) • peak rate limit: set by each site (100Mbps default) – disk • 5GB quota per slice (limit run-away log files) – memory • no limit • pl_mom resets biggest user at 90% utilization
  28. 28. VMM (cont) • VNET – socket programs “just work” • including raw sockets – slices should be able to send only… • well-formed IP packets • to non-blacklisted hosts – slices should be able to receive only… • packets related to connections that they initiated (e.g., replies) • packets destined for bound ports (e.g., server requests) – essentially a switching firewall for sockets • leverages Linux's built-in connection tracking modules – also supports virtual devices • standard PF_PACKET behavior • used to connect to a “virtual ISP”
  29. 29. Node Manager • SliverMgr – creates VM and sets resource allocations – interacts with… • bootstrap slice creation service (pl_scs) • third-party slice creation & brokerage services (using tickets) • Proper: PRivileged OPERations – grants unprivileged slices access to privileged info – effectively “pokes holes” in the namespace isolation – examples • files: open, get/set flags • directories: mount/unmount • sockets: create/bind • processes: fork/wait/kill
  30. 30. Auditing & Monitoring • PlanetFlow – logs every outbound IP flow on every node • accesses ulogd via Proper • retrieves packet headers, timestamps, context ids (batched) – used to audit traffic – aggregated and archived at PLC • SliceStat – has access to kernel-level / system-wide information • accesses /proc via Proper – used by global monitoring services – used to performance debug services
  31. 31. Infrastructure Services • Brokerage Services – Sirius: Georgia – Bellagio: UCSD, Harvard, Intel – Tycoon: HP • Environment Services – Stork: Arizona – AppMgr: MIT • Monitoring/Discovery Services – CoMon: Princeton – PsEPR: Intel – SWORD: Berkeley – IrisLog: Intel
  32. 32. Evolution vs Intelligent Design • Favor evolution over clean slate • Favor design principles over a fixed architecture • Specifically… – leverage existing software and interfaces – keep VMM and control plane orthogonal – exploit virtualization • vertical: mgmt services run in slices • horizontal: stacks of VMs – give no one root (least privilege + level playing field) – support federation (decentralized control)
  33. 33. Other Lessons • Inferior tracks lead to superior locomotives • Empower the user: yum • Build it and they (research papers) will come • Overlays are not networks • PlanetLab: We debug your network • From universal connectivity to gated communities • If you don’t talk to your university’s general counsel, you aren’t doing network research • Work fast, before anyone cares
  34. 34. Collaborators • Andy Bavier • David Culler, Berkeley • Marc Fiuczynski • Tom Anderson, UW • Mark Huang • Timothy Roscoe, Intel • Scott Karlin • Mic Bowman, Intel • Aaron Klingaman • John Hartman, Arizona • Martin Makowiecki • David Lowenthal, UGA • Reid Moran • Vivek Pai, Princeton • Steve Muir • David Parkes, Harvard • Stephen Soltesz • Amin Vahdat, UCSD • Mike Wawrzoniak • Rick McGeer, HP Labs
  35. 35. Available CPU Capacity Feb 1-8, 2005 (Week before SIGCOMM deadline) 120 100 Pct of 360 Nodes 80 60 40 20 0 10 20 30 40 50 60 70 80 Pct of CPU Available
  36. 36. Node Boot/Install Node Boot Manager PLC (MA) Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config . from floppy 4. Contact PLC (MA) 5. Send boot manager 6. Execute boot mgr 7. Node key read into memory from floppy 8. Invoke Boot API 9. Verify node key, send current node state 10. State = “install”, run installer 11. Update node state via Boot API 12. Verify node key, change state to “boot” 13. Chain-boot node (no restart) 14. Node booted
  37. 37. Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) Slice Created PI creates slice and assigns users to it (logged) Nodes Added to Users add nodes to their slice (logged) Slices Slice Traffic Experiments generate traffic (logged by PlanetFlow) Logged Traffic Logs PLC periodically pulls traffic logs from nodes Centrally Stored Network Activity Slice Responsible Users & PI
  38. 38. Slice Creation . . PI . SliceCreate( ) SliverCreate(rspec) SliceUsersAdd( ) PLC User (SA) NM VM VM … VM SliceAttributeSet( ) SliceGetTicket( ) VMM . . . (distribute ticket to slice creation service: pl_scs)
  39. 39. Brokerage Service . . . rcap = PoolCreate(rspec) SliceAttributeSet( ) PLC SliceGetTicket( ) (SA) NM VM VM … VM Broker VMM . . . (distribute ticket to brokerage service)
  40. 40. Brokerage Service (cont) . . . PoolSplit(rcap, slice, rspec) PLC (SA) NM VM VM … VM VM VMM User BuyResources( ) Broker . . . (broker contacts relevant nodes)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×