Vitus Masters Defense
Upcoming SlideShare
Loading in...5
×
 

Vitus Masters Defense

on

  • 1,033 views

 Final defense of my masters thesis at UTEP, 2006

 Final defense of my masters thesis at UTEP, 2006

Statistics

Views

Total Views
1,033
Views on SlideShare
1,031
Embed Views
2

Actions

Likes
0
Downloads
13
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Kurz erklären wies zum Nahmen kam, dass Eric den schon 5x geändert hat. Was heißt instrumentation?

Vitus Masters Defense Vitus Masters Defense Presentation Transcript

  • PlanetenWachHundNetz Instrumentation Infrastructure for PlanetLab Vitus Lorenz-Meyer
  • Peer-to-Peer
    • Distributed on open internet
    • All participants both receive & provide services to/from others
    • Not centrally administered
    • Membership changes over time (churn)
    • Example: file sharing (napster, gnutella,…)
      • Any node can publish a named file
      • Any node can obtain file from another node who has it.
      • Range of strategies to find nodes containing desired content
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Problem
  • The Problem
    • P2P systems hard to tune, requires understanding of complex behavior
      • Requires instrumentation & analysis
    • Many P2P systems constructed without scalable instrumentation infrastructure
      • Frequently done in ad-hoc manner
        • Data transmitted to single collection & analysis node
        • Inadequate for understanding behavior of large systems of many (hundreds to MILLIONS) of nodes
    • My work: development of a flexible tool to enable scalable instrumentation
      • algorithms, data structures
    Rel. Work Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso P2P
  • Related work (1 of 2: cousins)
    • Distributed Database Mngmt. Systems
      • Select data at sources
      • Optimize joins (run near to sources…)
      • Commercially used in non-p2p configurations
      • P2P (research): PIER, Sophia
    • Sensor Networks
      • Unmanaged radio-connected nodes
        • provide “network” of surveilance
      • SQL; Compiled into a 3-step process
      • Software communicates through same mechanism
      • IrisNet, TAG
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso
  • Related work (2of2: Siblings)
    • Aggregation Overlays
      • Information collection subsystem
      • Nodes provide information tuples
        • Internal aggregation language
        • Computed using parallel prefix of pre-defined assoc/comm ops
      • Astrolabe, SDIMS, SOMO
    • Google’s MapReduce
      • Data selection & aggregation in distributed system
        • User provides “map” and “reduce” program
      • Not fully p2p (resource mgmt. overlay)
  • High-level Approach
    • User specifiable programs like MapReduce
    • Split data collection into 3 ‘phases’
      • Generate values on all nodes
      • Pairwise aggregation throughout system
      • Evaluate results
    • emit measured vals (val,num=1)
    • Aggregate: (val1+val2,num1+num2)
    • Evaluate (avg) (val / num)
    • Easy to use: user provides 3 programs (scripts)
    Impl. Rel. Work
  • Illustration of Binary Aggregation Impl. Rel. Work
  • Why is this hard in P2P?
    • Problem: membership churn
      • Nodes continuously enter & leave system
    • Nobody in charge (p2p)
      • Nobody knows membership list!
    • Exposes following challenges
    • Finding all participating nodes
    • Constructing an (appx) balanced tree
    Impl. Rel. Work
  • Building Structure Upon Anarchy: Key Based Routing Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Appr. 0  2 160 2 158 2 159 2 159 + 2 158
  • “Chord” Routing Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Appr.
  • Chord lookup Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Appr.
  • Building a tree upon KBR Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Appr. a a a b b d e a b d e g i f h d e a b f g h i
  • Building a tree: FTT & KBT
    • FTT: finger-based tree
    • Operation associated with a “target” node
    • Systems send data to finger closest to target
    • Ambiguous
        • Depends on all nodes’ fingertables
    • Tree useful only for aggregation
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Appr.
    • KBT: Maps tree on key-space
      • Operation associated w/ target node
      • System/tree-node mapping:
        • Node assigned to node w/ nearest key
      • Non-ambiguous
      • Tree useful for both dissemination & aggregation
      • Single, global tree
  • Our Structure 101… 111… 101… 111… 101… 110… 100… 001… 011… 001… 011… 001… 010… 000… 101… 011… 001… 110… As 001..! Goal Appr.
    • KMR: Subset of KBT, rooted at specific node
    • One tree / root
      • Better load-balancing
    • Tree fully determined by set of active nodes and root
  • Implementation details
    • PWHN-Server layered on FreePastry
    • PWHN-Client connects to PWHN-Server and makes query
    • Callee builds tree making itself root
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal S S S S S S S S C Appr.
  • Our Goal
    • Develop toolkit for data collection/aggregation in P2P networks
      • Useful for PlanetLab-community
    • Extend MR’s model to P2P
      • K.I.S.S.
        • Users provide programs for gen/agg/eval
    • Use techniques from P2P
      • Construct aggregation tree upon key-based-routing
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Example Impl. details
  • Example (1)
    • First implementation:
      • Script version, flat, to test approach
    • Example 1: Overall average system load
        • Gen emits (1,<1load>,<5load>,<15load>) for each server
        • Agg adds all numbers
        • Eval divides last 3 numbers by first to get average
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Evaluation Goal
  • Example (2)
    • PWHN client (Java)
      • Can start and stop server
      • Used for specifying all programs and parameters (Servers, username for flat, method)
      • Front-end for connecting to servers and making query
      • Allows saving and graphically representing result
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Evaluation Goal
  • Example (3)
    • Graphing of queried results
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Evaluation Goal Bar Chart Color bubbles on world map
  • Example (4)
    • Graphing of tree
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Evaluation Goal Graphing of paths of the query
  • Evaluation
    • Minimize disruption
      • Minimize incoming bytes to client
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Synopsis Examples
    • More efficient
      • Lower average fan-in of aggregation tree
  • Evaluation: Fern
    • Global Update latency histogram
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Synopsis Examples 10 clients 701 clients
  • Summary
    • PWHN - Instrumentation toolkit
      • Extends MR’s model to P2P
      • Uses P2P techniques (DHTs)
      • Combines FTT and KBT to be more efficient
      • Conclusion: Useful tool that is more efficient than to build infrastructure into software
      • What did I do?
        • Survey of systems that provide aggregation in dynamic networks
        • Classification and naming of aggregation trees upon DHTs
        • Design and implementation of my own tool (KMR/PWHN)
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Examples
  • Questions Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Synopsis
  • Related: MR
    • Google’s MapReduce was designed for static networks
      • Allows arbitrary programs for aggregation
    • We observe that MR’s approach is practical, but was not designed for P2P
    • Example: Count words in website for index
      • “ Map” for each word: emit (<word>,1)
      • “ Reduce” for [(<word>,1)…]: add 1’s and emit (<word>,<count>)
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Goal Ex: Coral
  • Example: Coral
    • Load balancing P2P-CDN implemented as a HTTP web-proxy
    • Content democratizing for small-scale servers that can’t afford akamai
      • “ shlash-dot effect”
    • Design did not include monitoring
      • was later retro-fitted onto Coral
    • Killer-App. on PlanetLab, but centralized approach did not scale
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso P2P Problem
  • Implementation: KMR
    • Key-based MapReduce
      • Physical root node
      • Bit of parent is negated for each level
      • ‘ Left-tree’
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Examples Approach
  • Impl.: KMR usage
    • “ Down”: internal nodes send one message to sibling
    • “ Up”: Only one message to parent
    • Non-existent nodes
      • Messages end up at closest nodes
        • Knows to overtake role of parent
    Vitus Lorenz-Meyer: Thesis defense University of Texas @ El Paso Examples Approach
  • Wakeup Comic
  • Problem Detected (last night)
    • FreePastry doesn’t have expected semantics
      • Finds node with numerically closest key
        • Rather than most clockwise node less than key
      • Range-based algorithm inappropriate
    • FreePastry finger tables contain nodes with differing length common prefixes
      • Useful for finding nodes with longer common prefix with requested destination than destination node
      • Permits use of alternate (preferred) algorithm