• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Poster for NSDI

Poster for NSDI



My Poster for the poster session at NSDI 2006

My Poster for the poster session at NSDI 2006



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Poster for NSDI Poster for NSDI Presentation Transcript

    • PlanetenWachHundNetz: Locality-Aware MapReduce for Peer-to-Peer Robust Systems Group Vitus Lorenz-Meyer, Eric Freudenthal University of Texas at El Paso Other Approaches From distributed databases Key-based Tree Routing based on node keys • Children of node n with key k are nodes with keys closest to k with n-1 prefix bits in common with k • Node n is its own child! Aggregation tree Stochastically balanced Motivation Prefix tree • Efficient reduction of data from self- organized (P2P) sources Problem • Dynamic membership • Static tree (for parallel prefix) Non-existent node 001 (A) inappropriate •Not locality aware! •Only 1 global tree. Problems w/ Known Techniques Finger-table based tree • Do not localize network traffic 000 takes the role • Tree edges selected from nodes • Branching factor not controlled of its parent 001 finger tables (towards query root). • State lost when parent nodes fail/leave Paths to 111 Non-optimized • Details on right side of figure fan-out! Message arrives Our Goals at closest node 000 (B) • Exploit locality, minimize remote communication Resulting Aggre- gation tree • Efficient parallelism: Control of Message ends up at self: branching factor No more nodes -> discard • Minimal state-loss when nodes enter/leave • Self-healing continuous queries Implementation •Unique tree for every query •Locality awareness from DHT • Left-tree •but does not force locality • Node role determined by key •Challenge: what to do if nodes • Locality/clustering: Coral or Oasis enter/leave Goal: Efficient Aggregation Tree Our Solution: Key-based MapReduce (KMR) Overview Finding nodes Locality-aware Clustering MapReduce (Google) • Associate aggregation tree with a operation-specific root key. n = key of some node Characteristics t = root-id of aggregation tree Generalized Parallel Prefix Coral (locality-aware DsHT) π = n’s position among leaves (from left) • Rooted at host whose key is closest to root key • Associative & communitive • π=n⊕t Coral partitions its members in • One tree per query, distributes load over all nodes for multiple operations mapped to an key of n’s parent at height h: π / 2h ⊕ t clusters based on connection queries at a time aggregation tree latency • Nodes that are roots of sub-trees are their own left-children all the To find parent, n searches for smallest height such way down to their appropriate place among leaves. that parent is in the tree. • Google‘s map-reduce framework Aggregation tree at each • Each node is leaf and root of the sub-tree at depth d if it has the d’th  Data is mapped to low-dimensional Collecting data bit of root-ID flipped; node may also substitute for missing parents. cluster tuples • Set of nodes and MR-specific root-key fully defines a unique tree (complete • Each node reduces messages from children, sends result(s) to parent •  Tuples are recursively combined Build cluster aggregation trees knowledge not required: lookup parent, sibling, children) • using an associative reduction A single member of a cluster-tree Challenges Approach algorithm that emits a (summary) serves as aggregation tuple representative • Find children DHT lookup • Find parent DHT lookup Example: Challenges • Establishing new tree at exit/departure DHT lookup (key-space determines topology) • Counting occurrence of a particular • Choosing sub-cluster word in a document: Building the Tree representatives  Provide map algorithm that • Constructing aggregation tree for emits the tuple ‘(1)’ for every Send build-message to all siblings down the tree sub-cluster Continuous Queries occurrence Each node forwards build-message to all its siblings all the way down • Avoiding inefficient setup broadcast  Provide a reduce algorithm • Recall that node is its own child: siblings include its children! Maintaining tree consistency under churn that sums these tuples Non-existing parent • Each node periodically checks roles of self and immediate relatives • Closest node fulfills its role • Churn (membership change) must result in tree-node role change • Therefore, search for parent will discover this node • Change propagated by DHT finger tables Non-existing child • If child not in DHT, then node is a leaf Challenges Approach Finding children • Use DHT • What roles to check when periodically check (lookup) self, children, parent only when its assuming role parent → child, old → new • Who notifies who