• Save
REEF: Towards a Big Data Stdlib
Upcoming SlideShare
Loading in...5
×
 

REEF: Towards a Big Data Stdlib

on

  • 294 views

 

Statistics

Views

Total Views
294
Views on SlideShare
294
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

REEF: Towards a Big Data Stdlib REEF: Towards a Big Data Stdlib Presentation Transcript

  • Resource Managers
  • Resource Managers
  • Resource Managers True multi-tenancy… Many workloads: Streaming, Batch, Interactive, … Many users: Production Jobs, Ad-Hoc Jobs, Experiments, … …but, only for sophisticated apps
  • Resource Managers True multi-tenancy… Many workloads: Streaming, Batch, Interactive, … Many users: Production Jobs, Ad-Hoc Jobs, Experiments, … …but, only for sophisticated apps Fault tolerance Pre-emption Elasticity
  • Example 1: SQL / MapReduce Fault tolerance Elasticity σ π ⋈ ⋈⋈ ⋈⋈
  • Example 2: Machine learning Fault tolerance Elasticity Iterative computations
  • Example 3: Graph processing Fault tolerance Elasticity Iterative computations Low latency communication
  • Silos Silos are hard to build Each duplicates the same mechanisms under the hood In practice, silos form pipelines In each step: Read from and write to HDFS Synchronize on complete data between steps  Slow σ π ⋈
  • REEF Breadth Mechanism over Policy Avoid silos Recognize the need for different models But: allow them to be composed. Bridge the JVM/CLR/… divide Different parts of the computation can be in either of them. Resource Manager and DFS := Cluster OS REEF := stdlib
  • REEF Control Flow Yarn ( ) handles resource management (security, quotas, priorities) Per-job Drivers ( ) request resources, coordinate computations, and handle events: faults, preemption, etc… REEF Evaluators ( ) hold hardware resources, allowing multiple Tasks ( , , , , , , etc…) to use the same cached state. πσ + 3 σ σ σ
  • Retaining Evaluators Handover of pre-partitioned and parsed data between frameworks Iterative computation Interactive queries $…
  • Wake: Events + I/O Event based programming and remoting API: A static subset of Rx → static checking of event flows → aggressive JVM event inlining Implementation: “SEDA++” → Global thread pool → Thread sharing where possible
  • Tang Configuration is hard Errors often show up at runtime only State of receiving process is unknown to the configuring process Our approach: Configuration as Dependency Injection Configuration here is pure data  Early static and dynamic checks Command = ‘ls’ Error: Unknown parameter “Command” Missing required parameter “cmd” cmd = ‘ls’ ShellTask Evaluator Error: Required instanceof Evaluator Got ShellTask Task YarnEvaluator Evaluator Error: container-4872364523847-02.stderr: NullPointerException at: java…eval():1234 ShellTask.helper():546 ShellTask.onNext():789 YarnEvaluator.onNext():12
  • REEF Data Plane Fault-tolerant communication Group communication / shuffle Low-latency communication Storage & Checkpointing
  • REEF Data Plane Fault-tolerant communication Group communication / shuffle Low-latency communication Storage & Checkpointing
  • Objective Function
  • Start with a random 𝑤0 Until convergence: Step 1: Compute the gradient 𝜕 𝑤 = 𝑥,𝑦 𝜖 𝑋 2 𝑤, 𝑥 − 𝑦 Apply the gradient to the model 𝑤𝑡+1 = 𝑤𝑡 − 𝜕 𝑤 Data parallel in X  Reduce Needed by Partitions  Broadcast
  • On REEF Driver requests Evaluators
  • On REEF Driver requests Evaluators Driver sends Tasks to load & parse data
  • On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks
  • On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce
  • On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce Failure: Node takes part in computation atomically
  • On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce Failure: Node takes part in computation atomically Added node takes part in the next call
  • https://github.com/Microsoft-CISL
  • bgchun@gmail.com http://www.reef-project.org/surf/ sears.russell@gmail.com http://www.reef-project.org/grouper/
  • mweimer@microsoft.com