REEF: Towards a Big Data Stdlib

  • 215 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
215
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Resource Managers
  • 2. Resource Managers
  • 3. Resource Managers True multi-tenancy… Many workloads: Streaming, Batch, Interactive, … Many users: Production Jobs, Ad-Hoc Jobs, Experiments, … …but, only for sophisticated apps
  • 4. Resource Managers True multi-tenancy… Many workloads: Streaming, Batch, Interactive, … Many users: Production Jobs, Ad-Hoc Jobs, Experiments, … …but, only for sophisticated apps Fault tolerance Pre-emption Elasticity
  • 5. Example 1: SQL / MapReduce Fault tolerance Elasticity σ π ⋈ ⋈⋈ ⋈⋈
  • 6. Example 2: Machine learning Fault tolerance Elasticity Iterative computations
  • 7. Example 3: Graph processing Fault tolerance Elasticity Iterative computations Low latency communication
  • 8. Silos Silos are hard to build Each duplicates the same mechanisms under the hood In practice, silos form pipelines In each step: Read from and write to HDFS Synchronize on complete data between steps  Slow σ π ⋈
  • 9. REEF Breadth Mechanism over Policy Avoid silos Recognize the need for different models But: allow them to be composed. Bridge the JVM/CLR/… divide Different parts of the computation can be in either of them. Resource Manager and DFS := Cluster OS REEF := stdlib
  • 10. REEF Control Flow Yarn ( ) handles resource management (security, quotas, priorities) Per-job Drivers ( ) request resources, coordinate computations, and handle events: faults, preemption, etc… REEF Evaluators ( ) hold hardware resources, allowing multiple Tasks ( , , , , , , etc…) to use the same cached state. πσ + 3 σ σ σ
  • 11. Retaining Evaluators Handover of pre-partitioned and parsed data between frameworks Iterative computation Interactive queries $…
  • 12. Wake: Events + I/O Event based programming and remoting API: A static subset of Rx → static checking of event flows → aggressive JVM event inlining Implementation: “SEDA++” → Global thread pool → Thread sharing where possible
  • 13. Tang Configuration is hard Errors often show up at runtime only State of receiving process is unknown to the configuring process Our approach: Configuration as Dependency Injection Configuration here is pure data  Early static and dynamic checks Command = ‘ls’ Error: Unknown parameter “Command” Missing required parameter “cmd” cmd = ‘ls’ ShellTask Evaluator Error: Required instanceof Evaluator Got ShellTask Task YarnEvaluator Evaluator Error: container-4872364523847-02.stderr: NullPointerException at: java…eval():1234 ShellTask.helper():546 ShellTask.onNext():789 YarnEvaluator.onNext():12
  • 14. REEF Data Plane Fault-tolerant communication Group communication / shuffle Low-latency communication Storage & Checkpointing
  • 15. REEF Data Plane Fault-tolerant communication Group communication / shuffle Low-latency communication Storage & Checkpointing
  • 16. Objective Function
  • 17. Start with a random 𝑤0 Until convergence: Step 1: Compute the gradient 𝜕 𝑤 = 𝑥,𝑦 𝜖 𝑋 2 𝑤, 𝑥 − 𝑦 Apply the gradient to the model 𝑤𝑡+1 = 𝑤𝑡 − 𝜕 𝑤 Data parallel in X  Reduce Needed by Partitions  Broadcast
  • 18. On REEF Driver requests Evaluators
  • 19. On REEF Driver requests Evaluators Driver sends Tasks to load & parse data
  • 20. On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks
  • 21. On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce
  • 22. On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce Failure: Node takes part in computation atomically
  • 23. On REEF Driver requests Evaluators Driver sends Tasks to load & parse data Driver sends ComputeGradient and master Tasks Computation commences in sequence of Broadcast and Reduce Failure: Node takes part in computation atomically Added node takes part in the next call
  • 24. https://github.com/Microsoft-CISL
  • 25. bgchun@gmail.com http://www.reef-project.org/surf/ sears.russell@gmail.com http://www.reef-project.org/grouper/
  • 26. mweimer@microsoft.com