Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Scheduling In Distributed Systems          Candidacy exam                              Andrii Vozniuk                    ...
Big Data       Data explosion       Processing gets more complicated          Generates: 25 TB/day       Generates: 40 T...
Typical Data Processing Pipeline                     Log              Sensor                     data              dataETL...
Outline    Ɣ Gamma - parallel database        MapReduce - data-intensive system        Condor - compute-intensive system C...
Scheduling In Distributed Systems       Scheduling           Policy: setting an ordering of tasks                       ...
Matching Tasks With Resources       Perspectives           Data model           Execution model             System/Pers...
Gamma                                                Ɣ       Pioneering parallel database       Data model: constrained ...
Gamma: Scheduler                                                         ƔSELECT r FROM R      Query                      ...
Gamma: Batch Scheduling                                           Ɣ       Exploit sharing by scheduling in a batch      ...
Gamma: Batch Scheduling Joins                                           Ɣ    Several hash-joins in a batch of queries   ...
Limitations Of Gamma                                           Ɣ    Gamma offers        Efficient query execution      ...
MapReduce    System for data-intensive applications    Execution model: constrained        Job is a set of map and redu...
MapReduce: Scheduling                                    Map                                    Reduc             Map     ...
MapReduce: Speculative Execution    Nodes may become slow    Speculative execution minimizes job’s response time    Lau...
Emerging Heterogeneous Infrastructures    Replacement of failed components    Extending existing cluster with new machin...
MapReduce: Heterogeneous Cluster    Fast nodeSlow node    Performance degrades on heterogeneous cluster        Slow node...
MapReduce: LATE Scheduler    Idea: back up the task with the largest estimated finish     time (Longest Approximate Time ...
MapReduce: LATE Example   Back up the task with Longest Approximate Time to End                                   2 min1 ...
Limitations Of MapReduce    MapReduce offers        High scalability        Good fault tolerance        Handling of un...
Condor    Compute-intensive system harvesting idle resources    Data model: arbitrary    Execution model: arbitrary    ...
Condor Scheduler: Centralized?                         Scheduler                                     job                  ...
Condor Scheduler: Distributed?                                            Scheduler     Scheduler                         ...
Condor Scheduler: Hybrid!Information about tasks            Matchmaker           Information about nodes      Scheduler   ...
ClassAds: Describing Jobs and Resources          Job Description          Machine Description          [MyType=“Job”      ...
Conclusions    Scheduling done at different levels        Gamma: operator level scheduling enables sharing        MR an...
Thank you for your attention!        Feedback & Question?        Andrii.Vozniuk@epfl.ch26
References    Matchmaking: Distributed Resource Management for     High Throughput Computing by Rajesh Raman, Miron     L...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Cpu
Next
Download to read offline and view in fullscreen.

3

Share

Download to read offline

Scheduling in distributed systems - Andrii Vozniuk

Download to read offline

My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf

Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system.

This talk is based on the following papers:
1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica
3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt

Related Books

Free with a 30 day trial from Scribd

See all

Scheduling in distributed systems - Andrii Vozniuk

  1. 1. Scheduling In Distributed Systems Candidacy exam  Andrii Vozniuk  EPFL  July 4, 2012
  2. 2. Big Data Data explosion Processing gets more complicated Generates: 25 TB/day Generates: 40 TB/day Stores: 10 PB/year Stores: 20 PB/year Resources of many computers should be used 2
  3. 3. Typical Data Processing Pipeline Log Sensor data dataETL-like batch Clean Analyze Using resources of processing data data many organizations Particle found!Efficient query Query execution data User model No one-size-fits-all system currently exists 3
  4. 4. Outline Ɣ Gamma - parallel database MapReduce - data-intensive system Condor - compute-intensive system Conclusions Future Research4
  5. 5. Scheduling In Distributed Systems Scheduling  Policy: setting an ordering of tasks task task  Assigning resources to tasks task task How to match resources and tasks? Scheduling is challenging in distributed systems 5
  6. 6. Matching Tasks With Resources Perspectives  Data model  Execution model System/Perspecti Data model Execution model ve Gamma Relational Multioperator MapReduce Unconstrained MapReduce Condor Unconstrained Unconstrained How scheduling is influenced by data and execution 6 models?
  7. 7. Gamma Ɣ Pioneering parallel database Data model: constrained  Relational data model  Relations are horizontally partitioned Execution model: constrained  Multioperator queries  Operators employ hash-based algorithms 7
  8. 8. Gamma: Scheduler ƔSELECT r FROM R Query HostWHERE r < ‘k’ query Manager Catalog Machine Gamma Optimizes query Schedules Scheduler Database Compiles plan operators Process Operator Operator Node 1 Process Process Node 2 Execution on relevant nodes a-m n-z Scheduling is done at the operator level 8
  9. 9. Gamma: Batch Scheduling Ɣ Exploit sharing by scheduling in a batch Example of selection sharing σ1 σ2 σ1 σ2 Shared scan A A A Reads of A can be shared applying predicates in turn Shared relation A is scanned only once Batch scheduling trades latency for throughput 9
  10. 10. Gamma: Batch Scheduling Joins Ɣ Several hash-joins in a batch of queries Hash table for the same relation can be shared Example assumes 100% selectivity of σ Shared hash-table for A ⋈ ⋈ ⋈ ⋈ σ σ σ σ σ σ σ A Β A C B A C Sharing reduces I/O and memory usage Sharing among joins reduces total execution time 10
  11. 11. Limitations Of Gamma Ɣ Gamma offers  Efficient query execution  Sharing in a batch of queries Gamma operates on structured data Gamma is not suitable for  Unstructured data processing  ETL type of workload  Running on large scale A different system for ETL processing is needed 11
  12. 12. MapReduce System for data-intensive applications Execution model: constrained  Job is a set of map and reduce tasks  Tasks are independent Data model: unconstrained  Arbitrary data format  Files are partitioned into chunks  Each chunk is replicated several times 12
  13. 13. MapReduce: Scheduling Map Reduc Map 1e 2 Example: Chunk1 Chunk2 MapReduce job Result1 Temp1 Temp2 4 Map tasks 2 Reduce task Map Reduc Map 3 4e Chunk3 Chunk4 Temp3 Result2 Temp4 Tasks are scheduled close to data Execution is scalable and fault-tolerant Execution is elastic Fine grain scheduling improves fault tolerance and 13 elasticity
  14. 14. MapReduce: Speculative Execution Nodes may become slow Speculative execution minimizes job’s response time Launch if progress is 20% less than average backup Normal node stragglerTemporary slow node Speculative execution works well in homogeneous 14 environment
  15. 15. Emerging Heterogeneous Infrastructures Replacement of failed components Extending existing cluster with new machines Virtualized data centers of cloud providers  CPU and RAM are isolated  Contention for disk and network IO Performance per 60 VM (MB/s) 40 20 0 1 2 3 4 5 6 7 VMs on Physical HostIn many real-life cases the infrastructure is heterogeneous 15
  16. 16. MapReduce: Heterogeneous Cluster Fast nodeSlow node Performance degrades on heterogeneous cluster  Slow nodes are wasted  Backup tasks on slow nodes  All straggling tasks are treated equally  Thrashing due to excessive speculative execution Speculative execution should be improved for heterogeneous 16 cluster
  17. 17. MapReduce: LATE Scheduler Idea: back up the task with the largest estimated finish time (Longest Approximate Time to End) progress score progress rate = execution time 1 – progress score estimated time left = progress rate Thresholds  Limit the number of backup tasks  Launch backup tasks on fast nodes  Backup only sufficiently slow tasks LATE looks forward to prioritize tasks to speculate 17
  18. 18. MapReduce: LATE Example Back up the task with Longest Approximate Time to End 2 min1 Estimated time left: (1-0.66) / (1/3) = 1 1 task/min2 Progress = 66% Estimated time left: (1-0.05) / (1/1.9) = 1.8 3x slower Progress = 5.3%3 1.9x slower Time (min) improvementLATE correctly identifies task which hurts the response time the18 most
  19. 19. Limitations Of MapReduce MapReduce offers  High scalability  Good fault tolerance  Handling of unstructured data MapReduce is not suitable for  Running on multi organization infrastructure  Harvesting idle resources in organization A different system for multi organization infrastructure is 19 needed
  20. 20. Condor Compute-intensive system harvesting idle resources Data model: arbitrary Execution model: arbitrary How to increase utilization and respect the owners? job job job job Increase resources utilization by scheduling jobs on idle 20 machines
  21. 21. Condor Scheduler: Centralized? Scheduler job job job job Efficient but not reliable, possible bottleneck21
  22. 22. Condor Scheduler: Distributed? Scheduler Scheduler Scheduler Scheduler job job job job Reliable but inefficient22
  23. 23. Condor Scheduler: Hybrid!Information about tasks Matchmaker Information about nodes Scheduler 1 3 1 1 2 3 Scheduler Scheduler 4 job job job job Hybrid approach has the best of both worlds 23
  24. 24. ClassAds: Describing Jobs and Resources Job Description Machine Description [MyType=“Job” [MyType=“Machine“ TargetType = “Machine“ TargetType=“Job“ Department=“CompSci“ Machine=“nostos.cs.wisc.edu“ Requirements = OpSys=“LINUX“ (other.OpSys==LINUX && Disk=3076077 other.Disk > 10000000) Requirement = (LoadAvg <= 0.3) && Rank=Memory] (KeyboardIdle > (15*60)) Rank = other.Department==self.Department] Requirements should be satisfied Candidate with the highest rank is returned Matchmaker is suitable for heterogeneous shared clusters 24
  25. 25. Conclusions Scheduling done at different levels  Gamma: operator level scheduling enables sharing  MR and Condor: arbitrary code => sharing is hard  Condor: matchmaking gives control on job placement Hybrid approaches are promising for big data processing Scheduling in heterogeneous deployments is challenging 25
  26. 26. Thank you for your attention! Feedback & Question? Andrii.Vozniuk@epfl.ch26
  27. 27. References Matchmaking: Distributed Resource Management for High Throughput Computing by Rajesh Raman, Miron Livny and Marvin Solomon. Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt. Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica Slides 14 and 18 exploit presentation ideas from the LATE slides for OSDI 2008 by Matei Zaharia 27
  • MilosSimic

    Dec. 4, 2018
  • ibogomolov

    Oct. 3, 2014
  • henryhxu

    Mar. 27, 2013

My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system. This talk is based on the following papers: 1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt 2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica 3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt

Views

Total views

2,257

On Slideshare

0

From embeds

0

Number of embeds

11

Actions

Downloads

50

Shares

0

Comments

0

Likes

3

×