Successfully reported this slideshow.
Your SlideShare is downloading. ×

Low Latency Execution For Apache Spark

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 35 Ad

More Related Content

Slideshows for you (20)

Similar to Low Latency Execution For Apache Spark (20)

Advertisement

More from Jen Aman (20)

Recently uploaded (20)

Advertisement

Low Latency Execution For Apache Spark

  1. 1. Low latency execution for apache spark Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout
  2. 2. Who am I ? PhD candidate, AMPLab UC Berkeley Dissertation: System design for large scale machine learning Apache Spark PMC Member. Contributions to Spark core, MLlib, SparkR
  3. 3. Low latency: SPARK STREAMING
  4. 4. Low latency: SPARK STREAMING
  5. 5. Low latency: SPARK STREAMING
  6. 6. Low latency: execution Scheduler Few milliseconds Large Clusters
  7. 7. Low latency execution engine for iterative workloads Machine learning Stream processing State This talk
  8. 8. Execution Models
  9. 9. Centralized task scheduling Lineage, Parallel Recovery Microsoft Dryad Execution models: batch processing Task Control Message Driver Driver: Coordinate shuffles, data sharing S H U F F L E Network Transfer Iteration B R O A D C A S T
  10. 10. Execution models: parallel operators Long-lived operators Checkpointing based fault tolerance Naiad S H U F F L E S H U F F L E Iteration Task Control Message Driver Network Transfer Low Latency
  11. 11. SCALING behavior 0 50 100 150 200 250 4 8 16 32 64 128 Time(ms) Machines Network Wait Compute Task Fetch Scheduler Delay Cluster: 4 core, r3.xlarge machines Workload: Sum of 10k numbers per-core Median-task time breakdown
  12. 12. Can we achieve low latency with Apache Spark ?
  13. 13. DESIGN INSIGHT Fine-grained execution with coarse-grained scheduling
  14. 14. DRIZZLE S H U F F L E B R O A D C A S T Iteration Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
  15. 15. DAG scheduling Assign tasks to hosts using (a) locality preferences (b) straggler mitigation (c) fair sharing etc. Tasks Host1 Host2 Driver Host1 Host2 Serialize & Launch Host Metadata Scheduler Same DAG structure for many iterations Can reuse scheduling decisions
  16. 16. Batch scheduling Schedule a batch of iterations at once Fault tolerance, scheduling at batch boundaries 1 stage in each iteration b = 2
  17. 17. How much does this help ? 1 10 100 1000 4 8 16 32 64 128 Time/Iter(ms) Machines Apache Spark Drizzle-10 Drizzle-50 Drizzle-100 Workload: Sum of 10k numbers per-core Single Stage Job, 100 iterations – Varying Drizzle batch size
  18. 18. DRIZZLE S H U F F L E B R O A D C A S T Iteration Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
  19. 19. coordinating shuffles: APACHE SPARK Task Control Message Data Message Driver Intermediate Data Driver sends metadata Tasks pull data
  20. 20. coordinating shuffles: PRE-SCHEDULING Pre-schedule down-stream tasks on executors Trigger tasks once dependencies are met Task Control Message Data Message Driver Intermediate Data Pre-scheduled task
  21. 21. Push-metadata, pull-data Coalesce metadata across cores Transfer during downstream stage Data Message Intermediate Data Metadata Message
  22. 22. Micro-benchmark: 2-stages 0 50 100 150 200 250 300 350 4 8 16 32 64 128 Time/Iter(ms) Machines Apache Spark Drizzle-Pull Workload: Sum of 10k numbers per-core 100 iterations
  23. 23. DRIZZLE S H U F F L E B R O A D C A S T Iteration Batch Scheduling Pre-Scheduling Shuffles Distributed Shared Variables
  24. 24. Shared variables Fine-grained updates to shared data Distributed shared variables - MPI using MPI_AllReduce - Bloom, CRDTs - Parameter servers, key-value stores - reduce followed by broadcast
  25. 25. Drizzle: Replicated shared variables Custom merge strategies 1 Commutative updates e.g. vector addition Task Shared Variable M E R G E 1 1 1 1 2 2 2 2
  26. 26. Enabling async updates Synchronous Asynchronous semantics within a batch Synchronous semantics enforced at batch boundaries Task Shared Variable 1 2 Async Stale versions 1 1 2 3 Merge
  27. 27. Evaluation Micro-benchmarks -  Single stage -  Multiple stages -  Shared variables End-to-end experiments -  Streaming benchmarks -  Logistic Regression Implemented on Apache Spark 1.6.1 Integrations with Spark Streaming, MLlib
  28. 28. USING DRIZZLE API DAGScheduler.scala def runJob( rdd: RDD[T], func: Iterator[T] => U) Internal def runBatchJob( rdds: Seq[RDD[T]], funcs: Seq[Iterator[T] => U])
  29. 29. Using drizzle StreamingContext.scala def this( sc: SparkContext, batchDuration: Duration) Spark Streaming def this( sc: SparkContext, batchDuration: Duration, jobsPerBatch: Int)
  30. 30. b=1 à Batch processing Choosing batch size Higher overhead Smaller window for fault tolerance In practice: Largest batch such that overhead is below fixed threshold e.g., For 128 machines, batch of few seconds is enough b=N à Parallel operators Lower overhead Larger window for fault tolerance
  31. 31. Evaluation Micro-benchmarks -  Single stage -  Multiple stages -  Shared variables End-to-end experiments -  Streaming benchmarks -  Logistic Regression Implemented on Apache Spark 1.6.1 Integrations with Spark Streaming, MLlib
  32. 32. Streaming BENCHMARK 0 25 50 75 100 0 300 600 900 1200 1500 CDF Event Latency (ms) Drizzle, 90ms Spark Streaming 560ms Yahoo Streaming Benchmark: 1M JSON Ad-events / second, 64 machines Event Latency: Difference between window end, processing end
  33. 33. Mllib: LOGISTIC REGRESSION 0 100 200 300 400 500 600 4 8 16 32 64 128 Time/Iter(ms) Machines Apache Spark Drizzle Sparse Logistic Regression on RCV1 677K examples, 47K features
  34. 34. Work In progress Automatic batch size tuning Open source release Apache Spark JIRA to discuss potential contribution
  35. 35. conclusion Overheads when using Apache Spark for streaming, ML workloads Drizzle: Low Latency Execution Engine - Decouple execution from centralized scheduling - Milliseconds latency for iterative workloads Workloads / Questions / Contributions Shivaram Venkataraman shivaram@cs.berkeley.edu

×