SlideShare a Scribd company logo
1 of 20
Sparklint
a Tool for Identifying and Tuning Inefficient Spark Jobs
Across Your Cluster
Simon Whitear
Principal Engineer
Why Sparklint?
• A successful Spark cluster grows rapidly
• Capacity and capability mismatches arise
• Leads to resource contention
• Tuning process is non-trivial
• Current UI operational in focus
We wanted to understand application efficiency
Sparklint provides:
• Live view of batch & streaming application stats
or
• Event by event analysis of historical event logs
• Stats and graphs for:
– Idle time
– Core usage
– Task locality
Sparklint Listener:
Sparklint Listener:
Sparklint Server:
Demo…
• Simulated workload analyzing site access logs:
– read text file as JSON
– convert to Record(ip, verb, status, time)
– countByIp, countByStatus, countByVerb
Job took 10m7s to finish
Already pretty good
distribution; low idle time
indicates good worker
usage, minimal driver node
interaction in job
But overall utilization is low
Which is reflected in the
common occurrence of the
IDLE state (unused cores)
Job took 15m14s to
finish
Core usage increased,
job is more efficient,
execution time
increased, but the app
is not cpu bound
Job took 9m24s to finish
Core utilization decreased
proportionally, trading execution time
for efficiency
Lots of IDLE state shows
we are over allocating
resources
Job took 11m34s to finish
Core utilization remains
low, the config settings
are not right for this
workload.
Dynamic allocation only effective at
app start due to long
executorIdleTimeout setting
Job took 33m5s to finish Core utilization is up, but execution time is
up dramatically due to reclaiming
resources before each short running task.
IDLE state is reduced to a minimum, looks
efficient, but execution is much slower due to
dynamic allocation overhead
Job took 7m34s to finish
Core utilization way up,
with lower execution time
Parallel execution is
clearly visible in
overlapping stages
Flat tops show we are
becoming CPU bound
Job took 5m6s to finish
Core utilization decreases,
trading execution time for
efficiency again here
Thanks to dynamic allocation the
utilization is high despite being a bi-
modal application
Data loading and mapping requires
a large core count to get throughput
Aggregation and IO of results
optimized for end file size,
therefore requires less cores
Future Features:
• History Server event sources
• Inline recommendations
• Auto-tuning
• Streaming stage parameter delegation
• Replay capable listener
The Credit:
• Lead developer is Robert Xue
• https://github.com/roboxue
• SDE @ Groupon
Contribute!
Sparklint is OSS:
https://github.com/groupon/sparklint
Q+A

More Related Content

What's hot

2013 04-29-evolution of backend
2013 04-29-evolution of backend2013 04-29-evolution of backend
2013 04-29-evolution of backend
Wooga
 
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, FireflyDON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
DevOpsDays Tel Aviv
 

What's hot (20)

Key to optimal end user experience
Key to optimal end user experienceKey to optimal end user experience
Key to optimal end user experience
 
Enterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version ControlEnterprise Beacon Object Hive - Siebel Version Control
Enterprise Beacon Object Hive - Siebel Version Control
 
Taking Splunk to the Next Level - Architecture
Taking Splunk to the Next Level - ArchitectureTaking Splunk to the Next Level - Architecture
Taking Splunk to the Next Level - Architecture
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
 
Siebel monitoring
Siebel monitoringSiebel monitoring
Siebel monitoring
 
2013 04-29-evolution of backend
2013 04-29-evolution of backend2013 04-29-evolution of backend
2013 04-29-evolution of backend
 
The Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at ScaleThe Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
 
Netflix's Could Migration
Netflix's Could MigrationNetflix's Could Migration
Netflix's Could Migration
 
Flink Jobs Deployment On Kubernetes
Flink Jobs Deployment On KubernetesFlink Jobs Deployment On Kubernetes
Flink Jobs Deployment On Kubernetes
 
Closing the door on application performance problems
Closing the door on application performance problemsClosing the door on application performance problems
Closing the door on application performance problems
 
Engineering Leader opportunity @ Netflix - Playback Data Systems
Engineering Leader opportunity @ Netflix - Playback Data SystemsEngineering Leader opportunity @ Netflix - Playback Data Systems
Engineering Leader opportunity @ Netflix - Playback Data Systems
 
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
 
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, FireflyDON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
DON'T PANIC: GETTING YOUR INFRASTRUCTURE DRIFT UNDER CONTROL, ERAN BIBI, Firefly
 
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...Flink Forward San Francisco 2018: Andrew Gao &  Jeff Sharpe - "Finding Bad Ac...
Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Ac...
 
OpenStack Orchestration - Juno Updates
OpenStack Orchestration - Juno UpdatesOpenStack Orchestration - Juno Updates
OpenStack Orchestration - Juno Updates
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
 
Cloud applications monitoring in digital transformation era
Cloud applications monitoring in digital transformation eraCloud applications monitoring in digital transformation era
Cloud applications monitoring in digital transformation era
 
Building Reactive applications with Akka
Building Reactive applications with AkkaBuilding Reactive applications with Akka
Building Reactive applications with Akka
 
Reaching State Zero Without Losing Your Versions
Reaching State Zero Without Losing Your VersionsReaching State Zero Without Losing Your Versions
Reaching State Zero Without Losing Your Versions
 
Stream processing with Apache Flink @ OfferUp
Stream processing with Apache Flink @ OfferUpStream processing with Apache Flink @ OfferUp
Stream processing with Apache Flink @ OfferUp
 

Similar to Sparklint @ Spark Meetup Chicago

Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
DataWorks Summit
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Databricks
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Databricks
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorksPerformance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
Thoughtworks
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
VMware Tanzu
 

Similar to Sparklint @ Spark Meetup Chicago (20)

Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon Whitear
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
 
Serverless Computing with Azure Functions Best Practices
Serverless Computing with Azure Functions Best PracticesServerless Computing with Azure Functions Best Practices
Serverless Computing with Azure Functions Best Practices
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
 
Real time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflowsReal time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflows
 
Taking Splunk to the Next Level - Architecture
Taking Splunk to the Next Level - ArchitectureTaking Splunk to the Next Level - Architecture
Taking Splunk to the Next Level - Architecture
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Serverless Patterns by Jesse Butler
Serverless Patterns by Jesse ButlerServerless Patterns by Jesse Butler
Serverless Patterns by Jesse Butler
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
 
SQL PASS Summit 2018
SQL PASS Summit 2018SQL PASS Summit 2018
SQL PASS Summit 2018
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
How to Drive Down iSeries Computing Costs
How to Drive Down iSeries Computing CostsHow to Drive Down iSeries Computing Costs
How to Drive Down iSeries Computing Costs
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
 
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorksPerformance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Taking Splunk to the Next Level - Technical
Taking Splunk to the Next Level - TechnicalTaking Splunk to the Next Level - Technical
Taking Splunk to the Next Level - Technical
 
Taking Splunk to the Next Level – Architecture
Taking Splunk to the Next Level – ArchitectureTaking Splunk to the Next Level – Architecture
Taking Splunk to the Next Level – Architecture
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Sparklint @ Spark Meetup Chicago

  • 1. Sparklint a Tool for Identifying and Tuning Inefficient Spark Jobs Across Your Cluster
  • 3. Why Sparklint? • A successful Spark cluster grows rapidly • Capacity and capability mismatches arise • Leads to resource contention • Tuning process is non-trivial • Current UI operational in focus We wanted to understand application efficiency
  • 4. Sparklint provides: • Live view of batch & streaming application stats or • Event by event analysis of historical event logs • Stats and graphs for: – Idle time – Core usage – Task locality
  • 8. Demo… • Simulated workload analyzing site access logs: – read text file as JSON – convert to Record(ip, verb, status, time) – countByIp, countByStatus, countByVerb
  • 9. Job took 10m7s to finish Already pretty good distribution; low idle time indicates good worker usage, minimal driver node interaction in job But overall utilization is low Which is reflected in the common occurrence of the IDLE state (unused cores)
  • 10. Job took 15m14s to finish Core usage increased, job is more efficient, execution time increased, but the app is not cpu bound
  • 11. Job took 9m24s to finish Core utilization decreased proportionally, trading execution time for efficiency Lots of IDLE state shows we are over allocating resources
  • 12. Job took 11m34s to finish Core utilization remains low, the config settings are not right for this workload. Dynamic allocation only effective at app start due to long executorIdleTimeout setting
  • 13. Job took 33m5s to finish Core utilization is up, but execution time is up dramatically due to reclaiming resources before each short running task. IDLE state is reduced to a minimum, looks efficient, but execution is much slower due to dynamic allocation overhead
  • 14. Job took 7m34s to finish Core utilization way up, with lower execution time Parallel execution is clearly visible in overlapping stages Flat tops show we are becoming CPU bound
  • 15. Job took 5m6s to finish Core utilization decreases, trading execution time for efficiency again here
  • 16. Thanks to dynamic allocation the utilization is high despite being a bi- modal application Data loading and mapping requires a large core count to get throughput Aggregation and IO of results optimized for end file size, therefore requires less cores
  • 17. Future Features: • History Server event sources • Inline recommendations • Auto-tuning • Streaming stage parameter delegation • Replay capable listener
  • 18. The Credit: • Lead developer is Robert Xue • https://github.com/roboxue • SDE @ Groupon
  • 20. Q+A

Editor's Notes

  1. Spark cluster success Platform rolls out with a maximum supported load. Early projects ramp up, usage is fine Early successes feed back into recommendations to use the platform New users start loading up the platform just as initial successes are being scaled Platform limits hit, scaling requirements now begin to be understood and planned for Rough times whilst the platform operation learns to lead the application usage ◦ spark ui provides masses of info for only recent jobs / stages/tasks by default when the job is alive ◦ when serving spark ui from history server, there is still little summary information to debug the job config: Have I used the right magic number (locality wait, cores, numPartitions, job scheduling mode, etc.) ◦ difficult to compare different execution of the same job, due to this missing level of summary, (execution time is almost the only metrics to compare)
  2. ◦ A mechanism to listen the spark event log stream, and accumulate life time stats without losing (too many) details using constant memory in live mode because of the gauge we are using ◦ The mechanism also provides convenient replay when serving from a file ◦ A set of stats and graphs to describe the job performance uniformly: 1. idle time (duration when all calc are done on driver node, things to avoid) 2. max core usage, core usage percentage (should not be too high or too low, thinking about using avg numTaskInWait to supplement it) 3. task execution time for a certain stage by locality, (which honestly describes the opportunity cost of a lower locality, and indicates the idle locality wait config.)
  3. using the ReduceByKey.scala in repo as a sample to demo a series of attempts when we try to optimize a Spark application. The logs are included as well. The highlights for each run have been annotated in the screenshots in the attachment. The application is basically reading a text file, json parse and convert to "Record(ip: String, verb: String, status: Int, time: Long)", then do countByIP, countByStatus, countByVerb on them, repeat 10 times. These are three independent map reduce jobs, each has one map stage (parsing) and one reduce stage (countByXXX). Algo level optimization is out of the discussion here. The app need a constant number of CPU seconds, and a floating but bound amount of network i/o time (decided by job locality) to finish the execution.
  4. We use 16 cores as the baseline standard. The job takes 10 min to finish. The annotations in the pic describes what are we running here, and how to read sparklint graph. After reading the chart, we decided to decrease core to see if the execution time doubles or not, to figure out if we are bonded by CPU.
  5. by using 8 cores, the job took 15 min to finish, shorter than the 20 min expectation, proving that we are not bonded by cpu. Actually this saw tooth pattern easily indicates we are not bonded by CPU, and can be used as a classic example; An example of cpu bounded application can be found in the last demo slide. This leads to another angle of optimization: job scheduling tweaking.
  6. by using 32 cores, the job took 9 min to finish, proving again that throwing more cores doesn't provide commensurate performance gains.. The graph is a classic example about over allocating resources. We can assume we need no more than 24 cores to do the work effectively, so now we can look into other ways of tuning the job: dynamic allocation and increased parallelism.
  7. we try to optimize resource requirement by using dynamic allocation, initially just using the default executorIdleTimeout of 1min. This has also led us to try 1 core / executor. Since we don't usually have any task longer than 1 minute, we proved that dynamic allocation is not the key in optimizing this kind of app that has shorter tasks.
  8. we reduced executorIdleTimeout to 10s. In this way we decreased resource footprint and increased utilization. However this is a false saving for this job, because the job throughput is reduced due to low core supply and overhead in getting executors. This example proved again that dynamic allocation doesn't solve the optimization challenge when we have shorter tasks So, let’s try parallelism inside the job using FAIR scheduling.
  9. by using 16 cores and FAIR scheduler, this simple tweak cut the execution time from 10 min to 7.5 min, and our job now become CPU bounded (see annotation) The tweak to run the three count stages in parallel and use FAIR scheduling increases efficiency and reduces runtime, allowing us to become CPU bound,
  10. by using 32 cores and FAIR scheduler, the execution time become 5 min (compare to 9 min in pic3 using the same resource). We reduce efficiency in order to gain execution time, this is a decision for the team to decide, if there is a hard SLA to hit, it may be worth running with lower utilization. We can now call the job scheduling optimization done.
  11. Demos the correct scenario of using dynamic allocation, and throwing more cpu will help when the job is CPU bounded (the flat tops in the usage graph is the clear proof) In this case the partition count is chosen to optimize file size on HDFS, so the team are comfortable with the runtime.
  12. Sparklint can easily distinguish CPU bounded and job scheduling bounded applications. (We are working on automating this judgment, by using average number of pending tasks) Really easy to spot when a job is not bounded by CPU, but job scheduling (leads to low core usage) and driver node operations (leads to idle time). In theory your app will be 2x faster if you throw 2x cores to it, but this is not always true The point of spark level optimization is to make your job CPU bounded, when you can decide freely between ($ gain from faster application / $ spent in providing more cores) If your job is CPU bounded, simply add cores If your job has a lot of idle time, try decrease it by reducing unwanted/unintended driver node operations. (could be simple things like doing a map on a large array instead of an RDD and they forgot about it) If your job is job scheduling bounded, you can both reduce waste by using dynamic allocation (which in turns provides you high throughput when needed), and submit independent jobs in parallel using Futures and FAIR scheduler http://spark.apache.org/docs/latest/configuration.html#scheduling