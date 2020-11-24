Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Query or Not to Query Ask Unravel Prajakta Kalmegh, Principal Engineer Yusaku Sako, Head of Data Science
Data Science@ Prajakta Kalmegh ▪ pkalmegh@unraveldata.com ▪ https://www.linkedin.com/in/pkalmegh/ Yusaku Sako ▪ ysako@unra...
Experienced Team Strong Market Validation A Microsoft M12 Company Broad Technology and Cloud Platform Coverage Our Pedigree
Radically Simplify DataOps
Select right tech for the app, infrastructure, environment and cluster Debug code and pipelines, predict issues and assist...
CONTINUOUS INTEGRATION C O N TIN U O U S O PTIM IZA TIO N AGENDA
Unravel API Speedup Development Operational Insights Optimize Deployment
Speedup Development
Meet John 8:00 am 11:00 am 11:30 am 4:30 pm Still not working?
Why is this hard? What if I tweak my query? Did my resource requirements change? Is my data skewed on xxx? Is the cluster ...
Unravel brings data-driven insights as you code
The notebook Demo 2Q | | !2Q
What Unravel exploits? ▪ Users often issue similar queries ▪ Same challenges faced ▪ Same mistakes repeated ▪ By the same ...
Optimize Deployment
When is a good time to schedule? These are my resource requirements, should I schedule now or later? This report needs to ...
Finding the missing piece Is the cluster slow (er) ? Has the query needs changed over time? Is it both?
Unravel uses predictive analytics to time it right
The timeit Demo while (!best) { # find better }
What Unravel exploits? ▪ Cluster utilization timeseries ▪ Query execution variability data ▪ Similar resource profiles and...
Operational Actionable Insights
DELAYED Delayed: Increase in Input Data Size Detected
Delayed: Increase in Input Data Size Detected DELAYED
Architecture UNRAVELDAEMONS UnravelAPI Query Historical Executions Cluster State Indicators Query Quality Predictor Best S...
Already Scheduled Scheduled Query User Adhoc Query New Schedule Predict query issues Predict cluster issues Predict delays...
Recap: Detect Issues as early as possible 8:00 am 8:30 am Enjoy your day!
Thank you for watching! Signup for a free trial today https://bit.ly/3mo2ira
Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.
Upcoming SlideShare
Loading in …5
×

Query or Not to Query? Using Apache Spark Metrics to Highlight Potentially Problematic Queries

9 views

Published on

John submits a query and expects it to run smoothly. Based on his prior experience, he anticipates the query to finish in 20 mins.
Scenario-1: John’s query finishes execution in the expected timeframe and doesn’t impact any other concurrent query in the workload.
Scenario-2: John’s query takes twice the expected time, and also slows down multiple other concurrent queries. John now wonders “should I have submitted this query?”.

Published in: Data & Analytics
no profile picture user

  • Be the first to comment

  • Be the first to like this

Query or Not to Query? Using Apache Spark Metrics to Highlight Potentially Problematic Queries

  1. 1. Query or Not to Query Ask Unravel Prajakta Kalmegh, Principal Engineer Yusaku Sako, Head of Data Science
  2. 2. Data Science@ Prajakta Kalmegh ▪ pkalmegh@unraveldata.com ▪ https://www.linkedin.com/in/pkalmegh/ Yusaku Sako ▪ ysako@unraveldata.com ▪ https://www.linkedin.com/in/yusaku-sako/
  3. 3. Experienced Team Strong Market Validation A Microsoft M12 Company Broad Technology and Cloud Platform Coverage Our Pedigree
  4. 4. Radically Simplify DataOps
  5. 5. Select right tech for the app, infrastructure, environment and cluster Debug code and pipelines, predict issues and assist in optimizing apps Check for app correctness and resource efficiency Container sizing, Scheduling and Cluster selection Tuning apps and eliminating rogue apps Proactive and automated actions to maintain SLAs CONTINUOUS INTEGRATION C O N TIN U O U S O PTIM IZA TIO N
  6. 6. CONTINUOUS INTEGRATION C O N TIN U O U S O PTIM IZA TIO N AGENDA
  7. 7. Unravel API Speedup Development Operational Insights Optimize Deployment
  8. 8. Speedup Development
  9. 9. Meet John 8:00 am 11:00 am 11:30 am 4:30 pm Still not working?
  10. 10. Why is this hard? What if I tweak my query? Did my resource requirements change? Is my data skewed on xxx? Is the cluster bottlenecked?
  11. 11. Unravel brings data-driven insights as you code
  12. 12. The notebook Demo 2Q | | !2Q
  13. 13. What Unravel exploits? ▪ Users often issue similar queries ▪ Same challenges faced ▪ Same mistakes repeated ▪ By the same user, by other users Holistic view of what worked and what did not
  14. 14. Optimize Deployment
  15. 15. When is a good time to schedule? These are my resource requirements, should I schedule now or later? This report needs to be ready before Monday morning, my start time is flexible I have a new workload to schedule, what is a good time to start it?
  16. 16. Finding the missing piece Is the cluster slow (er) ? Has the query needs changed over time? Is it both?
  17. 17. Unravel uses predictive analytics to time it right
  18. 18. The timeit Demo while (!best) { # find better }
  19. 19. What Unravel exploits? ▪ Cluster utilization timeseries ▪ Query execution variability data ▪ Similar resource profiles and consumption history Holistic view of when it worked and when it didn’t
  20. 20. Operational Actionable Insights
  21. 21. DELAYED Delayed: Increase in Input Data Size Detected
  22. 22. Delayed: Increase in Input Data Size Detected DELAYED
  23. 23. Architecture UNRAVELDAEMONS UnravelAPI Query Historical Executions Cluster State Indicators Query Quality Predictor Best Slot Finder AskUnravel
  24. 24. Already Scheduled Scheduled Query User Adhoc Query New Schedule Predict query issues Predict cluster issues Predict delays Submit Query ✓ SLA-bound ✓ Responsive wantstosubmit/track askunravel Submit Query Better Slots are xxx On Track Delayed by xxx Use-Cases Hold and submit lateraskunravel Better Slots are xxx Operational Insights Optimize Deployment Speedup Development
  25. 25. Recap: Detect Issues as early as possible 8:00 am 8:30 am Enjoy your day!
  26. 26. Thank you for watching! Signup for a free trial today https://bit.ly/3mo2ira
  27. 27. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

×