Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Intuitive & Scalable HPO
With Spark+Fugue
Han Wang
Agenda
Introduction
Non-Iterative HPO
Demo
Iterative HPO Demo
pip install tune
https://github.com/fugue-project/tune
pip install fugue
https://github.com/fugue-project/fugue
Introduction
Questions
● Is parameter tuning a machine learning problem?
● Are there common ways to tune both classical models and deep...
Tuning Problems In General
General Parameter Tuning
Hyperparameter Tuning (for Machine Learning)
Some Classical
Models
Dee...
Distributed Parameter Tuning
● Not everything can be parallelized
● The tuning logic is always complex and tedious
● Popul...
Distributed Parameter Tuning
Tune SQL Validation
Our Goals
● For non-iterative problems:
○ Unify grid and random search, make other plugable
● For iterative problems:
○ Ge...
Non-Iterative Problems
Grid Search
a: Grid(0,1)
b: Grid(“a”, “b”)
c: 3.14
a:0, b:”a”, c:3.14
a:0, b:”b”, c:3.14
a:1, b:”a”, c:3.14
a:1, b:”b”, c:...
Random Search
a: Rand(0,1)
b: Choice(“a”,“b”)
c: 3.14
a:0.12, b:”a”, c:3.14
a:0.66, b:”a”, c:3.14
a:0.32, b:”b”, c:3.14
a:...
Bayesian Optimization
objective: a^2
a: Rand(-1,1)
-0.66 -> 0.76 -> -0.18
-> 0.75 -> 0.90
-> 0.07 -> 0.00
-> 0.41 -> 0.12 ...
Hybrid Search Space
Distributed Hybrid Search
Model 1 Model 2
Grid Random Bayesian
Live Demo
Space Concept & Scikit-Learn Tuning
Iterative Problems
Challenges
● Realtime asynchronous communication
● The overhead for checkpointing iterations can be significant
● Single i...
Successive Halving (SHA)
Rung 1
Rung 2
Rung 3
Rung 4
Fully Customized Successive Halving
8, [(4,6), (2,2), (6,1)]
Hyperband
Asynchronous Successive Halving (ASHA)
Live Demo
Keras Model Tuning
Summary
Space Monitoring
Dataset
Distributed
Execution
Abstraction
Non-Iterative
Random, Grid, BO
Iterative
SHA, HB, ASHA,...
Let’s Collaborate!
● Create specialized higher level APIs for major tuning cases so
users can do tuning with minimal code ...
pip install tune
https://github.com/fugue-project/tune
pip install fugue
https://github.com/fugue-project/fugue
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What to Upload to SlideShare
Next
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue

Download to read offline

Hyperparameter tuning is critical in model development. And its general form: parameter tuning with an objective function is also widely used in industry. On the other hand, Apache Spark can handle massive parallelism, and Apache Spark ML is a solid machine learning solution.

But we have not seen a general and intuitive distributed parameter tuning solution based on Apache Spark, why?

Not every tuning problem is on Apache Spark ML models. How can Apache Spark handle general models?
Not every tuning problem is a parallelizable grid or random search. Bayesian optimization is sequential, how can Apache Spark help in this case?
Not every tuning problem is single epoch, deep learning is not. How to fit algos such as hyperband and ASHA into Apache Spark?
Not every tuning problem is a machine learning problem, for example simulation + tuning is also common. How to generalize?
In this talk, we are going to show how using Fugue-Tune and Apache Spark together can eliminate these painpoints

Fugue-Tune like Fugue, is a “super framework” – an absraction layer unifying existing solutions such as Hyperopt and Optuna
It firstly models the general tuning problems, independent from machine learning
It is designed for both small and large scale problems. It can always fully parallelize the distributable part of a tuning problem
It works for both classical and deep learning models. With Fugue, running hyperband and ASHA becomes possible on Apache Spark.
In the demo, you will see how to do any type of tuning in a consistent, intuitive, scalable and minimal way. And you will see a live demo of the amazing performance.

  • Be the first to like this

Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue

  1. 1. Intuitive & Scalable HPO With Spark+Fugue Han Wang
  2. 2. Agenda Introduction Non-Iterative HPO Demo Iterative HPO Demo
  3. 3. pip install tune https://github.com/fugue-project/tune pip install fugue https://github.com/fugue-project/fugue
  4. 4. Introduction
  5. 5. Questions ● Is parameter tuning a machine learning problem? ● Are there common ways to tune both classical models and deep learning models? ● Why is it so hard to do distributed parameter tuning?
  6. 6. Tuning Problems In General General Parameter Tuning Hyperparameter Tuning (for Machine Learning) Some Classical Models Deep Learning Models Some Classical Models Non-Iterative Problems Iterative Problems
  7. 7. Distributed Parameter Tuning ● Not everything can be parallelized ● The tuning logic is always complex and tedious ● Popular tuning frameworks are not distributed environment friendly ● Spark is not suitable for iterative tuning problems
  8. 8. Distributed Parameter Tuning Tune SQL Validation
  9. 9. Our Goals ● For non-iterative problems: ○ Unify grid and random search, make other plugable ● For iterative problems: ○ Generalize SOTA algos such as Hyperband and ASHA ● For both ○ Tune both locally and distributedly without code change ○ Make tuning development iterable and testable ○ Minimize moving parts ○ Minimize interfaces
  10. 10. Non-Iterative Problems
  11. 11. Grid Search a: Grid(0,1) b: Grid(“a”, “b”) c: 3.14 a:0, b:”a”, c:3.14 a:0, b:”b”, c:3.14 a:1, b:”a”, c:3.14 a:1, b:”b”, c:3.14 Search Space Candidates Pros: determinism, even coverage, interpretable Cons: complexity can increase exponentially
  12. 12. Random Search a: Rand(0,1) b: Choice(“a”,“b”) c: 3.14 a:0.12, b:”a”, c:3.14 a:0.66, b:”a”, c:3.14 a:0.32, b:”b”, c:3.14 a:0.94, b:”a”, c:3.14 Search Space Candidates Pros: complexity and distribution are controlled, good for continuous variables Cons: by luck, not deterministic, large number of samples are normally needed
  13. 13. Bayesian Optimization objective: a^2 a: Rand(-1,1) -0.66 -> 0.76 -> -0.18 -> 0.75 -> 0.90 -> 0.07 -> 0.00 -> 0.41 -> 0.12 -> 0.66 Search Space Candidates Pros: less compute to guess the optimal parameters Cons: sequential operations may require more time
  14. 14. Hybrid Search Space Distributed Hybrid Search Model 1 Model 2 Grid Random Bayesian
  15. 15. Live Demo Space Concept & Scikit-Learn Tuning
  16. 16. Iterative Problems
  17. 17. Challenges ● Realtime asynchronous communication ● The overhead for checkpointing iterations can be significant ● Single iterative problem can’t be parallelized ● A lot of boilerplate code
  18. 18. Successive Halving (SHA) Rung 1 Rung 2 Rung 3 Rung 4
  19. 19. Fully Customized Successive Halving 8, [(4,6), (2,2), (6,1)]
  20. 20. Hyperband
  21. 21. Asynchronous Successive Halving (ASHA)
  22. 22. Live Demo Keras Model Tuning
  23. 23. Summary Space Monitoring Dataset Distributed Execution Abstraction Non-Iterative Random, Grid, BO Iterative SHA, HB, ASHA, PBT ... Specialization Scikit-Learn Specialization Keras, TF, PyTorch
  24. 24. Let’s Collaborate! ● Create specialized higher level APIs for major tuning cases so users can do tuning with minimal code and without learning distributed systems ● Enable advanced users to create fully customized, platform agnostic and scale agnostic tuning pipelines with tune’s lower level APIs
  25. 25. pip install tune https://github.com/fugue-project/tune pip install fugue https://github.com/fugue-project/fugue
  26. 26. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

Hyperparameter tuning is critical in model development. And its general form: parameter tuning with an objective function is also widely used in industry. On the other hand, Apache Spark can handle massive parallelism, and Apache Spark ML is a solid machine learning solution. But we have not seen a general and intuitive distributed parameter tuning solution based on Apache Spark, why? Not every tuning problem is on Apache Spark ML models. How can Apache Spark handle general models? Not every tuning problem is a parallelizable grid or random search. Bayesian optimization is sequential, how can Apache Spark help in this case? Not every tuning problem is single epoch, deep learning is not. How to fit algos such as hyperband and ASHA into Apache Spark? Not every tuning problem is a machine learning problem, for example simulation + tuning is also common. How to generalize? In this talk, we are going to show how using Fugue-Tune and Apache Spark together can eliminate these painpoints Fugue-Tune like Fugue, is a “super framework” – an absraction layer unifying existing solutions such as Hyperopt and Optuna It firstly models the general tuning problems, independent from machine learning It is designed for both small and large scale problems. It can always fully parallelize the distributable part of a tuning problem It works for both classical and deep learning models. With Fugue, running hyperband and ASHA becomes possible on Apache Spark. In the demo, you will see how to do any type of tuning in a consistent, intuitive, scalable and minimal way. And you will see a live demo of the amazing performance.

Views

Total views

106

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

5

Shares

0

Comments

0

Likes

0

×