Towards Automatic Composition of
MultiComponent Predictive Systems
Manuel Martin Salvador, Marcin Budka, Bogdan Gabrys
Data Science Institute, Bournemouth University, UK
April 18th, 2016
Seville, Spain
Predictive modelling
Labelled
Data
Supervised
Learning
Algorithm
Predictive
Model
Data is imperfect
Missing
Values
Noise
High
dimensionality
Outliers
Question Mark: http://commons.wikimedia.org/wiki/File:Question_mark_road_sign,_Australia.jpg
Noise: http://www.flickr.com/photos/benleto/3223155821/
Outliers: http://commons.wikimedia.org/wiki/File:Diagrama_de_caixa_com_outliers_and_whisker.png
3D plot: http://salsahpc.indiana.edu/plotviz/
MultiComponent Predictive Systems
Data Postprocessing PredictionsPreprocessing
Predictive
Model
MultiComponent Predictive Systems
Preprocessing
Data
Predictive
Model
Postprocessing Predictions
Preprocessing
Preprocessing Predictive
Model
Predictive
Model
Algorithm Selection
What are the best
algorithms to
process my data?
Hyperparameter Optimisation
How to tune the
hyperparameters to get
the best performance?
CASH problem
k-fold cross validation
Combined Algorithm Selection and Hyperparameter configuration problem
Objective function
(e.g. classification error)
HyperparametersAlgorithms
Training dataset
Validation dataset
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms.
In: Proc. of the 19th ACM SIGKDD. (2013) 847–855
Auto-WEKA
WEKA methods as search space
One-click black box
Data + Time Budget → MCPS
Our contribution
Recursive extension of complex
hyperparameters in the search space.
Code available in https://github.com/dsibournemouth/autoweka
Search space
Hyperparameters
PREV NEW
756 1186
Optimisation strategies
● Grid search: exhaustive exploration of the whole search space. Not feasible in high
dimensional spaces.
● Random search: explores the search space randomly during a given time.
● Bayesian optimisation: assumes that there is a function between the hyperparameters
and the objective and try to explore the most promising parts of the search space.
Hutter, F., Hoos, H. H., & Leyton-
Brown, K. (2011). Sequential
Model-Based Optimization for
General Algorithm Configuration.
Learning and Intelligent
Optimization, 6683 LNCS, 507–
523.
Evaluated strategies
1. WEKA-Def: All the predictors and meta-predictors are run using WEKA’s
default hyperparameter values.
2. Random search: The search space is randomly explored.
3. SMAC: Sequential Model-based Algorithm Configuration incrementally
builds a Random Forest as inner model.
4. TPE: Tree-structure Parzen Estimation uses Gaussian Processes to
incrementally build an inner model.
Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization,
6683 LNCS, 507–523.
J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl, Algorithms for Hyper-Parameter Optimization. in Advances in NIPS 24, 2011, pp. 1–9.
Experiments
21 datasets (classification problems)
Budget: 30 CPU-hours (per run)
25 runs with different seeds
Timeout: 30 minutes
Memout: 3GB RAM
Results
Classification error on test set
● WEKA-Def (best): 1/21
● Random search (mean): 4/21
● SMAC (mean): 10/21
● TPE (mean): 6/21
Search spaces
● NEW > PREV: 52/63
Best MCPSs found
Conclusion and future work
Automation of composition and optimisation of MCPSs is feasible
Extending the search space has helped to find better solutions
Bayesian optimisation strategies have performed better than random search in
most cases
Future work:
● Still gap for improvement in Bayesian optimisation strategies.
● Multi-objective optimisation (e.g. time and error).
● Adaptive optimisation in changing environments.
Thank you!
msalvador@bournemouth.ac.uk
Paper available in https://dx.doi.org/10.1007/978-3-319-32034-2_3
Slides available in http://slideshare.net/draxus

Towards Automatic Composition of Multicomponent Predictive Systems

  • 1.
    Towards Automatic Compositionof MultiComponent Predictive Systems Manuel Martin Salvador, Marcin Budka, Bogdan Gabrys Data Science Institute, Bournemouth University, UK April 18th, 2016 Seville, Spain
  • 3.
  • 4.
    Data is imperfect Missing Values Noise High dimensionality Outliers QuestionMark: http://commons.wikimedia.org/wiki/File:Question_mark_road_sign,_Australia.jpg Noise: http://www.flickr.com/photos/benleto/3223155821/ Outliers: http://commons.wikimedia.org/wiki/File:Diagrama_de_caixa_com_outliers_and_whisker.png 3D plot: http://salsahpc.indiana.edu/plotviz/
  • 5.
    MultiComponent Predictive Systems DataPostprocessing PredictionsPreprocessing Predictive Model
  • 6.
    MultiComponent Predictive Systems Preprocessing Data Predictive Model PostprocessingPredictions Preprocessing Preprocessing Predictive Model Predictive Model
  • 7.
    Algorithm Selection What arethe best algorithms to process my data?
  • 8.
    Hyperparameter Optimisation How totune the hyperparameters to get the best performance?
  • 9.
    CASH problem k-fold crossvalidation Combined Algorithm Selection and Hyperparameter configuration problem Objective function (e.g. classification error) HyperparametersAlgorithms Training dataset Validation dataset Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proc. of the 19th ACM SIGKDD. (2013) 847–855
  • 10.
    Auto-WEKA WEKA methods assearch space One-click black box Data + Time Budget → MCPS Our contribution Recursive extension of complex hyperparameters in the search space. Code available in https://github.com/dsibournemouth/autoweka
  • 11.
  • 12.
    Optimisation strategies ● Gridsearch: exhaustive exploration of the whole search space. Not feasible in high dimensional spaces. ● Random search: explores the search space randomly during a given time. ● Bayesian optimisation: assumes that there is a function between the hyperparameters and the objective and try to explore the most promising parts of the search space. Hutter, F., Hoos, H. H., & Leyton- Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507– 523.
  • 13.
    Evaluated strategies 1. WEKA-Def:All the predictors and meta-predictors are run using WEKA’s default hyperparameter values. 2. Random search: The search space is randomly explored. 3. SMAC: Sequential Model-based Algorithm Configuration incrementally builds a Random Forest as inner model. 4. TPE: Tree-structure Parzen Estimation uses Gaussian Processes to incrementally build an inner model. Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507–523. J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl, Algorithms for Hyper-Parameter Optimization. in Advances in NIPS 24, 2011, pp. 1–9.
  • 14.
    Experiments 21 datasets (classificationproblems) Budget: 30 CPU-hours (per run) 25 runs with different seeds Timeout: 30 minutes Memout: 3GB RAM
  • 15.
    Results Classification error ontest set ● WEKA-Def (best): 1/21 ● Random search (mean): 4/21 ● SMAC (mean): 10/21 ● TPE (mean): 6/21 Search spaces ● NEW > PREV: 52/63
  • 16.
  • 17.
    Conclusion and futurework Automation of composition and optimisation of MCPSs is feasible Extending the search space has helped to find better solutions Bayesian optimisation strategies have performed better than random search in most cases Future work: ● Still gap for improvement in Bayesian optimisation strategies. ● Multi-objective optimisation (e.g. time and error). ● Adaptive optimisation in changing environments.
  • 18.
    Thank you! msalvador@bournemouth.ac.uk Paper availablein https://dx.doi.org/10.1007/978-3-319-32034-2_3 Slides available in http://slideshare.net/draxus