Automatic Machine Learning
By: Himadri Mishra, 13074014
Overview: What is Machine Learning?
● Subfield of computer science
● Evolved from the study of pattern recognition and
computational learning theory in artificial intelligence
● Gives computers the ability to learn without being
explicitly programmed
● Explores the study and construction of algorithms that
can learn from and make predictions on data
Basic Flow of Machine Learning
Overview: Why Machine Learning?
● Some tasks are difficult to define algorithmically.
Example: Learning to recognize objects.
● High-value predictions that can guide better decisions
and smart actions in real time without human intervention
● Machine learning as a technology that helps analyze these
large chunks of big data,
● Research area that targets progressive automation of
machine learning
● Also known as AutoML
● Focuses on end users without expert knowledge
● Offers new tools to Machine Learning experts.
○ Perform architecture search over deep representations
○ Analyse the importance of hyperparameters
○ Development of flexible software packages that can be instantiated
automatically in a data-driven way
● Follows the paradigm of Programming by Optimization (PbO)
What is Automatic Machine Learning?
Examples of AutoML
● AutoWEKA: Approach for the simultaneous selection of a machine learning
algorithm and its hyperparameters
● Deep Neural Networks: notoriously dependent on their hyperparameters, and
modern optimizers have achieved better results in setting them than humans
(Bergstra et al, Snoek et al).
● Making a science of model search: a complex computer vision architecture
could automatically be instantiated to yield state-of-the-art results on 3
different tasks: face matching, face identification, and object
recognition.
Methods of AutoML
● Bayesian optimization
● Regression models for structured data and big data
● Meta learning
● Transfer learning
● Combinatorial optimization.
An AutoML Framework
Modules of AutoML Framework, unraveled
● Data Pre-Processing
● Problem Identification and Data Splitting
● Feature Engineering
● Feature Stacking
● Application of various models to data
● Decomposition
● Feature Selection
● Model selection and HyperParameter tuning
● Evaluation of Model
Data Pre-Processing
● Tabular data is most common way of representing data in
machine learning or data mining
● Data must be converted to a tabular form
Problem Identification and Data Splitting
● Single column, binary values (Binary Classification)
● Single column, real values (Regression problem)
● Multiple column, binary values (Multi-Class
Classification)
● Multiple column, real values (Multiple target Regression
problem)
● Multilabel Classification
Types of Labels
● Stratified KFold splitting for Classification
● Normal KFold split for regression
Feature Engineering
● Numerical Variables
○ No Processing Required
● Categorical Variables
○ Label Encoders
○ One Hot Encoders
● Text Variables
○ Count Vectorize
○ TF-IDF vectorize
Types of Variables
Feature Stacking
● Two Kinds of Stacking
○ Model Stacking
■ An Ensemble Approach
■ Combines the power of diverse models into single
○ Feature Stacking
■ Different features after processing, gets combined
● Our Stacker Module is a feature stacker
Application of models and Decomposition
● We should go for Ensemble tree based models:
○ Random Forest Regressor/Classifier
○ Extra Trees Regressor/Classifier
○ Gradient Boosting Machine Regressor/Classifier
● Can’t apply linear models without Normalization
○ For dense features Standard Scaler Normalization
○ For Sparse Features Normalize without scaling about mean, only to
unit variance
● If the above steps give a “good” model, we can go for
optimization of hyperparameters module, else continue
● For High dimensional data, PCA is used to decompose
● For images start with 10-15 components and increase it as
long as results improve
● For other kind of data, start with 50-60 components
● For Text Data, we use Singular Value Decomposition after
converting text to sparse matrix
Feature Selection
● Greedy Forward Selection
○ Selecting best features iteratively
○ Selecting features based on coefficients of model
● Greedy backward elimination
● Use GBM for normal features and Random Forest for Sparse
features for feature evaluation
Model selection and HyperParameter tuning
● Most important and fundamental process of Machine
Learning
● Classification:
○ Random Forest
○ GBM
○ Logistic Regression
○ Naive Bayes
○ Support Vector Machines
○ k-Nearest Neighbors
● Regression
○ Random Forest
○ GBM
○ Linear Regression
○ Ridge
○ Lasso
○ SVR
Choice of Model and Hyperparameters
Evaluation of Model
Saving all Transformations on Train Data for reuse
Re-Use of saved transformations for Evaluation on validation set
Current Research
Automatic Architecture selection for Neural Network
Automatically Tuned Neural Network
● Auto-Net is a system that automatically configures neural networks
● Achieved the best performance on two datasets in the human expert track of
the recent ChaLearn AutoML Challenge
● Works by tuning:
○ layer-independent network hyperparameters
○ per-layer hyperparameters
● Auto-Net submission reached an AUC score of 90%, while the best human
competitor (Ideal Intel Analytics) only reached 80%
● first time an automatically-constructed neural network won a competition
dataset
Conclusion
● Machine learning (ML) has achieved considerable successes
in recent years and an ever-growing number of disciplines
rely on it.
● However, its success crucially relies on human machine
learning experts to perform various tasks manually
● The rapid growth of machine learning applications has
created a demand for off-the-shelf machine learning
methods that can be used easily and without expert
knowledge
● Auto-ML is an open research topic and will be very soon
challenging the state of the Art results in various
domains
Thank You

Automatic Machine Learning, AutoML

  • 1.
    Automatic Machine Learning By:Himadri Mishra, 13074014
  • 2.
    Overview: What isMachine Learning? ● Subfield of computer science ● Evolved from the study of pattern recognition and computational learning theory in artificial intelligence ● Gives computers the ability to learn without being explicitly programmed ● Explores the study and construction of algorithms that can learn from and make predictions on data
  • 3.
    Basic Flow ofMachine Learning
  • 4.
    Overview: Why MachineLearning? ● Some tasks are difficult to define algorithmically. Example: Learning to recognize objects. ● High-value predictions that can guide better decisions and smart actions in real time without human intervention ● Machine learning as a technology that helps analyze these large chunks of big data,
  • 5.
    ● Research areathat targets progressive automation of machine learning ● Also known as AutoML ● Focuses on end users without expert knowledge ● Offers new tools to Machine Learning experts. ○ Perform architecture search over deep representations ○ Analyse the importance of hyperparameters ○ Development of flexible software packages that can be instantiated automatically in a data-driven way ● Follows the paradigm of Programming by Optimization (PbO) What is Automatic Machine Learning?
  • 6.
    Examples of AutoML ●AutoWEKA: Approach for the simultaneous selection of a machine learning algorithm and its hyperparameters ● Deep Neural Networks: notoriously dependent on their hyperparameters, and modern optimizers have achieved better results in setting them than humans (Bergstra et al, Snoek et al). ● Making a science of model search: a complex computer vision architecture could automatically be instantiated to yield state-of-the-art results on 3 different tasks: face matching, face identification, and object recognition.
  • 7.
    Methods of AutoML ●Bayesian optimization ● Regression models for structured data and big data ● Meta learning ● Transfer learning ● Combinatorial optimization.
  • 8.
  • 10.
    Modules of AutoMLFramework, unraveled ● Data Pre-Processing ● Problem Identification and Data Splitting ● Feature Engineering ● Feature Stacking ● Application of various models to data ● Decomposition ● Feature Selection ● Model selection and HyperParameter tuning ● Evaluation of Model
  • 11.
  • 12.
    ● Tabular datais most common way of representing data in machine learning or data mining ● Data must be converted to a tabular form
  • 13.
  • 14.
    ● Single column,binary values (Binary Classification) ● Single column, real values (Regression problem) ● Multiple column, binary values (Multi-Class Classification) ● Multiple column, real values (Multiple target Regression problem) ● Multilabel Classification Types of Labels
  • 15.
    ● Stratified KFoldsplitting for Classification ● Normal KFold split for regression
  • 16.
  • 17.
    ● Numerical Variables ○No Processing Required ● Categorical Variables ○ Label Encoders ○ One Hot Encoders ● Text Variables ○ Count Vectorize ○ TF-IDF vectorize Types of Variables
  • 18.
  • 19.
    ● Two Kindsof Stacking ○ Model Stacking ■ An Ensemble Approach ■ Combines the power of diverse models into single ○ Feature Stacking ■ Different features after processing, gets combined ● Our Stacker Module is a feature stacker
  • 20.
    Application of modelsand Decomposition
  • 21.
    ● We shouldgo for Ensemble tree based models: ○ Random Forest Regressor/Classifier ○ Extra Trees Regressor/Classifier ○ Gradient Boosting Machine Regressor/Classifier ● Can’t apply linear models without Normalization ○ For dense features Standard Scaler Normalization ○ For Sparse Features Normalize without scaling about mean, only to unit variance ● If the above steps give a “good” model, we can go for optimization of hyperparameters module, else continue
  • 22.
    ● For Highdimensional data, PCA is used to decompose ● For images start with 10-15 components and increase it as long as results improve ● For other kind of data, start with 50-60 components ● For Text Data, we use Singular Value Decomposition after converting text to sparse matrix
  • 23.
  • 24.
    ● Greedy ForwardSelection ○ Selecting best features iteratively ○ Selecting features based on coefficients of model ● Greedy backward elimination ● Use GBM for normal features and Random Forest for Sparse features for feature evaluation
  • 25.
    Model selection andHyperParameter tuning
  • 26.
    ● Most importantand fundamental process of Machine Learning
  • 27.
    ● Classification: ○ RandomForest ○ GBM ○ Logistic Regression ○ Naive Bayes ○ Support Vector Machines ○ k-Nearest Neighbors ● Regression ○ Random Forest ○ GBM ○ Linear Regression ○ Ridge ○ Lasso ○ SVR Choice of Model and Hyperparameters
  • 29.
  • 30.
    Saving all Transformationson Train Data for reuse
  • 31.
    Re-Use of savedtransformations for Evaluation on validation set
  • 32.
  • 33.
  • 34.
    Automatically Tuned NeuralNetwork ● Auto-Net is a system that automatically configures neural networks ● Achieved the best performance on two datasets in the human expert track of the recent ChaLearn AutoML Challenge ● Works by tuning: ○ layer-independent network hyperparameters ○ per-layer hyperparameters ● Auto-Net submission reached an AUC score of 90%, while the best human competitor (Ideal Intel Analytics) only reached 80% ● first time an automatically-constructed neural network won a competition dataset
  • 35.
  • 36.
    ● Machine learning(ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it. ● However, its success crucially relies on human machine learning experts to perform various tasks manually ● The rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge ● Auto-ML is an open research topic and will be very soon challenging the state of the Art results in various domains
  • 37.