Genetic Algorithms
AN APPLICATION TO HYPERPARAMETER TUNING IN
PREDICTIVE MODELS
Dr. Jyoti Obia
Hyperparameter Tuning in Predictive Model
 Heuristic / Metaheuristic Approach
 Optimization strategy that mimics
natural selection
 Generates multiple solutions to a
problem by applying the principle
of ‘Survival of the fittest’
 Population-based search
 Flexible and robust
 Not problem specific
 Commonly used techniques
 Need to know your hyper
parameters well
 Based on limited sets of
combination of parameters
 Decent accuracy
 Very Exhaustive
 Impractical
GA
Terms of Genetic Algorithm
 Chromosomes (String) : Solution – (CH)
 Genes (bits) : Part of solution
 Alleles : Value of Gene
 Phenotype: Decoded solution
 Genotype : Encoded solution
CH1
CH2
CH3
CH30
Population
For Parameter 1 For Parameter 2
Gene – String position
Alleles – parameter Value
Decoding the chromosomes
Genotype
Terminologies:
 Fitness Value
Value of Objective function associated with an organism. This
determines how fit the solution is.
 Crossover : 2 point Crossover
Other types :
• Single point crossover
• Multipoint crossover
Normal probability of crossover is between 0.6 and 1
 Mutation
• Applied to each Offspring individually after crossover
• Bitflip
• Helps to explore more of the solution space.
• The probability of mutation is usually small (0.001 to 0.1)
 Elitism
The organism with the best fitness value get to live to next generation
 Selection
Process of identifying Parents to be used for creating next generation
Types of selections
Tournament Selection
- Randomly selected K
- Choose Champion with highest value
Roulette Wheel
- Assign probabilities based on fitness value
Algorithm Methodology:
Generation 1:
m = 1
Generation 1:
m = N/2
Gen 1 is over here
This new
Generation
will become
the
population
for new
Generation
2
Selection
Cross-Over
Mutation
to create
GEN 2
Repeat the process until
m = M
Mutated Children from
the very last Generation
Save the best
solution from each
generation
Pick the best solution
Hyperparameter tuning
using Genetic Algorithm
Approach & Methodology
Dataset
Scaling
Testing setTraining Set
Build SVM
Classifier
Trained
SVM
Classifier
Fitness
Evaluation
Termination
Criteria
Optimized (C, gamma)
GA Operator
GA Engine
(C, gamma) pair
yes
no
Comparative Results for SVM hyperparameter tuning
(Default parameter , Grid search and Genetic algorithm)
GRID SEARCH GENETIC ALGORITHM DEFAULT
MODELS 60 Models 800 Models 1 Model
COMPUTATION TIME 1.55 mins 4sec with HP (45 mins -training) 4 sec
ACCURACY 91% 93% 82%
PRECISION (C ) 250 0.03 --
PRECISION (gamma) 1000 2.87 --
SVM Classification result:
Note: These metrics will change from system to system. It is just to give broad idea of comparison.
APPENDIX
References:
https://www.researchgate.net/publication/220835507_Optimizing_Hyperparameters_of_Support_Vector_
Machines_by_Genetic_Algorithms
https://www.researchgate.net/publication/312203449_SVM_Parameter_Optimization_using_Grid_Search_a
nd_Genetic_Algorithm_to_Improve_Classification_Performance
https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_introduction.htm
Application : Determine whether a given email is spam or not.
Relevant Information: Our collection of spam e-mails came from our postmaster and individuals who had filed spam.
Number of Instances: 4601 (1813 Spam = 39.4%)
Number of Attributes : 58 (57 continuous, 1 nominal class label)
Attribute Information :
 1 nominal {0,1} class attribute : Denotes whether the e-mail was considered spam (1) or not (0)
 48 continuous attributes : Percentage of words in the e-mail that match WORD
 6 continuous attributes : Percentage of characters in the e-mail that match CHAR
 1 continuous real [1,...] attribute : Average length of uninterrupted sequences of capital letters
 1 continuous integer [1,...] attribute : Length of longest uninterrupted sequence of capital letters
 1 continuous integer [1,...] attribute: Total number of capital letters in the e-mail
Missing Attribute Values: None
Class Distribution:
Spam 1813 (39.4%)
Non-Spam 2788 (60.6%)
Source:
This data at the UCI Machine Learning Repository:
https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/
Spam email dataset description-
gamma
gamma is a parameter for non linear hyperplanes. The higher the gamma value it tries to exactly
fit the training data set
C
C is the penalty parameter of the error term. It controls the trade off between smooth decision
boundary and classifying the training points correctly.
SVM Hyperparameters:
Genetic algorithm for hyperparameter tuning

Genetic algorithm for hyperparameter tuning

  • 1.
    Genetic Algorithms AN APPLICATIONTO HYPERPARAMETER TUNING IN PREDICTIVE MODELS Dr. Jyoti Obia
  • 2.
    Hyperparameter Tuning inPredictive Model  Heuristic / Metaheuristic Approach  Optimization strategy that mimics natural selection  Generates multiple solutions to a problem by applying the principle of ‘Survival of the fittest’  Population-based search  Flexible and robust  Not problem specific  Commonly used techniques  Need to know your hyper parameters well  Based on limited sets of combination of parameters  Decent accuracy  Very Exhaustive  Impractical GA
  • 3.
    Terms of GeneticAlgorithm  Chromosomes (String) : Solution – (CH)  Genes (bits) : Part of solution  Alleles : Value of Gene  Phenotype: Decoded solution  Genotype : Encoded solution CH1 CH2 CH3 CH30 Population For Parameter 1 For Parameter 2 Gene – String position Alleles – parameter Value
  • 4.
  • 5.
    Terminologies:  Fitness Value Valueof Objective function associated with an organism. This determines how fit the solution is.  Crossover : 2 point Crossover Other types : • Single point crossover • Multipoint crossover Normal probability of crossover is between 0.6 and 1  Mutation • Applied to each Offspring individually after crossover • Bitflip • Helps to explore more of the solution space. • The probability of mutation is usually small (0.001 to 0.1)  Elitism The organism with the best fitness value get to live to next generation  Selection Process of identifying Parents to be used for creating next generation
  • 6.
    Types of selections TournamentSelection - Randomly selected K - Choose Champion with highest value Roulette Wheel - Assign probabilities based on fitness value
  • 7.
    Algorithm Methodology: Generation 1: m= 1 Generation 1: m = N/2 Gen 1 is over here This new Generation will become the population for new Generation 2 Selection Cross-Over Mutation to create GEN 2 Repeat the process until m = M Mutated Children from the very last Generation Save the best solution from each generation Pick the best solution
  • 8.
    Hyperparameter tuning using GeneticAlgorithm Approach & Methodology Dataset Scaling Testing setTraining Set Build SVM Classifier Trained SVM Classifier Fitness Evaluation Termination Criteria Optimized (C, gamma) GA Operator GA Engine (C, gamma) pair yes no
  • 9.
    Comparative Results forSVM hyperparameter tuning (Default parameter , Grid search and Genetic algorithm) GRID SEARCH GENETIC ALGORITHM DEFAULT MODELS 60 Models 800 Models 1 Model COMPUTATION TIME 1.55 mins 4sec with HP (45 mins -training) 4 sec ACCURACY 91% 93% 82% PRECISION (C ) 250 0.03 -- PRECISION (gamma) 1000 2.87 -- SVM Classification result: Note: These metrics will change from system to system. It is just to give broad idea of comparison.
  • 10.
  • 11.
  • 12.
    Application : Determinewhether a given email is spam or not. Relevant Information: Our collection of spam e-mails came from our postmaster and individuals who had filed spam. Number of Instances: 4601 (1813 Spam = 39.4%) Number of Attributes : 58 (57 continuous, 1 nominal class label) Attribute Information :  1 nominal {0,1} class attribute : Denotes whether the e-mail was considered spam (1) or not (0)  48 continuous attributes : Percentage of words in the e-mail that match WORD  6 continuous attributes : Percentage of characters in the e-mail that match CHAR  1 continuous real [1,...] attribute : Average length of uninterrupted sequences of capital letters  1 continuous integer [1,...] attribute : Length of longest uninterrupted sequence of capital letters  1 continuous integer [1,...] attribute: Total number of capital letters in the e-mail Missing Attribute Values: None Class Distribution: Spam 1813 (39.4%) Non-Spam 2788 (60.6%) Source: This data at the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/ Spam email dataset description-
  • 13.
    gamma gamma is aparameter for non linear hyperplanes. The higher the gamma value it tries to exactly fit the training data set C C is the penalty parameter of the error term. It controls the trade off between smooth decision boundary and classifying the training points correctly. SVM Hyperparameters:

Editor's Notes

  • #3 Patterns that individuals within a population exhibits. Variation in appearance and behavior – pretty much in all species. height, eye-color, hair colour , behavioural traits, …. Those were traits most fitting to their environment survive to produce. And those survivors pass dowm their traits from generations to generations Generations include mutation to offer more variations in the future and those variation may make those offspring either more successful or less successful.