SlideShare a Scribd company logo
Analysis of Feature Selection 
Algorithms 
Branch and Bound | Beam Search algorithm 
Parinda Rajapaksha UCSC 
1
ROAD MAP 
• Motivation 
• Introduction 
• Analysis 
– Algorithm 
– Pseudo Code 
– Illustration of examples 
• Applications 
• Observations and Recommendations 
• Comparison between two algorithms 
• References 
2
SECTION 1 
Branch and Bound 
Algorithm 
3
MOTIVATION 
• The optimal feature selection (subset selection) has been very 
difficult because of its computational complexity 
• All the subsets of given cardinality that have to be evaluated to find 
the optimal set of features among large set of measurements 
• Exhaustive search is impractical even for relatively small size of 
problems 
– Finding 2 features from 10 feature set would generate 45 possible 
combinations. 
• This challenge has motivated over the years to speeding up the 
search process in the arena of feature selection 
4
INTRODUCTION 
• As a solution Branch and Bound algorithm was developed by 
Narendra and Fukunaga in 1977 
• Introduced heuristic measures which can help to identify parts of 
the search space that can be left unexplored without missing the 
optimal solution 
• Guaranteed to find the optimal feature subset without evaluating 
all possible subsets 
• B & B is an exponential search method 
• Assumes that feature selection criterion is monotonic 
5
INTRODUCTION Monotonicity Property 
• For given two feature subsets (X ,Y) and feature selection criterion 
function (J); 
X ⊂ Y => J(X) < J(Y) Ex: Y = {2,4,5} 
• It ensures that the values of the leaf nodes of that branch cannot be 
better than the current bound 
• Allows to create short cuts in the search tree representing the 
feature set optimization process 
• Reduce the number of nodes and branches of the search tree that 
have to be explored 
X = {2,4} 
6
INTRODUCTION Monotonicity Property 
Feature set => {x1, x2, x3, x4,……..xn } 
Xn 
X2 
{x1, x2, x3… xn} 
{x1} x1 
{x1, x2} 
J(x1..n) 
J(x1) < J(x1 ,x2) < J(x1, x2, x3) < ……..< J(x1, x2, x3, …. xn) 
7
• Start from the full set of features and remove features using depth 
first strategy 
• Monotonicity property should be satisfied to apply the algorithm 
• Branching is the constructing process of tree 
• For each tree level, a limited number of sub-trees is generated by 
deleting one feature from the set of features from the parent node 
• Bounding is the process of finding optimal feature set by traversing 
the constructed tree 
ANALYSIS 
8
ANALYSIS Algorithm 
1. Construct an ordered tree by satisfying the Monotonicity 
property 
Let xj be the set of features obtained by removing j features y1 , y2 … yj 
from the set Y of all features 
Xj = Y  {y1 , y2 … yj } 
The monotonicity condition assumes that, for feature subsets x1 , x2 … xj 
where, 
x1 ⊂ x2 ⊂ x3 …. ⊂ xj 
The criterion function J fulfills, 
J(x1) < J(x2) < J(x3) < … < J(xj) 
9
ANALYSIS Algorithm 
2. Traverse the tree from right to left in depth first search pattern 
• If the value of the criterion is less than the threshold (relevant to the 
most recent best subset) at a given node, 
All its successors will also have a value less than criterion value 
3. Pruning 
• Anytime the criterion value J(xm) in some internal node is found 
to be lower than the current bound, due to the Monotonicity 
condition the whole sub tree may be cut off and many computations 
may be omitted 
• B&B creates tree with all possible combinations of r element subsets 
from the n set, but searches only some of them 
10
ANALYSIS Pseudo Code 
11
ANALYSIS Tree Properties 
 Root of the tree represents the set of all features (n) and leaves 
represent target subsets (r) of features 
 For each tree level, a limited number of sub-trees is generated by 
deleting one feature from the set of features from the parent node 
{ X1,X2,X3 } 
X1 X2 X3 
All features (n) 
{ X2,X3 } { X1,X3 } { X1,X2 } 
Target subset (r) 
Removed 
feature 
12
ANALYSIS Tree Properties 
 In practice, we have only allowed variables to be removed in 
increasing order. This removes unnecessary repetitions in the 
calculation. Therefore tree is not symmetrical 
{ X1,X2,X3,X4 } 
{ X2,X3,X4 } { X1,X3,X4 } Not in 
increasing order 
X1 X2 
X2 
{ X3,X4 } 
{ X3,X4 } 
X1 
Repetition 
13
ANALYSIS Tree Properties 
 Number of leaf nodes in tree = nCr 
 Number of levels = n – r 
 Ex: 
X1 X2 X3 
No of leaf nodes = 3C2 = 3 
No of levels = 3 – 2 = 1 
{ X1,X2,X3 } 
{ X2,X3 } { X1,X3 } { X1,X2 } 
3 features reduced to 2 features 
14
EXAMPLE 
How to reduce 5 features in to 2 features using B & B 
Algorithm? 
1, 2, 3, 4, 
5 
? , ? 
Finding best 2 features from full set of features 
15
EXAMPLE Branching Step 1 
 Identify the Tree properties 
- No of levels = 5-2 = 3 (5  4  3 2) 
- No of leaf nodes = 5C2 = 10 
- Choose a criterion function J(x). 
16
EXAMPLE Branching Step 2 
1,2,3,4,5 
3 
2,3,4,5 1,3,4,5 1,2,4,5 
L 0 
1 2 
L 1 
17 
Note : If feature 4 and 5 remove from initial states, tree does not 
become a complete tree. There will be no features to remove 
in the next levels.
EXAMPLE Branching Step 3 
1,2,3,4,5 
3 
2,3,4,5 1,3,4,5 1,2,4,5 
L 0 
1 2 
2 3 4 3 4 4 
3,4,5 2,4,5 2,3,5 1,4,5 1,3,5 1,2,5 
L 1 
L 2 
18
EXAMPLE Branching Step 4 
1,2,3,4,5 
3 
2,3,4,5 1,3,4,5 1,2,4,5 
L 0 
1 2 
2 3 4 3 4 4 
3,4,5 2,4,5 2,3,5 1,4,5 1,3,5 1,2,5 
3 4 5 4 5 5 4 5 5 5 
4,5 3,5 3,4 2,5 2,4 2,3 1,5 1,4 1,3 1,2 
L 1 
L 2 
L 3 
19
EXAMPLE Criterion Values 
• Assume the Criterion function J(X) will give following results which 
15 
satisfied the Monotonicity Property 
Criterion values 
10 12 11 
6 7 8 8 10 9 
3 4 5 5 6 7 6 7 9 8 
20
EXAMPLE Backtracking 
• Calculate the criterion values using J(x) function (values are Assumed) 
• Set the right most value as the Bound (this branch has the minimum 
15 
11 
9 
8 
number of child nodes and edges) 
Set Bound 
Current V = 8 
Bound = 8 
10 12 
6 7 8 8 10 
3 4 5 5 6 7 6 7 9 
21
EXAMPLE Backtracking 
• Update the bound when backtracking reach to a leaf node 
15 
• Backtrack along the tree (depth search) if 
Current Node Value ≥ Bound 
10 12 11 
6 7 8 8 10 9 
3 4 5 5 6 7 6 7 9 8 
Update Bound 
Current V = 9 
Bound = 9 
22
EXAMPLE Backtracking 
15 
• If Current Node Value ≤ Bound 
Discard the below branches (Prune) 
• Bound will not change 
10 12 11 
6 7 8 8 10 9 
X 
Current V = 8 
Bound = 9 
3 4 5 5 6 7 6 7 9 8 
23
EXAMPLE Backtracking 
15 
• Repeat the previous steps 
10 12 11 
6 7 8 8 10 9 
X X 
Current V = 8 
Bound = 9 
3 4 5 5 6 7 6 7 9 8 
24
EXAMPLE Backtracking 
• Maximum bound in leaf nodes = 9 
• Optimal feature subset = {1,3} 
• Note that the some subsets in L3 can be omitted without calculating 
15 
10 12 11 
6 7 8 8 10 9 
X X 
Current V = 6 
Bound = 9 
X X 
3 4 5 5 6 7 6 7 9 8 
{1,3} 25
EXAMPLE 2 
Reduce 10 features to 6 features 
1, 2, 3, 4, 
5,6,7,8,9, 
10 
No of levels = 10 - 6 = 4 
No of leaf nodes = 10C6 = 210 
? 
?,?,?,? 
? 
n = 10 r = 6 
26
1 2 3 4 5 6 7 
2 . . . . 8 8 
3 . . . . 9 9 
4 . . . . 10 10 
210 Leaf nodes 
4 Levels 
EXAMPLE 2 Reduce 10 features to 6 
27
APPLICATIONS B & B 
 Evaluation of Feature Selection Techniques for Analysis of 
Functional MRI and EEG 
– This paper compares the performance of classical sequential methods 
and the B&B algorithm when applied to functional Magnetic 
Resonance Images (MRI) and intracranial EEG to classify pathological 
events. 
– They have used 12 features for MRI and 14 features for EEG data 
– The results of this work contradict that claim in several sources that the 
B&B algorithm is an optimal search algorithm for feature selection 
28
APPLICATIONS B & B 
– The algorithm fails to create subsets with better classification accuracy 
in this application 
Classification accuracy as a function of subset size for the MRI data 
29
OBSERVATIONS & RECOMMENDATION 
• Every B & B algorithm requires additional computations 
– Not only the target subsets of r features, but also their supersets n 
have to be evaluated 
• Does not guarantee that enough sub-trees will be cut off to keep the 
total number of criterion computations lower than their number in 
exhaustive search 
• In the worst case, Criterion function would be computed in every 
tree node 
– Same as the Exhaustive search 
30
OBSERVATIONS & RECOMMENDATION 
• Criterion value computation is usually slower near to the root 
– Evaluated feature subsets are larger J(X1,X2…Xn) 
• Sub tree cut-offs are less frequent near to the root 
– Higher criterion values following from larger subsets are 
compared to the bound, which is updated in leaves 
• The B & B algorithm usually spends most of the time by tedious, 
less promising evaluation of the tree nodes in levels closer to the 
root 
• This effect is to be expected, especially for r <<< n 
31
SECTION 2 
Beam Search 
Algorithm 
32
INTRODUCTION 
• Beam search is a heuristic method for solving combinatorial 
optimization problems 
• It is similar to breadth-first search as it progresses level by level 
• Only the most promising nodes at each level of the search tree are 
selected for further branching, while the remaining nodes are 
pruned off permanently 
• Beam search was first used in the artificial intelligence community 
for the speech recognition and the image understanding problems 
• The running time is polynomial in the problem size 
33
ANALYSIS Algorithm 
a) Compute the classifier performance using each of the n features 
individually (n 1-tuples) 
b) Select the best K (beam-width) features based on a pre-defined 
selection criterion among these 1-tuples 
c) Add a new feature to each of these K features, forming K(n−1) 
2-tuples of features. The tuple-size t is equal to 2 at this stage 
d) Evaluate the performance of each of these t-tuples. Of these, 
select the best K, based on classification performance. 
34
ANALYSIS Algorithm 
e) Form all possible (t + 1) tuples by appending these K t-tuples with 
other features (not already in that t tuple) 
f) Repeat steps d) to e) until the stopping criterion is met. 
the tuple size at this stage is m. 
g) The best K m-tuples are the result of beam search. 
35
ANALYSIS Pseudo Code 
36
ANALYSIS Easy 5 Steps 
1. Stat with the empty set (With no features) and evaluate the values 
of each feature individually 
– Values can be calculated by using criterion function or evaluating 
classifier performance 
2. Choose a value for Beam width (K) 
– K define the number of subsets to be carried for the next level 
3. Carry the best K subsets to the next level 
– Cut off value can be checked before the selection of best subsets 
4. Add a new feature (previously not used) to each of these selected feature 
subsets 
5. Repeat the process until tree reach to the target subset 
– Or a stopping criteria can be defined to terminate the process 
37
EXAMPLE 
How to reduce 5 features in to 3 features using Beam 
Search Algorithm? 
{1, 2, 3, 4, 5 } { ?,?,? } 
Finding best 3 features from full set of features 
38
EXAMPLE Step 1 
• Start with the empty subset and evaluate the values of each feature 
individually (Assume the values as follows) 
{ } 
1 2 3 4 5 
30 14 28 16 25 
39 
{1} {2} {3} {4} {5}
EXAMPLE Step 2 
• Select the best K (beam width) features based on a pre-defined 
selection criterion. (Assume K =3 ) 
{ } 
1 2 3 4 5 
Best features 
{ 1 } 
{ 3 } 
{ 5 } 
30 14 28 16 25 
40 
{1} {2} {3} {4} {5}
EXAMPLE Step 3 
• Add a new feature to each of these selected features. Order is not 
important. Duplications cannot be avoided. 
{ } 
1 2 3 4 5 
Best features 
{ 1 } 
{ 3 } 
{ 5 } 
30 14 28 16 25 
2 3 4 5 1 2 4 5 1 2 3 4 
31 35 60 39 30 55 40 50 34 35 34 48 
41
EXAMPLE Step 4 
• Choose the best K performance subsets among new feature sets. 
{ } 
1 2 3 4 5 
30 14 28 16 
25 
Best features 
{ 1,4 } 
{ 3,2 } 
{ 3,5 } 
2 3 4 5 1 2 4 5 1 2 3 4 
31 35 60 39 30 55 40 50 34 35 34 48 
42 
{1,4} {3,2} {3,5}
EXAMPLE Step 5 
• Carry the best K performance subsets to next level by adding rest of 
the available features. 
{ } 
1 2 3 4 5 
14 16 
2 3 4 5 1 2 4 5 1 2 3 4 
31 35 60 39 30 55 40 50 34 35 34 48 
2 3 5 1 4 5 1 2 4 
45 40 70 56 58 88 67 62 75 
Best features 
{ 1,4 } 
{ 3,2 } 
{ 3,5 } 
30 28 25 
43
EXAMPLE Step 6 
• Tree has reached to the 3 features which is the target subset. 
Maximum value will give the best feature set. 
{ } 
1 2 3 4 5 
30 14 28 16 
25 
2 3 4 5 1 2 4 5 1 2 3 4 
31 35 39 30 40 34 35 34 48 
2 3 5 1 4 5 1 2 4 
45 40 70 56 58 88 67 62 75 
Best features 
{ 1,4,5 } 
{ 3,2,5 } 
{ 3,5,4 } 
60 55 50 
44 
{1,4,5} {3,2,5} {3,5,4}
APPLICATIONS Beam Search 
 Beam Search for Feature Selection in Automatic SVM Defect 
Classification 
– In this paper they have used have implemented beam search with a 
support vector machine (SVM) classifier to select the candidate subsets 
– Improvements have been proposed to the beam search algorithm for 
feature selection, and the modified version is called Smart Beam 
Search (SBS) 
– Each defect in the data set is described by a high dimensional feature 
vector consisting of about 100 features 
45
APPLICATIONS Beam Search 
– The data set is comprised of about 3000 images, with 13 defect classes 
and presented the results for beam widths K= 2 and 5 
– SBS feature selection approach has reduces the dimensionality of the 
feature space and increased the classifier performance 
Overall accuracy using features selected by SBS 
46
OBSERVATIONS & RECOMMENDATIONS 
• There is no backtracking, since the intent of this technique is to 
search quickly 
• Therefore, beam search methods are not guaranteed to find an 
optimal solution and cannot recover from wrong decisions 
• Duplications cannot be avoided in the tree 
• If a node leading to the optimal solution is discarded during the 
search process, there is no way to reach that optimal solution 
afterwards 
• Beam width parameter K is fixed to a value before searching starts 
• A wider beam width allows greater safety, but it will increase the 
computational cost 
47
COMPARISON 
Branch and Bound Beam Search 
Follow depth fast strategy Similar to breadth fast search 
Guaranteed to find the optimal 
feature subset 
Not guaranteed to find optimal 
feature subset 
It is an Exponential search 
Polynomial running time in the 
problem size 
Backtracking needed to prune 
unnecessary subsets 
No need of backtracking process 
Need additional computations to 
backtrack after constructing the tree 
No additional computations needed 
after constructing the tree 
Need to fulfill Monotonicity Property 
No need to consider about 
Monotonicity Property 
Duplicate subsets are omitted Duplications cannot be avoided 
48
REFERENCES 
1. Narendra, P. M., & Fukunaga, K. (1977). A branch and bound algorithm 
for feature subset selection. Computers, IEEE Transactions on, 100(9), 
917-922. 
2. Somol, P., Pudil, P., & Kittler, J. (2004). Fast branch & bound algorithms for 
optimal feature selection. Pattern Analysis and Machine Intelligence, IEEE 
Transactions on, 26(7), 900-912. 
3. Burrell, L., Smart, O., Georgoulas, G. K., Marsh, E., & Vachtsevanos, G. J. 
(2007, June). Evaluation of Feature Selection Techniques for Analysis of 
Functional MRI and EEG. In DMIN (pp. 256-262) 
4. Gupta, P., Doermann, D., & DeMenthon, D. (2002). Beam search for 
feature selection in automatic SVM defect classification. In Pattern 
Recognition, 2002. Proceedings. 16th International Conference on (Vol. 2, 
pp. 212-215). IEEE. 
49
REFERENCES 
5. Dashti, M. T., & Wijs, A. J. (2007). Pruning state spaces with extended 
beam search. In Automated Technology for Verification and Analysis (pp. 
543-552). Springer Berlin Heidelberg. 
6. Valente, J., & Alves, R. A. (2005). Filtered and recovering beam search 
algorithms for the early/tardy scheduling problem with no idle 
time. Computers & Industrial Engineering, 48(2), 363-375. 
50

More Related Content

What's hot

Trees (data structure)
Trees (data structure)Trees (data structure)
Trees (data structure)
Trupti Agrawal
 
Design and Analysis of Algorithms
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of Algorithms
Arvind Krishnaa
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
Dr Shashikant Athawale
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
Krish_ver2
 
Minimum Spanning Tree
Minimum Spanning TreeMinimum Spanning Tree
Minimum Spanning Treezhaokatherine
 
Sorting algorithms
Sorting algorithmsSorting algorithms
Sorting algorithms
Vicente García Díaz
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
Yan Xu
 
K means clustering
K means clusteringK means clustering
K means clustering
Ahmedasbasb
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
Christopher Marker
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
Kiran K
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forests
Marc Garcia
 
Decision Tree - ID3
Decision Tree - ID3Decision Tree - ID3
Decision Tree - ID3Xueping Peng
 
All pairs shortest path algorithm
All pairs shortest path algorithmAll pairs shortest path algorithm
All pairs shortest path algorithmSrikrishnan Suresh
 
Data analytics concepts
Data analytics conceptsData analytics concepts
Data analytics concepts
Hiranthi Tennakoon
 
Prims and kruskal algorithms
Prims and kruskal algorithmsPrims and kruskal algorithms
Prims and kruskal algorithms
Saga Valsalan
 
Backtracking & branch and bound
Backtracking & branch and boundBacktracking & branch and bound
Backtracking & branch and bound
Vipul Chauhan
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structure
Abrish06
 
Tree in data structure
Tree in data structureTree in data structure
Tree in data structureghhgj jhgh
 
Greedy Algorithm
Greedy AlgorithmGreedy Algorithm
Greedy Algorithm
Waqar Akram
 

What's hot (20)

Trees (data structure)
Trees (data structure)Trees (data structure)
Trees (data structure)
 
Design and Analysis of Algorithms
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of Algorithms
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Minimum Spanning Tree
Minimum Spanning TreeMinimum Spanning Tree
Minimum Spanning Tree
 
Sorting algorithms
Sorting algorithmsSorting algorithms
Sorting algorithms
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forests
 
Decision Tree - ID3
Decision Tree - ID3Decision Tree - ID3
Decision Tree - ID3
 
All pairs shortest path algorithm
All pairs shortest path algorithmAll pairs shortest path algorithm
All pairs shortest path algorithm
 
Data analytics concepts
Data analytics conceptsData analytics concepts
Data analytics concepts
 
Prims and kruskal algorithms
Prims and kruskal algorithmsPrims and kruskal algorithms
Prims and kruskal algorithms
 
Backtracking & branch and bound
Backtracking & branch and boundBacktracking & branch and bound
Backtracking & branch and bound
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structure
 
Tree in data structure
Tree in data structureTree in data structure
Tree in data structure
 
Greedy Algorithm
Greedy AlgorithmGreedy Algorithm
Greedy Algorithm
 

Similar to Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)

Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Universidade de São Paulo
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
Kohei Asano
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
Rahul926331
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
ssuser2624f71
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
Zihui Li
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
Sivam Chinna
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
Shocky1
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
Dr. Syed Hassan Amin
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
ShwetapadmaBabu1
 
Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processing
Nevil Dsouza
 
designanalysisalgorithm_unit-v-part2.pptx
designanalysisalgorithm_unit-v-part2.pptxdesignanalysisalgorithm_unit-v-part2.pptx
designanalysisalgorithm_unit-v-part2.pptx
arifimad15
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ChenYiHuang5
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Wuhyun Rico Shin
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
Mostafa G. M. Mostafa
 
A framework for practical fast matrix multiplication
A framework for practical fast matrix multiplication�A framework for practical fast matrix multiplication�
A framework for practical fast matrix multiplication
Austin Benson
 
Parallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural ClusteringParallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural Clustering
煜林 车
 

Similar to Analysis of Feature Selection Algorithms (Branch & Bound and Beam search) (20)

Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
Phases of distributed query processing
Phases of distributed query processingPhases of distributed query processing
Phases of distributed query processing
 
designanalysisalgorithm_unit-v-part2.pptx
designanalysisalgorithm_unit-v-part2.pptxdesignanalysisalgorithm_unit-v-part2.pptx
designanalysisalgorithm_unit-v-part2.pptx
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
A framework for practical fast matrix multiplication
A framework for practical fast matrix multiplication�A framework for practical fast matrix multiplication�
A framework for practical fast matrix multiplication
 
Parallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural ClusteringParallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural Clustering
 

More from Parinda Rajapaksha

Android development
Android developmentAndroid development
Android development
Parinda Rajapaksha
 
Realm mobile database
Realm mobile databaseRealm mobile database
Realm mobile database
Parinda Rajapaksha
 
Identifying adverse drug reactions by analyzing twitter messages
Identifying adverse drug reactions by analyzing twitter messagesIdentifying adverse drug reactions by analyzing twitter messages
Identifying adverse drug reactions by analyzing twitter messages
Parinda Rajapaksha
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment
Parinda Rajapaksha
 
Scientific methods in computer science
Scientific methods in computer scienceScientific methods in computer science
Scientific methods in computer science
Parinda Rajapaksha
 
Gift 4 life v 1.1 (Blood Camp Management System)
Gift 4 life v 1.1 (Blood Camp Management System)Gift 4 life v 1.1 (Blood Camp Management System)
Gift 4 life v 1.1 (Blood Camp Management System)
Parinda Rajapaksha
 
Ceylon tobacco company (ctc)
Ceylon tobacco company (ctc)Ceylon tobacco company (ctc)
Ceylon tobacco company (ctc)
Parinda Rajapaksha
 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
Parinda Rajapaksha
 

More from Parinda Rajapaksha (8)

Android development
Android developmentAndroid development
Android development
 
Realm mobile database
Realm mobile databaseRealm mobile database
Realm mobile database
 
Identifying adverse drug reactions by analyzing twitter messages
Identifying adverse drug reactions by analyzing twitter messagesIdentifying adverse drug reactions by analyzing twitter messages
Identifying adverse drug reactions by analyzing twitter messages
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment
 
Scientific methods in computer science
Scientific methods in computer scienceScientific methods in computer science
Scientific methods in computer science
 
Gift 4 life v 1.1 (Blood Camp Management System)
Gift 4 life v 1.1 (Blood Camp Management System)Gift 4 life v 1.1 (Blood Camp Management System)
Gift 4 life v 1.1 (Blood Camp Management System)
 
Ceylon tobacco company (ctc)
Ceylon tobacco company (ctc)Ceylon tobacco company (ctc)
Ceylon tobacco company (ctc)
 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
 

Recently uploaded

Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 

Recently uploaded (20)

Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 

Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)

  • 1. Analysis of Feature Selection Algorithms Branch and Bound | Beam Search algorithm Parinda Rajapaksha UCSC 1
  • 2. ROAD MAP • Motivation • Introduction • Analysis – Algorithm – Pseudo Code – Illustration of examples • Applications • Observations and Recommendations • Comparison between two algorithms • References 2
  • 3. SECTION 1 Branch and Bound Algorithm 3
  • 4. MOTIVATION • The optimal feature selection (subset selection) has been very difficult because of its computational complexity • All the subsets of given cardinality that have to be evaluated to find the optimal set of features among large set of measurements • Exhaustive search is impractical even for relatively small size of problems – Finding 2 features from 10 feature set would generate 45 possible combinations. • This challenge has motivated over the years to speeding up the search process in the arena of feature selection 4
  • 5. INTRODUCTION • As a solution Branch and Bound algorithm was developed by Narendra and Fukunaga in 1977 • Introduced heuristic measures which can help to identify parts of the search space that can be left unexplored without missing the optimal solution • Guaranteed to find the optimal feature subset without evaluating all possible subsets • B & B is an exponential search method • Assumes that feature selection criterion is monotonic 5
  • 6. INTRODUCTION Monotonicity Property • For given two feature subsets (X ,Y) and feature selection criterion function (J); X ⊂ Y => J(X) < J(Y) Ex: Y = {2,4,5} • It ensures that the values of the leaf nodes of that branch cannot be better than the current bound • Allows to create short cuts in the search tree representing the feature set optimization process • Reduce the number of nodes and branches of the search tree that have to be explored X = {2,4} 6
  • 7. INTRODUCTION Monotonicity Property Feature set => {x1, x2, x3, x4,……..xn } Xn X2 {x1, x2, x3… xn} {x1} x1 {x1, x2} J(x1..n) J(x1) < J(x1 ,x2) < J(x1, x2, x3) < ……..< J(x1, x2, x3, …. xn) 7
  • 8. • Start from the full set of features and remove features using depth first strategy • Monotonicity property should be satisfied to apply the algorithm • Branching is the constructing process of tree • For each tree level, a limited number of sub-trees is generated by deleting one feature from the set of features from the parent node • Bounding is the process of finding optimal feature set by traversing the constructed tree ANALYSIS 8
  • 9. ANALYSIS Algorithm 1. Construct an ordered tree by satisfying the Monotonicity property Let xj be the set of features obtained by removing j features y1 , y2 … yj from the set Y of all features Xj = Y {y1 , y2 … yj } The monotonicity condition assumes that, for feature subsets x1 , x2 … xj where, x1 ⊂ x2 ⊂ x3 …. ⊂ xj The criterion function J fulfills, J(x1) < J(x2) < J(x3) < … < J(xj) 9
  • 10. ANALYSIS Algorithm 2. Traverse the tree from right to left in depth first search pattern • If the value of the criterion is less than the threshold (relevant to the most recent best subset) at a given node, All its successors will also have a value less than criterion value 3. Pruning • Anytime the criterion value J(xm) in some internal node is found to be lower than the current bound, due to the Monotonicity condition the whole sub tree may be cut off and many computations may be omitted • B&B creates tree with all possible combinations of r element subsets from the n set, but searches only some of them 10
  • 12. ANALYSIS Tree Properties  Root of the tree represents the set of all features (n) and leaves represent target subsets (r) of features  For each tree level, a limited number of sub-trees is generated by deleting one feature from the set of features from the parent node { X1,X2,X3 } X1 X2 X3 All features (n) { X2,X3 } { X1,X3 } { X1,X2 } Target subset (r) Removed feature 12
  • 13. ANALYSIS Tree Properties  In practice, we have only allowed variables to be removed in increasing order. This removes unnecessary repetitions in the calculation. Therefore tree is not symmetrical { X1,X2,X3,X4 } { X2,X3,X4 } { X1,X3,X4 } Not in increasing order X1 X2 X2 { X3,X4 } { X3,X4 } X1 Repetition 13
  • 14. ANALYSIS Tree Properties  Number of leaf nodes in tree = nCr  Number of levels = n – r  Ex: X1 X2 X3 No of leaf nodes = 3C2 = 3 No of levels = 3 – 2 = 1 { X1,X2,X3 } { X2,X3 } { X1,X3 } { X1,X2 } 3 features reduced to 2 features 14
  • 15. EXAMPLE How to reduce 5 features in to 2 features using B & B Algorithm? 1, 2, 3, 4, 5 ? , ? Finding best 2 features from full set of features 15
  • 16. EXAMPLE Branching Step 1  Identify the Tree properties - No of levels = 5-2 = 3 (5  4  3 2) - No of leaf nodes = 5C2 = 10 - Choose a criterion function J(x). 16
  • 17. EXAMPLE Branching Step 2 1,2,3,4,5 3 2,3,4,5 1,3,4,5 1,2,4,5 L 0 1 2 L 1 17 Note : If feature 4 and 5 remove from initial states, tree does not become a complete tree. There will be no features to remove in the next levels.
  • 18. EXAMPLE Branching Step 3 1,2,3,4,5 3 2,3,4,5 1,3,4,5 1,2,4,5 L 0 1 2 2 3 4 3 4 4 3,4,5 2,4,5 2,3,5 1,4,5 1,3,5 1,2,5 L 1 L 2 18
  • 19. EXAMPLE Branching Step 4 1,2,3,4,5 3 2,3,4,5 1,3,4,5 1,2,4,5 L 0 1 2 2 3 4 3 4 4 3,4,5 2,4,5 2,3,5 1,4,5 1,3,5 1,2,5 3 4 5 4 5 5 4 5 5 5 4,5 3,5 3,4 2,5 2,4 2,3 1,5 1,4 1,3 1,2 L 1 L 2 L 3 19
  • 20. EXAMPLE Criterion Values • Assume the Criterion function J(X) will give following results which 15 satisfied the Monotonicity Property Criterion values 10 12 11 6 7 8 8 10 9 3 4 5 5 6 7 6 7 9 8 20
  • 21. EXAMPLE Backtracking • Calculate the criterion values using J(x) function (values are Assumed) • Set the right most value as the Bound (this branch has the minimum 15 11 9 8 number of child nodes and edges) Set Bound Current V = 8 Bound = 8 10 12 6 7 8 8 10 3 4 5 5 6 7 6 7 9 21
  • 22. EXAMPLE Backtracking • Update the bound when backtracking reach to a leaf node 15 • Backtrack along the tree (depth search) if Current Node Value ≥ Bound 10 12 11 6 7 8 8 10 9 3 4 5 5 6 7 6 7 9 8 Update Bound Current V = 9 Bound = 9 22
  • 23. EXAMPLE Backtracking 15 • If Current Node Value ≤ Bound Discard the below branches (Prune) • Bound will not change 10 12 11 6 7 8 8 10 9 X Current V = 8 Bound = 9 3 4 5 5 6 7 6 7 9 8 23
  • 24. EXAMPLE Backtracking 15 • Repeat the previous steps 10 12 11 6 7 8 8 10 9 X X Current V = 8 Bound = 9 3 4 5 5 6 7 6 7 9 8 24
  • 25. EXAMPLE Backtracking • Maximum bound in leaf nodes = 9 • Optimal feature subset = {1,3} • Note that the some subsets in L3 can be omitted without calculating 15 10 12 11 6 7 8 8 10 9 X X Current V = 6 Bound = 9 X X 3 4 5 5 6 7 6 7 9 8 {1,3} 25
  • 26. EXAMPLE 2 Reduce 10 features to 6 features 1, 2, 3, 4, 5,6,7,8,9, 10 No of levels = 10 - 6 = 4 No of leaf nodes = 10C6 = 210 ? ?,?,?,? ? n = 10 r = 6 26
  • 27. 1 2 3 4 5 6 7 2 . . . . 8 8 3 . . . . 9 9 4 . . . . 10 10 210 Leaf nodes 4 Levels EXAMPLE 2 Reduce 10 features to 6 27
  • 28. APPLICATIONS B & B  Evaluation of Feature Selection Techniques for Analysis of Functional MRI and EEG – This paper compares the performance of classical sequential methods and the B&B algorithm when applied to functional Magnetic Resonance Images (MRI) and intracranial EEG to classify pathological events. – They have used 12 features for MRI and 14 features for EEG data – The results of this work contradict that claim in several sources that the B&B algorithm is an optimal search algorithm for feature selection 28
  • 29. APPLICATIONS B & B – The algorithm fails to create subsets with better classification accuracy in this application Classification accuracy as a function of subset size for the MRI data 29
  • 30. OBSERVATIONS & RECOMMENDATION • Every B & B algorithm requires additional computations – Not only the target subsets of r features, but also their supersets n have to be evaluated • Does not guarantee that enough sub-trees will be cut off to keep the total number of criterion computations lower than their number in exhaustive search • In the worst case, Criterion function would be computed in every tree node – Same as the Exhaustive search 30
  • 31. OBSERVATIONS & RECOMMENDATION • Criterion value computation is usually slower near to the root – Evaluated feature subsets are larger J(X1,X2…Xn) • Sub tree cut-offs are less frequent near to the root – Higher criterion values following from larger subsets are compared to the bound, which is updated in leaves • The B & B algorithm usually spends most of the time by tedious, less promising evaluation of the tree nodes in levels closer to the root • This effect is to be expected, especially for r <<< n 31
  • 32. SECTION 2 Beam Search Algorithm 32
  • 33. INTRODUCTION • Beam search is a heuristic method for solving combinatorial optimization problems • It is similar to breadth-first search as it progresses level by level • Only the most promising nodes at each level of the search tree are selected for further branching, while the remaining nodes are pruned off permanently • Beam search was first used in the artificial intelligence community for the speech recognition and the image understanding problems • The running time is polynomial in the problem size 33
  • 34. ANALYSIS Algorithm a) Compute the classifier performance using each of the n features individually (n 1-tuples) b) Select the best K (beam-width) features based on a pre-defined selection criterion among these 1-tuples c) Add a new feature to each of these K features, forming K(n−1) 2-tuples of features. The tuple-size t is equal to 2 at this stage d) Evaluate the performance of each of these t-tuples. Of these, select the best K, based on classification performance. 34
  • 35. ANALYSIS Algorithm e) Form all possible (t + 1) tuples by appending these K t-tuples with other features (not already in that t tuple) f) Repeat steps d) to e) until the stopping criterion is met. the tuple size at this stage is m. g) The best K m-tuples are the result of beam search. 35
  • 37. ANALYSIS Easy 5 Steps 1. Stat with the empty set (With no features) and evaluate the values of each feature individually – Values can be calculated by using criterion function or evaluating classifier performance 2. Choose a value for Beam width (K) – K define the number of subsets to be carried for the next level 3. Carry the best K subsets to the next level – Cut off value can be checked before the selection of best subsets 4. Add a new feature (previously not used) to each of these selected feature subsets 5. Repeat the process until tree reach to the target subset – Or a stopping criteria can be defined to terminate the process 37
  • 38. EXAMPLE How to reduce 5 features in to 3 features using Beam Search Algorithm? {1, 2, 3, 4, 5 } { ?,?,? } Finding best 3 features from full set of features 38
  • 39. EXAMPLE Step 1 • Start with the empty subset and evaluate the values of each feature individually (Assume the values as follows) { } 1 2 3 4 5 30 14 28 16 25 39 {1} {2} {3} {4} {5}
  • 40. EXAMPLE Step 2 • Select the best K (beam width) features based on a pre-defined selection criterion. (Assume K =3 ) { } 1 2 3 4 5 Best features { 1 } { 3 } { 5 } 30 14 28 16 25 40 {1} {2} {3} {4} {5}
  • 41. EXAMPLE Step 3 • Add a new feature to each of these selected features. Order is not important. Duplications cannot be avoided. { } 1 2 3 4 5 Best features { 1 } { 3 } { 5 } 30 14 28 16 25 2 3 4 5 1 2 4 5 1 2 3 4 31 35 60 39 30 55 40 50 34 35 34 48 41
  • 42. EXAMPLE Step 4 • Choose the best K performance subsets among new feature sets. { } 1 2 3 4 5 30 14 28 16 25 Best features { 1,4 } { 3,2 } { 3,5 } 2 3 4 5 1 2 4 5 1 2 3 4 31 35 60 39 30 55 40 50 34 35 34 48 42 {1,4} {3,2} {3,5}
  • 43. EXAMPLE Step 5 • Carry the best K performance subsets to next level by adding rest of the available features. { } 1 2 3 4 5 14 16 2 3 4 5 1 2 4 5 1 2 3 4 31 35 60 39 30 55 40 50 34 35 34 48 2 3 5 1 4 5 1 2 4 45 40 70 56 58 88 67 62 75 Best features { 1,4 } { 3,2 } { 3,5 } 30 28 25 43
  • 44. EXAMPLE Step 6 • Tree has reached to the 3 features which is the target subset. Maximum value will give the best feature set. { } 1 2 3 4 5 30 14 28 16 25 2 3 4 5 1 2 4 5 1 2 3 4 31 35 39 30 40 34 35 34 48 2 3 5 1 4 5 1 2 4 45 40 70 56 58 88 67 62 75 Best features { 1,4,5 } { 3,2,5 } { 3,5,4 } 60 55 50 44 {1,4,5} {3,2,5} {3,5,4}
  • 45. APPLICATIONS Beam Search  Beam Search for Feature Selection in Automatic SVM Defect Classification – In this paper they have used have implemented beam search with a support vector machine (SVM) classifier to select the candidate subsets – Improvements have been proposed to the beam search algorithm for feature selection, and the modified version is called Smart Beam Search (SBS) – Each defect in the data set is described by a high dimensional feature vector consisting of about 100 features 45
  • 46. APPLICATIONS Beam Search – The data set is comprised of about 3000 images, with 13 defect classes and presented the results for beam widths K= 2 and 5 – SBS feature selection approach has reduces the dimensionality of the feature space and increased the classifier performance Overall accuracy using features selected by SBS 46
  • 47. OBSERVATIONS & RECOMMENDATIONS • There is no backtracking, since the intent of this technique is to search quickly • Therefore, beam search methods are not guaranteed to find an optimal solution and cannot recover from wrong decisions • Duplications cannot be avoided in the tree • If a node leading to the optimal solution is discarded during the search process, there is no way to reach that optimal solution afterwards • Beam width parameter K is fixed to a value before searching starts • A wider beam width allows greater safety, but it will increase the computational cost 47
  • 48. COMPARISON Branch and Bound Beam Search Follow depth fast strategy Similar to breadth fast search Guaranteed to find the optimal feature subset Not guaranteed to find optimal feature subset It is an Exponential search Polynomial running time in the problem size Backtracking needed to prune unnecessary subsets No need of backtracking process Need additional computations to backtrack after constructing the tree No additional computations needed after constructing the tree Need to fulfill Monotonicity Property No need to consider about Monotonicity Property Duplicate subsets are omitted Duplications cannot be avoided 48
  • 49. REFERENCES 1. Narendra, P. M., & Fukunaga, K. (1977). A branch and bound algorithm for feature subset selection. Computers, IEEE Transactions on, 100(9), 917-922. 2. Somol, P., Pudil, P., & Kittler, J. (2004). Fast branch & bound algorithms for optimal feature selection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26(7), 900-912. 3. Burrell, L., Smart, O., Georgoulas, G. K., Marsh, E., & Vachtsevanos, G. J. (2007, June). Evaluation of Feature Selection Techniques for Analysis of Functional MRI and EEG. In DMIN (pp. 256-262) 4. Gupta, P., Doermann, D., & DeMenthon, D. (2002). Beam search for feature selection in automatic SVM defect classification. In Pattern Recognition, 2002. Proceedings. 16th International Conference on (Vol. 2, pp. 212-215). IEEE. 49
  • 50. REFERENCES 5. Dashti, M. T., & Wijs, A. J. (2007). Pruning state spaces with extended beam search. In Automated Technology for Verification and Analysis (pp. 543-552). Springer Berlin Heidelberg. 6. Valente, J., & Alves, R. A. (2005). Filtered and recovering beam search algorithms for the early/tardy scheduling problem with no idle time. Computers & Industrial Engineering, 48(2), 363-375. 50