Branch and Bound Feature Selection for Hyperspectral Image Classification

681
-1

Published on

Feature selection (FS) is a classical combinatorial problem in pattern recognition and data mining. It finds major importance in classification and regression scenarios. In this paper, a hybrid approach that combines branch-and-bound (BB) search with Bhattacharya distance based feature selection is presented for classifying hyperspectral data using Support Vector Machine (SVM) classifiers. The performance of this hybrid approach is compared to another hybrid approach that uses genetic algorithm (GA) based feature selection in place of BB. It is also compared to baseline SVMs with no feature reduction. Experimental results using hyperspectral data show that under small sample size situations, BB approach performs better than GA and SVM with no feature selection.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
681
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Branch and Bound Feature Selection for Hyperspectral Image Classification

  1. 1. BRANCH AND BOUND BASED FEATURE ELIMINATION FOR SUPPORT VECTOR MACHINE BASED CLASSIFICATION OF HYPERSPECTRAL IMAGES Sathishkumar Samiappan Saurabh Prasad Lori M. Bruce & Eric Hansen Mississippi State University
  2. 2. INTRODUCTION • Hyperspectral Images (HSI) are widely used for ground cover classification problems. • The problem is very challenging because, 1) High dimensional feature space 2) High correlation between successive features • In last decade, Support Vector Machines(SVM) shown to perform well for the problem • Traditional View about SVMs: They can handle higher dimensionality, Hence Feature Selection (FS) is not required • Recently Waske et al showed that FS can improve the classification performance of SVMs
  3. 3. MOTIVATION FS based on Metrics such as Bhattacharya Distance, Mutual Information & Correlation etc.. FS based on Search Approach Feature Selection Algorithms What is good or Bad about this? • Easy computation • Often Sub - Optimal Exhaustive Search Search based on Intelligence Can both of them get married? A HYBRID approach?
  4. 4. SELECTION OF ALGORITHMS • Rank Based Approach : Feature selection based on Bhattacharya distance(BD) and correlation • Features are ranked according to descending order of their BDs and Correlation • Select the first m features • BDs are generally used in selecting features for hyperspectral image classification Bhattacharya Distance µi & µj are the means of two classes ∑i & ∑j be their covariance
  5. 5. SELECTION OF ALGORITHMS • Search Approach : Branch and Bound (B&B) Search and Genetic Algorithms(GA) • Branch and Bound is a modification of simple tree search with back tracking. • With a good estimate of upper bound, B&B almost converges to the solution of exhaustive search • Genetic Algorithm is a very popular optimization procedure inspired from human evolution. • GA is basically a random search but guaranteed to converge to the optimal solution
  6. 6. PROPOSED HYBRIDAPPROACH
  7. 7. OBJECTIVE • To remove a subset of features such that the remaining features achieves the best performance during SVM classification • Bhattacharya Distance and Correlation: To rank the features based on their usefulness in discriminating between classes • Branch and Bound or Genetic Algorithms: To select a subset of lower ranked features to remove from feature set. • To create a elimination strategy in different combinations among the lower ranked features.
  8. 8. BRANCH AND BOUND • B&B is a general algorithmic strategy used to solve optimization problems • It divides the problem to be solved as number of sub-problems • Instead of solving all the sub problems, B&B tries to find one viable solution and notes its value as Upper Bound • All following calculations are terminated as soon as its cost reaches upper bound • If a better solution is found then Upper Bound will be updated • In this way many sub problems can be left unsolved safely
  9. 9. EXAMPLE : BRANCH AND BOUND Step 1: Rank the features in descending order of its importance and select the first +1m B1 Total Number of Features q = 6 Features to be Removed m = 4 Features Selected = (q-m) = 2 Initial Upper Bound B = Bo m Step 2: Compute the cost B1 at node 1 If B1<B Goto Step 1 else Back track & select B2 B1 The entire subtree can be discarded
  10. 10. EXAMPLE : BRANCH AND BOUND S1 S2 S3 S4 Maximum Depth of the tree = m So if depth reaches m, Compute new Bound for the path If new bound is better Update bound B else backtrack
  11. 11. ALGORITHM
  12. 12. The parameters used for GA Fitness Function - Multiclass Spider SVM Implementation using RBF kernal with sigma = 0.5 Number of generations = 20 Length of chromosome = 50 Population size = 30 Crossover Probability = 0.6 Mutation Probability = 0.003 GENETIC ALGORITHMS
  13. 13. RESULTS • Dataset used – AVIRIS Indian Pines with 220 features. • It’s a 7 class data with 200 training samples from each class. • The classes are corn no-till, corn min-till, grass pasture, hay windrowed, soybeans no- till, soybeans clean and woods.
  14. 14. • Pros - Compromise between rank based FS and exhaustive search - Computationally efficient when compared to GA and other search techniques - Potential to provide significant increase in performance of SVMs - Robust with small sample sizes(few training samples) • Cons - potential of overtraining DISCUSSION
  15. 15. Thank you Queries – Comments - suggestions Sathishkumar Samiappan sathish@gri.msstate.edu

×