Embedded Feature Selection
of Hyperspectral Bands with
Boosted Decision Trees
Sildomar Monteiro and Richard Murphy

The University of Sydney
Rio Tinto Centre for Mine Automation
    • Totally Autonomous Mine in 10 years:
        – Brings together all elements of systems, perception,
          machine learning, data fusion and more
        – A grand challenge for Field Robotics
    • Driven by safety, predictability and efficiency




     Dr Sildomar Monteiro                   IGARSS 2011          2
2
Goal: Mine Picture Compilation
• Provide a complete and accurate model of the mine
    – Mine planning and better prediction outcomes
• Maintain and update a multi-scale probabilistic
  representation
    –   Geology
    –   Geometry
    –   Equipment
    –   And other properties of interest for the mining process




 Dr Sildomar Monteiro                      IGARSS 2011            3
Today
                    Today




        Geology Feedback to Batch




                                                                  Floor mapping using
                                                                 ripped trench sections
                                    Cone logging

    Dr Sildomar Monteiro                           IGARSS 2011                            4
4
Geology (ground-truth)




Dr Sildomar Monteiro             IGARSS 2011   5
Mine Face Scanning




                               [Nieto, Viejo and
                               Monteiro, 2010]
                 IGARSS 2011                       6
Hyperspectral sensing for mining
• Geology classification (material identification) still has
  many challenges
• Environmental conditions
    – Illumination, temperature, dust
• Timely data acquisition and processing is needed
    – Algorithms and calibration
• High spectral similarity between (ore-bearing) rock
  types
    – Few, if any, distinctive spectral features




 Dr Sildomar Monteiro                      IGARSS 2011     7
Outline

• Hyperspectral classification using Boosting

• Embedded band selection

• Experiments using iron ore data




 Dr Sildomar Monteiro               IGARSS 2011   8
Hyper-spectral Sensors




                                                                Multispectral
                                                                Hyper-spectral



                                                                  SWIR
                                             VisNIR               970-2500 nm
                                             400-970 nm




                                    Band n


                           Band 6
                         Band 5
                       Band 4
                     Band 3
Dr Sildomar MonteiroBand 2                        IGARSS 2011                    9
                 Band 1
Example of Classification and Spectra




                                                                                                                                             0.6                                                    0.20
              0.6                                                      0.20                                                                              a
                            a                                                        b
                                                                                                                                             0.5
              0.5                                                                                                                                                                                   0.15




                                                                                                                               Reflectance
                                                                       0.15
Reflectance




                                                                                                                                             0.4
              0.4

                                                                                                                                             0.3                                                    0.10
              0.3                                                      0.10

                                                                                                                                             0.2
              0.2                                                                                                                                                                                   0.05
                                                                       0.05
                                                                                                                                             0.1
              0.1

                                                                                                                                             0.0                                                    0.00
              0.0                                                      0.00
                                                                                                                                                   500       750   1000 1250 1500 1750 2000 2250
                      500       750    1000 1250 1500 1750 2000 2250           500       750   1000 1250 1500 1750 2000 2250


                                                                                                                                             1.0                                                     0.5
              1.0 0.6
                          c a
                                                                        0.5
                                                                        0.20         db                                                              c
                                                                                                                                             0.8                                                     0.4
              0.8 0.5                                                   0.4




                                                                                                                               Reflectance
                                                                        0.15
Reflectance
Reflectance




                    0.4                                                                                                                      0.6                                                     0.3
              0.6                                                       0.3
                    0.3                                                 0.10
              0.4                                                       0.2                                                                  0.4                                                     0.2
                    0.2
                                                                        0.05
              0.2 0.1                                                   0.1                                                                  0.2                                                     0.1


                                                                        0.00                                                                 0.0                                                     0.0
              0.0 0.0                                                   0.0
                     500 500
                           750 750 1000 1250 1500 1750 20002250
                                 1000 1250 1500 1750 2000                      500       750 1000 1250 1500 1750 2000 2250                         500       750   1000 1250 1500 1750 2000 2250
                                                             2250               500       750 1000 1250 1500 1750 2000 2250

                                      Wavelength (nm)                                          Wavelength (nm)                                                     Wavelength (nm)
                    1.0
                                        Dr Sildomar Monteiro             0.5
                                                                                                                           IGARSS 2011                                                         10
                                c                                                     d
Hyperspectral Band Selection
• Feature Selection (vs Dimensionality Reduction)
       – Remove correlated inputs
       – Physical interpretation (band wavelengths)
•    Faster data processing
•    Possible faster data acquisition
•    Can be tailored to application
•    Indicate multispectral bands




    Dr Sildomar Monteiro                   IGARSS 2011   11
Boosting
• Sound theoretical foundation
    – Additive Logistic Regression [Friedman, 2000]
• Empirical studies show that boosting
    – Yields small classification error rates
    – Is very resilient to overfitting
• State-of-the-art results in many applications, e.g. face
  recognition in computer vision
• The idea of Boosting is to train many “weak” learners
  on various distributions (or set of weights) of the input
  data and then combine the resulting classifiers into a
  single “committee”


 Dr Sildomar Monteiro                      IGARSS 2011    12
Decision Trees
• Advantages:
    – Robustness and interpretability
• Disadvantages
    – Low accuracy and high variance
• Binary decision trees                                  (x )
     f (x , , , a,b ) a (x )               b
                                                    b           a   b
• Boosted trees
    – Accurate, robust and interpretable
                     M                
       G ( x)  sign    m f m ( x ) 
                      m1                                         m
                                                               
                                                             2 3
                                                            1
 Dr Sildomar Monteiro                      IGARSS 2011                   13
Embedded Feature Selection
• Relative Importance of input variables
                                                                    1
                                         ˆ
                                         F (x )
                                                                        2

                        Ij         Ex           . varx x j
                                          xj

• Approximation for decision trees (heuristic)
  [Friedman, 1999]
                                         J 1
                             ˆ
                             I j2 (T )         ˆ
                                               it2 ( (t )      j)
                                         t 1

• Least-squares improvement criterion
            2          wl wr                                                2
           i Rl , Rr          yl yr
                      wl w r

 Dr Sildomar Monteiro                                       IGARSS 2011         14
Embedded Feature Selection (cont.)
• Boosted Decision Trees
                                   M
                        ˆ      1         ˆ
                        I j2             I j2 Tm
                               M   m 1



• The Multi-class case
                                    K
                          ˆ    1          ˆ
                          Ij              I jk
                               K    k 1




 Dr Sildomar Monteiro                              IGARSS 2011   15
Experiments
• Hyperspectral data acquired using a field
  spectrometer (ASD)
    – 429 bands (same as hyperspectral camera)
    – Wavelengths from 350 nm to 2500 nm
• Samples of ore-bearing rocks
    – Martite, goethite, kaolinite, etc (total 9 classes)
    – Different illumination and physical conditions (direct sunlight,
      shadow and viewing angles)
• Methodology of experiments
    – Metrics: accuracy, precision, recall, F, Kappa, AUC
    – 4-fold cross-validation




 Dr Sildomar Monteiro                      IGARSS 2011               16
Hyperspectral data set




Dr Sildomar Monteiro              IGARSS 2011   17
Information in spectra

                       samples_644-17_1_00000.asd.ref                       samples_644-17_1_00035.asd.ref
           0.8
                 VisNIR                                            SWIR
           0.7

           0.6

           0.5

           0.4
 Reflectance




           0.3

           0.2

           0.1

           0.0
                 500         750         1000           1250         1500    1750         2000          2250   2500
                                                               Wavelength
Dr Sildomar Monteiro                                                        IGARSS 2011                               18
Experimental Results: 9 rock types
• Relative importance of features
                                100
 Relative importance (%)




                                 80

                                 60

                                 40

                                 20

                                  0
                                  400   600    800   1000   1200        1400       1600   1800     2000   2200   2400
                                                                   Wavelength (nm)

• Normalized count of features
                                100
 Normalized feature count (%)




                                80

                                60

                                40

                                20

                                 0
                                 400    600    800   1000   1200        1400       1600   1800     2000   2200   2400
                                                                   Wavelength (nm)



             Dr Sildomar Monteiro                                                         IGARSS 2011                   19
Experimental Results: 9 rock types
• Classification performance of selected features
  0.9000

  0.8000

  0.7000

  0.6000

  0.5000                                                 Relative Importance

  0.4000                                                 Normalized Count

  0.3000

  0.2000

  0.1000
              Accuracy   F-score   Kappa     AUC




 Dr Sildomar Monteiro                      IGARSS 2011                         20
Experimental Results
• All 9 classes




• Martite




 Dr Sildomar Monteiro              IGARSS 2011   21
Summary
• Boosting increases the performance of decision trees
  while keeping model interpretability
• We presented two approaches to perform feature
  selection using boosted decision trees
• Calculating the relative importance of features was
  more efficient than the counting of features
• The reduced set is able to predict the classes
  accurately, and more efficiently than using all features




 Dr Sildomar Monteiro              IGARSS 2011          22
Conclusions
• The standard learning procedure of boosted decision
  trees can perform feature selection automatically
• The feature selection is embedded in the internal
  structure of the model, no need for extra parameters
  or separate selection algorithms
• Instability of the models can be an issue
• Future work: how to determine the optimal number of
  features (using statistical tests)




 Dr Sildomar Monteiro            IGARSS 2011         23
When Things Don’t Work...




Dr Sildomar Monteiro              IGARSS 2011   24

ST.Monteiro-EmbeddedFeatureSelection.pdf

  • 1.
    Embedded Feature Selection ofHyperspectral Bands with Boosted Decision Trees Sildomar Monteiro and Richard Murphy The University of Sydney
  • 2.
    Rio Tinto Centrefor Mine Automation • Totally Autonomous Mine in 10 years: – Brings together all elements of systems, perception, machine learning, data fusion and more – A grand challenge for Field Robotics • Driven by safety, predictability and efficiency Dr Sildomar Monteiro IGARSS 2011 2 2
  • 3.
    Goal: Mine PictureCompilation • Provide a complete and accurate model of the mine – Mine planning and better prediction outcomes • Maintain and update a multi-scale probabilistic representation – Geology – Geometry – Equipment – And other properties of interest for the mining process Dr Sildomar Monteiro IGARSS 2011 3
  • 4.
    Today Today Geology Feedback to Batch Floor mapping using ripped trench sections Cone logging Dr Sildomar Monteiro IGARSS 2011 4 4
  • 5.
    Geology (ground-truth) Dr SildomarMonteiro IGARSS 2011 5
  • 6.
    Mine Face Scanning [Nieto, Viejo and Monteiro, 2010] IGARSS 2011 6
  • 7.
    Hyperspectral sensing formining • Geology classification (material identification) still has many challenges • Environmental conditions – Illumination, temperature, dust • Timely data acquisition and processing is needed – Algorithms and calibration • High spectral similarity between (ore-bearing) rock types – Few, if any, distinctive spectral features Dr Sildomar Monteiro IGARSS 2011 7
  • 8.
    Outline • Hyperspectral classificationusing Boosting • Embedded band selection • Experiments using iron ore data Dr Sildomar Monteiro IGARSS 2011 8
  • 9.
    Hyper-spectral Sensors Multispectral Hyper-spectral SWIR VisNIR 970-2500 nm 400-970 nm Band n Band 6 Band 5 Band 4 Band 3 Dr Sildomar MonteiroBand 2 IGARSS 2011 9 Band 1
  • 10.
    Example of Classificationand Spectra 0.6 0.20 0.6 0.20 a a b 0.5 0.5 0.15 Reflectance 0.15 Reflectance 0.4 0.4 0.3 0.10 0.3 0.10 0.2 0.2 0.05 0.05 0.1 0.1 0.0 0.00 0.0 0.00 500 750 1000 1250 1500 1750 2000 2250 500 750 1000 1250 1500 1750 2000 2250 500 750 1000 1250 1500 1750 2000 2250 1.0 0.5 1.0 0.6 c a 0.5 0.20 db c 0.8 0.4 0.8 0.5 0.4 Reflectance 0.15 Reflectance Reflectance 0.4 0.6 0.3 0.6 0.3 0.3 0.10 0.4 0.2 0.4 0.2 0.2 0.05 0.2 0.1 0.1 0.2 0.1 0.00 0.0 0.0 0.0 0.0 0.0 500 500 750 750 1000 1250 1500 1750 20002250 1000 1250 1500 1750 2000 500 750 1000 1250 1500 1750 2000 2250 500 750 1000 1250 1500 1750 2000 2250 2250 500 750 1000 1250 1500 1750 2000 2250 Wavelength (nm) Wavelength (nm) Wavelength (nm) 1.0 Dr Sildomar Monteiro 0.5 IGARSS 2011 10 c d
  • 11.
    Hyperspectral Band Selection •Feature Selection (vs Dimensionality Reduction) – Remove correlated inputs – Physical interpretation (band wavelengths) • Faster data processing • Possible faster data acquisition • Can be tailored to application • Indicate multispectral bands Dr Sildomar Monteiro IGARSS 2011 11
  • 12.
    Boosting • Sound theoreticalfoundation – Additive Logistic Regression [Friedman, 2000] • Empirical studies show that boosting – Yields small classification error rates – Is very resilient to overfitting • State-of-the-art results in many applications, e.g. face recognition in computer vision • The idea of Boosting is to train many “weak” learners on various distributions (or set of weights) of the input data and then combine the resulting classifiers into a single “committee” Dr Sildomar Monteiro IGARSS 2011 12
  • 13.
    Decision Trees • Advantages: – Robustness and interpretability • Disadvantages – Low accuracy and high variance • Binary decision trees (x ) f (x , , , a,b ) a (x ) b b a b • Boosted trees – Accurate, robust and interpretable M  G ( x)  sign    m f m ( x )   m1  m  2 3 1 Dr Sildomar Monteiro IGARSS 2011 13
  • 14.
    Embedded Feature Selection •Relative Importance of input variables 1 ˆ F (x ) 2 Ij Ex . varx x j xj • Approximation for decision trees (heuristic) [Friedman, 1999] J 1 ˆ I j2 (T ) ˆ it2 ( (t ) j) t 1 • Least-squares improvement criterion 2 wl wr 2 i Rl , Rr yl yr wl w r Dr Sildomar Monteiro IGARSS 2011 14
  • 15.
    Embedded Feature Selection(cont.) • Boosted Decision Trees M ˆ 1 ˆ I j2 I j2 Tm M m 1 • The Multi-class case K ˆ 1 ˆ Ij I jk K k 1 Dr Sildomar Monteiro IGARSS 2011 15
  • 16.
    Experiments • Hyperspectral dataacquired using a field spectrometer (ASD) – 429 bands (same as hyperspectral camera) – Wavelengths from 350 nm to 2500 nm • Samples of ore-bearing rocks – Martite, goethite, kaolinite, etc (total 9 classes) – Different illumination and physical conditions (direct sunlight, shadow and viewing angles) • Methodology of experiments – Metrics: accuracy, precision, recall, F, Kappa, AUC – 4-fold cross-validation Dr Sildomar Monteiro IGARSS 2011 16
  • 17.
    Hyperspectral data set DrSildomar Monteiro IGARSS 2011 17
  • 18.
    Information in spectra samples_644-17_1_00000.asd.ref samples_644-17_1_00035.asd.ref 0.8 VisNIR SWIR 0.7 0.6 0.5 0.4 Reflectance 0.3 0.2 0.1 0.0 500 750 1000 1250 1500 1750 2000 2250 2500 Wavelength Dr Sildomar Monteiro IGARSS 2011 18
  • 19.
    Experimental Results: 9rock types • Relative importance of features 100 Relative importance (%) 80 60 40 20 0 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Wavelength (nm) • Normalized count of features 100 Normalized feature count (%) 80 60 40 20 0 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Wavelength (nm) Dr Sildomar Monteiro IGARSS 2011 19
  • 20.
    Experimental Results: 9rock types • Classification performance of selected features 0.9000 0.8000 0.7000 0.6000 0.5000 Relative Importance 0.4000 Normalized Count 0.3000 0.2000 0.1000 Accuracy F-score Kappa AUC Dr Sildomar Monteiro IGARSS 2011 20
  • 21.
    Experimental Results • All9 classes • Martite Dr Sildomar Monteiro IGARSS 2011 21
  • 22.
    Summary • Boosting increasesthe performance of decision trees while keeping model interpretability • We presented two approaches to perform feature selection using boosted decision trees • Calculating the relative importance of features was more efficient than the counting of features • The reduced set is able to predict the classes accurately, and more efficiently than using all features Dr Sildomar Monteiro IGARSS 2011 22
  • 23.
    Conclusions • The standardlearning procedure of boosted decision trees can perform feature selection automatically • The feature selection is embedded in the internal structure of the model, no need for extra parameters or separate selection algorithms • Instability of the models can be an issue • Future work: how to determine the optimal number of features (using statistical tests) Dr Sildomar Monteiro IGARSS 2011 23
  • 24.
    When Things Don’tWork... Dr Sildomar Monteiro IGARSS 2011 24