Your SlideShare is downloading. ×
  • Like
IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

  • 270 views
Published

 

Published in Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
270
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Substructural Surrogates for Learning g g Decomposable Classification Problems: Implementation and First Results Albert Orriols Puig1,2 Kumara Sastry2 Orriols-Puig David E. Goldberg2 Ester Bernadó-Mansilla1 1Research Group in Intelligent Systems Enginyeria i Arquitectura La Salle, Ramon Llull University 2Illinois Genetic Algorithms Laboratory g y Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana Champaign
  • 2. Motivation Nearly decomposable functions in engineering systems • (Simon 69; Gibson 79; Goldberg 02) (Simon,69; Gibson,79; Goldberg,02) Design decomposition principle in: – Genetic Algorithms (GAs) • (Goldberg,02; Pelikan,05; Pelikan, Sastry & Cantú-Paz, 06) – Genetic Programming (GPs) • (Sastry & Goldberg, 03) – Learning Classifier Systems (LCS) • (Butz et al., 06; Llorà et al., 06) Enginyeria i Arquitectura la Salle Slide 2 GRSI
  • 3. Aim Our approach: build surrogates for classification. – Probabilistic models Induce the form of a surrogate – Regression Estimate the coefficients of the surrogate – Use the surrogate to classify new examples Surrogates used for efficiency enhancement of: – GAs: • (Sastry, Pelikan & Goldberg, 2004) • (Pelikan & Sastry, 2004) • (Sastry, Lima & Goldberg, 2006) – LCSs: • (Llorà, Sastry, Yu & Goldberg, 2007) Enginyeria i Arquitectura la Salle Slide 3 GRSI
  • 4. Outline 1. Methodology 2. Implementation 3. Test P bl 3 T t Problems 4. 4 Results 5. Discussion 6. Conclusions Enginyeria i Arquitectura la Salle Slide 4 GRSI
  • 5. 1. Methodology 2. Implementation 3. Test Problems Overview of the Methodology gy 3. Results 4. Discussion 4 Di i 5. Conclusions Structural Surrogate Classification Model Layer Model Layer Model Layer •Infer structural form •Use the surrogate • Extract dependencies and the coefficients function to predict the between attributes bet een attrib tes based on the class of new • Expressed as: structural model examples • Linkage Groups • Matrices (Sastry, Lima & Goldberg, 2006) • Bayesian Networks • Example: x-or Enginyeria i Arquitectura la Salle Slide 5 GRSI
  • 6. 1. Methodology 2. Implementation 3. Test Problems Surrogate Model Layer g y 3. Results 4. Discussion 4 Di i 5. Conclusions Input: Database of labeled examples with nominal attributes Substructure identified by the structural model layer – Example: [0 2] [1] Surrogate form induced from the structural model: f(x f( 0, x1, x2) = cI + c0x0 + c1x1 + c2x2 + c0,2x0x2 Coefficients f C ffi i t of surrogates: li t linear regression methods i th d Enginyeria i Arquitectura la Salle Slide 6 GRSI
  • 7. 1. Methodology 2. Implementation 3. Test Problems Surrogate Model Layer g y 3. Results 4 Di i 4. Discussion 5. Conclusions Map input examples to schema-index matrix – Consider all the schemas: {0*0, 0*1, 1*0, 1*1, *0*, *1*} [0 2] [1] – I general, th number of schemas t be considered i In l the b fh to b id d is: Number of linkage groups Cardinality of the variable jth of the ith linkage group Enginyeria i Arquitectura la Salle Slide 7 GRSI
  • 8. 1. Methodology 2. Implementation 3. Test Problems Surrogate Model Layer g y 3. Results 4. Discussion 4 Di i 5. Conclusions Map each example to a binary vector of size msch – For each building block, the entry corresponding to the schemata contained in the example is one. All the other are zero. [0 2] [1] 0*0 0*1 1*0 1*1 *0* *1* 010 1 0 0 0 0 1 100 0 0 1 0 1 0 111 0 0 0 1 0 1 Enginyeria i Arquitectura la Salle Slide 8 GRSI
  • 9. 1. Methodology 2. Implementation 3. Test Problems Surrogate Model Layer g y 3. Results 4. Discussion 4 Di i 5. Conclusions This results in the following matrix n input examples Each class is mapped to an integer Enginyeria i Arquitectura la Salle Slide 9 GRSI
  • 10. 1. Methodology 2. Implementation 3. Test Problems Surrogate Model Layer g y 3. Results 4. Discussion 4 Di i 5. Conclusions Solve the following linear system to estimate the coefficients of the surrogate model Input instances mapped Class or label of each to the schemas matrix input instance p Coefficients of the surrogate model To solve it, we use multi-dimensional least squares fitting approach: – Minimize: Enginyeria i Arquitectura la Salle Slide 10 GRSI
  • 11. 1. Methodology 2. Implementation 3. Test Problems Overview of the Methodology gy 3. Results 4. Discussion 4 Di i 5. Conclusions Structural Surrogate Classification Model Layer Model Layer Model Layer •Infer structural form •Use the surrogate • Extract dependencies and the coefficients function to predict the between attributes bet een attrib tes based on the class of new • Expressed as: structural model examples • Linkage Groups • Matrices (Sastry, Lima & Goldberg, 2006) • Bayesian Networks • Example: x-or Enginyeria i Arquitectura la Salle Slide 11 GRSI
  • 12. 1. Methodology 2. Implementation 3. Test Problems Classification Model Layer y 3. Results 4. Discussion 4 Di i 5. Conclusions Use the surrogate to predict the class of a new input instance Map the example New input p to a vector e of length msch Output = round (x e) (x·e) example according to the schemas Coefficients of the surrogate model 0*0 0*1 1*0 1*1 *0* *1* Output = x·(001010)’ 001 0 0 1 0 1 0 Enginyeria i Arquitectura la Salle Slide 12 GRSI
  • 13. 1. Methodology 2. Implementation 3. Test Problems Overview of the Methodology gy 3. Results 4. Discussion 4 Di i 5. Conclusions Structural Surrogate Classification Model Layer Model Layer Model Layer •Infer structural form •Use the surrogate • Extract dependencies and the coefficients function to predict the between attributes bet een attrib tes based on the class of new • Expressed as: structural model examples • Linkage Groups • Matrices (Sastry, Lima & Goldberg, 2006) • Bayesian Networks • Example: x-or Enginyeria i Arquitectura la Salle Slide 13 GRSI
  • 14. 1. Methodology How do we obtain the structural 2. Implementation 3. Test Problems 3. Results model? 4 Di i 4. Discussion 5. Conclusions Greedy extraction of the structural model for classification (gESMC): Greedy search à la eCGA (Harik, 98) 1. 1 Start considering all variables independent sModel = [1] [2] … [n] independent. 2. Divide the dataset in ten-fold cross-validation sets: train and test 3. Build the surrogate with the train set and evaluate with the test set 4. While there is improvement 1. Create all combinations of two subset merges Eg. { [1,2] .. [n]; [1,n] .. [2]; [1]..[2,n]} 2. Build the surrogate for all candidates. Evaluate the quality with the test set 3. Select the candidate with minimum error 4 If the quality of the candidate is better than the parent, accept the model th lit f th did t i b tt th th t t th dl 4. (update smodel) and go to 4. 5. Otherwise, stop. sModel is optimal. , p p Enginyeria i Arquitectura la Salle Slide 14 GRSI
  • 15. 1. Methodology 2. Implementation 3. Test Problems Test Problems 3. Results 4 Di i 4. Discussion 5. Conclusions Two-level hierarchical problems (Butz, Pelikan, Llorà & Goldberg, 2006) Enginyeria i Arquitectura la Salle Slide 15 GRSI
  • 16. 1. Methodology 2. Implementation 3. Test Problems Test Problems 3. Results 4. Discussion 4 Di i 5. Conclusions Problems in the lower level Condition length (l) Position of the left-most – Position Problem: 000110 :3 one-valued bit Condition length (l) – Parity Problem: Number of 1 mod 2 01001010 :1 Enginyeria i Arquitectura la Salle Slide 16 GRSI
  • 17. 1. Methodology 2. Implementation 3. Test Problems Test Problems 3. Results 4. Discussion 4 Di i 5. Conclusions Problems in the higher level Condition length (l) – Decoder Problem: Integer value of the input 000110 :5 Condition length (l) – Count-ones Problem: Number of ones of the input 000110 :2 Combinations of both levels: – Parity + decoder ( (hParDec) - Parity + count-ones (hParCount) y ) y ( ) – Position + decoder (hPosDec) - Position + count-ones (hPosCount) Enginyeria i Arquitectura la Salle Slide 17 GRSI
  • 18. 1. Methodology 2. Implementation 3. Test Problems Results with 2-bit lower level blocks 3. Results 4. Discussion 4 Di i 5. Conclusions First experiment: – hParDec, hPosDec, hParCount, hParDec hPosDec hParCount hPosCount – Binary inputs of length 17. – The first 6 bits are relevant The others are irrelevant relevant. – 3 blocks of two-bits low order problems – Eg: hPosDec Interacting variables: 10 01 00 00101010101 [x0 x1] [x2 x3] [x4 x5] irrelevant bits 210 Output = 21 Enginyeria i Arquitectura la Salle Slide 18 GRSI
  • 19. 1. Methodology 2. Implementation 3. Test Problems Results with 2-bit lower level blocks 3. Results 4. Discussion 4 Di i 5. Conclusions Performance of gESMC compared to SMO and C4.5: Enginyeria i Arquitectura la Salle Slide 19 GRSI
  • 20. 1. Methodology 2. Implementation 3. Test Problems Results with 2-bit lower level blocks 3. Results 4 Di i 4. Discussion 5. Conclusions Form and coefficients of the surrogate models F d ffi i t f th t dl – Interacting variables [x0 x1] [x2 x3] [x4 x5] – Linkages between variables are detected – Easy visualization of the salient variable interactions – All models only consist of the six relevant variables Enginyeria i Arquitectura la Salle Slide 20 GRSI
  • 21. 1. Methodology 2. Implementation 3. Test Problems Results with 2-bit lower level blocks 3. Results 4 Di i 4. Discussion 5. Conclusions C4.5 C4 5 created t t d trees: – HPosDec and HPosCount • Trees of 53 nodes and 27 leaves on average • Only the six relevant variables in the decision nodes – HParDec • Trees of 270 nodes and 136 leaves on average • Irrelevant attributes in the decision nodes – HP C HParCount t • Trees of 283 nodes and 142 leaves on average • Irrelevant attributes in the decision nodes SMO built support vector machines: – 351, 6, 6, and 28 machines with 16 weights each one were created for HPosDec, HPosCount, HParDec, and HParCount Enginyeria i Arquitectura la Salle Slide 21 GRSI
  • 22. 1. Methodology 2. Implementation 3. Test Problems Results with 3-bit lower level blocks 3. Results 4. Discussion 4 Di i 5. Conclusions Second experiment: – hParDec, hPosDec, hParCount, hParDec hPosDec hParCount hPosCount – Binary inputs of length 17. – The first 6 bits are relevant The others are irrelevant relevant. – 2 blocks of three-bits low order problems – Eg: hPosDec Interacting variables: 000 100 00101010101 [x0 x1 x2] [x3 x4 x5] irrelevant bits 0 3 Output = 3 Enginyeria i Arquitectura la Salle Slide 22 GRSI
  • 23. 1. Methodology 2. Implementation 3. Test Problems Results with 3-bit lower level blocks 3. Results 4 Di i 4. Discussion 5. Conclusions Second experiment: – 2 blocks of three-bits low order problems Interesting observations: •As expected, the method stops the search when no M1: [0][1][2][3][4][5] improvement is found M2: [0 1][2][3][4][5] •In HParDec, half of the times the system gets a model that represents the salient interacting M3: [0 1 2][3][4][5] variables due to the stochasticity of the 10-fold CV estimation. Eg. [0 1 2 8] [3 4 9 5 6] Enginyeria i Arquitectura la Salle Slide 23 GRSI
  • 24. 1. Methodology 2. Implementation 3. Test Problems Discussion 3. Results 4. Discussion 4 Di i 5. Conclusions Lack of guidance from low-order substructures – Increase the order of substructural merges – Select randomly one of the new structural models • Follow schemes such as simulated annealing (Korst & Aarts, 1997) Enginyeria i Arquitectura la Salle Slide 24 GRSI
  • 25. 1. Methodology 2. Implementation 3. Test Problems Discussion 3. Results 4. Discussion 4 Di i 5. Conclusions Creating substructural models with overlapping structures – Problems with overlapping linkages – Eg: multiplexer problems – gESMC on the 6-bit mux results in: [0 1 2 3 4 5 ] – What we would like to obtain is: [0 1 2] [0 1 3] [0 1 4] [0 1 5] – Enhance the methodology permitting to represent overlapping linkages, using methods such as the design structure matrix genetic algorithm (DMSGA) (Yu, 2006) Enginyeria i Arquitectura la Salle Slide 25 GRSI
  • 26. 1. Methodology 2. Implementation 3. Test Problems Conclusions 3. Results 4 Di i 4. Discussion 5. Conclusions Methodology to learn from decomposable problems – Structural model – Surrogate model – Classification model First implementation approach – greedy search of the best structural model Main advantage of gESMC over other learners – It provides the structural model j p jointly with the classification y model. Some limitations of gESMC: – It depends on the guidance on the low-order structure – Several approaches outlined as further work. Enginyeria i Arquitectura la Salle Slide 26 GRSI
  • 27. Substructural Surrogates for Learning g g Decomposable Classification Problems: Implementation and First Results Albert Orriols Puig1,2 Kumara Sastry2 Orriols-Puig David E. Goldberg2 Ester Bernadó-Mansilla1 1Research Group in Intelligent Systems Enginyeria i Arquitectura La Salle, Ramon Llull University 2Illinois Genetic Algorithms Laboratory g y Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana Champaign