Lecture8 - From CBR to IBk

1,577 views

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,577
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
104
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Lecture8 - From CBR to IBk

  1. 1. Introduction to Machine Learning Lecture 8 Instance Based Learning and Case-Based Reasoning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull
  2. 2. Recap of Lecture 7 kNN 15-NN 1-NN Key aspects Value of k Distance functions Slide 2 Artificial Intelligence Machine Learning
  3. 3. Recap of Lecture 7 Where is learning in kNN? g Retrieval system No l b l N global model dl No generalization … No learning! o ea g But till i bl t B t still, it is able to create accurate classification t tl ifi ti models Slide 3 Artificial Intelligence Machine Learning
  4. 4. Today’s Agenda Formalizing the framework: From kNN to CBR Incorporating learning in different phases: Learn prototypes Organize the memory in clusters Learn the best distance function Provide explanations Slide 4 Artificial Intelligence Machine Learning
  5. 5. From kNN to CBR kNN provides a retrieving system Much work on different phases of kNN Prototype selection Distance function selection … CBR provides a general framework based on kNN Slide 5 Artificial Intelligence Machine Learning
  6. 6. Schema of CBR CBR cycle Select a (Aamodt & solution Plaza, Plaza 1994) Reuse Similarity function Revise the solution Solution Revise Problem Retrieve Case Memory Retain Coherence and Structure and Retain the relevance of the agrupation of the cases new knowledge attributes Slide 6 Artificial Intelligence Machine Learning
  7. 7. Phases of CBR Five key phases Preprocess the training instance So that it meets the requirements of the system Retrieve Use U kNN with the selected distance function ih h l d di f i Reuse Vote-based scheme Revise Adapt the solution if necessary Retain Remove examples from or add examples to the case memory Slide 7 Artificial Intelligence Machine Learning
  8. 8. Challenges in CBR Hot areas Reduce the cost of matching Reduce the total number of examples in the case memory Organize the case memory in clusters and only consult examples O i th ilt dl lt l of some clusters Automatically create distance functions that are suited to your problem Extraction of explanations: CBR does not extract legible models (actually, does not learn any model) ) Slide 8 Artificial Intelligence Machine Learning
  9. 9. Prototype Selection Training data sets contain a large number of instances g g Increase the prediction time May M contain noisy i t ti i instances Prototype selection Select the representative examples to form the case base Remove all the other examples How? Learn which examples are the ones that maximize CBR accuracy Slide 9 Artificial Intelligence Machine Learning
  10. 10. Prototype Selection Possible sets of prototypes … Training Sel. Sel. Sel. Training Data set Proto 1 Proto 2 Proto 3 Data set Split the training set How do we know which is th b t S l ti i the best Selection of f Prototypes? Validation set KNN Test data set Does it sound familiar to you? Problem: Search for the best SP It s It’s just an optimization problem For robustness, use cross-validation or similar validation procedures Slide 10 Artificial Intelligence Machine Learning
  11. 11. Prototype Selection Optimization methods used so far p Genetic algorithms (Holland, 75) Genetic Programming (Koza et al., 1989) G ti P i (K tl Grammar Evolution (Ryan & O’Neill, 1998) Slide 11 Artificial Intelligence Machine Learning
  12. 12. Case-Based Memory Clustering Training data sets contain a large number of instances g g Clustering: Place instances in different clusters Only t i O l retrieve from the same cluster or clusters that are f th lt lt th t close to you Slide 12 Artificial Intelligence Machine Learning
  13. 13. Case-Based Memory Clustering Retrieve phase Reuse Reuse phase 1. Compare with all the prototypes Propose a solution with the 2. Compare only with the examples retrieved cases of the closest cluster Case Retrieve R ti Revise Ri Memory Retain phase Revise phase Update the organization. Revise if the solution is It may imply the update of the y py p p potentially valid y Retain clusters Slide 13 Artificial Intelligence Machine Learning
  14. 14. Generation of Distance Functions How does the distance function influences learning? g It may be the key between success and failure! Slide 14 Artificial Intelligence Machine Learning
  15. 15. Generation of Distance Functions Can I find a distance function that makes kNN perform p the best in all cases? No way Actually, NFL announces it (Wolpert 1992) way. Actually (Wolpert, Different distances suited for different domains May I try to create a new distance function for each specific problem? Of course. Again, an optimization problem Slide 15 Artificial Intelligence Machine Learning
  16. 16. Generation of Distance Functions Split the training data set into Training t’ T i i set’ Optimization problem Validation set Assume a parametric form Optimize the parameters of the Validation underlying function set Being more ambitious? Dist. Dist. Dist. function1 function2 functionn Do not assume any parametric … form Optimize both the function structure and the parameters kNN Training Examples: Data set‘ ( (Fornells et al., 2005) , ) (Camps et al., 2003) error1 error2 errorn Slide 16 Artificial Intelligence Machine Learning
  17. 17. Extraction of Explanations One of the main drawbacks of CBR is that it does not provide p any explanation Prediction based on nearest neighbors New techniques to provide explanations Based on used instances Building of partial models Not studied in more detail here Slide 17 Artificial Intelligence Machine Learning
  18. 18. Next Class Probabilistic-based learning Slide 18 Artificial Intelligence Machine Learning
  19. 19. Introduction to Machine Learning Lecture 8 Instance Based Learning and Case-Based Reasoning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull

×