SlideShare a Scribd company logo
1 of 24
Download to read offline
Introduction to Machine
       Learning
            Lecture 7
     Instance Based Learning

                Albert Orriols i Puig
               aorriols@salle.url.edu
                   i l @ ll       ld

      Artificial Intelligence – Machine Learning
          Enginyeria i Arquitectura La Salle
              gy           q
                 Universitat Ramon Llull
Recap of Lecture 6

                          LET’S START WITH DATA
                             CLASSIFICATION




                                                               Slide 2
Artificial Intelligence                     Machine Learning
Recap of Lecture 6

                  Data Set            Classification Model             How?




       We are going to deal with:
               • Data described by nominal and continuous attributes
               • Data that may have instances with missing values



                                                                              Slide 3
Artificial Intelligence                 Machine Learning
Recap of Lecture 6
        We want to build decision trees
        How can I automatically
        generate these types
        of trees?
                Decide which attribute we
                should put in each node
                Decide a split point




                Rely on information theory
                We also saw many other improvements



                                                          Slide 4
Artificial Intelligence                Machine Learning
Today’s Agenda


        Classification without building a model
        K-Nearest Neighbor (kNN)
        Effect of K
        Distance functions
        Variants of K-NN
        Strengths and weaknesses



                                                  Slide 5
Artificial Intelligence        Machine Learning
Classification without Building a Model

        Forget about a global model!
           g           g
                Simply store all the training examples
                Build local model f each new t t i t
                B ild a l l   d l for  h     test instance
                Refered to as lazy learners


        Some approaches to IBL
                Nearest neighbors
                Locally weighted regression
                Case-based reasoning




                                                             Slide 6
Artificial Intelligence                Machine Learning
k-Nearest Neighbors
        Algorithm
          g
                Store all the training data
                Given a new t t instance
                Gi          test i t
                          Recover the k neighbors of the test instance
                          Predict th
                          P di t the majority class among the neighbors
                                       j it l             th    i hb


                Voronoi Cells: The feature space is
                decomposed into several cells.
                E.g. for k=1




                                                                          Slide 7
Artificial Intelligence                      Machine Learning
k-Nearest Neighbors
        But, where is the learning process?
           ,                     gp
                Select the k neighbors and return the majority class is learning?
                No, that’s just t i i
                N th t’ j t retrieving


        But still, some important issues
                Which k should I use?
                Which distance functions should I use?
                Should I maintain all instances of the training data set?




                                                                            Slide 8
Artificial Intelligence                 Machine Learning
Which k Should I Use?
        The effect of k
                             15-NN                             1-NN




                Do you remember the discussion about overfitting in C4.5?
                          Apply the same concepts here!

                                                                            Slide 9
Artificial Intelligence                     Machine Learning
Which k Should I Use?
        Some experimental results on the use of different k
               p
                                                            7-NN




                          Number of neighbors

                Notice that the test error decreases as k increases but at k ≈ 5-
                                                          increases,
                7, it starts increasing again
                Rule of thumb: k=3 k=5 and k=7 seem to work ok in the
                                k=3, k=5,
                majority of problems
                                                                            Slide 10
Artificial Intelligence                  Machine Learning
Distance Functions
        Distance functions must be able to
                Nominal attributes
                Continuous attributes
                C ti        tt ib t
                Missing values
        The key
                They must return a low value for similar objects and a high
                value for different objects
                Seems obvious, right? But still, it is domain dependent
                      obvious             still
        There are many of them. Let’s see some of the most
        used



                                                                              Slide 11
Artificial Intelligence                 Machine Learning
Distance Functions
        Distance between two points in the same space
                             p                   p
                d(x, y)


        Some properties expected to be satisfied in general
                d(x, y) ≥ 0 and d(x, x) = 0
                d(x y) = d(y x)
                d(x,     d(y,
                d(x, y) + d(y, z) ≥ d(x, z)




                                                                 Slide 12
Artificial Intelligence                       Machine Learning
Distances for Continuous Variables
          Given x=(x1,…,xn)’ and y=(y1,…,yn)’
                                                                n
                                               d E ( x, y ) = [∑ ( xi − yi ) 2 ]1/ 2
                  Euclidean
                                                               i =1



                                                                n
                                              d E ( x, y ) = [∑ ( xi − yi ) ] q 1/ q
                  Minkowsky
                                                               i =1



                                                                       n
                                               d ABS ( x, y ) = ∑ | xi − yi |
                  Distance absolute value
                                                                      i =1




                                                                                  Slide 13
Artificial Intelligence                Machine Learning
Distances for Continuous Variables
          What if attributes are measured over different scales?
                  Attribute 1 ranging in [0,1]
                  Attribute 2 ranging in [0 1000]
                                         [0,
                  Can you detect any potential problem in the aforementioned
                  distance functions?




                      X in [0,1], y in [0,1000]                      X in [0,1000], y in [0,1000]

                                                                                                Slide 14
Artificial Intelligence                           Machine Learning
Distances for Continuous Variables
        The larger the scale, the larger the influence of the
               g              ,      g
        attribute in the distance function
        Solution: Normalize each attribute
        How:
                Normalization by means of the range

                                                 d (ex1a , ex2 )
                                                             a
                          d anorm (ex1 , ex2 ) =
                                     a     a

                                                 max a − min a

                Normalization by means of the standard deviation

                                                  d (ex1a , ex2 )
                                                              a
                          d anorm (ex1a , ex2 ) =
                                            a

                                                      4σ a
                                                                    Slide 15
Artificial Intelligence                   Machine Learning
Distances for Nominal Attributes
        Several metrics to deal with nominal attributes
                Overlap distance function




                          Idea: Two nominal attributes are equal only if they have the same
                          value




                                                                                      Slide 16
Artificial Intelligence                       Machine Learning
Distances for Nominal Attributes
        Several metrics to deal with nominal attributes
                Value difference metric (VDM)




                                                                 C = number of classes
                                                                 P(a exia, c) = conditional probability
                                                                 P(a,
                                                                 that the output class is c given that
                                                                 the attribute a has de value exia.




                          Idea: Two nominal values are similar if they have more similar
                          correlations with the output classes
                See (Wilson & Martinez) for more distance functions
                                                                                                 Slide 17
Artificial Intelligence                       Machine Learning
Distances for Heterogeneous Attributes


        What if my data set is described by both nominal and
        continuous attributes?
                Apply the same distance function
                Use nominal distance functions for nominal attributes
                Use continuous distance function for continuous attributes




                                                                             Slide 18
Artificial Intelligence               Machine Learning
Variants of kNN


        Different variants of kNN
                Distance-weighted kNN
                Attribute-weighted kNN




                                                        Slide 19
Artificial Intelligence              Machine Learning
Distance-Weighted kNN
         Inference of original kNN
                         g
                 The k nearest neighbors vote for the class
         Shouldn t
         Shouldn’t the closest examples have a higher influence in the
         decision process?
                 Weight the contribution of each of the k neighbors wrt their distance
                  E.g.,                                 k
                                  f ( xq ) = arg max ∑ wiδ (v, f ( xi ))
                                  ˆ                                                      k
                                                                                        ∑ wi f ( xi )
                                                v∈V    i =1
                                                                           f ( xq ) =
                                                                           ˆ            i =1
                                                   1                                            k
                                  where wi =                                                   ∑ wi
                                             d ( xq , xi ) 2                                   i =1




                 More robust to noisy instances and outliers

         E.g.: Shepard’s method (Shepard,1968)


                                                                                                      Slide 20
Artificial Intelligence                          Machine Learning
Attribute-weighted kNN
        What if some attributes are irrelevant or misleading?
                                                           g
                If irrelevant   cost increases, but accuracy is not affected
                If misleading
                    i l di       cost increases and accuracy may d
                                    ti            d              decrease


        Weight attributes:
                                                n
                                d w( x, y ) = ∑ wi ( xi − yi )   2

                                               i =1

        How to determine the weights?
                Option 1: The expert p
                 p              p provide us with the weights
                                                         g
                Option 2: Use a machine learning approach
                More will be said in the next lecture!

                                                                               Slide 21
Artificial Intelligence                 Machine Learning
Strengths and Weaknesses
  Strengths of kNN
           Building of a new local model for each test instance
           Learning has no cost
           Empirical results show that the method is highly accurate w.r.t other
           machine learning techniques
  Weaknesses
           Retrieving approach, but does not learn
           No global model. The knowledge is not legible
           Test cost increases linearly with the input instances
           No generalization
           Curse of dimensionality: What happens if we have many attributes?
           Noise and outliers may have a very negative effect
                                                                          Slide 22
Artificial Intelligence              Machine Learning
Next Class

        From instance-based to case-based reasoning
        A little bit more on learning
                Distance functions
                Prototype selection




                                                         Slide 23
Artificial Intelligence               Machine Learning
Introduction to Machine
       Learning
           Lecture 7
    Instance Based Learning

               Albert Orriols i Puig
              aorriols@salle.url.edu
                  i l @ ll       ld

     Artificial Intelligence – Machine Learning
         Enginyeria i Arquitectura La Salle
             gy           q
                Universitat Ramon Llull

More Related Content

What's hot

Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.pptbutest
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningJörgen Sandig
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceSahil Kumar
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMPuneet Kulyana
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Reinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialReinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialOmar Enayet
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes ClassifierYiqun Hu
 
Lecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesLecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesHema Kashyap
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Neural Networks
Neural NetworksNeural Networks
Neural NetworksAdri Jovin
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...Edureka!
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineRishabh Gupta
 

What's hot (20)

Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Reinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialReinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners Tutorial
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
Lecture 08 uninformed search techniques
Lecture 08 uninformed search techniquesLecture 08 uninformed search techniques
Lecture 08 uninformed search techniques
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 

Viewers also liked (20)

Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rules
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART III
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture17
Lecture17Lecture17
Lecture17
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture20
Lecture20Lecture20
Lecture20
 
Lecture23
Lecture23Lecture23
Lecture23
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
 
Lecture22
Lecture22Lecture22
Lecture22
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random Forests
 

Similar to Lecture7 - IBk

Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureRouyun Pan
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningMustafa Aldemir
 
Deep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsDeep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsSlideTeam
 
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Chris Rackauckas
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Victor Miagkikh
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOINN Inc.
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
 
Yann le cun
Yann le cunYann le cun
Yann le cunYandex
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearningEyad Alshami
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basicsnpinto
 

Similar to Lecture7 - IBk (20)

Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
 
Lecture2 - Machine Learning
Lecture2 - Machine LearningLecture2 - Machine Learning
Lecture2 - Machine Learning
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & Future
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsDeep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And Limitations
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
 
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?
 
Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
Yann le cun
Yann le cunYann le cun
Yann le cun
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
 

Recently uploaded

Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 

Recently uploaded (20)

Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 

Lecture7 - IBk

  • 1. Introduction to Machine Learning Lecture 7 Instance Based Learning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull
  • 2. Recap of Lecture 6 LET’S START WITH DATA CLASSIFICATION Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lecture 6 Data Set Classification Model How? We are going to deal with: • Data described by nominal and continuous attributes • Data that may have instances with missing values Slide 3 Artificial Intelligence Machine Learning
  • 4. Recap of Lecture 6 We want to build decision trees How can I automatically generate these types of trees? Decide which attribute we should put in each node Decide a split point Rely on information theory We also saw many other improvements Slide 4 Artificial Intelligence Machine Learning
  • 5. Today’s Agenda Classification without building a model K-Nearest Neighbor (kNN) Effect of K Distance functions Variants of K-NN Strengths and weaknesses Slide 5 Artificial Intelligence Machine Learning
  • 6. Classification without Building a Model Forget about a global model! g g Simply store all the training examples Build local model f each new t t i t B ild a l l d l for h test instance Refered to as lazy learners Some approaches to IBL Nearest neighbors Locally weighted regression Case-based reasoning Slide 6 Artificial Intelligence Machine Learning
  • 7. k-Nearest Neighbors Algorithm g Store all the training data Given a new t t instance Gi test i t Recover the k neighbors of the test instance Predict th P di t the majority class among the neighbors j it l th i hb Voronoi Cells: The feature space is decomposed into several cells. E.g. for k=1 Slide 7 Artificial Intelligence Machine Learning
  • 8. k-Nearest Neighbors But, where is the learning process? , gp Select the k neighbors and return the majority class is learning? No, that’s just t i i N th t’ j t retrieving But still, some important issues Which k should I use? Which distance functions should I use? Should I maintain all instances of the training data set? Slide 8 Artificial Intelligence Machine Learning
  • 9. Which k Should I Use? The effect of k 15-NN 1-NN Do you remember the discussion about overfitting in C4.5? Apply the same concepts here! Slide 9 Artificial Intelligence Machine Learning
  • 10. Which k Should I Use? Some experimental results on the use of different k p 7-NN Number of neighbors Notice that the test error decreases as k increases but at k ≈ 5- increases, 7, it starts increasing again Rule of thumb: k=3 k=5 and k=7 seem to work ok in the k=3, k=5, majority of problems Slide 10 Artificial Intelligence Machine Learning
  • 11. Distance Functions Distance functions must be able to Nominal attributes Continuous attributes C ti tt ib t Missing values The key They must return a low value for similar objects and a high value for different objects Seems obvious, right? But still, it is domain dependent obvious still There are many of them. Let’s see some of the most used Slide 11 Artificial Intelligence Machine Learning
  • 12. Distance Functions Distance between two points in the same space p p d(x, y) Some properties expected to be satisfied in general d(x, y) ≥ 0 and d(x, x) = 0 d(x y) = d(y x) d(x, d(y, d(x, y) + d(y, z) ≥ d(x, z) Slide 12 Artificial Intelligence Machine Learning
  • 13. Distances for Continuous Variables Given x=(x1,…,xn)’ and y=(y1,…,yn)’ n d E ( x, y ) = [∑ ( xi − yi ) 2 ]1/ 2 Euclidean i =1 n d E ( x, y ) = [∑ ( xi − yi ) ] q 1/ q Minkowsky i =1 n d ABS ( x, y ) = ∑ | xi − yi | Distance absolute value i =1 Slide 13 Artificial Intelligence Machine Learning
  • 14. Distances for Continuous Variables What if attributes are measured over different scales? Attribute 1 ranging in [0,1] Attribute 2 ranging in [0 1000] [0, Can you detect any potential problem in the aforementioned distance functions? X in [0,1], y in [0,1000] X in [0,1000], y in [0,1000] Slide 14 Artificial Intelligence Machine Learning
  • 15. Distances for Continuous Variables The larger the scale, the larger the influence of the g , g attribute in the distance function Solution: Normalize each attribute How: Normalization by means of the range d (ex1a , ex2 ) a d anorm (ex1 , ex2 ) = a a max a − min a Normalization by means of the standard deviation d (ex1a , ex2 ) a d anorm (ex1a , ex2 ) = a 4σ a Slide 15 Artificial Intelligence Machine Learning
  • 16. Distances for Nominal Attributes Several metrics to deal with nominal attributes Overlap distance function Idea: Two nominal attributes are equal only if they have the same value Slide 16 Artificial Intelligence Machine Learning
  • 17. Distances for Nominal Attributes Several metrics to deal with nominal attributes Value difference metric (VDM) C = number of classes P(a exia, c) = conditional probability P(a, that the output class is c given that the attribute a has de value exia. Idea: Two nominal values are similar if they have more similar correlations with the output classes See (Wilson & Martinez) for more distance functions Slide 17 Artificial Intelligence Machine Learning
  • 18. Distances for Heterogeneous Attributes What if my data set is described by both nominal and continuous attributes? Apply the same distance function Use nominal distance functions for nominal attributes Use continuous distance function for continuous attributes Slide 18 Artificial Intelligence Machine Learning
  • 19. Variants of kNN Different variants of kNN Distance-weighted kNN Attribute-weighted kNN Slide 19 Artificial Intelligence Machine Learning
  • 20. Distance-Weighted kNN Inference of original kNN g The k nearest neighbors vote for the class Shouldn t Shouldn’t the closest examples have a higher influence in the decision process? Weight the contribution of each of the k neighbors wrt their distance E.g., k f ( xq ) = arg max ∑ wiδ (v, f ( xi )) ˆ k ∑ wi f ( xi ) v∈V i =1 f ( xq ) = ˆ i =1 1 k where wi = ∑ wi d ( xq , xi ) 2 i =1 More robust to noisy instances and outliers E.g.: Shepard’s method (Shepard,1968) Slide 20 Artificial Intelligence Machine Learning
  • 21. Attribute-weighted kNN What if some attributes are irrelevant or misleading? g If irrelevant cost increases, but accuracy is not affected If misleading i l di cost increases and accuracy may d ti d decrease Weight attributes: n d w( x, y ) = ∑ wi ( xi − yi ) 2 i =1 How to determine the weights? Option 1: The expert p p p provide us with the weights g Option 2: Use a machine learning approach More will be said in the next lecture! Slide 21 Artificial Intelligence Machine Learning
  • 22. Strengths and Weaknesses Strengths of kNN Building of a new local model for each test instance Learning has no cost Empirical results show that the method is highly accurate w.r.t other machine learning techniques Weaknesses Retrieving approach, but does not learn No global model. The knowledge is not legible Test cost increases linearly with the input instances No generalization Curse of dimensionality: What happens if we have many attributes? Noise and outliers may have a very negative effect Slide 22 Artificial Intelligence Machine Learning
  • 23. Next Class From instance-based to case-based reasoning A little bit more on learning Distance functions Prototype selection Slide 23 Artificial Intelligence Machine Learning
  • 24. Introduction to Machine Learning Lecture 7 Instance Based Learning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull