SlideShare a Scribd company logo
1 of 6
My name is David Thompson, and this
presentation is Local Methods for Pattern
Recognition.
It's the first of several modules that
will involve, or talk about
dimensionality reduction in the context of
this Caltech Big Data Summer School.
And so, the first of these lectures is
intended to provide just
a common reference some basic skill set
that we'll refer to later.
I'm going to review some basic
pattern recognition strategies, such as
classification regression
that you will have been exposed to
previously, but I think it's, it's
worth talking about them again, because it
really is foundational for some of
the things that we'll describe later
as far as dimensionality reduction is
concerned.
So it's an important background to have.
Okay, so this first talk will, as I
said, review basic pattern recognition
problems, classification, and regression.
And in particular on the context of local
pattern recognition strategies.
These are non-parametric pattern
recognition strategies, as opposed
to say parametric methods that we have
seen earlier.
A lot of these fall into the general
category of nearest neighbor methods.
So I'll describe the nearest-neighbor
methodology, and why that might be a
good thing, and then a couple variants of
that, including local linear regression.
Which is a flexible regression strategy
that
is based on a local non-KerMetric
approach.
And then kernel density estimation, which
is a probabilistic method for pattern
recognition.
Okay, so let's start off with a simple
pattern recognition task.
Here I've got a, a data set with a couple
of
attributes on it, and is pattern
recognition folks I want to do.
I just plotted these giving each its own
coordinates.
So here, this is a two-dimensional
attribute space where
every data point may have associated with
it some
class value or some real valued ordinate
that we're
regressing against, some value that we'd
want to predict.
[INAUDIBLE] were just interesting in
intrinsic properties of the data set how
the data is distributed to try to infer
something about the process that
generated that data, so that is more akin
to say a density estimation
task where we are estimating probability
density in this two dimensional attribute
space.
So there are lots of different kinds of
pattern recognition questions that
we can ask of this data set even this very
simple one here.
So, I've plotted here on this, this slide
a red question mark to indicate
some query point where we'd like to
describe or predict behavior of this
process.
Again, this could be a, a prediction of
the
real valued function at that location in
the input
space, or it could be that we have a
new point there, who's class we're tying
to infer.
Regardless we've got a, a query point and
we want to be
able to infer something about what the
process is doing at that location.
All right, a little bit of notation
background.
So I'm going to treat these data points as
column vectors, so
we're going to say X is, er, X sub I is
our data point.
And it has from one to n attributes.
Again, that's example, a simple example
only has
two, but you can imagine having
arbitrarily many.
And because this is dimensionality
reduction, we're eventually going to be
talking
about data points that have hundreds or
even thousands of attributes.
And we can represent the entire data set
as
a matrix where we stack these column
vectors together
in rows so we have an N by d matrix, where
D is the number of data points.
For now without much loss of generality,
I'm just going
to assume that all of these attributes are
real valued.
Continuous attributes so, I'm not going to
deal much with, for example, categorical
attributes,
but many of the same principles that
I'll discuss apply to categorical
attributes as well.
And you may be familiar with encoding
strategies that
you can use to turn categorical attributes
into continuous.
So really we'll just focus on the, the the
continuous case here.
Okay, so, or one reasonable way to do
pattern recognition on this data set is to
just
look at the local behavior in the vicinity
of the query point that we're interested
in inferring.
So here we can, this goes under the
assumption that whatever
process generated the data is locally
smooth in the input space.
So maybe we can disregard a bunch of the
points that are really far away from our
query, right.
We don't have to look at the function
values.
it, it, at the extreme lower left side of
a plot.
We can just look at the, the values on
the upper right near the, the red question
mark.
Right?
So, maybe to look at its, its closest
neighbors
to see what the function is doing in that
vicinity.
And you can do this for each
of the pattern recognition strategies that
I mentioned.
All right.
So the canonical example is, of course,
nearest neighbor classification.
To infer the class of the query, just look
to its nearest neighbor in our
data set, and assume that it has the same
class as that one neighbor point.
So, in this case I've indicated, with a
red arrow, the nearest neighbor of the
query and so we just assume that that is
the class of our, of our query.
Now this amounts to a [UNKNOWN]
partitioning of
the input space so here in two dimensions
I've drawn, I've partitioned the space
drawing areas
for which every input point is responsible
right?
So you can see that the nearest neighbor
points
are actually responsible for a polygon,
within this input space.
And any query within that polygon is going
to get its class now this has kind of
nice property, because the data that is
where
we have denser data right in the center.
it, where we, it, have more information
that the Voronio partitions are smaller,
right?
And we permit the function to vary a
little bit more.
Far from the data cloud these, these
partitions get a lot larger and, again,
we,
we don't have as much to say about that
corner of, of the input space.
So our inference there is going to be
smoother.
Right, so this is a basic example of
nearest
neighbor classification, and the decision
boundary looks like this.
This is he resulting decision boundary.
I've applied here some, some labels to the
data set, red and, and blue.
And you can see that the decision boundary
here drawn in black is sort
of a wiggly shape through this input space
that bends around all of these inputs.
Now the, the query's actually been paired
with a blue point.
This is kind of interesting, though
because
if you look carefully you'll note that
the query is actually also fairly close to
a bunch of red points as well.
Right?
To it's immediate left there are clusters
of red points and it's almost as close
to those, but it's just sort of
happenstance
that it got paired with a blue point.
So another way of saying this is that the
one nearest neighbor approach to
classification is rather sensitive noise.
It has high variance, in the language of
pattern recognition.
Variance-bias tradeoff, is definitely well
it's definitely
more variance than bias in this case.
So this also manifests as a fairly wiggly
decision boundary.
So if you see in the center of the cloud
there, maybe our decision boundary wiggles
a little bit more than is appropriate
given,
given the, the amount of data that we
have.
So what this really needs is some way to
smooth that decision
boundary or regularize it so we're less
sensitive to those single point outliers.
And the way that's typically done is by
introducing more nearest neighbors.
So here I'm looking at the five nearest
neighbors and taking
a majority vote of which which class to
call the query.
And you can see the decision boundary
straighten out quite a bit asregularization that we
saw before both with the increase in the
number of
[INAUDIBLE] we can do exactly the same
thing here note
that the blue might even be a bit over
smooth.
Looks right, so we're doing a pretty good
job of modeling the underlying
galaxy influction, but we've started to
truncate its peak a little bit, right?
So maybe a width of point three is a
little bit too much.
One thing to note about this expression,
note that it's normalized to provide
a true probability density estimate, so
it's
a valid PDF so that the normalization.
Factor Z, zed normalizes our kernel
function so
that all the kernel evaluations have a
volume
one or area one e, the cross validation is the, the
typical way that one would estimate these,
these parameters.
But, fortunately there's really only one
parameter
to estimate which is the, in this case.
The kernel width.
So we get a wide range of very
flexible functions out of very few input
parameters.
We're letting the data speak for itself
unlike a parametric method.
And this is the, the real power of these
local methods for pattern recognition.
Okay.
So, in summary I've described some
local nonparametric methods for pattern
recognition.
This is in contrary to parametric methods
that imply some
global functional form for the process
that generates the data here.
We're just looking at local patches of
data to infer the
values of the underlying function or
process at some query point.
And there are 3 different examples of
this local pattern recognition that I
demonstrated.
K-nearest neighbor for classification.
Kernel smoothing and local linear
regression for regression problems
and then kernel density estimation for
probability density estimation tasks.
In order, the main parameter is
regularization and
you can set this using a cross validation
strategy.
So maybe a width of point three is a
little bit too much.
One thing to note about this expression,
note that it's normalized to provide
a true probability density estimate, so
it's
a valid PDF so that the normalization.
Factor Z, zed normalizes our kernel
function so
that all the kernel evaluations have a
volume
one or area one e, the cross validation is the, the
typical way that one would estimate these,
these parameters.
But, fortunately there's really only one
parameter
to estimate which is the, in this case.
The kernel width.
So we get a wide range of very
flexible functions out of very few input
parameters.
We're letting the data speak for itself
unlike a parametric method.
And this is the, the real power of these
local methods for pattern recognition.
Okay.
So, in summary I've described some
local nonparametric methods for pattern
recognition.
This is in contrary to parametric methods
that imply some
global functional form for the process
that generates the data here.
We're just looking at local patches of
data to infer the
values of the underlying function or
process at some query point.
And there are 3 different examples of
this local pattern recognition that I
demonstrated.
K-nearest neighbor for classification.
Kernel smoothing and local linear
regression for regression problems
and then kernel density estimation for
probability density estimation tasks.
In order, the main parameter is
regularization and
you can set this using a cross validation
strategy.

More Related Content

Similar to General lect

Machine learning
Machine learningMachine learning
Machine learning
Ashok Masti
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
Anish Acharya
 
Building a Regression Model using SPSS
Building a Regression Model using SPSSBuilding a Regression Model using SPSS
Building a Regression Model using SPSS
Zac Bodner
 

Similar to General lect (20)

lec14_ref.pdf
lec14_ref.pdflec14_ref.pdf
lec14_ref.pdf
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Data .pptx
Data .pptxData .pptx
Data .pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Analysing The Data
Analysing The DataAnalysing The Data
Analysing The Data
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)
 
Understanding the Machine Learning Algorithms
Understanding the Machine Learning AlgorithmsUnderstanding the Machine Learning Algorithms
Understanding the Machine Learning Algorithms
 
lec17_ref.pdf
lec17_ref.pdflec17_ref.pdf
lec17_ref.pdf
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
 
level of measurement TED TALK.docx
level of measurement TED TALK.docxlevel of measurement TED TALK.docx
level of measurement TED TALK.docx
 
lec1.pdf
lec1.pdflec1.pdf
lec1.pdf
 
Presentation of Linear Regression with Multiple Variable-4.pdf
Presentation of Linear Regression with Multiple Variable-4.pdfPresentation of Linear Regression with Multiple Variable-4.pdf
Presentation of Linear Regression with Multiple Variable-4.pdf
 
Lets build a neural network
Lets build a neural networkLets build a neural network
Lets build a neural network
 
Building a Regression Model using SPSS
Building a Regression Model using SPSSBuilding a Regression Model using SPSS
Building a Regression Model using SPSS
 
Gaps in the algorithm
Gaps in the algorithmGaps in the algorithm
Gaps in the algorithm
 
Data Types
Data TypesData Types
Data Types
 

Recently uploaded

DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| DelhiDELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
delhimunirka444
 
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
delhimunirka444
 
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
Business Bay Call Girls || 0529877582 || Call Girls Service in Business Bay Dubai
 
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | DelhiFULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
SaketCallGirlsCallUs
 
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | DelhiFULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
SaketCallGirlsCallUs
 
FULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
FULL NIGHT — 9999894380 Call Girls In Delhi | DelhiFULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
FULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
SaketCallGirlsCallUs
 
Bobbie goods coloring book 81 pag_240127_163802.pdf
Bobbie goods coloring book 81 pag_240127_163802.pdfBobbie goods coloring book 81 pag_240127_163802.pdf
Bobbie goods coloring book 81 pag_240127_163802.pdf
MARIBEL442158
 
FULL NIGHT — 9999894380 Call Girls In Paschim Vihar | Delhi
FULL NIGHT — 9999894380 Call Girls In  Paschim Vihar | DelhiFULL NIGHT — 9999894380 Call Girls In  Paschim Vihar | Delhi
FULL NIGHT — 9999894380 Call Girls In Paschim Vihar | Delhi
SaketCallGirlsCallUs
 
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call GirlAdmirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
home
 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
parisharma5056
 
FULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
FULL NIGHT — 9999894380 Call Girls In Najafgarh | DelhiFULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
FULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
SaketCallGirlsCallUs
 

Recently uploaded (20)

DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| DelhiDELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
DELHI NCR —@9711106444 Call Girls In Majnu Ka Tilla (MT)| Delhi
 
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
(9711106444 )🫦#Sexy Desi Call Girls Noida Sector 4 Escorts Service Delhi 🫶
 
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
 
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
UAE Call Girls # 971526940039 # Independent Call Girls In Dubai # (UAE)
 
Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024Deconstructing Gendered Language; Feminist World-Making 2024
Deconstructing Gendered Language; Feminist World-Making 2024
 
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | DelhiFULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
FULL NIGHT — 9999894380 Call Girls In Dwarka Mor | Delhi
 
❤Personal Whatsapp Srinagar Srinagar Call Girls 8617697112 💦✅.
❤Personal Whatsapp Srinagar Srinagar Call Girls 8617697112 💦✅.❤Personal Whatsapp Srinagar Srinagar Call Girls 8617697112 💦✅.
❤Personal Whatsapp Srinagar Srinagar Call Girls 8617697112 💦✅.
 
Young⚡Call Girls in Tughlakabad Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Tughlakabad Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Tughlakabad Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Tughlakabad Delhi >༒9667401043 Escort Service
 
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | DelhiFULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
FULL NIGHT — 9999894380 Call Girls In Mahipalpur | Delhi
 
FULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
FULL NIGHT — 9999894380 Call Girls In Delhi | DelhiFULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
FULL NIGHT — 9999894380 Call Girls In Delhi | Delhi
 
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
Jeremy Casson - How Painstaking Restoration Has Revealed the Beauty of an Imp...
 
Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305
 
AaliyahBell_themist_v01.pdf .
AaliyahBell_themist_v01.pdf             .AaliyahBell_themist_v01.pdf             .
AaliyahBell_themist_v01.pdf .
 
(INDIRA) Call Girl Jammu Call Now 8617697112 Jammu Escorts 24x7
(INDIRA) Call Girl Jammu Call Now 8617697112 Jammu Escorts 24x7(INDIRA) Call Girl Jammu Call Now 8617697112 Jammu Escorts 24x7
(INDIRA) Call Girl Jammu Call Now 8617697112 Jammu Escorts 24x7
 
Bobbie goods coloring book 81 pag_240127_163802.pdf
Bobbie goods coloring book 81 pag_240127_163802.pdfBobbie goods coloring book 81 pag_240127_163802.pdf
Bobbie goods coloring book 81 pag_240127_163802.pdf
 
Mayiladuthurai Call Girls 8617697112 Short 3000 Night 8000 Best call girls Se...
Mayiladuthurai Call Girls 8617697112 Short 3000 Night 8000 Best call girls Se...Mayiladuthurai Call Girls 8617697112 Short 3000 Night 8000 Best call girls Se...
Mayiladuthurai Call Girls 8617697112 Short 3000 Night 8000 Best call girls Se...
 
FULL NIGHT — 9999894380 Call Girls In Paschim Vihar | Delhi
FULL NIGHT — 9999894380 Call Girls In  Paschim Vihar | DelhiFULL NIGHT — 9999894380 Call Girls In  Paschim Vihar | Delhi
FULL NIGHT — 9999894380 Call Girls In Paschim Vihar | Delhi
 
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call GirlAdmirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
Admirable # 00971529501107 # Call Girls at dubai by Dubai Call Girl
 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
 
FULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
FULL NIGHT — 9999894380 Call Girls In Najafgarh | DelhiFULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
FULL NIGHT — 9999894380 Call Girls In Najafgarh | Delhi
 

General lect

  • 1. My name is David Thompson, and this presentation is Local Methods for Pattern Recognition. It's the first of several modules that will involve, or talk about dimensionality reduction in the context of this Caltech Big Data Summer School. And so, the first of these lectures is intended to provide just a common reference some basic skill set that we'll refer to later. I'm going to review some basic pattern recognition strategies, such as classification regression that you will have been exposed to previously, but I think it's, it's worth talking about them again, because it really is foundational for some of the things that we'll describe later as far as dimensionality reduction is concerned. So it's an important background to have. Okay, so this first talk will, as I said, review basic pattern recognition problems, classification, and regression. And in particular on the context of local pattern recognition strategies. These are non-parametric pattern recognition strategies, as opposed to say parametric methods that we have seen earlier. A lot of these fall into the general category of nearest neighbor methods. So I'll describe the nearest-neighbor methodology, and why that might be a good thing, and then a couple variants of that, including local linear regression. Which is a flexible regression strategy that is based on a local non-KerMetric approach. And then kernel density estimation, which is a probabilistic method for pattern recognition. Okay, so let's start off with a simple pattern recognition task. Here I've got a, a data set with a couple of attributes on it, and is pattern recognition folks I want to do. I just plotted these giving each its own coordinates. So here, this is a two-dimensional attribute space where every data point may have associated with it some class value or some real valued ordinate that we're regressing against, some value that we'd want to predict. [INAUDIBLE] were just interesting in intrinsic properties of the data set how the data is distributed to try to infer something about the process that
  • 2. generated that data, so that is more akin to say a density estimation task where we are estimating probability density in this two dimensional attribute space. So there are lots of different kinds of pattern recognition questions that we can ask of this data set even this very simple one here. So, I've plotted here on this, this slide a red question mark to indicate some query point where we'd like to describe or predict behavior of this process. Again, this could be a, a prediction of the real valued function at that location in the input space, or it could be that we have a new point there, who's class we're tying to infer. Regardless we've got a, a query point and we want to be able to infer something about what the process is doing at that location. All right, a little bit of notation background. So I'm going to treat these data points as column vectors, so we're going to say X is, er, X sub I is our data point. And it has from one to n attributes. Again, that's example, a simple example only has two, but you can imagine having arbitrarily many. And because this is dimensionality reduction, we're eventually going to be talking about data points that have hundreds or even thousands of attributes. And we can represent the entire data set as a matrix where we stack these column vectors together in rows so we have an N by d matrix, where D is the number of data points. For now without much loss of generality, I'm just going to assume that all of these attributes are real valued. Continuous attributes so, I'm not going to deal much with, for example, categorical attributes, but many of the same principles that I'll discuss apply to categorical attributes as well. And you may be familiar with encoding strategies that you can use to turn categorical attributes into continuous. So really we'll just focus on the, the the continuous case here. Okay, so, or one reasonable way to do
  • 3. pattern recognition on this data set is to just look at the local behavior in the vicinity of the query point that we're interested in inferring. So here we can, this goes under the assumption that whatever process generated the data is locally smooth in the input space. So maybe we can disregard a bunch of the points that are really far away from our query, right. We don't have to look at the function values. it, it, at the extreme lower left side of a plot. We can just look at the, the values on the upper right near the, the red question mark. Right? So, maybe to look at its, its closest neighbors to see what the function is doing in that vicinity. And you can do this for each of the pattern recognition strategies that I mentioned. All right. So the canonical example is, of course, nearest neighbor classification. To infer the class of the query, just look to its nearest neighbor in our data set, and assume that it has the same class as that one neighbor point. So, in this case I've indicated, with a red arrow, the nearest neighbor of the query and so we just assume that that is the class of our, of our query. Now this amounts to a [UNKNOWN] partitioning of the input space so here in two dimensions I've drawn, I've partitioned the space drawing areas for which every input point is responsible right? So you can see that the nearest neighbor points are actually responsible for a polygon, within this input space. And any query within that polygon is going to get its class now this has kind of nice property, because the data that is where we have denser data right in the center. it, where we, it, have more information that the Voronio partitions are smaller, right? And we permit the function to vary a little bit more. Far from the data cloud these, these partitions get a lot larger and, again, we, we don't have as much to say about that corner of, of the input space.
  • 4. So our inference there is going to be smoother. Right, so this is a basic example of nearest neighbor classification, and the decision boundary looks like this. This is he resulting decision boundary. I've applied here some, some labels to the data set, red and, and blue. And you can see that the decision boundary here drawn in black is sort of a wiggly shape through this input space that bends around all of these inputs. Now the, the query's actually been paired with a blue point. This is kind of interesting, though because if you look carefully you'll note that the query is actually also fairly close to a bunch of red points as well. Right? To it's immediate left there are clusters of red points and it's almost as close to those, but it's just sort of happenstance that it got paired with a blue point. So another way of saying this is that the one nearest neighbor approach to classification is rather sensitive noise. It has high variance, in the language of pattern recognition. Variance-bias tradeoff, is definitely well it's definitely more variance than bias in this case. So this also manifests as a fairly wiggly decision boundary. So if you see in the center of the cloud there, maybe our decision boundary wiggles a little bit more than is appropriate given, given the, the amount of data that we have. So what this really needs is some way to smooth that decision boundary or regularize it so we're less sensitive to those single point outliers. And the way that's typically done is by introducing more nearest neighbors. So here I'm looking at the five nearest neighbors and taking a majority vote of which which class to call the query. And you can see the decision boundary straighten out quite a bit asregularization that we saw before both with the increase in the number of [INAUDIBLE] we can do exactly the same thing here note that the blue might even be a bit over smooth. Looks right, so we're doing a pretty good job of modeling the underlying galaxy influction, but we've started to truncate its peak a little bit, right?
  • 5. So maybe a width of point three is a little bit too much. One thing to note about this expression, note that it's normalized to provide a true probability density estimate, so it's a valid PDF so that the normalization. Factor Z, zed normalizes our kernel function so that all the kernel evaluations have a volume one or area one e, the cross validation is the, the typical way that one would estimate these, these parameters. But, fortunately there's really only one parameter to estimate which is the, in this case. The kernel width. So we get a wide range of very flexible functions out of very few input parameters. We're letting the data speak for itself unlike a parametric method. And this is the, the real power of these local methods for pattern recognition. Okay. So, in summary I've described some local nonparametric methods for pattern recognition. This is in contrary to parametric methods that imply some global functional form for the process that generates the data here. We're just looking at local patches of data to infer the values of the underlying function or process at some query point. And there are 3 different examples of this local pattern recognition that I demonstrated. K-nearest neighbor for classification. Kernel smoothing and local linear regression for regression problems and then kernel density estimation for probability density estimation tasks. In order, the main parameter is regularization and you can set this using a cross validation strategy.
  • 6. So maybe a width of point three is a little bit too much. One thing to note about this expression, note that it's normalized to provide a true probability density estimate, so it's a valid PDF so that the normalization. Factor Z, zed normalizes our kernel function so that all the kernel evaluations have a volume one or area one e, the cross validation is the, the typical way that one would estimate these, these parameters. But, fortunately there's really only one parameter to estimate which is the, in this case. The kernel width. So we get a wide range of very flexible functions out of very few input parameters. We're letting the data speak for itself unlike a parametric method. And this is the, the real power of these local methods for pattern recognition. Okay. So, in summary I've described some local nonparametric methods for pattern recognition. This is in contrary to parametric methods that imply some global functional form for the process that generates the data here. We're just looking at local patches of data to infer the values of the underlying function or process at some query point. And there are 3 different examples of this local pattern recognition that I demonstrated. K-nearest neighbor for classification. Kernel smoothing and local linear regression for regression problems and then kernel density estimation for probability density estimation tasks. In order, the main parameter is regularization and you can set this using a cross validation strategy.