This document provides an introduction and overview of k-nearest neighbors (kNN) classification and density estimation techniques. It discusses:
1) How kNN density estimation works by growing a volume around an estimation point until it encloses k data points;
2) Examples demonstrating the behavior and limitations of kNN density estimation; and
3) How kNN classification assigns unlabeled examples to the class with the most frequent label among its k nearest neighbors.
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...TELKOMNIKA JOURNAL
In this paper we propose a high-order space-time discontinuous Galerkin (STDG) method for
solving of one-dimensional electromagnetic wave propagations in homogeneous medium. The STDG
method uses finite element Discontinuous Galerkin discretizations in spatial and temporal domain
simultaneously with high order piecewise Jacobi polynomial as the basis functions. The algebraic
equations are solved using Block Gauss-Seidel iteratively in each time step. The STDG method is
unconditionally stable, so the CFL number can be chosen arbitrarily. Numerical examples show that the
proposed STDG method is of exponentially accuracy in time.
Implicit schemes are needed in order to have fast runtime in wave models. Parallelization using the Message Passing Interface are needed in order to run on computers with thousands of processors. Implicit schemes rely on preconditioner in order for the iterative schemes to converge fast. Thus we need fast preconditioners and we present those here.
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansShinichi Tamura
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- K-Means Clustering and its application for Image Compression
- Introduction of Latent Variables
- Mixtures of Gaussians and its update using EM-algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・K-meansクラスタリングとその画像圧縮への応用
・隠れ変数の導入
・混合ガウス分布とEMアルゴリズムによる更新
I am Leonard K. I am a Stochastic Processes Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of Arkansas, USA.
I have been helping students with their homework for the past 6 years. I solve assignments related to Stochastic Processes.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Stochastic Processes Assignments.
A Novel Space-time Discontinuous Galerkin Method for Solving of One-dimension...TELKOMNIKA JOURNAL
In this paper we propose a high-order space-time discontinuous Galerkin (STDG) method for
solving of one-dimensional electromagnetic wave propagations in homogeneous medium. The STDG
method uses finite element Discontinuous Galerkin discretizations in spatial and temporal domain
simultaneously with high order piecewise Jacobi polynomial as the basis functions. The algebraic
equations are solved using Block Gauss-Seidel iteratively in each time step. The STDG method is
unconditionally stable, so the CFL number can be chosen arbitrarily. Numerical examples show that the
proposed STDG method is of exponentially accuracy in time.
Implicit schemes are needed in order to have fast runtime in wave models. Parallelization using the Message Passing Interface are needed in order to run on computers with thousands of processors. Implicit schemes rely on preconditioner in order for the iterative schemes to converge fast. Thus we need fast preconditioners and we present those here.
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansShinichi Tamura
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- K-Means Clustering and its application for Image Compression
- Introduction of Latent Variables
- Mixtures of Gaussians and its update using EM-algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・K-meansクラスタリングとその画像圧縮への応用
・隠れ変数の導入
・混合ガウス分布とEMアルゴリズムによる更新
I am Leonard K. I am a Stochastic Processes Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of Arkansas, USA.
I have been helping students with their homework for the past 6 years. I solve assignments related to Stochastic Processes.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Stochastic Processes Assignments.
Slides were formed by referring to the text Machine Learning by Tom M Mitchelle (Mc Graw Hill, Indian Edition) and by referring to Video tutorials on NPTEL
D. Mladenov - On Integrable Systems in CosmologySEENET-MTP
Lecture by Prof. Dr. Dimitar Mladenov (Theoretical Physics Department, Faculty of Physics, Sofia University, Bulgaria) on December 7, 2011 at the Faculty of Science and Mathematics, Nis, Serbia.
Quickselect Under Yaroslavskiy's Dual Pivoting AlgorithmSebastian Wild
I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain).
A paper covering the analyses of this talk (and some more!) has been submitted.
Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine:
slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort
Check my website for preprints of papers and my other talks:
wwwagak.cs.uni-kl.de/sebastian-wild.html
Recurrent and Recursive Networks (Part 1)sohaib_alam
The first half of the chapter on Sequence Modeling: Recurrent and Recursive Nets from the book "Deep Learning" by I. Goodfellow, Y. Bengio and A. Courville.
Slides taught in the course of pattern recognition of Professor Zohreh Azimifar at Shiraz University.
The slides are owned by the University of Texas.
اسلاید های تدریس شده درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
اسلاید ها متعلق به دانشگاه تگزاس است.
Slides were formed by referring to the text Machine Learning by Tom M Mitchelle (Mc Graw Hill, Indian Edition) and by referring to Video tutorials on NPTEL
D. Mladenov - On Integrable Systems in CosmologySEENET-MTP
Lecture by Prof. Dr. Dimitar Mladenov (Theoretical Physics Department, Faculty of Physics, Sofia University, Bulgaria) on December 7, 2011 at the Faculty of Science and Mathematics, Nis, Serbia.
Quickselect Under Yaroslavskiy's Dual Pivoting AlgorithmSebastian Wild
I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain).
A paper covering the analyses of this talk (and some more!) has been submitted.
Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine:
slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort
Check my website for preprints of papers and my other talks:
wwwagak.cs.uni-kl.de/sebastian-wild.html
Recurrent and Recursive Networks (Part 1)sohaib_alam
The first half of the chapter on Sequence Modeling: Recurrent and Recursive Nets from the book "Deep Learning" by I. Goodfellow, Y. Bengio and A. Courville.
Slides taught in the course of pattern recognition of Professor Zohreh Azimifar at Shiraz University.
The slides are owned by the University of Texas.
اسلاید های تدریس شده درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
اسلاید ها متعلق به دانشگاه تگزاس است.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
Performed analysis on Temperature, Wind Speed, Humidity and Pressure data-sets and implemented decision tree & clustering to predict possibility of rain
Created graphs and plots using algorithms such as k-nearest neighbors, naïve bayes, decision tree and k means clustering
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method proposed by Thomas Cover used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
اسلایدهای درس شبکه عصبی و یادگیری عمیق که در دانشگاه شیراز توسط استاد اقبال منصوری تدریس می شود.
Neural network and deep learning course slide taught by Professor Iqbal Mansouri at Shiraz University.
اسلایدهای درس شبکه عصبی و یادگیری عمیق که در دانشگاه شیراز توسط استاد اقبال منصوری تدریس می شود.
Neural network and deep learning course slide taught by Professor Iqbal Mansouri at Shiraz University.
اسلایدهای درس شبکه عصبی و یادگیری عمیق که در دانشگاه شیراز توسط استاد اقبال منصوری تدریس می شود.
Neural network and deep learning course slide taught by Professor Iqbal Mansouri at Shiraz University.
Slides of pattern recognition Course of Professor Zohreh Azimifar at Shiraz University.
اسلاید های درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
Slides of pattern recognition Course of Professor Zohreh Azimifar at Shiraz University.
اسلاید های درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
Slides of pattern recognition Course of Professor Zohreh Azimifar at Shiraz University.
اسلاید های درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
1. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
1
LECTURE 8: Nearest Neighbors
g Nearest Neighbors density estimation
g The k Nearest Neighbors classification rule
g kNN as a lazy learner
g Characteristics of the kNN classifier
g Optimizing the kNN classifier
2. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
2
Non-parametric density estimation: review
g Recall from the previous lecture that the general expression for non-
parametric density estimation is
g At that time, we mentioned that this estimate could be computed by
n Fixing the volume V and determining the number k of data points inside V
g This is the approach used in Kernel Density Estimation
n Fixing the value of k and determining the minimum volume V that encompasses
k points in the dataset
g This gives rise to the k Nearest Neighbor (kNN) approach, which is the subject of this
lecture
≅
V
inside
examples
of
number
the
is
k
examples
of
number
total
the
is
N
x
g
surroundin
volume
the
is
V
where
NV
k
p(x)
3. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
3
kNN Density Estimation (1)
g In the kNN method we grow the volume surrounding the estimation
point x until it encloses a total of k data points
g The density estimate then becomes
n Rk(x) is the distance between the estimation point x and its k-th closest neighbor
n cD is the volume of the unit sphere in D dimensions, which is equal to
g Thus c1=2, c2=π, c3=4π/3 and so on
(x)
R
c
N
k
NV
k
P(x) D
k
D ⋅
⋅
=
≅
( ) ( )
1
D/2
!
D/2
c
D/2
D/2
D
+
=
=
Γ
π
π
R
Vol=πR2
x
2
R
N
k
P(x)
π
=
4. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
4
kNN Density Estimation (1)
g In general, the estimates that can be obtained with the kNN
method are not very satisfactory
n The estimates are prone to local noise
n The method produces estimates with very heavy tails
n Since the function Rk(x) is not differentiable, the density estimate will
have discontinuities
n The resulting density is not a true probability density since its integral
over all the sample space diverges
g These properties are illustrated in the next few slides
5. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
5
kNN Density Estimation, example 1
g To illustrate the behavior of kNN we generated several density estimates for a univariate
mixture of two Gaussians: P(x)=½N(0,1)+½N(10,4) and several values of N and k
6. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
6
kNN Density Estimation, example 2 (a)
g The performance of the kNN
density estimation technique on
two dimensions is illustrated in
these figures
n The top figure shows the true density,
a mixture of two bivariate Gaussians
n The bottom figure shows the density
estimate for k=10 neighbors and
N=200 examples
n In the next slide we show the
contours of the two distributions
overlapped with the training data
used to generate the estimate
( ) ( )
[ ]
[ ]
−
−
=
=
=
=
+
=
4
1
1
1
Σ
0
5
µ
2
1
1
1
Σ
5
0
µ
with
Σ
,
µ
N
2
1
Σ
,
µ
N
2
1
p(x)
2
T
2
1
T
1
2
2
1
1
7. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
7
kNN Density Estimation, example 2 (b)
True density contours kNN density estimate contours
8. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
8
kNN Density Estimation as a Bayesian classifier
g The main advantage of the kNN method is that it leads to a very simple
approximation of the (optimal) Bayes classifier
n Assume that we have a dataset with N examples, Ni from class ωi, and that we
are interested in classifying an unknown sample xu
g We draw a hyper-sphere of volume V around xu. Assume this volume contains a total
of k examples, ki from class ωi.
n We can then approximate the likelihood functions using the kNN method by:
n Similarly, the unconditional density is estimated by
n And the priors are approximated by
n Putting everything together, the Bayes classifier becomes
V
N
k
)
|ω
P(x
i
i
i =
NV
k
P(x) =
N
N
)
P( i
i =
ω
k
k
NV
k
N
N
V
N
k
P(x)
)
)P(ω
ω
|
P(x
x)
|
P(ω i
i
i
i
i
i
i =
⋅
=
=
From [Bishop, 1995]
9. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
9
The k Nearest Neighbor classification rule
g The K Nearest Neighbor Rule (kNN) is a very intuitive method that
classifies unlabeled examples based on their similarity to examples in
the training set
n For a given unlabeled example xu∈ℜD, find the k “closest” labeled examples
in the training data set and assign xu to the class that appears most
frequently within the k-subset
g The kNN only requires
n An integer k
n A set of labeled examples (training data)
n A metric to measure “closeness”
g Example
n In the example below we have three classes
and the goal is to find a class label for the
unknown example xu
n In this case we use the Euclidean distance
and a value of k=5 neighbors
n Of the 5 closest neighbors, 4 belong to ω1 and
1 belongs to ω3, so xu is assigned to ω1, the
predominant class
xu
ω3
ω1 ω2
10. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
10
kNN in action: example 1
g We have generated data for a 2-dimensional 3-
class problem, where the class-conditional
densities are multi-modal, and non-linearly
separable, as illustrated in the figure
g We used the kNN rule with
n k = 5
n The Euclidean distance as a metric
g The resulting decision boundaries and decision
regions are shown below
11. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
11
kNN in action: example 2
g We have generated data for a 2-dimensional 3-class
problem, where the class-conditional densities are
unimodal, and are distributed in rings around a
common mean. These classes are also non-linearly
separable, as illustrated in the figure
g We used the kNN rule with
n k = 5
n The Euclidean distance as a metric
g The resulting decision boundaries and decision
regions are shown below
12. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
12
kNN as a lazy (machine learning) algorithm
g kNN is considered a lazy learning algorithm
n Defers data processing until it receives a request to classify an unlabelled
example
n Replies to a request for information by combining its stored training data
n Discards the constructed answer and any intermediate results
g Other names for lazy algorithms
n Memory-based, Instance-based , Exemplar-based , Case-based, Experience-
based
g This strategy is opposed to an eager learning algorithm which
n Compiles its data into a compressed description or model
g A density estimate or density parameters (statistical PR)
g A graph structure and associated weights (neural PR)
n Discards the training data after compilation of the model
n Classifies incoming patterns using the induced model, which is retained for future
requests
g Tradeoffs
n Lazy algorithms have fewer computational costs than eager algorithms during
training
n Lazy algorithms have greater storage requirements and higher computational
costs on recall
From [Aha, 1997]
13. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
13
Characteristics of the kNN classifier
g Advantages
n Analytically tractable
n Simple implementation
n Nearly optimal in the large sample limit (N→∞)
g P[error]Bayes <P[error]1NN<2P[error]Bayes
n Uses local information, which can yield highly adaptive behavior
n Lends itself very easily to parallel implementations
g Disadvantages
n Large storage requirements
n Computationally intensive recall
n Highly susceptible to the curse of dimensionality
g 1NN versus kNN
n The use of large values of k has two main advantages
g Yields smoother decision regions
g Provides probabilistic information
n The ratio of examples for each class gives information about the ambiguity of the decision
n However, too large a value of k is detrimental
g It destroys the locality of the estimation since farther examples are taken into account
g In addition, it increases the computational burden
14. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
14
kNN versus 1NN
1-NN 5-NN 20-NN
15. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
15
Optimizing storage requirements
g The basic kNN algorithm stores all the examples in the training set,
creating high storage requirements (and computational cost)
n However, the entire training set need not be stored since the examples may
contain information that is highly redundant
g A degenerate case is the earlier example with the multimodal classes. In that example,
each of the clusters could be replaced by its mean vector, and the decision boundaries
would be practically identical
n In addition, almost all of the information that is relevant for classification
purposes is located around the decision boundaries
g A number of methods, called edited kNN, have been derived to take
advantage of this information redundancy
n One alternative [Wilson 72] is to classify all the examples in the training set and
remove those examples that are misclassified, in an attempt to separate
classification regions by removing ambiguous points
n The opposite alternative [Ritter 75], is to remove training examples that are
classified correctly, in an attempt to define the boundaries between classes by
eliminating points in the interior of the regions
g A different alternative is to reduce the training examples to a set of
prototypes that are representative of the underlying data
n The issue of selecting prototypes will be the subject of the lectures on clustering
16. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
16
kNN and the problem of feature weighting
g The basic kNN rule computes the similarity measure based on the Euclidean distance
n This metric makes the kNN rule very sensitive to noisy features
n As an example, we have created a data set with three classes and two dimensions
g The first axis contains all the discriminatory information. In fact, class separability is excellent
g The second axis is white noise and, thus, does not contain classification information
g In the first example, both
axes are scaled properly
n The kNN (k=5) finds
decision boundaries fairly
close to the optimal
g In the second example,
the magnitude of the
second axis has been
increased two order of
magnitudes (see axes
tick marks)
n The kNN is biased by the
large values of the
second axis and its
performance is very poor
17. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
17
Feature weighting
g The previous example illustrated the Achilles’ heel of the kNN classifier: its
sensitivity to noisy axes
n A possible solution would be to normalize each feature to N(0,1)
n However, normalization does not resolve the curse of dimensionality. A close look at the
Euclidean distance shows that this metric can become very noisy for high dimensional
problems if only a few of the features carry the classification information
g The solution to this problem is to modify the Euclidean metric by a set of
weights that represent the information content or “goodness” of each feature
n Note that this procedure is identical to performing a linear transformation where the
transformation matrix is diagonal with the weights placed in the diagonal elements
g From this perspective, feature weighting can be thought of as a special case of feature extraction
where the different features are not allowed to interact (null off-diagonal elements in the
transformation matrix)
g Feature subset selection, which will be covered later in the course, can be viewed as a special case
of feature weighting where the weights can only take binary [0,1] values
n Do not confuse feature-weighting with distance-weighting, a kNN variant that weights the
contribution of each of the k nearest neighbors according to their distance to the unlabeled
example
g Distance-weighting distorts the kNN estimate of P(ωi|x) and is NOT recommended
g Studies have shown that distance-weighting DOES NOT improve kNN classification performance
∑
=
−
=
D
1
k
2
u
u ))
k
(
x
)
k
(
x
(
)
x
,
x
(
d
( )
∑
=
−
⋅
=
D
1
k
2
u
u
w ])
k
[
x
]
k
[
x
(
]
k
[
w
)
x
,
x
(
d
18. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
18
Feature weighting methods
g Feature weighting methods are divided in two groups
n Performance bias methods
n Preset bias methods
g Performance bias methods
n These methods find a set of weights through an iterative procedure that uses the
performance of the classifier as guidance to select a new set of weights
n These methods normally give good solutions since they can incorporate the classifier’s
feedback into the selection of weights
g Preset bias methods
n These methods obtain the values of the weights using a pre-determined function that
measures the information content of each feature (i.e., mutual information and correlation
between each feature and the class label)
n These methods have the advantage of executing very fast
g The issue of performance bias versus preset bias will be revisited when we
cover feature subset selection (FSS)
n In FSS the performance bias methods are called wrappers and preset bias methods are
called filters
19. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
19
Improving the nearest neighbor search procedure
g The problem of nearest neighbor can be stated as follows
n Given a set of N points in D-dimensional space and an unlabeled example xu∈ℜD
, find the point that
minimizes the distance to xu
n The naïve approach of computing a set of N distances, and finding the (k) smallest becomes impractical for
large values of N and D
g There are two classical algorithms that speed up the nearest neighbor search
n Bucketing (a.k.a Elias’s algorithm) [Welch 1971]
n k-d trees [Bentley, 1975; Friedman et al, 1977]
g Bucketing
n In the Bucketing algorithm, the space is divided into identical cells and for
each cell the data points inside it are stored in a list
n The cells are examined in order of increasing distance from the query point
and for each cell the distance is computed between its internal data points
and the query point
n The search terminates when the distance from the query point to the cell
exceeds the distance to the closest point already visited
g k-d trees
n A k-d tree is a generalization of a binary search tree in high dimensions
g Each internal node in a k-d tree is associated with a hyper-rectangle and a hyper-plane orthogonal to one of the
coordinate axis
g The hyper-plane splits the hyper-rectangle into two parts, which are associated with the child nodes
g The partitioning process goes on until the number of data points in the hyper-rectangle falls below some given threshold
n The effect of a k-d tree is to partition the (multi-dimensional) sample space according to the underlying
distribution of the data, the partitioning being finer in regions where the density of data points is higher
g For a given query point, the algorithm works by first descending the the tree to find the data points lying in the cell that
contains the query point
g Then it examines surrounding cells if they overlap the ball centered at the query point and the closest data point so far
20. Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
20
k-d tree example
Data structure (3D case) Partitioning (2D case)