SlideShare a Scribd company logo
1 of 17
1




Nearest Neighbor and Men’s Suits




                                       Jayda Dagdelen
                                   Nishani Siriwardane
                                        Daniel Yehuda
2

Introduction

       The goal of this paper is to examine the effectiveness of machine

learning/prediction technologies in making a simple daily decision. One of the first

decisions we make everyday is choosing what to wear. In this paper, we evaluate the

effectiveness of using the well-known nearest neighbor algorithm in aiding humans to

make this decision. We designed a MS Windows application which takes as input an

outfit descriptor in the form of garment colors, and gives as output one of three possible

ratings: bad, mediocre, or good. We focused on men’s suits, as there is more or less a

strict rule-set that governs them, which we thought could perhaps be modeled by a

computer. Women’s clothing tends to be less conservative in this regard.



Machine Learning

       The problem of data classification/prediction has been one of the important

elements in the growing field of Artificial Intelligence (AI) and machine learning.

Everything from intelligent robots to email spam detection algorithms use some form or

another of data classification, to aid in a decision making process. The problem is set up

as follows: given a set of inputs that represent some data point, suggest an output (or

classification) based on some knowledge-set. For example, the robot mentioned above

may take some form of its current visual data as input for a learning algorithm and base

its next move (i.e., left/right turn) on the output of the algorithm. Spam detection is a

good example of more straightforward classification; given an email (a set of words

which act as inputs), classify it on an integer scale between 0 (most probably legitimate)
3

and 10 (most probably spam). In both cases, some algorithm with a predefined

knowledge-set returns a prediction based on its input (and this knowledge).

        Perhaps the most important feature of any classification algorithm that falls under

the realm of machine-learning is the ability to build a knowledge-base based on some

training data set, or in other words, learn. In terms of the spam example, a training set

would encompass perhaps thousands of emails which are pre-classified (by a human) into

the different score-groups. Using a method known as supervised learning, the algorithm

parses all of these input/output pairs and attempts to “learn” the function that

appropriately maps the input vectors onto their corresponding outputs. Using whichever

learning method, it builds some knowledge that is based on the training set and can later

be applied to other unclassified datasets (vectors). Though other types of learning are

possible, including transduction, which evaluates its previous experiences to learn its own

bias, we will concentrate on the simple supervised learning method outlined above.1



The Nearest Neighbor Algorithm

        In the realm of supervised learning algorithms, there are many options. Neural

Network and State Vector Machine (SVM) systems are some of the more complicated

and advanced ones which have been successfully implemented and enjoy widespread use

in the industry. However, another popular and often very effective classification system is

the simple Nearest-Neighbor algorithm. Though quite memory-intensive, as it maintains

a list of all previously-trained vectors and their classifications, it performs just as well as

the others in many of its applications. Because of its simplicity, its often comparable


1
 Machine learning, Wikipedia.
< http://en.wikipedia.org/wiki/Machine_learning>
4

performance, and our relatively tiny data-set, we chose to use it as our classifier. We were

also interested to see just how well a simple algorithm would approximate the human

“taste function.”

        Although the nearest neighbor algorithm has both geometric2 and classification

applications, we will be concentrating on the latter one. A good example of its usage in

classification is the prediction of individuals’ political party-affiliations. With input data

such as age, education level, income level, and gender (all grouped together to form a d-

dimensional vector) the algorithm can be used to predict the party of the person

represented by the inputs. A party-affiliated point in d-dimensional space represents each

person in the data set. The classifier determines the party affiliation of a person by

assigning it the affiliation of its nearest neighbor.3 The following process is employed to

do this: the geometric distance from a new data point input to each element of the set of

classified points is calculated. The shortest distance to the new data point indicates the

nearest neighbor of the data point, and the class (in this case party-affiliation) of the

nearest neighbor is assigned to the new data point.4

        Again, it is important to note that the knowledge-base of the nearest neighbor

algorithm is no more intricate than the entire set of points that have already been

classified during some previous phase. The classification of these points constitutes the

training, or supervised-learning phase of the algorithm, whereas we will refer to the

prediction of new points simply as the prediction phase.

2
  A classic example of its usage in geometry would be emergency dispatchers. Given the location of the
fire, the emergency dispatcher finds the closest firehouse from a map, and dispatches the vehicles from
there.
3
   Nearest Neighbor Search
<http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE188.HTM>
4
  Nearest neighbor (pattern recognition), Wikipedia.
<http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29>
5

        The prediction phase of the algorithm is the part in which the actual knowledge

(database of classified vectors) that the system already has is used to make educated

statistically educated guesses as to the appropriate classification of new vectors. The

nature of the algorithm, however, makes this part very time consuming, as, in the brute-

force implementation, some constant amount of computing time, C, must be done in

comparing the new data input vector to all n vectors in the database. The result is an

algorithm which requires a time that is linearly proportional to the size of the database.

To combat this problem, various optimizations such as specialized trees to organize pre-

classified data have been developed; these drastically reduce the number of distances

needed to be computed. Such methods partition the geometric space by computing only

the distances within specified limits.5



Alternative Approaches

        In the realm of nearest neighbor, there are a variety of other approaches/options

which deserve some attention. Firstly, it is important to note that in practice, a common

variant that is often employed is known as k-nearest neighbor, in which in which k

number of data points are used to estimate the output of the new input data point. To

highlight its effectiveness, we will examine the following example which maps

1-dimensional vectors to their classifications:

Input :         0.0     1.0      1.7      2.5    3.0     3.5      4.0      5.0   6.0   7.0

Output:         D       D        D        R      R       D        R        R     R     R



5
 Nearest neighbor (pattern recognition), Wikipedia.
<http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29>
6

        An input such as 0.6 would be classified as D, with the simple nearest neighbor

algorithm. When k-nearest neighbor algorithm is applied with k = 2 or 3, it would still be

classified as D. However, determining the output of an input such as 3.7 with the k-

nearest neighbor algorithm is more difficult. With the simple nearest neighbor algorithm,

the output would be D. When k = 2, the two closest neighbors are D and R, and do not

belong to the same class. When k = 3, two out of three nearest neighbors are R, therefore

the classification is R. When k = 10, all the neighbors in the set are taken into account,

and the classification is R.6

        Unlike in the simple nearest neighbor method, in the k-nearest neighbor method

the calculation of errors becomes important as well. The value of k should be chosen such

that the prediction error should be minimized. For the calculation of the prediction error,

a loss function is necessary. Loss functions take the truth and the prediction as input and

produce 0 when the two match, or produce large values when the truth and the prediction

are far from each other.7 Though more complicated, the use of k-nearest neighbor in our

implementation may have proved more effective.

        Another option we considered was using a slightly less conventional model, in

which the system would be trained only on good outfits. In such a model, the final rating

of an outfit would be some decreasing function of the measured distance from the nearest

neighbor, rather than simply its classification. Yet another option is to use the traditional

model, but with different weights on each of the seven items of the suit. The purpose of

this would be to avoid some the problems previously outlined.


6
  Kth Nearest Neighbor Classification: Introduction
<http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/KNN.pdf>
7
  Cross Validation
< http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/CV.pdf>
7

       There is also the alternative of using a completely different algorithm, perhaps not

even under the umbrella of machine learning. One such algorithm could rate suits based

on a knowledge-set which simply described the weights of and required correlations

between different elements of the suit. Such an algorithm would, for example assign a

value rating the matchability between different elements of the suits (jacket/pants,

shirt/tie, etc.) and then use these values in determining the score.

       There are of course many other alternatives both within and outside the realm of

machine learning. We hoped only to scratch the surface of what we thought could be an

interesting way to help humans with a simple every-day decision.



Implementation

       As previously mentioned, we chose to limit the scope of our project to that of

men’s suits. Women’s outfits have numerous varieties in terms of shape, style, color, cut

and cloth - elements that would make a program that evaluates women’s outfits too

complicated for a project of this size. Men’s suits are more standard in terms of shape and

style consisting only of pants, socks, shoes, shirts, jackets, ties and belts. We assumed

that the main criteria for the evaluation of men’s outfits is garment color; it is the most

important element used by humans in determining if a set of suit elements are a good

“match.” These decisions allow for the representation of almost any men’s suit simply in

terms of a list of its seven garment/accessory colors.

       With the above in mind, the problem of assessing an outfit as bad, mediocre, or

good can essentially be thought of as one of prediction. In terms of machine learning,

some system could be trained on various sets of seven-color combinations, each
8

associated with some rating (bad, mediocre, good) and then queried with new color

combinations for a prediction response. As for nearest neighbor, the same methodology

applies. There is however and important aspect which needed consideration – how

exactly to represent each color in terms that the algorithm can understand.

       Although the nearest neighbor algorithm can be implemented to work with

discrete data (as in the party-affiliation problem discussed earlier in which one of the

inputs is race) and other types of distance measuring functions that work well with such,

color is anything but that. Color is a continuous spectrum on which humans can often

measure some type of distance. In other words, given three colors, we can usually group

two of them as being “closest” to each other. It is this very measure of distance that the

nearest neighbor algorithm relies on to match certain color groupings with others. A

natural way of combating this problem is to map each of the possible colors (~16.7

million on most computers today) to a number and then use the standard Euclidean

distance function as a measure of closeness. However, again who is to say what colors

should be close to each other on the number line? A somewhat artificial but more logical

approach is to break down each color into some other representation. In our case, we

chose to represent each color as the intensity levels of the three primary colors of red,

green and blue (Each of the primary colors can take 256 intensities and so adjusting them

appropriately, it is possible to come up with 256^3 ≈ 16.7 million colors.). The result is a

system which maps 21-dimensional input vectors (3 primary colors for each of the seven

garment colors) to one of three ratings categories (bad, mediocre, good). Though this

decision triples our vector size, it organizes the colors in an ordering in which, at least at
9

some level, the distance between colors can be identified via a geometric Euclidean

function.

       The following is a screenshot of the developed application:




       The window allows for the selection of a dataset (pre-classified knowledgebase)

and the setting of any of the seven garment/accessory colors. The “Predict!” button runs

the nearest neighbor algorithm on the 21-dimensional input vector corresponding to the

chosen colors and outputs the response in the text-field (in this example, according to the

knowledgebase, the given outfit is predicted to be a “good” one). The “Add Datapoint”

button is used to add a combination and rating to the currently open dataset – the slider

above it can be set to any of the three ratings (bad being leftmost).
10

Methodology

       To train the program, the first step involved designing sets of color combinations

for the suit and rating each as good, mediocre, or bad. We chose 35 outfits for each

classification. The good outfits were chosen by browsing online men’s advertisements

and finding the latest fashions. The mediocre outfits were created by using our own

tastes to modify the good outfits to okay outfits. And finally the bad outfits were set by

randomly choosing ridiculous color combinations that we thought would be tasteless.

       To test the success of the nearest neighbor algorithm in suit matching it was

necessary to create a testing data set which consisted of outfits already rated by a human

and then compare how the program rated them. The test data set consisted of thirty

different outfits of which one third were bad, one third good, and the other third

mediocre. These outfits were chosen by a member of our group who was not involved in

training the program so that the results would not be too biased. The thirty test outfits

were input into the program and the category the program assigned to each outfit was

recorded. The success of the program was measured by assigning the result from the

program to a score of 1 if it rated the outfit from the test data in the same category as the

human assigned it, and a score of 0 if the program rated it differently than the humanly

assigned category. Note that no weight was placed on how “wrong” the program was in

rating the outfit. For example if the human assigned the outfit to a good category but the

computer assigned it to either a mediocre or bad one, the result would receive the same

score of 1 even though a computer rating of mediocre is closer to getting it “correct.”

       We designed two experiments to determine some factors that affected the results

of the nearest neighbor algorithm. Our first experiment involved testing the hypothesis
11

that the larger the size of the training data, the more accurate the algorithm would in

predicting a “correctly” matched outfit. We trained the program with two different data

sets: one consisting of 135 outfits and the other consisting of only 68. The 68 outfits were

chosen by including only every other outfit from the larger training set. We then inputted

the 30 test outfits with the two different size training data sets and compared the scores.

See results section for results.

        The second experiment was, in essence, a repeat of the above experiment with

one important change. As mentioned before, the decision to use the RGB color

representation scheme was somewhat arbitrary – this scheme is simply the most common

one used by computers and offers at least some level of color difference “measurability.”

There is yet another common theme which some might say more closely models the

human color perception continuum – HSL. With HSL, each color is also broken down

into three numerical descriptors: Hue, Saturation, and Luminescence, which are all

measured as some percentage of a maximum value. Hue and saturation describe

qualitative differences between different colors, while luminescence describes the

quantitative differences of their brightness.8 In the second experiment, the two same

training sets (sizes 135 and 68 respectively) were converted into HSL representation. The

same was done with the 30 test outfits and the training/prediction was repeated. Below

are the results.




8
 Color
<http://encarta.msn.com/text_761577547__1/Color.html>
12

Results

Effects of the size of Training Data

        When the program was trained with 135 different outfits, it incorrectly

categorized 36.7% of the 30 humanly categorized outfits of the time. (Incorrect denotes

that the computer did not categorize the outfit as the same as the human.) Statistically,

with 95% confidence, this implies that with 135 different outfits in its knowledgebase,

the algorithm will incorrectly categorize the outfits 19.4%-53.9% of the time. When the

program is trained on only 68 different outfits, the program incorrectly categorized the

outfits 50% of the time. With 95% confidence, with only 68 outfits in its knowledgebase,

the program incorrectly categorizes the outfits 32.1%-67.9% of the time.           As we

hypothesized, when there is less training data, the nearest neighbor algorithm is less

accurate in its predictions. The more data points in its knowledgebase, the higher the

chance that some new input will have a nearest neighbor that is “closer.” With too few

data points, a new vector’s closest neighbor may be quite far away on the color

continuum and thus be too different to trust as a member of the same classification group.



Effects of a different color representation

        Replicating the same experiments with the HSL color representation scheme, we

found only negligible performance differences. For the trial with a 135 outfit training-

set, the error rate is 36.7% with a 95% confidence interval of 19.4%-53.9%. The results

were identical for the trial with only 68 outfits.
13

Conclusion of Results

       The results outlined above suggest that with enough data points, the nearest

neighbor learning algorithm implemented does decently in terms of agreeing with another

human’s classification of outfits. Though the difference in error rates between our two

trials (in RGB) may not be too statistically significant, the literature on nearest neighbor,

and machine learning in general, do indeed verify this conjecture. It is important to note

that in our trials, the mean error rate was significantly less than 66.7%, the expected error

rate of a random classifier. Furthermore, the upper bound of the confidence interval of the

RGB trial with the larger dataset is still less than this number.

       As for the results from the HSL representation, we see no improvement on the

trial with the larger dataset and only a slight improvement on the trial with the smaller

dataset. We can therefore make no statistically significant conclusions on the most

appropriate color representation model for use in such an algorithm. It is very possible

however, that with a larger dataset and more than only one trial, some conclusions can be

arrived at regarding this question.



Conclusion: Discussion

       We have shown that the simple nearest-neighbor algorithm performs relatively

well in rating what we will call the “matchability” of outfits, based solely on color. We

have also demonstrated the use of an alternate color representation scheme and its effect

on the algorithm’s performance. However, the question of where and under what

conditions the algorithm fails still remains.
14

       In developing and testing the algorithm, we came to understand its true limitations

in terms of application in the real world. These limitations stem mainly from the fact that

the algorithm, in and of itself, simply does what it says it does – it finds the nearest

neighbor and assigns its classification to that of the new outfit. It follows that in cases in

which a new outfit matches an existing one perfectly in all dimensions except, for

example, jacket color, the algorithm will most surely rate the outfit according to its

almost perfect match. But, herein lies the problem – a “perfectly” matching outfit

immediately goes from good to quite bad the moment the color of a major piece of the

outfit, for example the jacket, is changed to a ridiculous color. The nearest neighbor

algorithm inherently can not understand this and so, often fails in evaluating such outfits.

With more appropriately-trained data points in the region, it might perform better.

       Another major problem with the algorithm is its lack of a true understanding of

how humans tend to rate an outfit. Namely, it fails to weight and correlate different

elements of the suit. For example, while the matching of the jacket and pants is essential,

there is often much more leeway with tie color. The basic algorithm, however gives these

two dimensions the same weight/importance in computing distances and so fails on this

front. Nearest neighbor’s failure in correlation is best illustrated by the rating given to a

very well-matched (at least by our standards) but rather colorful suit. Because our

training set consists only of the more conservative/traditional suits, the algorithm ends up

classifying the suit as bad.
15

                                   APPENDIX

Nearest Neighbor Core Functions:

#include   "stdafx.h"
#include   ".nearestneighbor.h"
#include   <math.h>
#include   <queue>
#include   <string>
#include   <sstream>

using namespace std;

NearestNeighbor::NearestNeighbor(DataSet * d_local, int k_local) :
      d(d_local), k(k_local)
{
      standardize();
}

NearestNeighbor::~NearestNeighbor(void)
{
}

/*    returns the euclidian square distance between two vectors, x and
y, which are assumed
      to be standardized already, and assumed to both have dimension of
vector x */
double NearestNeighbor::distance(vector<double> &x, vector<double> &y) {
      double d = 0;
      for(int i = 0; i < x.size(); i++) {
            double t = x[i] - y[i];
            d += t*t;
      }
      return d;   //return square dist. for speed (no need for real
dist. - comparison only)
}

void NearestNeighbor::standardize() {
      vector<vector<int> > & input = d->trainEx; ///???
      int numAttrs = d->numAttrs;
      int numExs = d->numTrainExs;

      //record means
      vector<double> mean(numAttrs); //?

      for (int i = 0; i < numExs; i++) {
            for (int j = 0; j < numAttrs; j++)
                  mean[j] += (double)input[i][j];
      }
      for (int i = 0; i < numAttrs; i++)
            mean[i] /= (double)numExs;
      //end record means

      //record standard deviations
      stdev.resize(numAttrs);
      for (int i = 0; i < numExs; i++) {
16

           for (int j = 0; j < numAttrs; j++) {
                 double t = (double)input[i][j] - mean[j];
                 stdev[j] += t*t;
           }
     }
     for (int i = 0; i < numAttrs; i++){
           stdev[i] /= (double)numExs;
           stdev[i] = sqrt((double)stdev[i]);
     }
     //end record standard deviations

     //standardize
     data.resize(numExs);
     for (int i = 0; i < numExs; i++) {
           data[i].resize(numAttrs);
           for (int j = 0; j < numAttrs; j++)
                 if (stdev[j] != 0)
                       data[i][j] = (double)input[i][j] / stdev[j];
     }
     //end standardize
}

int NearestNeighbor::predict(vector<int> &ex) {

     int numAttrs = d->numAttrs;
     int numExs = d->numTrainExs;

     vector<double> dex(numAttrs);

      for (int i = 0; i < numAttrs; i++) //standardize vector based on
training stdev's
            if (stdev[i] != 0)
                  dex[i] = (double)ex[i] / stdev[i];

     double bestDist = distance(data[0], dex);
     int bestIndex = 0;
     for (int i = 1; i < d->numTrainExs; i++) {
           double dist = distance(data[i], dex);
           if (dist < bestDist) {
                 bestDist = dist;
                 bestIndex = i;
           }
     }
     return d->trainLabel[bestIndex];

}
17

                                  BIBLIOGRAPHY

Cover, T. M. and P. E. Hart. “Nearest Neighbor Pattern Classification,” IEEE
Transactions on Information Theory, Vol. IT-13, No.1, January 1967.

Gooda, Abdel-Hamid. “Application of The Techniques of Data Compression and Nearest
Neighbor Classification to Information Retrieval,” 2002.

Nayar, Shree K. and Sameer A. Nene. “A Simple Algorithm for Nearest Neighbor Search
in High Dimensions,” IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 19, No.9, September 1997.

Pace, R. Kelley and Dongya Zou. “Closed-Form Maximum Likelihood Estimates of
Nearest Neighbor Spatial Dependence,” Geographical Analysis, Volume 32, Number 2,
April 2000.

Yau, Hung-Chun and Michael T. Manry. “Iterative Improvement of a Nearest Neighbor
Classifier.”

Color
<http://encarta.msn.com/text_761577547__1/Color.html>

Cross Validation
<http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/CV.pdf>

Kth Nearest Neighbor Classification: Introduction.
<http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/KNN.pdf>

Machine learning, Wikipedia.
<http://en.wikipedia.org/wiki/Machine_learning>

Nearest Neighbor Search
<http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE188.HTM>

Nearest neighbor (pattern recognition), Wikipedia.
<http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29>

Nearest Neighbor Search
<http://www.cs.sunysb.edu/~algorith/files/nearest-neighbor.shtml>

More Related Content

What's hot

Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyMarina Santini
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableAditya Bhattacharya
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Machine Learning Unit 4 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 4 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 4 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 4 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Machine learning
Machine learningMachine learning
Machine learningRohit Kumar
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)butest
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningShahar Cohen
 
Machine Learning
Machine LearningMachine Learning
Machine LearningRahul Kumar
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural NetworkSaransh Choudhary
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Marina Santini
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.pptbutest
 

What's hot (17)

Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretable
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Machine Learning Unit 4 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 4 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 4 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 4 Semester 3 MSc IT Part 2 Mumbai University
 
Machine learning
Machine learningMachine learning
Machine learning
 
Statistical learning intro
Statistical learning introStatistical learning intro
Statistical learning intro
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Learning in AI
Learning in AILearning in AI
Learning in AI
 
Chaptr 7 (final)
Chaptr 7 (final)Chaptr 7 (final)
Chaptr 7 (final)
 
Learning
LearningLearning
Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural Network
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 

Viewers also liked

Abstract
AbstractAbstract
Abstractbutest
 
amta-decision-trees.doc Word document
amta-decision-trees.doc Word documentamta-decision-trees.doc Word document
amta-decision-trees.doc Word documentbutest
 
web design
web designweb design
web designbutest
 
CP2083 Introduction to Artificial Intelligence
CP2083 Introduction to Artificial IntelligenceCP2083 Introduction to Artificial Intelligence
CP2083 Introduction to Artificial Intelligencebutest
 
Tearn Up pitch deck.pdf
Tearn Up pitch deck.pdfTearn Up pitch deck.pdf
Tearn Up pitch deck.pdfasenju
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 

Viewers also liked (9)

Abstract
AbstractAbstract
Abstract
 
amta-decision-trees.doc Word document
amta-decision-trees.doc Word documentamta-decision-trees.doc Word document
amta-decision-trees.doc Word document
 
ppt
pptppt
ppt
 
web design
web designweb design
web design
 
CP2083 Introduction to Artificial Intelligence
CP2083 Introduction to Artificial IntelligenceCP2083 Introduction to Artificial Intelligence
CP2083 Introduction to Artificial Intelligence
 
Prezentare Industrial
Prezentare IndustrialPrezentare Industrial
Prezentare Industrial
 
Tearn Up pitch deck.pdf
Tearn Up pitch deck.pdfTearn Up pitch deck.pdf
Tearn Up pitch deck.pdf
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 

Similar to Machine Learning and Men's Suit Choices

Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Alexander Decker
 
A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesVimal Gupta
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmIOSR Journals
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Classifiers
ClassifiersClassifiers
ClassifiersAyurdata
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...cscpconf
 
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxNEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxagniva pradhan
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit
 
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...Mumbai Academisc
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
 
Data mining projects topics for java and dot net
Data mining projects topics for java and dot netData mining projects topics for java and dot net
Data mining projects topics for java and dot netredpel dot com
 
Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...IRJET Journal
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityGon-soo Moon
 

Similar to Machine Learning and Men's Suit Choices (20)

Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)
 
A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbies
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering Algorithm
 
F017132529
F017132529F017132529
F017132529
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Classifiers
ClassifiersClassifiers
Classifiers
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
 
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptxNEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
NEAREST NEIGHBOUR CLUSTER ANALYSIS.pptx
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
Cerdit card
Cerdit cardCerdit card
Cerdit card
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
 
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...
Bra a bidirectional routing abstraction for asymmetric mobile ad hoc networks...
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
Data mining projects topics for java and dot net
Data mining projects topics for java and dot netData mining projects topics for java and dot net
Data mining projects topics for java and dot net
 
Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
 

More from butest

LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 
Download
DownloadDownload
Downloadbutest
 
resume.doc
resume.docresume.doc
resume.docbutest
 

More from butest (20)

LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 
Download
DownloadDownload
Download
 
resume.doc
resume.docresume.doc
resume.doc
 

Machine Learning and Men's Suit Choices

  • 1. 1 Nearest Neighbor and Men’s Suits Jayda Dagdelen Nishani Siriwardane Daniel Yehuda
  • 2. 2 Introduction The goal of this paper is to examine the effectiveness of machine learning/prediction technologies in making a simple daily decision. One of the first decisions we make everyday is choosing what to wear. In this paper, we evaluate the effectiveness of using the well-known nearest neighbor algorithm in aiding humans to make this decision. We designed a MS Windows application which takes as input an outfit descriptor in the form of garment colors, and gives as output one of three possible ratings: bad, mediocre, or good. We focused on men’s suits, as there is more or less a strict rule-set that governs them, which we thought could perhaps be modeled by a computer. Women’s clothing tends to be less conservative in this regard. Machine Learning The problem of data classification/prediction has been one of the important elements in the growing field of Artificial Intelligence (AI) and machine learning. Everything from intelligent robots to email spam detection algorithms use some form or another of data classification, to aid in a decision making process. The problem is set up as follows: given a set of inputs that represent some data point, suggest an output (or classification) based on some knowledge-set. For example, the robot mentioned above may take some form of its current visual data as input for a learning algorithm and base its next move (i.e., left/right turn) on the output of the algorithm. Spam detection is a good example of more straightforward classification; given an email (a set of words which act as inputs), classify it on an integer scale between 0 (most probably legitimate)
  • 3. 3 and 10 (most probably spam). In both cases, some algorithm with a predefined knowledge-set returns a prediction based on its input (and this knowledge). Perhaps the most important feature of any classification algorithm that falls under the realm of machine-learning is the ability to build a knowledge-base based on some training data set, or in other words, learn. In terms of the spam example, a training set would encompass perhaps thousands of emails which are pre-classified (by a human) into the different score-groups. Using a method known as supervised learning, the algorithm parses all of these input/output pairs and attempts to “learn” the function that appropriately maps the input vectors onto their corresponding outputs. Using whichever learning method, it builds some knowledge that is based on the training set and can later be applied to other unclassified datasets (vectors). Though other types of learning are possible, including transduction, which evaluates its previous experiences to learn its own bias, we will concentrate on the simple supervised learning method outlined above.1 The Nearest Neighbor Algorithm In the realm of supervised learning algorithms, there are many options. Neural Network and State Vector Machine (SVM) systems are some of the more complicated and advanced ones which have been successfully implemented and enjoy widespread use in the industry. However, another popular and often very effective classification system is the simple Nearest-Neighbor algorithm. Though quite memory-intensive, as it maintains a list of all previously-trained vectors and their classifications, it performs just as well as the others in many of its applications. Because of its simplicity, its often comparable 1 Machine learning, Wikipedia. < http://en.wikipedia.org/wiki/Machine_learning>
  • 4. 4 performance, and our relatively tiny data-set, we chose to use it as our classifier. We were also interested to see just how well a simple algorithm would approximate the human “taste function.” Although the nearest neighbor algorithm has both geometric2 and classification applications, we will be concentrating on the latter one. A good example of its usage in classification is the prediction of individuals’ political party-affiliations. With input data such as age, education level, income level, and gender (all grouped together to form a d- dimensional vector) the algorithm can be used to predict the party of the person represented by the inputs. A party-affiliated point in d-dimensional space represents each person in the data set. The classifier determines the party affiliation of a person by assigning it the affiliation of its nearest neighbor.3 The following process is employed to do this: the geometric distance from a new data point input to each element of the set of classified points is calculated. The shortest distance to the new data point indicates the nearest neighbor of the data point, and the class (in this case party-affiliation) of the nearest neighbor is assigned to the new data point.4 Again, it is important to note that the knowledge-base of the nearest neighbor algorithm is no more intricate than the entire set of points that have already been classified during some previous phase. The classification of these points constitutes the training, or supervised-learning phase of the algorithm, whereas we will refer to the prediction of new points simply as the prediction phase. 2 A classic example of its usage in geometry would be emergency dispatchers. Given the location of the fire, the emergency dispatcher finds the closest firehouse from a map, and dispatches the vehicles from there. 3 Nearest Neighbor Search <http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE188.HTM> 4 Nearest neighbor (pattern recognition), Wikipedia. <http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29>
  • 5. 5 The prediction phase of the algorithm is the part in which the actual knowledge (database of classified vectors) that the system already has is used to make educated statistically educated guesses as to the appropriate classification of new vectors. The nature of the algorithm, however, makes this part very time consuming, as, in the brute- force implementation, some constant amount of computing time, C, must be done in comparing the new data input vector to all n vectors in the database. The result is an algorithm which requires a time that is linearly proportional to the size of the database. To combat this problem, various optimizations such as specialized trees to organize pre- classified data have been developed; these drastically reduce the number of distances needed to be computed. Such methods partition the geometric space by computing only the distances within specified limits.5 Alternative Approaches In the realm of nearest neighbor, there are a variety of other approaches/options which deserve some attention. Firstly, it is important to note that in practice, a common variant that is often employed is known as k-nearest neighbor, in which in which k number of data points are used to estimate the output of the new input data point. To highlight its effectiveness, we will examine the following example which maps 1-dimensional vectors to their classifications: Input : 0.0 1.0 1.7 2.5 3.0 3.5 4.0 5.0 6.0 7.0 Output: D D D R R D R R R R 5 Nearest neighbor (pattern recognition), Wikipedia. <http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29>
  • 6. 6 An input such as 0.6 would be classified as D, with the simple nearest neighbor algorithm. When k-nearest neighbor algorithm is applied with k = 2 or 3, it would still be classified as D. However, determining the output of an input such as 3.7 with the k- nearest neighbor algorithm is more difficult. With the simple nearest neighbor algorithm, the output would be D. When k = 2, the two closest neighbors are D and R, and do not belong to the same class. When k = 3, two out of three nearest neighbors are R, therefore the classification is R. When k = 10, all the neighbors in the set are taken into account, and the classification is R.6 Unlike in the simple nearest neighbor method, in the k-nearest neighbor method the calculation of errors becomes important as well. The value of k should be chosen such that the prediction error should be minimized. For the calculation of the prediction error, a loss function is necessary. Loss functions take the truth and the prediction as input and produce 0 when the two match, or produce large values when the truth and the prediction are far from each other.7 Though more complicated, the use of k-nearest neighbor in our implementation may have proved more effective. Another option we considered was using a slightly less conventional model, in which the system would be trained only on good outfits. In such a model, the final rating of an outfit would be some decreasing function of the measured distance from the nearest neighbor, rather than simply its classification. Yet another option is to use the traditional model, but with different weights on each of the seven items of the suit. The purpose of this would be to avoid some the problems previously outlined. 6 Kth Nearest Neighbor Classification: Introduction <http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/KNN.pdf> 7 Cross Validation < http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/CV.pdf>
  • 7. 7 There is also the alternative of using a completely different algorithm, perhaps not even under the umbrella of machine learning. One such algorithm could rate suits based on a knowledge-set which simply described the weights of and required correlations between different elements of the suit. Such an algorithm would, for example assign a value rating the matchability between different elements of the suits (jacket/pants, shirt/tie, etc.) and then use these values in determining the score. There are of course many other alternatives both within and outside the realm of machine learning. We hoped only to scratch the surface of what we thought could be an interesting way to help humans with a simple every-day decision. Implementation As previously mentioned, we chose to limit the scope of our project to that of men’s suits. Women’s outfits have numerous varieties in terms of shape, style, color, cut and cloth - elements that would make a program that evaluates women’s outfits too complicated for a project of this size. Men’s suits are more standard in terms of shape and style consisting only of pants, socks, shoes, shirts, jackets, ties and belts. We assumed that the main criteria for the evaluation of men’s outfits is garment color; it is the most important element used by humans in determining if a set of suit elements are a good “match.” These decisions allow for the representation of almost any men’s suit simply in terms of a list of its seven garment/accessory colors. With the above in mind, the problem of assessing an outfit as bad, mediocre, or good can essentially be thought of as one of prediction. In terms of machine learning, some system could be trained on various sets of seven-color combinations, each
  • 8. 8 associated with some rating (bad, mediocre, good) and then queried with new color combinations for a prediction response. As for nearest neighbor, the same methodology applies. There is however and important aspect which needed consideration – how exactly to represent each color in terms that the algorithm can understand. Although the nearest neighbor algorithm can be implemented to work with discrete data (as in the party-affiliation problem discussed earlier in which one of the inputs is race) and other types of distance measuring functions that work well with such, color is anything but that. Color is a continuous spectrum on which humans can often measure some type of distance. In other words, given three colors, we can usually group two of them as being “closest” to each other. It is this very measure of distance that the nearest neighbor algorithm relies on to match certain color groupings with others. A natural way of combating this problem is to map each of the possible colors (~16.7 million on most computers today) to a number and then use the standard Euclidean distance function as a measure of closeness. However, again who is to say what colors should be close to each other on the number line? A somewhat artificial but more logical approach is to break down each color into some other representation. In our case, we chose to represent each color as the intensity levels of the three primary colors of red, green and blue (Each of the primary colors can take 256 intensities and so adjusting them appropriately, it is possible to come up with 256^3 ≈ 16.7 million colors.). The result is a system which maps 21-dimensional input vectors (3 primary colors for each of the seven garment colors) to one of three ratings categories (bad, mediocre, good). Though this decision triples our vector size, it organizes the colors in an ordering in which, at least at
  • 9. 9 some level, the distance between colors can be identified via a geometric Euclidean function. The following is a screenshot of the developed application: The window allows for the selection of a dataset (pre-classified knowledgebase) and the setting of any of the seven garment/accessory colors. The “Predict!” button runs the nearest neighbor algorithm on the 21-dimensional input vector corresponding to the chosen colors and outputs the response in the text-field (in this example, according to the knowledgebase, the given outfit is predicted to be a “good” one). The “Add Datapoint” button is used to add a combination and rating to the currently open dataset – the slider above it can be set to any of the three ratings (bad being leftmost).
  • 10. 10 Methodology To train the program, the first step involved designing sets of color combinations for the suit and rating each as good, mediocre, or bad. We chose 35 outfits for each classification. The good outfits were chosen by browsing online men’s advertisements and finding the latest fashions. The mediocre outfits were created by using our own tastes to modify the good outfits to okay outfits. And finally the bad outfits were set by randomly choosing ridiculous color combinations that we thought would be tasteless. To test the success of the nearest neighbor algorithm in suit matching it was necessary to create a testing data set which consisted of outfits already rated by a human and then compare how the program rated them. The test data set consisted of thirty different outfits of which one third were bad, one third good, and the other third mediocre. These outfits were chosen by a member of our group who was not involved in training the program so that the results would not be too biased. The thirty test outfits were input into the program and the category the program assigned to each outfit was recorded. The success of the program was measured by assigning the result from the program to a score of 1 if it rated the outfit from the test data in the same category as the human assigned it, and a score of 0 if the program rated it differently than the humanly assigned category. Note that no weight was placed on how “wrong” the program was in rating the outfit. For example if the human assigned the outfit to a good category but the computer assigned it to either a mediocre or bad one, the result would receive the same score of 1 even though a computer rating of mediocre is closer to getting it “correct.” We designed two experiments to determine some factors that affected the results of the nearest neighbor algorithm. Our first experiment involved testing the hypothesis
  • 11. 11 that the larger the size of the training data, the more accurate the algorithm would in predicting a “correctly” matched outfit. We trained the program with two different data sets: one consisting of 135 outfits and the other consisting of only 68. The 68 outfits were chosen by including only every other outfit from the larger training set. We then inputted the 30 test outfits with the two different size training data sets and compared the scores. See results section for results. The second experiment was, in essence, a repeat of the above experiment with one important change. As mentioned before, the decision to use the RGB color representation scheme was somewhat arbitrary – this scheme is simply the most common one used by computers and offers at least some level of color difference “measurability.” There is yet another common theme which some might say more closely models the human color perception continuum – HSL. With HSL, each color is also broken down into three numerical descriptors: Hue, Saturation, and Luminescence, which are all measured as some percentage of a maximum value. Hue and saturation describe qualitative differences between different colors, while luminescence describes the quantitative differences of their brightness.8 In the second experiment, the two same training sets (sizes 135 and 68 respectively) were converted into HSL representation. The same was done with the 30 test outfits and the training/prediction was repeated. Below are the results. 8 Color <http://encarta.msn.com/text_761577547__1/Color.html>
  • 12. 12 Results Effects of the size of Training Data When the program was trained with 135 different outfits, it incorrectly categorized 36.7% of the 30 humanly categorized outfits of the time. (Incorrect denotes that the computer did not categorize the outfit as the same as the human.) Statistically, with 95% confidence, this implies that with 135 different outfits in its knowledgebase, the algorithm will incorrectly categorize the outfits 19.4%-53.9% of the time. When the program is trained on only 68 different outfits, the program incorrectly categorized the outfits 50% of the time. With 95% confidence, with only 68 outfits in its knowledgebase, the program incorrectly categorizes the outfits 32.1%-67.9% of the time. As we hypothesized, when there is less training data, the nearest neighbor algorithm is less accurate in its predictions. The more data points in its knowledgebase, the higher the chance that some new input will have a nearest neighbor that is “closer.” With too few data points, a new vector’s closest neighbor may be quite far away on the color continuum and thus be too different to trust as a member of the same classification group. Effects of a different color representation Replicating the same experiments with the HSL color representation scheme, we found only negligible performance differences. For the trial with a 135 outfit training- set, the error rate is 36.7% with a 95% confidence interval of 19.4%-53.9%. The results were identical for the trial with only 68 outfits.
  • 13. 13 Conclusion of Results The results outlined above suggest that with enough data points, the nearest neighbor learning algorithm implemented does decently in terms of agreeing with another human’s classification of outfits. Though the difference in error rates between our two trials (in RGB) may not be too statistically significant, the literature on nearest neighbor, and machine learning in general, do indeed verify this conjecture. It is important to note that in our trials, the mean error rate was significantly less than 66.7%, the expected error rate of a random classifier. Furthermore, the upper bound of the confidence interval of the RGB trial with the larger dataset is still less than this number. As for the results from the HSL representation, we see no improvement on the trial with the larger dataset and only a slight improvement on the trial with the smaller dataset. We can therefore make no statistically significant conclusions on the most appropriate color representation model for use in such an algorithm. It is very possible however, that with a larger dataset and more than only one trial, some conclusions can be arrived at regarding this question. Conclusion: Discussion We have shown that the simple nearest-neighbor algorithm performs relatively well in rating what we will call the “matchability” of outfits, based solely on color. We have also demonstrated the use of an alternate color representation scheme and its effect on the algorithm’s performance. However, the question of where and under what conditions the algorithm fails still remains.
  • 14. 14 In developing and testing the algorithm, we came to understand its true limitations in terms of application in the real world. These limitations stem mainly from the fact that the algorithm, in and of itself, simply does what it says it does – it finds the nearest neighbor and assigns its classification to that of the new outfit. It follows that in cases in which a new outfit matches an existing one perfectly in all dimensions except, for example, jacket color, the algorithm will most surely rate the outfit according to its almost perfect match. But, herein lies the problem – a “perfectly” matching outfit immediately goes from good to quite bad the moment the color of a major piece of the outfit, for example the jacket, is changed to a ridiculous color. The nearest neighbor algorithm inherently can not understand this and so, often fails in evaluating such outfits. With more appropriately-trained data points in the region, it might perform better. Another major problem with the algorithm is its lack of a true understanding of how humans tend to rate an outfit. Namely, it fails to weight and correlate different elements of the suit. For example, while the matching of the jacket and pants is essential, there is often much more leeway with tie color. The basic algorithm, however gives these two dimensions the same weight/importance in computing distances and so fails on this front. Nearest neighbor’s failure in correlation is best illustrated by the rating given to a very well-matched (at least by our standards) but rather colorful suit. Because our training set consists only of the more conservative/traditional suits, the algorithm ends up classifying the suit as bad.
  • 15. 15 APPENDIX Nearest Neighbor Core Functions: #include "stdafx.h" #include ".nearestneighbor.h" #include <math.h> #include <queue> #include <string> #include <sstream> using namespace std; NearestNeighbor::NearestNeighbor(DataSet * d_local, int k_local) : d(d_local), k(k_local) { standardize(); } NearestNeighbor::~NearestNeighbor(void) { } /* returns the euclidian square distance between two vectors, x and y, which are assumed to be standardized already, and assumed to both have dimension of vector x */ double NearestNeighbor::distance(vector<double> &x, vector<double> &y) { double d = 0; for(int i = 0; i < x.size(); i++) { double t = x[i] - y[i]; d += t*t; } return d; //return square dist. for speed (no need for real dist. - comparison only) } void NearestNeighbor::standardize() { vector<vector<int> > & input = d->trainEx; ///??? int numAttrs = d->numAttrs; int numExs = d->numTrainExs; //record means vector<double> mean(numAttrs); //? for (int i = 0; i < numExs; i++) { for (int j = 0; j < numAttrs; j++) mean[j] += (double)input[i][j]; } for (int i = 0; i < numAttrs; i++) mean[i] /= (double)numExs; //end record means //record standard deviations stdev.resize(numAttrs); for (int i = 0; i < numExs; i++) {
  • 16. 16 for (int j = 0; j < numAttrs; j++) { double t = (double)input[i][j] - mean[j]; stdev[j] += t*t; } } for (int i = 0; i < numAttrs; i++){ stdev[i] /= (double)numExs; stdev[i] = sqrt((double)stdev[i]); } //end record standard deviations //standardize data.resize(numExs); for (int i = 0; i < numExs; i++) { data[i].resize(numAttrs); for (int j = 0; j < numAttrs; j++) if (stdev[j] != 0) data[i][j] = (double)input[i][j] / stdev[j]; } //end standardize } int NearestNeighbor::predict(vector<int> &ex) { int numAttrs = d->numAttrs; int numExs = d->numTrainExs; vector<double> dex(numAttrs); for (int i = 0; i < numAttrs; i++) //standardize vector based on training stdev's if (stdev[i] != 0) dex[i] = (double)ex[i] / stdev[i]; double bestDist = distance(data[0], dex); int bestIndex = 0; for (int i = 1; i < d->numTrainExs; i++) { double dist = distance(data[i], dex); if (dist < bestDist) { bestDist = dist; bestIndex = i; } } return d->trainLabel[bestIndex]; }
  • 17. 17 BIBLIOGRAPHY Cover, T. M. and P. E. Hart. “Nearest Neighbor Pattern Classification,” IEEE Transactions on Information Theory, Vol. IT-13, No.1, January 1967. Gooda, Abdel-Hamid. “Application of The Techniques of Data Compression and Nearest Neighbor Classification to Information Retrieval,” 2002. Nayar, Shree K. and Sameer A. Nene. “A Simple Algorithm for Nearest Neighbor Search in High Dimensions,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No.9, September 1997. Pace, R. Kelley and Dongya Zou. “Closed-Form Maximum Likelihood Estimates of Nearest Neighbor Spatial Dependence,” Geographical Analysis, Volume 32, Number 2, April 2000. Yau, Hung-Chun and Michael T. Manry. “Iterative Improvement of a Nearest Neighbor Classifier.” Color <http://encarta.msn.com/text_761577547__1/Color.html> Cross Validation <http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/CV.pdf> Kth Nearest Neighbor Classification: Introduction. <http://stat-www.berkeley.edu/users/nolan/stat133/Fall04/lectures/KNN.pdf> Machine learning, Wikipedia. <http://en.wikipedia.org/wiki/Machine_learning> Nearest Neighbor Search <http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE188.HTM> Nearest neighbor (pattern recognition), Wikipedia. <http://en.wikipedia.org/wiki/Neares_neighbor_%28pattern_recognition%29> Nearest Neighbor Search <http://www.cs.sunysb.edu/~algorith/files/nearest-neighbor.shtml>