Project Report
                     Version 1.1
                    May 6, 2010



WKU Job Applicant’s Profile Evaluator using KNN

                  Vijayeandra Parthepan
                  Mohnish Thallavajhula
              Professor: Dr. Huanjing Wang




              Submitted in partial fulfillment
                  Of the requirements of
                   CS565 Data Mining




          Western Kentucky University
Project Report
                                                                                                                                                 12/05/11



                                                                 Table of Contents

1.0. Introduction ................................................................................................................................................ 3
2.0. Motivation ................................................................................................................................................... 3
3.0. Dataset Description .................................................................................................................................. 3
4.0. Approaches ................................................................................................................................................ 4
5.0. Future Work ................................................................................................................................................ 5
5.0. Results ......................................................................................................................................................... 5
6.0. Conclusion .................................................................................................................................................. 6
7.0. Referrences ................................................................................................................................................ 6




                                                                                      2
Project Report
                                                                                                  12/05/11


1.0.   INTRODUCTION:

                 K Nearest Neighbor (KNN) is the supervised data mining pattern recognition algorithm. It
       classifies objects based on closest training exam-nearest neighbor algorithm. It is amongst the simplest of
       all machine learning algorithms. An object is classified by a majority vote of its neighbors. K is small
       positive integer and it is usually previously set.
                 WKU job applicant’s profile evaluator using KNN analyzes the status of the current job applicant
       based on the applicant’s details and classifies the applicant to the group of jobs that the applicant can apply.




2.0.   MOTIVATION:

                 The potential employee’s who wish to find some jobs in the university are not sure which jobs
       they are most likely to get and hence they may end up applying to jobs which may not suit their profile. So,
       in order to make their job search more accurate, we are going to compare their profile with already existing
       employee’s and provides them the job suggestions. We are going to analyze the status of WKU employees
       using KNN. The KNN algorithm classifies the new employee to a particular class based on the existing
       records. The k – “nearest” details of the existing job assignments will be considered and the job applicant
       will be classified into which group the applicant belongs to.




3.0.    DATASET DESCRIPTION:

   Training data is the existing assignments of the jobs.


   Sample Training Data:
   A G 3.0 CS        2
   B UG 2.5 ANY 3
   C G 3.0 MPH 5


   Test data is the details of the Job Applicant.


   Sample Test Data:
   G 3.5 CS 5




                                                            3
Project Report
                                                                                                      12/05/11
   Test Data Description:
   Training data has:
   Class Name               in 1st column
   Qualification            in 2nd column
   GPA                      in 3rd column
   Department               in 4th column
   Years of experience      in 5th column


   Training Data Description:
   Qualification            in 1st column
   GPA                      in 2nd column
   Department               in 3rd column
   Years of experience      in 4th column




4.0.   APPROACHES:


                   After calculating the group to which the Job Applicant belongs to, the list of jobs that the Job
       applicant can apply is displayed.


                   The algorithm of the k-nearest neighbor that we apply in our project is as follows,
                   1. Calculate the “distance” from the test record to the training records.
                   2. Find the “k - nearest” training records.
                   3. Check the majority class from the k – nearest training records.
                   4. The class label for the training record is predicted as the class with the majority votes/weight
                   among the k – nearest training.


       We are classifying the job applicants based on their details into different classes of jobs.


                   Group A: {Graduate Assistant, Research Assistant}
                   Group B: {Lab Assistant, Desk Clerk, Night Clerk}
                   Group C: {Shuttle driver, Receptionist}


       The application has been developed using C# .NET.




                                                            4
Project Report
                                                                                                     12/05/11


5.0.   FUTURE WORK:


                 Convert the Windows implementation into Web Application.
                 Provide direct application process to the jobs by taking the applicant’s details.




6.0.   RESULTS:


   Screen shot of the help menu:




                                                          5
Project Report
                                                                                                12/05/11

    Screen shot of the main menu:




7.0.   CONCLUSION:


       By implementing k – NN, the applicant is classified into a particular group of jobs. Thus, the job
   application process is simplified. Since we have implemented k – NN, the implementation is much simpler than
   it’s counter parts i.e. Decision Trees, Naïve Bayes, Support Vector Machines.


8.0.    REFERRENCES:


   http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm




                                                        6

Dm project report

  • 1.
    Project Report Version 1.1 May 6, 2010 WKU Job Applicant’s Profile Evaluator using KNN Vijayeandra Parthepan Mohnish Thallavajhula Professor: Dr. Huanjing Wang Submitted in partial fulfillment Of the requirements of CS565 Data Mining Western Kentucky University
  • 2.
    Project Report 12/05/11 Table of Contents 1.0. Introduction ................................................................................................................................................ 3 2.0. Motivation ................................................................................................................................................... 3 3.0. Dataset Description .................................................................................................................................. 3 4.0. Approaches ................................................................................................................................................ 4 5.0. Future Work ................................................................................................................................................ 5 5.0. Results ......................................................................................................................................................... 5 6.0. Conclusion .................................................................................................................................................. 6 7.0. Referrences ................................................................................................................................................ 6 2
  • 3.
    Project Report 12/05/11 1.0. INTRODUCTION: K Nearest Neighbor (KNN) is the supervised data mining pattern recognition algorithm. It classifies objects based on closest training exam-nearest neighbor algorithm. It is amongst the simplest of all machine learning algorithms. An object is classified by a majority vote of its neighbors. K is small positive integer and it is usually previously set. WKU job applicant’s profile evaluator using KNN analyzes the status of the current job applicant based on the applicant’s details and classifies the applicant to the group of jobs that the applicant can apply. 2.0. MOTIVATION: The potential employee’s who wish to find some jobs in the university are not sure which jobs they are most likely to get and hence they may end up applying to jobs which may not suit their profile. So, in order to make their job search more accurate, we are going to compare their profile with already existing employee’s and provides them the job suggestions. We are going to analyze the status of WKU employees using KNN. The KNN algorithm classifies the new employee to a particular class based on the existing records. The k – “nearest” details of the existing job assignments will be considered and the job applicant will be classified into which group the applicant belongs to. 3.0. DATASET DESCRIPTION: Training data is the existing assignments of the jobs. Sample Training Data: A G 3.0 CS 2 B UG 2.5 ANY 3 C G 3.0 MPH 5 Test data is the details of the Job Applicant. Sample Test Data: G 3.5 CS 5 3
  • 4.
    Project Report 12/05/11 Test Data Description: Training data has: Class Name in 1st column Qualification in 2nd column GPA in 3rd column Department in 4th column Years of experience in 5th column Training Data Description: Qualification in 1st column GPA in 2nd column Department in 3rd column Years of experience in 4th column 4.0. APPROACHES: After calculating the group to which the Job Applicant belongs to, the list of jobs that the Job applicant can apply is displayed. The algorithm of the k-nearest neighbor that we apply in our project is as follows, 1. Calculate the “distance” from the test record to the training records. 2. Find the “k - nearest” training records. 3. Check the majority class from the k – nearest training records. 4. The class label for the training record is predicted as the class with the majority votes/weight among the k – nearest training. We are classifying the job applicants based on their details into different classes of jobs. Group A: {Graduate Assistant, Research Assistant} Group B: {Lab Assistant, Desk Clerk, Night Clerk} Group C: {Shuttle driver, Receptionist} The application has been developed using C# .NET. 4
  • 5.
    Project Report 12/05/11 5.0. FUTURE WORK: Convert the Windows implementation into Web Application. Provide direct application process to the jobs by taking the applicant’s details. 6.0. RESULTS: Screen shot of the help menu: 5
  • 6.
    Project Report 12/05/11 Screen shot of the main menu: 7.0. CONCLUSION: By implementing k – NN, the applicant is classified into a particular group of jobs. Thus, the job application process is simplified. Since we have implemented k – NN, the implementation is much simpler than it’s counter parts i.e. Decision Trees, Naïve Bayes, Support Vector Machines. 8.0. REFERRENCES: http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm 6