2. “Numbers have an important story to tell. They
rely on you to give them a voice.”
– Stephen Few
3. Introduction
• To predict what all factors can get a candidate
hired at Flipkart.
• The dataset relates to the recruitment process
at Flipkart where the candidates are
hired/rejected during the application
screening process at various stages.
4. Project motivation/Background
• Considered multiple data sets
• Zeroed on using live data for analysis
• Firsthand dataset of an Indian e-Commerce
firm Flipkart Internet Pvt. Ltd.
5. Objectives
• Do some predictors influence the hiring process more than
others?
• Have we considered all the important independent variables
that contribute in getting a candidate hired at Flipkart?
• Can we predict if a person will be hired or rejected based on
predictors?
• What business strategies can we implement to increase a
person’s chances of getting hired?
6. Tool/Techniques Used
• We used SAS Enterprise Miner 9.4 and Excel for
analysis.
• Performed data mining techniques like decision
tree, regression & neural networks
• Created confusion matrix based on the results of
these techniques and predicted which model is
better.
7. • Dataset has been shared by HR from the company on
request to do analysis on the recruitment process.
• Since it includes the application details of candidates who
are rejected it’s not a primary data of the Flipkart
employees.
• The data file consists of 47798 rows and 24 columns.
• For analysis, we have taken a sample of 4782 rows to
perform our analysis.
Data Set
9. Exploring the Data Set
• Most of our data was categorical
• Preprocessed and added the following variables :
1. LastCoKnown - Have work experience or not.
2. HasSM - Has Social Media for example LinkedIn.
3. Referral/Non Referral - Whether a candidate is referred or
not referred.
4. Hema/notHema - Hema contributes 38% in the dataset,
introduced a field Hema/notHema.
5. Sunil/not Sunil Sunil - Contributes 14% in the dataset,
introduced a field Sunil/not Sunil.
6. Sunil/Hema- TAM’s with highest number of recruits.
10. What we selected and why?
• Predictors which had significant impact on the
output.
• To find a model that could obtain accurate
classification of new applicants based on their
predictor information.
11. Preprocessing the data
• Data Redundancy
• Used sample node.
• Impute node to treat missing values.
• Interpretation/ evaluation
12. Methods for Analysis
Predictive Analytics:
• Logistic Regression
• Decision Tree
• Interactive Decision Tree
• Neural Network
• Neural Network with Regression
13. Logistic Regression
• We chose step wise method and selection
criteria as validation misclassification
• For this Model we are getting an accuracy of
approximately 95%.
14. Interactive Decision Tree
• This model considers the number of candidates whose last
company is known, have a professional social media and have
been referred by Flipkart employees.
• For this Model (using confusion matrix) we are getting an
accuracy of approximately 77%.
15. Interactive Decision Tree
• This model considers the number of candidates who come under TAM “Hema”
& “Sunil”. Hema has 38% of applicants and Sunil 14%.
• For this model (using confusion matrix) we are getting an
accuracy of approximately 77%.
21. Model Comparison
• Decision tree is best model with least
misclassification rate of 8.6%.
• Target variable is nominal data with 3 possible
values.
• Misclassification is best way to compare the
model because for nominal response prediction,
misclassification rates are often examined as a
means for assessing the performance of the
classifier.
23. Business Strategy
• Company should advertise job openings on job boards which
show a higher % of Hiring(Jobs on Github, Glassdoor, etc).
• Referral plays a very important role in hiring procedures-
evident from Decision Tree (91%)
• Flipkart can introduce Incentives for the successful referral of
candidates which in turn will promote other employees to refer
known skilled and qualified candidates for a particular job
opening
• Flipkart can save time, money and capital on application
screening process. Flipkart won’t be spending on a candidate
who needs to be called on-site for interviews hence money
can be saved on Travel and Dear allowances
Since the self splitting decision tree was difficult to interpret so we used interactive decision
Observation: This model predicts that candidates who are referred from flipkart employees and have last company/ work experience gets hired more than those who doesn’t have work experience or whose last company is not known. Flipkart employees referring those candidates having relative work experience to those having none is approximately 91%.
Business Perspective Inference
Looking at the tree & leaf statistics it can be seen that at least 91% of Flipkart employees are referring those candidates whose previous company is known as compared to those whose previous company is unknown. And out of those whose previous company is known a significant number of candidates are getting hired. Hence, Flipkart should encourage more incentives and bonuses for employees who are helping candidates getting hired, so that unnecessary capital and time is not wasted on screening candidates which do not meet the expected criteria. This could in turn imply that if an employee is referring a candidate, the employee has a good know-how of the candidate he/she is referring and has a good idea of the requisite skills.
Observation: Sunil has a smaller department but has greater demand of employees as he is heading the AD’s Group which is the backbone for any eCommerce firm to reach its customer base
Business Perspective Inference
The number of applications to the any department is not proportional to the size of the department, hence application to any department should not be a rejection criteria for any candidate.