SlideShare a Scribd company logo
1 of 10
Download to read offline
Human Resources Analytics:

providing useful insights for employee resignation prediction.
Final project for the course of Big Data with NoSQL

Stockholm University, DSV dept.

Academic year 2017/18

Presented by:

• Giacomo Bartoli

• Giorgos Ntymenos
Introduction
Why do people leave from this company?
Now, we have to hire and train new employees.
All the gained knowledge will benefit 

other companies!
I would like to know it beforehand.
Data
• Level of satisfaction
• Grade of last evaluation
• Number of projects
• Average monthly hours at work
• Numbers of years spent in the
company
• Whether the employee had
accidents at work
• Whether the employee was
promoted in the last 5 years
• Department
• Level of salary
• Left
Vs of Big Data:
- Volume: data coming from different
companies

- Variety: data might have different
formats, or even attributes. Even the
same attribute could be computed in
different way from company to
company. For example evaluation.
Thus, preprocessing is required. Also
data of different types are possible.
Method
Storage
HDFS is our choice because
Volume: it can handle massive amounts of data

Variety: It can accept data in about any format.
Analysis
Our aim is to classify, so we need to solve a classification task.

Our choice goes to decision trees because:
- they are simple

- not time consuming
- easy to scale.
- overfitting can be avoided using pre and post pruning.
- white box
Method
Evaluation
AUC because it is more reliable than accuracy.
Results
The reasons why employees resign are:
• When they perform really well and although they are generally satisfied they
feel underestimated, or they find better job opportunities due to their skills
• When they are not satisfied from the company at all.
• When they are not satisfied but not very effective either.
Discussion
Scaling



Use data from a lot of companies will lead to more accurate results but
preprocessing for integration is required.
Problem: possible sparse data

Solution: dimensionality reduction (PCA)
Replication data over different servers for partition tolerance
Discussion
Value for method and analysis result



We have clean data without missing values or many outliers, so with
decision tree we can have both speed and high performance, without
worrying for overfitting.
We might have very frequent writes and updates in our data. 

Ex: when inserting data they can be classified, using the tree we already
have from the last training.The training phase can take place as often as
the IT department team believes it is required.
Potential extension:
• Using structured data from different organizations
Potential extension:
• Considers also unstructured data as sources

More Related Content

What's hot

TSI Final Presentation
TSI Final PresentationTSI Final Presentation
TSI Final Presentation
Marco Better
 

What's hot (13)

Data Mining Technique - SEMMA
Data Mining Technique - SEMMAData Mining Technique - SEMMA
Data Mining Technique - SEMMA
 
Crisp dm
Crisp dmCrisp dm
Crisp dm
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Big data & analytics forum (yubin evh)
Big data & analytics forum (yubin   evh)Big data & analytics forum (yubin   evh)
Big data & analytics forum (yubin evh)
 
Buzzword scheme
Buzzword schemeBuzzword scheme
Buzzword scheme
 
Oracle jobs-for-freshers-now
Oracle jobs-for-freshers-nowOracle jobs-for-freshers-now
Oracle jobs-for-freshers-now
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining Projects
 
840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptop
 
Impact of Data Science
Impact of Data Science Impact of Data Science
Impact of Data Science
 
Workforce analytics - Know your employees productivity
Workforce analytics - Know your employees productivityWorkforce analytics - Know your employees productivity
Workforce analytics - Know your employees productivity
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Predictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive IndustryPredictive Analytics Project in Automotive Industry
Predictive Analytics Project in Automotive Industry
 
TSI Final Presentation
TSI Final PresentationTSI Final Presentation
TSI Final Presentation
 

Similar to Human Resources Analytics: providing useful insights for employee resignation prediction

Data Collection Process And Integrity
Data Collection Process And IntegrityData Collection Process And Integrity
Data Collection Process And Integrity
Gerrit Klaschke, CSM
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
CarolineRebeccaD
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
todd271
 

Similar to Human Resources Analytics: providing useful insights for employee resignation prediction (20)

Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data Analytics
 
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
 
Anwar kamal .pdf.pptx
Anwar kamal .pdf.pptxAnwar kamal .pdf.pptx
Anwar kamal .pdf.pptx
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
 
Building a successful data organization nov 2018
Building a successful data organization   nov 2018Building a successful data organization   nov 2018
Building a successful data organization nov 2018
 
Module 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdfModule 5 - Data Science Methodology.pdf
Module 5 - Data Science Methodology.pdf
 
Data driven; People based
Data driven; People basedData driven; People based
Data driven; People based
 
Big data
Big dataBig data
Big data
 
Data Collection Process And Integrity
Data Collection Process And IntegrityData Collection Process And Integrity
Data Collection Process And Integrity
 
Analytics
AnalyticsAnalytics
Analytics
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
 
Blitzscaling Session 9: Village Stage
Blitzscaling Session 9: Village StageBlitzscaling Session 9: Village Stage
Blitzscaling Session 9: Village Stage
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
 
How to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldHow to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital world
 
Data mining
Data miningData mining
Data mining
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
 
Best practice for_agile_ds_projects
Best practice for_agile_ds_projectsBest practice for_agile_ds_projects
Best practice for_agile_ds_projects
 
What is the Value of SAS Analytics?
What is the Value of SAS Analytics?What is the Value of SAS Analytics?
What is the Value of SAS Analytics?
 
Data Analytics Domain
Data Analytics DomainData Analytics Domain
Data Analytics Domain
 

Recently uploaded

原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 

Recently uploaded (20)

原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 

Human Resources Analytics: providing useful insights for employee resignation prediction

  • 1. Human Resources Analytics:
 providing useful insights for employee resignation prediction. Final project for the course of Big Data with NoSQL
 Stockholm University, DSV dept.
 Academic year 2017/18 Presented by: • Giacomo Bartoli • Giorgos Ntymenos
  • 2. Introduction Why do people leave from this company? Now, we have to hire and train new employees. All the gained knowledge will benefit 
 other companies! I would like to know it beforehand.
  • 3. Data • Level of satisfaction • Grade of last evaluation • Number of projects • Average monthly hours at work • Numbers of years spent in the company • Whether the employee had accidents at work • Whether the employee was promoted in the last 5 years • Department • Level of salary • Left Vs of Big Data: - Volume: data coming from different companies
 - Variety: data might have different formats, or even attributes. Even the same attribute could be computed in different way from company to company. For example evaluation. Thus, preprocessing is required. Also data of different types are possible.
  • 4. Method Storage HDFS is our choice because Volume: it can handle massive amounts of data
 Variety: It can accept data in about any format. Analysis Our aim is to classify, so we need to solve a classification task.
 Our choice goes to decision trees because: - they are simple
 - not time consuming - easy to scale. - overfitting can be avoided using pre and post pruning. - white box
  • 5. Method Evaluation AUC because it is more reliable than accuracy.
  • 6. Results The reasons why employees resign are: • When they perform really well and although they are generally satisfied they feel underestimated, or they find better job opportunities due to their skills • When they are not satisfied from the company at all. • When they are not satisfied but not very effective either.
  • 7. Discussion Scaling
 
 Use data from a lot of companies will lead to more accurate results but preprocessing for integration is required. Problem: possible sparse data
 Solution: dimensionality reduction (PCA) Replication data over different servers for partition tolerance
  • 8. Discussion Value for method and analysis result
 
 We have clean data without missing values or many outliers, so with decision tree we can have both speed and high performance, without worrying for overfitting. We might have very frequent writes and updates in our data. 
 Ex: when inserting data they can be classified, using the tree we already have from the last training.The training phase can take place as often as the IT department team believes it is required.
  • 9. Potential extension: • Using structured data from different organizations
  • 10. Potential extension: • Considers also unstructured data as sources