This is an elaborate presentation on how to predict employee attrition using various machine learning models. This presentation will take you through the process of statistical model building using Python.
Employee Attrition Analysis
A leading organization would like to know why its best and most experienced employees are leaving early. Based on the previous data, classification was done to predict the employees who could leave early.
IBM HR Analytics Employee Attrition & PerformanceShivangiKrishna
- Help companies to be prepared for future employee-loss
- Evaluating possible trends and reasons for employee attrition, in order to prevent valuable employees from leaving.
- We analyzed the numeric and categorical data with the use of Machine Learning models to identify the main variables contributing to the attrition of employees
- This project was completed and carried out by three DSAI students Angelin Grace Wijaya, Agarwala Pratham, Krishna Shivangi
Machine Learning Approach for Employee Attrition Analysisijtsrd
"Talent management involves a lot of managerial decisions to allocate right people with the right skills employed at appropriate location and time. Authors report machine learning solution for Human Resource HR attrition analysis and forecast. The data for this investigation is retrieved from Kaggle, a Data Science and Machine Learning platform 1 . Present study exhibits performance estimation of various classification algorithms and compares the classification accuracy. The performance of the model is evaluated in terms of Error Matrix and Pseudo R Square estimate of error rate. Performance accuracy revealed that Random Forest model can be effectively used for classification. This analysis concludes that employee attrition depends more on employees’ satisfaction level as compared to other attributes. Dr. R. S. Kamath | Dr. S. S. Jamsandekar | Dr. P. G. Naik ""Machine Learning Approach for Employee Attrition Analysis"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Fostering Innovation, Integration and Inclusion Through Interdisciplinary Practices in Management , March 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23065.pdf
Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/23065/machine-learning-approach-for-employee-attrition-analysis/dr-r-s-kamath"
The main goal of this slide is to leverage the power of data science to conduct an analysis on existing employee data to provide some interesting trends that may exists in data set, identify top factors that contribute to turnover and build a model to classify attrition and predict monthly income for the company, Alnylam Pharmaceuticals.
Employee Attrition Analysis
A leading organization would like to know why its best and most experienced employees are leaving early. Based on the previous data, classification was done to predict the employees who could leave early.
IBM HR Analytics Employee Attrition & PerformanceShivangiKrishna
- Help companies to be prepared for future employee-loss
- Evaluating possible trends and reasons for employee attrition, in order to prevent valuable employees from leaving.
- We analyzed the numeric and categorical data with the use of Machine Learning models to identify the main variables contributing to the attrition of employees
- This project was completed and carried out by three DSAI students Angelin Grace Wijaya, Agarwala Pratham, Krishna Shivangi
Machine Learning Approach for Employee Attrition Analysisijtsrd
"Talent management involves a lot of managerial decisions to allocate right people with the right skills employed at appropriate location and time. Authors report machine learning solution for Human Resource HR attrition analysis and forecast. The data for this investigation is retrieved from Kaggle, a Data Science and Machine Learning platform 1 . Present study exhibits performance estimation of various classification algorithms and compares the classification accuracy. The performance of the model is evaluated in terms of Error Matrix and Pseudo R Square estimate of error rate. Performance accuracy revealed that Random Forest model can be effectively used for classification. This analysis concludes that employee attrition depends more on employees’ satisfaction level as compared to other attributes. Dr. R. S. Kamath | Dr. S. S. Jamsandekar | Dr. P. G. Naik ""Machine Learning Approach for Employee Attrition Analysis"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Fostering Innovation, Integration and Inclusion Through Interdisciplinary Practices in Management , March 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23065.pdf
Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/23065/machine-learning-approach-for-employee-attrition-analysis/dr-r-s-kamath"
The main goal of this slide is to leverage the power of data science to conduct an analysis on existing employee data to provide some interesting trends that may exists in data set, identify top factors that contribute to turnover and build a model to classify attrition and predict monthly income for the company, Alnylam Pharmaceuticals.
* How high is your annual employee turnover?
* How much of your employee turnover consists of regretted loss?
* Do you know which employees will be the most likely to leave your company within a year?
Find the answer from HR Analytics because Human Resource analytics (HR analytics) is about analyzing an organizations’ people problems.
This presentation introduces big data and explains how to generate actionable insights using analytics techniques. The deck explains general steps involved in a typical analytics project and provides a brief overview of the most commonly used predictive analytics methods and their business applications.
Vijay Adamapure is a Data Science Enthusiast with extensive experience in the field of data mining, predictive modeling and machine learning. He has worked on numerous analytics projects ranging from healthcare, business analytics, renewable energy to IoT.
Vijay presented these slides during the Internet of Everything Meetup event 'Predictive Analytics - An Overview' that took place on Jan. 9, 2015 in Mumbai. To join the Meetup group, register here: http://bit.ly/1A7T0A1
Today, During a Management Development Program at Radisson Hotel, Noida.
Participant from PSUs like NTPC, GAIL and HR Personnel from Corporate with more than 20 years of experience.
A grand Teaching Learning Experience
Explore how data science can be used to predict employee churn using this data science project presentation, allowing organizations to proactively address retention issues. This student presentation from Boston Institute of Analytics showcases the methodology, insights, and implications of predicting employee turnover. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more data science insights
* How high is your annual employee turnover?
* How much of your employee turnover consists of regretted loss?
* Do you know which employees will be the most likely to leave your company within a year?
Find the answer from HR Analytics because Human Resource analytics (HR analytics) is about analyzing an organizations’ people problems.
This presentation introduces big data and explains how to generate actionable insights using analytics techniques. The deck explains general steps involved in a typical analytics project and provides a brief overview of the most commonly used predictive analytics methods and their business applications.
Vijay Adamapure is a Data Science Enthusiast with extensive experience in the field of data mining, predictive modeling and machine learning. He has worked on numerous analytics projects ranging from healthcare, business analytics, renewable energy to IoT.
Vijay presented these slides during the Internet of Everything Meetup event 'Predictive Analytics - An Overview' that took place on Jan. 9, 2015 in Mumbai. To join the Meetup group, register here: http://bit.ly/1A7T0A1
Today, During a Management Development Program at Radisson Hotel, Noida.
Participant from PSUs like NTPC, GAIL and HR Personnel from Corporate with more than 20 years of experience.
A grand Teaching Learning Experience
Explore how data science can be used to predict employee churn using this data science project presentation, allowing organizations to proactively address retention issues. This student presentation from Boston Institute of Analytics showcases the methodology, insights, and implications of predicting employee turnover. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more data science insights
PREDICTING BANKRUPTCY USING MACHINE LEARNING ALGORITHMSIJCI JOURNAL
This paper is written for predicting Bankruptcy using different Machine Learning Algorithms. Whether the company will go bankrupt or not is one of the most challenging and toughest question to answer in the 21st Century. Bankruptcy is defined as the final stage of failure for a firm. A company declares that it has gone bankrupt when at that present moment it does not have enough funds to pay the creditors. It is a global
problem. This paper provides a unique methodology to classify companies as bankrupt or healthy by applying predictive analytics. The prediction model stated in this paper yields better accuracy with standard parameters used for bankruptcy prediction than previously applied prediction methodologies.
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018Codemotion
In machine learning, training large models on a massive amount of data usually improves results. Our customers report, however, that training such models and deploying them is either operationally prohibitive or outright impossible for them. We created a collection of machine learning algorithms that scale to any amount of data, including k-means clustering for data segmentation, factorization machines for recommendations, time-series forecasting, linear regression, topic modeling, and image classification. This talk will discuss those algorithms, understand where and how they can be used.
Predict Backorder on a supply chain data for an OrganizationPiyush Srivastava
Performed cleaning and founded the important variables and created a best model using different classification techniques (Random Forest, Naïve Bayes, Decision tree, KNN, Neural Network, Support Vector Machine) to predict the back-order for an organization using the best modelling and technique approach.
Scikit-Learn is a powerful machine learning library implemented in Python with numeric and scientific computing powerhouses Numpy, Scipy, and matplotlib for extremely fast analysis of small to medium sized data sets. It is open source, commercially usable and contains many modern machine learning algorithms for classification, regression, clustering, feature extraction, and optimization. For this reason Scikit-Learn is often the first tool in a Data Scientists toolkit for machine learning of incoming data sets.
The purpose of this one day course is to serve as an introduction to Machine Learning with Scikit-Learn. We will explore several clustering, classification, and regression algorithms for a variety of machine learning tasks and learn how to implement these tasks with our data using Scikit-Learn and Python. In particular, we will structure our machine learning models as though we were producing a data product, an actionable model that can be used in larger programs or algorithms; rather than as simply a research or investigation methodology.
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
In this paper Compare the performance of two
classification algorithm. I t is useful to differentiate
algorithms based on computational performance rather
than classification accuracy alone. As although
classification accuracy between the algorithms is similar,
computational performance can differ significantly and it
can affect to the final results. So the objective of this paper
is to perform a comparative analysis of two machine
learning algorithms namely, K Nearest neighbor,
classification and Logistic Regression. In this paper it
was considered a large dataset of 7981 data points and 112
features. Then the performance of the above mentioned
machine learning algorithms are examined. In this paper
the processing time and accuracy of the different machine
learning techniques are being estimated by considering the
collected data set, over a 60% for train and remaining
40% for testing. The paper is organized as follows. In
Section I, introduction and background analysis of the
research is included and in section II, problem statement.
In Section III, our application and data analyze Process,
the testing environment, and the Methodology of our
analysis are being described briefly. Section IV comprises
the results of two algorithms. Finally, the paper concludes
with a discussion of future directions for research by
eliminating the problems existing with the current
research methodology.
Supervised learning is a machine learning approach that's defined by its use of labeled datasets. These datasets are designed to train or “supervise” algorithms into classifying data or predicting outcomes accurately.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
3. 1.1 OBJECTIVE AND SCOPE OF THE STUDY
The objective of this project is to predict the attrition rate for
each employee, to find out who’s more likely to leave the
organization.
It will help organizations to find ways to prevent attrition or
to plan in advance the hiring of new candidate.
Attrition proves to be a costly and time consuming problem
for the organization and it also leads to loss of productivity.
The scope of the project extends to companies in all
industries.
4. 1.2 ANALYTICS APPROACH
Check for missing values in the data, and if any, will process
the data accordingly.
Understand how the features are related with our target
variable - attrition
Convert target variable into numeric form
Apply feature selection and feature engineering to make it
model ready
Apply various algorithms to check which one is the most
suitable
Draw out recommendations based on our analysis.
5. 1.3 DATA SOURCES
For this project, an HR dataset named ‘IBM HR Analytics
Employee Attrition & Performance’, has been picked, which
is available on IBM website.
The data contains records of 1,470 employees.
It has information about employee’s current employment
status, the total number of companies worked for in the past,
Total number of years at the current company and the current
roles, Their education level, distance from home, monthly
income, etc.
6. 1.4 TOOLS AND TECHNIQUES
We have selected Python as our analytics tool.
Python includes many packages such as Pandas, NumPy,
Matplotlib, Seaborn etc.
Algorithms such as Logistic Regression, Random Forest,
Support Vector Machine and XGBoost have been used for
prediction.
10. 2.2 EXPLORATORY DATA ANALYSIS
Refers to the process of performing initial investigations on the
data so as to discover patterns, to spot inconsistencies, to test
hypothesis and to check assumptions with the help of graphical
representations
Displaying First 5 Rows
27. Data Pre-Processing-
Steps Involved –
Taking care of missing data and dropping non-relevant
features
Feature extraction
Converting categorical features into numeric form
Binarization of the converted categorical features
Feature scaling
Understanding correlation of features with each other
Splitting data into training and test data sets
Refers to data mining technique that transforms raw data into
an understandable format
Useful in making the data ready for analysis
28. 3.1 FEATURE SELECTION
Process wherein those features are selected, which contribute
most to the prediction variable or output.
Benefits of feature selection :
Improve the performance
Improves Accuracy
Providing the better understanding of Data
29. Dropping non-relevant variables
#dropping all fixed and non-relevant variables
attrition_df.drop(['DailyRate','EmployeeCount','EmployeeNumber','HourlyRate','Month
lyRate','Over18','PerformanceRating','StandardHours','StockOptionLevel','TrainingTi
mesLastYear'], axis=1,inplace=True)
Check number of rows and columns
31. Label Encoding
Label Encoding refers to converting the categorical variables into numeric
form, so as to convert it into the machine-readable form.
It is an important pre-processing step for the structured dataset in supervised
learning.
Fit and transform the required columns of the data, and then replace the
existing text data with the new encoded data.
33. One Hot Encoder
It is used to perform “binarization” of the categorical features and
include it as a feature to train the model.
It takes a column which has categorical data that has been label
encoded, and then splits the column into multiple columns.
The numbers are replaced by 1s and 0s, depending on which
column has what value.
35. Feature Scaling
Feature scaling is a method used to standardize the range of
independent variables or features of data
It is also known as Data Normalization
It is used to scale the features to a range which is centred around
zero so that the variance of the features are in the same range
Two most popular methods of feature scaling are standardization
and normalization
37. Correlation Matrix
• Correlation is a statistical technique which determines how one
variables moves/changes in relation with the other variable.
• It’s a bi-variant analysis measure which describes the association
between different variables.
Usefulness of Correlation matrix –
If two variables are closely correlated, then we can predict one
variable from the other.
Correlation plays a vital role in locating the important variables
on which other variables depend.
It is used as the foundation for various modeling techniques.
Proper correlation analysis leads to better understanding of data.
42. The process of modeling means training a machine learning
algorithm to predict the labels from the features, tuning it for
the business need, and validating it on holdout data.
Models used for employee attrition:
Logistic Regression
Random Forest
Support vector machine
XG Boost
Model building -
43. 4.1 LOGISTIC REGRESSION
Logistic Regression is one of the most basic and widely used
machine learning algorithms for solving a classification problem.
It is a method used to predict a dependent variable (Y), given an
independent variable (X), given that the dependent variable
is categorical.
44. Linear Regression equation
Y stands for the dependent variable that needs to be predicted.
β0 is the Y-intercept, which is basically the point on the line which
touches the y-axis.
β1 is the slope of the line (the slope can be negative or positive
depending on the relationship between the dependent variable and
the independent variable.)
X here represents the independent variable that is used to predict
our resultant dependent value.
∈ denotes the error in the computation
48. Confusion Matrix
Confusion matrix is the most crucial metric commonly used to
evaluate classification models.
The confusion matrix avoids "confusion" by measuring the
actual and predicted values in a tabular format.
In table above, Positive class = 1 and Negative class = 0.
Standard table of confusion matrix -
50. Receiver Operator Characteristic (ROC)
ROC determines the accuracy of a classification model at a user
defined threshold value.
It determines the model's accuracy using Area Under Curve
(AUC).
The area under the curve (AUC), also referred to as index of
accuracy (A) or concordant index, represents the performance of
the ROC curve. Higher the area, better the model.
52. ROC Curve For Logistic Regression
Using Logistic Regression algorithm, we got the accuracy score of
79% and roc_auc score of 0.77
53. 4.2 RANDOM FOREST
• Random Forest is a supervised learning algorithm.
• It creates a forest and makes it random based on bagging
technique. It aggregates Classification Trees.
• In Random Forest, only a random subset of the features is taken
into consideration by the algorithm for splitting a node.
57. Using Random Forest algorithm, we got the accuracy score of 79%
and roc_auc score of 0.76.
ROC Curve For Random Forest
58. 4.3 SUPPORT VECTOR MACHINE
SVM is a supervised machine learning algorithm used for both
regression and classification problems.
Objective is to find a hyperplane in an N -dimensional space.
Hyperplanes
Hyperplanes are decision boundaries
that help segregate the data points.
The dimension of the hyperplane
depends upon the number of features.
59. Support Vectors
These are data points that are closest to the hyperplane and
influence the position and orientation of the hyperplane.
Used to maximize the margin of the classifier.
Considered as critical elements of a dataset
60. Kernel Technique
Used when non-linear hyperplanes are needed
The hyperplane is no longer a line, it must now be a plane
Since we have a non-linear
classification problem, kernel
technique used here is Radial Basis
Function (rbf)
Helps in segregating data that are
linearly non-separable.
64. Using SVM algorithm, we got the accuracy score of 79% and
roc_auc score of 0.77
ROC Curve For SVM
65. 4.4 XG BOOST
XGBoost is a decision-tree-based ensemble Machine Learning algorithm
that uses a gradient boosting framework.
XGBoost belongs to a family of boosting algorithms that convert weak
learners into strong learners.
It is a sequential process, i.e., trees are grown using the information from
a previously grown tree one after the other, iteratively, the errors of the
previous model are corrected by the next predictor.
Advantages of XGBoost -
Regularization
Parallel Processing
High Flexibility
Handling Missing Values
Tree Pruning
Built-in Cross-Validation
69. Using XGBoost algorithm we got the accuracy score of 82% and
roc_auc score 0.81
ROC Curve For XGBoost Model
70. 4.5 COMPARISON OF MODELS
It can be observed by the table that XGBoost outperforms all other models.
Hence, based on these results we can conclude that, XGBoost will be the best
model to predict future Employee Attrition for this company.
71.
72. KEY FINDINGS
The dataset does not feature any missing values or any redundant
features.
The strongest positive correlations with the target features are:
Distance from home, Job satisfaction, marital status, overtime and
business travel
The strongest negative correlations with the target features are:
Performance Rating and Training times last year
73.
74. RECOMMENDATIONS
Transportation should be provided to employees living in the same
area, or else transportation allowance should be provided.
Plan and allocate projects in such a way to avoid the use of
overtime.
Employees who hit their two-year anniversary should be identified
as potentially having a higher-risk of leaving.
Gather information on industry benchmarks to determine if the
company is providing competitive wages.