SlideShare a Scribd company logo
1 of 14
Predicting Long-Term
Unemployment for
Workforce System Clients
Insights from Machine Learning
Jessica Smith Stockham
Objective
 Inform how the U.S. Department of Labor (DOL) workforce
system can prioritize limited follow-up resources to those
who are mostly likely to have the most trouble finding or
keeping a job
 Gain insights from machine learning on the characteristics
of workforce system clients that are most likely to be
unemployed 1 year after exiting workforce system
services
Method
107 predictors
Unemployed
Employed
Tree Algorithm
 Fit 2 different “tree-based” machine learning algorithms
 Decision Trees
 Random Forests
 Decision trees are relatively simple and quick but can
suffer from over-fitting (i.e., they predict your training
dataset too well and not generalize to future data)
 Random Forests are computationally intensive and more
complicated, but generalize better to future data
Data
 1. DOL Performance
Records
 Administrative data with
characteristics of
individuals served,
workforce services
provided, and employment
outcomes for 4 quarters
after exit
 PY 2018 Q2 WIOA
Performance Records Public
Use Data File
 2. O*Net Skills Dataset
 Mapping of occupation
codes to 35 skills (e.g.,
reading comprehension,
active listening)
 Every occupation is rated on
each skill on a 0-5 scale
 E.g., Chief Executives
have a score of 4.88 on
Active Listening and a
score of 0 on Equipment
Maintenance.
1 Active Learning
2 Active Listening
3 Complex Problem Solving
4 Coordination
5 Critical Thinking
6 Equipment Maintenance
7 Equipment Selection
8 Installation
9 Instructing
10 Judgment and Decision Making
11 Learning Strategies
12 Management of Financial Resources
13 Management of Material Resources
14 Management of Personnel Resources
15 Mathematics
16 Monitoring
17 Negotiation
18 Operation Monitoring
19 Operation and Control
20 Operations Analysis
21 Persuasion
22 Programming
23 Quality Control Analysis
24 Reading Comprehension
25 Repairing
26 Science
27 Service Orientation
28 Social Perceptiveness
29 Speaking
30 Systems Analysis
31 Systems Evaluation
32 Technology Design
33 Time Management
34 Troubleshooting
35 Writing
Data Scope and Limitations
 Data Scope
 ~1 million workforce system clients
 Client’s most recent spell of service at the workforce center
 Adults age 25 - 65 served by the workforce system from July 1, 2016
to December 31, 2018
 50 states (excludes territories)
 Employment outcomes available
 Limitation = Lost about half the raw data file when I merged in O*Net
skills ratings (due to missingness on the most recent occupation
variable)
 Results may not generalize to the broader workforce system population
Participant Trends
 OUTCOME:
 33% of participants are
unemployed 1 yr later
 SELECT FEATURES
 53% have only a high
school diploma/GED or
less
 43% are White
 28% are Black
 Age range is diverse
20%
27%
24%
29%
25-30 31-40 41-50 51-65
Age
Decision Tree: 15 Most Important Features
to predict unemployment
 Age 51-65 (vs Age 25-30)
 Education Level Less than HS (vs
having a BA or higher)
 Duration (in days) of workforce system
service receipt
 Living in LA, FL, MA, CT, NH, IN (vs
California)
 Veteran status
 Being long-term unemployed prior to
receiving workforce system services
 Being Black or providing “no response”
to the race/ethnicity data element (vs
being White)
 The following job skills: programming,
monitoring

Random Forest: 15 Most Important Features
to predict unemployment
 Age 51-65 (vs Age 25-30)
 Education Level Less than HS (vs
having a BA or higher)
 Duration (in days) of workforce system
service receipt
 Living in LA or FL (vs CA)
 Providing “no response” to the
race/ethnicity data element (vs being
White)
 The following job skills: programming,
monitoring, reading comprehension,
math, writing, operation monitoring
 Being long-term unemployed prior to
receiving workforce system services
 Veteran status

Model Best Parameters
Accuracy on
Validation
Dataset
(Share of
Correctly
Classified Cases)
Accuracy on
Test Dataset
(Share of
Correctly
Classified
Cases)
Decision Tree -max # of leaf nodes = 100 67% 67%
Random Forest -max depth of tree = 10
-max features = 14
-N estimators = 100
67% 67%
Prediction Results
 Decision Tree and Random Forest models have the same
level of accuracy on the training and test data.
 However, they vary some in the most important
predictive features
 67% accuracy is not that much better than a coin flip
 Future research: Improve predictive power
 Modify features (add interaction and higher order terms)
 Restrict the data scope to a more homogenous subset of the
workforce system clients, such as low-income adults age 30-
40 in California.
 Diagnose who is missing data on O*Net skill ratings
 Try additional machine learning algorithms
Takeaways
 https://jhsmith22.github.io/workforce_ml/
Project website
Extra: Fitting a Machine Learning Model
 1. Engineer the features: clean data and recode values as needed.
Covert categorical features into binary dummies.
 2. Split data into a “training” vs “test” datasets. Hold the “test”
data in reserve until Step #5.
 3. Try out a range of model parameters on the training dataset,
leveraging 5-fold cross-validation to create a more robust fit.
 4. Pick the best model parameters based on prediction accuracy
(How well does the model trained on the “training dataset” predict
the outcomes in the “validation” dataset?)
 5. Assess how well my model generalizes to unseen data. Evaluate
how accurately the model predicts the outcomes in the “test”
dataset.
Demographic & Socioeconomic Features
# Predictor Coding Description
1 Age Continuous Age in years at program entry
2 Sex Categorical Male, Female, neither. Omitted male for interpretability
3 Race/Ethnicity Categorical Hispanic, Asian (Not Hispanic), Black (Not Hispanic), Native Hawaiian/Pacific
Islander/American Indian/Alaska Native (not Hispanic), White (Not Hispanic),
Multiple Race (not Hispanic). Omitted “White” for interpretability.
4 Education Level Categorical Less than HS, HS diploma or GED, some post-secondary, postsecondary
technical or vocational certificate, Associate’s degree, Bachelor’s Degree or
higher. Omitted “Bachelor’s Degree or higher” for interpretability.
5 Veteran Status Binary Flag for veteran
6 Low-income Status Binary Flag for low-income at entry
7 English as a Second
Language
Binary Flag for English as a Second Language at entry
8 Single Parent Binary Flag for single parent at entry
9 Criminal History Categorical Yes, no, or refused to answer. Omitted “no” for interpretability.
10 Long-term unemployed Binary Flag for being unemployed for 27 or more consecutive weeks
11 Public Assistance Status Binary Flag for receipt of TANF, SNAP, SSI, or other reported assistance
12 State Categorical State that submitted the participant data. Omitted CA for interpretability.
Workforce System Experience Features
# Name Coding Description
13 2017 Exit Year Binary 2017 rather than 2016 Exit Year
14 Service Duration Continuous Cumulative days of service
15 Number of Spells of
Service Receipt
Continuous Count of the number of cycles of “entry”
and “exit” into workforce system services
Recent Occupation Skills Features
# Name Coding Description
16 Skill Rating (for each
of the 35 skills) for
the client’s most
recent occupation
Continuous Rating between 0 - 5

More Related Content

Similar to Predicting Long-Term Unemployment for Workforce System Clients

Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
 
ISRCS1130201541745413_CS8KWHKR3W
ISRCS1130201541745413_CS8KWHKR3WISRCS1130201541745413_CS8KWHKR3W
ISRCS1130201541745413_CS8KWHKR3WHannah Bank
 
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyWomen in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyColleen Farrelly
 
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018Catalina Arango
 
Introduction industrial-statistics-salt-lake-city
Introduction industrial-statistics-salt-lake-cityIntroduction industrial-statistics-salt-lake-city
Introduction industrial-statistics-salt-lake-cityGlobalCompliancePanel
 
Technology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docxTechnology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docxssuserf9c51d
 
Machine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkMachine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkGabriel Hughes PhD
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEbutest
 
CSET 121 Chem Exam 201602- Pass
CSET 121 Chem Exam 201602- PassCSET 121 Chem Exam 201602- Pass
CSET 121 Chem Exam 201602- PassPete Bach
 
Where's your digital at feb 2018
Where's your digital at feb 2018Where's your digital at feb 2018
Where's your digital at feb 2018Moira Wright
 
Running head BUSN311 - Quantitative Methods and Analysis 1.docx
Running head  BUSN311 - Quantitative Methods and Analysis 1.docxRunning head  BUSN311 - Quantitative Methods and Analysis 1.docx
Running head BUSN311 - Quantitative Methods and Analysis 1.docxjoellemurphey
 
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docx
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docxAssignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docx
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docxdeanmtaylor1545
 
vijay mishra_data_analyst_cv
vijay mishra_data_analyst_cvvijay mishra_data_analyst_cv
vijay mishra_data_analyst_cvvijay mishra
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatMarwa Zalat
 
A Predictive Model using Personality Traits: A Survey
A Predictive Model using Personality Traits: A SurveyA Predictive Model using Personality Traits: A Survey
A Predictive Model using Personality Traits: A SurveyIRJET Journal
 
Echelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy WorkshopEchelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy WorkshopGarrett Teoh Hor Keong
 

Similar to Predicting Long-Term Unemployment for Workforce System Clients (20)

Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
 
ISRCS1130201541745413_CS8KWHKR3W
ISRCS1130201541745413_CS8KWHKR3WISRCS1130201541745413_CS8KWHKR3W
ISRCS1130201541745413_CS8KWHKR3W
 
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyWomen in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
 
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
 
Introduction industrial-statistics-salt-lake-city
Introduction industrial-statistics-salt-lake-cityIntroduction industrial-statistics-salt-lake-city
Introduction industrial-statistics-salt-lake-city
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Technology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docxTechnology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docx
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 
Machine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkMachine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talk
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
 
CSET 121 Chem Exam 201602- Pass
CSET 121 Chem Exam 201602- PassCSET 121 Chem Exam 201602- Pass
CSET 121 Chem Exam 201602- Pass
 
Where's your digital at feb 2018
Where's your digital at feb 2018Where's your digital at feb 2018
Where's your digital at feb 2018
 
Running head BUSN311 - Quantitative Methods and Analysis 1.docx
Running head  BUSN311 - Quantitative Methods and Analysis 1.docxRunning head  BUSN311 - Quantitative Methods and Analysis 1.docx
Running head BUSN311 - Quantitative Methods and Analysis 1.docx
 
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docx
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docxAssignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docx
Assignment 1 The CEO’s Challenge Due Week 3 and worth 150 points .docx
 
vijay mishra_data_analyst_cv
vijay mishra_data_analyst_cvvijay mishra_data_analyst_cv
vijay mishra_data_analyst_cv
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa Zalat
 
A Predictive Model using Personality Traits: A Survey
A Predictive Model using Personality Traits: A SurveyA Predictive Model using Personality Traits: A Survey
A Predictive Model using Personality Traits: A Survey
 
Echelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy WorkshopEchelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy Workshop
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

Predicting Long-Term Unemployment for Workforce System Clients

  • 1. Predicting Long-Term Unemployment for Workforce System Clients Insights from Machine Learning Jessica Smith Stockham
  • 2. Objective  Inform how the U.S. Department of Labor (DOL) workforce system can prioritize limited follow-up resources to those who are mostly likely to have the most trouble finding or keeping a job  Gain insights from machine learning on the characteristics of workforce system clients that are most likely to be unemployed 1 year after exiting workforce system services
  • 3. Method 107 predictors Unemployed Employed Tree Algorithm  Fit 2 different “tree-based” machine learning algorithms  Decision Trees  Random Forests  Decision trees are relatively simple and quick but can suffer from over-fitting (i.e., they predict your training dataset too well and not generalize to future data)  Random Forests are computationally intensive and more complicated, but generalize better to future data
  • 4. Data  1. DOL Performance Records  Administrative data with characteristics of individuals served, workforce services provided, and employment outcomes for 4 quarters after exit  PY 2018 Q2 WIOA Performance Records Public Use Data File  2. O*Net Skills Dataset  Mapping of occupation codes to 35 skills (e.g., reading comprehension, active listening)  Every occupation is rated on each skill on a 0-5 scale  E.g., Chief Executives have a score of 4.88 on Active Listening and a score of 0 on Equipment Maintenance. 1 Active Learning 2 Active Listening 3 Complex Problem Solving 4 Coordination 5 Critical Thinking 6 Equipment Maintenance 7 Equipment Selection 8 Installation 9 Instructing 10 Judgment and Decision Making 11 Learning Strategies 12 Management of Financial Resources 13 Management of Material Resources 14 Management of Personnel Resources 15 Mathematics 16 Monitoring 17 Negotiation 18 Operation Monitoring 19 Operation and Control 20 Operations Analysis 21 Persuasion 22 Programming 23 Quality Control Analysis 24 Reading Comprehension 25 Repairing 26 Science 27 Service Orientation 28 Social Perceptiveness 29 Speaking 30 Systems Analysis 31 Systems Evaluation 32 Technology Design 33 Time Management 34 Troubleshooting 35 Writing
  • 5. Data Scope and Limitations  Data Scope  ~1 million workforce system clients  Client’s most recent spell of service at the workforce center  Adults age 25 - 65 served by the workforce system from July 1, 2016 to December 31, 2018  50 states (excludes territories)  Employment outcomes available  Limitation = Lost about half the raw data file when I merged in O*Net skills ratings (due to missingness on the most recent occupation variable)  Results may not generalize to the broader workforce system population
  • 6. Participant Trends  OUTCOME:  33% of participants are unemployed 1 yr later  SELECT FEATURES  53% have only a high school diploma/GED or less  43% are White  28% are Black  Age range is diverse 20% 27% 24% 29% 25-30 31-40 41-50 51-65 Age
  • 7. Decision Tree: 15 Most Important Features to predict unemployment  Age 51-65 (vs Age 25-30)  Education Level Less than HS (vs having a BA or higher)  Duration (in days) of workforce system service receipt  Living in LA, FL, MA, CT, NH, IN (vs California)  Veteran status  Being long-term unemployed prior to receiving workforce system services  Being Black or providing “no response” to the race/ethnicity data element (vs being White)  The following job skills: programming, monitoring 
  • 8. Random Forest: 15 Most Important Features to predict unemployment  Age 51-65 (vs Age 25-30)  Education Level Less than HS (vs having a BA or higher)  Duration (in days) of workforce system service receipt  Living in LA or FL (vs CA)  Providing “no response” to the race/ethnicity data element (vs being White)  The following job skills: programming, monitoring, reading comprehension, math, writing, operation monitoring  Being long-term unemployed prior to receiving workforce system services  Veteran status 
  • 9. Model Best Parameters Accuracy on Validation Dataset (Share of Correctly Classified Cases) Accuracy on Test Dataset (Share of Correctly Classified Cases) Decision Tree -max # of leaf nodes = 100 67% 67% Random Forest -max depth of tree = 10 -max features = 14 -N estimators = 100 67% 67% Prediction Results
  • 10.  Decision Tree and Random Forest models have the same level of accuracy on the training and test data.  However, they vary some in the most important predictive features  67% accuracy is not that much better than a coin flip  Future research: Improve predictive power  Modify features (add interaction and higher order terms)  Restrict the data scope to a more homogenous subset of the workforce system clients, such as low-income adults age 30- 40 in California.  Diagnose who is missing data on O*Net skill ratings  Try additional machine learning algorithms Takeaways
  • 12. Extra: Fitting a Machine Learning Model  1. Engineer the features: clean data and recode values as needed. Covert categorical features into binary dummies.  2. Split data into a “training” vs “test” datasets. Hold the “test” data in reserve until Step #5.  3. Try out a range of model parameters on the training dataset, leveraging 5-fold cross-validation to create a more robust fit.  4. Pick the best model parameters based on prediction accuracy (How well does the model trained on the “training dataset” predict the outcomes in the “validation” dataset?)  5. Assess how well my model generalizes to unseen data. Evaluate how accurately the model predicts the outcomes in the “test” dataset.
  • 13. Demographic & Socioeconomic Features # Predictor Coding Description 1 Age Continuous Age in years at program entry 2 Sex Categorical Male, Female, neither. Omitted male for interpretability 3 Race/Ethnicity Categorical Hispanic, Asian (Not Hispanic), Black (Not Hispanic), Native Hawaiian/Pacific Islander/American Indian/Alaska Native (not Hispanic), White (Not Hispanic), Multiple Race (not Hispanic). Omitted “White” for interpretability. 4 Education Level Categorical Less than HS, HS diploma or GED, some post-secondary, postsecondary technical or vocational certificate, Associate’s degree, Bachelor’s Degree or higher. Omitted “Bachelor’s Degree or higher” for interpretability. 5 Veteran Status Binary Flag for veteran 6 Low-income Status Binary Flag for low-income at entry 7 English as a Second Language Binary Flag for English as a Second Language at entry 8 Single Parent Binary Flag for single parent at entry 9 Criminal History Categorical Yes, no, or refused to answer. Omitted “no” for interpretability. 10 Long-term unemployed Binary Flag for being unemployed for 27 or more consecutive weeks 11 Public Assistance Status Binary Flag for receipt of TANF, SNAP, SSI, or other reported assistance 12 State Categorical State that submitted the participant data. Omitted CA for interpretability.
  • 14. Workforce System Experience Features # Name Coding Description 13 2017 Exit Year Binary 2017 rather than 2016 Exit Year 14 Service Duration Continuous Cumulative days of service 15 Number of Spells of Service Receipt Continuous Count of the number of cycles of “entry” and “exit” into workforce system services Recent Occupation Skills Features # Name Coding Description 16 Skill Rating (for each of the 35 skills) for the client’s most recent occupation Continuous Rating between 0 - 5