SlideShare a Scribd company logo
1 of 10
Download to read offline
Predicting the Contractual Full-Time
Equivalent Percentage using XGBoost
10/18/2019
Stine Bakke and Knut Håkon Grini
1
Agenda
• A-ordningen
• Agreed working hours – and the problem
• Preparing the data
• XGBoost (Extreme Gradient Boosted Decision Trees)
• Results
• Passing judgement/final thoughts
10/18/2019 2
A-ordningen – 24/7 reporting – monthly data
• Coordinated digital collection of information from employers about jobs,
earnings and taxes to 3 public agencies
10/18/2019 3
10/18/2019 4
Contractual full-
time equivalent
Paid hours
(only hourly paid)
Hours per week
full time in position
Substandard Reporting 7,5 %
10/18/2019 5
Check 1: Hourly paid
Ratio model, just
identification of
outliers
1,5 % extremes
Check 2: Boundaries on earnings
Lower wage
threshold is
established for FTE
wage and lower and
upper limits for
hourly paid
employees are set
2,7 % disq
Check 3: Relationship
between earnings and
FTE
Iterative linear
regression model
that checks for
outliers
4,7 % disq
eXtreme Gradient Boosting
• Uses «gradient boosted decision trees»
• Every tree provides a set of predicted values
• Trees are «grown» based on modified versions of the data
• Observations with bad prediction are weighted more
• Observations with good prediction are weighted less
• Improved prediction for each new tree
10/18/2019 6
Input variables
10/18/2019 7
• Fixed earnings (log)
• Reported or calculated FTE %
• Age and age squared
• Number of employees in local
unit (Ten groups)
• Education (first digit ISCED 2011)
• Industry (NACE 2007)
• Occupation (two digit ISCO 2008)
• Apprenticeship
• Earning category (fixed monthly,
hourly paid, other)
• Gender
10/18/2019 8
-
20 000
40 000
60 000
80 000
100 000
120 000
0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 1,1
Training improved Training faulty
Lessons learned, passing judgement and final thoughts
• Good learning data
• Careful specification – (variabelspec)
• Distribution of results – is it realistic or biased
• We are very optimistic but not quite finished
10/18/2019 9
Thank you!
10/18/2019 10

More Related Content

Similar to Predicting the Contractual Full-Time Equivalent Perecentage using XGBoost, Stine Bakke and Knut Håkon Grini, Statistics Norway

Similar to Predicting the Contractual Full-Time Equivalent Perecentage using XGBoost, Stine Bakke and Knut Håkon Grini, Statistics Norway (20)

What's New in BrightPay 2017/18
What's New in BrightPay 2017/18What's New in BrightPay 2017/18
What's New in BrightPay 2017/18
 
10. Linda Pullan - Important Changes to Payroll
10. Linda Pullan - Important Changes to Payroll10. Linda Pullan - Important Changes to Payroll
10. Linda Pullan - Important Changes to Payroll
 
Re-thinking data validation, Anette Morgils Hertz and Katja Overgaard, Statis...
Re-thinking data validation, Anette Morgils Hertz and Katja Overgaard, Statis...Re-thinking data validation, Anette Morgils Hertz and Katja Overgaard, Statis...
Re-thinking data validation, Anette Morgils Hertz and Katja Overgaard, Statis...
 
How the IRS 2019 Form W-4 Will Change Payroll
How the IRS 2019 Form W-4 Will Change PayrollHow the IRS 2019 Form W-4 Will Change Payroll
How the IRS 2019 Form W-4 Will Change Payroll
 
HI 224 Chapter 9
HI 224 Chapter 9HI 224 Chapter 9
HI 224 Chapter 9
 
2.8 payroll applications
2.8 payroll applications2.8 payroll applications
2.8 payroll applications
 
Payroll Update Seminar - February 2018
Payroll Update Seminar - February 2018Payroll Update Seminar - February 2018
Payroll Update Seminar - February 2018
 
DFG Benchmarking
DFG Benchmarking DFG Benchmarking
DFG Benchmarking
 
The past - present and future of auto-enrolment
The  past - present and future of auto-enrolment The  past - present and future of auto-enrolment
The past - present and future of auto-enrolment
 
Work on Primary Care Spending Measures
Work on Primary Care Spending MeasuresWork on Primary Care Spending Measures
Work on Primary Care Spending Measures
 
ProAktive's approach to Auto Enrolment (pensions).
ProAktive's approach to Auto Enrolment (pensions).ProAktive's approach to Auto Enrolment (pensions).
ProAktive's approach to Auto Enrolment (pensions).
 
3.4 Measuring access - Mitchell Briggs, Louise Harvey, Brian Niven
3.4 Measuring access - Mitchell Briggs, Louise Harvey, Brian Niven3.4 Measuring access - Mitchell Briggs, Louise Harvey, Brian Niven
3.4 Measuring access - Mitchell Briggs, Louise Harvey, Brian Niven
 
Maximo Budget Monitoring Logsdon 2017 imug
Maximo Budget Monitoring Logsdon 2017 imugMaximo Budget Monitoring Logsdon 2017 imug
Maximo Budget Monitoring Logsdon 2017 imug
 
Foundations DFG Benchmarking Offer April 2017
Foundations DFG Benchmarking Offer April 2017Foundations DFG Benchmarking Offer April 2017
Foundations DFG Benchmarking Offer April 2017
 
Finnish Basic Income Experiment 2017-2018 - About the experiment and its’ eva...
Finnish Basic Income Experiment 2017-2018 - About the experiment and its’ eva...Finnish Basic Income Experiment 2017-2018 - About the experiment and its’ eva...
Finnish Basic Income Experiment 2017-2018 - About the experiment and its’ eva...
 
ONS household income statistics user event
ONS household income statistics user event ONS household income statistics user event
ONS household income statistics user event
 
Gender Pay Gap: Reporting, Enforcement and Emerging - Equality Briefing FEB18
Gender Pay Gap: Reporting, Enforcement and Emerging - Equality Briefing FEB18Gender Pay Gap: Reporting, Enforcement and Emerging - Equality Briefing FEB18
Gender Pay Gap: Reporting, Enforcement and Emerging - Equality Briefing FEB18
 
BMA 2018-19 Contract update
BMA 2018-19 Contract updateBMA 2018-19 Contract update
BMA 2018-19 Contract update
 
Quality & Outcomes Framework (QOF)
Quality & Outcomes Framework (QOF)Quality & Outcomes Framework (QOF)
Quality & Outcomes Framework (QOF)
 
Cedar Day 2018 - Avoid Top Payroll Errors
Cedar Day 2018 -  Avoid Top Payroll ErrorsCedar Day 2018 -  Avoid Top Payroll Errors
Cedar Day 2018 - Avoid Top Payroll Errors
 

More from Tilastokeskus

More from Tilastokeskus (20)

Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, TilastokeskusKasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
 
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
 
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, TilastokeskusMitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
 
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
21.11.2023 Talouden kokonaiskestävyyden mittaaminen, Tilastokeskus
 
TOL2025 - mikä muuttuu? Miten uudistus toteutettiin? Miten muutostarpeet Suom...
TOL2025 - mikä muuttuu? Miten uudistus toteutettiin? Miten muutostarpeet Suom...TOL2025 - mikä muuttuu? Miten uudistus toteutettiin? Miten muutostarpeet Suom...
TOL2025 - mikä muuttuu? Miten uudistus toteutettiin? Miten muutostarpeet Suom...
 
Lääkärien vuokratyö, Heli Udd, Tilastokeskus
Lääkärien vuokratyö, Heli Udd, TilastokeskusLääkärien vuokratyö, Heli Udd, Tilastokeskus
Lääkärien vuokratyö, Heli Udd, Tilastokeskus
 

Recently uploaded

Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
MadhuKothuru
 
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
HyderabadDolls
 

Recently uploaded (20)

Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
NGO working for orphan children’s education
NGO working for orphan children’s educationNGO working for orphan children’s education
NGO working for orphan children’s education
 
Private Call Girls Bidar 9332606886Call Girls Advance Cash On Delivery Service
Private Call Girls Bidar  9332606886Call Girls Advance Cash On Delivery ServicePrivate Call Girls Bidar  9332606886Call Girls Advance Cash On Delivery Service
Private Call Girls Bidar 9332606886Call Girls Advance Cash On Delivery Service
 
Lorain Road Business District Revitalization Plan Final Presentation
Lorain Road Business District Revitalization Plan Final PresentationLorain Road Business District Revitalization Plan Final Presentation
Lorain Road Business District Revitalization Plan Final Presentation
 
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
 
The NAP process & South-South peer learning
The NAP process & South-South peer learningThe NAP process & South-South peer learning
The NAP process & South-South peer learning
 
Genuine Call Girls in Salem 9332606886 HOT & SEXY Models beautiful and charm...
Genuine Call Girls in Salem  9332606886 HOT & SEXY Models beautiful and charm...Genuine Call Girls in Salem  9332606886 HOT & SEXY Models beautiful and charm...
Genuine Call Girls in Salem 9332606886 HOT & SEXY Models beautiful and charm...
 
Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Morena [ 7014168258 ] Call Me For Genuine Models We...
 
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'IsraëlAntisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
 
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
 
Call Girl Service in Korba 9332606886 High Profile Call Girls You Can Get ...
Call Girl Service in Korba   9332606886  High Profile Call Girls You Can Get ...Call Girl Service in Korba   9332606886  High Profile Call Girls You Can Get ...
Call Girl Service in Korba 9332606886 High Profile Call Girls You Can Get ...
 
74th Amendment of India PPT by Piyush(IC).pptx
74th Amendment of India PPT by Piyush(IC).pptx74th Amendment of India PPT by Piyush(IC).pptx
74th Amendment of India PPT by Piyush(IC).pptx
 
NAP Expo - Delivering effective and adequate adaptation.pptx
NAP Expo - Delivering effective and adequate adaptation.pptxNAP Expo - Delivering effective and adequate adaptation.pptx
NAP Expo - Delivering effective and adequate adaptation.pptx
 
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
2024 asthma jkdjkfjsdklfjsdlkfjskldfgdsgerg
 
9867746289 Independent Call Girls in Mumbai Airport 24/7 - Mumbai Escorts
9867746289 Independent Call Girls in Mumbai Airport 24/7 - Mumbai Escorts9867746289 Independent Call Girls in Mumbai Airport 24/7 - Mumbai Escorts
9867746289 Independent Call Girls in Mumbai Airport 24/7 - Mumbai Escorts
 
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
Nagerbazar @ Independent Call Girls Kolkata - 450+ Call Girl Cash Payment 800...
 
sponsor for poor old age person food.pdf
sponsor for poor old age person food.pdfsponsor for poor old age person food.pdf
sponsor for poor old age person food.pdf
 
Call Girls Basheerbagh ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Basheerbagh ( 8250092165 ) Cheap rates call girls | Get low budgetCall Girls Basheerbagh ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Basheerbagh ( 8250092165 ) Cheap rates call girls | Get low budget
 
Dating Call Girls inBaloda Bazar Bhatapara 9332606886Call Girls Advance Cash...
Dating Call Girls inBaloda Bazar Bhatapara  9332606886Call Girls Advance Cash...Dating Call Girls inBaloda Bazar Bhatapara  9332606886Call Girls Advance Cash...
Dating Call Girls inBaloda Bazar Bhatapara 9332606886Call Girls Advance Cash...
 
tOld settlement register shouldnotaffect BTR
tOld settlement register shouldnotaffect BTRtOld settlement register shouldnotaffect BTR
tOld settlement register shouldnotaffect BTR
 

Predicting the Contractual Full-Time Equivalent Perecentage using XGBoost, Stine Bakke and Knut Håkon Grini, Statistics Norway

  • 1. Predicting the Contractual Full-Time Equivalent Percentage using XGBoost 10/18/2019 Stine Bakke and Knut Håkon Grini 1
  • 2. Agenda • A-ordningen • Agreed working hours – and the problem • Preparing the data • XGBoost (Extreme Gradient Boosted Decision Trees) • Results • Passing judgement/final thoughts 10/18/2019 2
  • 3. A-ordningen – 24/7 reporting – monthly data • Coordinated digital collection of information from employers about jobs, earnings and taxes to 3 public agencies 10/18/2019 3
  • 4. 10/18/2019 4 Contractual full- time equivalent Paid hours (only hourly paid) Hours per week full time in position Substandard Reporting 7,5 %
  • 5. 10/18/2019 5 Check 1: Hourly paid Ratio model, just identification of outliers 1,5 % extremes Check 2: Boundaries on earnings Lower wage threshold is established for FTE wage and lower and upper limits for hourly paid employees are set 2,7 % disq Check 3: Relationship between earnings and FTE Iterative linear regression model that checks for outliers 4,7 % disq
  • 6. eXtreme Gradient Boosting • Uses «gradient boosted decision trees» • Every tree provides a set of predicted values • Trees are «grown» based on modified versions of the data • Observations with bad prediction are weighted more • Observations with good prediction are weighted less • Improved prediction for each new tree 10/18/2019 6
  • 7. Input variables 10/18/2019 7 • Fixed earnings (log) • Reported or calculated FTE % • Age and age squared • Number of employees in local unit (Ten groups) • Education (first digit ISCED 2011) • Industry (NACE 2007) • Occupation (two digit ISCO 2008) • Apprenticeship • Earning category (fixed monthly, hourly paid, other) • Gender
  • 8. 10/18/2019 8 - 20 000 40 000 60 000 80 000 100 000 120 000 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 1,1 Training improved Training faulty
  • 9. Lessons learned, passing judgement and final thoughts • Good learning data • Careful specification – (variabelspec) • Distribution of results – is it realistic or biased • We are very optimistic but not quite finished 10/18/2019 9