SlideShare a Scribd company logo
1 of 23
Download to read offline
Integration of administrative sources and survey data through HMM for the production of labour statistics
Integration of administrative sources
and survey data through HMM for the
production of labour statistics
Danila Filipponi, Ugo Guarnera, Roberta Varriale
ISTAT
Istat - 19 November 2018
Integration of administrative sources and survey data through HMM for the production of labour statistics
outline
1 the problem
2 non supervised approach
3 Application to employment statistics
4 results and conclusions
Integration of administrative sources and survey data through HMM for the production of labour statistics
the problem
motivating example
Build a micro-data file to support the estimates on
employment (including the Permanent Census)
Available information:
weekly administrative data on employment status and other
individual characteristics
quarterly labour force survey (LFS) data
yearly Master Sample (MS) data (from 2018)
Integration of administrative sources and survey data through HMM for the production of labour statistics
the problem
motivating example
Table 1: Cross-classification of the employment status measured by
LFS data and AD. Year 2015. Italy
LFS  AD Out In Total
Not Employed 59.9 2.9 62.8
Employed 3.2 34.0 37.2
Total 63.1 36.9 100.0
The two measures disagree on about the 6% of the surveyed units
Integration of administrative sources and survey data through HMM for the production of labour statistics
the problem
overcoverage of ad according to lfs data by
type of source. year 2015 italy
Integration of administrative sources and survey data through HMM for the production of labour statistics
non supervised approach
the approach
Given the informative context and the expected output a suitable
approach should
model data at unit level
be unsupervised, since none of the sources can be reasonably
considered as error free - all the sources are treated as
imperfect measures of the true phenomenon -
take into account the longitudinal structure of the data
Integration of administrative sources and survey data through HMM for the production of labour statistics
non supervised approach
Non supervised approach: the variables
L variables, representing the “true” (latent) target phenomenon
L are the variables that we would observe if data were error
free
L are considered latent variables because they are not directly
observed
Yg variables (g = 1, . . . , G), representing imperfect (observed)
measures of the target phenomenon
Q1, Q2 variables, covariates associated respectively to the
latent variables L and to the measures Yg
Integration of administrative sources and survey data through HMM for the production of labour statistics
non supervised approach
Non supervised approach: the model
The statistical model is composed of two components specified via
the conditional probability distributions:
P(L | Q1) : latent model
P(Y1, . . . , YG | L, Q2) : measurement model
Integration of administrative sources and survey data through HMM for the production of labour statistics
non supervised approach
Non supervised approach: estimates
Estimates of the model parameters can be obtained via MLE
methods based on the observed data likelihood
Using the parameter estimates, one can obtain via Bayes
formula the posterior distribution of the target variable
conditional on all the available information:
P(L|Y1, . . . , YG, Q1, Q2)
Predictions of the latent variable for each unit can be obtained
by taking expectations from the previous distribution
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
modelling employment data
Goal: Estimation of the monthly employment status
two categories: 1 = employed, 2 = not employed
Lt: true employment status (latent)
Lt ∈ (1, 2) t ∈ (1, . . . , 12)
Y1: employment status according to the LFS
Y1t ∈ (1, 2)
Y2: employment status according to the AD
Y2t ∈ (1, 2)
covariates
Q: retirement status, student, income, age, sex
Source: type of administrative sources.
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
After the comments and suggestions from the Advisory board, we
define the following model:
Y11
L3
Y21 Y22 Y23
L2L1
Y14
Y24
L4
SOURCE Q
X
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
Y11
Y21 Y22 Y23
Y14
Y24
SOURCE Q
X
L1 L3L2 L4
Latent model
initial probability
pl1
.
= P(L1 = j)
transition matrix
p
(t)
i|j
.
= P(Lt =
i|Lt−1 = j)
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
L3
Y21 Y22 Y23
L2L1
Y24
L4
SOURCE Q
X
Y11 Y14
Measurement model
Y 1: employment status
according to the LFS
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
Y11
L3L2L1
Y14
L4
SOURCE Q
X
Y21 Y22 Y23 Y24
Y2: employment status according
to AD
local independence
serial independence
measurament errors
ψg
j|i
.
= P(Ygt = ygj|Lt = i),
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
Y11
L3
Y21 Y22 Y23
L2L1
Y14
Y24
L4
SOURCE Q
X
X: mixture, 3 classes
Captures heterogeneity in
the population:
1= Never employed
2= Stayers
3= Movers
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
Y11
L3
Y21 Y22 Y23
L2L1
Y14
Y24
L4
X
SOURCE Q Covariates
Q: retirement status,
student, income, age, sex
SOURCE: source type
1= No source
2= Employees
3= Self-employers
(time)
4= Self-employers (no
time)
Integration of administrative sources and survey data through HMM for the production of labour statistics
Application to employment statistics
Modeling employment data
Y11
L3
Y21 Y22 Y23
L2L1
Y14
Y24
L4
SOURCE Q
X
Complete model
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
some results
We focus on the following outcomes from the model:
estimates of measurement error parameters ψg
j|i
estimates of the employments (by domain D) using the
expectation of the posterior distribution
12
t=1 k∈D
ˆlk,t
12
where
ˆlk,t
.
= P(lt = 1|Y1 = y1,k, Y2 = y2,k, Q = qk),
index k refers to the kth unit, t to the month and D is some
domain of interest.
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
Monthly and quarterly estimate of the
employment in Italy. Year 2015
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
Employments by Region. Year 2015
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
Bootstrap 95% prediction interval for
monthly employment in Umbria. Year 2015
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
conclusions and future works
Non-supervised models seem to be a promising approach in a
multi-source context.
Combining administrative and survey data we can obtain a
prediction of the employment status that takes into account
miss-classification errors from both sources.
Evaluate how to use an additional measure on employment
(Continuous Census)
Evaluate the model in small area prediction
Integration of administrative sources and survey data through HMM for the production of labour statistics
results and conclusions
Thanks

More Related Content

Similar to Session II Estimation Methods and Accuracy - D. Filipponi, Integration of administrative sources and survey data through Hidden Markov Models for the production of labour statistics

Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment
Dr Lendy Spires
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publication
Dr Lendy Spires
 
Présentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncturePrésentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncture
Cdiscount
 
Enterprise Measurements_January 2015
Enterprise Measurements_January 2015Enterprise Measurements_January 2015
Enterprise Measurements_January 2015
David Wade
 

Similar to Session II Estimation Methods and Accuracy - D. Filipponi, Integration of administrative sources and survey data through Hidden Markov Models for the production of labour statistics (20)

Employer Employee linked data in Italy availability and usage by institusions
Employer Employee linked data in Italy availability and usage by institusionsEmployer Employee linked data in Italy availability and usage by institusions
Employer Employee linked data in Italy availability and usage by institusions
 
Our research on 500+ during WIEM 2021 conference
Our research on 500+ during WIEM 2021 conferenceOur research on 500+ during WIEM 2021 conference
Our research on 500+ during WIEM 2021 conference
 
Econometrics_1.pptx
Econometrics_1.pptxEconometrics_1.pptx
Econometrics_1.pptx
 
Ws2011 sessione6 camillo_dattoma
Ws2011 sessione6 camillo_dattomaWs2011 sessione6 camillo_dattoma
Ws2011 sessione6 camillo_dattoma
 
Tarmo Puolokainen: Public Agencies’ Performance Benchmarking in the Case of D...
Tarmo Puolokainen: Public Agencies’ Performance Benchmarking in the Case of D...Tarmo Puolokainen: Public Agencies’ Performance Benchmarking in the Case of D...
Tarmo Puolokainen: Public Agencies’ Performance Benchmarking in the Case of D...
 
Inequalities in an OLG economy with heterogeneity within cohorts and an oblig...
Inequalities in an OLG economy with heterogeneity within cohorts and an oblig...Inequalities in an OLG economy with heterogeneity within cohorts and an oblig...
Inequalities in an OLG economy with heterogeneity within cohorts and an oblig...
 
Gender Budgeting in Austria - Ursula ROSENBICHLER (Austria)
Gender Budgeting in Austria - Ursula ROSENBICHLER (Austria)Gender Budgeting in Austria - Ursula ROSENBICHLER (Austria)
Gender Budgeting in Austria - Ursula ROSENBICHLER (Austria)
 
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...
 
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTINGINTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
 
Goodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data caseGoodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data case
 
Analysis and interpretation of labour statistics
Analysis and interpretation of labour statisticsAnalysis and interpretation of labour statistics
Analysis and interpretation of labour statistics
 
Ws2011 sessione9 bacchini
Ws2011 sessione9 bacchiniWs2011 sessione9 bacchini
Ws2011 sessione9 bacchini
 
Predicting the economic public opinions in Europe
Predicting the economic public opinions in EuropePredicting the economic public opinions in Europe
Predicting the economic public opinions in Europe
 
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
 
Final assignment
Final assignmentFinal assignment
Final assignment
 
Government at a Glance 2013, Country Fact Sheet: Germany
Government at a Glance 2013, Country Fact Sheet: GermanyGovernment at a Glance 2013, Country Fact Sheet: Germany
Government at a Glance 2013, Country Fact Sheet: Germany
 
Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publication
 
Présentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncturePrésentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncture
 
Enterprise Measurements_January 2015
Enterprise Measurements_January 2015Enterprise Measurements_January 2015
Enterprise Measurements_January 2015
 

More from Istituto nazionale di statistica

More from Istituto nazionale di statistica (20)

Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profitCensimenti Permanenti Istituzioni non profit
Censimenti Permanenti Istituzioni non profit
 
Censimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni PubblicheCensimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni Pubbliche
 
Censimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni PubblicheCensimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni Pubbliche
 
Censimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni PubblicheCensimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni Pubbliche
 
Censimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni PubblicheCensimento Permanente Istituzioni Pubbliche
Censimento Permanente Istituzioni Pubbliche
 
14a Conferenza Nazionale di Statisticacnstatistica14
14a Conferenza Nazionale di Statisticacnstatistica1414a Conferenza Nazionale di Statisticacnstatistica14
14a Conferenza Nazionale di Statisticacnstatistica14
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 
14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica14a Conferenza Nazionale di Statistica
14a Conferenza Nazionale di Statistica
 

Recently uploaded

Orientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdfOrientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdf
Elizabeth Walsh
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
EADTU
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
httgc7rh9c
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
Peter Brusilovsky
 

Recently uploaded (20)

Orientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdfOrientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdf
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
 
Graduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptxGraduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptx
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptxHow to Manage Website in Odoo 17 Studio App.pptx
How to Manage Website in Odoo 17 Studio App.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
Ernest Hemingway's For Whom the Bell Tolls
Ernest Hemingway's For Whom the Bell TollsErnest Hemingway's For Whom the Bell Tolls
Ernest Hemingway's For Whom the Bell Tolls
 
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdfDiuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
 

Session II Estimation Methods and Accuracy - D. Filipponi, Integration of administrative sources and survey data through Hidden Markov Models for the production of labour statistics

  • 1. Integration of administrative sources and survey data through HMM for the production of labour statistics Integration of administrative sources and survey data through HMM for the production of labour statistics Danila Filipponi, Ugo Guarnera, Roberta Varriale ISTAT Istat - 19 November 2018
  • 2. Integration of administrative sources and survey data through HMM for the production of labour statistics outline 1 the problem 2 non supervised approach 3 Application to employment statistics 4 results and conclusions
  • 3. Integration of administrative sources and survey data through HMM for the production of labour statistics the problem motivating example Build a micro-data file to support the estimates on employment (including the Permanent Census) Available information: weekly administrative data on employment status and other individual characteristics quarterly labour force survey (LFS) data yearly Master Sample (MS) data (from 2018)
  • 4. Integration of administrative sources and survey data through HMM for the production of labour statistics the problem motivating example Table 1: Cross-classification of the employment status measured by LFS data and AD. Year 2015. Italy LFS AD Out In Total Not Employed 59.9 2.9 62.8 Employed 3.2 34.0 37.2 Total 63.1 36.9 100.0 The two measures disagree on about the 6% of the surveyed units
  • 5. Integration of administrative sources and survey data through HMM for the production of labour statistics the problem overcoverage of ad according to lfs data by type of source. year 2015 italy
  • 6. Integration of administrative sources and survey data through HMM for the production of labour statistics non supervised approach the approach Given the informative context and the expected output a suitable approach should model data at unit level be unsupervised, since none of the sources can be reasonably considered as error free - all the sources are treated as imperfect measures of the true phenomenon - take into account the longitudinal structure of the data
  • 7. Integration of administrative sources and survey data through HMM for the production of labour statistics non supervised approach Non supervised approach: the variables L variables, representing the “true” (latent) target phenomenon L are the variables that we would observe if data were error free L are considered latent variables because they are not directly observed Yg variables (g = 1, . . . , G), representing imperfect (observed) measures of the target phenomenon Q1, Q2 variables, covariates associated respectively to the latent variables L and to the measures Yg
  • 8. Integration of administrative sources and survey data through HMM for the production of labour statistics non supervised approach Non supervised approach: the model The statistical model is composed of two components specified via the conditional probability distributions: P(L | Q1) : latent model P(Y1, . . . , YG | L, Q2) : measurement model
  • 9. Integration of administrative sources and survey data through HMM for the production of labour statistics non supervised approach Non supervised approach: estimates Estimates of the model parameters can be obtained via MLE methods based on the observed data likelihood Using the parameter estimates, one can obtain via Bayes formula the posterior distribution of the target variable conditional on all the available information: P(L|Y1, . . . , YG, Q1, Q2) Predictions of the latent variable for each unit can be obtained by taking expectations from the previous distribution
  • 10. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics modelling employment data Goal: Estimation of the monthly employment status two categories: 1 = employed, 2 = not employed Lt: true employment status (latent) Lt ∈ (1, 2) t ∈ (1, . . . , 12) Y1: employment status according to the LFS Y1t ∈ (1, 2) Y2: employment status according to the AD Y2t ∈ (1, 2) covariates Q: retirement status, student, income, age, sex Source: type of administrative sources.
  • 11. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data After the comments and suggestions from the Advisory board, we define the following model: Y11 L3 Y21 Y22 Y23 L2L1 Y14 Y24 L4 SOURCE Q X
  • 12. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data Y11 Y21 Y22 Y23 Y14 Y24 SOURCE Q X L1 L3L2 L4 Latent model initial probability pl1 . = P(L1 = j) transition matrix p (t) i|j . = P(Lt = i|Lt−1 = j)
  • 13. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data L3 Y21 Y22 Y23 L2L1 Y24 L4 SOURCE Q X Y11 Y14 Measurement model Y 1: employment status according to the LFS
  • 14. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data Y11 L3L2L1 Y14 L4 SOURCE Q X Y21 Y22 Y23 Y24 Y2: employment status according to AD local independence serial independence measurament errors ψg j|i . = P(Ygt = ygj|Lt = i),
  • 15. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data Y11 L3 Y21 Y22 Y23 L2L1 Y14 Y24 L4 SOURCE Q X X: mixture, 3 classes Captures heterogeneity in the population: 1= Never employed 2= Stayers 3= Movers
  • 16. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data Y11 L3 Y21 Y22 Y23 L2L1 Y14 Y24 L4 X SOURCE Q Covariates Q: retirement status, student, income, age, sex SOURCE: source type 1= No source 2= Employees 3= Self-employers (time) 4= Self-employers (no time)
  • 17. Integration of administrative sources and survey data through HMM for the production of labour statistics Application to employment statistics Modeling employment data Y11 L3 Y21 Y22 Y23 L2L1 Y14 Y24 L4 SOURCE Q X Complete model
  • 18. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions some results We focus on the following outcomes from the model: estimates of measurement error parameters ψg j|i estimates of the employments (by domain D) using the expectation of the posterior distribution 12 t=1 k∈D ˆlk,t 12 where ˆlk,t . = P(lt = 1|Y1 = y1,k, Y2 = y2,k, Q = qk), index k refers to the kth unit, t to the month and D is some domain of interest.
  • 19. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions Monthly and quarterly estimate of the employment in Italy. Year 2015
  • 20. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions Employments by Region. Year 2015
  • 21. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions Bootstrap 95% prediction interval for monthly employment in Umbria. Year 2015
  • 22. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions conclusions and future works Non-supervised models seem to be a promising approach in a multi-source context. Combining administrative and survey data we can obtain a prediction of the employment status that takes into account miss-classification errors from both sources. Evaluate how to use an additional measure on employment (Continuous Census) Evaluate the model in small area prediction
  • 23. Integration of administrative sources and survey data through HMM for the production of labour statistics results and conclusions Thanks