5. Llinking employers and employees responses

Linking employers’ and employees’ responses in
EU wide surveys: what are the solutions and
their prerequisite?
N.Greenan & M.Seghir
Monday, 29.03.2021
By Majda SEGHIR
This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 8222293.

Introduction
• At the EU level, strong impulse for more integrated statistical
information that covers several socio-economic aspects
• Why?
• Decisions making requires as much rich and timely information as possible
• No single survey can provide all the necessary information
• Running new surveys requires an appreciable amount of both time and funds
• The need for information requires the analysis of a large number of
variables=survey with long questionnaire=lower quality of the responses &
higher frequency of missing responses
• EU research agenda: data integration, multi-sources data combination,
linking of micro-data, integrated technical infrastructures…

Rationale for linking employer and employee
surveys
• Linked employer-employee surveys: the best data configuration to
disentangle both the employers’ and the employees’ effects when analysing issues
such as wage determination, productivity, innovation strategy and resource
management practices
• Survey design: employer first sampled and employees sampled in a second stage
or employees first sampled and their employers interviewed in a second stage.
• Existing linked surveys: mostly at the national level =>carrying out a linked
survey is very expensive
• Instead of carrying out linked employer and employee surveys, matching
existing employer and employee data sources

How to integrate information from different
employer and employee data sources
• Issues in preparing data combination
• Record linkage
• Statistical matching
• Data aggregation
• Multi-level modelling
• Data availability

Issues in preparing data combination
• Objective: prevent biased or inconsistent data sources (D’Orazio et al.,
2006)
• Reconciliation of biased sources:
• Harmonisation of the reference population
• Harmonisation of reference time
• Reconciliation of inconsistent data sources with respect to the
common variables:
• Harmonisation of the common variables by changing the categorisation
• Creation of new (common) variables from available information in both
samples

Record linkage of employer and employee
data files (1/2)
• Aims at identifying pairs of records, in two data sets that represent the
same entity.
• Records are assumed to have some common identifying information (unique identifier
(ideally), name and/or address, age and sex)
• Exact record linkage: perfect agreement between identifier (e.g. personal
identification number)
• Probabilistic record linkage: uses probabilities for deciding when a given pair of
records refers to the same unit (is a “match”) or not

Record linkage of employer and employee data files
(2/2)
• The target population is different in the employer sample and in the
employee sample. Obviously record linkage is not possible unless a linkage
at the employer level is considered
• Employees may be asked exact information on their employers (e.g. name, address)
which will be used to identify the employer via a business register and then link the
employee sample to the employer sample
• Requirements:
• Reliable information to identify the employer in the employee sample
• European Business register to identify the employers OR the linkage can be
performed at the national level
• Issues:
• Representative sample of employees but a non-representative sample of employers
(more likely to be biased towards large enterprises)

Statistical matching of employer and
employee files (1/2)
• Aims to integrate two (or more) data sets characterised by the fact
that
• The different data sets contain information on a set of common variables and
on variables (target variables) that are not jointly observed
• SM can be viewed as an imputation problem of the target variables from a
donor to a recipient survey =construction of a synthetic file containing all the
variables of interest, although collected in different sources
• Statistical matching, unlike record linkage, aims to match similar but not
identical units

Statistical matching of employer and
employee files (2/2)
• The target population is different in the employer sample and in the employee
sample. The implicit nested structure of employee surveys may offer a solution to
use statistical matching.
• Workers’ information may be aggregated to the employer level and then relying on a set of
relevant common variables input the employee file with information form the employer file
• Requirements:
• Data aggregation of the employee file to the employer level =>loss of micro-details on workers
• The common variables should have a great predictive power of the variables to
match=>conditional independence assumption
• Common variables should be consistent in terms of the definitions and classifications
• Issues:
• The CIA assumption is very strong and hard to test (problem solved if there is and auxiliary
information where the target variables are jointly observed e.g. small sample of designed
linked survey)

Combining by aggregating employer and
employee information
• Most flexible approach of data combination:
• Identifying a common level to which information can be aggregated before
proceeding with the combination
• Loss of information by substituting individual data with aggregated data
• Interesting alternative when record linkage and statistical matching are not
allowed
• Requirements:
• Common variables defining the aggregate level should be available with
enough details in both employer and employee samples
• The data sources should be large enough to a have minimum number of
observations by integration level

Joint analysis of employer and employee
surveys: Multi-level modelling
• Not a data combination solution but an alternative solution to model
the nested and multi-level structure of employee and employer
surveys
• Rationale:
• Employees are nested within size classes of companies within sectors within
countries
• Layered regression models that correspond with the levels of grouping present
in the data
• Requirement:
• A least 30 groups (higher-level units) with it least 5 individuals

Recommendations for an ex-post employer-
employee data linkage
• The matching of EU employer and employee surveys is possible only at the
employer level
• General recommendation: EU data harmonisation with respect to survey design and
variables definitions
• Recommendations for record linkage:
• Include employers variables of high level of identification power and quality in the employee
survey (e.g. name and adress)
• Perform record linkage at the natinal level as national statistical offices have access to more
detailed information on employers (e.g. business registers).
• Recommendations for statistical matching:
• National linked surveys can be used as auxilliary information to assess the validity of the
matching at the EU level
• A nested survey design with a common questionnaire to employee and employers and specific
modules for employees and for employers.
• Recommendations for data aggregation and multi-level modelling:
• Have a sample size and common grouping variables that allows for a large number of groups

5. Llinking employers and employees responses

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 5. Llinking employers and employees responses

Similar to 5. Llinking employers and employees responses (20)

More from BEYOND4.0

More from BEYOND4.0 (20)

Recently uploaded

Recently uploaded (20)

5. Llinking employers and employees responses