2. 2
Modernisation of the statistical production
• Reference models
Data collection
Data processing
Decent Work Indicators Reference Framework
Data dissemination
Evaluation
Metadata management
OUTLINE
4. 4
ModernStats World
Workshop
together with Sharing Tools Group
26-28 June, Geneva
Linking GSBPM
and GSIM
Core Ontology
for Official
Statistics
Alignment of
GSBPM OP with
GAMSO
Metadata
Glossary
GSIM e-training
5 November 2019
Activities in 2019 contribute de facto
to setting up a more integrated view of
the modernisation standards
HLG-MOS Activities 2019
5. 5
OVER-ARCHING
The GSBPM comprises eight main phases plusover-arching processes
GSBPM version 5.1 – Phases
1
Specify
Needs
2
Design
3
Build
4
Collect
5
Process
6
Analyse
7
Disse-
minate
8
Evaluate
PLANNING PRODUCTION
EVALUATION
Quality Management / Data & Metadata Management
7. 7
Mapping to GSBPM
In different circumstances sub-processes
- may occur in different orders
- may be revisited a number of times forming form iterative
loops (e.g. Process and Analyse )
- may be skipped (e.g. in regular processs/iterations)
9. 9
The traditional approach
DESIGN BUILD PRODUCTION DISSEM. EVAL
DESIGN
COLLECT PROCESS TABULATE PRINT
COLLECT – PROCESS
TABULATE
PLAN PUBLISH
ESTABLISHMENT SURVEY
HOUSEHOLD SURVEY
AD HOC SURVEY
• Statistical production seen as a "value chain“
• Different lines of production per statistical domain (“stove pipes”)
• Highly inefficient due to:
– duplication of infrastructure and resources
– lost economies of scale
– hard to get combined outputs
11. 11
Is GSBPM enough?
Modernisation of statistics requires:
• reuse and sharing of methods,
components, processes and data
repositories
• definition of a shared “plug-and-play”
modular component architecture
GSBPM helps in determining which
components are required.
Components will process information
Need for interfaces specification
GSIM
CSPA
12. 12
Generic Statistical Information Model
GSIM describes the information objects and flows
within the statistical business process.
• GSIM is a reference framework of information objects that sets
out definitions, attributes and relationships between them
14. 14
What’s the meaning of “VARIABLE”?
“An input to an indicator”
“The result of an equation”
“A column of a database”
“A place in memory to store values and operate with it”
“A field in a dataset”
“One element of a set that can change its value”
“Something that changes. Like the weather.”
- My mom
- Labour economist
- Developer (former Math teacher)
- Database Administrator
- IT Developer
- Labour analyst
- Statistical assistant
Why GSIM?
15. 15
What’s the meaning of “VARIABLE”?
“Variable is a characteristic of a unit to which a
numerical measure or a category from a classification
can be assigned”
Examples:
• Age of a person (measured in years since birthdate)
• Activity sector of an establishment (categorized according to ISIC)
• Occupation of a person (categorized according to ISCO)
• Total income of a household (measured in amount of money)
- GSIM
Why GSIM?
16. 16
Generic Statistical Information Model
GSIM is a conceptual model that provides a
standardized set of information objects that flow
through the process model in the creation of official statistics as
represented by the GSBPM.
It defines a common terminology across and between
statistical organisations.
It allows statistical organisations and standards
bodies (e.g. SDMX and DDI) to understand and map
common statistical information and processes.
Conceptual
mode
18. 19
• Mail interview: Questionnaire is sent by mail and requested to be sent back
• Personal Interview: Questions are asked face-to-face
• PAPI: Paper And Pencil Interviewing. Data obtained from the interview is filled in on a paper form
using a pencil.
• CAPI: Computer Assisted Personal Interviewing. This method is very much similar to the PAPI
method, but the data is directly entered into a computer programme instead of first using paper
forms.
• WAPI: Web Assisted Personal Interviewing. The respondents answer the questions online, but they
are also assisted online in doing so.
• TAPI: Tablet Assisted Personal Interviewing. This method is virtually identical to the CAPI method,
but the data is entered into a tablet instead of a computer/laptop.
• SAPI: Smartphone Assisted Personal Interviewing. With this method, the data is entered into a
smartphone by the interviewer.
• Telephone interview:
• CATI: Computer Assisted Telephone Interviewing. The questions are usually presented to the
interviewers on a computer screen, after which they ask them to the respondents. To ensure that
the correct questions are asked to each respondent, the specialised computer software uses
"skips": Certain answers can lead to the next question being different. This also prevents the
respondent from having to answer irrelevant questions.
• IVR: Interactive Voice Response. The interview is conducted by a computerized system.
Modes of data collection
19. 20
• CAWI: Computer Assisted Web Interviewing. Online research in which respondents are invited via e-mail
to answer the survey through online questionnaires. These questionnaires can be personalized so that the
correct questions are asked to each respondent.
• CASI: Computer Assisted Self Interviewing. The CASI method involves respondents taking place behind
the computer themselves in order to fill in the questionnaire. Audio or video recorded questions may be
included.
• TASI: Tablet Assisted Self Interviewing. This method is virtually identical to the CASI method, but the data
is entered into a tablet instead of a computer/laptop.
• SASI: Smartphone Assisted Self Interviewing. With this method, the data is entered into a smartphone by
the respondent.
• Multimodal approach = Mail + xASI + CATI(IVR) + xAPI
Modes of data collection
20. 21
Which modes of data
collection are used in your
institution?
Modes of data collection
21. 22
The Statistical Production
Quality Management / Data & Metadata Management
INFORMATIONSTORAGETOOLS
Quality Management / Data & Metadata Management
Documentation
Configuration
& Settings
Notes
22. 23
Data Validation
• Validation rules are applied to ensure the integrity and correctness of information entering the system
• Two types of validation:
• Structural: Correctness of data types, completness, codes, etc.
• Business rules: values’ ranges, totals, cross-referenced values, etc.
• Keep track of validation results (process/quality metadata)
Data validation, editing and transformation
Data Edition
• Correction of errors found during validation
• Structural errors may require going back to the collection phase
• Different procedures for correction:
• Imputation: Data is corrected automatically based on imputation methods: deductive, substitution,
estimator, cold/hot deck, nearest neighbour, etc.
• Interactive editing: By means of an Editor program (which can be the same used for data collection)
• Corrected data must be submitted back to validation.
Data Transformation
• Calculation of derived variables and/or aggregates.
• Generation of new datasets in different format
• Anonymization
VALIDATION
EDITING
ERROR TRANSFORMATION
OKCOLLECTION
DISSEMINATION
23. 24
The 2008 ILO Declaration on Social Justice for a Fair
Globalization recommends the establishment of appropriate
indicators to monitor and evaluate progress in the
implementation of the ILO Decent Work Agenda.
Decent work is considered central to sustainable poverty
reduction and is a means to achieve equitable, inclusive and
sustainable development.
In September 2008, the ILO convened an International Tripartite
Meeting of Experts to Develop a Decent Work Indicators
Framework, which was presented and adopted at the 18th
International Conference of Labour Statisticians in December
2008.
Decent Work Indicators Reference Framework
24. 25
The framework comprises ten substantive elements (and an
additional one on the economic and social context) corresponding
to the four strategic pillars of the Decent Work Agenda.
DECENT WORK AGENDA
Strategic Pillars
Full and productive employment
Rights at work
Social protection
Promotion of social dialogue
Decent Work Indicators Reference Framework
The Decent Work Agenda includes a cross-cutting objective of
gender equality. Thus, the Decent Work Indicators will be
disaggregated by sex, whenever possible.
25. 26
SUBSTANTIVE ELEMENTS OF DECENT WORK
1. Employment
opportunities
6. Stability and security
of work
2. Adequate earnings
and productive work
7. Equal opportunity and
treatment in
employment
3. Decent working time
8. Safe work
environment
4. Combining work,
family and personal life
9. Social security
5. Work that should be
abolished
10. Social dialogue,
employers’ and workers’
representation
Decent Work Indicators Reference Framework
26. 27
1.Employmentopportunities
INDICATOR DW SDG
Labour force participation rate
Employment to population ratio
Unemployment rate 8.5.2
Unemployment by education
Long term unemployment 19th
ICLS
Status in employment (ICSE)
Employment by occupation
(ISCO)
Wages in non-agriculture job
NEET 8.6.1
Labour underutilization
Informal employment rate 8.3.1
Decent Work Indicators Reference Framework
27. 28
Labour Force Participation Rate
𝐿𝑎𝑏𝑜𝑢𝑟 𝐹𝑜𝑟𝑐𝑒
𝑊𝑜𝑟𝑘𝑖𝑛𝑔−𝐴𝑔𝑒 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
x 100
– Working-Age Population and Labour Force should normally
correspond to persons aged 15 and above
– Labour force corresponds to persons either in employment or
unemployment
Sources: LFS, Census, other
28. 29
Persons outside the labour force
(inactivity rate)
Working age population – (persons in employment + persons unemployed)
𝑊𝑜𝑟𝑘𝑖𝑛𝑔 𝑎𝑔𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
x 100
– Share of the working-age population that is not in the labour force (%)
– Includes discouraged
– It is the reverse side of the labour force participation rate (they sum to
1, cross check)
Sources: LFS, Census, other household surveys
29. 30
The Statistical Production
Quality Management / Data & Metadata Management
INFORMATIONSTORAGETOOLS
VTL
Quality Management / Data & Metadata Management
Documentation
Configuration
& Settings
Notes
31. 32
Technical metadata
• System’s parameters
– Govern automatic updates and scheduled batch processes
• Structural information
– Structural validation of data collection instruments
• User Access Control
– Roles, credentials, access rights, etc.
• Context information
– Dissemination website: language, country, subject
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
32. 33
Process metadata (Paradata)
From the field
• Interview mode
– PAPI, TAPI, CAPI
• Date and time of the interview
• Geographical coordinates
• Number of attempts
Internal
• Consistency rules
• Formulas for derived variables or calculated indicators
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
33. 34
Structural metadata
Business Structural metadata
• The “heart” of the metadata driven system.
– code lists for all the concepts in use
– definition of all the artefacts used for
• Data collection: questionnaires, DSD’s, etc.
• Dissemination: tables, charts, navigation menus, etc.
• Stored in a single metadata repository
– Shared by all the modules
– Single point of maintenance
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
34. 35
Statistical metacontent
Business Reference metadata Descriptive metadata Statistical metacontent
• Typical “data about data”.
• Two classes of metacontent:
– Observation value status: sometimes called flag
– Notes: controlled vocabulary (coded at collection time) or free text
• Cleaning module
– Checking for mandatory and contradictory notes
• Calculation module
– “Operating” on notes for derived indicators
• Dissemination module
– Table metadata, flags and footnotes
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
35. 36
Unlabeled stuff
Why is this metadata relevant?
Labeled stuff
The bean example is taken from: A Manager’s
Introduction to Adobe eXtensible Metadata Platform,
http://www.adobe.com/products/xmp/pdfs/whitepaper.pdf
Labeled stuff
36. 37
Methodological metadata
Business Reference metadata Descriptive metadata Methodological metadata
• Methodology used for data collection and processing
• Sampling procedures
• Primary source’s metadata is also pertinent to secondary data
– collect and disseminate this information in DDI-C
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
37. 38
Quality metadata
Business Reference metadata Descriptive metadata Quality metadata
• Provides information about the quality of the data
– Ex.: After consistency checking data status can be “Error”,
“Ready for dissemination” or “Ready by allowance”.
– Additional quality related information attached as
comments/annotations
• Some of paradata is also quality metadata
– Ex.: Number of substitutions in a survey sample, Number of non-
responses
• Stored in the Workflow tables or “Administrative” modules
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
38. 39
External resources
Business Reference metadata External resources
• Artefacts and documents related to the studies
– Questionnaires
– Methodological guidelines
– Reports
– Maps
– Computer programs
Metacontent
Methodology
Quality
Metadata
Technical
Process
Business
Structural
Reference
Descriptive
(conceptual)
Ext. Resources
Types of Metadata
40. 41
Survey Documentation using DDI-C
The data documentation, or reference metadata, helps
the researcher to:
– find the data they are interested in.
– understand what the data is measuring and how the data
has been created.
– assess the quality of the data.
41
41. 42
Questionnaires
Technical, analytical, administrative docs
• Sample selection information
• Listing forms
• Manuals, lists of codes, etc.
• Logistical documentation
• Personnel organization and structure
• Budget
• Any planning documentation
Metadata collection: Valuable resources
42. 43
Computer programs
• Data entry, editing
• Tabulations, computations
Tables, photos, maps
Reports
• Final reports
• Consultant reports
Others
• Press releases and media articles
• Other information or documentation
43
Metadata collection: Valuable resources
43. 44
The Statistical Production
Quality Management / Data & Metadata Management
INFORMATIONSTORAGE
Quality Management / Data & Metadata Management
Documentation
Configuration
& Settings
Notes