SlideShare a Scribd company logo
A quality framework
    for the evaluation of
administrative and survey data

    Piet J.H. Daas, Judit Arends-Tóth,
Barry Schouten and Léander Kuivenhoven

        Statistics Netherlands
Overview

 Reason for work
 View on Quality
 Starting point
 Literature study results
 Overview of the framework
 Application
 Future work
Reason for work
 Statistics Netherlands increases the use of
  data (sources) collected and maintained by
  others
  • To decrease response burden and costs

 As a result:
  • More dependent on external data sources
  • Must be able to monitor the quality of external
    data sources
     – Develop a way to monitor the quality of external
       data sources
View on quality

 Statistics Netherlands definition of the quality of
  external data sources:
      “Usability for the production of statistics”


 Differs from quality as used by the data source
  maintainer
Starting point
 Previous work at Statistics Netherlands
 • Most recent: Work of Daas and Fonville
    – Determination of register quality
        – Described in a paper for the register seminar in Finland
          (2007)
 • Improvement is possible
    – Predominantly business register data
    – International experience could be included more

 New project started
 • Develop quality framework for the evaluation of
   external data sources (focus on registers and
   administrative data)
Literature study (1)
 Performed and extensive literature study
 • Publically available papers/books that studied
   the quality of administrative data sources and
   registers → Dutch and English
 • A lot of research focuses on quality of survey
   collected → these were excluded
 • Ended up with quite a limited lists of important
   papers:
     –   Book of the Wallgren’s (S)
     –   Daas and Fonville (NL)
     –   Eurostat paper on Quality of administrative Data
     –   Work performed at ONS and by Thomas (UK)
     –   UNECE paper of Nordic countries
Literature study (2)

 Conclusions:
 • A general level of mutuality
     – The papers identified many similar quality aspects
       (quality indicators)

 • None of the ‘views’ on quality were exactly alike
     • How to combine all these views?
         • Something higher than a dimension was needed
         • Karr et al. 2006 used the term Hyperdimension to
           distinguish different views on quality

 • Combine all quality aspects identified in all studies
   and new aspects in a single framework !!
Quality framework

 Framework has 4 hyperdimensions
 • Four views on the quality of the external data source


 The hyperdimensions identified are:
 • Source     → Data source as a whole
 • Metadata → Conceptual metadata of data in source
 • Data       → Facts (values) in data source
 • Process    → Processing related quality aspects
Quality framework levels
 Levels distinguished:

           HYPERDIMENSION



         n>1

                         DIMENSION



                n >= 1

                                 QUALITY INDICATOR



                                        1:n



                                 Measurement method
1) Source hyperdimension

 Here the data source is viewed upon as a file
  delivered by the data source maintainer to
  the NSI



 Dimensions (5):
 • Supplier, Relevance, Privacy and security,
   Delivery, and Procedures
Source hyperdimension

Hyper- Dimension Indicator           Measurement method
dimension
Source   1. Supplier
         Supplier      Contact     Name, Contact information
                1.1 Contact      - Name of the data source
                                 - Data source contact info
         Relevance    Adm. burden Effect of use on adm. burden
                                 - NSI a contact person
                                   of NSI
                 1.2 Purpose        (time and money)
                                  - Reason for use of the data
                                    source by NSI
         Privacy and Legal provision Check if Personal Data
         2. Relevance
         security2.1 Usefulness   - Importance data source for NSI
                                    Protection act applies

         Delivery2.2 Envisaged use
                       Costs         - Potentialuse for NSI use of data
                                       Costs of statistical
                                       source

                 2.3 Information     - Does the data source satisfy
2) Metadata hyperdimension

 Focuses on the conceptual metadata quality
  aspects of the data source.

 Other metadata aspects (such as process
  meta) are not included

 Dimensions (4):
 • Clarity, Comparability, Unique keys, and Data
   treatment by data source maintainer
Metadata hyperdimension

Hyper-    Dimension Indicator                  Measurement method
dimension
Metadata Clarity
          1. Clarity           Population        Description of the population
                1.1 Population definition - Clarity scoredata source
                               definition        used in of the definition

           Unique Definition of variables
               1.2 keys       Identification - Clarity scoreunique keys
                                                Presence of of the definition
                                               (and categories)
                              keys present (which)
                1.3 Time dimensions       - Clarity score of the definition
           Data 1.4 Geographic demarcation - Clarity score of the definition
                 treatment Checks                 Variable value checks
           by data source                         performed
                1.5 Definition changes
           maintainer                          - Familiarity with occurred
                                                 changes
            2. Comparability Modifications Familiarity with data
                                                  modifications
                2.1 Population definition - Comparability with NSI
                    comparison                   definition
3) Data hyperdimension

        Aspects related to data in the data source
        • All aspects are accuracy related

        Actively being discussed at our office
         • Future changes may be possible

        Dimensions (9)
        • Over coverage, Under coverage, Linkability, Unit
          non-response, Item non-response, Measurement,
          Processing, Precision, and Sensitivity


Remark: Precision was mainly included for (externally collected) survey data
Data hyperdimension

Hyper-    Dimension Indicator                     Measurement method
dimension
Data        Over coverage Non-pop. units
            1. Over coverage                    Percentage of units not 1.1
            Non-population units     - Percentage of units to population of NSI
                                                belonging not
                                       belonging to population
            Linkability        Linkable units Percentage if units linked
            2. Under coverage
                2.1 Missing units            - Percentage of missing
            Measurement Incompatible population units with violated
                                                Fraction of fields
                               records          edit rules
                2.2 Selectivity              - R-index for population
            Processing         Adjustment      composition
                                                Fraction of fields adjusted

                 2.3 Effect on core variables     - Maximum bias of average for
                                Imputation           Fraction of fields imputed
                                                    core variable
                                                  - Maximum RMSE of average

    R-index: Representative index; RMSE: Root mean square Error; MSE: Mean Square Error
4) Process hyperdimension

 Focuses on the processing of the data source
 • by the data source maintainer
 • by the NSI

 Not discussed here, future work

 Framework was developed without specifically
  focusing at process related quality aspects
 • main focus was product related
Framework and external data sources
 Developed for administrative data
 • Registers a.o.

 Why not use it for surveys?
 • In the case were the data is collected by an
   organization other than Statistics Netherlands
 • Experiences in the past resulted in many quality
   related discussions
    – transparency of data collection process?

 Resulted in some minor adjustments of the
  framework
 • Some terms were adjusted
 • Review with sample approach in mind
Application of the framework
 Should be applied to externally collected data
  sources
 • Administrative data, registers, non-NSI surveys

 How to apply?
 • Source and Metadata hyperdimension
     – Checklists have been developed
 • Data hyperdimension
     – Methods of calculation have been proposed
     – Currently looking at a practical means to apply these
 • Process hyperdimension
     – Under investigation
Application of the framework (2)
For each data source and use of data source
 1) Evaluate Source with checklist
    – 2 ways: a quick an a complete scan
    – When no problems occur continue
 2) Evaluate Metadata with checklist
    – 3 ways: replace, additional or new
    – When no problems occur continue
 3) Evaluate Data
    – In a standardized way (scripts or computer program)
    – Probably requires some very specific test at the end
        – Framework should be generally applicable
        – User mostly has a specific use in mind
Future work

 Evaluate registers and external survey data
 • Is the framework generally applicable for all
   sources?
 Thoroughly test Source and Metadata checklists
 • Feed-back on usability by users
 Calculation methods for Data
 • A single way of determining every Quality indicator
 Study how to efficiently evaluate Data
 • E.g. Scripts or computer program
 Determine the quality aspects in the Process
  hyperdimension
Questions?

More Related Content

Viewers also liked

QMS SharePoint Structure Definition Document
QMS SharePoint Structure Definition DocumentQMS SharePoint Structure Definition Document
QMS SharePoint Structure Definition Document
Melissa Jones
 
QMS SharePoint Wireframe - download and edit for you use
QMS SharePoint Wireframe - download and edit for you useQMS SharePoint Wireframe - download and edit for you use
QMS SharePoint Wireframe - download and edit for you use
Melissa Jones
 
Part 3 - SharePoint QMS Anyone Can Make - Data Dictionary
Part 3 - SharePoint QMS Anyone Can Make - Data DictionaryPart 3 - SharePoint QMS Anyone Can Make - Data Dictionary
Part 3 - SharePoint QMS Anyone Can Make - Data Dictionary
Melissa Jones
 
Quality framework 1
Quality framework 1Quality framework 1
Quality framework 1
Shwetha Bhat
 
Metadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full versionMetadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full version
Péter Király
 
Quality measurement - How to measure the quality of any object?
Quality measurement - How to measure the quality of any object?Quality measurement - How to measure the quality of any object?
Quality measurement - How to measure the quality of any object?
Grzegorz Grela
 
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
Barry Peters
 
Audit Quality Framework & Proportionate Application of ISAs
Audit Quality Framework & Proportionate Application of ISAsAudit Quality Framework & Proportionate Application of ISAs
Audit Quality Framework & Proportionate Application of ISAs
International Federation of Accountants
 
PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality Framework
PAS_Team
 
QMS Calibration Powerpoint
QMS Calibration PowerpointQMS Calibration Powerpoint
QMS Calibration Powerpoint
Dennis J Morgan
 
Quality assurance road map
Quality assurance road mapQuality assurance road map
Quality assurance road map
rajeshsinghsitarganj
 
Principles of quality framework design
Principles of quality framework designPrinciples of quality framework design
Principles of quality framework design
Anthony Fisher Camilleri
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
Heiko Paulheim
 
Implementing an Integrated Quality Management System in SharePoint
Implementing an Integrated Quality Management System in SharePointImplementing an Integrated Quality Management System in SharePoint
Implementing an Integrated Quality Management System in SharePoint
Montrium
 
Automating Business Processes with SharePoint
Automating Business Processes with SharePointAutomating Business Processes with SharePoint
Automating Business Processes with SharePoint
Gus Fraser
 
Gap model service quality
Gap model service qualityGap model service quality
Gap model service quality
Aamna Shakeel
 
SharePoint 2013 as a BPM & Workflow Management System
SharePoint 2013 as a BPM & Workflow Management SystemSharePoint 2013 as a BPM & Workflow Management System
SharePoint 2013 as a BPM & Workflow Management System
Andreas Aschauer
 
Beginners SharePoint introduction
Beginners SharePoint introductionBeginners SharePoint introduction
Beginners SharePoint introduction
Melick Baranasooriya
 
Testing & Quality Assurance
Testing & Quality AssuranceTesting & Quality Assurance
Testing & Quality Assurance
Anand Subramaniam
 

Viewers also liked (19)

QMS SharePoint Structure Definition Document
QMS SharePoint Structure Definition DocumentQMS SharePoint Structure Definition Document
QMS SharePoint Structure Definition Document
 
QMS SharePoint Wireframe - download and edit for you use
QMS SharePoint Wireframe - download and edit for you useQMS SharePoint Wireframe - download and edit for you use
QMS SharePoint Wireframe - download and edit for you use
 
Part 3 - SharePoint QMS Anyone Can Make - Data Dictionary
Part 3 - SharePoint QMS Anyone Can Make - Data DictionaryPart 3 - SharePoint QMS Anyone Can Make - Data Dictionary
Part 3 - SharePoint QMS Anyone Can Make - Data Dictionary
 
Quality framework 1
Quality framework 1Quality framework 1
Quality framework 1
 
Metadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full versionMetadata Quality Assurance Framework at QQML2016 conference - full version
Metadata Quality Assurance Framework at QQML2016 conference - full version
 
Quality measurement - How to measure the quality of any object?
Quality measurement - How to measure the quality of any object?Quality measurement - How to measure the quality of any object?
Quality measurement - How to measure the quality of any object?
 
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
15 Months to Certification: Using SharePoint as the Platform for an ISO 9001 ...
 
Audit Quality Framework & Proportionate Application of ISAs
Audit Quality Framework & Proportionate Application of ISAsAudit Quality Framework & Proportionate Application of ISAs
Audit Quality Framework & Proportionate Application of ISAs
 
PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality Framework
 
QMS Calibration Powerpoint
QMS Calibration PowerpointQMS Calibration Powerpoint
QMS Calibration Powerpoint
 
Quality assurance road map
Quality assurance road mapQuality assurance road map
Quality assurance road map
 
Principles of quality framework design
Principles of quality framework designPrinciples of quality framework design
Principles of quality framework design
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
 
Implementing an Integrated Quality Management System in SharePoint
Implementing an Integrated Quality Management System in SharePointImplementing an Integrated Quality Management System in SharePoint
Implementing an Integrated Quality Management System in SharePoint
 
Automating Business Processes with SharePoint
Automating Business Processes with SharePointAutomating Business Processes with SharePoint
Automating Business Processes with SharePoint
 
Gap model service quality
Gap model service qualityGap model service quality
Gap model service quality
 
SharePoint 2013 as a BPM & Workflow Management System
SharePoint 2013 as a BPM & Workflow Management SystemSharePoint 2013 as a BPM & Workflow Management System
SharePoint 2013 as a BPM & Workflow Management System
 
Beginners SharePoint introduction
Beginners SharePoint introductionBeginners SharePoint introduction
Beginners SharePoint introduction
 
Testing & Quality Assurance
Testing & Quality AssuranceTesting & Quality Assurance
Testing & Quality Assurance
 

Similar to Proposal for a quality framework for the evaluation of administrative and survey data

Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Amit Sheth
 
information technology materrailas paper
information technology materrailas paperinformation technology materrailas paper
information technology materrailas paper
melkamutesfay1
 
Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...
Piet J.H. Daas
 
Analytics and reporting context linkedin final
Analytics and reporting context linkedin finalAnalytics and reporting context linkedin final
Analytics and reporting context linkedin final
Dennis Crow
 
Sharon Dawes (CTG Albany) Open data quality: a practical view
Sharon Dawes (CTG Albany) Open data quality: a practical viewSharon Dawes (CTG Albany) Open data quality: a practical view
Sharon Dawes (CTG Albany) Open data quality: a practical view
Open City Foundation
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
Riccardo Albertoni
 
Datamining
DataminingDatamining
Datamining
rishikarki
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Twitter Inc.
 
A Framework for Health IT Evaluation
A Framework for Health IT EvaluationA Framework for Health IT Evaluation
A Framework for Health IT Evaluation
Health Informatics New Zealand
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Statistik dan Probabilitas Unlambjb..ppt
Statistik dan Probabilitas Unlambjb..pptStatistik dan Probabilitas Unlambjb..ppt
Statistik dan Probabilitas Unlambjb..ppt
NurlinaAbdullah1
 
Research and collection of data
Research and collection of dataResearch and collection of data
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Krishnan Parasuraman
 
Quality key users
Quality key usersQuality key users
Quality key users
Antti Jakobsson
 
Indicator Development for Forest Governance
Indicator Development for Forest GovernanceIndicator Development for Forest Governance
Indicator Development for Forest Governance
Forest Trees Sentinel Landscapes
 
Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?
CGIAR Research Program on Dryland Systems
 
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Kathmandu Living Labs
 
Connected development data
Connected development dataConnected development data
Connected development data
Rob Worthington
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
Ge Peng
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 

Similar to Proposal for a quality framework for the evaluation of administrative and survey data (20)

Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
 
information technology materrailas paper
information technology materrailas paperinformation technology materrailas paper
information technology materrailas paper
 
Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...
 
Analytics and reporting context linkedin final
Analytics and reporting context linkedin finalAnalytics and reporting context linkedin final
Analytics and reporting context linkedin final
 
Sharon Dawes (CTG Albany) Open data quality: a practical view
Sharon Dawes (CTG Albany) Open data quality: a practical viewSharon Dawes (CTG Albany) Open data quality: a practical view
Sharon Dawes (CTG Albany) Open data quality: a practical view
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Datamining
DataminingDatamining
Datamining
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
 
A Framework for Health IT Evaluation
A Framework for Health IT EvaluationA Framework for Health IT Evaluation
A Framework for Health IT Evaluation
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Statistik dan Probabilitas Unlambjb..ppt
Statistik dan Probabilitas Unlambjb..pptStatistik dan Probabilitas Unlambjb..ppt
Statistik dan Probabilitas Unlambjb..ppt
 
Research and collection of data
Research and collection of dataResearch and collection of data
Research and collection of data
 
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
Big Data Journeys: Review of roadmaps taken by early adopters to achieve thei...
 
Quality key users
Quality key usersQuality key users
Quality key users
 
Indicator Development for Forest Governance
Indicator Development for Forest GovernanceIndicator Development for Forest Governance
Indicator Development for Forest Governance
 
Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?
 
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
 
Connected development data
Connected development dataConnected development data
Connected development data
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 

More from Piet J.H. Daas

Big Data and official statistics with examples of their use
Big Data and official statistics with examples of their useBig Data and official statistics with examples of their use
Big Data and official statistics with examples of their use
Piet J.H. Daas
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics Netherlands
Piet J.H. Daas
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
Piet J.H. Daas
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
Piet J.H. Daas
 
Use of social media for official statistics
Use of social media for official statisticsUse of social media for official statistics
Use of social media for official statistics
Piet J.H. Daas
 
Isi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and biasIsi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and bias
Piet J.H. Daas
 
Responsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsResponsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics Netherlands
Piet J.H. Daas
 
CBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONSCBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONS
Piet J.H. Daas
 
Ntts2017 presentation 45
Ntts2017 presentation 45Ntts2017 presentation 45
Ntts2017 presentation 45
Piet J.H. Daas
 
Big Data presentation Mannheim
Big Data presentation MannheimBig Data presentation Mannheim
Big Data presentation Mannheim
Piet J.H. Daas
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media data
Piet J.H. Daas
 
Big data cbs_piet_daas
Big data cbs_piet_daasBig data cbs_piet_daas
Big data cbs_piet_daas
Piet J.H. Daas
 
Gebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiekGebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiek
Piet J.H. Daas
 
Big Data @ CBS
Big Data @ CBSBig Data @ CBS
Big Data @ CBS
Piet J.H. Daas
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivity
Piet J.H. Daas
 
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data MethodologyUsing Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
Piet J.H. Daas
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in Eindhoven
Piet J.H. Daas
 
Big Data presentation for Statistics Canada
Big Data presentation for Statistics CanadaBig Data presentation for Statistics Canada
Big Data presentation for Statistics Canada
Piet J.H. Daas
 
Quality challenges in modernising business statistics
Quality challenges in modernising business statisticsQuality challenges in modernising business statistics
Quality challenges in modernising business statistics
Piet J.H. Daas
 
Quality Approaches to Big Data
Quality Approaches to Big DataQuality Approaches to Big Data
Quality Approaches to Big Data
Piet J.H. Daas
 

More from Piet J.H. Daas (20)

Big Data and official statistics with examples of their use
Big Data and official statistics with examples of their useBig Data and official statistics with examples of their use
Big Data and official statistics with examples of their use
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics Netherlands
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
 
Use of social media for official statistics
Use of social media for official statisticsUse of social media for official statistics
Use of social media for official statistics
 
Isi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and biasIsi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and bias
 
Responsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsResponsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics Netherlands
 
CBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONSCBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONS
 
Ntts2017 presentation 45
Ntts2017 presentation 45Ntts2017 presentation 45
Ntts2017 presentation 45
 
Big Data presentation Mannheim
Big Data presentation MannheimBig Data presentation Mannheim
Big Data presentation Mannheim
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media data
 
Big data cbs_piet_daas
Big data cbs_piet_daasBig data cbs_piet_daas
Big data cbs_piet_daas
 
Gebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiekGebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiek
 
Big Data @ CBS
Big Data @ CBSBig Data @ CBS
Big Data @ CBS
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivity
 
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data MethodologyUsing Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in Eindhoven
 
Big Data presentation for Statistics Canada
Big Data presentation for Statistics CanadaBig Data presentation for Statistics Canada
Big Data presentation for Statistics Canada
 
Quality challenges in modernising business statistics
Quality challenges in modernising business statisticsQuality challenges in modernising business statistics
Quality challenges in modernising business statistics
 
Quality Approaches to Big Data
Quality Approaches to Big DataQuality Approaches to Big Data
Quality Approaches to Big Data
 

Recently uploaded

What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025
khuleseema60
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
David Douglas School District
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ImMuslim
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
JomonJoseph58
 
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
indexPub
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
TechSoup
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 

Recently uploaded (20)

What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
 
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 

Proposal for a quality framework for the evaluation of administrative and survey data

  • 1. A quality framework for the evaluation of administrative and survey data Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven Statistics Netherlands
  • 2. Overview  Reason for work  View on Quality  Starting point  Literature study results  Overview of the framework  Application  Future work
  • 3. Reason for work  Statistics Netherlands increases the use of data (sources) collected and maintained by others • To decrease response burden and costs  As a result: • More dependent on external data sources • Must be able to monitor the quality of external data sources – Develop a way to monitor the quality of external data sources
  • 4. View on quality  Statistics Netherlands definition of the quality of external data sources: “Usability for the production of statistics”  Differs from quality as used by the data source maintainer
  • 5. Starting point  Previous work at Statistics Netherlands • Most recent: Work of Daas and Fonville – Determination of register quality – Described in a paper for the register seminar in Finland (2007) • Improvement is possible – Predominantly business register data – International experience could be included more  New project started • Develop quality framework for the evaluation of external data sources (focus on registers and administrative data)
  • 6. Literature study (1)  Performed and extensive literature study • Publically available papers/books that studied the quality of administrative data sources and registers → Dutch and English • A lot of research focuses on quality of survey collected → these were excluded • Ended up with quite a limited lists of important papers: – Book of the Wallgren’s (S) – Daas and Fonville (NL) – Eurostat paper on Quality of administrative Data – Work performed at ONS and by Thomas (UK) – UNECE paper of Nordic countries
  • 7. Literature study (2)  Conclusions: • A general level of mutuality – The papers identified many similar quality aspects (quality indicators) • None of the ‘views’ on quality were exactly alike • How to combine all these views? • Something higher than a dimension was needed • Karr et al. 2006 used the term Hyperdimension to distinguish different views on quality • Combine all quality aspects identified in all studies and new aspects in a single framework !!
  • 8. Quality framework  Framework has 4 hyperdimensions • Four views on the quality of the external data source  The hyperdimensions identified are: • Source → Data source as a whole • Metadata → Conceptual metadata of data in source • Data → Facts (values) in data source • Process → Processing related quality aspects
  • 9. Quality framework levels  Levels distinguished: HYPERDIMENSION n>1 DIMENSION n >= 1 QUALITY INDICATOR 1:n Measurement method
  • 10. 1) Source hyperdimension  Here the data source is viewed upon as a file delivered by the data source maintainer to the NSI  Dimensions (5): • Supplier, Relevance, Privacy and security, Delivery, and Procedures
  • 11. Source hyperdimension Hyper- Dimension Indicator Measurement method dimension Source 1. Supplier Supplier Contact Name, Contact information 1.1 Contact - Name of the data source - Data source contact info Relevance Adm. burden Effect of use on adm. burden - NSI a contact person of NSI 1.2 Purpose (time and money) - Reason for use of the data source by NSI Privacy and Legal provision Check if Personal Data 2. Relevance security2.1 Usefulness - Importance data source for NSI Protection act applies Delivery2.2 Envisaged use Costs - Potentialuse for NSI use of data Costs of statistical source 2.3 Information - Does the data source satisfy
  • 12. 2) Metadata hyperdimension  Focuses on the conceptual metadata quality aspects of the data source.  Other metadata aspects (such as process meta) are not included  Dimensions (4): • Clarity, Comparability, Unique keys, and Data treatment by data source maintainer
  • 13. Metadata hyperdimension Hyper- Dimension Indicator Measurement method dimension Metadata Clarity 1. Clarity Population Description of the population 1.1 Population definition - Clarity scoredata source definition used in of the definition Unique Definition of variables 1.2 keys Identification - Clarity scoreunique keys Presence of of the definition (and categories) keys present (which) 1.3 Time dimensions - Clarity score of the definition Data 1.4 Geographic demarcation - Clarity score of the definition treatment Checks Variable value checks by data source performed 1.5 Definition changes maintainer - Familiarity with occurred changes 2. Comparability Modifications Familiarity with data modifications 2.1 Population definition - Comparability with NSI comparison definition
  • 14. 3) Data hyperdimension  Aspects related to data in the data source • All aspects are accuracy related  Actively being discussed at our office • Future changes may be possible  Dimensions (9) • Over coverage, Under coverage, Linkability, Unit non-response, Item non-response, Measurement, Processing, Precision, and Sensitivity Remark: Precision was mainly included for (externally collected) survey data
  • 15. Data hyperdimension Hyper- Dimension Indicator Measurement method dimension Data Over coverage Non-pop. units 1. Over coverage Percentage of units not 1.1 Non-population units - Percentage of units to population of NSI belonging not belonging to population Linkability Linkable units Percentage if units linked 2. Under coverage 2.1 Missing units - Percentage of missing Measurement Incompatible population units with violated Fraction of fields records edit rules 2.2 Selectivity - R-index for population Processing Adjustment composition Fraction of fields adjusted 2.3 Effect on core variables - Maximum bias of average for Imputation Fraction of fields imputed core variable - Maximum RMSE of average R-index: Representative index; RMSE: Root mean square Error; MSE: Mean Square Error
  • 16. 4) Process hyperdimension  Focuses on the processing of the data source • by the data source maintainer • by the NSI  Not discussed here, future work  Framework was developed without specifically focusing at process related quality aspects • main focus was product related
  • 17. Framework and external data sources  Developed for administrative data • Registers a.o.  Why not use it for surveys? • In the case were the data is collected by an organization other than Statistics Netherlands • Experiences in the past resulted in many quality related discussions – transparency of data collection process?  Resulted in some minor adjustments of the framework • Some terms were adjusted • Review with sample approach in mind
  • 18. Application of the framework  Should be applied to externally collected data sources • Administrative data, registers, non-NSI surveys  How to apply? • Source and Metadata hyperdimension – Checklists have been developed • Data hyperdimension – Methods of calculation have been proposed – Currently looking at a practical means to apply these • Process hyperdimension – Under investigation
  • 19. Application of the framework (2) For each data source and use of data source 1) Evaluate Source with checklist – 2 ways: a quick an a complete scan – When no problems occur continue 2) Evaluate Metadata with checklist – 3 ways: replace, additional or new – When no problems occur continue 3) Evaluate Data – In a standardized way (scripts or computer program) – Probably requires some very specific test at the end – Framework should be generally applicable – User mostly has a specific use in mind
  • 20. Future work  Evaluate registers and external survey data • Is the framework generally applicable for all sources?  Thoroughly test Source and Metadata checklists • Feed-back on usability by users  Calculation methods for Data • A single way of determining every Quality indicator  Study how to efficiently evaluate Data • E.g. Scripts or computer program  Determine the quality aspects in the Process hyperdimension