SlideShare a Scribd company logo
1 of 19
Introduction to Survey Data
Quality
Olga Maslovskaya
University of Southampton
Survey Data
• Vast amounts of survey data are collected for many
purposes, including governmental information, public
opinion and election surveys, advertising and market
research as well as scientific research
• Survey data underlie many public policy and
business decisions
• Good quality data reduces the risk of poor policies
and decisions and is of crucial importance
Survey challenges
• Budgets are severely constrained (survey costs)
• Pressures on providing timely data are greater in the
digital age
• Public interest in participating in surveys is declining
and now at all time low (response rates)
• When cooperation obtained from reluctant
respondents, responses may be less accurate
• New modes of data collection introduce new
concerns for data quality (errors)
Definition
• Quality can be defined simply as “fitness for
use”
• Quality is a requirement for survey data to be as
accurate as necessary to achieve their intended
purposes, be available at the time it is needed
(timely), and be accessible to those for whom the
survey was conducted.
Biemer and Lyberg (2003)
Total Survey Quality (TSQ)
Total Survey Quality (TSQ)
TSQ – survey quality is more than its accuracy or statistical
dimension. It also includes among other factors producing
results that fit the needs of the survey users and providing
results that users will have confidence in. Usability of results
is of crucial importance.
Statistical
Dimension
Non-statistical
Dimension
TSQ: Quality Dimensions –Statistical
• Accuracy of estimates is the difference between
the estimate and the true parameter value
• Accuracy is the larger concept of TSQ
X = T + e
Observed
item
True value Error
Variance
(random error)
Bias (systematic
error)
TSQ: Quality Dimensions – Non-statistical
• Relevance - product is relevant and meets user needs
• Timeliness and punctuality – in disseminating results
(most important user needs)
• Accessibility and clarity – of the information
• Comparability - reliable comparisons across space and
time are often crucial; cross-national comparisons
• Coherence - single source – elementary concepts can
be combined in more complex ways; different sources –
based on common definitions, classifications and
methodological standards
• Completeness - data are rich enough to satisfy the
analysis objectives
TSQ: Quality Dimensions – Non-statistical
• Credibility – credible methodology
• Interpretability – documentation is clear
• Richness of detail - data are rich enough to
satisfy the analysis objectives
• Level of confidentiality protection
• Cost – data give good value for money
Total Survey Error (TSE)
• TSE concept was developed by Robert Groves
(1989) in book on Survey Errors and Survey Costs
• Survey estimates are derived from complex survey
data, published estimates may differ from their true
parameter values due to survey errors
• Total Survey Error is the difference between a
population mean, total, or other population
parameter and the estimate of the parameter based
on the sample survey (or census) (Biemer and
Lyberg, 2003)
TSE
TSE= sampling errors + non-sampling errors
Sources of Sampling Error
Sampling errors – can be computed for probability
samples and are due to selecting a sample instead of the
entire population
Sources:
• Sampling scheme
• Sample size
• Estimator choice
Components of Non-sampling Error
Non-sampling errors (including measurement error – cannot be
formally estimated but can be improved by interviewing
procedures and question wordings etc.) - are errors due to
mistakes or system deficiencies, also from incomplete responses
to the survey or its questions, etc.
1. Specification error
2. Frame error
3. Nonresponse error
4. Measurement error
5. Data processing error
6. Modelling/Estimation error
Other Important Factors
A number of additional factors can impact survey
data quality. Four of the more important:
• the length of time the survey was fielded,
• the use of incentives,
• the reputation of the organisation conducting the
survey
• mode of data collection
Actors affecting data quality
• Respondents (respondents’ effects on data quality):
e.g., response styles, satisficing (less efforts to
provide optimal responses)
• Interviewers (interviewers’ effects on data quality):
e.g., fabrication, ability to elicit interest and
commitment to survey in respondents, duration of
interview, duplication of responses apart from say
demographic
• Supervisors and survey research organisations
(supervisors’ effects on data quality), e.g. sampling
design, training of field workers
Data quality monitoring strategies
• Continuous quality improvement (CQI)
– Special cause variation – errors made by individual coders
– Common cause variation – errors due to the process itself
• Responsive design (Groves and Heeringa, 2006)
• Adaptive design – real time control of costs and
errors – (Schouten et al. 2013)
• Adaptive Total Design (Biemer, 2010) – adaptive
design strategy that combines ides of CQI and the
TSE paradigm to reduce costs and error across
multiple survey processes
• Six Sigma – set of principles and strategies for
improving any process
Data Quality in Practice
• No instance where a total survey quality (TSQ)
measure has ever been calculated or combined
single measure of quality taking all dimensions into
account
• Cost-benefit trade-offs to minimise different errors
depending on survey aims
• Quality reports or quality declarations have been
used where information on each dimension is
provided
• Data quality guides are meant to alert the data user
to potential sources of bias that might be present
Conclusions (1)
• Data quality is a multi-dimensional concept
• Single score or measure of data quality is not available
• Cost-benefit trade-offs to minimize different errors depending
on survey aims
• Quality frameworks are developed and adopted
• Broad range of relevant data quality indicators and information
are available with data
• The chances of users misusing the data or misinterpreting
published statistics is reduced if they understand better the
strengths and limitations of the data.
• New technologies require fresh considerations of data quality
issues in new types of surveys
Conclusions (2)
High quality of the survey data brings
– improvement in the quality of surveys themselves
– improvement of the quality of research and of policy
and financial decisions that are based on the survey
data
References
• Biemer (2010) Total survey error: Design, implementation, and evaluation. Public
Opinion Quarterly, 74(5): 817-848.
• Biemer (2016) Total Survey Error Paradigm: Theory and Practice. In The Sage
handbook of survey methodology by Wolf, Joye, Smith and Fu. London: SAGE
publications.
• Biemer and Lyberg (2003) Introduction to survey quality. New York: John Wiley & Sons.
• Groves and Heeringa (2006) Responsive design for household surveys: Tools for
actively controlling survey errors and costs. Journal of the Royal Statistical Society
Series A, 169 (3): 439-457.
• Lynn (2004) Editorial: Measuring and communicating survey quality. Journal of the Royal
Statistical Society Series A, 167 (4): 575-578.
• Lyberg and Weisberg (2016) The SAGE handbook of survey methodology. London:
SAGE publications.
• Schouten et al. (2013) Optimizing quality of response through adaptive survey designs.
Survey Methodology, 39 (1): 29-39.
• Weisberg (2005) The total survey error approach. Chicago: University of Chicago Press.

More Related Content

What's hot

Predictive value and likelihood ratio
Predictive value and likelihood ratio Predictive value and likelihood ratio
Predictive value and likelihood ratio Abino David
 
Interpretation of research results comendador, g.
Interpretation of research results  comendador, g.Interpretation of research results  comendador, g.
Interpretation of research results comendador, g.Ginalyn Comendador
 
Quantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionQuantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionDrKevinMorrell
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniquesDr. Adrija Roy
 
Concept of Inferential statistics
Concept of Inferential statisticsConcept of Inferential statistics
Concept of Inferential statisticsSarfraz Ahmad
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuSudhir INDIA
 
Tools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityTools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityDr. Sarita Anand
 
Qualitative codes and coding
Qualitative codes and coding Qualitative codes and coding
Qualitative codes and coding Heather Ford
 
Sampling design and procedures
Sampling design and proceduresSampling design and procedures
Sampling design and proceduresPrabesh Ghimire
 
variable in research
variable in researchvariable in research
variable in researchQudratullah
 
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Stats Statswork
 
Ethics In Research
Ethics In ResearchEthics In Research
Ethics In Researchali haider
 
Data Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityData Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityIkbal Ahmed
 
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...Bikash Sapkota
 
bio statistics for clinical research
bio statistics for clinical researchbio statistics for clinical research
bio statistics for clinical researchRanjith Paravannoor
 
Stratified random sampling
Stratified random samplingStratified random sampling
Stratified random samplingwaiton sherekete
 

What's hot (20)

Predictive value and likelihood ratio
Predictive value and likelihood ratio Predictive value and likelihood ratio
Predictive value and likelihood ratio
 
Interpretation of research results comendador, g.
Interpretation of research results  comendador, g.Interpretation of research results  comendador, g.
Interpretation of research results comendador, g.
 
Quantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionQuantitative Data - A Basic Introduction
Quantitative Data - A Basic Introduction
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 
Biostatistics ppt
Biostatistics  pptBiostatistics  ppt
Biostatistics ppt
 
Concept of Inferential statistics
Concept of Inferential statisticsConcept of Inferential statistics
Concept of Inferential statistics
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahu
 
Kappa
KappaKappa
Kappa
 
Tools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityTools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and Reliability
 
Qualitative codes and coding
Qualitative codes and coding Qualitative codes and coding
Qualitative codes and coding
 
Sampling design and procedures
Sampling design and proceduresSampling design and procedures
Sampling design and procedures
 
variable in research
variable in researchvariable in research
variable in research
 
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
 
Ethics In Research
Ethics In ResearchEthics In Research
Ethics In Research
 
The mann whitney u test
The mann whitney u testThe mann whitney u test
The mann whitney u test
 
Data Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityData Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & Normality
 
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...
Data Collection (Methods/ Tools/ Techniques), Primary & Secondary Data, Quali...
 
Ppt data collection
Ppt data collectionPpt data collection
Ppt data collection
 
bio statistics for clinical research
bio statistics for clinical researchbio statistics for clinical research
bio statistics for clinical research
 
Stratified random sampling
Stratified random samplingStratified random sampling
Stratified random sampling
 

Similar to Introduction to Survey Data Quality

Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfVikramjit Singh
 
MethodsofDataCollection.pdf
MethodsofDataCollection.pdfMethodsofDataCollection.pdf
MethodsofDataCollection.pdfssuser9878d0
 
MethodsofDataCollection.pdf
MethodsofDataCollection.pdfMethodsofDataCollection.pdf
MethodsofDataCollection.pdfMohdTaufiqIshak
 
Descriptive research-survey
Descriptive research-surveyDescriptive research-survey
Descriptive research-surveyVikramjit Singh
 
Questionnaire2002
Questionnaire2002Questionnaire2002
Questionnaire2002wardah azhar
 
RESEARCH APPROACHES AND DESIGNS.pptx
RESEARCH APPROACHES AND DESIGNS.pptxRESEARCH APPROACHES AND DESIGNS.pptx
RESEARCH APPROACHES AND DESIGNS.pptxPRADEEP ABOTHU
 
COMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxCOMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxGhaffarAhmed9
 
Data collection methods RSS6 2014
Data collection methods RSS6 2014Data collection methods RSS6 2014
Data collection methods RSS6 2014RSS6
 
Evaluating Systems Change
Evaluating Systems ChangeEvaluating Systems Change
Evaluating Systems ChangeNoel Hatch
 
Chapter Eight Quantitative Methods
Chapter Eight Quantitative MethodsChapter Eight Quantitative Methods
Chapter Eight Quantitative MethodsInternational advisers
 
Quantitative search and_qualitative_research by mubarak
Quantitative search and_qualitative_research by mubarakQuantitative search and_qualitative_research by mubarak
Quantitative search and_qualitative_research by mubarakHafiza Abas
 
Evaluation of Health IT Implementation (March 20, 2019)
Evaluation of Health IT Implementation (March 20, 2019)Evaluation of Health IT Implementation (March 20, 2019)
Evaluation of Health IT Implementation (March 20, 2019)Nawanan Theera-Ampornpunt
 
Evaluation of Health IT Implementation
Evaluation of Health IT ImplementationEvaluation of Health IT Implementation
Evaluation of Health IT ImplementationNawanan Theera-Ampornpunt
 
Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)Nawanan Theera-Ampornpunt
 
Practical Research 1 about quantitative and qualitative methods
Practical Research 1 about quantitative and qualitative methodsPractical Research 1 about quantitative and qualitative methods
Practical Research 1 about quantitative and qualitative methodsAndoJoshua
 
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...Adikesavaperumal
 
MARKET RESEARCH WEEK LESSONN PLAN 5.pptx
MARKET RESEARCH WEEK LESSONN PLAN 5.pptxMARKET RESEARCH WEEK LESSONN PLAN 5.pptx
MARKET RESEARCH WEEK LESSONN PLAN 5.pptxPreciousChanaiwa
 
Educ 210-research-design
Educ 210-research-designEduc 210-research-design
Educ 210-research-designBernadetteSLomeda
 

Similar to Introduction to Survey Data Quality (20)

Data quality: total survey error
Data quality: total survey errorData quality: total survey error
Data quality: total survey error
 
Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdf
 
MethodsofDataCollection.pdf
MethodsofDataCollection.pdfMethodsofDataCollection.pdf
MethodsofDataCollection.pdf
 
MethodsofDataCollection.pdf
MethodsofDataCollection.pdfMethodsofDataCollection.pdf
MethodsofDataCollection.pdf
 
Descriptive research-survey
Descriptive research-surveyDescriptive research-survey
Descriptive research-survey
 
Questionnaire2002
Questionnaire2002Questionnaire2002
Questionnaire2002
 
Survey
SurveySurvey
Survey
 
RESEARCH APPROACHES AND DESIGNS.pptx
RESEARCH APPROACHES AND DESIGNS.pptxRESEARCH APPROACHES AND DESIGNS.pptx
RESEARCH APPROACHES AND DESIGNS.pptx
 
COMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxCOMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptx
 
Data collection methods RSS6 2014
Data collection methods RSS6 2014Data collection methods RSS6 2014
Data collection methods RSS6 2014
 
Evaluating Systems Change
Evaluating Systems ChangeEvaluating Systems Change
Evaluating Systems Change
 
Chapter Eight Quantitative Methods
Chapter Eight Quantitative MethodsChapter Eight Quantitative Methods
Chapter Eight Quantitative Methods
 
Quantitative search and_qualitative_research by mubarak
Quantitative search and_qualitative_research by mubarakQuantitative search and_qualitative_research by mubarak
Quantitative search and_qualitative_research by mubarak
 
Evaluation of Health IT Implementation (March 20, 2019)
Evaluation of Health IT Implementation (March 20, 2019)Evaluation of Health IT Implementation (March 20, 2019)
Evaluation of Health IT Implementation (March 20, 2019)
 
Evaluation of Health IT Implementation
Evaluation of Health IT ImplementationEvaluation of Health IT Implementation
Evaluation of Health IT Implementation
 
Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)
 
Practical Research 1 about quantitative and qualitative methods
Practical Research 1 about quantitative and qualitative methodsPractical Research 1 about quantitative and qualitative methods
Practical Research 1 about quantitative and qualitative methods
 
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...
wepik-exploring-effective-strategies-for-data-collection-navigating-the-chall...
 
MARKET RESEARCH WEEK LESSONN PLAN 5.pptx
MARKET RESEARCH WEEK LESSONN PLAN 5.pptxMARKET RESEARCH WEEK LESSONN PLAN 5.pptx
MARKET RESEARCH WEEK LESSONN PLAN 5.pptx
 
Educ 210-research-design
Educ 210-research-designEduc 210-research-design
Educ 210-research-design
 

More from University of Southampton

Generating SPSS training materials in StatJR
Generating SPSS training materials in StatJRGenerating SPSS training materials in StatJR
Generating SPSS training materials in StatJRUniversity of Southampton
 
Introduction to the Stat-JR software package
Introduction to the Stat-JR software packageIntroduction to the Stat-JR software package
Introduction to the Stat-JR software packageUniversity of Southampton
 
Multi level modelling- random coefficient models | Ian Brunton-Smith
Multi level modelling- random coefficient models | Ian Brunton-SmithMulti level modelling- random coefficient models | Ian Brunton-Smith
Multi level modelling- random coefficient models | Ian Brunton-SmithUniversity of Southampton
 
Multi level modelling - random intercept models | Ian Brunton Smith
Multi level modelling - random intercept models | Ian Brunton SmithMulti level modelling - random intercept models | Ian Brunton Smith
Multi level modelling - random intercept models | Ian Brunton SmithUniversity of Southampton
 
Introduction to multilevel modelling | Ian Brunton-Smith
Introduction to multilevel modelling | Ian Brunton-SmithIntroduction to multilevel modelling | Ian Brunton-Smith
Introduction to multilevel modelling | Ian Brunton-SmithUniversity of Southampton
 
Biosocial research:How to use biological data in social science research?
Biosocial research:How to use biological data in social science research?Biosocial research:How to use biological data in social science research?
Biosocial research:How to use biological data in social science research?University of Southampton
 
Integrating biological and social research data - Michaela Benzeval
Integrating biological and social research data - Michaela BenzevalIntegrating biological and social research data - Michaela Benzeval
Integrating biological and social research data - Michaela BenzevalUniversity of Southampton
 
Teaching research methods: pedagogy hooks
Teaching research methods: pedagogy hooksTeaching research methods: pedagogy hooks
Teaching research methods: pedagogy hooksUniversity of Southampton
 
Teaching research methods: pedagogy of methods learning
Teaching research methods: pedagogy of methods learningTeaching research methods: pedagogy of methods learning
Teaching research methods: pedagogy of methods learningUniversity of Southampton
 
Multilevel models:random coefficient models
Multilevel models:random coefficient modelsMultilevel models:random coefficient models
Multilevel models:random coefficient modelsUniversity of Southampton
 
Multilevel models:random intercept models
Multilevel models:random intercept modelsMultilevel models:random intercept models
Multilevel models:random intercept modelsUniversity of Southampton
 
Introduction to spatial interaction modelling
Introduction to spatial interaction modellingIntroduction to spatial interaction modelling
Introduction to spatial interaction modellingUniversity of Southampton
 
Survey questions and measurement error
Survey questions and measurement errorSurvey questions and measurement error
Survey questions and measurement errorUniversity of Southampton
 
Participatory performative and mobile methods
Participatory performative and mobile methodsParticipatory performative and mobile methods
Participatory performative and mobile methodsUniversity of Southampton
 
Participatory theatre as a social research methods
Participatory theatre as a social research methodsParticipatory theatre as a social research methods
Participatory theatre as a social research methodsUniversity of Southampton
 

More from University of Southampton (20)

Generating SPSS training materials in StatJR
Generating SPSS training materials in StatJRGenerating SPSS training materials in StatJR
Generating SPSS training materials in StatJR
 
Introduction to the Stat-JR software package
Introduction to the Stat-JR software packageIntroduction to the Stat-JR software package
Introduction to the Stat-JR software package
 
Multi level modelling- random coefficient models | Ian Brunton-Smith
Multi level modelling- random coefficient models | Ian Brunton-SmithMulti level modelling- random coefficient models | Ian Brunton-Smith
Multi level modelling- random coefficient models | Ian Brunton-Smith
 
Multi level modelling - random intercept models | Ian Brunton Smith
Multi level modelling - random intercept models | Ian Brunton SmithMulti level modelling - random intercept models | Ian Brunton Smith
Multi level modelling - random intercept models | Ian Brunton Smith
 
Introduction to multilevel modelling | Ian Brunton-Smith
Introduction to multilevel modelling | Ian Brunton-SmithIntroduction to multilevel modelling | Ian Brunton-Smith
Introduction to multilevel modelling | Ian Brunton-Smith
 
Biosocial research:How to use biological data in social science research?
Biosocial research:How to use biological data in social science research?Biosocial research:How to use biological data in social science research?
Biosocial research:How to use biological data in social science research?
 
Integrating biological and social research data - Michaela Benzeval
Integrating biological and social research data - Michaela BenzevalIntegrating biological and social research data - Michaela Benzeval
Integrating biological and social research data - Michaela Benzeval
 
Teaching research methods: pedagogy hooks
Teaching research methods: pedagogy hooksTeaching research methods: pedagogy hooks
Teaching research methods: pedagogy hooks
 
Teaching research methods: pedagogy of methods learning
Teaching research methods: pedagogy of methods learningTeaching research methods: pedagogy of methods learning
Teaching research methods: pedagogy of methods learning
 
better off living with parents
better off living with parentsbetter off living with parents
better off living with parents
 
Multilevel models:random coefficient models
Multilevel models:random coefficient modelsMultilevel models:random coefficient models
Multilevel models:random coefficient models
 
Multilevel models:random intercept models
Multilevel models:random intercept modelsMultilevel models:random intercept models
Multilevel models:random intercept models
 
Introduction to multilevel modelling
Introduction to multilevel modellingIntroduction to multilevel modelling
Introduction to multilevel modelling
 
How to write about research methods
How to write about research methodsHow to write about research methods
How to write about research methods
 
How to write about research methods
How to write about research methodsHow to write about research methods
How to write about research methods
 
Introduction to spatial interaction modelling
Introduction to spatial interaction modellingIntroduction to spatial interaction modelling
Introduction to spatial interaction modelling
 
Cognitive interviewing
Cognitive interviewingCognitive interviewing
Cognitive interviewing
 
Survey questions and measurement error
Survey questions and measurement errorSurvey questions and measurement error
Survey questions and measurement error
 
Participatory performative and mobile methods
Participatory performative and mobile methodsParticipatory performative and mobile methods
Participatory performative and mobile methods
 
Participatory theatre as a social research methods
Participatory theatre as a social research methodsParticipatory theatre as a social research methods
Participatory theatre as a social research methods
 

Recently uploaded

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

Introduction to Survey Data Quality

  • 1. Introduction to Survey Data Quality Olga Maslovskaya University of Southampton
  • 2. Survey Data • Vast amounts of survey data are collected for many purposes, including governmental information, public opinion and election surveys, advertising and market research as well as scientific research • Survey data underlie many public policy and business decisions • Good quality data reduces the risk of poor policies and decisions and is of crucial importance
  • 3. Survey challenges • Budgets are severely constrained (survey costs) • Pressures on providing timely data are greater in the digital age • Public interest in participating in surveys is declining and now at all time low (response rates) • When cooperation obtained from reluctant respondents, responses may be less accurate • New modes of data collection introduce new concerns for data quality (errors)
  • 4. Definition • Quality can be defined simply as “fitness for use” • Quality is a requirement for survey data to be as accurate as necessary to achieve their intended purposes, be available at the time it is needed (timely), and be accessible to those for whom the survey was conducted. Biemer and Lyberg (2003)
  • 5. Total Survey Quality (TSQ) Total Survey Quality (TSQ) TSQ – survey quality is more than its accuracy or statistical dimension. It also includes among other factors producing results that fit the needs of the survey users and providing results that users will have confidence in. Usability of results is of crucial importance. Statistical Dimension Non-statistical Dimension
  • 6. TSQ: Quality Dimensions –Statistical • Accuracy of estimates is the difference between the estimate and the true parameter value • Accuracy is the larger concept of TSQ X = T + e Observed item True value Error Variance (random error) Bias (systematic error)
  • 7. TSQ: Quality Dimensions – Non-statistical • Relevance - product is relevant and meets user needs • Timeliness and punctuality – in disseminating results (most important user needs) • Accessibility and clarity – of the information • Comparability - reliable comparisons across space and time are often crucial; cross-national comparisons • Coherence - single source – elementary concepts can be combined in more complex ways; different sources – based on common definitions, classifications and methodological standards • Completeness - data are rich enough to satisfy the analysis objectives
  • 8. TSQ: Quality Dimensions – Non-statistical • Credibility – credible methodology • Interpretability – documentation is clear • Richness of detail - data are rich enough to satisfy the analysis objectives • Level of confidentiality protection • Cost – data give good value for money
  • 9. Total Survey Error (TSE) • TSE concept was developed by Robert Groves (1989) in book on Survey Errors and Survey Costs • Survey estimates are derived from complex survey data, published estimates may differ from their true parameter values due to survey errors • Total Survey Error is the difference between a population mean, total, or other population parameter and the estimate of the parameter based on the sample survey (or census) (Biemer and Lyberg, 2003)
  • 10. TSE TSE= sampling errors + non-sampling errors
  • 11. Sources of Sampling Error Sampling errors – can be computed for probability samples and are due to selecting a sample instead of the entire population Sources: • Sampling scheme • Sample size • Estimator choice
  • 12. Components of Non-sampling Error Non-sampling errors (including measurement error – cannot be formally estimated but can be improved by interviewing procedures and question wordings etc.) - are errors due to mistakes or system deficiencies, also from incomplete responses to the survey or its questions, etc. 1. Specification error 2. Frame error 3. Nonresponse error 4. Measurement error 5. Data processing error 6. Modelling/Estimation error
  • 13. Other Important Factors A number of additional factors can impact survey data quality. Four of the more important: • the length of time the survey was fielded, • the use of incentives, • the reputation of the organisation conducting the survey • mode of data collection
  • 14. Actors affecting data quality • Respondents (respondents’ effects on data quality): e.g., response styles, satisficing (less efforts to provide optimal responses) • Interviewers (interviewers’ effects on data quality): e.g., fabrication, ability to elicit interest and commitment to survey in respondents, duration of interview, duplication of responses apart from say demographic • Supervisors and survey research organisations (supervisors’ effects on data quality), e.g. sampling design, training of field workers
  • 15. Data quality monitoring strategies • Continuous quality improvement (CQI) – Special cause variation – errors made by individual coders – Common cause variation – errors due to the process itself • Responsive design (Groves and Heeringa, 2006) • Adaptive design – real time control of costs and errors – (Schouten et al. 2013) • Adaptive Total Design (Biemer, 2010) – adaptive design strategy that combines ides of CQI and the TSE paradigm to reduce costs and error across multiple survey processes • Six Sigma – set of principles and strategies for improving any process
  • 16. Data Quality in Practice • No instance where a total survey quality (TSQ) measure has ever been calculated or combined single measure of quality taking all dimensions into account • Cost-benefit trade-offs to minimise different errors depending on survey aims • Quality reports or quality declarations have been used where information on each dimension is provided • Data quality guides are meant to alert the data user to potential sources of bias that might be present
  • 17. Conclusions (1) • Data quality is a multi-dimensional concept • Single score or measure of data quality is not available • Cost-benefit trade-offs to minimize different errors depending on survey aims • Quality frameworks are developed and adopted • Broad range of relevant data quality indicators and information are available with data • The chances of users misusing the data or misinterpreting published statistics is reduced if they understand better the strengths and limitations of the data. • New technologies require fresh considerations of data quality issues in new types of surveys
  • 18. Conclusions (2) High quality of the survey data brings – improvement in the quality of surveys themselves – improvement of the quality of research and of policy and financial decisions that are based on the survey data
  • 19. References • Biemer (2010) Total survey error: Design, implementation, and evaluation. Public Opinion Quarterly, 74(5): 817-848. • Biemer (2016) Total Survey Error Paradigm: Theory and Practice. In The Sage handbook of survey methodology by Wolf, Joye, Smith and Fu. London: SAGE publications. • Biemer and Lyberg (2003) Introduction to survey quality. New York: John Wiley & Sons. • Groves and Heeringa (2006) Responsive design for household surveys: Tools for actively controlling survey errors and costs. Journal of the Royal Statistical Society Series A, 169 (3): 439-457. • Lynn (2004) Editorial: Measuring and communicating survey quality. Journal of the Royal Statistical Society Series A, 167 (4): 575-578. • Lyberg and Weisberg (2016) The SAGE handbook of survey methodology. London: SAGE publications. • Schouten et al. (2013) Optimizing quality of response through adaptive survey designs. Survey Methodology, 39 (1): 29-39. • Weisberg (2005) The total survey error approach. Chicago: University of Chicago Press.

Editor's Notes

  1. So data quality is crucial
  2. All these require reconsideration of existing data quality frameworks. New aspects should be taken into account Opt-in panels Big Data Difficult to convince clients that traditional probability surveys are the best and worth extra costs
  3. (Eurostat, Statistics Canada and Statistics Sweden) Statistics Canada: Relevance Accuracy Timeliness Accessibility Interpretability Coherence Statistics Sweden: Content Accuracy Timeliness Comparability/coherence Availability/clarity
  4. Bias – mean of errors is not equal to 0, does not cancel out; variance – mean of error is equal to 0, does cancel out Accuracy is The larger concept of Total Survey Quality (TSQ) Broader that accuracy definition is needed as users are not just interested in the accuracy of the estimates provided. Accuracy is the cornestone of quality, since without it, sruvey data are of little use. If the data are erroneous, it does not help much if relevance, timeliness, accessibility, comparability, coherence and completeness are sufficient.
  5. Relevance of statistical concept (product is relevant and meets user needs) Timeliness and punctuality in disseminating results (one of the most important user needs) Accessibility and clarity of the information Comparability (reliable comparisons across space and time are often crucial; cross-national comparisons) Coherence (single source – elementary concepts can be combined in more complex ways; different sources – based on common definitions, classifications and methodological standards) Completeness (data are rich enough to satisfy the analysis objectives) Credibility (credible methodology) Interpretability – documentation is clear Richness of detail Level of confidentiality protection Cost – data give good value for money
  6. Credibility (credible methodology) Interpretability – documentation is clear Richness of detail Level of confidentiality protection Cost – data give good value for money
  7. Simple random sampling is often neither possible nor cost-effective. Stratifying the sample can reduce the sampling error, clustering the sample can reduce costs but would increase the sampling error.
  8. In many cases non-sampling error can be much more damaging than sampling error to estimates from surveys
  9. Errors can be systematic or random and correlated or uncorrelated. Uncorrelated (e.g., interviewer mistakenly records a “yes” answer as a “no” Correlated (when interviewers take multiple interviewers and when cluster sampling is used – correlated errors increase the variance of estimates due to an effective sample size that is smaller than the intended one and thereby make it more difficult to achieve statistically significant results) Measurement errors pose a serious limitation to the validity and usefulness of the information collected via survey. Having excellent samples representative of the target population, having high response rates, having complete data, etc. does us little good if our measurement instruments evoke responses that are fraught with error. Measurement error is distinct from other survey errors and it is error that occurs when the recorded or observed value is different from the true value of the variable. Reliability and validity are important in measurement error. Reliability is “agreement between two efforts to measure the same thing, using maximally similar methods” How was the survey administered (e.g. in person, by telephone, online, multiple modes, etc.)? (sensitive questions) Were the questions well constructed, clear, and not leading or otherwise biasing? (satisficing) What steps, if any, were taken to ensure that respondents were providing truthful  answers to the questions, and were any respondents removed from the final dataset (e.g., identifying speeders, satisficers, multiple completions)? (in-survey behaviour)
  10. How long was the survey in the field and how much effort was put to ensuring a good response? What incentives, if any, were respondents offered to encourage participation? (monetary incentives could bias the survey responses towards lower income groups; some respondetns could rush through) What is the record of accomplishment of the organization that conducted the survey? (organisation with long and successful records can inspire confidence)
  11. For face to face and telephone interviews
  12. potentially modify the survey design based on the analysis of paradata collected from relevant processes CQI is methods emphasize imprivement in the underlying process rather than screening the product. Initital quality improvements efforts should focus on eliminating the special cause erros since thay re usually responsible for most of the errors in a process. Reducing the common cause errors will require changing the process in such a way that the error rate can be lowered. Approach have been adopted to control costs and errors in surveys. Uses number of quality management tools. Continuous quality improvement (CQI) uses number of standard quality management tools: worksflow diagram, cause and effect (or fishbone) diagram, Pareto histograms, statistical process control methods and various production efficiency metrics Responsive design (Groves and Heeringa, 2006) – strategy that includes some of the ideas and concepts and approaches of CQI while providing several innovative strategies that use paradata as well as survey data to monitor nonresponse bias and follow up efficiency and effectiveness Adaptive design – real time control of costs and errors – Schouten et al. 2013 – strategy for tailoring key features of the survey design for different types of sample members maximizing response rates and reduce nonresponse selectivity – appropriate when substantial prior information about sample units is available Adaptive Total Design (Biemer, 2010) – adaptive design strategy that combines ides of CQI and the TSE paradigm to reduce costs and error across multiple survey processes including frame consturction, sampling, data collection and data processing Six Sigma – set of principles and strategies for improving any process
  13. Report that provides comprehensive picture of the quality of a survey, addressing each potential source of error Supplemental to regular survey documentation and should be based on information that is available in many different forms such as survey methodology reports, user manuals on how to use microdata files, and technical reports providing details about specifics.
  14. Cost-benefit trade-offs are needed to decide which errors to minimize Quality frameworks were developed and adopted and provided statistics producers with clear description of how certain dimensions of quality can be measured and why it might be important to do so. The survey community needs to find ways of ensuring that as broad a range as possible of relevant indicators and information is made available routinely (Lynn 2004) The chances of users misusing the data or misinterpreting published statistics will be reduced if they understand better the strengths and limitations of the data. The publication of data quality measures itself represent an improvement in the quality of a survey