Total survey error (TSE) refers to all errors that occur during the survey process that contribute to estimates deviating from true population parameters. TSE results from both sampling errors and non-sampling errors. The goal of TSE framework is to minimize all sources of bias and variance through optimal survey design, implementation, data collection, processing, analysis and modelling. However, minimizing all errors fully is impossible due to budget constraints, so trade-offs must be made to focus on reducing the most influential errors. Continuous quality improvement efforts such as redesigns, nonresponse reduction, quality monitoring and data quality indicators can help gradually reduce biases and unwanted variations in survey estimates over time.
Total Survey Error & Institutional Research: A case study of the University E...Sonia Whiteley
Total Survey Error (TSE) is a component of Total Survey Quality (TSQ) that supports the assessment of the extent to which a survey is ‘fit-for-purpose’. While TSQ looks at a number of dimensions, such as relevance, credibility and accessibility, TSE is has a more operational focus on accuracy and minimising errors.
Mitigating survey error involves finding a balance between a achieving a survey with minimal error and a survey that is affordable. It is also often the case that addressing one source of error can inadvertently increase another source of error.
TSE provides a conceptual framework for evaluating the design of the University Experience Survey (UES) and offers a structured approach to making decisions about changing and enhancing the UES to support continuous improvement. The implications of TSE for institutional research are discussed using the UES as a case study.
Binary outcome models are widely used in many real world application. We can used Probit and Logit models to analysis this type of data. Specially, dose response data can be analyze using these two models.
Hypothesis is a hunch the researcher or research team has. Basically a hypothesis is nothing more or less than a hunch to solve your research problem.
A good hypothesis by adding predictions on the how or why. So use sentences that include variations. If one cannot assess the predictions by observation or by experience, the hypothesis classes as not yet useful, and must wait for others who might come afterward to make possible the needed observations. For example, a new technology or theory might make the necessary experiments feasible.
Total Survey Error & Institutional Research: A case study of the University E...Sonia Whiteley
Total Survey Error (TSE) is a component of Total Survey Quality (TSQ) that supports the assessment of the extent to which a survey is ‘fit-for-purpose’. While TSQ looks at a number of dimensions, such as relevance, credibility and accessibility, TSE is has a more operational focus on accuracy and minimising errors.
Mitigating survey error involves finding a balance between a achieving a survey with minimal error and a survey that is affordable. It is also often the case that addressing one source of error can inadvertently increase another source of error.
TSE provides a conceptual framework for evaluating the design of the University Experience Survey (UES) and offers a structured approach to making decisions about changing and enhancing the UES to support continuous improvement. The implications of TSE for institutional research are discussed using the UES as a case study.
Binary outcome models are widely used in many real world application. We can used Probit and Logit models to analysis this type of data. Specially, dose response data can be analyze using these two models.
Hypothesis is a hunch the researcher or research team has. Basically a hypothesis is nothing more or less than a hunch to solve your research problem.
A good hypothesis by adding predictions on the how or why. So use sentences that include variations. If one cannot assess the predictions by observation or by experience, the hypothesis classes as not yet useful, and must wait for others who might come afterward to make possible the needed observations. For example, a new technology or theory might make the necessary experiments feasible.
Hypothesis -Concept Sources Types
Hypothesis
It is a tentative prediction about the nature of the relationship between two or more variables.
It is a tentative explanation of the research problem
Hypotheses are always in declarative sentence form
An hypothesis is a statement or explanation that is suggested by knowledge or observation but has not, yet, been proved or disproved
Sources of hypothesis
Experience of researcher
Review of literature
Findings of the pilot study
Interaction with knowledgeable persons of the concerned field
Knowledge of culture and society
Creative thinking and imagination of researcher
Types of Hypotheses
Directional Hypotheses / One tailed Hypothesis
Non-Directional Hypotheses / Two tailed Hypothesis
Null Hypotheses
Directional Hypotheses / One Tailed Hypothesis
A directional hypothesis is a prediction made by a researcher regarding a positive or negative change, relationship, or difference between two variables /two groups or conditions
directional hypothesis predicts the nature of the effect of the independent variable on the dependent variable.
It is often symbolized as H1
Non-Directional Hypotheses / Two Tailed Hypothesis
A non-directional simply states that there will be a difference between the two groups/conditions but does not say which will be greater/smaller, quicker/slower etc.
non-directional hypothesis predicts that the independent variable will have an effect on the dependent variable, but the direction of the effect is not specified.
Null Hypotheses
A null hypothesis is a hypothesis that says there is no statistical significance between the two variables.
null hypothesis states that there is no relationship between the two variables being studied (one variable does not affect the other).
It is the hypothesis that the researcher is trying to disprove.
the null hypothesis is a statement of
-‘no effect’ or ‘no difference’
It is often symbolized as H0.
Examples
“ In a clinical trial of a new drug with the current drug ”
We would write Null Hypotheses (H0):
H0 : there is no difference between the two drugs.
We would write Directional Hypotheses (H1):
H1 : the new drug is better than the current drug.
We would write Non-Directional Hypothesis:
the two drugs have different effects, on average.
Data 101: Introduction to Data VisualizationDavid Newbury
Do you want to make pictures using data but don't know where to start? Would you like to learn how data visualization works, and how to tell stories with data?
This workshop by David Newbury explores the history of data visualization from the first maps to the latest interactive tools from the New York Times.The workshop will also discuss the hows and whys of storytelling with data. It finshes with a collaborative exploration of data visualization using Sharpies, Post-It notes, and things that begin with "S".
No computers will be used in this class, and there are no prerequisites. As a result of this workshop, you'll have a stronger foundation in understanding how to communicate information more-effectively.
We’re excited to partner with the Carnegie Library of Pittsburgh on a “Data 101” training series designed to build information literacy, mapping, and data visualization skills for people looking to get started in using data, or more-experienced users looking to brush-up on their skills. The training sessions will be offered monthly at one of the Library’s branches, and will be followed by ample time to practice what you’ve learned.
This first class on data visualization was offered on the morning of May 10, 2016 at the East Liberty Branch.
Vast amounts of survey data are collected for many purposes, including governmental information, public opinion and election surveys, advertising and market research as well as scientific research
Survey data underlie many public policy and business decisions
Good quality data reduces the risk of poor policies and decisions and is of crucial importance
Hypothesis -Concept Sources Types
Hypothesis
It is a tentative prediction about the nature of the relationship between two or more variables.
It is a tentative explanation of the research problem
Hypotheses are always in declarative sentence form
An hypothesis is a statement or explanation that is suggested by knowledge or observation but has not, yet, been proved or disproved
Sources of hypothesis
Experience of researcher
Review of literature
Findings of the pilot study
Interaction with knowledgeable persons of the concerned field
Knowledge of culture and society
Creative thinking and imagination of researcher
Types of Hypotheses
Directional Hypotheses / One tailed Hypothesis
Non-Directional Hypotheses / Two tailed Hypothesis
Null Hypotheses
Directional Hypotheses / One Tailed Hypothesis
A directional hypothesis is a prediction made by a researcher regarding a positive or negative change, relationship, or difference between two variables /two groups or conditions
directional hypothesis predicts the nature of the effect of the independent variable on the dependent variable.
It is often symbolized as H1
Non-Directional Hypotheses / Two Tailed Hypothesis
A non-directional simply states that there will be a difference between the two groups/conditions but does not say which will be greater/smaller, quicker/slower etc.
non-directional hypothesis predicts that the independent variable will have an effect on the dependent variable, but the direction of the effect is not specified.
Null Hypotheses
A null hypothesis is a hypothesis that says there is no statistical significance between the two variables.
null hypothesis states that there is no relationship between the two variables being studied (one variable does not affect the other).
It is the hypothesis that the researcher is trying to disprove.
the null hypothesis is a statement of
-‘no effect’ or ‘no difference’
It is often symbolized as H0.
Examples
“ In a clinical trial of a new drug with the current drug ”
We would write Null Hypotheses (H0):
H0 : there is no difference between the two drugs.
We would write Directional Hypotheses (H1):
H1 : the new drug is better than the current drug.
We would write Non-Directional Hypothesis:
the two drugs have different effects, on average.
Data 101: Introduction to Data VisualizationDavid Newbury
Do you want to make pictures using data but don't know where to start? Would you like to learn how data visualization works, and how to tell stories with data?
This workshop by David Newbury explores the history of data visualization from the first maps to the latest interactive tools from the New York Times.The workshop will also discuss the hows and whys of storytelling with data. It finshes with a collaborative exploration of data visualization using Sharpies, Post-It notes, and things that begin with "S".
No computers will be used in this class, and there are no prerequisites. As a result of this workshop, you'll have a stronger foundation in understanding how to communicate information more-effectively.
We’re excited to partner with the Carnegie Library of Pittsburgh on a “Data 101” training series designed to build information literacy, mapping, and data visualization skills for people looking to get started in using data, or more-experienced users looking to brush-up on their skills. The training sessions will be offered monthly at one of the Library’s branches, and will be followed by ample time to practice what you’ve learned.
This first class on data visualization was offered on the morning of May 10, 2016 at the East Liberty Branch.
Vast amounts of survey data are collected for many purposes, including governmental information, public opinion and election surveys, advertising and market research as well as scientific research
Survey data underlie many public policy and business decisions
Good quality data reduces the risk of poor policies and decisions and is of crucial importance
How to Structure the “Approach” Section of a Grant Application by David Elash...UCLA CTSI
David Elashoff, PhD speaks on the topic of "How to Structure the “Approach” Section of a Grant Application" at the November 08, 2018 R Award Workshop at UCLA.
The contents of this presentation includes the introduction, steps involved in a survey, pros and cons as well as the sources of error. The contents are designed to support the researchers and students in their basics.
This slide contains basic understanding on the concept of program evaluation. The key learning objectives include -
- Evaluation fundamentals
- Developing a logic model
- Understanding evaluation design
- Data analysis approach
K-to-R Workshop: How to Structure the "Approach" Section (Part 1)UCLA CTSI
UCLA CTSI K-to_R Workshop, October 29, 2015
Presenter:
David Elashoff, PhD
Professor of Biostatistics & Medicine
Program Leader, CTSI Biostatistics and Computational Biology
A community needs assessment identifies the strengths and resources available in the community to meet the needs of children, youth, and families. The assessment focuses on the capabilities of the community, including its citizens, agencies, and organizations.
Michael Brockly's M.S. thesis presentation for Purdue University, December 2013.
This study created a framework to quantify and mitigate the amount of error that test administrators introduced to a biometric system during data collection. Prior research has focused only on the subject and the errors they make when interacting with biometric systems, while ignoring the test administrator. This study used a longitudinal data collection, focusing on demographics in government identification forms such as driver’s licenses, fingerprint metadata such a moisture and skin temperature, and face image compliance to an ISO best practice standard. Error was quantified from the first visit and baseline test administrator error rates were measured. Additional training, software development, and error mitigation techniques were introduced before a second visit, in which the error rates were measured again. The new system greatly reduced the amount of test administrator error and improved the integrity of the data collected. Findings from this study show how to measure test administrator error and how to reduce it in future data collections.
StatJR is a software system that can interoperate with other statistical software.
For example there is a StatJR template to fit a regression in many packages including SPSS.
SPSS is often used for training in the social sciences.
We have extended StatJR’s functionality so that it can automatically create ‘bespoke’ SPSS training materials.
A statistical software package written in Python and first released in 2013.
Named after our former colleague Jon Rasbash and pronounced “Stature”.
Stat-JR is meant to appeal to novice users, expert users and other algorithm developers
It has its own MCMC estimation engine built into the software but also allows interoperability with other software packages (this talk).
Has several interfaces including an electronic book interface including “statistical analysis assistant” features (talk 2).
Can also be used to create “bespoke” training materials in combination with the SPSS software package (talk 3).
Random coefficient models
Allowing individual-level relationships to vary across groups
Linking individual and group level explanations – cross level interactions
Two level random intercept models
Comparing groups – the variance components model
Quantifying group differences – the variance partition coefficient
Adding predictors at the individual and group level – the random intercept model
Think aloud
Probing
Observation
Response latency
Vignettes/ card sorts
Explain format of the interview
Interviewer will ask a survey question/ ask respondent to attempt to fill in a questionnaire
Respondent is asked to verbalise thought processes
Practice thinking aloud
Interviewer demonstrates
Respondent has a go
Comprehension of question
Retrieval from memory of relevant information
Judgement and estimation process
Response process; mapping answer to response options
The research combines walking methods and participatory theatre –working with migrant mothers, girls and migrant women with no recourse to public funds - to understand the lives, experiences and sense of belonging and place making – involved in enacting citizenship
to reflect on the social construction of reality
to Identify social structures which lead to oppressions
to try out interventions for social action
to validate participants’ local, subjugated knowledge.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Digital Artifact 2 - Investigating Pavilion Designs
Data quality: total survey error
1. Data Quality: Total Survey
Error (TSE)
Dr Olga Maslovskaya
NCRM National Centre for Research Methods
University of Southampton
2. Survey Data
• Vast amounts of survey data are collected for many
purposes, including governmental information, public
opinion and election surveys, advertising and market
research as well as scientific research
• Survey data underlie many public policy and business
decisions
• Good quality data reduces the risk of poor policies and
decisions and is of crucial importance
3. Total Survey Quality (TSQ)
Total Survey Quality (TSQ)
Statistical
Dimension
Non-statistical
Dimension
4. TSQ: Quality Dimensions –Statistical
• Accuracy of estimates is the difference between the estimate
and the true parameter value
• Accuracy is most important concept of TSQ
X = T + e
Observed item True value Error
Variance (random
error)
Bias (systematic
error)
5. Total Survey Error (TSE) (1)
•TSE concept was developed by Robert Groves
(1989) in book on Survey Errors and Survey
Costs
•Survey estimates are derived from complex
survey data
•Published estimates may differ from their true
parameter values due to survey errors
•Total Survey Error is the difference between a
population mean, total, or other population
parameter and the estimate of the parameter
based on the sample survey (Biemer and Lyberg,
2003)
6. Total Survey Error (TSE) (2)
•Survey error is any error arising from the
survey process that contributed to the deviation
of an estimate from its true parameter value
(Biemer, 2016)
•Survey error diminishes the accuracy of
inferences derived from the survey
•TSE is the accumulation of all errors that may
arise in the design, collection, processing, and
analysis of survey data (Biemer, 2016)
7. TSE framework (1)
• Set of principles, methods and processes that minimise TSE
within the budget allocated for accuracy, timing and other
constrains
• Non-statistical dimensions of TSQ can be viewed as constrains:
timeliness and comparability constrain the design; accessibility,
relevance and completeness constrain the budget (Biemer
2017)
8. TSE framework (2)
TSE framework provides principles that guide stages of
survey process:
• Survey design
• Survey implementation
• Data collection
• Data processing
• Data analysis
• Modelling and estimation
Each stage of survey process provides opportunities for
errors which add up to TSE
9. TSE
TSE= sampling errors + non-sampling
errors
Survey errors:
•Sampling errors – can be computed for
probability samples and are due to
selecting a sample instead of the entire
population
10. TSE
•Non-sampling errors (including
measurement error – cannot be formally
estimated but can be improved by
interviewing procedures and question
wordings etc.) - are errors due to mistakes or
system deficiencies, also from incomplete
responses to the survey or its questions, etc.
In many cases non-sampling error can be much
more damaging than sampling error to
estimates from surveys
11. Sources of Sampling Error
• Sampling scheme
• Stratification
• Clustering
• Selection probabilities
• Sample size
• Overall sample size
• Effective sample size
• Estimator choice
• Simple
• Use of auxiliary information
• Model-based
• Model-assisted
13. Specification Error
•Refers to a question on the questionnaire
•Occurs when the concept implied by the survey
question and the concept that should be
measured in the survey differ (Biemer and
Lyberg, 2003)
14. Frame Error
•Arises from construction of the sampling
frame for the survey
•The sampling frame might have missing
elements (units), duplicates or erroneous
inclusions (nonpopulation units)
15. Nonresponse Error
• Unit nonresponse occurs when a sample unit
(individual, household or organisation) does not
response to any part of the questionnaire,
• Item nonresponse occurs when the questionnaire is
only partially completed and some items are not
answered
• Incomplete response occurs when the response to
open-ended question is incomplete or very short and
inadequate
• Panel attrition occurs when a sample unit is lost over
the period of a longitudinal study
16. Measurement error
•Measurement errors pose a serious limitation to
the validity and usefulness of the data collected
•Most damaging source of error
•Without reliable measurements, analysis of
data hardly make any sense
17. Sources of measurement error
•Respondents
•May deliberately or unintentionally provide
incorrect information
•Response style behaviours (agree with
everything, do not know to every question or
choose extreme response options); through
social desirability bias
•Satisficing (less efforts to provide optimal
responses)
•Interviewers - enumerators
•May falsify data
•May inappropriately influence responses
18. •May have negative impact on
responses to sensitive questions
•May record responses incorrectly
•May fail to comply with the survey
protocol
20. Processing Error
Contributes to measurement error
• Occurs during data processing stage
• Errors in data editing
• Errors in data entry
• Errors in coding
• Errors in outlier editing
• Errors in assignment of survey weights
• Errors in non-response imputing
21. Modelling and Estimation Error
•Occurs during data analysis stage
(modelling)
•Errors in weight adjustments,
•Errors in imputation,
•Errors in modelling process
22. Types of Errors
• Systematic Error – bias -errors that tend to agree –
results in biased estimates (strengthen the relations
between variables, leading to false conclusions) – e.g.
response styles or other stable behaviours - bias the
results, distorting the mean value on variables – does not
cancel out
• Random Error – variance - errors that tend to disagree
(unintended mistakes made by respondents) – affects the
variance of estimates (may weaken the relations between
variables), vary from case to case but are expected to
cancel out
23. Mean Squared Error (MSE)
• Total survey error (TSE) is a term that is used to refer to
all sources of bias (systematic error) and variance
(random error) that may affect accuracy of survey data.
• Mean Squared Error (MSE) – metric for measuring TSE
• MSE is the sum of the total bias squared plus the variance
components for all the various sources of error in the
survey design.
24. MSE
•MSE cannot be calculated directly but useful
conceptually to consider how large the different
components of error can be and how much
they add to the total survey error
•MSE is a great guide for optimal survey
designs
25. MSE
• Survey design goal is to minimise the MSE
• When two designs are similar on other quality dimensions,
the optimal design is the one achieving the smallest MSE
• Working to reduce the measurement error on one set of
questions could increase the error for a different set of
questions in the same survey
• Also, reducing one error could increase another error in
the survey
26. Survey designers face the following
questions:
• Where should additional resources be directed to generate
the greatest improvement to data quality: extensive
interviewer training for nonresponse reduction, greater
nonresponse follow up intensity, or by offering larger
incentives to sample members to encourage participation?
• Should a more expensive data collection mode be used,
even if the sample size must be reduced significantly to
stay within budget?
27. TSE in Practice
•Idea is to minimise all these error sources
•Minimising all of these errors would require an
unlimited budget (impossible)
•Cost-benefit trade-offs are needed to decide
which errors to minimise
28. TSE in Practice (1)
•Realistic scenario is to work on continuous
improvement of various survey processes so that
biases and unwanted variations are gradually
reduced
•Redesign of surveys if needed
•Non-response bias reduction through real time
responsive and adaptive survey designs
•Quality monitoring strategies, e.g., paradata
•Data quality indicators application in data
analysis
29. TSE in Practice (2)
Decisions are needed:
•To ignore some errors
•To measure and to control/adjust for some
(data analysis stage: complex designs,
measurement errors, missing data, sampling
errors)
30. Conclusions
•Data accuracy is of crucial importance
•Single score or measure of data quality (Total
Survey Quality) is not available
•TSE framework was developed and adopted
•Cost-benefit trade-offs to minimise different
errors of TSE depending on survey aims
•TSE helps keeping data quality standards high
and in line with survey aims under financial
constrains
31. References
• Biemer (2010) Total survey error: Design, implementation, and evaluation. Public
Opinion Quarterly, 74(5): 817-848.
• Biemer (2016) Total Survey Error Paradigm: Theory and Practice. In The Sage
handbook of survey methodology by Wolf, Joye, Smith and Fu. London: SAGE
publications.
• Biemer (2017) Total survey error: A Framework for censuses and surveys.
Presentation at the University of Southampton.
• Biemer and Lyberg (2003) Introduction to survey quality. New York: John Wiley &
Sons.
• Groves and Heeringa (2006) Responsive design for household surveys: Tools for
actively controlling survey errors and costs. Journal of the Royal Statistical Society
Series A, 169 (3): 439-457.
• Lyberg and Weisberg (2016) The SAGE handbook of survey methodology.
London: SAGE publications.
• Lynn (2004) Editorial: Measuring and communicating survey quality. Journal of the
Royal Statistical Society Series A, 167 (4): 575-578.
• Schouten et al. (2013) Optimizing quality of response through adaptive survey
designs. Survey Methodology, 39 (1): 29-39.
• Weisberg (2005) The total survey error approach. Chicago: University of Chicago
Press.
Editor's Notes
So data quality is crucial
TSQ – survey quality is more than its accuracy or statistical dimension. It also includes among other factors producing results that fit the needs of the survey users and providing results that users will have confidence in. Usability of results is of crucial importance.
(Eurostat, Statistics Canada and Statistics Sweden)
Statistics Canada:
Relevance
Accuracy
Timeliness
Accessibility
Interpretability
Coherence
Statistics Sweden:
Content
Accuracy
Timeliness
Comparability/coherence
Availability/clarity
Bias – mean of errors is not equal to 0, does not cancel out; variance – mean of error is equal to 0, does cancel out
Accuracy is The larger concept of Total Survey Quality (TSQ)
Broader that accuracy definition is needed as users are not just interested in the accuracy of the estimates provided.
Accuracy is the cornestone of quality, since without it, sruvey data are of little use. If the data are erroneous, it does not help much if relevance, timeliness, accessibility, comparability, coherence and completeness are sufficient.
Simple random sampling is often neither possible nor cost-effective. Stratifying the sample can reduce the sampling error, clustering the sample can reduce costs but would increase the sampling error.
Idea is to minimize the errors
Biemer and Lyberg in their book Introduction to Survey Quality introduced devision between sampling and non-sampling errors
Roots in cautioning against sole attention to sampling error
Framework contains statistical and nonstatistical notions
There are different components of non-sampling error
Errors can be systematic or random and correlated or uncorrelated.
Uncorrelated (e.g., interviewer mistakenly records a “yes” answer as a “no”
Correlated (when interviewers take multiple interviewers and when cluster sampling is used – correlated errors increase the variance of estimates due to an effective sample size that is smaller than the intended one and thereby make it more difficult to achieve statistically significant results)
Measurement errors pose a serious limitation to the validity and usefulness of the information collected via survey. Having excellent samples representative of the target population, having high response rates, having complete data, etc. does us little good if our measurement instruments evoke responses that are fraught with error.
Measurement error is distinct from other survey errors and it is error that occurs when the recorded or observed value is different from the true value of the variable.
Reliability and validity are important in measurement error. Reliability is “agreement between two efforts to measure the same thing, using maximally similar methods”
How was the survey administered (e.g. in person, by telephone, online, multiple modes, etc.)? (sensitive questions)
Were the questions well constructed, clear, and not leading or otherwise biasing? (satisficing)
What steps, if any, were taken to ensure that respondents were providing truthful answers to the questions, and were any respondents removed from the final dataset (e.g., identifying speeders, satisficers, multiple completions)? (in-survey behaviour)
Having excellent samples representative of the target population, high response rates, complete data, etc. does us little good if our measurement instruments evoke responses that are fraught with error
Response errors or response styles are measurement errors and found in the answers respondents give to survey
Response styles:
Acquiescence response style – tendency to agree with items regardless of content
Disacquiescence reponse style – tendency to disagree with items regardless of content
Mid-point response style – tendency to use the middle response category of a rating scale regardless of content
Extreme response style – tendency to select most extreme response option regardless of content
Straightlining – tendency to rush through the survey clicking on the same response every time regarding of content
Tendency to select “do not know” options regardless of content
Reliability and validity are important concepts in measurement error. Reliability is “agreement between two efforts to measure the same thing, using maximally similar methods” (in Alwin, 2016).
The score for reliability is called the coefficient of precision
Validity is an agreement or consistency between two efforts to measure the same thing using maximally different measurements (in Alwin, 2016)
Satisfising behaviour increases measurement error (when respondents give answers that sound plausible so as to get through the task quickly
Improving survey question wording might minimise the likelihood of satisficing
Interviewers can cause errors in a number of ways
Acquiescence response style – tendency to agree with items regardless of content
Disacquiescence reponse style – tendency to disagree with items regardless of content
Mid-point response style – tendency to use the middle response category of a rating scale regardless of content
Extreme response style – tendency to select most extreme response option regardless of content
Straightlining – tendency to rush through the survey clicking on the same response every time regarding of content
Tendency to select “do not know” options regardless of content
MSE is hypothetical
Clients need to wiegh these trade offs deciding how they want to spend limited resources to minimize the potential survey errors
Cost-benefit trade-offs are needed to decide which errors to minimize
Quality frameworks were developed and adopted and provided statistics producers with clear description of how certain dimensions of quality can be measured and why it might be important to do so.
The survey community needs to find ways of ensuring that as broad a range as possible of relevant indicators and information is made available routinely (Lynn 2004)
The chances of users misusing the data or misinterpreting published statistics will be reduced if they understand better the strengths and limitations of the data.
The publication of data quality measures itself represent an improvement in the quality of a survey