Social Research Centre workshop - Telephone Surveying in the Post-Modern Era, held Thursday 10 October 2019. Presentation by Dina Neiger - Chief Statistician (Social Research Centre)
3. www.srcentre.com.au
Why weight?
Make sample more “representative”
Account for the sampling scheme
o If unequal probabilities of selection e.g. over-sampling smaller areas
Correct for non-response
o Ideally weighting variables will be correlated with response and with outcomes being
measured
o Typical weighting variables include age, education, gender and location
• Pragmatic approach (reliable benchmarks and practical to collect within survey)
• Accepted by industry and government
• Many outcomes will correlate with these variables
3
4. www.srcentre.com.au
Are all weights created equal?
Cell-weighting is the simplest way to incorporate multiple characteristics
in the weighting solution
o Classify sample and population into interlocking table (e.g. LGA by Age by Gender)
o For each cell, calculate weights as the ratio of population (N) to sample (n) sizes
Advantages
o Very easy to understand and implement
Disadvantages
o Creates many (possibly small) cells leading to unstable weights and high variances
o Does not easily accommodate missing data
o Needs population counts for all interlocking cells
Alternatives
o Raking (iterative weighting to margins)
o GREG (model-based calibration)
4
* ABS, 3101.0 - Australian Demographic Statistics, Dec 2018
5. www.srcentre.com.au
Raking
Raking (aka rim weighting) has been around since 1940 (Deming and
Stephan)
o Iterative method that adjusts by each variable successively (LGA then Age then Gender etc)
until all variables are in line with the population targets
Advantages
o Easy to describe and implement
o Does not need population counts for interlocking variables
o Can accommodate missing values as well as more adjustment variables
Disadvantages
o Can produce extreme weights
o Can converge slowly, especially with missing data, and sometimes not at all
o Long ago surpassed by more efficient weighting methods
5
6. www.srcentre.com.au
GREG weighting
First developed in the 1990s (Deville and Särndal)
o Uses non-linear optimisation to minimise the difference between design and final weights,
subject to the weights meeting the benchmarks
Advantages
o Technique currently used by Australian, New Zealand, Canadian and many European official
statistics agencies
o Results in well-behaved (non-extreme) weights that meet all benchmarks
o Improves weighting efficiency (higher effective base)
o Can accommodate a wide range of benchmarks and complex designs (e.g. continuous
totals, means, both person and household totals, etc)
Disadvantages
o Need to impute DK/Ref for weighting variables
o More complex to implement than raking (methods available for R, SAS and Stata)
6
7. www.srcentre.com.au
Choice of weighting variables
Standard demographic variables used in weighting:
o Age
o Gender
o State
Does not correct for other known biases e.g.:
o Telephone Status
o Education
o Country of Birth
Ideally weighting variables should be correlated with outcomes e.g.:
o Life satisfaction
o Concession card holder
o Employment
7
8. www.srcentre.com.au
Decision criteria for weighting variables
Availability of trusted benchmarks e.g.
o Census
o Official Statistics
Are benchmarks comparable to the survey e.g.
o Question wording
o Mode
o Reference period
o Survey topic
Correlation with key outcomes
o Life satisfaction not correlated with attitude to sun protection
o Alcohol consumption is correlated with smoking
Impact on weighting efficiency
o Increasing number of variables likely to decrease efficiency
o Ideally assess both bias and efficiency
8
10. www.srcentre.com.au
Considerations for multiple frame weighting
Design weights adjust for chance of selection to take part in a survey
For RDD landline and RDD mobile surveys:
o Need to account for overlapping chances of selection
o Calculated as sum of probabilities of being selected from each frame
o Questionnaire needs to collect information on telephone characteristics
• How many in-scope persons in the household
• Whether have a landline
• Number of personal mobiles
For listed mobile surveys:
o Most people don’t know if their mobile is listed
o Need a way to estimate – work in progress
10
11. www.srcentre.com.au
Weighting with a difference – Propensity models
Regression model that differentiates segments of the sample
o Respondents vs non-respondents (response propensity)
o Listed mobiles vs RDD mobiles (selection propensity)
Need at least some information about both types of respondents
o Auxiliary data available on the frame
o Data collected through a non-response follow-up survey
Use a model to estimate probability of being
o A non-respondent
o Part of the listed sample
Adjust weights to improve representativeness of the sample
11
12. www.srcentre.com.au
Example – Community Trust in Statistics Survey
Analysis by Andrew Ward, Principal Statistician, Social Research Centre
Thanks to Dr Siu-Ming Tam and Mr Paul Schubert, ABS, for their collaboration on this work
and their kind permission to use the survey data for this presentation.
Slides based on presentation by Andrew Ward at the Australian Statistical Conference in
December 2016.
12
13. www.srcentre.com.au
Community Trust in Statistics Survey
Determine public
awareness and trust of
official statistics
Dual frame phone survey
Also available for respondents
and non-respondents: part-of-
state (based on the landline
prefix) or mobile
Collected info from ~727
refusals – age, sex,
awareness, trust
13
14. www.srcentre.com.au
Challenge
Incorporation of refusal information in non-response adjustment
Probability of response
derived from propensity
model
Base weight =
Design weight / Probability of response
Limited auxiliary information available for respondents and non-
respondents
14
15. www.srcentre.com.au
Non-response adjustment
Awareness / Trust
Non-respondents
(%)
Respondents
(%)
Have heard of and trust a great deal 12.4 20.5
Have heard of and tend to trust 28.5 54.6
Have heard of and tend to distrust 6.2 10.5
Have heard of and distrust a great deal 4.8 2.2
Have heard of but DK / Refused trust 26.3 2.9
Total awareness 78.2 90.7
Have not heard of 17.1 9.0
Don’t know / Refused 4.8 0.4
15
16. www.srcentre.com.au
While there were no apparent differences in response propensity by
location or gender, there were some differences by age group and by
awareness / trust in the ABS.
Propensity to respond
Predicted probability of responding
DK/Ref
Haven't heard of ABS
Have heard of ABS but DK/Ref trust
Have heard of ABS and distrust them a great deal
Have heard of ABS and tend to distrust them
Have heard of ABS and tend to trust them
Have heard of ABS and trust them a great deal
0.0 0.2 0.4 0.6 0.8 1.0
Refused
0.0 0.2 0.4 0.6 0.8 1.0
Responded
16
17. www.srcentre.com.au
Non-response adjustment
Awareness / Trust
Unweighted
(%)
Without
propensity
weight (%)
With
propensity
weight (%)
Have heard of and trust a great
deal
20.5 15.3 14.3
Have heard of and tend to trust 54.6 52.6 47.5
Have heard of and tend to distrust 10.5 11.1 9.9
Have heard of and distrust a great
deal
2.2 2.4 2.9
Have heard of but DK / Refused
trust
2.9 3.4 7.5
Total awareness 90.7 84.8 82.1
Have not heard of 9.0 14.5 17.1
Don’t know / Refused 0.4 0.7 0.9
17
18. www.srcentre.com.au
In summary
Weighting will impact on accuracy of results and can correct for frame
and response deficiencies
Consider using
o Updated methods
o Variables that are correlated with outcomes in addition to demographic variables
o Propensity model
18
19. PO Box 13328
Law Courts Victoria 8010
03 9236 8500
A subsidiary of:
19
Thank you
Editor's Notes
Survey 1 Cyber Crime blended (case study)
Includes model design weights
Standard demographic benchmarks: Age, Gender, State.
Custom calibration benchmarks: Concession card holder, Life satisfaction, Speaks language other than English at home; Used digital camera in last 12 months, Self-rated ability with digital technology, Time spent watching online ‘catch-up tv’ services
Survey 2 VTS PR2312 (2019 2nd wave):
Includes model design weights from S&H
Std weighting: Age group, Gender, Telephone Status
Custom benchmarks: Education, Internet usage (how often post to blog forums and comment or post to social media), Plans to quit smoking
Survey 3 OPBS 2016 combined 5 nonprobabiliy panels
Custom benchmarks: early adopter, digital affinity, income and employment
The figures to look at are the ones for total awareness. 90% may be close to the mark at the moment. But you have to cast your mind back to 12-18 months ago, when ABS would have had a somewhat lower profile.
“without propensity weight” does not include NRFU survey respondents