Social Research Centre workshop - Telephone Surveying in the Post-Modern Era, held Thursday 10 October 2019. Presentation by Dina Neiger - Chief Statistician (Social Research Centre)
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
Workshop session 4 - Optimal sample designs for general community telephone surveys
1. A subsidiary of:
Optimal sample blend for
general population telephone surveys
Telephone Surveying in the Post-Modern Era Conference
Dina Neiger, Andrew Ward
2. www.srcentre.com.au
Acknowledgements
Social Research Centre
Jack Barton, Analyst
Sebastian Misson, Senior Statistician
Ben Phillips, Senior Research Director, Survey Methodology
Darren Pennay, Executive Director, Research Methods and Strategy
Australian Bureau of Statistics
2017-18 National Health Survey TableBuilder
Phil Hughes
(2018) “Dual Frame Surveys – Some Practical Issues” AMSRS Webinar
2
4. www.srcentre.com.au
National General Population Surveys
No statistical requirement for dual-frame1
RDD Mobile single frame is
o Cost-effective (labour and statistical efficiency), and
o More accurate (reduced sampling error) than dual-frame design
4
1P Hughes (2018) “Dual Frame Surveys – Some Practical Issues” AMSRS Webinar
6. www.srcentre.com.au
RDD Mobile Single Frame
No geography indicator - cost of screening prohibitive
6
Incidence (ERP
March 2019, 18+)
Number of
Screeners
New South Wales 32% 3
Victoria 26% 4
Queensland 20% 5
South Australia 7% 10
Western Australia 10% 14
Tasmania 2% 47
Northern Territory 1% 105
Australian Capital Territory 2% 57
0
10
20
30
40
50
60
70
80
90
100
1% 11% 21% 31% 41% 51% 61% 71% 81% 91%
NumberofScreenerInterviews
Population Incidence
NSW
SA
NT
7. www.srcentre.com.au
Listed Mobiles to replace RDD?
Listed mobile – credit agencies, marketing lists, etc.
o Geography and other auxiliary information available
o Coverage growing but far from complete
7
SamplePages
Number of People with
Listed Mobile (Oct 2019)
Number of Adults with
Mobile Phone
(NHS 2017/2018)
Theoretical
coverage of Listed
Mobile Frame
New South Wales 1,583,980 5,493,200 29%
Victoria 1,472,331 4,475,900 33%
Queensland 1,110,183 3,331,400 33%
South Australia 361,405 1,201,800 30%
Western Australia 606,954 1,763,600 34%
Tasmania 92,915 367,500 25%
Northern Territory 29,966 118,300 25%
Australian Capital Territory 89,361 283,300 32%
Total 5,347,0951
17,035,000 31%
8. www.srcentre.com.au
Frame options
Must continue with blended frames
Blend options:
o Traditional dual-frame: RDD mobile/RDD landline
o Add Listed mobile to dual-frame: Tri-frame
o New-age dual-frame: RDD mobile/Listed mobile
8
10. www.srcentre.com.au
Bias Considerations
Systematic difference between the survey estimate and the population
parameter
Many sources of biases
TSE Framework e.g. Coverage/Response/Questionnaire/Measurement
Use independent benchmarks to estimate bias e.g.
o ABS National Health Survey
o Census
o Estimated Resident Population
Average Absolute Error = Absolute difference between survey estimate
and the benchmark averaged across multiple measures
10
11. www.srcentre.com.au
Effective sample size Considerations
Effective sample size is the simple random sample that would yield the
same sampling variance as achieved by the actual survey
Effective sample size = Actual Sample Size*Weighting Efficiency
o Effective sample size should be used for power calculations and statistical testing
o Weighting Efficiency is an estimate of the increase in variance due to the complex sample
design and weighting adjustments made to the data
o Low weighting efficiency compromises increases sampling error and reduces accuracy of
estimates
For example
o Weighting efficiency = 38.5%
o Effective sample size for n = 2,000: 770
11
Groves, Robert M., Floyd J. Fowler, Mick P. Couper, James M. Lepkowski, Eleanor Singer and Roger Tourangeau. 2009. Survey
Methodology. 2nd ed. Hoboken, NJ, USA: Wiley.
12. www.srcentre.com.au
Fixed budget considerations
Fixed budget of $1,000
Un-screened interview = $1
Cost of screened interview e.g. Vic = $4
Objective: spend $1,000 in a way that maximises the effective sample
size (effective base)
12
14. www.srcentre.com.au
Illustration for optimal blend assessment
Based on a large SRC survey of Victorian population
Simulations
o Multiple sub-samples of 5,000 in proportion to population (Greater Melbourne/Rest of State)
o NHS 2017/2018 Victorian estimates
o Unweighted profile by frame
o Weighted estimates to calculate bias and costs for different blends
Simulation results are for illustration purposes only & may not apply to
every set of variables and every geography – important to test in specific
context
14
15. www.srcentre.com.au
Unweighted Profile by Frame
Source of
estimates
25 to 34
years of
age
Country of
birth is
Australia
Has a
bachelor
degree or
higher
Couple
with child
/ children
household
Homes
owned with
a mortgage
Currently
employed
Ave Abs
Error
ABS NHS 2017/2018 (state weighted estimates)
Estimated
population
proportion
(%)
20.1 64.0 28.1 34.3 39.5 66.8
SRC state population survey (unweighted)
RDD landline
(%)
2.4 75.9 33.6 26.3 23.7 42.6 13.9
Listed mobile
(%)
13.9 82.9 36.1 30.6 38.8 65.4 6.5
RDD mobile
(%)
23.0 61.4 50.0 37.3 37.0 69.6 6.0
15
16. www.srcentre.com.au
Frame Blend Options – Bias Comparison
16
5.1 5.1 5.1 5.0 4.95.0 5.0 5.1 5.1 4.95.0 5.1 5.1 5.1 5.1
0.0
1.0
2.0
3.0
4.0
5.0
6.0
30 40 50 60 70
AbsoulteAverageError
% RDD Mobile
RDD Mobile and RDD Landline Tri-frame (RDD Landline =30%) RDD Mobile and Listed Mobile
17. www.srcentre.com.au
Frame Blend Options – Effective Base for Fixed Cost Comparison
17
241 248 246 240 230241 248 246 240 230
321
295
267
246 241
0
50
100
150
200
250
300
350
30 40 50 60 70
EffectiveBase
% RDD Mobile
RDD Mobile and RDD Landline Tri-frame (RDD Landline =30%) RDD Mobile and Listed Mobile
19. www.srcentre.com.au
Unless need 75+ old accurate estimates, do not use landline
National surveys – RDD Mobile Only
Sub-national surveys – RDD Mobile /Listed Mobile Blend
Use historical data to determine the best blend, for example, Victoria 70-30
19
20. www.srcentre.com.au
Future work
Small area estimation – Statistical technique to predict local area
characteristics based on survey and administrative data available at
multiple levels of geography
IPND pilot
o Understand costs, response rate, best methods
RDD mobiles
o Investigation into profile and response rates by those that match to the lists versus those
that don’t
Listed mobiles
o Optimal blend further experimentation with different datasets
o Weighting solutions
20
21. PO Box 13328
Law Courts Victoria 8010
03 9236 8500
A subsidiary of:
21
Thank you
Editor's Notes
RDD – is the best practice to ensure representative sample assuming complete coverage as it allows calculation of the probability of selection
However, due to reduce prevalence of landline, as discussed by Ben the population covered by landline is severely limited meaning.
GIGO principle - selecting random sample from a poor frame results in a poor sample!
One exception is the “older (75+) adults who are still well represented on the landline. But as discussed by Kane and Ben, this will also be decreasingly useful with the introduction of the NBN and passing of time.
Link to Ben’s presentation
Link to Ben’s presentation
A different dimension of accuracy is the uncertainty around the estimate or sampling error. Sampling error is caused by us taking a random sample and if we took a different random sample we would likely come up with a slightly different estimate. Unlike bias this is a variable rather than a systematic error.
Weighting efficiency is a measure of uncertainty that reflects this error in the context of complex survey design and an unbalanced sample (e.g. a lot more females than in our target population) .
The trade-off of a complex survey design and/or unbalanced sample compared to a simple random sample is reduced statistical power compared to a simple random sample of the same size. The effective sample size is one way of quantifying this reduction in power. Effective sample size is defined as ‘the simple random sample that would yield the same sampling variance as achieved by the actual design’ (Groves et al 2009:112). In practice, the actual sample size is divided by the design effect (DEFF) to calculate the effective sample size, where the design effect is the estimate of the increase in variance due to the complex sample design. To illustrate, a complex sample size of 1,000 with DEFF of 2 is equivalent to a simple random sample of 500 for inferential statistics (1000 divided by 2).
Weighting efficiency reflects both selection probability (e.g. probability of selection within household or if disproportionate to size selection within LGA for example as well as post-stratification weighting that aims to balance the sample in accordance to known population distributions (e.g. age/sex/education) which is effectively disproportionate sample compared to the population
For the purpose of illustrating costs trade-offs, we take a budget of $1000 and assign a cost per non-screened interview (either landline or listed mobile) of $1 and cost of screened interview (RDD Mobile) of $4.00, our goal is to spend our $1000 in a way that maximises the effective base (reduces sampling error)
So let’s have a look at weighted results next. In order to do that, using data from the same survey and the SRC standard weighting methodology (more on that in a later session), Andrew undertook a range of simulations to determine how the mix of RDD mobile, Listed Mobile and RDD landline impacts on weighting efficiency and costs. Let’s start with a standard dual frame approach RDD landline and mobile and have a look at what an optimal blend would be in this case.
Background not to be stated in the presentation:
Based on VPHS 2017
Simulations based on results of state-based telephone survey
Survey used mix of random landline (50%), random mobile (30%) and listed mobile (20%)
Collected many demographic and outcome variables to compare with 2016 Census or 2017-18 National Health Survey
Simulation approach:
Keep landline proportion fixed at 30% of total sample
Vary listed mobile proportion from 0% to 70% of total sample
Use random mobile for balance of sample
Randomly selected records from the 3 frames
Weight selections
Compare weighted estimates with ABS values (average absolute difference)
Compare weighting efficiency
% listed mobile is with respect to the total sample
Total blend is 30% landline and 70% mobile
Outcome variables for bias assessment:
out_alcohol B7. Had an alcoholic drink of any kind in the last 12 months
out_fruit B2. Serves of fruit usually eat each day
out_generalhealth G1. General health status
out_hypertension C6. Ever been told by a doctor have high blood pressure
out_nervous G3. How often did you feel nervous (last 4 weeks)
out_tobacco B11. Smoking status
out_vegetables B1. Serves of vegetables usually eat each day
Weighting efficiency is calculated from the weights
Weighting variables used: age gender education phone status cob
Graph has averages across a bunch of simulations
On this slide we compare population prevalence from the ABS NHS 2017/18 estimates for Victoria (TableBuilder) to a sample profile by frame type for a set of 6 demographic variables that were collected in a comparable way to the NHS.
Survey data were collected by the SRC for a State-based SRC survey [2017 VPHS (Project number 2018) – not stated during presentation] and while the exact numbers won’t necessarily be applicable to other states, in our experience, this pattern is consistent.
As can be seen from the table, the sample profile of the mobile frame is a lot less biased than the landline frame. This is not surprising given Ben’s presentation earlier today.
Background not to be included in the handouts
Please add sources into the notes
ABS Estimates – NHS 2017/18
2017 VPHS (Project number 2018)
Age group by sample type:
Landline listed mobile RDD Mobile Census (ERP)
18-24 years 1.65% 2.67% 10.18% 12.66%
25-34 years 1.97% 10.18% 18.45% 20.07%
35-44 years 5.58% 15.43% 17.61% 17.20%
45-54 years 12.95% 21.10% 17.11% 16.39%
55-64 years 21.30% 25.84% 17.23% 14.28%
65+ years 56.55% 24.78% 19.40% 19.39%
So notwithstanding some of the limitations of the analysis above, we are convinced that for phone surveys we should looking at RDD Mobile/Listed mix with the blend depending on the screening costs/accuracy trade-offs and the landline should only be used for surveys that require precise estimates for 75+ age group.
This is not to say that there is not more we can do to better understand and measure the performance of the frames.
And of course, if RDD Mobiles can be appended with geography via IPND that would provide a better alternative. More on this, later today.