Can Post-Stratification Adjustments Do Enough to Reduce Bias in Telephone Surveys that Do Not Sample Cell Phones? It Depends
1. Can Post-Stratification Adjustments Do Enough to Reduce Bias in Telephone Surveys that Do Not Sample Cell Phones? It Depends Kathleen Thiede Call SHADAC JSM, Vancouver August 2, 2010 Funded by a grant from the Robert Wood Johnson Foundation
2.
3.
4.
5.
6.
7.
8. Table 1. Selected outcomes for non-elderly; original public use weights by phone status Non-CPOH subsample (A-B) significantly underestimates all key health related outcomes Non-CPOH and CPOH subsamples are significantly different on all health related estimates (C-B); CPOH rates are higher on all outcomes ^Status of adults (18+) * T-tests corrected for overlapping observations (A-B) and are significant at the p ≤ .05 level or more
14. Table 4. Reweighted estimates for key segments of non-elderly population Bias reduction is greatest among total non-elderly; works least well for young adults Average absolute bias across estimates is relatively small for reweighted data Average MSE*1000
15.
16.
17.
18.
19.
Editor's Notes
In addition to the support of RWJF, I wish to thank…
7/13/10 Mike suggested this revision vs 2-4 times more expensive 1.5 to 3 times more expensive Notes:according to Blumberg and Luke - CPOH estimate for Jan-Jun 2009 to 21.1% CPOH. Further, this research suggests that CPOH are different from non-CPOH in important ways that can bias health-related estimates from RDD surveys. For example, people in CPOH report higher rates of uninsurance and greater barriers to care, yet report better health status and more active lifestyles compared to non-CPOH counterparts. 1 Notes: Cost: federal regulations prohibit the use of predictive dialing to cell phone lines and hand dialing increases the cost incurred by the survey center. 4 The second cost issue is screening for eligible respondents. Given the extensive use of cell phones by minors (age 17 and under), reaching an eligible respondent requires some effort. Additionally, if the goal is to reach CPOH, one must screen out those who could also be reached by a landline telephone, increasing the number of calls made to cell phones. 5 Further, because cell phones are attached to individuals rather than geography, an additional screen must be added to ensure the willing respondent resides in the area for which the researcher wishes to produce estimates. Combined, the cost of a CPOH survey is estimated to be somewhere between two and four times the cost of a RDD interview. 6 Beyond cost, an added complication of sampling cell phones arises when trying to merge these cases with RDD cases to produce population estimates when only national or regional estimates of telephone usage are available and rules to account for the overlap landline and cell samples have not been settled
Many states continue to rely on their own surveys to inform and evaluate health policies They have limited time and expertise More importantly, they are in need of a reasonable solution to this problem. One possible strategy is to account for this sample loss through post-stratification adjustments. Notes:A standard post-stratification adjustment for non-telephone households already exists and is widely used in state health surveys. 8 However survey researchers have yet to develop a similar agreed upon methodology for adjusting estimates to account for non-coverage of CPOH in RDD surveys.
We examine how biased health surveys are when they omit CPHOs, explore whether post-stratification can reasonably reduce this bias, and consider how well these adjustments work for key subpopulations (young adults and minorities).
To answer these questions we use the publicly available 2008 National Health Interview Survey (NHIS) --which contains information on CPOH status and includes both landline households and CPOH. Approach: remove the CPOH from the public use data; reweight the remaining non-CPOH data using distributions from the full NHIS data as population control totals. This reweighting process involved a series of post-stratification adjustments. Conventional post-stratification adjustments (i.e., region, age, race/ethnicity) were applied first, followed by less conventional adjustments informed by research on characteristics of the CPOH population (i.e., age by education, home ownership status, and household structure [households comprised only of adults between the ages of 18 and 30]). Read last bullet
In the analysis we first compare people living in CPOH to those who do not live in CPOH with respect to several important health surveillance domains (e.g., health insurance coverage, access to care, smoking and drinking). Using the original weights. Examine re-weighted estimates to see if we are successful in removing bias introduced by omitting CPOH. Goal of post-stratification: Re-weight the publicly available NHIS person weights so when applied to non-CPOH observations they produce outcome estimates that approximate those obtained from the original weights and the total NHIS sample Each unique household was categorized as a cell phone only household (CPOH) if at least one member of the household reported cellular service and no landline telephone. All other households that did not meet this definition (i.e. households with landlines, households with no service, and households with unknown service) were grouped together.
Full sample as gold standard Sample once CPOH omitted CPOH (~ 20% of households) Rows represent six health outcomes Removing the CPOH households from the full NHIS significantly impacts all health-related estimates. Removing the CPOHCompared to the non-CPOH the full sample has higher rates of uninsurance, poorer access to care, and a larger proportion of adults reporting heavy drinking and current smoking. Non-CPOH and CPOH subsamples are also significantly different on all health-related outcomes, with a much greater absolute difference than was true for the contrast between the full sample and non-CPOH sample. Thus, NHIS estimates with CPOH removed are biased, and these two subpopulations (non-CPOH and CPOH) are significantly different in terms of access to care and health behaviors. OMITTING CPOH leads to significant bias in health related outcomes.
By way of example Table 2 presents the reweighting process for one health related outcome: uninsurance. The first two columns contain data for the total and non-CPOH non-elderly NHIS sample applying the original weights. The first row again shows the rate of uninsurance among the non-elderly for the full NHIS sample (16.67%) and the estimate with CPOH cases removed (14.64%); the absolute bias/difference is 2.03%, relative bias is 12.19% -- the uninsurance estimate is 12% lower with CPOH removed.
The last six columns show each iterative weighting adjustment applied in our attempt to adjust for removing CPOH. The last column brings the uninsurance estimate among non-CPOH (16.21%) closer to the full sample estimate; our gold standard (from 14.64 to 16.21). As shown adjusting for the omission of CPOH generally results in larger standard errors due to more heterogeneity in the weights introduced by post-stratification. Turning to MSE: the biggest impact comes from the home ownership adjustment. The average bias reduction goes from 38.37% to 72.54% after fitting this variable and MSE goes from .17 to .04. Fitting the age by education post-stratification cell provides us the least amount of bias reduction and the average MSE increased. Whsld has the lowest MSE
So this comparison of the weights was done using all six health related variables. There is variation across the six health outcomes in terms the amount of bias associated with dropping CPOH and in terms of the reweighting we did to account for this omission. For example the largest absolute bias in removing CPOH is 2.31% for having no usual source of care (column 3). Reweighting the data to attempt to account for omitting CPOH drops the absolute bias to less than 1%. In terms of the percentage of bias reduction in the last column, bias is reduced 77.18% for uninsurance and just under 34% for reports of heavy drinking. Overall bias from excluding CPOH was reduced by 51% across all six estimates through introducing our post-stratification weighting adjustments.
Although the post strat adjustments lowered the bias associated with these health estimates, it also increased the variance of these estimates as shown by the SE (adjusting for the omission of CPOH generally results in larger standard errors due to more heterogeneity in the weights introduced by post-stratification). MSE allows us to assess whether the decrease in bias was offset by too large an increase in variance (or vice versa). The goal of this post-strat process was to reduce the bias introduced by excluding CPOH without overly increasing the variance. The reweighted estimates appear to have accomplished this goal as the average MSE*1000 decreased from the estimate omitting the CPOH applying only the original weights, to the average MSE*1000 for the reweighted data. ***** The MSE represents total survey error and is comprised of sampling error (variance) and non-sampling error (bias). The full NHIS sample estimate is used as the gold standard and bias is calculated as the difference between the reweighted estimate and the gold standard estimate. MSE was then calculated as the sum of the squared bias and the variance for each estimate.
Underestimate key health related outcomes – best to err on conservative side.
Next we want to know if the pattern observed for the NHIS non-elderly holds for specific demographic subsets of the population: specifically Hispanic and Black non-elderly and young adults – all 3 groups for whom CPOH status is high. The first 2 columns of t his table summarizes the average absolute bias and MSE across the 6 outcomes introduced from omitting CPOH from the total sample ( mns ). The last columns summarize bias reduction using the reweighted estimates first for the total non-elderly sample, with comparisons for the reduction of bias across the estimates for Hispanic, Black and young adults. As shown, bias reduction is greatest among total non-elderly sample – repeating what was presented in Table 3: average absolute bias from 1.45% to .62%, MSE of .05 and average percent bias reduction is 50.94%. Reweighting the data works somewhat less well for Hispanics and Blacks, and least well for young adults.
Overall, absolute bias resulting from leaving out cell phone only households is small for the array of health care access and health related behaviors indicators included in our analysis. Only exceeded 3 percentage points for one estimate: uninsurance among Hispanics Answering the research question posed at the start of this paper-- can post-stratification adjustments correct for bias associated with not sampling CPOH in RDD health surveys?—As foreshadowed in the Title: It depends. The answer is closer to yes for total non-elderly population where the reduction in bias across health related outcomes is quite high. However, reweighting the data to account for bias associated with removing CPOH works less well for Hispanics and blacks (e.g., the average bias reduction across the six outcomes was higher among whites (51.47%) than blacks (30.38%) and Hispanics (41.12%)) and even worse for 18 to 30 year olds (16.44% compared to 50.95% among all non-elderly). This suggests that post-stratification adjustments may not be enough. Surveys that do not include cell phone only households may misrepresent the level of disparities in health care access as well as health related behaviors.
Reweighting the data to account for CPOH works better for some health-related outcomes than others. Within the non-elderly population, the adjusted non-CPOH estimates reduced coverage bias by as much as 77% for uninsurance and as little as 34% for estimates of heavy drinking. (This has important implications for public health surveillance systems.) Thus for insurance, a key indicator for monitoring health policy reforms and funding formulas, the remaining bias with the reweighted data is small and results in an estimate that is approximately 0.5 percentage points lower than the gold standard (the full NHIS estimate). In truth, the reweighted estimate for drinking (5.55%) is not dramatically different from the gold standard (5.95% for the full NHIS data). Additionally, the direction of the bias is toward underestimating these key health related outcomes. Bullet 2 notes: We acknowledge that our adjustments are “context specific” We reweight the data using only demographic variables readily available in many state surveys and that are typically measured in a way compatible with available population control totals (e.g., Census Bureau measures of education, home ownership, race/ethnicity). This is essential to our work with states that often do not have the resources to conduct anything beyond traditional RDD surveys.
Bullet 3: Our results demonstrate that surveys that cannot go beyond RDD telephone modes of administration would do well to include a question about home ownership and add this post-stratification adjustment to their weighting routine. This measure consistently resulted in the greatest reductions to bias without much increase to variance. Although we aspire to develop a standard and accepted post-stratification adjustment for cell phone households, it seems unlikely that the size and characteristics of the cell phone only and cell any population will stabilize to the degree that phoneless households have and which allows for the use of telephone interruptions data for coverage adjustments. 8 Therefore, we must continue to monitor the efficacy of this post-stratification approach in dealing with coverage bias as telephony continues to change.
N per outcome 1, 2 = person file = 67,065 3, = sampled adult and child = 28,227 4 = sampled adult only = 18,810 Three publicly available weights were adjusted. The person weight available in the person file, the sample adult weight and the sample child weight. The sample weights were combined into a single variable that was equal to the sample adult weight if the case was sample adult and to the sample child weight if the case was a sample child. The following describes the post-stratification process that was used for both the sample and person files. An iterative strategy was used: Adjusted Weight= Control Total / New Estimation * Previous Weight In the first iteration, estimated population counts obtained from the original public-use weights and the full sample were computed across each post-stratification variable. Then CPOH observations were removed from the dataset and the original weights were applied to create new estimates. To adjust the original weights, the full sample estimates by region (i.e. the control totals) were divided by the reduced sample estimates by region. This fraction was multiplied by the original weight to obtain a region adjusted weight. This weight was then used to create estimates across all variables. The new weight was then multiplied by the fraction of control totals by race and the newly estimated race totals. Subsequent adjustments occurred in the same fashion. The order of iteration was Region, Race, Age by Education, Housing Tenure To ascertain which weight preformed the best, population counts were converted into percentages so that patterns could be more easily identified. Outcome estimates (uninsurance, delayed care because of cost, no usual source of sick care, and current smoking status), were computed as were their associated standard errors and mean squared errors.