Data Quality: Issues and Fixes - Presentation Transcript
CR
RC
ILCS Raking
Motivate Need and
Illustrate Basic Approach
Dr. Ali Mushtaq
July 3, 2009
(for academic purposes only)
What is Raking?
• A way to Adjust Survey totals “t” to
Independent Controls “T”
• Takes existing Survey Weights,
usually wij = 1/pij, where pij is
probability of selection
• Ratios them up to each total T in
turn, until results are as close as
wanted
What is the Value?
• Can increase stability of survey
results
Reduce Sample Variance
• Get results that are close to
desired outcomes
Reduce bias arising from
minor operational errors
What Results to Expect?
• If Controls are Reasonable,
Raking Process will converge
(“Hit” all controls)
• And improve survey results
related to Control Totals
More Information Quality
• Only Weights are Changed by
Raking, not Survey Data
• Data Quality is thus
unchanged
• But Information Quality is
usually Improved
What Does Raking Cost?
• Usually Done quickly on a PC
• Independent Controls Need to be
consistent with each other
• Sample must be reasonably large
for Raking to be Safely Applied
• Some Costs incurred to explain
Method
Raking Made Simple
• “Fudge” Factor Intuition
• Develop a ratio of target total
divided by sample total
• Repeat this process with each
of the controls in turn
NSS Example from ILCS
While the NSS RA survey is raked
across 4 dimensions (age, gender,
marz and urban/rural), the example
we’ll use here will just use two
dimensions.
Table 1. Raking Example –
Source Survey Data
Table 2: Desired Marginals
First Ratio Adjustment
Second Ratio Adjustment
After Second Iteration
ISLS Benefits Achieved
• Reduction in Bias
• Reduction (hopefully) in Variance
• Survey Results are Consistent with
Census Projections
Again Many Thanks
Data Quality and Record
Linkage Techniques
Springer 2007
0 comments
Post a comment