Data Quality: Issues and Fixes

888 views
757 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
888
On SlideShare
0
From Embeds
0
Number of Embeds
31
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Quality: Issues and Fixes

  1. 1. CR RC ILCS Raking Motivate Need and Illustrate Basic Approach Dr. Ali Mushtaq July 3, 2009 (for academic purposes only)
  2. 2. What is Raking? • A way to Adjust Survey totals “t” to Independent Controls “T” • Takes existing Survey Weights, usually wij = 1/pij, where pij is probability of selection • Ratios them up to each total T in turn, until results are as close as wanted
  3. 3. What is the Value? • Can increase stability of survey results Reduce Sample Variance • Get results that are close to desired outcomes Reduce bias arising from minor operational errors
  4. 4. What Results to Expect? • If Controls are Reasonable, Raking Process will converge (“Hit” all controls) • And improve survey results related to Control Totals
  5. 5. More Information Quality • Only Weights are Changed by Raking, not Survey Data • Data Quality is thus unchanged • But Information Quality is usually Improved
  6. 6. What Does Raking Cost? • Usually Done quickly on a PC • Independent Controls Need to be consistent with each other • Sample must be reasonably large for Raking to be Safely Applied • Some Costs incurred to explain Method
  7. 7. Raking Made Simple • “Fudge” Factor Intuition • Develop a ratio of target total divided by sample total • Repeat this process with each of the controls in turn
  8. 8. NSS Example from ILCS While the NSS RA survey is raked across 4 dimensions (age, gender, marz and urban/rural), the example we’ll use here will just use two dimensions.
  9. 9. Table 1. Raking Example – Source Survey Data
  10. 10. Table 2: Desired Marginals
  11. 11. First Ratio Adjustment
  12. 12. Second Ratio Adjustment
  13. 13. After Second Iteration
  14. 14. ISLS Benefits Achieved • Reduction in Bias • Reduction (hopefully) in Variance • Survey Results are Consistent with Census Projections
  15. 15. Again Many Thanks Data Quality and Record Linkage Techniques Springer 2007

×