Your SlideShare is downloading. ×
Data Quality: Issues and Fixes
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Data Quality: Issues and Fixes

422
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
422
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CR RC ILCS Raking Motivate Need and Illustrate Basic Approach Dr. Ali Mushtaq July 3, 2009 (for academic purposes only)
  • 2. What is Raking? • A way to Adjust Survey totals “t” to Independent Controls “T” • Takes existing Survey Weights, usually wij = 1/pij, where pij is probability of selection • Ratios them up to each total T in turn, until results are as close as wanted
  • 3. What is the Value? • Can increase stability of survey results Reduce Sample Variance • Get results that are close to desired outcomes Reduce bias arising from minor operational errors
  • 4. What Results to Expect? • If Controls are Reasonable, Raking Process will converge (“Hit” all controls) • And improve survey results related to Control Totals
  • 5. More Information Quality • Only Weights are Changed by Raking, not Survey Data • Data Quality is thus unchanged • But Information Quality is usually Improved
  • 6. What Does Raking Cost? • Usually Done quickly on a PC • Independent Controls Need to be consistent with each other • Sample must be reasonably large for Raking to be Safely Applied • Some Costs incurred to explain Method
  • 7. Raking Made Simple • “Fudge” Factor Intuition • Develop a ratio of target total divided by sample total • Repeat this process with each of the controls in turn
  • 8. NSS Example from ILCS While the NSS RA survey is raked across 4 dimensions (age, gender, marz and urban/rural), the example we’ll use here will just use two dimensions.
  • 9. Table 1. Raking Example – Source Survey Data
  • 10. Table 2: Desired Marginals
  • 11. First Ratio Adjustment
  • 12. Second Ratio Adjustment
  • 13. After Second Iteration
  • 14. ISLS Benefits Achieved • Reduction in Bias • Reduction (hopefully) in Variance • Survey Results are Consistent with Census Projections
  • 15. Again Many Thanks Data Quality and Record Linkage Techniques Springer 2007