Data Quality: Issues and Fixes
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Data Quality: Issues and Fixes

on

  • 874 views

 

Statistics

Views

Total Views
874
Views on SlideShare
844
Embed Views
30

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 30

http://crrcam.blogspot.com 30

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data Quality: Issues and Fixes Presentation Transcript

  • 1. CR RC ILCS Raking Motivate Need and Illustrate Basic Approach Dr. Ali Mushtaq July 3, 2009 (for academic purposes only)
  • 2. What is Raking? • A way to Adjust Survey totals “t” to Independent Controls “T” • Takes existing Survey Weights, usually wij = 1/pij, where pij is probability of selection • Ratios them up to each total T in turn, until results are as close as wanted
  • 3. What is the Value? • Can increase stability of survey results Reduce Sample Variance • Get results that are close to desired outcomes Reduce bias arising from minor operational errors
  • 4. What Results to Expect? • If Controls are Reasonable, Raking Process will converge (“Hit” all controls) • And improve survey results related to Control Totals
  • 5. More Information Quality • Only Weights are Changed by Raking, not Survey Data • Data Quality is thus unchanged • But Information Quality is usually Improved
  • 6. What Does Raking Cost? • Usually Done quickly on a PC • Independent Controls Need to be consistent with each other • Sample must be reasonably large for Raking to be Safely Applied • Some Costs incurred to explain Method
  • 7. Raking Made Simple • “Fudge” Factor Intuition • Develop a ratio of target total divided by sample total • Repeat this process with each of the controls in turn
  • 8. NSS Example from ILCS While the NSS RA survey is raked across 4 dimensions (age, gender, marz and urban/rural), the example we’ll use here will just use two dimensions.
  • 9. Table 1. Raking Example – Source Survey Data
  • 10. Table 2: Desired Marginals
  • 11. First Ratio Adjustment
  • 12. Second Ratio Adjustment
  • 13. After Second Iteration
  • 14. ISLS Benefits Achieved • Reduction in Bias • Reduction (hopefully) in Variance • Survey Results are Consistent with Census Projections
  • 15. Again Many Thanks Data Quality and Record Linkage Techniques Springer 2007