Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Merging

8,330 views

Published on

merging

Published in: Government & Nonprofit
  • Dating direct: ❤❤❤ http://bit.ly/2u6xbL5 ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❤❤❤ http://bit.ly/2u6xbL5 ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Thank you for the slides. It is very useful. I am getting the exact number 150,988 observations after linking round 1 and 2 dataset. But the data does not seem to match as for e.g sex changes for 918 observations in the dataset and so is it with age. Could you please explain where am I maybe going wrong? I have followed exact commands that you have suggested.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Merging

  1. 1. INDIA HUMAN DEVELOPMENT SURVEY (IHDS) TRAINING PROGRAM MARCH 16, 2016 How to merge two rounds?
  2. 2. Merging Household Files
  3. 3. Relationship between IHDS-I and IHDS-II households IHDS-I sample (N=41,554) Replacement households in IHDS-II (N=2,134) Split households from round 1 (N=5,397) Reinterview Households (N=34,621) Attrition (N=6,911)  Most important concept in merging two data files 1. Some households in round 1 with no match in round 2 and vice versa 2. Households in round 1 match with more than 1 household in round 2
  4. 4. Any questions?  Who were chosen for reinterview?  Recontact rate of 83%? What does it mean?  How were replacement households chosen?  What is a split household?
  5. 5. What is needed to merge household files? 1. Round 1 household file – N=41,554 2. Round 2 household file – N=42,152  (Why are there more cases in round 2?) 3. Linking file – N=42,152 – gives Round 1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new
  6. 6. Step 1 – Link round 2 data to linking file to get round 1 ID  use linkhh, clear  sort STATEID DISTID PSUID HHID HHSPLITID  merge 1:1 STATEID DISTID PSUID HHID HHSPLITID using round2HH  sort STATEID DISTID PSUID HHID2005 HHSPLITID2005, gen(_mergeR2link)  save round2HH_plus, replace
  7. 7. Step 2-Merge this Round 2+ file with Round 1 file  use round1HH  rename HHID HHID2005  rename HHSPLITID HHSPLITID2005  sort STATEID DISTID PSUID HHID2005 HHSPLITID2005  merge 1:m STATEID DISTID PSUID HHID2005 HHSPLITID2005 using round2HH_plus, gen(_mergeR1R2)  sort STATEID DISTID PSUID HHID HHSPLITID  save mergedHHR1R2, replace
  8. 8. Cases in Merged file is superset  Households surveyed in both rounds N=40,018  Households surveyed in round 1 only (attrition) N=6,911  Households surveyd in round 2 only (replacement) N=2,134  Total N=49,063  Keep only _mergeR1R2==3 for panel analysis (N=40,018)
  9. 9. Merging Individual Files
  10. 10. Relationship between IHDS-I and IHDS-II individuals IHDS-I sample (N=215,754) New individulas, new HH (N=9,760) New Ind in R1 HH (N=43,822) Reinterview Ind (N=150,995) HH attrition (N=29,299) Ind. attrition in interview hh (N=35,464)  Most important concept in merging two data files 1. Even reinterview households have new members (births, marriages) 2. Even reinterview households have some members who are no longer there (deaths, marriages, migration)
  11. 11. What is needed to merge individual files? 1. Round 1 household file – N=215,754 2. Round 2 household file – N=204,568  (Why are there more cases in round 2?) 3. Linking file – N=204,568 – gives Round 1 identification codes for all Round 2 households that were reinterviewed, missing linking codes for 2,134 households that are new
  12. 12. Step 1 – Link round 2 data to linking file to get round 1 ID  use linkind, clear  sort STATEID DISTID PSUID HHID HHSPLITID PERSONID  merge 1:1 STATEID DISTID PSUID HHID HHSPLITID PERONID using round2IND  sort STATEID DISTID PSUID HHID2005 HHSPLITID2005, gen(_mergeR2link)  save round2IND_plus, replace
  13. 13. Step 2-Merge this Round 2+ file with Round 1 file  use round1IND  rename HHID HHID2005  rename HHSPLITID HHSPLITID2005  rename PERSONID PERSONID2005  sort STATEID DISTID PSUID HHID2005 HHSPLITID2005 PERSONID2005  merge 1:m STATEID DISTID PSUID HHID2005 HHSPLITID2005 PERSONID2005 using round2IND_plus, gen(_mergeR1R2)  sort STATEID DISTID PSUID HHID HHSPLITID  save mergedINDR1R2, replace
  14. 14. Cases in Merged file is superset  Individuals surveyed in both rounds N=150,988  Individuals surveyed in round 1 only (attrition/death/migration) N=64,766  Individuals surveyd in round 2 only (replacement/new) N=53,580  Total N=269,334  Keep only _mergeR1R2==3 for panel analysis (N=150,988)
  15. 15. Evermarried woman file linkage
  16. 16. Same process as individual file linkage  But only one thing to note, there was no ever married woman file for 2004-5 so you will be merging with the household file from 2004-5
  17. 17. Merging Caution
  18. 18. Merging overwrites variables  So if you want to keep variables from round 1 and round 2 separate, before merging you may want to rename all round 1 variables  Typically we use the command  Rename * x*  Rename xSTATEID STATEID et. For merging  So xr05 will be age in 20045 and r05 will be age in 2011-12

×