sumandro_the_art_of_nsso_data_5thelephant_20120728

416 views
381 views

Published on

Presentation made at The Fifth Elephant conference organised by HasGeek in July 2012, on structure of data published by the National Sample Survey Office, Govt of India, and its extraction using R.

Published in: Self Improvement
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
416
On SlideShare
0
From Embeds
0
Number of Embeds
42
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

sumandro_the_art_of_nsso_data_5thelephant_20120728

  1. 1. The art of NSSO data sumandro chattapadhyay ajantriks.net | @ajantriks
  2. 2. Per Capita Floor Area (Sq. Mt.) MPCEQuintile Class Pucca Semi-Pucca Katcha All 0-20 5.84 5.03 4.32 5.63 20-40 6.72 7.37 5.16 6.75 40-60 7.98 7.85 6.88 7.96 60-80 10.13 9.32 6.50 10.0980-100 16.83 14.70 20.15 16.83 All 9.77 6.55 4.97 9.45
  3. 3. StructureHistoryConceptsData Organisation
  4. 4. StructureHistoryConceptsData Organisation
  5. 5. History1862 Statistical Committee constituted, publication of the first Statistical Abstract of British India (1840-65)1881 First Decennial Population Census begins1914 Directorate of Statistics established, later became the Directorate of Commercial Intelligence and Statistics1939 Wholesale Price Index collection and calculation begins
  6. 6. History1947 P.C. Mahalanobis appointed as the Honourary Statistical Advisor1949 The Central Statistical Unit established1951 Central Statistical Organisation (CSO) and Department of Statistics are established as nodal national data gathering institutions. Presently CSO is part of the Ministry of Statistics and Programme Implementation
  7. 7. StructureHistoryConceptsData Organisation
  8. 8. ConceptsRound: Each annual cycle of data collection by NSSOSchedule: Thematic focus for data collection, multipleschedules per RoundThick Round: Major data collection rounds repeatedevery 5 years (hence called quinquennial rounds)Thin Round: Minor data collection rounds
  9. 9. ConceptsState-Region: Usually a cluster of three or moredistricts in a stateFixed-Width Data Format: Data files in text formatspecified by fixed column widths, pad characterand left/right alignment.Schedule File: Questionnaire for the survey concernedLayout File: Description of organisation of variables
  10. 10. StructureHistoryConceptsData Organisation
  11. 11. Data OrganisationOrganisation of Raw Data:- Fixed-width file (.txt)- Binary coding of informationSupporting Files:- Schedule file- Layout file- Readme file- State and district codes
  12. 12. Data OrganisationLevels:- Multi-row coding of information about same entity- Binary coding of informationQuestions to housholds and individuals:- Need to generate unique IDs at the household and- at the individual levels- Appropriate weightage (by household size)
  13. 13. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Age : Column 4-5 3. Daily wage : Column 6-9Data: 12121212121212 34343434343434
  14. 14. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Age : Column 4-5 3. Daily wage : Column 6-9Data: 12121212121212 34343434343434
  15. 15. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Age : Column 4-5 3. Daily wage : Column 6-9Data: 12121212121212 34343434343434
  16. 16. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Age : Column 4-5 3. Daily wage : Column 6-9Data: 12121212121212 34343434343434
  17. 17. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Level: Column 4 2. Age : Column 5-6 (if level = 2) 3. Daily wage : Column 5-8 (if level = 4)Data: 12121212121212 12143434343434
  18. 18. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Level: Column 4 2. Age : Column 5-6 (if level = 2) 3. Daily wage : Column 5-8 (if level = 4)Data: 12121212121212 12143434343434
  19. 19. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Level: Column 4 2. Age : Column 5-6 (if level = 2) 3. Daily wage : Column 5-8 (if level = 4)Data: 12121212121212 12143434343434
  20. 20. Data OrganisationSchedule: 1. What is the serial No. Of a person? 2. What is his/her age? 3. What is his/her daily wage?Layout: 1. Serial Number : Column 1-3 2. Level: Column 4 2. Age : Column 5-6 (if level = 2) 3. Daily wage : Column 5-8 (if level = 4)Data: 12121212121212 12143434343434
  21. 21. sumandro chattapadhyayajantriks.net | @ajantriks

×