SlideShare a Scribd company logo
1 of 1
WEIGHTING OF DATA
Robert Radicsa (riradics@ncsu.edu), Sudipta Dasmohapatrab (sdasmoh@ncsu.edu), Steve Kelley c (sskelley@ncsu.edu),
a Graduate Research Assistant, b Associate Prof., c Department Head, Department of Forest Biomaterials, College of Natural Resources
Data Collection Method
Abstract
Data collected from consumer samples in the IBSS
project was adjusted (weighted) to make inferences
to the population in the states of NC and TN. This
paper presents information from the lessons learned
during the process of weighting of the data when
using multiple variables to account for differences
between a selected sample and the population.
Goal of Weighting and Raking
Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data.
http://magmods.wordpress.com/2011/03/23/magmods-questionnaire-3/
Weighting with More Variables; Raking
Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data.
Basic Algorithm
Limitations
Weighting with One Variable -Gender
Weighting with Two Variables – Gender,
Ethnicity
• Survey instrument
• Sampling: Randomly selected consumer
email addresses from third party
consumer database
• Data collection: Fall 2013 in NC and TN
•Pilot test: 34 consumers
•Cover Letter
• Completed Surveys:
• 586 in total
• 376 in NC and 210 in TN
• Response rate=2%
respondents % NC Census TN Census
n 376 9,848,000 209 6,496,000
Gender
Male 54.0 48.7 45.5 48.8
Female 46.0 51.3 54.5 51.2
Education
College 4 or 4+ 66.7 26.8 31.0 23.5
Ethnicity
White/Caucasian 79.0 71.9 88.5 79
Black/African-American 10.1 22.0 6.7 17
Age
18-24 9.6 10.0 10.3 2.1
25-44 26.5 43.1 26.8 26.6
45-64 26.9 24.9 26.2 52.1
65+ 13.3 1.0 12.4 17.8
Sample and Population
Demography Data
Sample data do not have the same demographics
proportions as the population data have.
Weighting and raking improve the relation between
the sample and the population by fine tuning the
sampling weights of the cases. At the end of the
process the marginal totals of the adjusted weights
on different characteristics are equal to the totals of
the population on the similar characteristics.
NC Census Weight
n 376 9,848,000 Census% / Sample%
Gender
Male 54 48.7 0.90
Female 46 51.3 1.12
• All male respondents get 0.90 weight for statistic analyses.
• All female respondents get 1.12 weight for statistic analyses.
NC Census Weight
n 376 9,848,000 Census% / Sample%
Gender
Male 54 48.7 0.90
Female 46 51.3 1.12
Ethnicity
White/Caucasian 79 71.9 0.91
Black/African-
American 10.1 22 2.18
Others 10.9 6.1 0.56
All respondents get two weights.
Issue: Gender proportions are not represented according to
the census because of these two multiplications.
Raking is the method of the iterative
proportional fitting.
Raking adjusts a set of data so that its
marginal totals match specified control totals
on a specified set of variables.
Raking is analogy of the process of leveling
the soil in a garden by alternately working
with a rake in two perpendicular directions.
• Lack of convergence or slow convergence.
• Large weights > 30, few respondents
represents large proportion of the
population.
• Small weights < 0.01 large proportion of the
sample represents small proportion of the
population.
The basic raking algorithm in terms of those individual weights, wi, i = 1, 2, ..., n. For an
unweighted (i.e., equally weighted) sample, one can simply take the initial weights to be wi = 1
for each i. In a cross-classification that has J rows and K columns, we denote the sum of the wi
in cell (j,k) by wjk. To indicate further summation, we replace a subscript by a + sign. Thus, the
initial row totals and column totals of the sample weights are w j+ and w+k respectively.
Analogously, we denote the corresponding population control totals by T j+ and T+k .
(1) for the sum of the modified weights in cell (j,k) at the end of step 1. If we begin by matching
the control totals for the rows, T j+, the initial steps of the algorithm are
mjk(0) = wjk (j = 1,...,J; k=1,...,K)
mjk(1) = mjk(0) ( T j+ / mj+(0) )
mjk(2) = mjk(1) ( T +k / m+k(1) )
The adjustment factors, T j+ /m j+(0) and T+k / m+k
(1), are actually applied to the individual weights, which we could denote by mi (2), for example.
In the iterative process an iteration rakes both rows and columns. For iteration s ( s = 0, 1, ...) we
may write
mjk(2s+1) = mjk(2s) ( T j+ / mj+(2s) )
mjk(2s+2) = mjk(2s+1) ( T +k / m+k(2s+1) )
Raking can also adjust a set of data to control totals on three or more variables.

More Related Content

Similar to Weighting Survey Data to Match Population Demographics

Estimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataEstimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataNick Stauner
 
Zhao_Danton_SR16_Poster
Zhao_Danton_SR16_PosterZhao_Danton_SR16_Poster
Zhao_Danton_SR16_PosterDanton Zhao
 
Statistics for second language educators
Statistics for second language educatorsStatistics for second language educators
Statistics for second language educatorsAchilleas Kostoulas
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionRione Drevale
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Md Rahman
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyWaqas Tariq
 
PAGE Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docx
PAGE  Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docxPAGE  Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docx
PAGE Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docxgerardkortney
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103Farah Diba
 
Descriptive statistics.ppt
Descriptive statistics.pptDescriptive statistics.ppt
Descriptive statistics.pptPerumalPitchandi
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.pptTanushreeBiswas23
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.pptAnusuya123
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.pptJeenaJacob19
 
3. Descriptive statistics.pbzfdsdfbbttsh
3. Descriptive statistics.pbzfdsdfbbttsh3. Descriptive statistics.pbzfdsdfbbttsh
3. Descriptive statistics.pbzfdsdfbbttshAjithGhoyal
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.pptDoris729291
 
joaks-evolution-2014
joaks-evolution-2014joaks-evolution-2014
joaks-evolution-2014Jamie Oaks
 
2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sepDennis Sweitzer
 

Similar to Weighting Survey Data to Match Population Demographics (20)

Biostatistics ii
Biostatistics iiBiostatistics ii
Biostatistics ii
 
Estimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataEstimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale data
 
Zhao_Danton_SR16_Poster
Zhao_Danton_SR16_PosterZhao_Danton_SR16_Poster
Zhao_Danton_SR16_Poster
 
Statistics for second language educators
Statistics for second language educatorsStatistics for second language educators
Statistics for second language educators
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regression
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
 
Statistics-1.ppt
Statistics-1.pptStatistics-1.ppt
Statistics-1.ppt
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
 
Relation Anaylsis
Relation AnaylsisRelation Anaylsis
Relation Anaylsis
 
PAGE Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docx
PAGE  Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docxPAGE  Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docx
PAGE Running head WEEK # 11 – SEX AND CLASS 1Week # 11 -.docx
 
ECONOMETRICS I ASA
ECONOMETRICS I ASAECONOMETRICS I ASA
ECONOMETRICS I ASA
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103
 
Descriptive statistics.ppt
Descriptive statistics.pptDescriptive statistics.ppt
Descriptive statistics.ppt
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.ppt
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.ppt
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.ppt
 
3. Descriptive statistics.pbzfdsdfbbttsh
3. Descriptive statistics.pbzfdsdfbbttsh3. Descriptive statistics.pbzfdsdfbbttsh
3. Descriptive statistics.pbzfdsdfbbttsh
 
3. Descriptive statistics.ppt
3. Descriptive statistics.ppt3. Descriptive statistics.ppt
3. Descriptive statistics.ppt
 
joaks-evolution-2014
joaks-evolution-2014joaks-evolution-2014
joaks-evolution-2014
 
2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep
 

More from Robert Radics

Using SPSS raking algorithm handling population and sample differences
Using SPSS raking algorithm handling population and sample differencesUsing SPSS raking algorithm handling population and sample differences
Using SPSS raking algorithm handling population and sample differencesRobert Radics
 
Public Perceptions of Bioenergy
Public Perceptions of BioenergyPublic Perceptions of Bioenergy
Public Perceptions of BioenergyRobert Radics
 
Consumers perception segments
Consumers perception segmentsConsumers perception segments
Consumers perception segmentsRobert Radics
 

More from Robert Radics (9)

Dsc 3750 lecture 3
Dsc 3750 lecture 3Dsc 3750 lecture 3
Dsc 3750 lecture 3
 
Dsc 5530 lecture 3
Dsc 5530 lecture 3Dsc 5530 lecture 3
Dsc 5530 lecture 3
 
Dsc 5530 lecture 2
Dsc 5530 lecture 2Dsc 5530 lecture 2
Dsc 5530 lecture 2
 
Dsc 5530 lecture 1
Dsc 5530 lecture 1Dsc 5530 lecture 1
Dsc 5530 lecture 1
 
Dsc 3750 lecture 2
Dsc 3750 lecture 2Dsc 3750 lecture 2
Dsc 3750 lecture 2
 
Dsc 3750 lecture 1
Dsc 3750 lecture 1Dsc 3750 lecture 1
Dsc 3750 lecture 1
 
Using SPSS raking algorithm handling population and sample differences
Using SPSS raking algorithm handling population and sample differencesUsing SPSS raking algorithm handling population and sample differences
Using SPSS raking algorithm handling population and sample differences
 
Public Perceptions of Bioenergy
Public Perceptions of BioenergyPublic Perceptions of Bioenergy
Public Perceptions of Bioenergy
 
Consumers perception segments
Consumers perception segmentsConsumers perception segments
Consumers perception segments
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 

Weighting Survey Data to Match Population Demographics

  • 1. WEIGHTING OF DATA Robert Radicsa (riradics@ncsu.edu), Sudipta Dasmohapatrab (sdasmoh@ncsu.edu), Steve Kelley c (sskelley@ncsu.edu), a Graduate Research Assistant, b Associate Prof., c Department Head, Department of Forest Biomaterials, College of Natural Resources Data Collection Method Abstract Data collected from consumer samples in the IBSS project was adjusted (weighted) to make inferences to the population in the states of NC and TN. This paper presents information from the lessons learned during the process of weighting of the data when using multiple variables to account for differences between a selected sample and the population. Goal of Weighting and Raking Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data. http://magmods.wordpress.com/2011/03/23/magmods-questionnaire-3/ Weighting with More Variables; Raking Battaglia, M., Hoaglin, D., & Frankel, M. (2013). Practical considerations in raking survey data. Basic Algorithm Limitations Weighting with One Variable -Gender Weighting with Two Variables – Gender, Ethnicity • Survey instrument • Sampling: Randomly selected consumer email addresses from third party consumer database • Data collection: Fall 2013 in NC and TN •Pilot test: 34 consumers •Cover Letter • Completed Surveys: • 586 in total • 376 in NC and 210 in TN • Response rate=2% respondents % NC Census TN Census n 376 9,848,000 209 6,496,000 Gender Male 54.0 48.7 45.5 48.8 Female 46.0 51.3 54.5 51.2 Education College 4 or 4+ 66.7 26.8 31.0 23.5 Ethnicity White/Caucasian 79.0 71.9 88.5 79 Black/African-American 10.1 22.0 6.7 17 Age 18-24 9.6 10.0 10.3 2.1 25-44 26.5 43.1 26.8 26.6 45-64 26.9 24.9 26.2 52.1 65+ 13.3 1.0 12.4 17.8 Sample and Population Demography Data Sample data do not have the same demographics proportions as the population data have. Weighting and raking improve the relation between the sample and the population by fine tuning the sampling weights of the cases. At the end of the process the marginal totals of the adjusted weights on different characteristics are equal to the totals of the population on the similar characteristics. NC Census Weight n 376 9,848,000 Census% / Sample% Gender Male 54 48.7 0.90 Female 46 51.3 1.12 • All male respondents get 0.90 weight for statistic analyses. • All female respondents get 1.12 weight for statistic analyses. NC Census Weight n 376 9,848,000 Census% / Sample% Gender Male 54 48.7 0.90 Female 46 51.3 1.12 Ethnicity White/Caucasian 79 71.9 0.91 Black/African- American 10.1 22 2.18 Others 10.9 6.1 0.56 All respondents get two weights. Issue: Gender proportions are not represented according to the census because of these two multiplications. Raking is the method of the iterative proportional fitting. Raking adjusts a set of data so that its marginal totals match specified control totals on a specified set of variables. Raking is analogy of the process of leveling the soil in a garden by alternately working with a rake in two perpendicular directions. • Lack of convergence or slow convergence. • Large weights > 30, few respondents represents large proportion of the population. • Small weights < 0.01 large proportion of the sample represents small proportion of the population. The basic raking algorithm in terms of those individual weights, wi, i = 1, 2, ..., n. For an unweighted (i.e., equally weighted) sample, one can simply take the initial weights to be wi = 1 for each i. In a cross-classification that has J rows and K columns, we denote the sum of the wi in cell (j,k) by wjk. To indicate further summation, we replace a subscript by a + sign. Thus, the initial row totals and column totals of the sample weights are w j+ and w+k respectively. Analogously, we denote the corresponding population control totals by T j+ and T+k . (1) for the sum of the modified weights in cell (j,k) at the end of step 1. If we begin by matching the control totals for the rows, T j+, the initial steps of the algorithm are mjk(0) = wjk (j = 1,...,J; k=1,...,K) mjk(1) = mjk(0) ( T j+ / mj+(0) ) mjk(2) = mjk(1) ( T +k / m+k(1) ) The adjustment factors, T j+ /m j+(0) and T+k / m+k (1), are actually applied to the individual weights, which we could denote by mi (2), for example. In the iterative process an iteration rakes both rows and columns. For iteration s ( s = 0, 1, ...) we may write mjk(2s+1) = mjk(2s) ( T j+ / mj+(2s) ) mjk(2s+2) = mjk(2s+1) ( T +k / m+k(2s+1) ) Raking can also adjust a set of data to control totals on three or more variables.