Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Umeå University [email_address]
Outline <ul><li>Length-Biased Sampling & the Estimation-Problem  </li></ul><ul><li>Applications & Suggested Solutions </li...
Length-Biased Sampling <ul><li>The probability of sample inclusion of a population unit is related to the value of the var...
The Estimation Problem <ul><li>Assume there is a population with elements </li></ul><ul><li>The mean of the population is ...
The Estimation Problem <ul><li>Suppose  observations form a sample with sample mean  </li></ul><ul><li>where  </li></ul><u...
The Estimation Problem <ul><li>The expected value of the sample mean is </li></ul><ul><li>where  </li></ul><ul><li>are the...
The Estimation Problem <ul><li>Using simple random sampling </li></ul><ul><li>and thus </li></ul>
The Estimation Problem <ul><li>However in general  is unknown and thus </li></ul><ul><li>The sample mean becomes a biased ...
Cox (1969) <ul><li>Derived the length-biased or weighted pdf and looked at the estimation of the population mean from a le...
Cox (1969)  <ul><li>It can be shown that  </li></ul><ul><li>An unbiased estimator of  is </li></ul>
Cox (1969)  <ul><li>with variance </li></ul><ul><li>Note: </li></ul><ul><li>~  N </li></ul>
Cox (1969)  <ul><li>Relation between the moments of  g(x)  and  f(x) : </li></ul><ul><li>The relative bias is thus </li></ul>
 
2. APPLICATIONS Technical/Industrial Sampling <ul><li>Cox (1969): Sampling textile fibres and the estimation of fibre leng...
Marketing <ul><li>Shopping Center Sampling & Mall Intercept Surveys: </li></ul><ul><li>Keillor et al (2001): Global consum...
Epidemiology <ul><li>Sampling procedure for the collection of positive-valued or lifetime data are length-biased (Simon 19...
Resource Economics <ul><li>On-site sampling:  </li></ul><ul><li>Deriving demand functions for a recreational site (Bocksta...
Resource Economics  <ul><li>Shaw (1988): Three problems with on-site samples’ regression; </li></ul><ul><li>Non-negative i...
Resource Economics  <ul><li>Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s...
Resource Economics  <ul><li>Englin & Shonkwiler (1995): </li></ul><ul><li>The Negative Binomial Model </li></ul><ul><li>Th...
Resource Economics  <ul><li>Nunes (2003): Binary Choice Models </li></ul><ul><li>The count variable is described by a Pois...
3.   Misspecification of Sampling Probabilities: A Simulation <ul><li>Aim: </li></ul><ul><li>To see whether or not the eff...
Misspecification of Sampling Probabilities: A Simulation <ul><li>Time is modeled as a function of frequency of visits when...
Misspecification of Sampling Probabilities: A Simulation <ul><li>The three estimators used for the simulation are:  </li><...
Simulation Results 0.155  0.036  0.176 (0.100)  (0.081)  (0.112) 0.398  0.567  0.642 (0.162)  (0.327)  (0.419) Cox’s estim...
Summary <ul><li>If the probabilities of sample inclusion of population units are related to the values of the variable mea...
References <ul><li>Bockstael , N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the Esti...
And finally she stops…
Upcoming SlideShare
Loading in...5
×

Textile Fibre Sampling

1,858

Published on

Published in: Technology, Economy & Finance
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,858
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Textile Fibre Sampling

    1. 1. Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Umeå University [email_address]
    2. 2. Outline <ul><li>Length-Biased Sampling & the Estimation-Problem </li></ul><ul><li>Applications & Suggested Solutions </li></ul><ul><li>Simulation under Misspecified Sampling Inclusion Probabilities </li></ul>
    3. 3. Length-Biased Sampling <ul><li>The probability of sample inclusion of a population unit is related to the value of the variable measured. </li></ul><ul><li>Cox (1969): Textile fibre sampling </li></ul><ul><li>A simple illustration of the problem when estimating the population mean </li></ul>
    4. 4. The Estimation Problem <ul><li>Assume there is a population with elements </li></ul><ul><li>The mean of the population is </li></ul>
    5. 5. The Estimation Problem <ul><li>Suppose observations form a sample with sample mean </li></ul><ul><li>where </li></ul><ul><li>if individual i is sampled </li></ul><ul><li>otherwise </li></ul>
    6. 6. The Estimation Problem <ul><li>The expected value of the sample mean is </li></ul><ul><li>where </li></ul><ul><li>are the inclusion probabilities of the population units. </li></ul>
    7. 7. The Estimation Problem <ul><li>Using simple random sampling </li></ul><ul><li>and thus </li></ul>
    8. 8. The Estimation Problem <ul><li>However in general is unknown and thus </li></ul><ul><li>The sample mean becomes a biased estimator of the population mean. </li></ul>
    9. 9. Cox (1969) <ul><li>Derived the length-biased or weighted pdf and looked at the estimation of the population mean from a length-biased sample. </li></ul><ul><li>Assume is a random sample with pdf </li></ul>
    10. 10. Cox (1969) <ul><li>It can be shown that </li></ul><ul><li>An unbiased estimator of is </li></ul>
    11. 11. Cox (1969) <ul><li>with variance </li></ul><ul><li>Note: </li></ul><ul><li>~ N </li></ul>
    12. 12. Cox (1969) <ul><li>Relation between the moments of g(x) and f(x) : </li></ul><ul><li>The relative bias is thus </li></ul>
    13. 14. 2. APPLICATIONS Technical/Industrial Sampling <ul><li>Cox (1969): Sampling textile fibres and the estimation of fibre length distribution. </li></ul>
    14. 15. Marketing <ul><li>Shopping Center Sampling & Mall Intercept Surveys: </li></ul><ul><li>Keillor et al (2001): Global consumer tendencies. </li></ul><ul><li>Sudman (1980): Quota sampling techniques and weighting procedures to correct for frequency bias. </li></ul><ul><li>Nowell et al (1991): correction techniques for length-biased sampling in two situations; when total length of stay is known or estimated and when only the recurrence time is known. </li></ul>
    15. 16. Epidemiology <ul><li>Sampling procedure for the collection of positive-valued or lifetime data are length-biased (Simon 1980, Zelen et al. 1969) </li></ul><ul><li>Wang (1996): statistical analysis of length-biased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented. </li></ul>
    16. 17. Resource Economics <ul><li>On-site sampling: </li></ul><ul><li>Deriving demand functions for a recreational site (Bockstael 1990, Ovaskainen et al. 2001) </li></ul><ul><li>Charting trip taking behavior (Bowker 1998) </li></ul><ul><li>Travel cost models of recreational demand (Moons et al. 2001) </li></ul><ul><li>Contingent valuation surveys for the elicitation of non-market goods (Cameron et al. 1987, Nowell et al. 1988) </li></ul>
    17. 18. Resource Economics <ul><li>Shaw (1988): Three problems with on-site samples’ regression; </li></ul><ul><li>Non-negative integers </li></ul><ul><li>Truncation </li></ul><ul><li>Endogeneous Stratification </li></ul>
    18. 19. Resource Economics <ul><li>Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s distribution: </li></ul><ul><li>Normal distribution </li></ul><ul><li>Poisson distribution: </li></ul><ul><li>y=1,2,… </li></ul>
    19. 20. Resource Economics <ul><li>Englin & Shonkwiler (1995): </li></ul><ul><li>The Negative Binomial Model </li></ul><ul><li>The truncated, stratified model is </li></ul><ul><li>y=1,2,… </li></ul>
    20. 21. Resource Economics <ul><li>Nunes (2003): Binary Choice Models </li></ul><ul><li>The count variable is described by a Poisson distribution with an unobservable heterogeneity term correlated with the error term in a probit binary choice model </li></ul>
    21. 22. 3. Misspecification of Sampling Probabilities: A Simulation <ul><li>Aim: </li></ul><ul><li>To see whether or not the effect of missepecified sampling probabilities is large or not… </li></ul><ul><li>What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits? </li></ul>
    22. 23. Misspecification of Sampling Probabilities: A Simulation <ul><li>Time is modeled as a function of frequency of visits when estimating the population mean. </li></ul><ul><li> ~ Poisson </li></ul><ul><li> ~ Exponential </li></ul><ul><li> ~ Gamma </li></ul><ul><li>The inclusion probabilities are proportional to the time spent at the site: </li></ul>
    23. 24. Misspecification of Sampling Probabilities: A Simulation <ul><li>The three estimators used for the simulation are: </li></ul><ul><li>The sample mean: </li></ul><ul><li>Shaw’s estimator: </li></ul><ul><li>Cox’s Estimator: </li></ul>
    24. 25. Simulation Results 0.155 0.036 0.176 (0.100) (0.081) (0.112) 0.398 0.567 0.642 (0.162) (0.327) (0.419) Cox’s estimator -0.220 - 0.017 0.118 (0.096) (0.050) (0.065) -0.311 -0.036 0.058 (0.103) (0.011) (0.015) Shaw’s estimator 0.780 0.983 1.118 (0.656) (1.016) (1.301) 0.689 0.964 1.058 (0.481) (0.939) (1.131) Sample mean
    25. 26. Summary <ul><li>If the probabilities of sample inclusion of population units are related to the values of the variable measured, the parameter estimates will be biased and inconsistent. </li></ul><ul><li>Thus correctly specified sampling </li></ul><ul><li>inclusion mechanisms should </li></ul><ul><li>not be neglected! </li></ul>
    26. 27. References <ul><li>Bockstael , N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the Estimation of Recreational Demand Functions:An Application to Sportfishing. Land Economics , vol.66. No 1,40-49 </li></ul><ul><li>Bowker, J.M., Leeworthy, V.R., 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach. Journal of Leisure research 30(1),64-78. </li></ul><ul><li>Bush, A.J, Hair, J.F., 1985. An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, 158-67. </li></ul><ul><li>Cameron, T. A., James, M.D., 1987. Efficient Estimation Methods for &quot;Close-Ended&quot; Contingent Valuation Surveys. The Review of Economics and Statistics 69, 269-276. </li></ul><ul><li>Cox, D.R., 1969. &quot;Some Sampling Problems in Technology&quot; in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. </li></ul><ul><li>Englin, J., Shonkwiler, J.S., 1995. Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, 104-112. </li></ul><ul><li>Keillor, B.D., D'Amico, M., Horton, V., 2001. Global Consumer Tendencies, Psychology and Marketing 18, 1-19. </li></ul><ul><li>Laitila, T., 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, 17-23. </li></ul><ul><li>Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., 2001. Travel Cost and Time Measurement in Travel Cost Models. Faculty of Economics and Applied Economic Sciences , Working Paper series, no 2001-22. </li></ul><ul><li>Nakanishi, M., 1978. Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: American Marketing Association , 67-70. </li></ul><ul><li>Nowell, C., Evans, M.A., McDonald, L., 1988. Length-Biased Sampling in Contingent Valuation Studies. Land Economics 64 (November), 367-71. </li></ul><ul><li>Nowell, C., Stanley, L.R., 1991. Length-Biased Sampling in Mall Intercept Surveys. Journal of Marketing Research 28, 1991, 475-479. </li></ul><ul><li>Nunes, L.C., 2003. Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa. </li></ul><ul><li>Ovaskainen, V., Mikkola, J., Pouta, E., 2001. Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest Economics 7:2, 125-144. </li></ul><ul><li>Santos Silva, J.M.C., 1997. Unobservables in Count Data Models for On-Site Samples. Economics Letters 54, 217-220. </li></ul><ul><li>Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., 2004. How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal Processes . Biostatistics 5, 1, 145-151. </li></ul><ul><li>Shaw, D., 1988. On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Journal of Econometrics 37, 211-223. </li></ul><ul><li>Simon, R. 1980. Length-Biased Sampling in Etiological Studies. Am. J. Epidem . 111, 444-452. </li></ul><ul><li>Sudman, S., 1980. Improving the Quality of Shopping Center Sampling. Journal of Marketing Research 17, 1980, 423-431. </li></ul><ul><li>Wang, M-C., 1996. Hazards Regression Analysis for Length- Biased Data, Biometrika 2, 343-354. </li></ul><ul><li>Zelen, M., Feinleib, M. 1969. On The Theory of Screening for Chronic Diseases. Boimetrika 56, 601-614 </li></ul>
    27. 28. And finally she stops…
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×