Revisiting Sampling Concepts
Population <ul><li>A  population  is all the possible members of a category  </li></ul><ul><li>Examples:  </li></ul><ul><u...
Sample <ul><li>A  sample  is some subset of a population </li></ul><ul><ul><li>Examples: </li></ul></ul><ul><ul><ul><li>Th...
Population Sample Sample  Statistics Population Parameters Inference Samples are drawn to infer something about population
Reasons to Sample <ul><li>Ideally a decision maker would like to consider every item in the population but; </li></ul><ul>...
Probability Vs Non Probability Sampling <ul><li>Probability Sampling </li></ul><ul><li>Drawing Samples in Random manner </...
Probability Vs Non Probability Sampling <ul><li>Non Probability Sampling </li></ul><ul><li>man-on-the-street interviews </...
Types of Variables <ul><li>Qualitative </li></ul><ul><li>Quantitative </li></ul><ul><li>Discrete </li></ul><ul><li>Continu...
Sampling Error <ul><li>“Sampling error is simply the difference between the estimates obtained from the sample and the tru...
Validity of Sampling Process
Sampling Distributions <ul><li>A distribution of  all possible  statistics calculated from  all possible  samples of size ...
Sampling Distribution of Means <ul><li>Suppose a population consists of three numbers 1,2 and 3 </li></ul><ul><li>All the ...
Distribution of the Population
Sampling distribution of means n = 2
= µ   = 0.6 3 3,3 9 2 Mean of SD 2.5 3,2 8 2 3,1 7 2.5 2,3 6 2 2,2 5 1.5 2,1 4 2 1,3 3 1.5 1,2 2 1 1,1 1 Sample Mean Sampl...
 
<ul><li>The population’s distribution has far more variability than that of sample means </li></ul><ul><li>As the sample s...
<ul><li>The mean of the sampling distribution of ALL the sample means is equal to the true population mean. </li></ul><ul>...
Central Limit Theorem …… <ul><li>The variability of a sample mean decreases as the  sample size increases </li></ul><ul><l...
Central Limit Theorem …… <ul><li>How large is a “large sample”? </li></ul><ul><li>It depends upon the form of the distribu...
 
Implications of CLT <ul><li>A light bulb manufacturer claims that the life span of its light bulbs has a mean of 54 months...
Implications of CLT Cont <ul><li>From the data we know that </li></ul><ul><li>µ   =  54  Months  = 6 Months  </li></ul><ul...
54 o -2.35 0.0094 52
<ul><li>To find  ,we need to convert to  z -scores: </li></ul><ul><li>From the Area table  =  0.4906 </li></ul><ul><li>Hen...
What can go wrong  <ul><li>Statistics can be manipulated by taking biased samples intentionally </li></ul><ul><li>Examples...
How to do it rightly <ul><li>Need to make sure that sample truly represents the population </li></ul><ul><li>Use Random wa...
Upcoming SlideShare
Loading in …5
×

Sampling 1231243290208505 1

1,080 views

Published on

2 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,080
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
33
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide

Sampling 1231243290208505 1

  1. 1. Revisiting Sampling Concepts
  2. 2. Population <ul><li>A population is all the possible members of a category </li></ul><ul><li>Examples: </li></ul><ul><ul><ul><li>the heights of every male or every female </li></ul></ul></ul><ul><ul><ul><li>the temperature on every day since the beginning of time </li></ul></ul></ul><ul><ul><ul><li>Every person who ever has, and ever will, take a particular drug </li></ul></ul></ul>
  3. 3. Sample <ul><li>A sample is some subset of a population </li></ul><ul><ul><li>Examples: </li></ul></ul><ul><ul><ul><li>The heights of 10 students picked at random </li></ul></ul></ul><ul><ul><ul><li>The participants in a drug trial </li></ul></ul></ul><ul><li>Researchers seek to select samples that accurately reflect the broader population from which they are drawn. </li></ul>
  4. 4. Population Sample Sample Statistics Population Parameters Inference Samples are drawn to infer something about population
  5. 5. Reasons to Sample <ul><li>Ideally a decision maker would like to consider every item in the population but; </li></ul><ul><li>To Contact the whole population would be time consuming e.g. Election polls </li></ul><ul><li>The cost of such study might be too high </li></ul><ul><li>In many cases whole population would be consumed if every part of it was considered </li></ul><ul><li>The Sample results are adequate </li></ul>
  6. 6. Probability Vs Non Probability Sampling <ul><li>Probability Sampling </li></ul><ul><li>Drawing Samples in Random manner </li></ul><ul><li>Using random numbers </li></ul><ul><li>Writing names on identical cards or slips and then drawing randomly </li></ul><ul><li>Choosing every nth item of the population </li></ul><ul><li>First dividing the population into homogeneous groups and then drawing samples randomly </li></ul>
  7. 7. Probability Vs Non Probability Sampling <ul><li>Non Probability Sampling </li></ul><ul><li>man-on-the-street interviews </li></ul><ul><li>call-in surveys </li></ul><ul><li>readership surveys </li></ul><ul><li>web surveys </li></ul>
  8. 8. Types of Variables <ul><li>Qualitative </li></ul><ul><li>Quantitative </li></ul><ul><li>Discrete </li></ul><ul><li>Continuous </li></ul><ul><li>Categorical </li></ul><ul><li>Numerical </li></ul>
  9. 9. Sampling Error <ul><li>“Sampling error is simply the difference between the estimates obtained from the sample and the true population value.” </li></ul><ul><li>Sampling Error = X - µ </li></ul><ul><li>Where </li></ul><ul><li>X = Mean of the Sample </li></ul><ul><li>µ = Mean of the Population </li></ul>
  10. 10. Validity of Sampling Process
  11. 11. Sampling Distributions <ul><li>A distribution of all possible statistics calculated from all possible samples of size n drawn from a population is called a Sampling Distribution. </li></ul><ul><li>Three things we want to know about any distribution? </li></ul><ul><li>– Central Tendency </li></ul><ul><li>– Dispersion </li></ul><ul><li>– Shape </li></ul>
  12. 12. Sampling Distribution of Means <ul><li>Suppose a population consists of three numbers 1,2 and 3 </li></ul><ul><li>All the possible samples of size 2 are drawn from the population </li></ul><ul><li>Mean of the Pop ( µ) = (1 + 2 + 3)/3 = 2 </li></ul><ul><li>Variance </li></ul><ul><li>Standard Deviation = 0.82 </li></ul>
  13. 13. Distribution of the Population
  14. 14. Sampling distribution of means n = 2
  15. 15. = µ = 0.6 3 3,3 9 2 Mean of SD 2.5 3,2 8 2 3,1 7 2.5 2,3 6 2 2,2 5 1.5 2,1 4 2 1,3 3 1.5 1,2 2 1 1,1 1 Sample Mean Sample Sample #
  16. 17. <ul><li>The population’s distribution has far more variability than that of sample means </li></ul><ul><li>As the sample size increases the dispersion becomes less and in the SD </li></ul>0.6 < 0.8 = µ <
  17. 18. <ul><li>The mean of the sampling distribution of ALL the sample means is equal to the true population mean. </li></ul><ul><li>The standard deviation of a sampling distribution called Standard Error is calculated as </li></ul>
  18. 19. Central Limit Theorem …… <ul><li>The variability of a sample mean decreases as the sample size increases </li></ul><ul><li>If the population distribution is normal, so is the sampling distribution </li></ul><ul><li>For ANY population (regardless of its shape) the distribution of sample means will approach a normal distribution as n increases </li></ul><ul><li>It can be demonstrated with the help of simulation . </li></ul>
  19. 20. Central Limit Theorem …… <ul><li>How large is a “large sample”? </li></ul><ul><li>It depends upon the form of the distribution from which the samples were taken </li></ul><ul><li>If the population distribution deviates greatly from normality larger samples will be needed to approximate normality . </li></ul>
  20. 22. Implications of CLT <ul><li>A light bulb manufacturer claims that the life span of its light bulbs has a mean of 54 months and a standard deviation of 6 months. A consumer advocacy group tests 50 of them. Assuming the manufacturer’s claims are true, what is the probability that it finds a mean lifetime of less than 52 months? </li></ul>
  21. 23. Implications of CLT Cont <ul><li>From the data we know that </li></ul><ul><li>µ = 54 Months = 6 Months </li></ul><ul><li>By Central Limit Theorem </li></ul><ul><li>= µ = 54 </li></ul>=
  22. 24. 54 o -2.35 0.0094 52
  23. 25. <ul><li>To find ,we need to convert to z -scores: </li></ul><ul><li>From the Area table = 0.4906 </li></ul><ul><li>Hence, the probability of this happening is 0.0094. </li></ul><ul><li>We are 99.06% certain that this will not happen </li></ul>
  24. 26. What can go wrong <ul><li>Statistics can be manipulated by taking biased samples intentionally </li></ul><ul><li>Examples </li></ul><ul><li>Asking leading questions in Interviews and questionnaires </li></ul><ul><li>A survey which showed that 2 out 3 dentists recommend a particular brand of tooth paste </li></ul><ul><li>Some time there is non response from particular portion of population effecting the sampling design </li></ul>
  25. 27. How to do it rightly <ul><li>Need to make sure that sample truly represents the population </li></ul><ul><li>Use Random ways where possible </li></ul><ul><li>Avoid personal bias </li></ul><ul><li>Avoid measurement bias </li></ul><ul><li>Do not make any decisions about the population based on the samples until you have applied statistical inferential techniques to the sample. </li></ul>

×