Successfully reported this slideshow.
Upcoming SlideShare
×

# T5 sampling

4,229 views

Published on

Published in: Technology, Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### T5 sampling

1. 1. SamplingBy Rama Krishna Kompella
2. 2. Learning Objectives• Understand the identifying the target respondents• Sampling and different types of sampling• Understanding sample process• What are the potential errors in sampling• Determining Sampling size
3. 3. Census vs. Sampling• Two methods of selecting the respondents – Census – Sampling• Census – When the number of respondents / units of interest are limited, or – When it is required to gather data from all the individuals in the population
4. 4. Census vs. Sampling• Sampling – When the size of the population is too large – The population is homogeneous – Considerations of time and cost play a major role in going for sampling
5. 5. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
6. 6. Sampling Process• The population needs to be defined in terms of: Term Example Element Company’s Product Sampling Unit Retail outlet, super market Extent Hyderabad & Secunderabad Time April 10 – May 25
7. 7. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
8. 8. Sampling Process• Identify the sampling frame: – Need to clearly define from which universe will the sample be picked from – Ex: When you are studying the purchase behaviour of consumers buying premium cars, your sampling frame will be all the premium car outlets in the city
9. 9. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
10. 10. Sampling Process• Specify the sampling unit – We need to decide on whom to contact in order to obtain the data required – Need to be careful while selecting the sampling unit, as we need to be sure of whether we will get the required data from the respondent or not – Ex: When studying intention to purchase a car, the unit of sampling would be people who are employed and having a steady income. Whereas if we are studying the trends from a dealer perspective, then the sampling unit will be the dealers
11. 11. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
12. 12. Sampling process• Need to select the kind of sampling method used in order to identify the respondents• There are two ways of selecting the sample: – Probability methods – Non-probability methods
13. 13. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
14. 14. Sampling Process• Need to decide how many respondents need to be chosen from the population• Generally, the sample size depends on the type of research conducted• For exploratory research the sample size tends to be small in number, whereas for conclusive research the sample size will be large
15. 15. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
16. 16. Sampling Process• A sampling plan needs to clearly specify who is the target population• Ex: when we are planning to study the purchase pattern of groceries by households, we need to clearly specify what “household” means. Is it a family who have kids, DINKS, Empty nesters etc.
17. 17. Sampling Process• Define the population• Identify the sampling frame• Specify the sampling unit• Selection of sampling method• Determination of Sampling size• Specify sampling plan• Selection of sample
18. 18. Sampling Designwithin the Research Process
19. 19. Step 4: Specifying the sampling method• Probability Sampling – Every element in the target population or universe [sampling frame] has equal probability of being chosen in the sample for the survey being conducted. – Scientific, operationally convenient and simple in theory. – Results may be generalized.• Non-Probability Sampling – Every element in the universe [sampling frame] does not have equal probability of being chosen in the sample. – Operationally convenient and simple in theory. – Results may not be generalized.
20. 20. Types of Sampling Designs Probability Nonprobability Simple random Convenience Complex random Purposive Systematic Judgment Cluster Quota Stratified Snowball Double
21. 21. Simple Random Sampling• In simple random sampling, every item of the population has equal probability of being chosen• Two methods are used in random sampling: – Lottery method – Random number table
22. 22. Simple RandomAdvantages Disadvantages• Easy to implement with • Requires list of random dialing population elements • Time consuming • Uses larger sample sizes • Produces larger errors • High cost14-22
23. 23. SystematicAdvantages Disadvantages• Simple to design • Periodicity within• Easier than simple random population may skew• Easy to determine sampling sample and results distribution of mean or • Trends in list may bias proportion results • Moderate cost14-23
24. 24. StratifiedAdvantages Disadvantages• Control of sample size in • Increased error will result if strata subgroups are selected at• Increased statistical different rates efficiency • Especially expensive if• Provides data to represent strata on population must and analyze subgroups be created• Enables use of different • High cost methods in strata14-24
25. 25. ClusterAdvantages Disadvantages• Provides an unbiased • Often lower statistical estimate of population efficiency due to subgroups parameters if properly being homogeneous rather done than heterogeneous• Economically more efficient • Moderate cost than simple random• Lowest cost per sample• Easy to do without list14-25
26. 26. Stratified and Cluster SamplingStratified Cluster• Population divided into • Population divided into few subgroups many subgroups• Homogeneity within • Heterogeneity within subgroups subgroups• Heterogeneity between • Homogeneity between subgroups subgroups• Choice of elements • Random choice of from within each subgroups subgroup14-26
27. 27. Area Sampling14-27
28. 28. Double SamplingAdvantages Disadvantages• May reduce costs if first • Increased costs if stage results in enough discriminately used data to stratify or cluster the population14-28
29. 29. Nonprobability Samples No need to generalize Limited Feasibility objectives Time Cost14-29
30. 30. Nonprobability Sampling Methods Convenience Judgment Quota Snowball14-30
31. 31. Non-probability samples• Convenience sampling – Drawn at the convenience of the researcher. Common in exploratory research. Does not lead to any conclusion.• Judgmental sampling – Sampling based on some judgment, gut-feelings or experience of the researcher. Common in commercial marketing research projects. If inference drawing is not necessary, these samples are quite useful.• Quota sampling – An extension of judgmental sampling. It is something like a two-stage judgmental sampling. Quite difficult to draw.• Snowball sampling – Used in studies involving respondents who are rare to find. To start with, the researcher compiles a short list of sample units from various sources. Each of these respondents are contacted to provide names of other probable respondents.
32. 32. Quota Sampling• To select a quota sample comprising 3000 persons in country X using three control characteristics: sex, age and level of education.• Here, the three control characteristics are considered independently of one another. In order to calculate the desired number of sample elements possessing the various attributes of the specified control characteristics, the distribution pattern of the general population in country X in terms of each control characteristics is examined. Control Characteristics Population Distribution Sample Elements . Gender: .... Male ...................... 50.7% Male 3000 x 50.7% = 1521 ................. Female .................. 49.3% Female 3000 x 49.3% = 1479 Age: .......... 20-29 years ........... 13.4% 20-29 years 3000 x 13.4% = 402 ................. 30-39 years ........... 53.3% 30-39 years 3000 x 52.3% = 1569 ................. 40 years & over ..... 33.3% 40 years & over 3000 x 34.3% = 1029 Religion: ... Christianity............ 76.4% Christianity 3000 x 76.4% = 2292 ................. Islam ..................... 14.8% Islam 3000 x 14.8% = 444 ................. Hinduism ............... 6.6% Hinduism 3000 x 6.6% = 198 ................. Others ................... 2.2% Others 3000 x 2.2% = 66 __________________________________________________________________________________
33. 33. Types of error• Non-sampling error – Error associated with collecting and analyzing the data• Sampling error – Error associated with failing to interview the entire population
34. 34. Non-Sampling Error• Coverage error – Wrong population definition – Flawed sampling frame – Interviewer or management error in following sampling frame• Response error – Badly worded question results in invalid or incorrect response – Interviewer bias changes response• Non-response error – Respondent refuses to take survey or is away – Respondent refuses to answer certain questions• Processing errors – Error in data entry or recording of responses• Analysis errors – Inappropriate analytical techniques, weighting or imputation are applied
35. 35. Sampling Error• Sampling error is known after the data are collected by calculating the Margin of Error and confidence intervals• Surveys don’t have a Margin of Error, questions do• Power analyses use estimates of the parameters involved in calculating the margin of error• It is common to see sample sizes of 400 and 1000 for surveys (these are associated with 5% and 3% margins of error)• In most cases the size of the population being sampled from is irrelevant• The margin of error should be calculated using the size of the subgroups sampled
36. 36. What’s Next?• Computation of sample size• Sampling error
37. 37. Key Terms• Area sampling • Multiphase sampling• Census • Nonprobability sampling• Cluster sampling • Population• Convenience sampling • Population element• Disproportionate • Population parameters stratified sampling • Population proportion of• Double sampling incidence• Judgment sampling • Probability sampling14-37
38. 38. Key Terms• Proportionate stratified • Simple random sample sampling • Skip interval• Quota sampling • Snowball sampling• Sample statistics • Stratified random sampling• Sampling • Systematic sampling• Sampling error • Systematic variance• Sampling frame• Sequential sampling14-38
39. 39. Simple Random Sampling• In simple random sampling, every item of the population has equal probability of being chosen• Two methods are used in random sampling: – Lottery method – Random number table
40. 40. Random Number Table
41. 41. Systematic Random Sampling• Three steps are followed: – Select the sampling interval, K K=Total Population / Desired Sample Size – Select a unit randomly between the first unit and kth unit – Add K to the selected number to the randomly chosen number – EX: If total population = 1000, desired sample size is 50, then K = 1000/50 = 20. – Randomly select a number between 1 and 20 – Let us say, the number is 17, then the sample series will be 17, 37, 57……
42. 42. Stratified Random Sampling• Calculate the percentage of population present in each stratum• Determine the sample to be drawn from each stratum• Randomly select sample from each stratum• Eg: You need to select 40 people from an office, which has the following staff – Male, full time 90 – Male, part time 18 – Female, full time 9 – Female, part time 63
43. 43. Some Notations to rememberPopulation Parameters Symbol Sample Notations SymbolSize N Size nMean value μ Mean value x-Percentage value Percentage value(population proportion) P (sample proportion) p– Q or [1 – P] q– or [1 – p–]Standard deviation σ Estimated standard deviation s–Variance σ2 Estimated sample s –2Standard error Estimated standard error(population parameter) Sμ or SP (sample statistics) Sx – or Sp –Other Sampling ConceptsConfidence intervals CIx – or CIp –Tolerance level of error eCritical z-value ZBConfidence levels CLFinite correction factor (the overallsquare root of [N – n/N – 1] (alsoreferred to as “finite multiplier” or“finite population correction”) fcf
44. 44. Central Limit Theorem• The theorem states that for almost all defined target populations (virtually with disregard to the actual shape of the original population), the sampling distribution of the mean (x–) or the percentage ( p–) value derived from a simple random sample will be approximately normally distributed, provided that the sample size is sufficiently large (i.e., when n is greater than or equal to 30).• In turn, the sample mean value (x–) of that random sample with an estimated sampling error (Sx–) fluctuates around the true population mean value (μ) with a standard error of σ/√n and has an approximately normal sampling distribution, regardless of the shape of the probability frequency distribution curve of the overall target population
45. 45. Normal Curve
46. 46. Sampling Error• Sampling error is any type of bias that is attributable to mistakes made in – either the selection process of prospective sampling units or – determining the sample size
47. 47. Statistical Precision• Using several statistical methods, the researcher will be able to specify the critical tolerance level of error (i.e., allowable margin of error) prior to undertaking a research study• This critical tolerance level of error (e) represents general precision (S) with no specific confidence level or precise precision [(S)(ZB,CL)] when a specific level of confidence is required
48. 48. Statistical Precision• General precision can be viewed as the amount of general sampling error associated with the given sample of raw data that was generated through some type of data collection activity.• Precise precision represents the amount of measured sampling error associated with the raw data at a specified level of confidence
49. 49. Statistical Precision• When attempting to measure the precision of raw data, researchers must incorporate the theoretical understanding of the concepts of – sampling distributions, – the central limit theorem, and – estimated standard error in order to calculate the necessary confidence intervals.
50. 50. Estimated Standard Error• Estimated standard error, also referred to as general precision, gives the researcher a measurement of the sampling error and an indication of how far the sample result lies from the actual target population parameter value estimate.• The formula to compute the estimated standard error of a sample mean value (Sx–) is – Sx– = s – /√n• where s – = Estimated standard deviation of the sample mean• n = Sample size
51. 51. Confidence Interval• A confidence interval represents a statistical range of values within which the true value of the target population parameter is expected to lie
52. 52. Z-Score
53. 53. Determining Sample SizeThree factors play an important role in determining appropriate sample sizes:1. The variability of the population characteristic under investigation (σμ or σP). – The greater the variability of the characteristic, the larger the size of the sample necessary.2. The level of confidence desired in the estimate (CL). – The higher the level of confidence desired, the larger the sample size needed.3. The degree of precision desired in estimating the population characteristic (e). – The more precise the required sample results (i.e., the smaller the e), the larger the necessary sample size.
54. 54. Determining the Sample Size
55. 55. Q & As