Samples Types and Methods


Published on

Samples types, methods of sampling, probability, random, sampling error, sample size

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Samples Types and Methods

  1. 1. Sampling Types and Methods Professor. Tarek Tawfik Amin Public Health Dept. Faculty of Medicine Cairo University
  2. 2. Objectives: By the end of the lectures 4th yearmedical student should be able to: 1- Define the indication of using a sample and the whole population in research. 2- Define the meaning/concepts/rules of probability and non-probability sampling techniques. 3- Enumerate, define the indication forusing different types of random techniques and able to use the random digit table in drawing a simple random sampling. 4- Identify the advantages and uses of non-probability sampling.
  3. 3. In research what we are looking for? The variable: is a condition, quality or trait that varies from one case to another In the target population (population of interest) Either the whole population SampleOR
  4. 4. Theconcept of sampling Study population: Samplingunits You select a few sampling units from the study population Sample You collect information from these people to find answers to your research questions. You make an estimate “prediction” extrapolated to the study population (prevalence, outcomes etc.)
  5. 5. Basic Terms and Concepts Target Population and Sample A pOpulatiOn is a complete set of units with a specified set of characteristics while a sample is a subset of the population. In research the defining characteristics of population include geOgRaphic, clinical, demOgRaphic and tempORal.
  6. 6. Basic Terms and Concept Clinicalanddemographic characteristics define the target population, the large set of people throughout the world to which the results will be generalized (all teenagers with asthma(. Example: The study sample is the subset of the target population available forstudy (teenagers with asthma in the investigator’s town in 2005(.
  7. 7. Steps in designing the protocol for choosing the study subjects Target population Specify clinical, Demographic and then Geographic and temporal characteristics Intended sample Specify accessible population and approach to selecting the sample Research question Truth in the Universe Study plan Findings in the study Design
  8. 8. Selection Criteria  How would you define the population to be studied?  Through establishing selection criteria that include inclusion and exclusion criteria.  Example: Demonstrate the selection criteria for subjects to evaluate the efficacy of calcium supplements for preventing osteoporosis?
  9. 9. Designing selection criteria fora clinical trial of calcium supplements to prevent osteoporosis Inclusion criteria (be specific( Specifying the characteristics that define population that are relevant to the research question and efficient for study: Demographic: age, sex, and race. Clinicalcharacteristics. Geographic (administrative) Temporalcharacteristics A 5-yeartrial of calcium supplementation forpreventing osteoporosis might specify the subject be: White females 50 to 60 years old Ingoodgeneralhealth** Patients attending clinic at X Hospital Between Jan. 1st and December 31st of next year. Considerations Example
  10. 10. Designing selection criteria fora clinical trial of calcium supplements to prevent osteoporosis Exclusion Criteria (be parsimonious( Specifying the subsets of the population that will not be studied because of: A high likelihood of being lost to follow-up. An inability to provide good data. Being at high risk of side effects. Characteristics that make it unethical to withhold the study treatment The calcium supplementation trial might exclude subjects who are: oAlcoholic orplan to move of the country orregion. oDisoriented orhave a language barrier. oSarcoidosis/hypercalcemia oTaking steroids. Considerations Example
  11. 11. Clinical versus Community populations If the research question involves patients with a disease, hospitalized or clinic-based patients are inexpensive and easy to recruit, but selection factors that determine who comes to the hospital orclinic may have an important effect. Tertiaryclinics tendto accumulatepatients with serious forms of disease. In choosing the sample in the community who will represent a non clinical population (population- based) Samples are difficult and expensive to recruit, but they are particularly useful forguiding public health and clinical practice in the community.
  12. 12. The Sample Population Research question Truth in the universe Study plan Truth in the study Step1 Target population Specific clinical and Demographic characteristics Step 2 Accessible population Specific temporal and geographic characteristics Step 3 Sample population Defined approach to sampling Criteria forselection Suited to research question Representative of target population Easy to study Representative of accessible population Easy to do
  13. 13. Terms and Concept  The whole collection of units “universe”from which a sample may be drawn.  The samplingunits may be hospitals, institutions, houses, schools, villages, records, events and not necessarily individuals.  Samplingframe is detailed characteristics of the study units amenable to sampling.
  14. 14.  Adequately representative of the target population so as to minimize bias (or systematic error).  Large enough to minimize random variation differences that might occur between the sample and target populations. Characteristics Of A Good Sample
  15. 15. The whole population  If we are interested in the characteristics of each individual, particularly with descriptive research questions, thereis a needforgeneralizingthefindings.  Probability sampling is the goldstandard.  It provides a rigorous basis forestimating the fidelityof phenomena observed and for computing statistical significance and confidence intervals.
  16. 16. The whole population. A. It is expensive. B. It is timeconsuming. C. Highererrorchances because of the many persons, equipments and wide geographic area covered. Study of the whole population is carried out in censuses.
  17. 17. Sampling Resorted to if we are interested in studying the prevalence of a problem, associations or intervention effect,…..etc A. It is less expensive. B. It is less time consuming. C. It has lower error chances because of less persons, equipments and geographic area covered. D. It allows for continuous study of the population (longitudinal study). Study of a sample is carried out in the majority of researches.
  18. 18. Principlesof sampling I. In a majority of cases of sampling there will be a difference between the sample statistics and the true population mean, which attributable to the selection of the units in the sample “sampling error”. II. The greaterthe sample size, the more accurate will be the estimate of the true population mean “reduction in sampling error” III. The greaterthe difference in the variable “heterogeneous variable” understudy in a population fora given sample size, the greaterwill be the difference between the sample statistics and the true population mean “the largerthe sampling error”.
  19. 19. Sampling error Fourindividuals A, B, C, D A = 18 years B= 20 years C= 23 years D= 25 years Theirmean age is = 18+20+23+ 25= 86/4= 21.5 years (population mean).
  20. 20. Probability of sampling two individuals: (6 probabilities) A+B=18+20= 38/2=19.0 years A+C= 18+23=20.5 years. A+D=18+25=21.5 years. B+C=20+23=21.5 years. B+D=20+25=22.5 years. C+D=23+25=24.0 years. Probability of sampling three individuals: (4 probabilities) A+B+C=18+20+23/3=20.33 years. A+B+D=18+20+25=21.00 years. A+C+D=18+23+25=22.00 years. B+C+D=20+23+25=22.67 years. If C=32 years and D=40 years: sampling of 2 will include a sampling errorof -7.00 to +7.00 and in case of 3 individuals it will be -3.67 to +3.67 years. Sampling error= population mean-sample mean = ranges from -2.5 to +2.5 years. Sampling error= population mean-sample mean = ranges from -1.17 to +1.17 years. The greaterthe difference (variability) of a given variable the largerthe sampling errorfora given sample size.
  21. 21. Typesof sampling Random/probability Non-random/probability Mixed sampling Simple Stratified Proportionate Disproportionate Cluster Single Doublestage Multi-stage Quota Accidental Judgmental Snowball Systematic sampling
  22. 22. Types of Samples Probability samples: Units are selected according to probability laws i.e. everyoneintheunderlyingpopulationhas an equal(aspecified)andindependentchanceof appearinginthatsample. Non-probability (convenience) samples: Units are selected based on known factors. In clinical research the study sample is usually made up of people who meet the inclusion criteria and are easily accessible to the investigator.
  23. 23. Probability Samples In orderto be able to inferfrom sample results to the underlying population, that sample should be a representative sample. i.e. it should represent the population from which it is drawn in every respect. Becausewecannotanticipateallcharacteristics of the populationthatthesampleshouldrepresent, wechosea probability (random)sample.
  24. 24. How to draw aprobability Sample? I. Identify the study units (individuals, villages, houses, …etc). II. Make a complete list of the study units in the underlying population. That complete list is known as the samplingframe. III. Each of these units is given a number. IV. Then select the required numberof units (sample size) at random from that frame.
  25. 25. The selection of units can be made either by: 1. The lottery method “fishbowl draw” (the numbers of frame units are written on identical pieces of papers, mixed thoroughly in a bowl and the required number is blindly picked up). 2. Through the use of random numbers tables. 3. Computer generated random numbers. Two systems o f drawing a rando m sample: Sampling witho ut replacement. Sampling withreplacement.
  26. 26. Random number table
  27. 27. Random Sampling Techniques 1-Simple random sample 2-Stratified random sample 3-Systematic random sample 4-Clusterrandom sample 5-Multistage random sample
  28. 28. 1-Simple random sample We prepare a complete and up-to-date list of the underlying population (sample frame). The specified sample size is drawn from that frame at random. Disadvantages:  Suitable forhomogenous population (single sex).  Largersample size is required.  More expensive as we have to get the cases from widely scattered areas.  Time consuming and more laborious.  Some groups might not be represented in the sample.  Extreme values can occurby chance.
  29. 29. Example of Simple random sample using random digit table. Draw at random a sample size of 50 from a population of 10,000. Prepare the sampling frame and each subject received a number. A. The size of the population is 10,000 i.e. it is formed of 5 digits. B. Select at random a page from the random numbers tables. C. Select 5 adjacent columns (5 digits). D. Proceed from up down (blindly), any value falling between 00001 and 10,000 is chosen and so on until you completed your50 cases. E. Duplicate numbers are left aside F. Individuals with those 50 numbers compose oursample.
  30. 30. Simplerandom sampling 26804 00010 93445 90720 12805 58563 85027 32242 86468 09362 16212 00128 64590 75362 32348 29273 34703 23763 96215 01556 63708 59207 22211 48522 49674 01534 98685 04104 00047 14986 Samplingframe Random table
  31. 31. 2-Stratified random sampling o Based upon thelogic of heterogeneity of the included variables(variationsin population characteristicsand distribution which may result in dominanceof somestrataand ignoring others). o Ensurehomogeneity of sub-population though ranking them into strata.
  32. 32. 2-Stratified random sample  Ensures representativeness with regard to important characteristics as age, sex, educational orsocio- economic levels.  The population is divided into strata (subgroups) according to the different levels of the important variable. The population in each stratum is homogenous so sampling accuracy is increased.  We choose a simple random sample from each stratum, the size of which is proportionateto the size of that stratum. In otherwords the sampling fraction is the same foreach stratum and the total sample.   3 3 2 2 1 1 N n N n N n N n ===
  33. 33. Example of Stratified random sample A town with a total population of 12,000 was classified into 4 homogenous socioeconomic strata. The population in each stratum was 2,000 (class I), 4,000 (class II), 5,000 (class III) and 1,000 (class IV) respectively. A sample size of 600 is to be drawn from the town. Calculate the number of individuals to be drawn at random from each of the 4 strata? 501000 2505000 2004000 1002000 20 1 20 1 20 1 20 1 20 1 000,12 600 == == == == == xsampleStratum4 xsampleStatum3 xsampleStratum2 xsampleStratum1 fractionSampling
  34. 34. 3-Systematic random sample 1. The underlying population is classified into intervals: Thesizeof intervals = thesizeof thepopulation÷the requiredsamplesize. (indicatedinsmallfullyidentified populations). 2. The first case is selected at random from the first stratum (interval) and the others are selected by adding systematically the size of each interval. 3. Accordingly we are taking each (nth) individual. n is the size of the interval. If the latteris 10 we take every tenth observation
  35. 35. Example of systematic random sample 1000 patients visit Kasr AlAiny outpatient clinics every day. We need a systematic random sample of 100 patients. Explain how should we proceed in selecting those 100 patients composing our sample? Weclassifythepatients into100intervals andselecta patientfromeach. Sizeof eachinterval=1000/100= 10 Chooseatrandomanumberthatlies between1and10say 9. Choosefromthesecondintervalpatientnumber19th . Choosefromthethirdintervalobservationnumber 29th . th291019ORth2910x29 =+=+ th19109ORth191x109 =+=+
  36. 36. Systematic sampling
  37. 37. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 46 47 48 49 50 51 52 53 54 55 56 57 58 59….. First interval (pick one at random) Addtheintervaltoselectthesubsequentsubjects
  38. 38. 4-Cluster random sample ۞ In this method, the sampling units are clusters (groups) of individuals – (incomplete sampling frame and/orthe total sampling population is large) rather than individuals. ۞ The clusters (schools, houses, villages, …etc.) form the sampling frame, from which the required number of clusters is selected at random. ۞ All individuals in a cluster, a specific group, ora random sample of them are included. ۞ Very useful when the population is widely dispersed, and it is impractical to list and sample from all its elements.
  39. 39. Example of random cluster sample The objective of ourstudy was to define the prevalence of Obesity among primary school children in Giza There are 150 primary schools in Giza. The estimated sample size is 20 clusters. Describe how would you proceed in drawing such sample? A.  Listall200schools B. Giveeachanumber C. Usetherandomnumbers tables inselectingthe20 schools whosenumbers willfallbetween001and 200.
  40. 40. Example: Clustersampling Section 4 Section 5 Section 3 Section 2Section 1
  41. 41. 5-Multistagerandom sample We use this method if the target population is spread overwide geographic area and there is limited budget orresources (in community- based surveys). In this method, the sample is drawn in many stages. The area is divided into smallerclusters, the clusters are divided into smallerclusters and so on. Random selection is carried out at each level successively.
  42. 42. Country Provinces Sampling units: province Cities Districts Households Person Sampling unit: city Sampling unit: district Sampling unit: household Sampling unit: person Multi stage sampling
  43. 43. Youwereaskedtoheadaresearchteamtoinvestigate theproblemof hypertensioninEgypt Howwouldyouproceedindrawingyoursample? List all governorates (provinces).  Select 4 governorates (provinces) at random  List the districts in each of the 4 governorates  Select a district from each governorate at random  List all villages and urban areas in each districts.  Select a village and an urban centre from each district randomly  Study all or sub-sample of individuals in the selected villages and urban centres
  44. 44. II-Non-probability (convenience) samples  A convenience sample can minimize volunteerism and otherselection biases by consecutively selecting every accessible person who meets the inclusion criteria.  A consecutive sample is specially desirable when it mounts to taking the entire accessible population over a long enough period to include seasonal variation or otherchanges overtime that considered important to research question.  Representativness is a matterof judgment.
  45. 45. Non-probability samples These designs are used when the number of elements in a population is either unknown orcan not be individually identified. Quota sampling. Accidental sampling. Judgmental orpurposive sampling. Snowball sampling.
  46. 46. Non-probability (convenience) samples 1-Purposive sample: Chosen according to the investigator’s judgement in such a way that maximizes the chances of proving the study hypothesis. “selecting patients with ESRD” 2-Quota sample: Involves only few strata e.g. men and women >20 years. The enumerators select any individual belonging to those strata from whom they get the required information in an easy, quick and accessible way.
  47. 47. Thankyou