SAMPLING IN RESEARCH




      Dr. Kusum Gaur
      Professor, PSM
      WHO Fellow IEC
Selection of study population



             Whole Population

             Sample Population




12/08/2012           Dr. Kusum Gaur   2
What is Sample ?

• A sample is a small representative
  segment of a population

• Inferences drawn from a sample are expected
  to be applicable for the source population



12/08/2012         Dr. Kusum Gaur           3
Why do we need a sample?



 To get inferences

             applicable to universe

                 with minimum resources

12/08/2012             Dr. Kusum Gaur     4
Sample – Qualities
 Sample is a part of population but it is true
  representative of whole.

 Qualities

 Adequate size

 Appropriate sampling technique

12/08/2012            Dr. Kusum Gaur             5
Factors on which SAMPLE SIZE depend:
• Population Factors
   – Type of information available
• Type of study
   – Type of Data
   – Type of study design
   – Type of sampling
   – Type of Statistical Analysis for outcome needed
• Determined values of research by researcher
   – Power
   – Significance level


 12/08/2012              Dr. Kusum Gaur                6
Power: Ability to detect right answer



Alpha Error: Chance to miss right answer
Type of Data & level of Measurements


 Qualitative – Counted Facts – Nominal Data
  Measured as Numbers expressed as proportions

 Quantitative- Measured Facts - Numerical Data
  Measured as quantity & expressed as Mean SD

 *Ordinal Data – Rank Order Data
   Measured as rank & expressed as Median   Percentile




12/08/2012            Dr. Kusum Gaur               13
Sample size for Qualitative data

                          Z 2 PQ                      4 PQ
      Sample Size= ------------------- -- = ------------------
                          L2                           L2

                                         P= Prevalence of disease
                                                        Q = 100-P
                                               L = allowable error
                                    Z= 1.96 ≈ 2 for 95% CL
 for descriptive/case-series type of study design

09/03/2010                         Dr. Kusum Gaur                14
Sample size for Quantitative data

                          Z 2 SD 2                     4 SD 2
      Sample Size= ------------------- -- =----------------------
                          L2                           L2

                                                      SD= Standard Deviation
                                                           L = allowable error
                                                Z= 1.96 ≈ 2 for 95% CL
      For Descriptive Studies only



09/03/2010                          Dr. Kusum Gaur                          16
Finite Correction
Sample Size – Finite Population (where the
  population is less than 50,000)
                 SS
New SS = _________________
          ( 1 + ( SS – 1 ))Pop
How many controls?

                             n
                  k                     Here n0=No. of cases &
                        2n0 n           n = expected no. of cases




• k = 13 / (2*11 – 13) = 13 / 9 = 1.44
• kn0 = 1.44*11 ≈ 16 controls (and 11 cases)
   – Same precision as 13 controls and 13 cases
Sampling Design factors of sample size




                  Variance of Specified Sampling
Design Effect =
                  Variance of Simple Random Sampling (SRS)




12/08/2012               Dr. Kusum Gaur                      19
Sampling Technique effect on Sample Size

    Sampling Technique    Design Effect Size Multiplier

    Simple Random Sampling                1

    Systemic Random Sampling              1.2

     Stratified Random Sampling           0.8

    Cluster Random Sampling               2


 12/08/2012              Dr. Kusum Gaur                   20
Conventionally accepted
             Researcher’s Estimations


   Alpha Error                          0.05

   Power                                80%

    Confidence Limit                    95%




12/08/2012             Dr. Kusum Gaur          21
Key Concepts: Sample size
• Sampling Design - larger sample for Custer

• Desired Power – more power for larger sample

• Allowable error – smaller error for larger sample

• Heterogeneity leads to have larger sample to cover
  diversities

• Nature of Analysis – Complex multivariate needs
  larger sample
 12/08/2012            Dr. Kusum Gaur                 22
Steps -Sample Size Estimation
    • Stage 1- * Base Sample Size Calculation (n)

    • Stage 2 – Sample Size with Design Effect (d)
                =n*d

    • Stage 3- Contingency Addition (e.g. 5%)
                SS Estimation for study population
                =(n*d)+5%of n

    *Use appropriate equation for sample size calculation
      http://stat.ubc.ca/~rollin/stats/ssize/
12/08/2012               Dr. Kusum Gaur               23
E.G. Mean 1= 5, Mean 2 = 15 & SD = 14   inputting values
12/08/2012   Dr. Kusum Gaur   29
12/08/2012   Dr. Kusum Gaur   30
12/08/2012   Dr. Kusum Gaur   31
12/08/2012   Dr. Kusum Gaur   32
12/08/2012   Dr. Kusum Gaur   33
12/08/2012   Dr. Kusum Gaur   34
Sample Size Tables
SAMPLING
TECHNIQUES
SAMPLING TECHNIQUES

• PROBABILITY/RANDOM SAMPLING



• NONPROBABILITY SAMPLING


12/08/2012    Dr. Kusum Gaur    41
Random sampling Techniques

  Aim is to give equal chance to
     every observation unit to be
           selected for study in sample.

(Any Observation unit
           should not have Zero Probability )




 12/08/2012               Dr. Kusum Gaur        42
* Random Sampling Techniques

Simple Random Technique

       Systemic Random Technique

              Stratified Random Technique

                   Multiphase Random Technique

                           Multistage Random Technique

                                      Cluster Random Technique

 12/08/2012                Dr. Kusum Gaur                  43
Simple Random Technique
 • Lottery Method




 • Random Table Method


12/08/2012          Dr. Kusum Gaur   44
12/08/2012   Dr. Kusum Gaur   45
Steps –Use of Random Table
• Stage 1- Give number to each member of population
• Stage 2 – Determine total population size (N)
• Stage 3- Determine Sample size (S)

• Stage 4 – Drop one finger on Random Table with eyes closed
• Stage 5 – Drop one finger with eyes closed on direction to be
  chosen – Up/Down/Rt/Lt

• Stage 6- Determine first number within 0 to N
• Stage 7- * Determine other numbers till Sample size (S)

* Once a number is chosen do not repeat it again

  12/08/2012                  Dr. Kusum Gaur                      46
Steps –Use of Random Table..
e.g. N=300, M=50

Random no. Selected no. (3 digits from 0-300)
49468
49699
14043        043
15013        013
12600
33122        122
94169        169
89916
74169        169
32007        007
www.evaluation
  wikiog/index/how_to_use_a_random_number_Table
  12/08/2012             Dr. Kusum Gaur           47
Systemic Random Technique
    The selection of sample follows a systematic
    interval of selection
•   Find serial interval
                 (K) = total population/sample size
•   1st observation through simple random sampling
    among 1to K.                    th
•   Next observation = (1st +K) Observation
•   Next observation = (2     nd +K) thObservation
•   -------------so on till No. of observations
                        = Sample Size

12/08/2012              Dr. Kusum Gaur                48
Systemic Random Technique               Population
N=100 (Given)                              1     21   41   61   81
                                            2     22   42   62   82
S=20 (Estimated)                           3     23   43   63   83
K=N/S =100/20 =5                           4     24   44   64   84
                                            5     25   45   65   85
1st observation between 1 to 5             6     26   46   66   86
                                            7     27   47   67   87
      though SRS e.g. 3                     8     28   48   68   88
Every 5th observation from 3rd             9     29   49   69   89
                                            10    30   50   70   90
      observation will be included in       11    31   51   71   91
      sample population                     12    32   52   72   92
                                            13    33   53   73   93
So, sample population will be – 3rd        14    34   54   74   94
      8th 13th 18th 23rd 28th 33rd 38th     15    35   55   75   95
                                            16    36   56   76   96
      43rd 48th 53rd 58th 63rd 68th 73rd    17    37   57   77   97
      78th 83rd 88th 93rd and 98th          18    38   58   78   98
                                            19    39   59   79   99
      observation                           20    40   60   80   100
 12/08/2012                Dr. Kusum Gaur                         49
Stratified Random Technique
  Sample selection through Simple Random/Systemic Random Technique


              Sample         Strata 1
             Sample
                             Strata 2

             Sample          Strata 3
12/08/2012                    Dr. Kusum Gaur                         50
Multiphase Random Technique
                                                                 Specific test
                               Screening Test
             S/S
Population



                                                     Probable cases              Cases
             Suspected cases                                                      For
                                                                                 study




12/08/2012                                      Dr. Kusum Gaur                           51
Multistage Random Technique

Each stage Simple RT is used                     village
                                      district
                                                 village

                                                 village
                 State 1             district
   Population                                    village
                                                            Study
       Of                                                  Population
     Nation                                      village
                                     district
                                                 village
                State 2
                                                 village
                                     district
                                                 village



12/08/2012                 Dr. Kusum Gaur                        52
Cluster Random Technique
The unit of random selection is a cluster rather than individual
• CI = Total population /30 (in 30 Cluster Technique)


                   Cluster 1         Cluster 27


                       Cluster 2            Cluster 28
   Population                                                       Study
       Of                                                          Population
     Nation           Cluster 3            Cluster 29


                                   Cluster 30
                   Cluster 4


                                       Through Simple RT

12/08/2012                         Dr. Kusum Gaur                        53
Stratified Vs Cluster Technique

    Stratified Technique            Cluster Technique
•   Homogenous groups are        • Comparable groups of
    made                           population are made
•   Randomly select sample         (usually 30)
    from each group              • Randomly select sample
•   To make it more truly          from each group
    representative, take
    sample population            • More chances of error than
    proportion to size (PPS)       simple random
•   Less chances of error than
    simple random
Non Probability Sampling



    •   When random samples are not possible
    •   Rare disease
    •   Small population
    •   Special population
    •   Special Condition
    •   Difficult to reach population

12/08/2012             Dr. Kusum Gaur          55
Non-probability Samples


             Convenience
              Purposive
              Quota
              Snow ball study



12/08/2012                 Dr. Kusum Gaur   56
12/08/2012   Dr. Kusum Gaur   57
12/08/2012   Dr. Kusum Gaur   58
12/08/2012   Dr. Kusum Gaur   59
Snow ball sampling


Contact tracing
Initial respondent helps in recruiting
 new population
Useful in network analysis approach


12/08/2012          Dr. Kusum Gaur        60
Computer in Statistics


12/08/2012            Dr. Kusum Gaur   61
Web sites related to Statistics


•   http://stattrek.com
•   http://vassarstat.net
•   http://www.scribd.com
•   http://www.statistixl.com
•   http://statistics calculators.com
•   http://stat.ubc.ca/~rollin/stats/ssize/
•   ……………………………………………………………

12/08/2012            Dr. Kusum Gaur          62
Computer Softwares in Statistics


•   Microsoft Excel
•   SPSS
•   Epi info
•   Epi tab
•   Mini tab
•   Graph Pad
•   Primer
•   Medcal
•   ……………..
12/08/2012            Dr. Kusum Gaur   63
THANKS

12/08/2012     Dr. Kusum Gaur   64

Sampling in Medical Research

  • 1.
    SAMPLING IN RESEARCH Dr. Kusum Gaur Professor, PSM WHO Fellow IEC
  • 2.
    Selection of studypopulation Whole Population Sample Population 12/08/2012 Dr. Kusum Gaur 2
  • 3.
    What is Sample? • A sample is a small representative segment of a population • Inferences drawn from a sample are expected to be applicable for the source population 12/08/2012 Dr. Kusum Gaur 3
  • 4.
    Why do weneed a sample? To get inferences applicable to universe with minimum resources 12/08/2012 Dr. Kusum Gaur 4
  • 5.
    Sample – Qualities Sample is a part of population but it is true representative of whole. Qualities Adequate size Appropriate sampling technique 12/08/2012 Dr. Kusum Gaur 5
  • 6.
    Factors on whichSAMPLE SIZE depend: • Population Factors – Type of information available • Type of study – Type of Data – Type of study design – Type of sampling – Type of Statistical Analysis for outcome needed • Determined values of research by researcher – Power – Significance level 12/08/2012 Dr. Kusum Gaur 6
  • 7.
    Power: Ability todetect right answer Alpha Error: Chance to miss right answer
  • 8.
    Type of Data& level of Measurements Qualitative – Counted Facts – Nominal Data Measured as Numbers expressed as proportions Quantitative- Measured Facts - Numerical Data Measured as quantity & expressed as Mean SD *Ordinal Data – Rank Order Data Measured as rank & expressed as Median Percentile 12/08/2012 Dr. Kusum Gaur 13
  • 9.
    Sample size forQualitative data Z 2 PQ 4 PQ Sample Size= ------------------- -- = ------------------ L2 L2 P= Prevalence of disease Q = 100-P L = allowable error Z= 1.96 ≈ 2 for 95% CL for descriptive/case-series type of study design 09/03/2010 Dr. Kusum Gaur 14
  • 10.
    Sample size forQuantitative data Z 2 SD 2 4 SD 2 Sample Size= ------------------- -- =---------------------- L2 L2 SD= Standard Deviation L = allowable error Z= 1.96 ≈ 2 for 95% CL For Descriptive Studies only 09/03/2010 Dr. Kusum Gaur 16
  • 11.
    Finite Correction Sample Size– Finite Population (where the population is less than 50,000) SS New SS = _________________ ( 1 + ( SS – 1 ))Pop
  • 12.
    How many controls? n k Here n0=No. of cases & 2n0 n n = expected no. of cases • k = 13 / (2*11 – 13) = 13 / 9 = 1.44 • kn0 = 1.44*11 ≈ 16 controls (and 11 cases) – Same precision as 13 controls and 13 cases
  • 13.
    Sampling Design factorsof sample size Variance of Specified Sampling Design Effect = Variance of Simple Random Sampling (SRS) 12/08/2012 Dr. Kusum Gaur 19
  • 14.
    Sampling Technique effecton Sample Size Sampling Technique Design Effect Size Multiplier Simple Random Sampling 1 Systemic Random Sampling 1.2 Stratified Random Sampling 0.8 Cluster Random Sampling 2 12/08/2012 Dr. Kusum Gaur 20
  • 15.
    Conventionally accepted Researcher’s Estimations Alpha Error 0.05 Power 80% Confidence Limit 95% 12/08/2012 Dr. Kusum Gaur 21
  • 16.
    Key Concepts: Samplesize • Sampling Design - larger sample for Custer • Desired Power – more power for larger sample • Allowable error – smaller error for larger sample • Heterogeneity leads to have larger sample to cover diversities • Nature of Analysis – Complex multivariate needs larger sample 12/08/2012 Dr. Kusum Gaur 22
  • 17.
    Steps -Sample SizeEstimation • Stage 1- * Base Sample Size Calculation (n) • Stage 2 – Sample Size with Design Effect (d) =n*d • Stage 3- Contingency Addition (e.g. 5%) SS Estimation for study population =(n*d)+5%of n *Use appropriate equation for sample size calculation http://stat.ubc.ca/~rollin/stats/ssize/ 12/08/2012 Dr. Kusum Gaur 23
  • 20.
    E.G. Mean 1=5, Mean 2 = 15 & SD = 14 inputting values
  • 23.
    12/08/2012 Dr. Kusum Gaur 29
  • 24.
    12/08/2012 Dr. Kusum Gaur 30
  • 25.
    12/08/2012 Dr. Kusum Gaur 31
  • 26.
    12/08/2012 Dr. Kusum Gaur 32
  • 27.
    12/08/2012 Dr. Kusum Gaur 33
  • 28.
    12/08/2012 Dr. Kusum Gaur 34
  • 29.
  • 34.
  • 35.
    SAMPLING TECHNIQUES • PROBABILITY/RANDOMSAMPLING • NONPROBABILITY SAMPLING 12/08/2012 Dr. Kusum Gaur 41
  • 36.
    Random sampling Techniques Aim is to give equal chance to every observation unit to be selected for study in sample. (Any Observation unit should not have Zero Probability ) 12/08/2012 Dr. Kusum Gaur 42
  • 37.
    * Random SamplingTechniques Simple Random Technique Systemic Random Technique Stratified Random Technique Multiphase Random Technique Multistage Random Technique Cluster Random Technique 12/08/2012 Dr. Kusum Gaur 43
  • 38.
    Simple Random Technique • Lottery Method • Random Table Method 12/08/2012 Dr. Kusum Gaur 44
  • 39.
    12/08/2012 Dr. Kusum Gaur 45
  • 40.
    Steps –Use ofRandom Table • Stage 1- Give number to each member of population • Stage 2 – Determine total population size (N) • Stage 3- Determine Sample size (S) • Stage 4 – Drop one finger on Random Table with eyes closed • Stage 5 – Drop one finger with eyes closed on direction to be chosen – Up/Down/Rt/Lt • Stage 6- Determine first number within 0 to N • Stage 7- * Determine other numbers till Sample size (S) * Once a number is chosen do not repeat it again 12/08/2012 Dr. Kusum Gaur 46
  • 41.
    Steps –Use ofRandom Table.. e.g. N=300, M=50 Random no. Selected no. (3 digits from 0-300) 49468 49699 14043 043 15013 013 12600 33122 122 94169 169 89916 74169 169 32007 007 www.evaluation wikiog/index/how_to_use_a_random_number_Table 12/08/2012 Dr. Kusum Gaur 47
  • 42.
    Systemic Random Technique The selection of sample follows a systematic interval of selection • Find serial interval (K) = total population/sample size • 1st observation through simple random sampling among 1to K. th • Next observation = (1st +K) Observation • Next observation = (2 nd +K) thObservation • -------------so on till No. of observations = Sample Size 12/08/2012 Dr. Kusum Gaur 48
  • 43.
    Systemic Random Technique Population N=100 (Given) 1 21 41 61 81 2 22 42 62 82 S=20 (Estimated) 3 23 43 63 83 K=N/S =100/20 =5 4 24 44 64 84 5 25 45 65 85 1st observation between 1 to 5 6 26 46 66 86 7 27 47 67 87 though SRS e.g. 3 8 28 48 68 88 Every 5th observation from 3rd 9 29 49 69 89 10 30 50 70 90 observation will be included in 11 31 51 71 91 sample population 12 32 52 72 92 13 33 53 73 93 So, sample population will be – 3rd 14 34 54 74 94 8th 13th 18th 23rd 28th 33rd 38th 15 35 55 75 95 16 36 56 76 96 43rd 48th 53rd 58th 63rd 68th 73rd 17 37 57 77 97 78th 83rd 88th 93rd and 98th 18 38 58 78 98 19 39 59 79 99 observation 20 40 60 80 100 12/08/2012 Dr. Kusum Gaur 49
  • 44.
    Stratified Random Technique Sample selection through Simple Random/Systemic Random Technique Sample Strata 1 Sample Strata 2 Sample Strata 3 12/08/2012 Dr. Kusum Gaur 50
  • 45.
    Multiphase Random Technique Specific test Screening Test S/S Population Probable cases Cases Suspected cases For study 12/08/2012 Dr. Kusum Gaur 51
  • 46.
    Multistage Random Technique Eachstage Simple RT is used village district village village State 1 district Population village Study Of Population Nation village district village State 2 village district village 12/08/2012 Dr. Kusum Gaur 52
  • 47.
    Cluster Random Technique Theunit of random selection is a cluster rather than individual • CI = Total population /30 (in 30 Cluster Technique) Cluster 1 Cluster 27 Cluster 2 Cluster 28 Population Study Of Population Nation Cluster 3 Cluster 29 Cluster 30 Cluster 4 Through Simple RT 12/08/2012 Dr. Kusum Gaur 53
  • 48.
    Stratified Vs ClusterTechnique Stratified Technique Cluster Technique • Homogenous groups are • Comparable groups of made population are made • Randomly select sample (usually 30) from each group • Randomly select sample • To make it more truly from each group representative, take sample population • More chances of error than proportion to size (PPS) simple random • Less chances of error than simple random
  • 49.
    Non Probability Sampling • When random samples are not possible • Rare disease • Small population • Special population • Special Condition • Difficult to reach population 12/08/2012 Dr. Kusum Gaur 55
  • 50.
    Non-probability Samples Convenience  Purposive  Quota  Snow ball study 12/08/2012 Dr. Kusum Gaur 56
  • 51.
    12/08/2012 Dr. Kusum Gaur 57
  • 52.
    12/08/2012 Dr. Kusum Gaur 58
  • 53.
    12/08/2012 Dr. Kusum Gaur 59
  • 54.
    Snow ball sampling Contacttracing Initial respondent helps in recruiting new population Useful in network analysis approach 12/08/2012 Dr. Kusum Gaur 60
  • 55.
  • 56.
    Web sites relatedto Statistics • http://stattrek.com • http://vassarstat.net • http://www.scribd.com • http://www.statistixl.com • http://statistics calculators.com • http://stat.ubc.ca/~rollin/stats/ssize/ • …………………………………………………………… 12/08/2012 Dr. Kusum Gaur 62
  • 57.
    Computer Softwares inStatistics • Microsoft Excel • SPSS • Epi info • Epi tab • Mini tab • Graph Pad • Primer • Medcal • …………….. 12/08/2012 Dr. Kusum Gaur 63
  • 58.
    THANKS 12/08/2012 Dr. Kusum Gaur 64

Editor's Notes