ESTIMATION Dr. Mandar Baviskar M.D.
Associate Professor, Community Medicine
Dr. Balasaheb Vikhe Patil RMC, PIMS(DU), Loni
BY the end of this Session you will Know
• Why we estimate sample size
• What information you need prior to sample size estimation
• What methods are available to estimate sample size
 Complete enumeration of study population is
known as CENSUS
 It is often not feasible to cover the entire study
population, therefore we take a sample.
 A sample is a subset of the study population
which ideally is representative of the population.
 If the sample is representative of the population
then our findings can be generalized to the
population.
 In research we want to be able to say that whatever results we have
got are REAL & NOT by chance.
 There is bound to be some difference between the ACTUAL value of
variable in Population (parameter) and what we find in the STUDY
(statistic).
 We want to have a Level of Confidence that we have enough readings
to correctly identify REAL difference in study population.
 In medical studies we usually want to be at least 95% CONFIDENT
and allow for 5% error.
 Therefore, Z for 95% CL is Zα=1.96.
 Parameter: Denotes value for
population (μ)
 Statistic: Denotes same value
for sample (m)
 Power (1-β): Is the probability that study will detect predetermined effect size (any deviation
from null hypothesis) should such a deviation exist. It should be at least 80%
Alpha error: Saying there is significant relationship when there is
none.
It is taken as 5% i.e. Alpha=0.05
Beta error: Saying there is no significant relationship when it is there.
It should be 20% or less i.e. Beta=0.2
 Depending on study design and outcome variables this may require putting
 Prevalence of condition
 Proportion of exposed
 Expected difference between means & proportion of groups under study
 What is the Target power
 How much Confidence you want
 How much error you can allow
 Target variance
 Help of statistician must be taken for correctly estimating sample size
BEFORE beginning the study.
 The expected values of exposure can be found from Previous study,
Existing records, Pilot study, Assumption (maximum)
 Mathematical formulae are used to estimate the minimum number of
patients/respondents needed to make inferences about a population.
 Software like
 Epi Info,
 G* power,
 SAS, STATA, R studio
 GLIMMPSE (repeated measures)
 Simulation
 Examples
 We want to assess Stress levels in Students of RMC, Loni. (Similar study shows past
prevalence was 30%)
 We want to estimate Hb levels in Sickle Cell Anemia patients coming to PRH, Loni (Pilot
study mean 11, SD=2)
 We want to find effectiveness of NRT inclusive behavior therapy regimen compared to
behaviour therapy alone in Tobacco Cessation (Other studies show NRT 20% cure rate &
non NRT 10%)
 We want to find difference in dose required to produce motor block using two anesthetic
drugs. (mean A=110, SD=10; mean B=100, SD=11)
 We want to conduct a web based survey about use of masks in Maharashtra
 Just taking the desired NUMBER
of samples is NOT ENOUGH
 Taking them using correct
SAMPLING TECHNIQUE is also
equally important.
 These methods are
1. Probability
2. Non Probability
• Simple Random
• Stratified Random
• Systematic
Random
• Cluster
• Multi Stage
Probability
• Convenience
• Purposive
• Quota
• Snowball
Non-
Probability
 A sample size should be large enough to sufficiently describe the phenomenon of
interest, and address the research question at hand.
 But at the same time, a large sample size risks having repetitive data.
 The goal of qualitative research should thus be the attainment of saturation.
 Saturation occurs when adding more participants to the study does not result in
obtaining additional perspectives or information.
Help will Always be given to those who ASK for it.
-Dulbus Ambeldore

Basics of Sample Size Estimation

  • 1.
    ESTIMATION Dr. MandarBaviskar M.D. Associate Professor, Community Medicine Dr. Balasaheb Vikhe Patil RMC, PIMS(DU), Loni BY the end of this Session you will Know • Why we estimate sample size • What information you need prior to sample size estimation • What methods are available to estimate sample size
  • 2.
     Complete enumerationof study population is known as CENSUS  It is often not feasible to cover the entire study population, therefore we take a sample.  A sample is a subset of the study population which ideally is representative of the population.  If the sample is representative of the population then our findings can be generalized to the population.
  • 3.
     In researchwe want to be able to say that whatever results we have got are REAL & NOT by chance.  There is bound to be some difference between the ACTUAL value of variable in Population (parameter) and what we find in the STUDY (statistic).  We want to have a Level of Confidence that we have enough readings to correctly identify REAL difference in study population.  In medical studies we usually want to be at least 95% CONFIDENT and allow for 5% error.  Therefore, Z for 95% CL is Zα=1.96.  Parameter: Denotes value for population (μ)  Statistic: Denotes same value for sample (m)
  • 4.
     Power (1-β):Is the probability that study will detect predetermined effect size (any deviation from null hypothesis) should such a deviation exist. It should be at least 80% Alpha error: Saying there is significant relationship when there is none. It is taken as 5% i.e. Alpha=0.05 Beta error: Saying there is no significant relationship when it is there. It should be 20% or less i.e. Beta=0.2
  • 5.
     Depending onstudy design and outcome variables this may require putting  Prevalence of condition  Proportion of exposed  Expected difference between means & proportion of groups under study  What is the Target power  How much Confidence you want  How much error you can allow  Target variance  Help of statistician must be taken for correctly estimating sample size BEFORE beginning the study.  The expected values of exposure can be found from Previous study, Existing records, Pilot study, Assumption (maximum)
  • 6.
     Mathematical formulaeare used to estimate the minimum number of patients/respondents needed to make inferences about a population.  Software like  Epi Info,  G* power,  SAS, STATA, R studio  GLIMMPSE (repeated measures)  Simulation
  • 7.
     Examples  Wewant to assess Stress levels in Students of RMC, Loni. (Similar study shows past prevalence was 30%)  We want to estimate Hb levels in Sickle Cell Anemia patients coming to PRH, Loni (Pilot study mean 11, SD=2)  We want to find effectiveness of NRT inclusive behavior therapy regimen compared to behaviour therapy alone in Tobacco Cessation (Other studies show NRT 20% cure rate & non NRT 10%)  We want to find difference in dose required to produce motor block using two anesthetic drugs. (mean A=110, SD=10; mean B=100, SD=11)  We want to conduct a web based survey about use of masks in Maharashtra
  • 8.
     Just takingthe desired NUMBER of samples is NOT ENOUGH  Taking them using correct SAMPLING TECHNIQUE is also equally important.  These methods are 1. Probability 2. Non Probability • Simple Random • Stratified Random • Systematic Random • Cluster • Multi Stage Probability • Convenience • Purposive • Quota • Snowball Non- Probability
  • 9.
     A samplesize should be large enough to sufficiently describe the phenomenon of interest, and address the research question at hand.  But at the same time, a large sample size risks having repetitive data.  The goal of qualitative research should thus be the attainment of saturation.  Saturation occurs when adding more participants to the study does not result in obtaining additional perspectives or information.
  • 10.
    Help will Alwaysbe given to those who ASK for it. -Dulbus Ambeldore