Designs and sample size in medical resarch

Design and
Sample Size
(and Analysis)
A. Indrayan
PhD(OhioState), FAMS, FRSS, FASc

Study design
Objective of study: Descriptive Analytical
Strategy:
Sample
survey
Case
series
Census Observational
Experimental
(Intervention)
Method: Random
Nonrandom
(Purposive)
Prospective
(Follow-up)
Cross-
sectional
Retros-
pective
Laboratory
experiment
Clinical
trial
Field trial
Type:
Mixed
in
stages
SRS
Haphazard
Volunteers Longitudinal Case-control Chemical Therapeutic Prophylactic
SyRS Snowball Cohort
Nested
case-control
Cell Diagnostic Screening
StRS Convenience Other No control Animal Prophylactic
CRS/Area Quota
Cohort can be historical
(retrospective) or
Cases and controls can
be prospectively
Screening
MRS
Referred
concurrent recruited
PPS
With
control
Without
control
Consecutive
Sequential
Randomised
(RCT if trial)
Non-
randomised
Blind Open
Layout for experiments/trials:
-Cross-over, repeated measures
-One-way, two-way, factorial, etc. Single Double Triple

Descriptive Studies
• Existing status – unless you know this,
how do you plan to proceed (what
percentage of people of age 60+ have
cataract in India, profile of cases of
benign prostatic hyperplasia)
• No cause-effect (or antecedent-outcome)

Descriptive Studies
• Case series (HIV in SFO)
• Case studies
• Sample surveys (SRS)
Primary methodology for surveys:
• Sample design
• Random, nonrandom, etc.

Analytical Studies
• Antecedent–outcome relationship
• Aetiology, risk factors, cause-effect, etc.
Two types:
Observational and Experimental

Observational Studies
(Epidemiological Studies)
• Naturally occurring events (Nature’s
experiments)
• No human intervention (obesity and
hypertension)
• Intervention is harmful or not easily
implementable (smoking)

Types of Observational Studies
• Ecological studies (population based
variables – dietary pattern and diabetes)
• Retrospective (case-control)
• Prospective (cohort)
• Cross-sectional (different from surveys)
• There must be an antecedent and an
outcome

Experimental Studies
• Clinical trials on patients (treatment
modalities and )
• Field trials on population at large (iron
supplementation to girls)
• Laboratory experiments on animals
(pharmacological), biological material
(tissues, swabs, blood specimen, etc.)

Clinical Trial Ethics
• Intervention must have been established as potentially
beneficial in preclinical phases
• Inclusion and exclusion criteria (for mortality end
point do not take persons of old age, people likely to be
harmed with side effects or otherwise not included)
• Informed consent (self selection) – bias?
• Protection of interest of the subjects (Individual
interest more important than societal gains unless
explicitly stated)
• Done in standard conditions (bias under control)
• Done in phases
• Helsinki Declaration

Clinical Trials Methodology
• Random selection of subjects
(consecutive or random numbers)
• Controls – self (placebo effect
confounded) or parallel (matched,
unmatched)
• Random allocation (individual or cluster)
• Blinding (Double blind RCT gold
standard)
• Masking

Clinical Trial Designs
• One-way, two-way, factorial, partially
factorial
• Crossover
• Up-and-down
• Two-stage
• Adaptive

Laboratory Experiments
• Standard conditions (in lab. and in
subjects) so less variation and less sample
size
• Same designs
• Harmful intervention can be tried in
lower animals and they can be sacrificed
if carried out as per guidelines

Sample size is required for
planning
Statistical requirement may conflict
with available resources
Sample size
Reliability
Hard to execute
Large sample
Small sample
Easy to do

Should be neither too small nor too big
• Small sample may fail to provide
sufficient evidence – unethical in case of
experiments as the exposure is
unnecessary (but good for
pilot/exploratory/phase-1 trial)
• Large sample is also waste of resources if
smaller sample can provide convincing
evidence

Some researchers expect a statistician to give
a sample size just on the basis of the title of
research
Just as a physician can not
prescribe without knowing fully
about pain, a statistician can not
suggest a sample size without some
basic information

Sample size depends on –1
Statistical parameter under investigation:
• Mean/Median
• Proportion (Prevalence, Probability)
• Rate (Incidence, Mortality)
• Ratio (OR, RR, Hazard ratio)
• Coefficient (Regression, Correlation)
• Difference

Sample size depends on –2
• Estimation/Test of hypothesis
• Design (Descriptive/Analytical)
• Layout (Independent/Matching/Repeated
measures, One-way/Two-way/Etc.)
• Sampling method (SRS/Cluster/Etc.)
• Any subgroups
• Non-response
• Number of variables to be simultaneously
considered

Two types of statistical
inference
 Estimation
 Test of hypothesis
Requirement of basic information is
different for the two setups

I. ESTIMATION Basic equation: L = zα/2*SE(t)
n is in the denominator of all SEs
• Variability in different measurements
(quantitative); prevalence or proportion
(qualitative)
• Minimum degree of precision (width of CI) –
Margin of error
• Least confidence you can afford

II. TEST OF HYPOTHESIS
Basic equation: Power = P(Z ≥ zα when the specified medically important
difference is present; n would occur on the right side of the equation
• Variability in different measurements (quantitative);
prevalence or proportion (qualitative)
• Minimum difference that would be considered
medically important
• Statistical power (The probability of detecting the
specified difference when present)
• Level of significance (The probability of incorrectly
rejecting a null) – one-tail or two-tail

Adjustments
• Expected nonresponse or dropouts (bias)
• Sampling other than SRS
• Number of covariates in the study
• Cross-classifications – sub-groups

TOOLS FOR SAMPLE SIZE
• Formula (different for small
samples)
• Online calculator
(http://www.stat.ubc.ca/~rollin/stats/ssize ,
http://statpages.org/#Power ,
http://hedwig.mgh.harvard.edu/sample_size/size.html )
• Nomogram
• Table
• Thumb rule

100
0.02
1000
900
800
700
600
500
400
300
250
150
100
75
Prevalence
rate
Cluster
size
Number
of
Clusters
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0.13
0.14
0.15
0.16
450
400
350
300
250
225
200
175
150
500
100
90
80
70
60
50
45
40
35
30
25
20
15
125
200
100
90
80
70
60
50
40
35
30
25
20
15
45
10
5
450
400
350
300
250
225
175
150
125
500
90
80
70
60
50
150
125
100
175
200
450
350
300
250
225
400
500
45
40
35
30
25
20
15
10
5
150
125
100
90
80
70
60
400
350
300
250
225
200
175
450
500
45
40
35
30
25
20
15
10
50
450
400
350
300
250
225
200
500
175
150
125
100
80
70
60
50
45
40
35
30
25
20
15
10
125
100
90
80
70
60
50
40
35
30
25
20
15
10
400
300
250
225
200
175
150
350
450
500
5
0.20
0.19
0.18
0.17
0.16
0.15
0.14
0.13
0.12
0.11
0.10
0.09
0.08
0.06
0.07
0.05
0.04
0.015
0.03
0.025
0.020
0.010
0.20
0.19
0.18
0.17
0.16
0.15
0.14
0.13
0.12
0.11
0.10
0.09
0.08
0.06
0.04
0.07
0.05
0.015
0.03
0.025
0.020
0.010
0.20
0.19
0.18
0.17
0.16
0.15
0.14
0.13
0.12
0.11
0.10
0.09
0.08
0.06
0.04
0.07
0.05
0.015
0.03
0.025
0.020
0.010
0.18
0.20
0.22
Ratio of D & B (D /B -lines)
a =0.20
a =0.10
a =0.05
Number of clusters ( C -lines)
a =0.05
a =0.10
a =0.20
P -line
L=10%
of
P
L=20%
of
P
L=20%
of
P
L=10%
of
P
L=10%
of
P
L=20%
of
P
L = Precision required on either side of P , a = Size of critical region (Confidence = 1-a )
Prevalence
rate
(P
)

Thumb Rules
• Normative studies: 200 per group
Clinical trials –
• Big trial: Minimum 300 per group—each centre in case multicentric
• Medium trial: Minimum 100 per group
• Small trial (PG thesis): Minimum 30 per group
Observational studies –
• Case-control: Minimum 30 cases with rarest exposure in case of
medium sized study and minimum of 5 cases in case small scale (PG
thesis) study
• Cohort: Minimum 30 cases with rarest outcome in case of medium
sized study and minimum of 5 cases in case small scale (PG thesis)
study
Regressions
• Logistic: Minimum 5 cases in rarest cross-classification
• Quantitative regression: Minimum 10 subjects per regressor

Resource limitations
• Many times, time and resources do not
allow a study on the required sample size.
• Do reverse calculation for the sample size
you can cover and find power of your
study. Say that the power of the study
would be so much and not more, in view
of the resource limitations.
• In case of really small sample and non-
Gaussian conditions, use non-parametric
or exact methods.

Analysis – 1
(broad aspects only)
Depends on design and the type of
measurements (quantitative, qualitative,
ordinal)
• For descriptive studies (Profiles,
prevalence, patterns) – percentages, cross-
tabulations, mean, SD, quartiles,
percentiles, box plot, distribution pattern,
Kaplan-Meier for durations, rates and ratios,
confidence intervals,

Analysis – 2
• For analytical studies (correlations with
cause-effect overtones, differences between
groups in percentages and averages) –
various correlations, chi-square test, Student
t-test, ANOVA, ANCOVA, logistic
regression (odds ratios), quantitative
regression, Cox regression (hazard ratios),
agreement analysis, log-rank, ROC curves,

Designs and sample size in medical resarch

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Designs and sample size in medical resarch

Similar to Designs and sample size in medical resarch (20)

Recently uploaded

Recently uploaded (20)

Designs and sample size in medical resarch