Epidemiology & Statistics
Study Designs
Observational

Descriptive

Analytical

Experimental / Interventional

NonRandomized

•Cohort Studies
•Case Report
•Case Control(Trohoc)
•Case Series
•Cross – Sectional D. •Cross –Sectional Analytical
•Ecological
•Prevalence
/Surveillance

Randomized
LANCET
2002:359:57 - 61
LANCET
2002:359:57 - 61
Observational - Descriptive

• Frequency, Natural History, Possible

•
•
•

determinants
No comparisons groups
Useful for hypothesis generation about causal
associations
E.g.:
o
o
o
o

Case Reports (SLJOP) –www.sljol.info
Case series
Descriptive Cross Sectional
Ecological/Population – Correlations
Observational - Descriptive

• Data

Analysis / Presentation

o Rates
o Proportions
o Percentages

with 95% Confidence Interval
Observational - Analytical

• Always have a comparison/control group
• Allows determination of causal association
• Hypothesis testing
• E.g.
o Cohort study
o Case Control Study (Trohoc Study)
o Analytical Cross Sectional Study
Observational - Analytical

LANCET
2002:359:57 - 61
Observational - Analytical

• Data
o
o
o
o
o
o
o
o
o

Analysis / Presentation - Basic

Rates
Proportions
with 95% Confidence Interval
Percentages
Odds Ratio (95% CI) – Case control & Cross sectional
Relative Risk (95% CI)
Attributable Risk/ Risk difference
Attributable Risk ratio / Aetiologic Fraction
Cohort
Population Attributable risk
Number Needed to Treat /Harm
Observational - Analytical

• Data
o
o
o
o
o
o
o
o

Analysis / Presentation – Advanced

Comparison of 2 proportions – Z test
Comparison of 2 Means - t test , Z test
Comparison of >2 Means - ANOVA
Comparison of 2 Medians – M- W U test /Rank Sum Test
Comparison of >2 Medians – Kruskal Wallis Test
Comparison of Categories (counts) – Chi Square Test
Correlation – Pearson /Spearman
Linear Association bet 2 variables – Simple Linear
regression

Exceptions and more classifications are there !
2 x 2 table
Disease

Factor

Total
Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d
2 x 2 table

• Odds
o Odds
o Odds ratio (OR)

• Risk
o
o
o
o
o

Risk
Relative risk (RR) / Risk ratio
Attributable risk (AR)/Risk difference
Attributable risk ratio (AR %) /Aetiologic fraction
Number Needed to Treat /Harm (NNT/NNH)

• Chi square test (
• Screening
o
o
o
o

2)

Sensitivity
Specificity
Positive predicative value (PPV)
Negative predictive value (NPV)
Odds

•

Odds = Part /Non part
Disease

Factor

Total
Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+
d

•Odds of disease among exposed (factor positive) = a / b
•Odds of disease among non exposed (factor negative) = c / d
Odds ratio (OR)

• Ratio between odds of exposure and
•

Non

exposure odds
OR = (a / b) / (c / d ) = (a x d ) / (b x c)
Disease

Factor

Total
Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d
Cross product ratio
Odds ratio (OR)…

• CI does not include 1 then its significant
•

o OR =2.6 ,CI 1.9 – 3.3
o OR = 1.1,CI 0.4 – 2.3 Not significant

IF OR bet. 0 -1 with CI not crosses 1
o OR is significant
o Factor is protective
o Eg
 OR 0. 4, CI 0.36 - 0.48

• Small CI – higher precision
Risk

• Risk ( proportion ) =

Diseased / Pop. at risk(Total
exposed) [also known as (Cumulative) Incidence]
Disease

Factor

Total
Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

•Risk of disease among exposed (factor positive) = a / a+b
•Risk of disease among non exposed (factor negative) = c / c+d
Estimated risk/ Average risk
Risk Ratio / Relative Risk - RR

• Ratio between risks of exposure and
•

Non

exposure risks for the disease
RR = [a /(a+b) ] / [c /(c+d) ]
Disease

Factor

Total
Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d
Remember OR ??
Risk Ratio / Relative Risk

• CI does not include 1 then its significant

•

o RR =2.6 ,CI 1.9 – 3.3
o RR = 1.1,CI 0.4 – 2.3 Not significant

IF RR bet. 0 -1 with CI not crosses 1
o RR is significant
o Factor is protective
o Eg
 RR 0. 4, CI 0.36 - 0.48

• Higher the

RR – Higher the Risk
Attributable Risk (AR) / Risk Difference

• Excess risk of disease among exposed,
compared to non exposed

• AR = (Risk exposed) – (Risk unexposed)
= (a / a+b) – (c / c+d )

• Simple difference of risks
Attributable risk ratio (AR %)/Aetiologic fraction

• Estimate of the proportion of disease that can be
•

attributed to the exposure
Proportion of disease that can be eliminated if the
exposure is removed

• AR %

= (Risk exposed) – (Risk unexposed) x 100
(Risk exposed)
= Attributable fraction x 100
Number Needed to Treat /Harm (NNT/NNH)

• NNH
o Number of subjects, if given a harmful exposure (Risk
exposed > Risk unexposed) would cause the one case of
disease

• NNT

o Number of subjects, if given a treatment /Protective
exposure (Risk exposed < Risk unexposed) would
prevent the one case of disease

• NNT/NNH = 1/(Risk exposed)–(Risk unexposed)
= 1 / AR
Variable 2

Chi Square ( 2) Test – for counts
Variable 1

Total
Category 1
(Level 1)

Category 2
(Level 2)

Category 1
(Level 1)

a

b

a+b

Category 2
(Level 2)

c

d

c+d

Total

a+c

b+d

a+b+c+d
Chi Square ( 2) Test (2x2 table)

• Check the association of 2 categorical variables
• There is an association of that variables
o If p value is less than 0.05 (or significant level )
and / or
o If Chi Square value (or Test Statistic ) Is > 3.86

• Degree of freedom(df) is 2
• Need to interpret according to the variables
concerned
Screening
Test

Disease
(Gold standard)

Total

Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

• Sensitivity
• Specificity
• Positive Predictive Value (PPV)
• Negative Predictive Value (NPV)
Sensitivity
Test

Disease
(Gold standard)

Total

Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

• Sensitivity

= Test Positive X 100 % = a/(a+c) %
True Positive
Specificity
Test

Disease
(Gold standard)

Total

Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

• Specificity

= Test Negative X 100 % = d/(b+d) %
True Negative
Positive Predictive Value (PPV)
Test

Disease
(Gold standard)

Total

Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

• PPV =Test Positives among true positivesX100%
Test Positive
Negative Predictive Value (NPV)
Test

Disease
(Gold standard)

Total

Positive

Negative

Positive

a

b

a+b

Negative

c

d

c+d

Total

a+c

b+d

a+b+c+d

• NPV =Test Negatives among true negativesX100%
Test Negatives
Thank You
Questions and Answers
Sensitivity / Specificity
1.Comment briefly on the following.
o
o
o
o

Sensitivity
Specificity
Positive predictive value
Negative predictive Value
Sensitivity / Specificity
2. A new screening test for Chlamydia was
administered to 500 patients attending the
STD clinic and 100 0f them tested positive.
Out of these 100 patients 90 gave positive
results on the confirmatory test (gold
standard). Among those who gave negative
results for the new screening test 20 gave
positive results on the confirmatory test.
1. Present the above data as a 2 by 2 table
2. Calculate Sensitivity, Specificity, Positive predictive
value ,Negative predictive value
Diagnostic test

Disease
Positive
Negative

Positive

90

10

100

Negative

20

380

400

110

390

500

•

Sensitivity = (90 /110)*100 = 81.818 %

•

Specificity= (380/390)*100 = 97.4358%

•

PPV = 90 / 100 *100 = 90%

•

NPV = 380/400 * 100 = 95 %
• P value ??
• OR = 20/20 80 /180 = 2.25
• Interpretation : odds of having a breast cancer
is 2.25 time is higher if life time breast feeding
duration less than 24months
2.8 times higher the risk for hospitalization with diarrhea during infancy in
children who were Introduced complementary feeding before 6 months.
If we take 100 samples of similar children from the population 95 from
them have the calculated risk between 1.9 to 4.2

•

•

6.1 8 times higher the risk for hospitalization with diarrhea during infancy
in children who have Illiterate mothers. But . If we take 100 samples of
similar children from the population 95 from them have the calculated
risk between 0.9 to 143.2. Its shows huge variation and interval includes
1. So this factor is less important in making conclusions

Protective factor - The odds of the Water on tap inside the house group
experiencing hospitalization with diarrhea during infancy is less than the
odds of the Water on tap not inside the house group experiencing the
hospitalization with diarrhea during infancy. In other words its 0.5 times
less than Water on tap not inside the house group. The range not cross
1 so this is significant.
• Risk (x) = 48 / 1002 = 0.0479

• Risk (Placebo) = 53/500 = 0.106
• NNT
• NNT

= 1/ 0.0479 - 0.106

= 17.211
≈ 18
Thank You

MD Paediatricts (Part 2) - Epidemiology and Statistics

  • 1.
  • 2.
    Study Designs Observational Descriptive Analytical Experimental /Interventional NonRandomized •Cohort Studies •Case Report •Case Control(Trohoc) •Case Series •Cross – Sectional D. •Cross –Sectional Analytical •Ecological •Prevalence /Surveillance Randomized
  • 3.
  • 4.
  • 5.
    Observational - Descriptive •Frequency, Natural History, Possible • • • determinants No comparisons groups Useful for hypothesis generation about causal associations E.g.: o o o o Case Reports (SLJOP) –www.sljol.info Case series Descriptive Cross Sectional Ecological/Population – Correlations
  • 6.
    Observational - Descriptive •Data Analysis / Presentation o Rates o Proportions o Percentages with 95% Confidence Interval
  • 7.
    Observational - Analytical •Always have a comparison/control group • Allows determination of causal association • Hypothesis testing • E.g. o Cohort study o Case Control Study (Trohoc Study) o Analytical Cross Sectional Study
  • 8.
  • 9.
    Observational - Analytical •Data o o o o o o o o o Analysis / Presentation - Basic Rates Proportions with 95% Confidence Interval Percentages Odds Ratio (95% CI) – Case control & Cross sectional Relative Risk (95% CI) Attributable Risk/ Risk difference Attributable Risk ratio / Aetiologic Fraction Cohort Population Attributable risk Number Needed to Treat /Harm
  • 10.
    Observational - Analytical •Data o o o o o o o o Analysis / Presentation – Advanced Comparison of 2 proportions – Z test Comparison of 2 Means - t test , Z test Comparison of >2 Means - ANOVA Comparison of 2 Medians – M- W U test /Rank Sum Test Comparison of >2 Medians – Kruskal Wallis Test Comparison of Categories (counts) – Chi Square Test Correlation – Pearson /Spearman Linear Association bet 2 variables – Simple Linear regression Exceptions and more classifications are there !
  • 11.
    2 x 2table Disease Factor Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d
  • 12.
    2 x 2table • Odds o Odds o Odds ratio (OR) • Risk o o o o o Risk Relative risk (RR) / Risk ratio Attributable risk (AR)/Risk difference Attributable risk ratio (AR %) /Aetiologic fraction Number Needed to Treat /Harm (NNT/NNH) • Chi square test ( • Screening o o o o 2) Sensitivity Specificity Positive predicative value (PPV) Negative predictive value (NPV)
  • 13.
    Odds • Odds = Part/Non part Disease Factor Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+ d •Odds of disease among exposed (factor positive) = a / b •Odds of disease among non exposed (factor negative) = c / d
  • 14.
    Odds ratio (OR) •Ratio between odds of exposure and • Non exposure odds OR = (a / b) / (c / d ) = (a x d ) / (b x c) Disease Factor Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d Cross product ratio
  • 15.
    Odds ratio (OR)… •CI does not include 1 then its significant • o OR =2.6 ,CI 1.9 – 3.3 o OR = 1.1,CI 0.4 – 2.3 Not significant IF OR bet. 0 -1 with CI not crosses 1 o OR is significant o Factor is protective o Eg  OR 0. 4, CI 0.36 - 0.48 • Small CI – higher precision
  • 16.
    Risk • Risk (proportion ) = Diseased / Pop. at risk(Total exposed) [also known as (Cumulative) Incidence] Disease Factor Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d •Risk of disease among exposed (factor positive) = a / a+b •Risk of disease among non exposed (factor negative) = c / c+d Estimated risk/ Average risk
  • 17.
    Risk Ratio /Relative Risk - RR • Ratio between risks of exposure and • Non exposure risks for the disease RR = [a /(a+b) ] / [c /(c+d) ] Disease Factor Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d Remember OR ??
  • 18.
    Risk Ratio /Relative Risk • CI does not include 1 then its significant • o RR =2.6 ,CI 1.9 – 3.3 o RR = 1.1,CI 0.4 – 2.3 Not significant IF RR bet. 0 -1 with CI not crosses 1 o RR is significant o Factor is protective o Eg  RR 0. 4, CI 0.36 - 0.48 • Higher the RR – Higher the Risk
  • 19.
    Attributable Risk (AR)/ Risk Difference • Excess risk of disease among exposed, compared to non exposed • AR = (Risk exposed) – (Risk unexposed) = (a / a+b) – (c / c+d ) • Simple difference of risks
  • 20.
    Attributable risk ratio(AR %)/Aetiologic fraction • Estimate of the proportion of disease that can be • attributed to the exposure Proportion of disease that can be eliminated if the exposure is removed • AR % = (Risk exposed) – (Risk unexposed) x 100 (Risk exposed) = Attributable fraction x 100
  • 21.
    Number Needed toTreat /Harm (NNT/NNH) • NNH o Number of subjects, if given a harmful exposure (Risk exposed > Risk unexposed) would cause the one case of disease • NNT o Number of subjects, if given a treatment /Protective exposure (Risk exposed < Risk unexposed) would prevent the one case of disease • NNT/NNH = 1/(Risk exposed)–(Risk unexposed) = 1 / AR
  • 22.
    Variable 2 Chi Square( 2) Test – for counts Variable 1 Total Category 1 (Level 1) Category 2 (Level 2) Category 1 (Level 1) a b a+b Category 2 (Level 2) c d c+d Total a+c b+d a+b+c+d
  • 23.
    Chi Square (2) Test (2x2 table) • Check the association of 2 categorical variables • There is an association of that variables o If p value is less than 0.05 (or significant level ) and / or o If Chi Square value (or Test Statistic ) Is > 3.86 • Degree of freedom(df) is 2 • Need to interpret according to the variables concerned
  • 24.
  • 25.
  • 26.
  • 27.
    Positive Predictive Value(PPV) Test Disease (Gold standard) Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d • PPV =Test Positives among true positivesX100% Test Positive
  • 28.
    Negative Predictive Value(NPV) Test Disease (Gold standard) Total Positive Negative Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d • NPV =Test Negatives among true negativesX100% Test Negatives
  • 29.
  • 30.
  • 31.
    Sensitivity / Specificity 1.Commentbriefly on the following. o o o o Sensitivity Specificity Positive predictive value Negative predictive Value
  • 32.
    Sensitivity / Specificity 2.A new screening test for Chlamydia was administered to 500 patients attending the STD clinic and 100 0f them tested positive. Out of these 100 patients 90 gave positive results on the confirmatory test (gold standard). Among those who gave negative results for the new screening test 20 gave positive results on the confirmatory test. 1. Present the above data as a 2 by 2 table 2. Calculate Sensitivity, Specificity, Positive predictive value ,Negative predictive value
  • 33.
    Diagnostic test Disease Positive Negative Positive 90 10 100 Negative 20 380 400 110 390 500 • Sensitivity =(90 /110)*100 = 81.818 % • Specificity= (380/390)*100 = 97.4358% • PPV = 90 / 100 *100 = 90% • NPV = 380/400 * 100 = 95 %
  • 34.
    • P value?? • OR = 20/20 80 /180 = 2.25 • Interpretation : odds of having a breast cancer is 2.25 time is higher if life time breast feeding duration less than 24months
  • 38.
    2.8 times higherthe risk for hospitalization with diarrhea during infancy in children who were Introduced complementary feeding before 6 months. If we take 100 samples of similar children from the population 95 from them have the calculated risk between 1.9 to 4.2 • • 6.1 8 times higher the risk for hospitalization with diarrhea during infancy in children who have Illiterate mothers. But . If we take 100 samples of similar children from the population 95 from them have the calculated risk between 0.9 to 143.2. Its shows huge variation and interval includes 1. So this factor is less important in making conclusions Protective factor - The odds of the Water on tap inside the house group experiencing hospitalization with diarrhea during infancy is less than the odds of the Water on tap not inside the house group experiencing the hospitalization with diarrhea during infancy. In other words its 0.5 times less than Water on tap not inside the house group. The range not cross 1 so this is significant.
  • 43.
    • Risk (x)= 48 / 1002 = 0.0479 • Risk (Placebo) = 53/500 = 0.106 • NNT • NNT = 1/ 0.0479 - 0.106 = 17.211 ≈ 18
  • 44.