Interpretation of statistical values
&
fundamentals of epidemiology
Dr.Asma Rahim
Dr.Bindhu vasudevan
Dept. of Community Medicine
What you are expected to Know?
• Mean
• What is SD ?
• What is SE & its applications
• What is Confidence limits as noted
in many journals?
• What is P value ?How to interpret it?
• Which are the different statistical tests to be
applied on different situations?
Dilemma of a PG Student!!!
•DNB exams more stress on Original work.
•Methodology of your work is important.
•Look ahead for statistical queries.
•Examiners familiar with research designs
•OSCE stations have questions on
Statistics.
• Descriptive Statistics
• Inferential statistics
Types of variables
• Qualitative
– Dichotomous
– Nominal
– Ordinal
• Quantitative
– Discrete
– Continuous
1. Which is a qualitative variable
• a) BMI
• b) S. bilirubin
• c) Name of residing place
• d) Blood urea
2. Which is a quantitative variable
• Causes of deaths
• Religious distribution
• Age group distribution
• Age distribution
4. Which is an ordinal variable
• A)Blood pressure
• B)Name of residing place
• C)Grading of carcinoma
• D) temperature
5. Which is not a nominal scale
variable
• A)Causes of death
• B) religion
• C)diagnosis
• D)visual analogue scale
Quantitative data Qualitative data
Hb in gm% Anemic/non anemic
Height in cm Tall/short
B.P in mm of Hg Hypo/normo/
hypertensives
Measures of central Tendency
• Qualitative data – Proportion
• Quantitative data – Mean,Median,Mode
In a group of 100 under five children
attending IMCH O.P the mean weight is
15kg. The standard deviation is 2.
1.In what range 95% of children’s weight
will lie in the sample?
2. In what range the mean weight of all
children who are attending IMCH OP
will lie?
Range in which 95% children’s weight in the
sample will lie:
95% reference range =
mean +/- 2SD = 11-19Kg
What is the mean Birth weight of all the
children attending IMCH O.P
95% Confidence interval =
mean +/- 2SE( Standard error)-
Central limit theorem
• Difficult to study the whole population
• Researcher wants to extrapolate the study
findings to population
In a group of 100 under five children
attending IMCH O.P the mean weight is
15kg. The standard deviation is 2.
what will be the mean weight of all
children who are attending IMCH OP
will lie?
Central limit Theorem
• Central limit theorem states that
• The random sampling distribution of sample
means will be normal distribution
• Means of random sample means will be
equal to population mean
• The standard deviation of sample means
from population mean is the standard error
17kg
19
16
15
18
17
Standard Error
Central limit Theorem
• Central limit theorem states that
• The random sampling distribution of sample
means will be normal distribution
• Means of random sample means will be
equal to population mean
• The standard deviation of sample means
from population mean is the standard error
17kg
19
16
15
18
17
Standard Error
Applications of SE
• To find out the range in which the population mean
will lie ( 95% confidence interval- sample mean +/-
2SE)
• To know whether the sample is representative of the
population if population mean is known
• To find the observed difference of two samples is
statistically significant
• In a group of 100 children the mean
weight is 15kg. The standard error is
0.02. In what range the population mean
will lie.
95% Confidence interval
• Range in which the mean population value
will lie
• Mean +/ - 2 SE
• 95% confidence limits – sample mean +/- 2
SE
• 95% CI =15+/- 2x0.02=14.96-15.04kg
• The PEFR of 100, 11 year old girls follow a
normal distribution with a mean of 300 1/min,
standard deviation 20 l/min and standrd error of
2 l/min
• What will be the range in which 95% of the girl’s
PEFR will lie in the sample?
• What will be the range in which mean PEFR of the
population will lie from which the sample was
taken?
Range in which 95% of girls PEFR in the
sample will lie:
mean +/- 2SD = 260 - 340
Range in which mean PEFR Value will lie:
mean +/- 2SE( Standard error)-
95% Confidence interval = 296-304
Normal distribution curve
•
Applications of SE
• To find out the range in which the population mean
will lie ( 95% confidence interval- sample mean +/-
2SE)
• To know whether the sample is representative of the
population if population mean is known
• To find the observed difference of two samples is
statistically significant
• In a village the percentage of male population is
52%. In a sample of 100 people the male
percentage was 40 with a standard error of 5.Is
this sample representing the population
Answer
• SE = 5
95% CI= sample proportion +/- 2 SE
= 40 +/- 2 x 5
=30- 50
52% is higher than this range
Applications of SE
• To find out the range in which the population mean
will lie ( 95% confidence interval- sample mean +/-
2SE)
• To know whether the sample is representative of the
population if population mean is known
• To find the observed difference of two samples is
statistically significant
Height of 100 boys & 100 girls gave
the following values. Do these two
groups differ significantly
Mean
height
SE
Girls 150cm 2
boys 160cm 3
Answer
Girls
– 95% CI = 150 +/- 2 x 2
=146 -154
• Boys
– 95% CI = 160 +/- 2x 3
=154-166
• Overlapping is present among the 95% CI
• Both groups can have the same population
mean
Sample size
• Calculate the sample size to find out the
prevalence of a disease after implementing
a control programme with 10% allowable
error. Prevalence of the disease before
implementing the programme was 80 %
Sample size
• Qualitative data N = 4pq/L2
• P = positive factor /prevalence/proportion
• Q = 100 – p
• L = allowable error or precision or
variability
• 4 = 1.962(Alpha error) 2
• Quantitative data N = 4SD2/L2
Sample size
• Calculate the sample size to find out the
prevalence of a disease after implementing
a control programme with 10% allowable
error. Prevalence of the disease before
implementing the programme was 80 %
• N= 4 x 80 x 20/8 x 8 = 100
• Determine the sample size to find out the Vitamin A
requirement in the under five children of Calicut
district . From the existing literature the mean daily
requirement of the same was documented as 930 I.U
with a SD of 90 I.U. Consider the precision as 9.
• N = 4SD2/L2
• 4 x 90 x 90 /9 x9 = 400
• Determine the sample size to prove that
drug A is better than drug B in reducing the
S.Cholesterol. The findings from a previous
study is given
Drug Mean SD
A 215 20
B 240 30
• Quantitative data N =
(Zα + Zβ )2 x S2 x 2 /d2
Zα = Z value for α level = 1.96 at α 0.05
Zβ = Z value for β level =1.28 for β at 10% &
0.82 at 20%
S = average SD
d = difference between the two means
• Determine the sample size to prove that
drug A is better than drug B in reducing the
S.Cholesterol. The findings from a previous
study is given
Drug Mean SD
A 215 20
B 240 30
• N = 10.5 x 25 x 25 x2/ 25 x 25 = 21
• Qualitative data N =
(Zα + Zβ )2 p x q /d2
Zα = Z value for α level = 1.96 at α 0.5
Zβ = Z value for β level =1.28 for β at 10%
P = average prevalence /proportion/positive
factor
d = difference between the
prevalence/proportion/positive factor
• In a study conducted on a sample of 400 adults, it
was found that mean daily requirement of Vit. A
was 900 I.U. From the existing literature the same
was documented as 930 I.U with a SE of 4.5 I.U.
Does the study finding differ from the existing
literature finding significantly?
Steps for testing a hypothesis
• State Null Hypothesis
• State alternate hypothesis
• Fix the alpha error
• Identify the correct statistical test
• In a study conducted on a sample of 400 adults, it
was found that mean daily requirement of Vit. A
was 900 I.U. From the existing literature the same
was documented as 930 I.U with a SE of 4.5 I.U.
Does the study finding differ from the existing
literature finding significantly?
Reject
Null hypothesis
Accept
Null hypothesis
Null hypothesis
true
Type 1 error
(alpha error)
Correct decision
Null hypothesis
false
Correct decision Type 2 error
(Beta error)
• Alpha = 5% (0.05)
• Beta = 0.1 to 0.2 or 10 to 20%.
• Power of the study = 1- beta error
• Strength at which we conclude there is no
difference between the two groups.
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Parametric and Nonparametric tests
Parametric: When the data is normally
distributed.
Nonparametric : When data is not normally
distributed,usually with small sample size.
Non parametric tests
Qualitative data Chi-square test
Fishers test,
Mc Nemar test
Paired t test Wilcoxon Signed rank test
independent t test Wilcoxon test , Mann-
Whitney U , Kolmogrov
independent t test Kruskal-wallis test
Deciding statistical tests?
• In a clinical trial of a micronutrient on
growth, the weight was measured before
and after giving the micronutrient.. Which
test will you use for comparison?
• paired t test
• F test
• T test
• Chi square test
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
The most appropriate test for
comparing Hb values in the adult
women in two different population of
size 150 and 200 is
• A) t test
• B) Anova
• C) Z test
• D) Chi square test
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Answer
• C
– Two groups
– >30
– Continuous variable
– Comparing mean
The most appropriate test to
compare birth weight in 3
different regions is
• A) t test
• B) Anova
• C) Z test
• D) Chi square test
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Answer
• B
– Continuous variable
– Compare means
– > 2 groups
The most appropriate test to
compare BMI in two different
adult population of size 24 and 30
is
• A) Two sampled t test
• B) Paired t test
• C) Z test
• D) Chi square test
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Answer
• A
– Two different groups
– Continuous variable
– Size <30
The association between smoking
status and MI is tested by
• A) t test
• B) Anova
• C) F test
• D) Chi square test
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Standard drug used 40% of patients responded
and a new drug when used 60% of patients
responded. Which of the following tests of
parametric significance is most useful in this
study?
• A) Fishers t Test
• B) Independent sample t test
• C) Paired t test
• D) Chi square test.
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
• A consumer group would like to evaluate
the success of three different commercial
weight loss programmes. Subjects are
assigned to one of three programmes
(Group A , Group B ,GROUP C) . Each
group follows different diet regimen. At
first time and at the end of 6 weeks subjects
are weighed an their BP measurements
recorded.
Test to detect mean difference in
body weight between Group A &
Group B
• T-TEST
• Difference between means of two samples
Is there a significant difference in body
weight in Group A at Time 1 and Time
2?
• Paired T Test
• Same people sampled on two Occasions.
Is the difference in body weight of subjects in
Group A,GROUP b ,group C significantly
different at Time 2
• Analysis of variance
Is there any relation between blood pressure
and body weight of these subjects?
Difference in proportion Chi-square test, Z test,
Difference in mean (Before and
after comparison-same group)
Paired t test
Difference in mean (two
independent groups)
Unpaired t test, If
sample > 30-Z test
More than 2 means(> 2 groups) Anova
Association b/w 2 quantitative
variables
Spearman correlation
Prediction regression
Association b/w 2 quantitative variables
•Correlation
Correlation coefficient
• Shows the relation between two quantitative
variable
• Shows the rate of change of one variable as
the other variable change
• The value lies between –1 to + 1
• Correlation coefficient of zero means that
there is no relationship
• No. of deaths in 8 villages due to water
borne diseases before & after installation of
water supply system.
• Villages: 1 2 3 4 5 6 7 8
• Before :13 6 12 13 4 13 9 10
• After :15 4 10 9 1 11 8 13
Did the Installation of water supply
system significantly reduce deaths
Which non parametric test will be
used to test the null hypothesis
• Small sample size
• Distribution is not normal
• Non parametric test
• Wilcoxon signed rank test
Non parametric tests
Qualitative data Chi-square test
Fishers test,
Mc Nemar test
Paired t test Wilcoxon Signed rank test
independent t test Wilcoxon test , Mann-
Whitney U , Kolmogrov
independent t test Kruskal-wallis test
For treatment of Hepatitis A 7
patients treated with herbal
medicines& 7 patients treated with
Allopathic symptomatic management.
S.Br values after 10 days of treatment
is given below
• Herbal : 9 6 10 3 6 3 2
• Allopathy: 6 3 5 6 2 4 8
Is herbal treatment is better than
allopathic treatment?
• Small sample size
• Distribution is not normal
• Non parametric test
• Mann- Whitney test
Non parametric tests
Qualitative data Chi-square test
Fishers test,
Mc Nemar test
Paired t test Wilcoxon Signed rank test
independent t test Wilcoxon test , Mann-
Whitney U , Kolmogrov
independent t test Kruskal-wallis test
Steps for testing a hypothesis
• State Null Hypothesis
• State alternate hypothesis
• Fix the alpha error
• Identify the test statistic
• Find out the critical value
• Calculate the value for the identified
statistical test
Difference in means/ SE
• If the calculated value is > the table
value(critical value)- Reject Null
Hypothesis
• In a study conducted on a sample of 400 adults, it
was found that mean daily requirement of Vit. A
was 900 I.U. From the existing literature the same
was documented as 930 I.U with a SE of 4.5 I.U.
Does the study finding differ from the existing
literature finding significantly?
Null hypothesis
Alpha Error – 5%
Test static –Z test
SE = 4.5
Z = 930-900/4.5=6.67
– For alpha error 5%, critical Z value = 1.96
– 6.67 >1.96 So we will Reject null hypothesis
– There is a significant difference
– P value
• After applying a statistical test an
investigator get the p value as 0.01. What
does it mean?
• Null hypothesis states there is no difference,If
there is any difference it is due to chance
• P value = If the null hypothesis is true the
probability of the sample variation to occur by
chance
• P value 0.05= probability of the sample variation
by chance is only 5% if null hypothesis was true
• 95% the sample variation is not due to chance,&
there is a difference. So we will reject NH
• P = 0.01 - probability of the sample
variation by chance is only 1% if null
hypothesis was true
• 99 % the sample variation is not due to
chance,& there is a difference. So we will
reject NH
• As p value decreases the difference become
more significant
• For practical purpose p value < 0.05 ; the
difference is significant
In assessing the association between
maternal nutritional status and Birth
weight of the newborns two investigators
A and B studied separately and found
significant results with p values 0.02 &
0.04 respectively. From this what can you
infer about the magnitude of association
found by the
two investigators
Low birth weight & selected risk
factors
Risk factor P value
Maternal age 0.01
Birth order 0.1
Employment status of
mother
0.9
Mean Weight gain during
pregnancy
0.002
UTI during pregnancy 0.03
Mean Hb 0.0001
• A study was conducted to find out the
association between Per Capita National
Income and Per Capita Consumer
Expenditure from the data given below
230
235
240
245
250
255
260
240 250 260 270 280 290
Series1
• . What is the name of this diagram?
• What is its use?
• From the diagram what is your inference?
Type of study Alternative
name
Unit of study
Descriptive Case series
Cross sectional
Longitudinal
Prevalence
study
Incidence study
Individual
Analytical
studies
(observational
Ecological
Case control
Cohort
Correlational
Case reference
Follow up
Populations
Individuals
Individuals
Analytical studies
(interventional)
Randomised
controlled trial
Field trial
Community
trials
Clinical trial
Community
intervention
Community
Patients
Healthy people
Healthy people
Study questions and appropriate designs
Type of question Appropriate study design
Burden of illness Cross sectional survey
Longitudinal survey
Causation, risk and
prognosis
Case control study, Cohort study
Occupational risk,
environmental risk
Ecological studies
Treatment efficacy RCT
Diagnostic test
evaluation
Paired comparative study
Cost effectiveness RCT
Odd’s ratio
• In a study conducted by Gireesh G N etal
about the ‘Prevalence of Worm infestation
in children”,50 children in anganwadi were
examined. Out of this 5 had worm
infestation. 2 out of this 5 have a history of
pet animals at home while 21 out of the 45
non infested has a history of pet animals at
home. Is there any association between pet
animals and worm infestations?
Study design –Case control
• Measure of risk –Odd’s ratio
• Set up a 2x2 table
a b
2 21
c d
3 24
Worm infestation
+
+
-
-
• Odd’s ratio = ad /bc
• 2 x 24 = 0.76
21 x3
Interpretation
• OR =1,RISK FACTOR NOT RELATED
TO DISEASE
• OR <1 ,RISK FACTOR PROTECTIVE
• OR >1 RISK FACTOR POSITIVELY
ASSOCIATED WITH DISEASE
Relative risk
• In a study to find the effect of Birth weight
on subsequent growth of children , 300
children with birth weight 2kg to 2.5 kg
were followed till age 1 . A similar number
of children with birth weight greater 2.5 kg
were followed up too. Anthropometric
measurements done in both groups. Results
are shown below
Low birth weight Normal
No.children studied 300 300
No.malnourished
At age one 102 51
Study design –Cohort study
• Measure of risk –Relative risk ,Attributable
risk.
• Relative risk –Incidence among exposed
Incidence among nonexposed
= 102/300 = 0.34 = 2
51/ 300 0.17
Inference ?
Rr 0 no association
Rr >1 + association
• An out break of Pediculosis capitis being
investigated in a girls school with 291
pupils.Of 130 Children who live in a nearby
housing estate 18 were infested and of 161
who live elsewhere 37 were infested. The
Chi square value was found to be 3.93 .
• P value = 0.04
• Is there a significant difference in the
infestation rates between the two groups?
Results of a screening test
Disease
Positive Negative
Positive TP(a) FP(b)
Test
Negative FN© TN(d)
Features of a screening test
Sensitivity = a/ a+c
Specificity = d/b+d
Positive predictive value = a/a+b
Negative predictive value = d/c+d
False positive rate = bb+d
False negative rate = c/a+c
In a group of patients presenting to a hospital emergency
with abdominal pain, 30% of patients have acute
appendicitis, 70% of patients with appendicitis have a
temperature greater than 37.50c and 40% of patients
without appendictis have a temperature greater than
37.50c. Considering these findings which of the
following statement is correct ?
a) Sensitivity of temperature greater than 37.50c as a
marker for appendicitis is 21/49
b) Specificity of temperature grater than 37.50c as a
marker for appendicitis is 42/70
c) The positive predictive value of temperature greater
than 37.50c as marker for appendicitis is 21/30
d) Specificity of the test will depend upon the
prevalence of appendicitis in the population to which it
is applied.
Sensitivity and Specificity
+ -
Fever > 37.50c +
21a 28b
- 9c 42d
30a+c 70b+d
• Sensitivity = a/a+c - 21/30=70%
• Specificity = d/b+d = 42/70=60%
• Positive predictive value = a/a+b =
21/49=43%
• Negative predictive value = d/c+d = 42/51
Exercise 11
Disease prevalence in a population of
10,000 was 5%. A urine sugar test with
sensitivity of 70% and specificity of 80%
was done on the population. The positive
predictive value will be :
a)15.55% b) 70.08% c) 84.4%
d)98.06%
• Total population = 10,000
• Disease prevalence = 5%
• No diseased = 500
• Applying this to a 2x2 table :
2x2 table
+ -
+ TEST 350 a 1900 b 2250
- 150c 7600d 7750
500 9500 10000
All the Best!!1
http://www.dnbpediatrics.com/

sta

  • 1.
    Interpretation of statisticalvalues & fundamentals of epidemiology Dr.Asma Rahim Dr.Bindhu vasudevan Dept. of Community Medicine
  • 2.
    What you areexpected to Know? • Mean • What is SD ? • What is SE & its applications • What is Confidence limits as noted in many journals?
  • 3.
    • What isP value ?How to interpret it? • Which are the different statistical tests to be applied on different situations?
  • 4.
    Dilemma of aPG Student!!! •DNB exams more stress on Original work. •Methodology of your work is important. •Look ahead for statistical queries. •Examiners familiar with research designs •OSCE stations have questions on Statistics.
  • 5.
    • Descriptive Statistics •Inferential statistics
  • 6.
    Types of variables •Qualitative – Dichotomous – Nominal – Ordinal • Quantitative – Discrete – Continuous
  • 7.
    1. Which isa qualitative variable • a) BMI • b) S. bilirubin • c) Name of residing place • d) Blood urea
  • 8.
    2. Which isa quantitative variable • Causes of deaths • Religious distribution • Age group distribution • Age distribution
  • 9.
    4. Which isan ordinal variable • A)Blood pressure • B)Name of residing place • C)Grading of carcinoma • D) temperature
  • 10.
    5. Which isnot a nominal scale variable • A)Causes of death • B) religion • C)diagnosis • D)visual analogue scale
  • 11.
    Quantitative data Qualitativedata Hb in gm% Anemic/non anemic Height in cm Tall/short B.P in mm of Hg Hypo/normo/ hypertensives
  • 12.
    Measures of centralTendency • Qualitative data – Proportion • Quantitative data – Mean,Median,Mode
  • 13.
    In a groupof 100 under five children attending IMCH O.P the mean weight is 15kg. The standard deviation is 2. 1.In what range 95% of children’s weight will lie in the sample? 2. In what range the mean weight of all children who are attending IMCH OP will lie?
  • 14.
    Range in which95% children’s weight in the sample will lie: 95% reference range = mean +/- 2SD = 11-19Kg What is the mean Birth weight of all the children attending IMCH O.P 95% Confidence interval = mean +/- 2SE( Standard error)-
  • 15.
    Central limit theorem •Difficult to study the whole population • Researcher wants to extrapolate the study findings to population
  • 16.
    In a groupof 100 under five children attending IMCH O.P the mean weight is 15kg. The standard deviation is 2. what will be the mean weight of all children who are attending IMCH OP will lie?
  • 17.
    Central limit Theorem •Central limit theorem states that • The random sampling distribution of sample means will be normal distribution • Means of random sample means will be equal to population mean • The standard deviation of sample means from population mean is the standard error
  • 18.
  • 19.
    Central limit Theorem •Central limit theorem states that • The random sampling distribution of sample means will be normal distribution • Means of random sample means will be equal to population mean • The standard deviation of sample means from population mean is the standard error
  • 20.
  • 21.
    Applications of SE •To find out the range in which the population mean will lie ( 95% confidence interval- sample mean +/- 2SE) • To know whether the sample is representative of the population if population mean is known • To find the observed difference of two samples is statistically significant
  • 22.
    • In agroup of 100 children the mean weight is 15kg. The standard error is 0.02. In what range the population mean will lie.
  • 23.
    95% Confidence interval •Range in which the mean population value will lie • Mean +/ - 2 SE
  • 24.
    • 95% confidencelimits – sample mean +/- 2 SE • 95% CI =15+/- 2x0.02=14.96-15.04kg
  • 25.
    • The PEFRof 100, 11 year old girls follow a normal distribution with a mean of 300 1/min, standard deviation 20 l/min and standrd error of 2 l/min • What will be the range in which 95% of the girl’s PEFR will lie in the sample? • What will be the range in which mean PEFR of the population will lie from which the sample was taken?
  • 26.
    Range in which95% of girls PEFR in the sample will lie: mean +/- 2SD = 260 - 340 Range in which mean PEFR Value will lie: mean +/- 2SE( Standard error)- 95% Confidence interval = 296-304
  • 27.
  • 28.
    Applications of SE •To find out the range in which the population mean will lie ( 95% confidence interval- sample mean +/- 2SE) • To know whether the sample is representative of the population if population mean is known • To find the observed difference of two samples is statistically significant
  • 29.
    • In avillage the percentage of male population is 52%. In a sample of 100 people the male percentage was 40 with a standard error of 5.Is this sample representing the population
  • 30.
    Answer • SE =5 95% CI= sample proportion +/- 2 SE = 40 +/- 2 x 5 =30- 50 52% is higher than this range
  • 31.
    Applications of SE •To find out the range in which the population mean will lie ( 95% confidence interval- sample mean +/- 2SE) • To know whether the sample is representative of the population if population mean is known • To find the observed difference of two samples is statistically significant
  • 32.
    Height of 100boys & 100 girls gave the following values. Do these two groups differ significantly Mean height SE Girls 150cm 2 boys 160cm 3
  • 33.
    Answer Girls – 95% CI= 150 +/- 2 x 2 =146 -154 • Boys – 95% CI = 160 +/- 2x 3 =154-166 • Overlapping is present among the 95% CI • Both groups can have the same population mean
  • 34.
    Sample size • Calculatethe sample size to find out the prevalence of a disease after implementing a control programme with 10% allowable error. Prevalence of the disease before implementing the programme was 80 %
  • 35.
    Sample size • Qualitativedata N = 4pq/L2 • P = positive factor /prevalence/proportion • Q = 100 – p • L = allowable error or precision or variability • 4 = 1.962(Alpha error) 2 • Quantitative data N = 4SD2/L2
  • 36.
    Sample size • Calculatethe sample size to find out the prevalence of a disease after implementing a control programme with 10% allowable error. Prevalence of the disease before implementing the programme was 80 %
  • 37.
    • N= 4x 80 x 20/8 x 8 = 100
  • 38.
    • Determine thesample size to find out the Vitamin A requirement in the under five children of Calicut district . From the existing literature the mean daily requirement of the same was documented as 930 I.U with a SD of 90 I.U. Consider the precision as 9.
  • 39.
    • N =4SD2/L2 • 4 x 90 x 90 /9 x9 = 400
  • 40.
    • Determine thesample size to prove that drug A is better than drug B in reducing the S.Cholesterol. The findings from a previous study is given Drug Mean SD A 215 20 B 240 30
  • 41.
    • Quantitative dataN = (Zα + Zβ )2 x S2 x 2 /d2 Zα = Z value for α level = 1.96 at α 0.05 Zβ = Z value for β level =1.28 for β at 10% & 0.82 at 20% S = average SD d = difference between the two means
  • 42.
    • Determine thesample size to prove that drug A is better than drug B in reducing the S.Cholesterol. The findings from a previous study is given Drug Mean SD A 215 20 B 240 30
  • 43.
    • N =10.5 x 25 x 25 x2/ 25 x 25 = 21
  • 44.
    • Qualitative dataN = (Zα + Zβ )2 p x q /d2 Zα = Z value for α level = 1.96 at α 0.5 Zβ = Z value for β level =1.28 for β at 10% P = average prevalence /proportion/positive factor d = difference between the prevalence/proportion/positive factor
  • 45.
    • In astudy conducted on a sample of 400 adults, it was found that mean daily requirement of Vit. A was 900 I.U. From the existing literature the same was documented as 930 I.U with a SE of 4.5 I.U. Does the study finding differ from the existing literature finding significantly?
  • 46.
    Steps for testinga hypothesis • State Null Hypothesis • State alternate hypothesis • Fix the alpha error • Identify the correct statistical test
  • 47.
    • In astudy conducted on a sample of 400 adults, it was found that mean daily requirement of Vit. A was 900 I.U. From the existing literature the same was documented as 930 I.U with a SE of 4.5 I.U. Does the study finding differ from the existing literature finding significantly?
  • 48.
    Reject Null hypothesis Accept Null hypothesis Nullhypothesis true Type 1 error (alpha error) Correct decision Null hypothesis false Correct decision Type 2 error (Beta error)
  • 49.
    • Alpha =5% (0.05) • Beta = 0.1 to 0.2 or 10 to 20%. • Power of the study = 1- beta error • Strength at which we conclude there is no difference between the two groups.
  • 50.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 51.
    Parametric and Nonparametrictests Parametric: When the data is normally distributed. Nonparametric : When data is not normally distributed,usually with small sample size.
  • 52.
    Non parametric tests Qualitativedata Chi-square test Fishers test, Mc Nemar test Paired t test Wilcoxon Signed rank test independent t test Wilcoxon test , Mann- Whitney U , Kolmogrov independent t test Kruskal-wallis test
  • 53.
    Deciding statistical tests? •In a clinical trial of a micronutrient on growth, the weight was measured before and after giving the micronutrient.. Which test will you use for comparison? • paired t test • F test • T test • Chi square test
  • 54.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 55.
    The most appropriatetest for comparing Hb values in the adult women in two different population of size 150 and 200 is • A) t test • B) Anova • C) Z test • D) Chi square test
  • 56.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 57.
    Answer • C – Twogroups – >30 – Continuous variable – Comparing mean
  • 58.
    The most appropriatetest to compare birth weight in 3 different regions is • A) t test • B) Anova • C) Z test • D) Chi square test
  • 59.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 60.
    Answer • B – Continuousvariable – Compare means – > 2 groups
  • 61.
    The most appropriatetest to compare BMI in two different adult population of size 24 and 30 is • A) Two sampled t test • B) Paired t test • C) Z test • D) Chi square test
  • 62.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 63.
    Answer • A – Twodifferent groups – Continuous variable – Size <30
  • 64.
    The association betweensmoking status and MI is tested by • A) t test • B) Anova • C) F test • D) Chi square test
  • 65.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 66.
    Standard drug used40% of patients responded and a new drug when used 60% of patients responded. Which of the following tests of parametric significance is most useful in this study? • A) Fishers t Test • B) Independent sample t test • C) Paired t test • D) Chi square test.
  • 67.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 68.
    • A consumergroup would like to evaluate the success of three different commercial weight loss programmes. Subjects are assigned to one of three programmes (Group A , Group B ,GROUP C) . Each group follows different diet regimen. At first time and at the end of 6 weeks subjects are weighed an their BP measurements recorded.
  • 69.
    Test to detectmean difference in body weight between Group A & Group B • T-TEST • Difference between means of two samples
  • 70.
    Is there asignificant difference in body weight in Group A at Time 1 and Time 2? • Paired T Test • Same people sampled on two Occasions.
  • 71.
    Is the differencein body weight of subjects in Group A,GROUP b ,group C significantly different at Time 2 • Analysis of variance
  • 72.
    Is there anyrelation between blood pressure and body weight of these subjects?
  • 73.
    Difference in proportionChi-square test, Z test, Difference in mean (Before and after comparison-same group) Paired t test Difference in mean (two independent groups) Unpaired t test, If sample > 30-Z test More than 2 means(> 2 groups) Anova Association b/w 2 quantitative variables Spearman correlation Prediction regression
  • 74.
    Association b/w 2quantitative variables •Correlation
  • 75.
    Correlation coefficient • Showsthe relation between two quantitative variable • Shows the rate of change of one variable as the other variable change • The value lies between –1 to + 1 • Correlation coefficient of zero means that there is no relationship
  • 77.
    • No. ofdeaths in 8 villages due to water borne diseases before & after installation of water supply system. • Villages: 1 2 3 4 5 6 7 8 • Before :13 6 12 13 4 13 9 10 • After :15 4 10 9 1 11 8 13
  • 78.
    Did the Installationof water supply system significantly reduce deaths Which non parametric test will be used to test the null hypothesis • Small sample size • Distribution is not normal • Non parametric test • Wilcoxon signed rank test
  • 79.
    Non parametric tests Qualitativedata Chi-square test Fishers test, Mc Nemar test Paired t test Wilcoxon Signed rank test independent t test Wilcoxon test , Mann- Whitney U , Kolmogrov independent t test Kruskal-wallis test
  • 80.
    For treatment ofHepatitis A 7 patients treated with herbal medicines& 7 patients treated with Allopathic symptomatic management. S.Br values after 10 days of treatment is given below • Herbal : 9 6 10 3 6 3 2 • Allopathy: 6 3 5 6 2 4 8
  • 81.
    Is herbal treatmentis better than allopathic treatment? • Small sample size • Distribution is not normal • Non parametric test • Mann- Whitney test
  • 82.
    Non parametric tests Qualitativedata Chi-square test Fishers test, Mc Nemar test Paired t test Wilcoxon Signed rank test independent t test Wilcoxon test , Mann- Whitney U , Kolmogrov independent t test Kruskal-wallis test
  • 83.
    Steps for testinga hypothesis • State Null Hypothesis • State alternate hypothesis • Fix the alpha error • Identify the test statistic
  • 84.
    • Find outthe critical value • Calculate the value for the identified statistical test Difference in means/ SE • If the calculated value is > the table value(critical value)- Reject Null Hypothesis
  • 85.
    • In astudy conducted on a sample of 400 adults, it was found that mean daily requirement of Vit. A was 900 I.U. From the existing literature the same was documented as 930 I.U with a SE of 4.5 I.U. Does the study finding differ from the existing literature finding significantly?
  • 86.
    Null hypothesis Alpha Error– 5% Test static –Z test SE = 4.5 Z = 930-900/4.5=6.67
  • 87.
    – For alphaerror 5%, critical Z value = 1.96 – 6.67 >1.96 So we will Reject null hypothesis – There is a significant difference – P value
  • 88.
    • After applyinga statistical test an investigator get the p value as 0.01. What does it mean?
  • 89.
    • Null hypothesisstates there is no difference,If there is any difference it is due to chance • P value = If the null hypothesis is true the probability of the sample variation to occur by chance • P value 0.05= probability of the sample variation by chance is only 5% if null hypothesis was true • 95% the sample variation is not due to chance,& there is a difference. So we will reject NH
  • 90.
    • P =0.01 - probability of the sample variation by chance is only 1% if null hypothesis was true • 99 % the sample variation is not due to chance,& there is a difference. So we will reject NH • As p value decreases the difference become more significant • For practical purpose p value < 0.05 ; the difference is significant
  • 91.
    In assessing theassociation between maternal nutritional status and Birth weight of the newborns two investigators A and B studied separately and found significant results with p values 0.02 & 0.04 respectively. From this what can you infer about the magnitude of association found by the two investigators
  • 92.
    Low birth weight& selected risk factors Risk factor P value Maternal age 0.01 Birth order 0.1 Employment status of mother 0.9 Mean Weight gain during pregnancy 0.002 UTI during pregnancy 0.03 Mean Hb 0.0001
  • 93.
    • A studywas conducted to find out the association between Per Capita National Income and Per Capita Consumer Expenditure from the data given below
  • 94.
  • 95.
    • . Whatis the name of this diagram? • What is its use? • From the diagram what is your inference?
  • 96.
    Type of studyAlternative name Unit of study Descriptive Case series Cross sectional Longitudinal Prevalence study Incidence study Individual Analytical studies (observational Ecological Case control Cohort Correlational Case reference Follow up Populations Individuals Individuals Analytical studies (interventional) Randomised controlled trial Field trial Community trials Clinical trial Community intervention Community Patients Healthy people Healthy people
  • 97.
    Study questions andappropriate designs Type of question Appropriate study design Burden of illness Cross sectional survey Longitudinal survey Causation, risk and prognosis Case control study, Cohort study Occupational risk, environmental risk Ecological studies Treatment efficacy RCT Diagnostic test evaluation Paired comparative study Cost effectiveness RCT
  • 98.
    Odd’s ratio • Ina study conducted by Gireesh G N etal about the ‘Prevalence of Worm infestation in children”,50 children in anganwadi were examined. Out of this 5 had worm infestation. 2 out of this 5 have a history of pet animals at home while 21 out of the 45 non infested has a history of pet animals at home. Is there any association between pet animals and worm infestations?
  • 99.
    Study design –Casecontrol • Measure of risk –Odd’s ratio
  • 100.
    • Set upa 2x2 table a b 2 21 c d 3 24 Worm infestation + + - -
  • 101.
    • Odd’s ratio= ad /bc • 2 x 24 = 0.76 21 x3
  • 102.
    Interpretation • OR =1,RISKFACTOR NOT RELATED TO DISEASE • OR <1 ,RISK FACTOR PROTECTIVE • OR >1 RISK FACTOR POSITIVELY ASSOCIATED WITH DISEASE
  • 103.
    Relative risk • Ina study to find the effect of Birth weight on subsequent growth of children , 300 children with birth weight 2kg to 2.5 kg were followed till age 1 . A similar number of children with birth weight greater 2.5 kg were followed up too. Anthropometric measurements done in both groups. Results are shown below
  • 104.
    Low birth weightNormal No.children studied 300 300 No.malnourished At age one 102 51
  • 105.
    Study design –Cohortstudy • Measure of risk –Relative risk ,Attributable risk. • Relative risk –Incidence among exposed Incidence among nonexposed = 102/300 = 0.34 = 2 51/ 300 0.17 Inference ? Rr 0 no association Rr >1 + association
  • 106.
    • An outbreak of Pediculosis capitis being investigated in a girls school with 291 pupils.Of 130 Children who live in a nearby housing estate 18 were infested and of 161 who live elsewhere 37 were infested. The Chi square value was found to be 3.93 . • P value = 0.04 • Is there a significant difference in the infestation rates between the two groups?
  • 107.
    Results of ascreening test Disease Positive Negative Positive TP(a) FP(b) Test Negative FN© TN(d)
  • 108.
    Features of ascreening test Sensitivity = a/ a+c Specificity = d/b+d Positive predictive value = a/a+b Negative predictive value = d/c+d False positive rate = bb+d False negative rate = c/a+c
  • 109.
    In a groupof patients presenting to a hospital emergency with abdominal pain, 30% of patients have acute appendicitis, 70% of patients with appendicitis have a temperature greater than 37.50c and 40% of patients without appendictis have a temperature greater than 37.50c. Considering these findings which of the following statement is correct ? a) Sensitivity of temperature greater than 37.50c as a marker for appendicitis is 21/49 b) Specificity of temperature grater than 37.50c as a marker for appendicitis is 42/70 c) The positive predictive value of temperature greater than 37.50c as marker for appendicitis is 21/30 d) Specificity of the test will depend upon the prevalence of appendicitis in the population to which it is applied.
  • 110.
    Sensitivity and Specificity +- Fever > 37.50c + 21a 28b - 9c 42d 30a+c 70b+d
  • 111.
    • Sensitivity =a/a+c - 21/30=70% • Specificity = d/b+d = 42/70=60% • Positive predictive value = a/a+b = 21/49=43% • Negative predictive value = d/c+d = 42/51
  • 112.
    Exercise 11 Disease prevalencein a population of 10,000 was 5%. A urine sugar test with sensitivity of 70% and specificity of 80% was done on the population. The positive predictive value will be : a)15.55% b) 70.08% c) 84.4% d)98.06%
  • 113.
    • Total population= 10,000 • Disease prevalence = 5% • No diseased = 500 • Applying this to a 2x2 table :
  • 114.
    2x2 table + - +TEST 350 a 1900 b 2250 - 150c 7600d 7750 500 9500 10000
  • 116.