Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Statistical test
1. Parametric and
Non Parametric Tests
Presenter: Dr. Mrigesh
Facilitator: Dr. Swati
Moderator: Dr. Ranjan Das
Dr. Manish Goel
12/6/2017 1
2. Plan of presentation
• Data and its type.
• Standard normal distribution curve and its
characteristics.
• Hypothesis and types of error.
• Steps involved in testing of hypothesis.
• Parametric test
Pre-requisites of parametric test
Parametric test with examples
Limitations.
12/6/2017 2
3. • Non Parametric test
Pre-requisites of non Parametric test
Types of non Parametric test
Limitations.
• Quick recap.
• References.
12/6/2017 3
4. Data: A collection of items of information.
A dictionary of Epidemiology, Miquel Porta
DATA
Qualitative Quantitative
Ordinal Nominal Continuous Discrete
Interval Ratio
12/6/2017 4
5. Standard Normal Distribution Curve
Characteristics
• Has a bell shaped curve, symmetric.
• Mean = Median = Mode
• The total area under the curve is 1 (or 100%)
• The tails of the curve are infinite.
12/6/2017 5
7. Hypothesis : A supposition, arrived at from observation
or reflection that leads to refutable prediction.
• Two types:
A) Null hypothesis: No relationship between the
variables being studied.
B) Alternate hypothesis : There is a relationship
between the variables being studied.
– Eg. A new drug A came in market and the
manufacturer claims that it is more effective than
drug B for treating angina.
12/6/2017 7
9. Steps in Testing of Hypothesis
1. Determine the appropriate test
2. Establish the level of significance (α)
3. Formulate the statistical hypothesis
4. Calculate the test statistic
5. Determine the degree of freedom
6. Compare computed test statistic against a
tabled/critical value
12/6/2017 9
10. Parametric Statistics
• A branch of statistics that assumes data comes
from a type of probability distribution and
makes inferences about the parameters of the
distribution
• Also called as Distribution dependent
statistics, Classical or Standard tests
12/6/2017 10
11. Assumptions
• Variable in question has a known underlying
mathematical distribution (Gaussian)
• They show the same variance
• Usually used for continuous data, ratio/interval data
12/6/2017 11
12. Types of parametric test
Test of Significance based on - Number of groups
- Sample size
2 groups
Large samples : Z-test
Small samples : t-test
To compare mean of more than 2 groups: F-test -
Analysis of Variance (ANOVA)
12/6/2017 12
13. Z-test
• Also called as Standard normal variate /standard
score / Z score/z-values, normal scores.
• Pre-requisites:
Samples must be randomly selected
Large sample size N ≥30
Data can be quantitative or qualitative
Variable assumed to follow normal distribution.
12/6/2017 13
14. • Test statistic, Z = difference observed
standard error
• If, Z > 1.96, p < 0.05 Then, reject NH
• If, Z < 1.96, p > 0.05 Then, fail to reject NH
12/6/2017 14
15. Qualitative data:
i) Test for single proportion:
Let, p = proportion of sample
P = proportion of population
Then, Z = p-P where, SEp = √PQ/n
SEp
Q = 1-P
12/6/2017 15
16. i) Test for two proportions:
Let, p1 = proportion of sample-1
p2 = proportion of sample-2
Then, Z = p1-p2 where, SEp1-p2 = p1q1 + p2q2
SEp1-p2 n1 n2
12/6/2017 16
17. Quantitative data:
i) Test for single mean:
Let, ẋ = mean of sample
μ = mean of population
Then, Z = ẋ - μ where, SEẋ = S/√n
SEẋ
12/6/2017 17
18. i) Test for two means:
Let, ẋ1 = mean of sample-1
ẋ2 = mean of sample-2
Then, Z = ẋ1 - ẋ2 where, SEẋ1-ẋ2 = s1
2 + s2
2
SEẋ1-ẋ2 n1 n2
12/6/2017 18
19. Ex: Complication rate of a drug which was tried on 100 pts was
15%. The new drug which was administered to another
group of 100 pts had complication rate of 7%. Test whether
the new drug is superior in terms of reducing complications.
Soln: NH: P1 = P2 AH: P1 ≠ P2
n1 = 100; p1 = 15% n2 = 100; p2 = 7%
So, Z = p1-p2
SEp1-p2
= 15-7
15 x 85 + 7 x 93
100 100
Z = 8/6.21 = 1.82
As, Z < 1.96, p > 0.05
Fail to reject NH
12/6/2017 19
20. t-Test
• Pre-requisites:
Samples must be randomly selected
Small sample size N < 30
Data must be quantitative
Variable assumed to follow normal distribution.
12/6/2017 20
22. Degrees of freedom (DF):
• Freedom to choose variables
• Hence, DF = n – k
where, n = no. of observations (sample size)
k = no. of population parameters
from this sample.
12/6/2017 22
23. • Critical value calculated according to degrees of freedom and
‘p’ value as per ‘t-table’
• Test statistic, td,p = difference observed
standard error
where, t = t value
d = DF
p = p value
12/6/2017 23
24. Test with single mean (One sample t-test):
t = ẋ - μ
SEẋ - μ
where, SE ẋ - μ = SD/√n
DF = n-1
12/6/2017 24
25. Test with two means:
i) Unpaired t-test:
t = ẋ1 - ẋ2
SEẋ1-ẋ2
where, SEẋ1-ẋ2 = S2 + S2 and, S2 = (n1-1)s1
2+(n2-1)s2
2
n1 n2 n1+n2-2
DF = n1+n2 -1
12/6/2017 25
26. Ex: In a clinical trial of comparing efficacy of two
antihypertensive drugs A & B, Drug A was given to
10 randomly selected pt’s & at the end of trial mean
DBP was 88mmHg with SD of 5mmHg. Drug B was
given to 8 randomly selected pt’s & at end of trial
mean DBP 94mmHg with SD of 6mmHg. Test
whether drug A differs from drug B in the treatment
of hypertension?
Soln: NH: Drug A= B AH: Drug A ≠ B
n1=10 ẋ1 = 88 SD1 =5
n2 = 8 ẋ2 = 94 SD2 =6
12/6/2017 26
27. Soln: NH: Drug A = B AH: Drug A ≠ B
n1=10 ẋ1 = 88 SD1 =5
n2 = 8 ẋ2 = 94 SD2 =6
t = ẋ1 –ẋ2 = 88-94/2.589 = 2.317
SE
DF = 10+8-2=16
t for 16 DF at P0.05= 1.746
Calculated t > t16,0.05
Hence, Reject NH and thus Drug A differs from drug B12/6/2017 27
28. ii) Paired t-test (pre-post comparison):
t = d
SEd where, d = Σd / n
SE d = SD d /√n
SDd = Σ (d-d)2
n-1
DF = n-1
12/6/2017 28
30. Ex: 5 persons were chosen randomly & their PR were recorded
before & after administration of drug. Results after 5min of
administration are as follows. Test whether drug changes PR?
Individual Before DA After DA
1 92 88
2 90 88
3 96 90
4 98 88
5 92 89
12/6/2017 30
31. Individual Before DA (x1) After DA
(x2)
d= x1-x2
1 92 88 -4
2 90 88 -2
3 96 90 -6
4 98 88 -10
5 92 89 -8
Total -30
d = Σd / n
= 30 / 5
= 6
12/6/2017 31
32. Individual Before DA
(x1)
After DA
(x2)
d= x1-x2 (d-d)2
1 92 88 -4 4
2 90 88 -2 16
3 96 90 -6 0
4 98 88 -10 16
5 92 89 -8 4
Total -30 40
SD = √40/(5-1)
= √10
= 3.16
SE = SD / √n
= 3.16 / 2.23
= 1.4112/6/2017 32
33. Soln: NH: PR d = 0 AH: PR d ≠ 0
• Paired t = d / SEd = 6 / 1.41 = 4.25
• DF = n-1=5-1=4
• t for 4 DF at P0.05 = 2.13
• Calculated Paired t = 4.25 > t4,0.05
• Hence, Reject NH and thus, Drug causes change in PR
12/6/2017 33
34. Analysis of Variance (ANOVA)
• Used to compare the means of two or more samples to see
whether they come from the same population.
• Compares variances within groups to variances between
groups (F-value)
• It compare groups (independent variable) based on single
continuous response variable(dependent variable) .
• Eg. Comparing test score by level of education
12/6/2017 34
36. Pre-requisites:
• Variables assumed to follow normal distribution
• Individuals in various groups must be randomly selected
• Samples comprising the groups should be independent
• All groups have same standard deviation (variance)
• Variables must be quantitative (means)
12/6/2017 36
37. • Here, variance is calculated - within group variation and
between group variation
• Between groups variation (E1) = sum of squares
k – 1
where, k=no. of groups
• Within group variation (E2) = sum of squares
n – k
where, n=total no. of observations
12/6/2017 37
38. • Test statistic, F = E1 (variance between groups)
E2 variance within groups)
• Degrees of Freedom (DF) = k-1, n-k
• F table: DF between groups (k-1) – column
DF within groups (n-k) – rows
12/6/2017 38
41. ANCOVA: Analysis of co-variance
• ANCOVA has single continuous response variable.
• It compares a response variable by both a factor and a
continuous independent variable
12/6/2017 41
47. NON-PARAMETRIC TEST
Assumptions
• Distribution of variables need not follow Gaussian
distribution.
• Sample observations are independent
• Variable is continuous or ordinal
• Small samples (n<30)
12/6/2017 49
48. Categorical Outcomes
Chi square test
Sign test
Mc Nemar test
Fischer exact test
Numerical Outcomes
Wilcoxon Signed Rank
test
Wilcoxon Rank Sum
test
Kruskal Wallis test
12/6/2017 50
49. Chi square test
• Most commonly used non parametric test
• Karl Pearson invented in 1900.
Pre-requisites
• Random sample data
• To be applied on actual data and not percentages
• Adequate cell size – 5 or more in all cells of 2x2 table
and 5 or more in 80% of cells in larger tables but no
cells with zero count.
• Observations must be independent
12/6/2017 51
50. • Test statistic, χ2 = Σ(O – E)2
E
where, E = Row Total x Column Total
Grand Total
• DF = (r-1)(c-1)
12/6/2017 52
51. Characteristics of Chi Square
i) Can be applied to more than 2 groups (Test of
Homogeneity)
ii) Association can be found (Test of Association)
iii) Test for Goodness of fit – to test whether a given
distribution is a good fit to the given data
iv) Yates’ correction – Arbitrary , conservative adjustment to
chi square applied to a 2x2 table when, one or more cells
have expected value <5
χ2 = Σ [(O – E) – 0.5]2
E
12/6/2017 53
52. Ex: A anti hypertensive drug trial was conducted in
Belgaum and it found that of 60 patients who
received drug A, 45 had some complication and of 60
patients who received drug B, 33 had complications.
Is the drug safe?
Soln: NH: No relation between drug and complications
AH: Drug and complications are related
Drug Complications Total
No Yes
A 15 45 60
B 27 33 60
Total 42 78 120
12/6/2017 54
53. Χ2 = Σ (O-E)2
E
= (15-21)2 + (45-39)2 + (27-21)2 + (33-39)2
21 39 21 39
= 5.274
DF = (r-1)(c-1) = (2-1)(2-1) = 1
Χ2
1,0.05 = 3.84
Drug Complications Total
No Yes
A 15 (21) 45 (39) 60
B 27 (21) 33 (39) 60
Total 42 78 120
As, Χ2 > critical value
Reject NH
12/6/2017 55
54. Sign test
• Test to analyze the sign of difference between paired
observations either same individuals or related
individuals
• Alternative to ‘paired t-test’
• Probability is calculated
12/6/2017 56
55. Mc Nemar test
• Similar to 2x2 chi square test
• For comparison of variables from matched pairs
• Can also be used for pre and post samples
12/6/2017 58
56. Mc Nemar χ2 (χ2
c) = [(b-c)-1]2
(b+c)
Intervention Outcome Total
Yes No
Yes a b a+b
No c d c+d
Total a+c b+d a+b+c+d = N
12/6/2017 59
57. Fischer exact test
• Used when the expected values are <5 in more than
20% cells or one of them is zero
• Can be used for r x c tables
• It gives exact probability
12/6/2017 60
59. Wilcoxon Signed Rank test
• Comparison in a single sample
• For pre and post intervention comparison
• Medians compared
• Observations are ranked and then compared
12/6/2017 62
63. Wilcoxon Rank Sum test
• Comparing two independent samples
• Means are compared
• Observations are ranked and then compared
12/6/2017 66
64. 12/6/2017 67
• Example: To compare the scores on quantitative
variables obtained from two independent group
Group A: 2 4 2 6 4 8
Group B: 8 8 4 10 12 11
70. Quick recap contd..
Two sample
problem
All Expected values
<5?
Chi square test
Fischer’s Exact
test
Mc Nemar
test
Paired t test
Samples
independent?
Distribution
Normal?Yes
Yes No
No
YesNo
Large Small
Unpaired t
test
Sample Size
Samples
independent
Z test
Yes No
12/6/2017 74
72. References
• Park K. Park’s textbook of preventive and social
medicine. 23rd ed. Jabalpur: M/s Banarsidas Bhanot;
2015.
• Bhalwar R. Textbook of public health and community
medicine. Pune: Department of community
medicine,AFMC;2009.
• Sunder L, Adarsh, Pankaj. Textbook of community
medicine preventive and social medicine. 4th ed. New
Delhi: CBS Publishers &Distributors Pvt Ltd;2014.
• Indrayan A., Holt P. Concise Encyclopedia of
Biostatistics for medical professionals. CRC press
Taylor and Francis group; 201712/6/2017 76
73. • Beaglehole R., Bonita R. Basic Epidemiology 2nd ed.
WHO library cataloguing-in-publication data; 2002
• Armitage P., Berry G. Statistical Methods in Medical
Research 4th ed.Blackwell Science;2002.
• Indrayan A., Sarmukaddam B.Medical Biostatistics
1st ed.. Marcel Dekker, Inc. New York. Basel; 2001.
• Das R., Das P. Biomedical Research Methodology
including Biostatistical Applications. Jaypee Brothers
Medical Publishers (P) Ltd.; 2011.
• Negi K. Methods in Biostatistics 1st ed. AITBS
Publisher India; 2012.
• Porta M. A Dictionary of Epidemiology 6th
ed.BEA;201412/6/2017 77