Kolmogorov Smirnov

KOLMOGOROV
SMIRNOV
1
Presented by: Rabin BK
BSc.CSIT 5th Semester

 Introduction
 Formulation
 Example
 References
2

Introduction
3
 Named after Andrey Kolmogorov and Nikolai Smirnov
 Used in situations where a comparison has to be made between an
observed sample distribution and theoretical distribution
 A nonparametric test of the equality of continuous data
 A very efficient way to determine if two samples are significantly different
from each other
 Type of data: Continuous data, Univariate(univariate refers to an expression,
equation, function or polynomial of only one variable)
 Null hypothesis: Two samples drawn from the same distribution

One Sample Test
4
 Formula:
D=Maximum|F0(X)−Fr(X)|
Where :
F0(X) = Observed cumulative frequency distribution of a random sample
of n observations.
F0(X)=k/n = (No.of observations ≤ X)/(Total no.of observations)
Fr(X) = The theoretical frequency distribution.
 Acceptance Criteria: If calculated value is less than critical value accept
null hypothesis.
 Rejection Criteria: If calculated value is greater than table value reject
null hypothesis.

Example:
5
Streams No. of students interested in
joining
FO(X) FT(X) |FO(X)−FT(X)|
Observed(O) Theoretical (T)
B.Sc. 5 12 5/60 12/60 7/60
B.A. 9 12 14/60 24/60 10/60
B.COM. 11 12 25/60 36/60 11/60
M.A. 16 12 41/60 48/60 7/60
M.COM. 19 12 60/40 60/60 60/60
Total n=60
Source: Tutorialspoint[4]

Example contd...
6
 Test statistic |D| is calculated as:
D = Maximum|F0(X)−FT(X)| = 11/60 = 0.183
 The table value of D at 5% significance level is given by
D0 at .05=1.36/√n = 1.36/√60 = 0.175
 Calculated value is greater than the critical value
 We reject the null hypothesis and conclude that there is a difference
among students of different streams in their intention of joining the Club.

Two Sample Test
7
 May also be used to test whether two underlying one-dimensional
probability distributions differ.
 The Kolmogorov–Smirnov statistic is:
D = Maximum|F1,n(X)−F2,n(X)|
where,
F1,n(X) = the empirical distribution functions of the first sample
F2,n(X) = the empirical distribution functions of the second sample
 The null hypothesis is rejected at level α if
Dn,m > c(α) √((n+m)/(n-m))
where,
n and m are the sizes of first and second samples respectively
The value of c(α) is given by :

References
1. http://influentialpoints.com/Training/kolmogorov-smirnov_test.htm
2. http://www.physics.csbsju.edu/stats/KS-test.html
3. http://www.physics.csbsju.edu/stats/KS-test.n.plot_form.html
4. https://www.tutorialspoint.com/statistics/kolmogorov_smirnov_test.htm
5. https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogor
ov_distribution
6. https://www.graphpad.com/guides/prism/7/statistics/interpreting_results_kolmog
orov-smirnov_test.htm?toc=0&printWindow
7. http://blog.minitab.com/blog/adventures-in-statistics-2/choosing-between-a-
nonparametric-test-and-a-parametric-test
8. http://www.stats.ox.ac.uk/~massa/Lecture%2013.pdf
8

Kolmogorov Smirnov

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kolmogorov Smirnov

Similar to Kolmogorov Smirnov (20)

More from Rabin BK

More from Rabin BK (20)

Recently uploaded

Recently uploaded (20)

Kolmogorov Smirnov