Biostatistics Standard deviation and variance

Standard Deviation &Variance
Dr. Harinatha Reddy
Sri Krishnadevaraya University
Department of microbiology

Standard deviation:
• In statistics, the standard deviation is a measure that is used to
quantify the amount of variation or dispersion of a set of data.
• It is represented by Greek symbol (s) and in short form S or SD.
• It also known as root mean square deviation.
This formula for ungrouped data:
SD OR

SD for Ungrouped data:
Example 1: A hen lays eight eggs. Each egg was weighed and
recorded as follows:
60, 56, 61, 68, 51, 53, 69, 54
S.NO Variance
(X)
Deviation (d
or dx)=
X-X
̅
(X-X
̅ )2 or
dx2
1 60 60-59= 1 1×1=1
2 56 56-59= -3 3×3=9
3 61 61-59= 2 2×2= 4
4 68 68-59= 9 81
5 51 51-59= -8 64
6 53 53-59= -6 36
7 69 69-59= 10 100
8 54 54-59= -5 25
N=8 X=432 ∑dx2=320
Mean(X
̅ )= ∑X/N
= 432/8
=59
SD=
SD = 320/8
SD= 6.32

Exercise 1: Calculate SD for ungrouped data:
• 10,15,20,25,4,8

Standard deviation for grouped data (discrete variables):
Where n = ∑f

Example: Standard deviation for grouped data (discrete
variables):
Workers (X) Frequency (f)
0 1
1 1
2 2
3 3
4 6
5 5
6 4
7 3
8 3
9 2
First calculate mean for
discrete data:
Mean formula for discrete
data is Mean (X
̅ )= ∑fx / ∑f

Workers (X) Frequency
(f) (fx)
0 1 0
1 1 1
2 2 4
3 3 9
4 6 24
5 5 25
6 4 24
7 3 21
8 3 24
9 2 18
∑f=30 ∑fx=150
Mean formula for discrete data is
Mean (X
̅ )= ∑fx / ∑f
= 150/30 = 5
Mean (X
̅ )= 5

Workers (X) Frequency
(f) (fx) (X-X
̅ ) (X-X
̅ )
2
(X-X
̅ )
2
f
0 1 0 0-5=-5 25 25×1= 25
1 1 1 1-5=-4 16 16×1=16
2 2 4 2-5=-3 9 9×2= 18
3 3 9 -2 4 12
4 6 24 -1 1 6
5 5 25 0 0 0
6 4 24 1 1 4
7 3 21 2 4 12
8 3 24 3 9 27
9 2 18 4 16 32
∑f=30 ∑fx=150 ∑(X-X
̅ )
2
f =152
SD

Exercise 1: Calculate SD for following discrete data
Blood cells No of days
90 5
55 9
60 5
70 4
80 10
100 20

Standard deviation for grouped Data
(Continuous serious)

Standard deviation for grouped Data
Hours Number of
students
10 -14 2
15 -19 12
20 -24 23
25-29 60
30-34 77
35-39 38
40 -44 8
Table 1. Number of hours per week spent
watching television
First calculate Mean (x
̅ )= ∑f.m / ∑f
M= Middle values of class (It also mention as X)

Hours Number of
students
10 -14 2
15 -19 12
20 -24 23
25-29 60
30-34 77
35-39 38
40 -44 8
First calculate Mean (x
̅ )= ∑f.m / ∑f
M= Middle values of class (It also mention as X)

Hours Midpoint
(x)
Frequenc
y (f)
fx
10 to 14 12 2 24
15 to 19 17 12 204
20 to 24 22 23 506
25 to 29 27 60 1,620
30 to 34 32 77 2,464
35 to 39 37 38 1,406
40 to 44 42 8 336
∑f= 220 ∑fx=6,560
Mean (x
̅ )= ∑f.m / ∑f
= 6560/220
=29.82

Hours Midpoint
(M or X)
Frequenc
y (f)
fm
or fx
(M-X
̅ ) (M-X
̅ )
2
(M-X
̅ )
2
f
10 -14 12 2 12×2=24 12-29.82
= -17.82
(17.82 )
2
=317.6
317.6×2
= 635.2
15 -19 17 12 17×12=204 17-29.82=
-12.82
(12.82 )
2
=164.4
164.4×12
=1,972.8
20 -24 22 23 22×23=506 -7.82 61.2 1,407.6
25 -29 27 60 27×60=1,620 -2.82 8.0 480.0
30 -34 32 77 32×77=2,464 2.18 4.8 369.6
35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8
40 -44 42 8 42×8=336 12.18 148.4 1,187.2
∑f= 220 ∑fx=6,560 ∑(M-X
̅ )
2
f=
8,013.2
Mean (x)==29.82
SD
Where n = ∑f
M written as X in the equation
(Middle point (M) of the class also called as X)
SD

Coefficient of standard deviation

Coefficient of standard deviation
Coefficient of standard deviation=
Standard deviation (SD)
Arithmetic mean (𝐗̅ )
=
SD
𝐗̅

Hours Midpoint
(M or X)
Frequenc
y (f)
fm
or fx
(M-X
̅ ) (M-X
̅ )
2
(M-X
̅ )
2
f
10 -14 12 2 12×2=24 12-29.82
= -17.82
(17.82 )
2
=317.6
317.6×2
= 635.2
15 -19 17 12 17×12=204 17-29.82=
-12.82
(12.82 )
2
=164.4
164.4×12
=1,972.8
20 -24 22 23 22×23=506 -7.82 61.2 1,407.6
25 -29 27 60 27×60=1,620 -2.82 8.0 480.0
30 -34 32 77 32×77=2,464 2.18 4.8 369.6
35 -39 37 38 37×38=1,406 7.18 51.6 1,960.8
40 -44 42 8 42×8=336 12.18 148.4 1,187.2
∑f= 220 ∑fx=6,560 ∑(M-X
̅ )
2
f=
8,013.2
Mean (X
̅ )==29.82 Coefficient of SD =
SD
𝐗̅
=6.03/29.2
= 0.206

Uses of Standard deviation:
• Standard deviation is based on all the observations.
• Of all the measures of dispersion, standard deviation is best
because it is least effected by fluctuations.
• It used in the finding of standard error.

Variance:
• The variance is the arithmetic mean of the squares of sum the
deviations for the mean value of the data.
• It is represented by s2 or σ2
• Formula for the ungrouped data=
s2 or σ2 =

• Ascending order: 16,17,18,19,20,21,22,23,24.

Example: Variance for ungrouped data:
• 23,22,20,24,16,17,18,19,21,
• Ascending order: 16,17,18,19,20,21,22,23,24.
• Mean: 180/9= 20
S.No Observation
s (X)
Deviation from mean
(Dx or d= X-X
̅ )
Square of
deviation
(X-X
̅ )2 or X 2
1 16 16-20= -4 42 = 16
2 17 17-20= -3 32 = 9
3 18 18-20= -2 4
4 19 19-20= -1 1
5 20 20-20= 0 0
6 21 21-20= 1 1
7 22 22-20= 2 4
8 23 23-20= 3 9
9 24 24-20= 4 16
N= 9 ∑X= 180 ∑X-X
̅ )2 = 60
S2 =
S2 =60/9-1
S2 = 7.5

Exercise 1 : Variance for ungrouped data:
• 10,2,8,6,15,20,4,5
• Calculate variance for ungrouped data:

Variance for grouped data (continuous series):
Variance of grouped data formula:
X̅ : Mean
M or X: Mid point of class interval.
N= frequency
The first step in the variance for grouped data is
to calculate mean:
Mean (X
̅ )= ∑fm / ∑f

Example: Variance for grouped data:
Age H1N1 patients
31-35 2
36-40 3
41-45 8
46-50 12
51-55 16
56-60 5
61-65 2
66-70 2
The first step in the variance for grouped data is
to calculate mean:
Mean (X
̅ )= ∑fm / ∑f

Class interval Mid point (M
or X)
Frequency (f) Fm or fx
31-35 33 2 66
36-40 38 3 114
41-45 43 8 344
46-50 48 12 576
51-55 53 16 848
56-60 58 5 290
61-65 63 2 126
66-70 68 2 136
∑f= 50 ∑fm= 2500
Mean (X
̅ )= ∑fm / ∑f
= 2500/50
Mean (X
̅ ) = 50

Mean (X
̅ ) = 50
Class
interval
Mid
point (M
or X)
Freque
ncy (f)
Fm or fx (X-X
̅ ) or
(m-X
̅ )
(X-X
̅ )2 or
(m-X
̅ )2
f(X-X
̅ )2
Or
f(m-X
̅ )2
31-35 33 2 66 33-50= -17 172 = 289 289×2= 578
36-40 38 3 114 38-50= -12 144 144×3= 432
41-45 43 8 344 -7 49 392
46-50 48 12 576 -2 4 48
51-55 53 16 848 3 9 144
56-60 58 5 290 8 64 320
61-65 63 2 126 13 169 328
66-70 68 2 136 18 324 648
∑f= 50 ∑fm=
2500
∑(X-X
̅ )2 =
1052
∑f(X-X
̅ )2
= 2900
= 2900/50-1
=2900/49= 59.18
SD or S = 59.18 SD or S= 7.69

Co-efficient of Variation (CV)
Co-efficient of Variation (CV)= Standard deviation × 100
Mean

Calculate
the Variance, standard deviation and co-efficient of variance for
the data.
Yield of wheat
per hectare
No of wheat fields
10-20 22
20-30 5
30-40 2
50-60 12
60-70 16
70-80 10
First find SD and followed by CV
Co-efficient of Variation (CV)= Standard deviation × 100
Mean

Class
interval
Mid
point (M
or X)
Freque
ncy (f)
Fm or fx (X-X
̅ )
Or
(m-X
̅ )
(X-X
̅ )2
Or
(m-X
̅ )2
f(X-X
̅ )2
Or
f(m-X
̅ )2
10-20 22
20-30 5
30-40 2
50-60 12
60-70 16
70-80 10
∑f= ∑fm= ∑(X-X
̅ )2 = ∑f(X-X
̅ )2
=

Significance of Variance:
• It is easy to calculate.
• It indicates the variability clearly.
• The variance is the most informative among the measures of
dispersion for populations.
• It is most frequently used measure of variation in data especially
with normal, binomial or Poisson distribution.

Biostatistics Standard deviation and variance

Biostatistics Standard deviation and variance

More Related Content

What's hot

Similar to Biostatistics Standard deviation and variance

More from HARINATHA REDDY ASWARTHAGARI

Recently uploaded

Biostatistics Standard deviation and variance