2. Introduction:
Definition of statistics: It is the ‘science of collecting,
classifying, presenting & interpreting data’ relating to any
sphere of enquiry.
Having learnt the methods of collection & presentation
of data, we have to understand & grasp the application of
mathematical techniques involved in analysis & interpretation
of the data.
As medicos, we should learn to apply the formulae
straight to our problems without worrying how they have
been deduced. Application of methods for analysis is quite
easy & we should become familiar with them so as to verify
our preconceived ideas or to remove doubts which might arise
at the first look of figures collected.
3. “ If a man will begin with certainties, he shall end in
doubts’: but if he will be content to begin with doubts, he
shall end in certainties.”
- Francis Bacon
Characteristics of frequency distribution is of
two types,
1. Measures of central tendency ( Location, Position, Average)
2. Measures of dispersion ( Scatterdness, Variability, Spread)
4. Definition:
It refers to a single central number or value that
condenses the mass data & enables us to give an idea
about the whole or entire data.
Types:
1. Arithmetic Mean
2. Median Q2
3. The mode Z
x
5. Introduction:
It is the most commonly used measure of central
tendency.
It is also called as ‘Average’.
Definition:
It is defined as additional or summation of all
individual observations divided by the total number of
observation.
6. Types of series
1. Ungrouped series ( Ungrouped data, Unclassified data, Raw
data ) : Includes individual observations without frequency.
2. Grouped series ( Classified data ) : Includes individual
observation with frequency & class frequency.
Calculation :
1. Direct method
2. Indirect method
7. Merits of Arithmetic Mean
1. Easy to understand & to calculate.
2. It is correctly or rigidly defined.
3. It is based on each & every observation.
4. Every set of data has one & only one A.M.
5. Used for further mathematical calculations like standard
deviation.
Demerits of Arithmetic Mean
1. Affected by extreme values ( either low or high)
2. It can not be obtained even if a single value is missing.
8. Introduction :
It is called Q2 because it denotes 2nd quartile or positional
value.
It is the 2nd measure of central tendency.
Here there are 3 quartile Q1 , Q2 , Q3 which divides
the distribution into 4 parts or equal.
A Q1 Q2 Q3 B
.
9. Definition :
Median divides the distribution into two equal parts i.e.
50% of the distribution is below the median & 50% is above
the median.
Q1 = n/4, Q2 = 3 x n/4
Ungrouped data:
When ‘n’ is odd if the total number of observations are
even, then arrange the observations either in ascending or
descending order & calculate the median by formula.
Q3 = n+1/ 2
10. Definition :
Dictionary meaning of mode is common or fashionable.
Mode is the value which occurs more frequently in a given set
of data.
There are 3 types
Type 1
Ex: Selection of mode : Observation having the highest
repetition.
10,11,12,26,20,40,20,10,12,10
Mode = 10
11. Type 2 :
Selection of mode: Observation containing highest frequency.
Ex: Number of children per family.
No.of children/Family No.of families
0 13
1 24
2 25
3 13
4 14
25 is highest frequency so ‘2’ is mode.
Type 3:
Class containing highest frequency.
12. Merits of Mode:
1. Easy to calculate & understand.
2. Not affected by extreme value.
3. Mode can be found by both qualitative & quantitative data.
Demerits of Mode:
1. Some times no mode or more then one mode in a given set
of distribution.
2. Not used for further mathematical calculation.
3. Not commonly used.
13. Examples of Ungrouped series :
1. Direct method
= ∑x/n
x = Individual observation
n = Number of observation
Ex: Systolic BP of the patients, calculate mean, mode & median.
1. 110mmHg x1
2. 100mmHg x2
3. 150mmHg x3
4. 140mmHg x4
5. 140mmHg x5
6. 120mmHg x6
x
14. Mean ( Average ) : = ∑x/n
∑ = Summation
n = Number of samples
x = Individual observation.
∑x = x1+ x2+ x3 + x4 + x5 + x6
= 760/6
= 126.6mmHg
Mode :
Most repeated number in the data: 140mmHg
Median : 100, 110, 120, 140, 140,150
= 120+140 = 260/2
= 130mmHg
x
15. Step deviation method of calculation mean :
Ex: Height of the school children's given below find out the
mean.
1. 148cm x1
2. 143cm x2
3. 160cm x3
4. 152cm x4
5. 157cm x5
6. 150cm x6
7. 155cm x7
Working origin ( w ) = 150cm
16. Formula :
= ∑ ( x – w ) / n
148 -150 = -2
143 -150 = -7
160 - 150 = 10
152 - 150 = 2
157 - 150 = 7
150 -150 = 0
155 -150 = 5
= 15/7 = 2.1
= w +
= 150 + 2.1
= 152.2
x
x
x
17. Find mean days of confinement after delivery in the following?
Mean = ∑fx/n , ∑f = n
= 137/18
= 7.61
Days of
confinement
x
No. of patients
grouped
f
Total days of
each group
fx
6 5 30
7 4 28
8 4 32
9 3 27
10 2 20
18 137
18. Definition:
Measures of variability describes the spread or scatterdness
of the individual observation around the central tendency.
Significance :
1. Gives complete idea/picture of data
2. Helps in comparison of distribution.
3. Useful for further calculations
4. Gives idea about the reliability of average value.
19. Methods of dispersion
1. Range ( R )
2. Inter quartile range ( IQR )
3. Quartile deviation / Semi inter quartile range
4. Mean deviation / Average deviation (MD)
5. Standard deviation (SD)
20. Range :
Definition:
Is defined as the difference between the highest & lowest values
in a set of data.
R = H – L
Ex: Weight of an adult person 50 -100kg
Merits:
Easy to calculate & understand
Has got a well defined formula
gives first hand information about variation
Demerits:
It is not based on all the values
Affected by extreme value
21. Definition:
It is the interval between the value of upper
quartile ( the value above which 25% observation
falls) & lower quartile ( the values which fall
below the 25% ).
So the measures gives us the range of middle
50% of observation & it is very helpful when the
observations are not homogenous & extreme in
nature. It is the superior measure over the range
in such conditions.
23. Merits of IQR:
Easy & simple to understand
Easy to calculate
Not affected by extreme values
Demerits of IQR :
It is a positional value which is based on two
quartile
Based on first & last values
24. Definition :
It is an average amount of scatter of the
items in a distribution from any measures of
the central tendency by ignoring the
mathematical signs.
Formula: M.D = ∑ |x – | / nx
25. Example: Average marks obtained in 5 internals by a
student.
x x -
25 25- 22 = 3
15 15- 22 = -7
25 25-22 = 3
25 25-22 = 3
20 20- 22 = -2
= ∑x/n
= 110/5
= 22
x
x
27. Introduction:
It is most widely used, best method of
calculating deviation.
Though in AD it takes into consideration of all the
observation & it ignores the mathematical signs,
but SD overcomes this problem by squaring the
deviation.
Definition:
SD is the square root of summation of square
of deviation of given set of observation from the
AM divided by the total number of observation.
28. Formula : Ungrouped series
Standard deviation = ∑( x- )2 / n
n ˃ 30
Grouped series
Standard deviation = ∑f (x - )2 / n
n ˂ 30
Where, ∑ – is Summation of,
x – is Individual observation, – is Arithmetic mean,
n – is Total number of observation
x
x
x
29. Average marks obtained in 5 internals by a student
S.D = ∑ ( X - ) 2 / n = 80/5 = 16
= 4
Marks obtained
x
x - ( x - )2
25 25 – 22 = 3 9
15 15 – 22 = -7 49
25 25 – 22 = 3 9
25 25 – 22 = 3 9
20 20 – 22 = -2 4
= 110 = 80
x x
x
30. Co – efficient of SD = SD/ Mean x 100
= 4 / 22 x 100
= 400 / 22
= 18.1 %
Significance of SD :
Based on all observations.
Best method of calculation without ignoring mathematical
signs.
Useful for further statistical calculations. (i.e. Test of
Significance etc.)
Useful for calculation of standard error.
Lesser the standard deviation, better the estimation of
population mean.