1. Dr. Manoj Kumar Meher
Kalahandi University
meher.manoj@gmail.com
MEASURES OF DISPERSION
2. STANDARD DEVIATION
• Standard deviation is the measure of dispersion of a set of data from its mean. It measures
the absolute variability of a distribution; the higher the dispersion or variability, the greater is the
standard deviation and greater will be the magnitude of the deviation of the value from their
mean.
3.
4.
5. Year 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Rainfall (in
CM)
210.8 205.6 78.6 158.4 99.5 167.7 152.5 104.6 98.8 125.8 187.6
Average
Temperature
in °C
32.5 29.8 31.6 27.9 33.3 30.1 27.5 27.5 28.6 33.2 28.6
Year 2008 2009 2010 2011 2012 2013 2014 2015
Rainfall (in
CM)
210.8 205.6 78.6 158.4 99.5 167.7 152.5 104.6
Average
Temperature
in °C
32.5 29.8 31.6 27.9 33.3 30.1 27.5 27.5
6. VARIANCE
• Variance (σ2) in statistics is a measurement of the spread between
numbers in a data set. That is, it measures how far each number in
the set is from the mean and therefore from every other number in
the set.
• Variance measures variability from the average or mean, variability is
volatility, and volatility is a measure of risk. Therefore, the variance
statistic can help determine the risk.
• A large variance indicates that numbers in the set are far from the
mean and from each other, while a small variance indicates the
opposite.
• Variance can be negative. A variance value of zero indicates that all
values within a set of numbers are identical.
• All variances that are not zero will be positive numbers.
7. Formula for Variance
σ 2
=
∑ 𝑋 − 𝑋 2
𝑁
Where
• σ 2= variance
• X= value of the item
• 𝑋= Mean
• N= Number of elements
10. COEFFICIENT OF VARIATION
• The coefficient of variation (CV) is a statistical measure of the
dispersion of data points in a data series around the mean. The
coefficient of variation represents the ratio of the standard deviation
to the mean, and it is a useful statistic for comparing the degree of
variation from one data.
• Why do we use coefficient of variation?
• Basically, all the data points are plotted first and then the coefficient
of variation is used to measure the dispersion of those points from
each other and the mean. So it helps us in understanding the data
and also to see the pattern it forms. It is calculated as a ratio of the
standard deviation of the data set to mean value.
15. CHI-SQUARE (Χ2) AND FREQUENCY DATA
• Up to this point, the inference to the population has been
concerned with “scores” on one or more variables, such as CAT
scores, mathematics achievement, and hours spent on the
computer.
• We used these scores to make the inferences about population
means. To be sure not all research questions involve score data.
• Today the data that we analyze consists of frequencies; that is, the
number of individuals falling into categories. In other words, the
variables are measured on a nominal scale.
• The test statistic for frequency data is Pearson Chi-Square. The
magnitude of Pearson Chi-Square reflects the amount of
discrepancy between observed frequencies and expected
frequencies.
15
16. STEPS IN TEST OF HYPOTHESIS
1. Determine the appropriate test
2. Establish the level of significance: α
3. Formulate the statistical hypothesis
4. Calculate the test statistic
5. Determine the degree of freedom
6. Compare computed test statistic against a tabled/critical
value.
16
17. 1. DETERMINE APPROPRIATE TEST
• Chi Square is used when both variables are measured
on a nominal scale.
• It can be applied to interval or ratio data that have
been categorized into a small number of groups.
• It assumes that the observations are randomly
sampled from the population.
• All observations are independent (an individual can
appear only once in a table and there are no
overlapping categories).
• It does not make any assumptions about the shape of
the distribution nor about the homogeneity of
variances.
17
18. 2. ESTABLISH LEVEL OF SIGNIFICANCE
• The significance level, also denoted as alpha or α, is a measure of
the strength of the evidence that must be present in
our sample before you will reject the null hypothesis and conclude
that the effect is statistically significant. The researcher determines
the significance level before conducting the experiment.
• α is a predetermined value
• The convention
• α = .05
• α = .01
• α = .001
18
19. 3. DETERMINE THE HYPOTHESIS:
WHETHER THERE IS AN ASSOCIATION OR NOT
• Ho : The two variables are independent
• Ha : The two variables are associated
19
20. 4. CALCULATING TEST STATISTICS
• Contrasts observed frequencies in each cell of a
contingency table with expected frequencies.
• The expected frequencies represent the number of cases
that would be found in each cell if the null hypothesis
were true ( i.e. the nominal variables are unrelated).
• Expected frequency of two unrelated events is product of
the row and column frequency divided by number of
cases.
Fe= Fr Fc / N
20
Hypothesis is a tentative assumption made in order to draw out and test its
logical or empirical consequences
21. 4. CALCULATING TEST STATISTICS
21
e
e
o
F
F
F 2
2 )
(
e
e
o
F
F
F 2
2 )
(
e
e
o
F
F
F 2
2 )
(
22. 4. CALCULATING TEST STATISTICS
22
e
e
o
F
F
F 2
2 )
(
e
e
o
F
F
F 2
2 )
(
23. 5. DETERMINE DEGREES OF FREEDOM
df = (R-1)(C-1)
23
The degrees of freedom in a statistical calculation represent how
many values involved in a calculation have the freedom to vary. The
degrees of freedom can be calculated to help ensure the statistical
validity of chi-square tests. These tests are commonly used to
compare observed data with data that would be expected to be
obtained according to a specific hypothesis.
24. 6. COMPARE COMPUTED TEST STATISTIC
AGAINST A TABLED/CRITICAL VALUE
• The computed value of the Pearson chi- square statistic is
compared with the critical value to determine if the
computed value is improbable
• The critical tabled values are based on sampling
distributions of the Pearson chi-square statistic
• If calculated 2 is greater than 2 table value, reject Ho
24
• Ho : The two variables are independent
• Ha : The two variables are associated
25. EXAMPLE
• Suppose a researcher is interested in voting
preferences on employment issues.
• A questionnaire was developed and sent to a
random sample of 90 voters.
• The researcher also collects information
about the political party membership of the
sample of 90 respondents.
25
26. BIVARIATE FREQUENCY TABLE OR
CONTINGENCY TABLE
Favor Neutral Oppose f row
NDA 10 10 30 50
UPA 15 15 10 40
f column 25 25 40 n = 90
26
27. BIVARIATE FREQUENCY TABLE OR
CONTINGENCY TABLE
Favor Neutral Oppose f row
NDA 10 10 30 50
UPA 15 15 10 40
f column 25 25 40 n = 90
27
28. BIVARIATE FREQUENCY TABLE OR
CONTINGENCY TABLE
Favor Neutral Oppose f row
NDA 10 10 30 50
UPA 15 15 10 40
f column 25 25 40 n = 90
28
29. BIVARIATE FREQUENCY TABLE OR
CONTINGENCY TABLE
Favor Neutral Oppose f row
NDA 10 10 30 50
UPA 15 15 10 40
f column 25 25 40 n = 90
29
30. 1. DETERMINE APPROPRIATE TEST
1. Party Membership ( 2 levels) and Nominal
2. Voting Preference ( 3 levels) and Nominal
30
32. 3. DETERMINE THE HYPOTHESIS
• Ho : There is no difference between D & R in their
opinion on EMPLOYMENT issue.
• Ha : There is an association between responses to
the EMPLOYMENT survey and the party
membership in the population.
32
33. 4. CALCULATING TEST STATISTICS
50*25/90 Favor Neutral Oppose f row
NDA fo =10
fe =13.9
fo =10
fe =13.9
fo =30
fe=22.2
50
UPA fo =15
fe =11.1
fo =15
fe =11.1
fo =10
fe =17.8
40
f column 25 25 40 n = 90
33
34. 4. CALCULATING TEST STATISTICS
Favor Neutral Oppose f row
NDA fo =10
fe =13.9
fo =10
fe =13.9
fo =30
fe=22.2
50
UPA fo =15
fe =11.1
fo =15
fe =11.1
fo =10
fe =17.8
40
f column 25 25 40 n = 90
34
= 50*25/90
35. 4. CALCULATING TEST STATISTICS
Favor Neutral Oppose f row
NDA fo =10
fe =13.9
fo =10
fe =13.9
fo =30
fe=22.2
50
UPA fo =15
fe =11.1
fo =15
fe =11.1
fo =10
fe =17.8
40
f column 25 25 40 n = 90
35
= 40* 25/90
38. 6. COMPARE COMPUTED TEST
STATISTIC AGAINST A TABLED/CRITICAL
VALUE
• α = 0.05
• df = 2
• Critical tabled value = 5.99
• Test statistic, 11.03, exceeds critical value
• Null hypothesis is rejected
• NDA & UPA differ significantly in their opinions on
EMPLOYMENT issues.
38
41. • Hypothesis = Association
• Due to pollution in Delhi there is increase of Covid-19
• Covid-19 is associated with pollution in Delhi
• Critical level =0.001, 0.01, 0.05,
• Degree of freedom = Df = (C-1) X R-1
• Chi Square (X square) = (O-E)2 /E
• X2 Value should be compare with the X2 table
• if calculate value is higher then table value Hypothesis is
rejected
• Alternative hypothesis is accepted