2. PREFACE
Mathematics forms an integral part of everyday life. We have to
teach it with freshness and variety to make it meaningfully
applicable to life. Statistics helps you interpret data in your daily
lives and make good decisions! For example, is it possible to eat
too much grapefruit? Is it safer for your brain cells to use a
headset when you talk on the phone? Can an online profile help
you get a job? What steps can you take during college to increase
your future salary? I cannot claim that all the materials I have
written in this book are mine. I have learned the subject from
many excellent books. This text books is designed to meet the
everyday requirements of students at school and the general
readers of mathematics.
Suggestions for improvement are welcome.
The Author
3. Contents
1 Measures Of Central Tendency
1.1 Mean
1.2 Median
1.3 Mode
2. Measures of Dispersion
2.1 Variance
2.2 Standard deviation
3. Central Moments
3.1 Skewness
3.2 Kurtosis
4. Unit 1
Measures of Central Tendency
Introduction
Measures of central tendency are statistical
measures which describe the position of a distribution.
They are also called statistics of location, and are the
complement of statistics of dispersion, which provide
information concerning the variance or distribution of
observations. In the univariate context, the mean, median
and mode are the most commonly used measures of
central tendency. Computable values on a distribution
that discuss the behavior of the center of a distribution.
5. Measures of Central Tendency
The value or the figure which represents the whole
series is neither the lowest value in the series nor the
highest it lies somewhere between these two extremes.
The average represents all the measurements made on
a group, and gives a concise description of the group as
a whole.
When two are more groups are measured, the central
tendency provides the basis of comparison between
them.
1.1Mean
In mathematics, mean has several different definitions
depending on the context.
In probability and statistics, mean and expected value are
used synonymously to refer to one measure of the central
6. tendency either of a probability distribution or of
the random variable characterized by that distribution.[1] In
the case of a discrete probability distribution of a random
variable X, the mean is equal to the sum over every
possible value weighted by the probability of that value;
that is, it is computed by taking the product of each
possible value x of X and its probability P(x), and then
adding all these products together, giving . An
analogous formula applies to the case of a continuous
probability distribution. Not every probability distribution
has a defined mean; see the Cauchy distribution for an
example. Moreover, for some distributions the mean is
infinite: for example, when the probability of the value
is for n = 1, 2, 3,…
For a data set, the terms arithmetic mean, mathematical
expectation, and sometimes average are used
synonymously to refer to a central value of a discrete set
of numbers: specifically, the sum of the values divided by
the number of values. The arithmetic mean of a set of
numbers x1, x2, ..., xn is typically denoted by , pronounced
7. "x bar". If the data set were based on a series of
observations obtained by sampling from a statistical
population, the arithmetic mean is termed the sample
mean (denoted ) to distinguish it from the population
mean (denoted or ).
Types of mean
In mathematics, the three classical Pythagorean
means are the arithmetic mean (A), the geometric
mean (G), and the harmonic mean (H).
They are defined by:
Arithmetic mean
The most common type of average is the arithmetic
mean. If n numbers are given, each number denoted
by ai, where i = 1, n, the arithmetic mean is the [sum] of
the ai' s divided by n or
The arithmetic mean, often simply called the mean, of
two numbers, such as 2 and 8, is obtained by finding a
value A such that 2 + 8 = A + A. One may find
8. that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to
read 8 and 2 does not change the resulting value
obtained for A. The mean 5 is not less than the
minimum 2 or greater than the maximum 8. If we
increase the number of terms in the list to 2, 8, and 11,
the arithmetic mean is found by solving for the value
of A in the equation 2 + 8 + 11 = A + A + A. One finds
that A= (2 + 8 + 11)/3 = 7.
Arithmetic Mean Calculated Methods:
• Direct Method :
• Short cut method :
9. • Step deviation Method :
Geometric mean
The geometric mean of n non-negative numbers is
obtained by multiplying them all together and then taking
the nth root. In algebraic terms, the geometric mean
of a1, a2… an is defined as
Geometric mean can be thought of as the antilog of the
arithmetic mean of the logs of the numbers.
Example: Geometric mean of 2 and 8 is
Harmonic mean
Harmonic mean for a non-empty collection of
numbers a1, a2,…, an, all different from 0, is defined as
the reciprocal of the arithmetic mean of the reciprocals
of the ai’s:
10. One example where the harmonic mean is useful is
when examining the speed for a number of fixed-distance
trips. For example, if the speed for going
from point A to B was 60 km/h, and the speed for
returning from B to A was 40 km/h, then the harmonic
mean speed is given by
Inequality concerning AM, GM, and HM
A well known inequality concerning arithmetic,
geometric, and harmonic means for any set of
positive numbers is
It is easy to remember noting that the
alphabetical order of the letters A, G, and H is
preserved in the inequality. See Inequality of
arithmetic and geometric means.
Thus for the above harmonic mean example: AM
= 50, GM = 49, and HM = 48 km/h.
11. Problems
1. Calculated the AM,GM,HM of the following.
x 15 12 15 23 14 17 18 19 20 16
f 4 3 2 3 5 4 1 2 7 8
Median
Median is a central value of the distribution, or the
value which divides the distribution in equal parts,
each part containing equal number of items. Thus it is
the central value of the variable, when the values are
arranged in order of magnitude.
Connor has defined as “The median is that value of
the variable which divides the group into two equal
parts, one part comprising of all values greater, and
the other, and all values less than median”
Calculation of Median –Discrete series:
12. Arrange the data in ascending or descending
order.
Calculate the cumulative frequencies.
Apply the formula.
Calculation of median – Continuous series
For calculation of median in a continuous frequency
distribution the following formula will be employed.
Algebraically,
Advantages of Median:
• Median can be calculated in all distributions.
• Median can be understood even by common people.
• Median can be ascertained even with the extreme
items.
• It can be located graphically
• It is most useful dealing with qualitative data
13. Disadvantages of Median:
• It is not based on all the values.
• It is not capable of further mathematical treatment.
• It is affected fluctuation of sampling.
• In case of even no. of values it may not the value from
the data.
Mode
Mode is the most frequent value or score in the
distribution. It is defined as that value of the item in a
series. Thus to find the median, order the list
according to its elements' magnitude and then
repeatedly remove the pair consisting of the highest
and lowest values until either one or two values are
left. If exactly one value is left, it is the median; if two
values, the median is the arithmetic mean of these
two. This method takes the list 1, 7, 3, 13 and orders
it to read 1, 3, 7, and 13. Then the 1 and 13 are
removed to obtain the list 3, 7. Since there are two
14. elements in this remaining list, the median is their
arithmetic mean, (3 + 7)/2 = 5.
Advantages of Mode:
• Mode is readily comprehensible and easily calculated
• It is the best representative of data
• It is not at all affected by extreme value.
• The value of mode can also be determined
graphically.
• It is usually an actual value of an important part of the
series.
Disadvantages of mode;
• It is not based on all observations.
• It is not capable of further mathematical manipulation.
• Mode is affected to a great extent by sampling
fluctuations.
Choice of grouping has great influence on the value of
mode
15. Unit 2
Measures of Dispersion
Introduction
Measures of dispersion are descriptive statistics that
describe how similar a set of scores are to each other
The more similar the scores are to each other, the
lower the measure of dispersion will be
The less similar the scores are to each other, the
higher the measure of dispersion will be
In general, the more spread out a distribution is, the
larger the measure of dispersion will be
There are three main measures of dispersion:
1. The range
2. The semi- interquartile range (SIR)
3. Variance / standard deviation
16. Variance
This measure the average of the squared deviations from
the mean (as opposed the average of the absolute
deviations) is called the variance.
The variance is the usual measure of dispersion in
statistical theory, but it has a drawback when researchers
want to describe the dispersion in data in a practical way.
To calculate variance;
Find the mean of the data.
Hint – mean is the average so add up the values and
divide by the number of items
Subtract the mean from each value – the result is called
the deviation from the mean.
Square each deviation of the mean
Divide the total by the number of items.
17. The variance formula includes the Sigma Notation,
which represents the sum of all the items to the right of
Sigma.
2 ()x
n
Standard deviation
Standard Deviation shows the variation in data. If the
data is close together, the standard deviation will be small.
If the data is spread out, the standard deviation will be
large
Standard Deviation is often denoted by the lowercase
Greek letter sigma,
.
The standard deviation formula can be represented
using Sigma Notation:
(x
)2
n
Find the variance and standard deviation. The math test
scores of five students are: 92,88,80,68 and 52
18. Unit 3
Central moments
Introduction
Central Moments- The average of all the deviations of all
observations in a dataset from the mean of the
observations raised to the power r
In the previous equation, n is the number of observations,
X is the value of each individual observation, m is the
arithmetic mean of the observations, and r is a positive
integer.
There are 4 central moments:
The first central moment, r=1, is the sum of the
difference of each observation from the sample
Average (arithmetic mean), which always equals 0
The second central moment, r=2, is variance.
The third central moment, r=3, is skewness.
19. Skewness
Skweness describes how the sample differs in shape
from a symmetrical distribution. If a normal distribution
has a skewness of 0, right skewed is greater than 0
and left skewed is less than 0.Negatively skewed
distributions, skewed to the left, occur when most
of the scores are toward the high end of the
distribution.In a normal distribution where skewness is
0, the mean, median and mode are equal. In a
negatively skewed distribution, the mode > median >
mean.
Positively skewed distributions occur when most of
the scores are toward the low end of the distribution.
In a positively skewed distribution, mode< median<
mean
20.
21. Kurtosis
Kurtosis is the 4th central moment.
This is the “peakedness” of a distribution.
It measures the extent to which the data are
distributed in the tails versus the center of the
distribution
There are three types of peakedness.
Leptokurtic- very peaked
Platykurtic – relatively flat
Mesokurtic – in between
Mesokurtic has a kurtosis of 0
Leptokurtic has a kurtosis that is +
Platykurtic has a kurtosis that is -
22. Reference
Web resources
https://www.google.co.in/?gfe_rd=cr&ei=gJw
eVIPQK6vM8geL6YDwCg&gws_rd=ssl#q=v
ariance+standard+deviation+ppt