Analysis means critical evaluation of the
assembled and grouped data for studying the
characteristics of the object under study and
for determining the patterns of relationships
among the variable relating to it.
Both quantitative & qualitative methods are
However, social research most often requires
quantitative analysis involving the application
of various statistical techniques.
It summarizes large mass of data into
understandable and meaningful form.
E.g. Sensex, Nifty, GDP, PCI etc.
Statistics makes exact description possible
E.g. percentage of literate among males and
females, percentage of degree holder among
males and females.
Statistical analysis facilitates identification of
the casual factors underlying complex
E.g. what are the factors that determines a
variable like labor productivity or academic
performance of student.
It aids the drawing of reliable inferences from
E.g. what would be the growth rate of
industrial production during the coming
Making estimation or generalization from
the result of sample survey.
Statistical analysis is useful for assessing
the significance of specific sample result
under assumed population condition. This
type of analysis is called Hypothesis
Types of statistical
analysis of data
• Descriptive analysis
• Inferential analysis
This analysis provides us with profiles of
organization, work groups, persons and other
subjects of a multiple of characteristics such
as size, composition, efficiency, preference
It involves an estimate of the accuracy of the
inferences called reliability. The reliability is
expressed in terms of probability.
Analysis with the help of software packages.
E.g. R, Excel, SPSS etc.
Approach to statistical analysis
First approach, Descriptive analysis:
Construction of statistical distribution &
calculation of simple measures.
E.g. averages, percentages and measures
Some average value that represents the
distribution is computed by using the
appropriate measures of central tendency
i.e. Mode, Median and Arithmetic mean
Second aspect, comparison of two or more
Measures of dispersion are used to get
complete indication of the nature of
Range, standard deviation & co efficient of
To study relationship among variables,
Coefficient of correlation, partial & multiple
correlation & Regression
are used for prediction purpose.
Other methods like ratio, proportion and
percentage are used for comparison of
groups of unequal size. E.g. liquidity ratio
is used for inter/ intra firm comparison.
Third aspect, Coefficients of correlation,
partial and multiple correlation and
regression are used for to find out the
nature of relationship among variables.
In survey research, result of sample may
have some errors. Parametric test of
significance such as “ t” test, “f” test etc
and non parametric tests like chi-square
test, K S test, sigh test etc are used for
These test are also used for testing the
Hypothesis relating to variable.
Types of statistical measures
Measures of central tendency: mean, mode,
Measures of dispersion: ranges, deviation,
Measures of relation: correlation, regression,
chi square test, factor analysis, discriminant
analysis, cluster analysis, cannonical
Analysis of variance: one / two way of
ANOVA, MANOVA .
Time series: seasonal, cyclical, trend and
Measures of central tendency-
mean, median and mode
Descriptive statistics that identify which value
is most typical for the data set.
Adding all of the scores in a data set together
and dividing by the number of scores.
e.g. Height/cm; 153, 146, 151, 170, 160
Added together =780
780 divided by 5 =156cm
The mean =156
The most powerful measure of central
tendency because it is made up from all of
the scores in the data set.
Any outliers can distort the mean.
Sometimes the mean does not make
sense in terms of the data set e.g. the
number of children per family in the UK =
When all of the scores in a data set have been
put in order, the median is the central number
in the set.
E.g. Age of employees/years; 21, 29, 34, 44,
The median age of the employees is 34.
The median is less effected by extreme scores
than the mean
It is not suited to being used with small sets of
data especially if containing widely varying
e.g. 7, 8, 9, 102, 121 where the median would
be 9. A more real median would be 60!
The most frequent occurring number in the
data set when put in order. e.g. 3, 5, 6, 6, 6,
Mode = 6
The data set could be Bimodal (two modes)
or even multimodal.
The mode is normally unaffected by extreme
scores and may give an idea of how often
something is occurring e.g. what size of shoes
sell most when ordering stock
The mode may not be central measure, and a
set of data may not have a most frequent