3. Introduction
When managers are bewildered by plethora ofWhen managers are bewildered by plethora of
data, which do not make any sense on the surfacedata, which do not make any sense on the surface
of it, they are looking for methods to classify dataof it, they are looking for methods to classify data
that would convey meaning. The idea here is tothat would convey meaning. The idea here is to
help them draw the right conclusion. This chapterhelp them draw the right conclusion. This chapter
provides theprovides the nittynitty--gritty of arranging data intogritty of arranging data into
informationinformation..
4. 1) Meaning and Example of
Raw Data
Meaning of RawMeaning of Raw
Data:Data:
Raw Data represent numbersRaw Data represent numbers
and facts in the originaland facts in the original
format in which the data haveformat in which the data have
been Collected. You need tobeen Collected. You need to
convert the raw data intoconvert the raw data into
information for managerialinformation for managerial
decision Making.decision Making.
Example of Raw Data:Example of Raw Data:
Assume that you know the weeklyAssume that you know the weekly
sales of a product in a region over thesales of a product in a region over the
past year are: (Figures in '000' units)past year are: (Figures in '000' units)
52 61 59 55 63 70 59 77 8152 61 59 55 63 70 59 77 81
83 69 91 73 83 90 81 77 7783 69 91 73 83 90 81 77 77
74 65 56 77 64 49 60 52 5074 65 56 77 64 49 60 52 50
45 42 46 39 29 38 41 43 2345 42 46 39 29 38 41 43 23
26 27 22 29 31 29 31 30 3026 27 22 29 31 29 31 30 30
29 40 44 45 46 52 5329 40 44 45 46 52 53
Suppose you present this set of dataSuppose you present this set of data
as it is to the General Manageras it is to the General Manager
(Sales). At best it will be boring to(Sales). At best it will be boring to
him.him.
5. Information is Key
Large and massive raw data tend to bewilder you so muchLarge and massive raw data tend to bewilder you so much
that the overall patterns are obscured. You cannot see thethat the overall patterns are obscured. You cannot see the
wood for the trees. This implies that the raw data must bewood for the trees. This implies that the raw data must be
processed to give you useful information.processed to give you useful information.
Raw Data Information
Process
6. 2) Frequency Distribution
In simple terms,In simple terms, frequency distributionfrequency distribution is a summarizedis a summarized
table in which raw data are arranged into classes andtable in which raw data are arranged into classes and
frequencies. Classes represent categories or groupings, whichfrequencies. Classes represent categories or groupings, which
contain a lower limit and an upper limit. Classes are formedcontain a lower limit and an upper limit. Classes are formed
conveniently following certain guidelines. Against each class,conveniently following certain guidelines. Against each class,
you count and then place the number of observations that fallyou count and then place the number of observations that fall
into it. When you do it for all classes in a given data analysiinto it. When you do it for all classes in a given data analysiss
problem, it becomes a frequency distribution.problem, it becomes a frequency distribution.
Frequency distribution focuses on classifying raw data intoFrequency distribution focuses on classifying raw data into
information. It is the most widely used data reductioninformation. It is the most widely used data reduction
technique in descriptive statistics. When you are looking fortechnique in descriptive statistics. When you are looking for
pattern that would help you understand the characteristic youpattern that would help you understand the characteristic you
measure in a problem situation, frequency distribution comesmeasure in a problem situation, frequency distribution comes
to your rescue.to your rescue.
7. Guidelines for Constructing a Frequency
Distribution Table
1)1) Identify the Minimum ValueIdentify the Minimum Value
(Min) and Maximum Value (Max)(Min) and Maximum Value (Max)
in the given Data Set. Calculatein the given Data Set. Calculate
RangeRange = Max= Max--MinMin
2)2) Decide on theDecide on the Number of ClassesNumber of Classes
you would like to have. Theyou would like to have. The
number of classes can benumber of classes can be
determined as the square root ofdetermined as the square root of
the number of observations in thethe number of observations in the
data set.. Also for any problem itdata set.. Also for any problem it
is recommended that you have notis recommended that you have not
less than 5 classes and not moreless than 5 classes and not more
than 15 classes.than 15 classes.
3)3) Determine theDetermine the WidthWidth of theof the
Class Interval =Class Interval =
Range/ Number of ClassesRange/ Number of Classes
4)4) Formulate the Boundaries ofFormulate the Boundaries of
the Classesthe Classes in such a mannerin such a manner
that it will include all thethat it will include all the
observations in the data set.observations in the data set.
Avoid overlapping of classes.Avoid overlapping of classes.
Once class boundary for eachOnce class boundary for each
class is ready, all you need toclass is ready, all you need to
do is to tally the number ofdo is to tally the number of
observations in each class.observations in each class.
8. Histogram (also known as frequency
histogram) is a snap shot photograph of
the frequency distribution. Histogram is a
graphical representation of the frequency
distribution in which the X-axis
represents the classes and the Y-axis
represents the frequencies. Rectangular
bars are constructed at the boundaries of
each class with heights proportional to
the frequency.
3) HISTOGRAM
Histogram depicts the pattern of the distribution emerging from the
characteristic being measured. If the pattern is symmetrical and bell shaped,
then it reflects the normal distribution curve. In the quality control parlance,
the system is stable; only chance causes are present and the assignable
causes are absent.
10. Histogram- Example
The inspection records of a hose assembly operation revealed a hThe inspection records of a hose assembly operation revealed a high leveligh level
of rejection. An analysis of the records showed that the "leaks"of rejection. An analysis of the records showed that the "leaks" were awere a
major contributing factor to the problem. It was decided to invemajor contributing factor to the problem. It was decided to investigate thestigate the
hose clamping operation. The hose clamping force (torque) was mehose clamping operation. The hose clamping force (torque) was measuredasured
on twenty five assemblies. (Figures in footon twenty five assemblies. (Figures in foot--pounds). The data are givenpounds). The data are given
below: Draw the frequency histogram and comment.below: Draw the frequency histogram and comment.
88 1313 1515 1010 1616
1111 1414 1111 1414 2020
1515 1616 1212 1515 1313
1212 1313 1616 1717 1717
1414 1414 1414 1818 1515
11. Histogram Example Solution
You will notice that theYou will notice that the RangeRange is 20is 20--88
=12.=12. You take theYou take the number of classesnumber of classes asas
5(Note that the square root of the number5(Note that the square root of the number
of observations isof observations is 25 = 5). The25 = 5). The widthwidth ofof
the class is Range/Number of classes =the class is Range/Number of classes =
12/5 =2.4. Round it to 3. You can now12/5 =2.4. Round it to 3. You can now
form the boundaries of the classesform the boundaries of the classes
starting with 8 and then incrementing bystarting with 8 and then incrementing by
3 successively the lower limit of each3 successively the lower limit of each
class until all the classes are formed.class until all the classes are formed.
Tally the number of observations underTally the number of observations under
each class. This would give you theeach class. This would give you the
following table of frequency distribution.following table of frequency distribution.
ClassClass FrequencyFrequency
88--1111 22
1111--1414 77
1414--1717 1212
1717--2020 33
2020--2323 11
Looking at the histogram, it is easy for you toLooking at the histogram, it is easy for you to
see that the pattern does not show a bell shapesee that the pattern does not show a bell shape
curve. The bars adjacent to the class 14curve. The bars adjacent to the class 14--1717
cause some distortion to normality. It is alsocause some distortion to normality. It is also
evident that the average is in the range 14 to 17.evident that the average is in the range 14 to 17.
Corrective action is needed. However, beforeCorrective action is needed. However, before
taking any action, you must be cautious abouttaking any action, you must be cautious about
the fact that the sample size here is only 25the fact that the sample size here is only 25
observations. Take more measurements andobservations. Take more measurements and
draw the histogram again before takingdraw the histogram again before taking
corrective steps.corrective steps.
Histogram for the Example
2
7
12
3
1
0
5
10
15
8-11 11-14 14-17 17-20 20-23
Classes
Frequency
12. Microsoft Excel and Histogram
The Microsoft Excel Chart Wizard allows you to create a varietyThe Microsoft Excel Chart Wizard allows you to create a variety of chartsof charts
for numerical as well as categorical data. The histogram picturefor numerical as well as categorical data. The histogram pictured in thed in the
previous slide is an output from Chart Wizard.previous slide is an output from Chart Wizard.
Also there is a powerful utility as addAlso there is a powerful utility as add--in supplied by Microsoft Excelin supplied by Microsoft Excel
called "Data Analysis" in the Tools Menu. This has a variety ofcalled "Data Analysis" in the Tools Menu. This has a variety of analysisanalysis
tools, which includetools, which include HistogramHistogram,, Cumulative DistributionCumulative Distribution,, FrequencyFrequency
Distribution,Distribution, Descriptive Statistics,Descriptive Statistics, ParetoPareto--ChartChart and many others.and many others.
Please get familiarized with these in Excel at the earliest so tPlease get familiarized with these in Excel at the earliest so that you couldhat you could
function as a manager taking information based decisions. The pofunction as a manager taking information based decisions. The power ofwer of
Excel spread sheet software is amazing.Excel spread sheet software is amazing.
13. 4) Cumulative Frequency Distribution
A type of frequency distribution that shows how manyA type of frequency distribution that shows how many
observations are above or below the lower boundariesobservations are above or below the lower boundaries
of the classes. You can formulate the following fromof the classes. You can formulate the following from
the previous example of hose clamping force(torque)the previous example of hose clamping force(torque)
0.080.08
0.360.36
0.840.84
0.960.96
1.001.00
22
99
2121
2424
2525
0.080.08
0.280.28
0.480.48
0.120.12
0.040.04
22
77
1212
33
11
88--1111
1111--1414
1414--1717
1717--2020
2020--2323
1.001.002525TotalTotal
CumulativeCumulative
RelativeRelative
FrequencyFrequency
CumulativeCumulative
FrequencyFrequency
RelativeRelative
FrequencyFrequency
FrequencyFrequencyClassClass
14. Ogive Curve
TheThe OgiveOgive curve is a graphicalcurve is a graphical
representation of the cumulative frequencyrepresentation of the cumulative frequency
distribution using numbers or percentages.distribution using numbers or percentages.
In this pictorial representation, less thanIn this pictorial representation, less than
values are in the Xvalues are in the X--axis and cumulativeaxis and cumulative
frequency in numbers or percentages arefrequency in numbers or percentages are
in the Yin the Y--axis. A line graph in the form of aaxis. A line graph in the form of a
curve is plotted connecting the cumulativecurve is plotted connecting the cumulative
frequencies corresponding to the upperfrequencies corresponding to the upper
boundaries of the classes. Today, thisboundaries of the classes. Today, this
ogiveogive graph is elegantly and efficientlygraph is elegantly and efficiently
obtained as output from Chart Wizard orobtained as output from Chart Wizard or
Data Analysis in the Toolbox of MicrosoftData Analysis in the Toolbox of Microsoft
Excel. TheExcel. The OgiveOgive graph for the presentgraph for the present
torque example obtained from Microsofttorque example obtained from Microsoft
Excel is given in the adjacent box:Excel is given in the adjacent box:
CumulativeDistribution(OgiveCurve)for
theExample
2
9
21 24 25
0
10
20
30
11 14 17 20 23
Torque(lessthanvalue)
Cumulative
Frequency