1. NewGate India
Hyderbad, Andhra Pradesh- 500038
Website: www.newgate.in
Email: contact@newgate.in
Slideshare URL : http://www.slideshare.net/newgateindia
Business Statistics
Descriptive Statistical Analysis
For
Location of students of PGP Jan-09 students
1
2. CONTENTS:
I. Questions: ………………………………………………….3
II. Solution: Calculation
1. Data Collection………………………………………………3
2. Data Classification to intervals………………………………5
3. DATA GENERATED through MS EXCEL…………………………6
4. Sum, Count & Range………………………………………...6
5. Mean…………………………………………………………7
6. Variance & Standard Deviation…………………………….7
6.1 Standard Deviation………………………………7
6.2 Variance…………………………………………7
7. Median………………………………………………..……..7
8. Quartile……………………………………………………...8
9. Inter Quartile Range…………………………………………9
10. Upper limit & Lower Limit……………………………10
11. Mode……………………………………………………11
12. Skewness……………………………………………….11
III. Analysis:
13. Central tendencies & Dispersion……………………....12
13.1 Central tendency………………………………...12
13.2 Dispersion……………………………………….13
13.3 Coefficient of variation…………………………..14
14. Mode…………………………………………………...14
15. Box & Whisker’s Plot………………………...……..…15
15.1 Outliers…………….……………………………15
15.2 Evidence of skewness………………………..…16
Appendix – 1……………………………………………17
Appendix – 2……………………………………………18
Appendix – 3……………………………………………19
Bibliography……………………………………………..20
2
3. 1. __________________________________________________________________________
I.QUESTION
__________________________________________________________________________
Collect information on any variable for a group of 30 members. Write a report
summarizing those data, including the following activities.
a. Calculate appropriate measures of central tendency and dispersion.
b. Do these data have a mode?
c. Draw a box-and-whisker plot. Are there any outliers? Does the plot show
any evidence of skewness?
__________________________________________________________________________
II. Solution: Calculation
__________________________________________________________________________
1.DATA COLLECTION:
A survey was done for a sample of 30 students from PGP JAN 2010 batch of Alliance
Business school.
Survey was done on the basis of the places to which each one of them belonged. The
distance of each of the places were calculated with reference to Bangalore.
SAMPLE : 30 Students
POPULATION : 52 Students
NOTE: The distance of students from Bangalore were recorded as 0 Km as all distance
were measured from Bangalore.
The scale value for all 30 students were collected and quoted as follows.
Sources to calculate distance : http://maps.google.co.in/
3
5. 2. DATA GENERATED through MS EXCEL
Data analysis was done for descriptive statistics through MS-EXCEL. All the above 30
data were taken into account while producing the below report.
MS-EXCEL REPORT 1
Mean 1221.2
Median 1443
Mode 0
Standard Deviation 915.7679423
Sample Variance 838630.9241
Sample Variance 838630.9241
Skewness 0.117333504
Range 3304
Minimum 0
Maximum 3304
Sum 36636
Count 30
Largest(1) 3304
Smallest(1) 0
Confidence Level(95.0%) 341.9533643
Q1(First Quartile) 327
Q2(Second Quartile) 1443
Q3(Third Quartile) 1929
IQR 1602
Upper Limit 4332
Lower Limit -2076
Lower Limit -2076
5
6. 3. DATA CLASSIFIED TO CLASS INTERVAL
To have better understanding of the behavior of the large number of sample data. We
have categorized the collected data to ordinal values within the class intervals.
CLASSIFIED CLASS INTERVAL TABLE : 2
X ( in Km) f
0-500 10
500-1000 2
1000-1500 3
1500-2000 9
2000-2500 5
2500-3000 0
3000-3500 1
4. SUM , COUNT & RANGE
ARRANGEMENT ( SORTING OF DATA) :
First fall of all variable data were arranged from top to bottom in their increasing order
Xmid
X ( in km) ( in Km) f Xmid X f
0-500 250 10 2500
500-1000 750 2 1500
1000-1500 1250 3 3750
1500-2000 1750 9 15750
2000-2500 2250 5 11250
2500-3000 2750 0 0
3000-3500 3250 1 3250
Total 30 38000
SUM : 38000
COUNT : n = 30
6
7. Range :
Range = Maximum value - Minimum Value
Max = 3304
Min = 0
Range = 3304
_________________________________________________________________________________________
5. MEAN:
X = Σ xi ( i = 1,2,3………n )
n
X= { ( 250 x 10) + ( 750 x 2) + ( 1250 x 3) + ( 1750 x 9) + ( 250 x 10) + ( 2250 x 5)
+ ( 2750 x 0) + ( 3250 x 1) }/30
= 38000 / 30
X = 1266.66
________________________________________________________________________________________
6. VARIANCE & STANDARD DEVIATION:
6.1 Standard deviation S = Σ|x–x|
n–1
S= [ | ( 250 – 1266.66 ) | x 10 + | ( 750 – 1266.66 ) | x 2 + | ( 1250 – 1266.66 ) | x 3
+ | ( 1750 – 1266.66 ) | x 9 + | ( 250 – 1266.66 ) | x 10 + | ( 2250 – 1266.66 )| x 5
+ | ( 2750 – 1266.66 )| x 0 + | ( 3250 – 1266.66 )| x 1 ] /30
S = 759.31
6.2 VARIANCE :
S = √ Variance
Variance = S X S Variance = 576552.2
7
8. _________________________________________________________________________________________
7. MEDIAN:
Median = L + N/2 – C X i
f
L: Lower limit of the class interval
N : Number of observations
,f : Frequency of particular observation
, i: Width of the class interval
C : Cumulative frequency of previous observation
N = 30
N/2 =15
N =15
C = 12
Cumulative f= 3
x f i=500
f
0-500 10 10 L=1000
500-1000 2 12
1000-1500 3 15 Q2 ( Median)
1500-2000 9 24
= 1000 + [ { (15-12) /3 } X 500 ]
2000-2500 5 29
2500-3000 0 29 Q2 = 1500
3000-3500 1 30
Total 30 30
_________________________________________________________________________________________
8
9. 8. QUARTILE:
Quartiles Q1, Q2, Q3 are the percentile values dividing the whole samples of data into
4 four equal quadrant.
Q1 = first Quartile or 25th Percentile
Q2( Median) = Second Quartile or 50th Percentile
Q3 = Third Quartile or 75th Percentile
Q1 = L + N/4 – C X i
f
N = 30
N/4 =7.5
N =15
Cumulative C=0
x f f= 10
f
i=500
0-500 10 10 L=0
500-1000 2 12
1000-1500 3 15
Q1 ( Median)
1500-2000 9 24
2000-2500 5 29 = 0 + [ { (7.5-0) /10 } X 500 ]
2500-3000 0 29
3000-3500 1 30 Q1 = 375
Total 30 30
9
10. Q3 = L + 3N/4 – C X i
f
N = 30
3N/4 = 22.5
N =15
Cumulative C = 15
x f
f f= 9
0-500 10 10 i=500
500-1000 2 12 L=1500
1000-1500 3 15
1500-2000 9 24 Q1 ( Median)
2000-2500 5 29
= 1500 + [{ (22.5-15) /9 } X 500 ]
2500-3000 0 29
3000-3500 1 30 Q3 = 1916.66
Total 30 30
Q1 = 375
Q2 = 1500
Q3 = 1916.66
_________________________________________________________________________________________
9. INTER QUARTILE RANGE ( IQR)
IQR = Q3 – Q1
IQR = 19166.66 -376 IQR = 541.66
__________________________________________________________________________
10
11. 10. UPPER LIMIT & LOWER LIMIT
Upper Limit U = Q3 + 1.5 IQR
Lower Limit L = Q1 - 1.5 IQR
U = 1916.66 + 1.5 x 541.66 = 4228.49
L = 375 – 1.5 x 541.66 = -1937.49
__________________________________________________________________________
11. MODE:
Mode is the maximum frequency of particular data for a given variable.
x Xmid f
0-500 250 10 Frequency is maximum at interval
500-1000 750 2
0-500 that is 10 times
1000-1500 1250 3
1500-2000 1750 9
2000-2500 2250 5 Mode = 250
2500-3000 2750 0
3000-3500 3250 1
Total 30
__________________________________________________________________________
12. Skweness:
Skp = {3 ( Mean – Median )} / S
Skp = [ 3 ( 1266.66 -1500)] / 759.31
Skp = - 0.9219
__________________________________________________________________________
11
12. III. ANALYSIS:
.
13. Appropriate measures of central tendency and dispersion
13.1 Central Tendency: Mean,Median,Q1,Q3
Mean 1266.66 Km
Median 1500 Km
1500
1400
1300
1200
1100
Mean Median
Observation:
It can inferred that students of PGP Jan-10 batch travel an average distance of
1266.66 Km to come to Alliance Business School.
50 % of the observations lies above 1500 km
Q1 ( First Quartile) 375 Km
Q3 ( Third Quartile) 1916.66 Km
2500
2000
1500
1000
500
0
Q1 Q2 Q3
12
13. Observation:
25 % of observation lies below 375 Km
50% of observation lies between 375 Km to 1916.66 Km
25% of observation lies above 1916.66 Km
13.2 Dispersion: Standard Deviation, Sample Variance,Range,IQR,Coefficient
Standard Deviation 915.7679423 Km
Sample Variance 838630.9241 Km
Observation:
The central tendency predicted to summarize the whole sample can
deviate/differ from its mean at by an average value of 915.78 ( more or less )
Mean + Deviation = 2182.44 Km
Mean – Deviation = 350.88 Km
It will deviate/differ over a range of 2 X 915.78 = 1831.56 Km
It will deviate/differ at an average from lower most value 350.88 Km to higher
most value 350.88 Km
IQR
Range
Standard Deviation
0 500 1000 1500 2000 2500 3000 3500
Range 3304 Km
IQR 1602 Km
Observation:
The student’s hometown are spread up over a coverage of 3304 Km
50 % of the observations lies in the coverage area of 1602 Km
13
14. 13.3 Coefficient of variation:
Coefficient of variation = Standard Deviation/Mean X 100
= 915.7679423/1266.66
= 72.23 %
_________________________________________________________________________________________
14.Maximum Observed data ( Mode)
Observation:
The maximum observed interval were 0 Km to 500km
Mode is its mid value (0 + 500) / 2 = 250 Km
f
Total
3000-3500
2500-3000
Axis Title
2000-2500
1500-2000
1000-1500 f
500-1000
0-500
0 5 10 15 20 25 30 35
Axis Title
_________________________________________________________________________________________
14
15. 15. Box-and-whisker plot
Q1 375 327 Km
Q2 1500 1443 Km
Q3 1916.66 1929 Km
IQR 1541.66 1602 Km
Upper Limit 4228.49 4332 Km
Lower Limit -1937.49 -2076 Km
15
16. 15.1 Outliers : UpperLimit,Lower Limit
Upper Limit 4228.49 Km
Lower Limit -1937.49 Km
Max value : 3304 Km < upper limit:4228.49 Km
Min Value : 0 Km < lower limit : -1937.49 Km
Hence, No outliers
Observation:
All samples data should be within 4228.49 Km to escape being outliers.
The maximum sample date is of 3304 Km < 4228.49 Km.So there is no
outliers
5000
4000
3000
2000
1000
0
Upper Limit Lower Limit
-1000
-2000
-3000
15.2 .Evidence Of Skweness
Skewness -0.9219
Observation:
It is negatively skewed as mean < median
The skewness is -0.9219
In the box plot the median do not lie exactly in between Q1 and Q2.
________________________________________________________________________________________
16
18. APPENDIX – 2
CLASSIFIED CLASS INTERVAL TABLE : 2
Xmid
X (in Km) f cumulative f Xmid X f
( in Km)
0-500 250 10 10 2500
500-1000 750 2 12 1500
1000-1500 1250 3 15 3750
1500-2000 1750 9 24 15750
2000-2500 2250 5 29 11250
2500-3000 2750 0 29 0
3000-3500 3250 1 30 3250
Total 30 30 38000
MS-EXCEL Vs CALCULATED REPORT
Simulated By
Calculated Value
Ms-excel
Mean 1266.66 1221.2
Mode 250 0
Median 1500 1443
Standard
759.31 915.7679423
Deviation
Sample variance 576552.2 838630.9241
Skewness -0.9219 0.117333504
Range 3304 3304
Minimum 0 0
Maximum 3304 3304
Sum 38000 36636
Count 30 30
Largest(1) 3304 3304
Smallest(1) 0 0
Q1 375 327
Q2 1500 1443
Q3 1916.66 1929
IQR 1541.66 1602
Upper Limit 4228.49 4332
Lower Limit -1937.49 -2076
18
19. APPENDIX – 3
MS-EXCEL Vs CALCULATED REPORT
3500
3000
2500
2000
calculated value
1500 Ms-Excel value
1000
500
0
Mean Median Standard Range
Deviation
5000
4000
3000
2000
Axis Title
Calculated data
1000
MS-Excel Value
0
Q1 Q2 Q3 IQR Upper Lower
Limit Limit
-1000
-2000
-3000
Axis Title
19
20. BIBLIOGRAPHY
Statistics of Business and Economics
( Anderson,Sweeney,Williams,Cenage Learning,9th Edition)
www.wikipdeia.org
www.stats4u.com
www.maps.google.co.in
20