Statistical analysis of Crude oil prices of historical data. Tried to analyze the data using statistical models and graphs and interpreted the meaning of results found. A thorough research on the industry is done. The report uses the following mathematical calculations: Mean & its types, Median, Mode, Variance, Standard Deviation, Range, Correlation & Regression, and Kurtosis Coefficient of Skewness.
Tried to calculate it both manually and using MS Excel in built macros.
2. A REPORT ON:
Statistical Analysis on Crude Oil Prices
Submitted to: Dr. Rajni
In partial fulfilment of the requirement of the
Statistics for Management Course
By:
Mayank Agrawal
On:
November 06, 2018
3. Letter of Transmittal
November 06, 2018
Dr. Rajni
Assistant Professor
Statistics for Management
Department of Business Administration
Jindal Global Business School
Subject: Statistical Analysis Report on Crude Oil Prices
Dear Maβam
As per the guidelines provided by you during the classroom discussion, we hereby submit a report on the
statistical analysis on Crude Oil prices. This report has been prepared after carefully scrutinizing the facts
and thus measuring and evaluating relevant statistical tendencies of measurement.
Kindly evaluate the same and provide your valuable inputs.
Best Regards,
Mayank Agrawal
Jindal Global Business School
5. Arithmetic Mean:
It is defined as the average of a set of numerical values, as calculated by adding them together and dividing
by the number of terms in the set.
π₯Μ =
β π₯π
π
π=1
π
π₯Μ =
17163.48
393
π₯Μ = 43.67
Value of π₯Μ represents the average mean price of crude oil for 33 years.
Arithmetic Mean for Grouped Data:
Here data are grouped into categories or intervals and presented as diagrams or tables, the definition of the
mean is unchanged, but the method of obtaining it differs from that used for ungrouped data. The mean is
then obtained by dividing this sum by the number of cases.
Prices Frequency
21-30 199
30-40 27
40-50 33
50-60 26
60-70 23
70-80 22
80-90 17
90-100 22
100-110 19
110-120 2
120-130 1
130-140 2
π₯Μ =
βππ
π΄π
π₯Μ =
18135
393
π₯Μ = 46.15
6. Arithmetic Mean by Step Deviation Method:
Prices
Mid Value
(Xi) Frequency(fi) di = xi - a ui = di/h fiui
20-30 25 199 -55 -5 -995
30-40 35 27 -45 -4.1 -110.5
40-50 45 33 -35 -3.2 -105.0
50-60 55 26 -25 -2.3 -59.1
60-70 65 23 -15 -1.4 -31.4
70-80 75 22 -5 -0.5 -10.0
80-90 85 17 5 0.5 7.7
90-100 95 22 15 1.4 30.0
100-110 105 19 25 2.3 43.2
110-120 115 2 35 3.2 6.4
120-130 125 1 45 4.1 4.1
130-140 135 2 55 5 10
Total ( β ) 393 -1209.55
π₯Μ = π +
βππππ
βππ
β β
where
β’ xΜ is the mean which we're trying to find.
β’ a is the assumed mean.
β’ h is the class interval which we looked at in the section on data.
β’ fi is the frequency of each class, we find the total frequency of all the classes in the
data set (βfi) by adding up all the fi 's
β’ Each ui is found from the following formula:
Prices Frequency(Fi) Mid Value (Xi) FiXi
20-30 199 25 4975
30-40 27 35 945
40-50 33 45 1485
50-60 26 55 1430
60-70 23 65 1495
70-80 22 75 1650
80-90 17 85 1445
90-100 22 95 2090
100-110 19 105 1995
110-120 2 115 230
120-130 1 125 125
130-140 2 135 270
Total ( β ) 393 18135
7. β’ where h is the class interval and each di is the difference between the mid element in a
class and the assumed mean.
β’ d is calculated from the following formula:
β’ where x is the midpoint of a given class.
β’ x is obtained from the following:
β’ xi is the number in the middle of a given class.
π₯Μ = π +
βππππ
βππ
β β
Here,
a = 80
h = 30-20+1 = 11
π₯Μ = 80 +
β1209.55
393
β 11
π₯Μ = 80 +
β1209.55
393
β 11
π₯Μ = 46.145
Grouped data mean = Step deviation method mean
8. Arithmetic Mean by Assumed Mean Method:
Prices
Mid Value
(Xi) Frequency(fi) di = xi - a fidi
20-30 25 199 -55 -10945
30-40 35 27 -45 -1215
40-50 45 33 -35 -1155
50-60 55 26 -25 -650
60-70 65 23 -15 -345
70-80 75 22 -5 -110
80-90 85 17 5 85
90-100 95 22 15 330
100-110 105 19 25 475
110-120 115 2 35 70
120-130 125 1 45 45
130-140 135 2 55 110
Total ( β ) 393 -13305
π₯Μ = π +
βπππ π
βππ
where
β’ fi is the frequency of each class
β’ d is calculated from the following formula:
β’ a is the assumed mean.
Here,
a = 80
π₯Μ = 80 +
β13305
393
π₯Μ = 46.145
Grouped data mean = Step deviation method mean = Assumed mean method
9. Geometric Mean:
It is defined as the central number in a geometric progression (e.g. 9 in 3, 9, 27), also calculable as the nth
root of a product of n numbers.
Prices Mid Value (Xi) Frequency(fi) Log Xi
Fi*Log
Xi
20-30 25 199 1.4 278.2
30-40 35 27 1.5 41.7
40-50 45 33 1.7 54.6
50-60 55 26 1.7 45.2
60-70 65 23 1.8 41.7
70-80 75 22 1.9 41.3
80-90 85 17 1.9 32.8
90-100 95 22 2.0 43.5
100-110 105 19 2.0 38.4
110-120 115 2 2.1 4.1
120-130 125 1 2.1 2.1
130-140 135 2 2.1 4.3
Total ( β ) 393 22.2 627.8
πΊ. π. = π΄ππ‘ππππ ππ
βπ ππππ
βππ
πΊ. π. = π΄ππ‘ππππ ππ
627.8
393
πΊ. π. = π΄ππ‘ππππ ππ 1.59
πΊ. π. = π΄ππ‘ππππ ππ 1.59
πΊ. π. = 4.9037
G.M value indicates the central tendency or typical value of a set of numbers by using the product of their
values.
Harmonic Mean:
The harmonic mean (sometimes called the subcontrary mean) is one of several kinds of average, and in
particular one of the Pythagorean means. Typically, it is appropriate for situations when the average
of rates is desired.
The harmonic mean can be expressed as the reciprocal of the arithmetic mean of the reciprocals of the given
set of observations.
10. Prices Frequency(fi) Mid Value (Xi) 1/x f/x
20-30 199 25 0.040 7.96
30-40 27 35 0.029 0.77
40-50 33 45 0.022 0.73
50-60 26 55 0.018 0.47
60-70 23 65 0.015 0.35
70-80 22 75 0.013 0.29
80-90 17 85 0.012 0.20
90-100 22 95 0.011 0.23
100-110 19 105 0.010 0.18
110-120 2 115 0.009 0.02
120-130 1 125 0.008 0.01
130-140 2 135 0.007 0.01
Total ( β ) 393 11.24
π». π. =
βππ
β
ππ
ππ
π». π. =
393
11.24
π». π. = 34.96
Median:
It is defined as denoting or relating to a value or quantity lying at the midpoint of a frequency distribution of
observed values or quantities, such that there is an equal probability of falling above or below it.
Prices Frequency (Fi)
Mid Value
(Xi)
FiXi
Cumulative
frequency (Cf)
20-30 199 25 4975 199
30-40 27 35 945 226
40-50 33 45 1485 259
50-60 26 55 1430 285
60-70 23 65 1495 308
70-80 22 75 1650 330
80-90 17 85 1445 347
90-100 22 95 2090 369
100-110 19 105 1995 388
110-120 2 115 230 390
120-130 1 125 125 391
130-140 2 135 270 393
Total ( β ) 393 18135 3885
11. π = πΏ +
π
2
βπΆπ
π
* h
where
L = Lower limit of the median class
f = frequency of the median class
h = class interval of the median class
π = 70 +
196.5 β308
22
* 10
X = 29.45
It's the middle value of all the observation.
Mode:
The mode is a statistical term that refers to the most frequently occurring number found in a set of numbers.
The mode is found by collecting and organizing data in order to count the frequency of each result. The
result with the highest number of occurrences is the mode of the set.
Prices Frequency(Fi)
Mid Value
(Xi)
FiXi
Cumulative
frequency
20-30 199 25 4975 199
30-40 27 35 945 226
40-50 33 45 1485 259
50-60 26 55 1430 285
60-70 23 65 1495 308
70-80 22 75 1650 330
80-90 17 85 1445 347
90-100 22 95 2090 369
100-110 19 105 1995 388
110-120 2 115 230 390
120-130 1 125 125 391
130-140 2 135 270 393
Total ( β ) 393 18135 3885
ππππ = πΏ +
π1βπ0
[( π1βπ0)+(π1βπ2)]
* h
Here,
L = Lower limit of the median class
h = class length of the median class
f1 = frequency of the modal class
f0 = frequency of the pre-modal class
f2 = frequency of the post-modal class
12. ππππ = 20 +
199β0
[(199β0)+(199β27)]
* 10
ππππ = 25.36
So 25.36 is the most frequently occurring observation.
Variance:
Variance is the expectation of the squared deviation of a random variable from its mean. Informally, it
measures how far a set of (random) numbers are spread out from their average value.
ππππππππ =
βπ₯2
π
- Β΅^2
Where
Β΅ =
βπ₯
π
Β΅ = Mean
x = Crude oil prices
N = No of observations
ππππππππ =
βπ₯2
π
- Β΅^2
ππππππππ =
1092281
393
β 1907.32
ππππππππ = 872.280
It tells how far a set of (random) numbers are spread out from their arithmetic mean.
14. Standard Deviation:
A quantity expressing by how much the members of a group differ from the mean value for the group.
S.D. = βVariance
S.D. = β872.280
S.D. = 29.53
S.D value shows that other observations are largely spread out from mean.
Range:
Range = Upper limit(max) β Lower limit(min)
Range = 133.88 β 11.35
Range = 122.53
17. Pearsonβs Correlation coefficient
=
π β ππ β (β π β π)
βπ β π π β (β π) π β π β π π β (β π) π
=
πππ(πππππππ.ππ)β(πππππ.ππβπππππ.ππ)
β((πππβπππππππ.ππ)βπππππ.ππ π)ββ((πππβπππππππ.ππ)β(πππππ.ππ) π)
= 0.99
From the Pearsonβs Correlation Coefficient, it can be interpreted that it is a strong correlation.
Bowleyβs Coefficient of Skewness
Q1=
πͺπ+π
π
= 971. 5
Q2=
πͺπ+π
π
= 1943
Q3= (
πͺπ+π
π
) β π = 3886
Thus, Bowleyβs Coefficient of Skewness is equal to
=
πΈπ + πΈπ β (π β πΈπ)
πΈπ β πΈπ
= 0.33
18. Kurtosis Coefficient of Skewness
m2 =
330207.39
377
m2 = 875.881
m4 =
772412272.00
377
m4 = 2048838.918
k =
2048838.918
875.881^2
k = 2.67
Since the value of Kurtosis = 2.67 , it is greater than zero. this imples that the kurtosis
curve is said to be LEPTOPKUTIC.
Leptopkutic represents a frequency distribution having a greater kurtosis than the
normal distribution, more concentrated about the mean.
20. Regression:
Regression analysis is a set of statistical processes for estimating the relationships among variables.
π =
2779.340
1907.408
π = 1.1005
-10
10
30
50
70
90
110
130
150
0 20 40 60 80 100 120 140
X
Y
Regression
21. Comparison:
EXCEL FORMULAES VERIFICATION
MEAN 43.6729771 43.67
MEDIAN 29.66 29.45
MODE 21.34 25.36
STANDARD DEVIATION 29.56749459 29.53
VARIANCE 872.0122153 872.2
RANGE 122.53 122.53