1. Statistics and Research methodologies Assignment 1.
On June 14, 2013 the finance minister, P. Chidambaram urged Indians
to not buy too much of gold for at least 1 year as it caused the
country’s current account deficit to rise which in turn leads to
country’s imports to rise which in turn leads to rupee’s devaluation
and it becoming weaker as compared to other currencies.
The finance minister made the above statement because he knows
there’s a link that exists between the amounts of gold a country
consumes and the value of the nation’s currency.
Above statement makes it clear that there’s a correlation between
consumption of gold and the forex rate of a currency. As the price is a
function of consumption there also must exist a relation between
price of gold and the currency exchange rate. The relation is: as more
gold would be imported, more would be its price, and more price
would lead to current account deficit being larger which in turn would finally lead to a weaker currency.
Thus we can say that in the above case price of gold is an independent variable (X) and the foreign
exchange rate (Y) is the dependent variable.
Let us analyze the above theory with the statistical tools of regression and correlation.
(From now on the independent variable i.e. the price of gold and the dependent variable i.e. the forex
rate of currency would be called as X and Y respectively.)
Above variables X and Y would be said to be correlated if the change in one affects or induces a change
If they incidentally are correlated then there must be some relation between them that we can use to
predict or estimate the value of unknown variable from the value of known variable. This prediction can
be done by the process of regression analysis.
* End of Introduction *
gold price/gm (X)
Rupee exchange rate (Y)
Above is the data for variables X and Y.
The scatter chart would be as follows:
y = -0.0039x + 73.635
R² = 0.2482
3. There are two types of correlation: positive correlation and negative correlation.
Positive correlation is seen when value of two variables involved deviate in the same direction i.e. if
value of one increase (or decrease) the value of other also simultaneously increase (or decrease).
Generally scatter chart of a positive correlation resembles the figure below:
Negative correlation is seen when value of two variables involved deviate in the opposite direction i.e. if
value of one increase (or decrease) the value of other also simultaneously decrease (or increase).
Generally scatter chart of a positive correlation resembles the figure below:
4. All the charts other than two above represents partial correlation meaning it’s not always possible that if
the value of X will increase, necessarily the value of Y will also increase and vice-versa.
Our chart resembles more to that of Negative correlation although it’s not perfectly negative but is
tending towards negative i.e. most of the time if X, the independent variable increases, Y decreases. We
can also say that if the price of gold increases the currency exchange rate decreases (i.e. the currenncy
becomes stronger – opposite of what we have assumed in the theory)
Secondly the scatter of points also tells a good story about the correlation between the variables. If the
points are widely scattered then we can say that the variables are poorly correlated and vice versa.
In our case the points are neither highly scattered nor highly narrow and tight. There exists a reasonable
amount of correlation between the two variables X and Y.
Mean, standard deviation of variables X and Y is a follows:
Standard deviation of Variable X is noticeably very high because of ever-rising gold prices and each day
there’s a new value of gold which is very much deviated from the previous value.
On the other hand, Standard deviation of variable Y is too low because even a small fluctuation iin
currency is a result of many factors. A drastic change in price of gold is still not enough to induce a big
change in the forex rate.
The Value of Co-variance between the two variables can be found out in excel by using the function
“=COVARIANCE.S(array1, array2)”. It comes out to be -18.9945.
Covariance and correlation both shows the relationship between two variables but covariance takes the
units of the variables in consideraion while correlation does not.
5. Covariance shows the strenght of the relation between two variables. Correlation does the same thing
but on a scale of -1 to +1 and does not take units of the variables into consideration.
In our case there’s negative covariance which shows the relationship between X and Y is inverse in
nature i.e. if X decreases, Y increases and vice-versa.
Negative covariance in our case also shows that higher than average (or mean) values of X are paired
with lower than average (or mean) values of Y.
Covariance is generally not used to interpret how two variables vary with each other because its
magnitude is hard to interpret. Correlation is much more used for this purpose.
Correlation coeffeicient as stated above is scaled version of the covariance. It can be obtained by
CORREL function in excel. Inputting the values of X and Y in function gives us a correlation coeffeicient of
The value is in negative, this means there’s negative correlation between X and Y. -1 being the
correlation for perfectly negative correlated variables tell us that in our case by almost half the strength
of perfectly negative correlation the variable varies with each other.
Generally it’s considered that absolute value of correlation coefficient above 0.8 is considered as strong
and absolute value below 0.5 is considered as weak. So In our case the variables are at the verge of
being poorly correlated but are actually not.
Regression coefficient can be calculated by using Data analysis tool of excel. Regression coefficient of Y
on X comes out to be -0.003948112.
Regression coefficient of Y on X is basically covariance of X and Y divides by the variance of X.
Regression coefficient is also the slope of the line that we got in the scatter graph.
The basic difference between correlation coefficient and regression coefficient is that correlation
coefficient gives us how strong a relationship exists between the variables while on the other hand
regression coefficient allows us to estimate change in value of Y for a unit increase in X.
In our case the regression coefficient is negative with a value of 0.0039, this means that there’s decrease
of 0.0039 for each additional percentage increase in X. i.e. for each percent rise in the price of gold the
forex rate decreases, again going against our theory.
Linear regression equation is given by Y = aX + b, where Y is the dependent variable and X is the
independent variable, a is the regression coefficient or the slope of the line, b is the intercept cut by the
line on X axis. The linear regression equation also describes how well the line describes the data. It’s also
called the line of best fit.
So the equation will come out to be: Y = -0.0039X + 73.635
Unless X is 0, the intercept in the above equation has no specific meaning.
The equation of X on Y can be found out by reversing the variables and inputting their value in excel. It
gives us the equations as:
Y = -62.87X + 6902.9
Graph for the above equation would be as follows:
Regression equation of X on Y
y = -62.87x + 6902.
R² = 0.248
60.5000 61.0000 61.5000 62.0000 62.5000 63.0000
Q.10 Suppose the price of gold suddenly rises to 5000 Rs/gram, what would be the impact on the forex rates
i.e. the dependent variable.
The value of the dependent variable can be found by putting the value of X in the linear regression
equation we got in Question no. 7
7. The original equation is: y = -0.0039x + 73.635
Substituting X=5000in above, we get: +54.135
It means if gold prices rise to 5000 the forex rate would go down to 54.135
Q.11 Correlation and regression are the two most widely used statistical methods for determining the
strength between two variables and how they vary with one another.
Correlation is very useful in hypothesis testing where it’s necessary to establish cause-effect
Correlation and regression plays a crucial role in medical as well as scientific experiments. For example if
one wants to observe the effect of a drug on the blood pressure of a patient, correlation and regression
analysis can be used to determine the level of blood pressure compared amount of drug given.
Recently the very method of regression and correlation analysis was used to determine the lentic water
quality of the important religious Brahmasarovar water tank at kurukshetra, Haryana. A positive
correlation was found between the dissolved oxygen and the population of plankton and phytoplankton
(the good bacteria that reduced carbon dioxide and increases the amount of oxygen that helps sustain
the aquatic web of life)
Correlation and regression analysis has also been extensively used by business analysts in order to
project the future stock prices and financial condition of the company.
Correlation and regression is also used by researchers of NASA and other space agencies to calculate
the solar maximum that will occur (which is a 11 year period when number of sun spots appear on the
sun when the irradiance of sun increase thus altering earth’s climate and increasing the temperature)
In the assignment we tried to test the theory whether gold price really affect the forex rates. In some
case we found that yes there’s a direct correlation between the price of gold and forex rates and in
other questions we found inverse of it. Thus we can say that this variation is due to plethora of various
other factors that play a role in determining exchange rate such as state of international economy,
economy, economic decisions taken by government etc. i.e. the forex rate along with being dependent
on gold is also dependent on other variables, Hence the variation.