SlideShare a Scribd company logo
1 of 123
Download to read offline
MECN4006 – Research Project
Correlation of Factors Influencing a Share Price
Name: Ashail Maharaj
Student Number: 536684
Supervisor: Dr Ian Campbell
25 August 2016
A project report submitted to the Faculty of Engineering and the Built Environment, University
of the Witwatersrand, Johannesburg, in partial fulfilment of the requirements for the degree of
Bachelor of Science in Engineering.
Johannesburg, August 2016
ii
DECLARATION
UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG
SCHOOL OF MECHANICAL, INDUSTRIAL AND
AERONAUTICAL ENGINEERING
I Ashail Maharaj Student Number: 536684, am registered for the course No. MECN 4006 - in the
year 2016.
I herewith submit the following task, β€œResearch Project: Correlation of Factors Influencing a Share
Price” in partial fulfilment of the requirements of the above course.
I hereby declare the following:
ο‚· I am aware that plagiarism (the use of someone else’s work without their permission and/or
without acknowledging the original source) is wrong;
ο‚· I confirm that the work submitted herewith for assessment in the above course is my own
unaided work except where I have explicitly indicated otherwise;
ο‚· This task has not been submitted before, either individually or jointly, for any course
requirement, examination or degree at this or any other tertiary educational institution;
ο‚· I have followed the required conventions in referencing the thoughts and ideas of others;
ο‚· I understand that the University of the Witwatersrand may take disciplinary action against me
if it can be shown that this task is not my own unaided work or that I have failed to acknowledge
the sources of the ideas or words in my writing in this task.
Signature: ________________________________ Date: 25/08/2016
iii
ABSTRACT
When investors predict share prices it is important to find the relevant factors that could influence the
way the investors value the share. Generally, when investors value a share they use different factors
that they feel are strong indicators of the company’s worth. These factors are also determined by the
knowledge the investors have gained. There are a vast number of factors that a company’s financial
performance is dependent upon and each of these factors could be related to each other, in such a case
it would be redundant to include all the factors. The scope of this project focused around the correlation
of factors with each other and the share price, with the aim to find the factors that are most relevant to
the share price. The objectives of this research were to determine what factors influence the price of the
Richemont share, what the relationship between relevant factors and the Richemont share price are and
which factors can be used to predict the share price. A group of 8 factors were chosen. i.e. dividends
yield, volume traded, rand-euro exchange rate, rand dollar exchange rate, inflation (CPI), the prime
lending interest rate, the gold price and the platinum price. Each factor was recorded at daily, weekly,
monthly, quarterly, half-yearly and yearly intervals.
During this study a number of steps were carried out to ensure that the data used was valid for the
analysis. The correlation and cross correlation of the factors with the share price was determined. The
underlying assumptions for correlation were then tested. A linear regression was performed on each
factor with the share price and a residual analysis was done. Each series was then transformed to get a
better linear fit between the factors and the share price. Each transformed set of factors and share price
was linearly regressed after transformation and the residual analysis was performed again. Correlation
and cross correlation were evaluated between all factors to find redundant factors, which were then
eliminated. The volume traded was the only relevant factor remaining after the test. The relationship
found in the regression was then tested for each dataset’s volume traded series to see if the regression
model still holds.
This study then concluded showing that the only relevant factor from the initial 8 was the volume, the
relationship between the Richemont share price and the volume traded is showed by the equation √ 𝑦3
=
(βˆ’2 βˆ™ 10βˆ’27) βˆ™ π‘₯3
+ 21.9 and that the volume traded is correlated with the share price with a correlation
coefficient of -0.894.
iv
TABLE OF CONTENTS
DECLARATION...................................................................................................................................ii
TABLE OF CONTENTS ....................................................................................................................iv
LIST OF FIGURES.............................................................................................................................ix
LIST OF TABLES...............................................................................................................................xi
1 INTRODUCTION.........................................................................................................................1
1.1 Background.............................................................................................................................1
1.2 Motivation...............................................................................................................................1
2 LITERATURE SURVEY.............................................................................................................2
2.1 Fundamental Analysis.............................................................................................................2
2.2 Technical Analysis..................................................................................................................3
2.3 Time Series Analysis ..............................................................................................................4
Seasonality......................................................................................................................6
Trends .............................................................................................................................7
Correlation ......................................................................................................................7
Pearson’s Product Moment Correlation Coefficient.......................................................7
Cross correlation and Auto Correlation ........................................................................13
2.4 Feature Selection...................................................................................................................14
2.5 Exploratory Data Analysis (EDA) ........................................................................................14
2.6 Confirmatory Data Analysis (CDA) .....................................................................................14
2.7 Dividend Yield......................................................................................................................15
2.8 Interest Rate ..........................................................................................................................15
2.9 Principle Component Analysis (PCA) ..................................................................................15
2.10 Causality ...............................................................................................................................16
2.11 Linear regression...................................................................................................................16
2.12 Residual Analysis..................................................................................................................16
2.13 Related Studies......................................................................................................................17
3 OBJECTIVES .............................................................................................................................18
v
4 APPARATUS ..............................................................................................................................19
5 METHODOLOGY .....................................................................................................................19
5.1 Collection of Data.................................................................................................................19
5.2 Processing of Data ................................................................................................................20
5.3 Precautions............................................................................................................................21
6 OBSERVATIONS.......................................................................................................................23
7 DATA PROCESSING ................................................................................................................27
7.1 Correlation Values ................................................................................................................27
7.2 Cross Correlation and auto Correlations...............................................................................27
7.3 Assumption tests...................................................................................................................28
7.4 Coefficient of determination.................................................................................................29
7.5 Confidence intervals of regression plots...............................................................................29
7.6 Residual analysis for transformed data .................................................................................29
7.7 Correlation of factors ............................................................................................................29
7.8 Cross Correlation of all factors .............................................................................................29
7.9 Residuals for fitting regression model to each data sets’ volume traded data ......................29
8 RESULTS ....................................................................................................................................30
8.1 Correlation Values ................................................................................................................30
8.2 Cross Correlations and Auto Correlations ............................................................................31
8.3 Assumption Tests..................................................................................................................34
8.4 Coefficient of Determination ................................................................................................36
8.5 Regression of Price Vs Factors .............................................................................................37
8.6 Residual vs Actual ................................................................................................................41
8.7 Regression of Price Vs Factors After Transformations ........................................................45
8.8 Confidence Intervals on regression plots..............................................................................48
8.9 Residual Analysis for transformed data................................................................................50
8.10 Correlation of Factors ...........................................................................................................52
8.11 Cross Correlation of all Factors ............................................................................................53
8.12 Residuals for fitting regression model to each periods volume traded data..........................56
vi
9 DISCUSSION..............................................................................................................................59
9.1 Correlation Coefficients........................................................................................................59
Average correlation for each factor...............................................................................59
Absolute average correlation for each data capture frequency .....................................59
9.2 Cross Correlation ..................................................................................................................60
9.3 Assumption Tests..................................................................................................................61
Daily..............................................................................................................................61
Weekly..........................................................................................................................61
Monthly.........................................................................................................................62
Quarterly .......................................................................................................................62
Half-Yearly ...................................................................................................................62
Yearly............................................................................................................................62
9.4 Linear Regression .................................................................................................................62
Dividends Yield Vs Share Price....................................................................................63
Volume Traded Vs Share Price.....................................................................................63
Rand-Euro Exchange rate vs Share Price......................................................................63
Rand-Dollar Exchange rate Vs Share Price ..................................................................63
Inflation Vs Share Price ................................................................................................63
Interest Rate Vs Share Price..........................................................................................63
Gold Price Vs Share Price.............................................................................................63
Platinum Price Vs Share Price ......................................................................................64
9.5 Residuals vs Actual...............................................................................................................64
Share Price ....................................................................................................................64
Dividend yield...............................................................................................................64
Volume..........................................................................................................................64
Rand-Euro.....................................................................................................................64
Rand-Dollar...................................................................................................................65
Inflation.........................................................................................................................65
Interest...........................................................................................................................65
vii
Gold...............................................................................................................................65
Platinum........................................................................................................................65
9.6 Transformation and Linear Regression.................................................................................65
Dividends Yield ............................................................................................................66
Volume..........................................................................................................................67
Rand-Euro.....................................................................................................................67
Rand-Dollar...................................................................................................................67
Inflation.........................................................................................................................67
Interest Rate and Gold...................................................................................................67
Platinum........................................................................................................................67
9.7 Confidence intervals on regression plots ..............................................................................67
9.8 Residual analysis On transformed data.................................................................................68
Dividend Yield..............................................................................................................68
Inflation.........................................................................................................................68
Platinum Price...............................................................................................................68
Rand-Dollar Exchange Rate..........................................................................................68
Rand-Euro Exchange Rate............................................................................................68
9.9 Correlation of factors ............................................................................................................69
Dividends yield .............................................................................................................69
Volume traded...............................................................................................................69
Rand-Euro exchange rate..............................................................................................70
Rand-Dollar exchange rate............................................................................................70
Inflation.........................................................................................................................70
Interest Rate ..................................................................................................................71
Gold Price .....................................................................................................................71
Errors from Fitting the Linear Model to Each Volume Traded and Share Price Data Set
72
10 CONCLUSIONS .....................................................................................................................72
11 RECOMMENDATIONS........................................................................................................72
viii
12 REFERENCES........................................................................................................................74
APPENDIX A: Matlab Code for Cross Correlation and Auto Correlation..................................80
APPENDIX B: Matlab Code for Testing All Assumptions.............................................................81
APPENDIX C: Matlab Code for A Regression Plot with Confidence Limits ...............................83
APPENDIX D: Code for changing the data that is plotted.............................................................84
APPENDIX E: Matlab Code for The Cross Correlation with Each Data Set...............................85
APPENDIX F: Matlab Code for Extracting Error Graphs............................................................89
APPENDIX G: Residual Plots ...........................................................................................................91
ix
LIST OF FIGURES
Figure 1: Moods of the masses according to technical analysis[8].........................................................3
Figure 2: Shifting of Time Series..........................................................................................................14
Figure 3: Platinum Price Over Time.....................................................................................................23
Figure 4: Share Price Over Time ..........................................................................................................23
Figure 5: Gold Price Over Time ...........................................................................................................24
Figure 6 Volume Traded Over Time.....................................................................................................24
Figure 7: Rand-Euro Over Time...........................................................................................................25
Figure 8: Dividend Yield Over Time....................................................................................................25
Figure 9: Rand-Dollar Over Time.........................................................................................................26
Figure 10: Interest Rate Over Time ......................................................................................................26
Figure 11: Inflation Over Time.............................................................................................................27
Figure 12: Cross Correlation of Factors Within Daily Data Set ...........................................................32
Figure 13: Cross Correlation of Factors Within Weekly Data Set........................................................32
Figure 14: Cross Correlation of Factors Within Monthly Data Set ......................................................33
Figure 15: Cross Correlation of Factors Within Quarterly Data Set.....................................................33
Figure 16: Cross Correlation of Factors Within Half-Yearly Data Set.................................................34
Figure 17: Cross Correlation of Factors Within Yearly Data Set .........................................................34
Figure 27: Linear Regression of DY Vs Share price ............................................................................37
Figure 28: Linear Regression of Volume Traded Vs Share price.........................................................38
Figure 29: Linear Regression of Rand-Euro Exchange Rate Vs Share price........................................38
Figure 30: Linear Regression of Rand-Dollar Exchange Rate Vs Share price .....................................39
Figure 31: Linear Regression of Inflation Vs Share price ....................................................................39
Figure 32: Linear Regression of Interest Rate Vs Share price..............................................................40
Figure 33: Linear Regression of Gold Price Vs Share price.................................................................40
Figure 34: Linear Regression of Platinum Price Vs Share price ..........................................................41
Figure 18: Residual Plot for Share Price...............................................................................................41
Figure 19: Residual Plot for Dividend Yield ........................................................................................42
Figure 20: Residual Plot for Volume Traded........................................................................................42
Figure 21: Residual Plot for Rand-Euro Exchange Rate ......................................................................43
Figure 22: Residual Plot for Rand-Dollar Exchange Rate....................................................................43
Figure 23: Residual Plot for Inflation ...................................................................................................44
Figure 24: Residual Plot for Interest Rate.............................................................................................44
Figure 25: Residual Plot for Gold Price................................................................................................45
Figure 26: Residual Plot for Platinum Price .........................................................................................45
Figure 35: Linear Regression of Dividend Yield Vs Share Price After Transformation......................46
Figure 36: Linear Regression of Volume Traded Vs Share Price After Transformation......................46
x
Figure 37: Linear Regression of Rand-Euro Exchange Rate Vs Share Price After Transformation ....47
Figure 38: Linear Regression of Rand-Dollar Exchange Rate Vs Share Price After Transformation
..............................................................................................................................................................47
Figure 39: Linear Regression of Inflation Vs Share Price After Transformation.................................48
Figure 40: Linear Regression of Platinum Price Vs Share Price After Transformation.......................48
Figure 41: Regression Plot of transformed Dividend Yield vs transformed Share Price with Confidence
Intervals ................................................................................................................................................49
Figure 42: Regression Plot of transformed Rand-Euro vs transformed Share Price with Confidence
Intervals ................................................................................................................................................49
Figure 43: Regression Plot of transformed Rand-Dollar vs transformed Share Price with Confidence
Intervals ................................................................................................................................................49
Figure 44: Regression Plot of transformed Inflation vs transformed Share Price with Confidence
Intervals ................................................................................................................................................50
Figure 45: Regression Plot of transformed Platinum Price vs transformed Share Price with Confidence
Intervals ................................................................................................................................................50
Figure 46: Residual Plot of transformed regression between Share Price and DY...............................51
Figure 47: Residual Plot of transformed regression between Share Price and Inflation.......................51
Figure 48: Residual Plot of transformed regression between Share Price and Platinum Price.............51
Figure 49: Residual Plot of transformed regression between Share Price and Rand-Dollar ................51
Figure 50: Residual Plot of transformed regression between Share Price and Rand-Euro...................52
Figure 51: Error for Fitting Equation to Daily Data .............................................................................57
Figure 52: Error for Fitting Equation to Weekly Data..........................................................................57
Figure 53: Error for Fitting Equation to Monthly DataFigure 54: Error for Fitting Equation to Quarterly
Data.......................................................................................................................................................57
Figure 56: Error for Fitting Equation to Yearly Data ...........................................................................58
Figure 57: Error for Fitting Equation to Half-yearly Data....................................................................58
xi
LIST OF TABLES
Table 1: Correlation Values of Each Factor with Share Price ..............................................................31
Table 2:Assumption Test Results for Each Series in The Daily Data Set ............................................35
Table 3: Assumption Test Results for Each Series in The Weekly Data Set........................................35
Table 4: Assumption Test Results for Each Series in The Monthly Data Set ......................................35
Table 5: Assumption Test Results for Each Series in The Quarterly Data Set.....................................36
Table 6: Assumption Test Results for Each Series in The Half-Yearly Data Set.................................36
Table 7: Coefficient of Determination for Each Series.........................................................................37
Table 8: Correlation matrix of Untransformed Factors.........................................................................53
Table 9: Cross correlation of the Platinum Price with each factor........................................................54
Table 10: Cross Correlation of the Gold Price with Each Factor .........................................................54
Table 11: Cross Correlation of the Interest Rate with Each Factor ......................................................54
Table 12: Cross Correlation of Inflation with Each Factor...................................................................55
Table 13: Cross Correlation of the Rand-Dollar Exchange Rate with Each Factor..............................55
Table 14: Cross Correlation of the Rand-Euro Exchange Rate with Each Factor ................................55
Table 15: Cross Correlation of the Volume Traded with Each Factor..................................................56
Table 16: Cross Correlation of the Dividend Yield with Each Factor..................................................56
Table 17: Correlation of predicted and actual share price ....................................................................58
Table 18: Transformations used on each dataset ..................................................................................66
Table 19: R2
values ...............................................................................................................................66
1
1 INTRODUCTION
1.1 Background
There are many factors that could influence a share price. The factors could be economic or behavioural.
An economic factor could be factors that influence the spending of individuals, factors like the value of
the currency, the GDP (Gross Domestic Product) and commodity prices. Behavioural factors can be
described as factors that influence the decision-making ability of individuals. Factors such as what was
taught to an individual about trading (knowledge), the risk that the individual is willing to take and the
driving factors for the investment (which could be seen as emotions). There are many factors to be
considered with any single shares’ price. There are relationships between each factor and this could
lead to many different combinations of factors that can be used to analyse the value of a share price.
The price of a share is determined by the demand for the share, and what each individual buying or
selling the share perceives its value as. The problem with predicting the share price reduces to which
factors are most relevant to the markets perception of the value of the share. There are many factors to
use but not all will be relevant. Some factors may even be redundant as multiple factors have causal
relationships and may also be correlated with each other.
Finding the few factors, which are most relevant to the share price, can improve the prediction accuracy
and simplify the prediction process by reducing the computational complexity and the time taken to
complete the model. This project aims to find the factors which are most relevant to predicting the
Richemont share price. These factors could later be used in a prediction model. Currently there are
many methods to reaching the results required. The methods used in literature are, Factor analysis,
Principle Component Analysis, Causality and Correlations between factors.
1.2 Motivation
The purpose of this study is to investigate what factors are most relevant to the Richemont share price.
This study forms a preliminary investigation to the design of a simplified model of the JSE. While
trying to create this model, a number of factors as well as one specific share was chosen. The model
mentioned above will use Agent based simulation to simulate the decisions of entities within the market
according to economic factors such as;
ο‚· Fundamental pricing information including dividends.
ο‚· Distribution of knowledge.
ο‚· Economic conditions such as exchange rate, overseas market closing levels, inflation and interest
rates as well as commodity prices that may be correlated to the share.
ο‚· Market perceptions of the company including company ethics.
2
ο‚· Agents will be rational and will have a risk profile and time view (e.g. buy and hold, day trading,
etc.)
ο‚· Competition and alternatives in the market.
ο‚· Agents will be segmented by financial level (and size of trades)
While searching through literature it was found that with many models selection of the correct factors
led to higher accuracy as opposed to using as many factors as possible[1]–[3]. Based on an investigation
on prior models it was found that a selection of the factors based on quality was a more effective
criterion than the quantity of factors selected. The reduction of the factors also allowed for the model
to be less computationally complex as well as it reduced the times it took for the simulations to be
completed. From this the question β€œWhat factors are most influential to the Richemont Share Price?”
was created and leads into this research.
2 LITERATURE SURVEY
2.1 Fundamental Analysis
Fundamental analysis is a method of evaluating the value of a share by looking at the effect of the
company’s performance and their reactions to economic conditions [4]. Fundamental analysis seeks to
understand if the company is growing, if it is profitable, will it continue to improve or become the best
in its market segment, whether it is able to pay its debts and if the company is in good ethical standing.
All the questions and many more are answered in order to answer the question β€œDoes the company
make a good investment?”. Fundamental analysis is usually used for evaluation of stocks but can be
applied to securities, countries, markets and market segments [5].
There are many factors that can be analysed when performing fundamental analysis on a share, all
factors that could influence the company’s performance could be included. Fundamental analysis is not
limited to using purely qualitative or purely quantitative data. Any news about a company can usually
be helpful to measuring the value of the company or the potential value of the company. This is where
the methodology of using event studies becomes useful. The quantitative analysis usually includes
looking at the company’s finances and books in order to create a perceived value of the share. This
perceived value is called the intrinsic value, factors such as revenue, debt, dividends, and company
performance ratios are used as measures of this intrinsic value too[6].
The objectives of fundamental analysis are [4]:
ο‚· To predict the direction of economies that impact a company. This is done because the financial
performance of the company is dependent upon the economy is resides within.
ο‚· To estimate the intrinsic value of the stock and try to predict when changes in this value will occur.
ο‚· To select the right time to buy and sell stocks to maximise investment returns.
3
2.2 Technical Analysis
Technical analysis is the evaluation of securities by analysing statistics which are generated by market
activity. These statistics are generally past prices or volumes traded[7]. The aim of technical analysis is
to identify patterns that can suggest future activity instead of trying to forecast intrinsic value. Technical
analysis typically depends on the use of charts patterns, technical indicators, oscillators or some
combination of the above mentioned[8]. Technical analysts believe that the charts show the moods of
the crowds and thus they focus on the analysis of mass human psychology. Emotional risk is inversely
correlated to financial risk; Figure 1 below displays the moods associated with the different price trends.
Figure 1: Moods of the masses according to technical analysis[8]
People are generally motivated by greed and optimism when buying and are driven by fear or pessimism
when selling. It is believed that people formulate scenarios based on their emotional state in order to
rationalise their emotions. Investors will try to sell at the top or as close to the top, and buy at the bottom
or as close to the bottom as possible using this rationale. Investors use this in the aid of finding turning
points which they cannot see[8].
Apart from the above-mentioned methods technical analysis, trend, support, and resistance and volume
analysis are used.
ο‚· Trend Analysis
Trend analysis is one of the most important and most used techniques in technical analysis. A trend is
the general direction in which the price is heading. Trends aren’t always easy to spot as there are lots
of fluctuations in the price over time. In trend analysis trends are classified according to their direction
into three sets; uptrends, horizontal trends and downtrends. An uptrend is characterised by a series of
higher highs and higher lows whereas a downtrend is characterised by lower lows and lower highs[9][8].
4
Trends are then further classified into another set of three; a long-term trend, an intermediate trend and
a short-term trend. A long term trend is one that last longer than a year, an intermediate trend lasts
between one and three months and a short-term trend is considered to last up to a month. Channel lines
are the addition of two parallel trend lines which act as areas of support and resistance[8].
ο‚· Support and Resistance Analysis
Support is defined as the price level which the stock seldom falls below and resistance is the price level
which the stock seldom increases above. Support and resistance are governed by the psychology behind
supply and demand. Where the support is the price level which the market is willing to buy at and the
resistance level is the price at which the market is willing to sell at. When the price breaches the support
or the resistance level this means that there has been a shift in the supply or demand curves for the
shares. Once the resistance or support level is breached, it’s role will be reversed, i.e. the resistance
level will become the support level if the resistance level is broken and vice versa for the support
level[9][10].
ο‚· Volume Analysis
Volume is the number of shares that are traded over a given period of time, greater volume results in a
more active security. Volume charts have trends too which can show the increase or demand in the
demand or supply of the share. Volume analysis is important to technical analysis because it is used to
confirm chart trends and patterns. In most scenarios changes in volume precedes changes in price except
when the divergence case occurs. The divergence case is when the volume and price relationship starts
to deteriorate[10][11].
2.3 Time Series Analysis
A time series is a set of observations, each of which are recorded at a different point in time denoted by
𝑑 . When 𝑑 is incremental and data is recorded at each increment the series is discrete, whereas if 𝑑 is
continuous the series is a continuous time series[12]. Time series analysis can be broken down into the
following objectives [13];
ο‚· Description
The description objective comprises of plotting the data, looking for trends, seasonality, outliers,
normality, stationarity and using more tools used to describe the data set better.
ο‚· Explanation
Explanation focuses around correlations and relationships between different time series or within a
single time series.
ο‚· Prediction
Prediction focuses around trying to estimate future values of the data. This is also known as forecasting.
This includes fitting models to the data to improve forecasts.
5
6
ο‚· Control
This objective is usually used when dealing with quality control, it is used to ensure that process outputs
are within specification or are significantly within specification. The classical decomposition model is
used to describe a time series and use it to better forecast the future values, this model defines a time
series in terms of components such as the trend, seasonality and noise[14]. There are two classical
decomposition methods, the additive and the multiplicative. Equations 1 and 2 are used to describe them
respectively.
π‘Œ = 𝑇 + 𝐢 + 𝑆 + 𝑒 (1)
π‘Œ = 𝑇×𝐢×𝑆×𝑒 (2)
Where
π‘Œ is the value of the series at a specified point
𝑇 is the linear trend
𝐢 is the cycle
𝑆 is the seasonality
𝑒 is the random error
Seasonality
Seasonality is described as the predictable changes that data in the time series experiences and recurs
over a one-year period[15]. Seasonality can be calculated by using Equation 3 below.
𝐾𝑑 =
π‘Œπ‘‘
𝑀𝑑
(3)
Where
𝐾𝑑 is a series of seasonality and randomness
𝑀𝑑 is the moving average of the time series
π‘Œπ‘‘ is the value of the series at time t
7
In order to attain the seasonality series Equation 4 needs to be used. It is important to note that the
subscript g is the number of increments in a season, and that each season is the sum of the time each
season lasts. This is done to average out the randomness that occurs within each season[16].
𝑆 𝑔 = βˆ‘ 𝐾𝑑
(4)
Where
𝑆 𝑔 is the seasonality of the series
Trends
The trend of a time series is found by using a least squares fit of the model using Equation 5 below [16].
𝑀𝑑 = π‘Ž + 𝑏𝑑 + 𝑒𝑑 (5)
Where
𝑀𝑑 is the Moving average value at time t
π‘Ž is the intercept
𝑏 is the slope
𝑒𝑑 is the residual
For the trend, just the linear part of Equation 5 is used and the residual term is discarded.
Correlation
Correlation in terms of time series is a measure of how two time series are able to fluctuate together or
the measure of the linear relationship between the two. It is used to tell how well one time series is able
to predict fluctuations in another[17], [18]. Correlation does not mean a causal relationship but merely
that there exists a relationship between the variables that can be exploited for the forecasting of the
other[19]. Correlated variables could have some common variable that causes the fluctuations in them
and this is what may create the relationship. A time series may also have correlations with lagged
version of itself which is called serial correlation or auto correlation [20]. Correlation also allows for
analysis of which time series leads which by using offset data and comparing the two this is called
cross-correlation[21], [22]. When these lags are used, generally a model will first be fit and then an
information criterion like AIC may be used to find the best lag order [23], [24].Alternatively the
maximum lags can be used.
Pearson’s Product Moment Correlation Coefficient
Pearson’s Product moment correlation (PPMC) also referred to as the Pearson’s correlation coefficient
is a measure of how well two variables are related linearly. It enables the user to know whether fitting
a straight line to the data accurately represents the relationship between the variables in question. The
8
equation below is used to calculate the coefficient of correlation. A strong correlation is represented by
an r value within the intervals [0.7; 1] or [ -1; -0.7]. Where an absolute value of 1 represents a perfect
linear relationship. A moderate strength correlation would fall within the intervals [0.3;0.7] or [-0.7; -
0.3]. Low correlation is represented with a value within the ranges [0.1;0.3] or [-0.3; -0.1]. No
correlation is when the r value is 0.[25][17]
π‘Ÿ =
𝑛(βˆ‘ π‘₯𝑦) βˆ’ (βˆ‘ π‘₯)(βˆ‘ 𝑦)
√(𝑛 βˆ‘ π‘₯2 βˆ’ (βˆ‘ π‘₯)2)(𝑛 βˆ‘ 𝑦2 βˆ’ (βˆ‘ 𝑦)2)
(6)
π‘Ÿ =
π‘π‘œπ‘£(π‘₯, 𝑦)
𝑆 π‘₯ 𝑆 𝑦
(7)
Where
𝑛 is the number of observations in the sample
π‘₯ is the independent variable
𝑦 is the dependant variable
π‘π‘œπ‘£(π‘₯, 𝑦) is the covariance between the two variables
𝑆 π‘₯ is the sample standard deviation of the independent variable
𝑆 𝑦 is the sample standard deviation of the dependent variable
Equation 7 is a substitution of covariance for the numerator and variance of both of the variables into
the bracketed terms in equation 6 denominator [26]. With PPMC there are assumptions that were made,
if these assumptions are not met then the data may not mean what it is thought to, or the results would
not be valid[27][28]. These assumptions are;
ο‚· Normality
Normality is the measure of how the data is distributed and if the normal distribution can be fit to the
data significantly. Testing for normality can be done graphically or numerically. The numerical methods
that can be used to test this, are the Kolmogorov-Smirnov Test [29] (see Equation 8 for test statistic)
and the Shapiro-Wilk test (see Equation 9 and 10 for test statistic) [30]. The graphical methods that can
be used are, Q-Q plots, histograms and Box-and-Whisker diagrams[30]. The numerical tests can be
performed to a specified significance level to see if the data is normally distributed.
𝑇 = π‘šπ‘Žπ‘₯|πΉβˆ—(π‘₯) βˆ’ 𝑆(π‘₯)| (8)
Where
𝑇 is the test statistic used for the Kolmogorov-Smirnov test
9
πΉβˆ—(π‘₯) is the data being tested
𝑆(π‘₯) is the empirical distribution data (data from the normal distribution for normality tests)
π‘Š = (
𝑏
π‘ βˆšπ‘› βˆ’ 1
)
2 (9)
𝑏 = βˆ‘ 𝑏𝑖 = βˆ‘ π‘Ž(𝑛 βˆ’ 𝑖 + 1)(𝑋 π‘›βˆ’π‘–+1 βˆ’ 𝑋𝑖) (10)
Where
π‘Š is the test statistic for the Shapiro-Wilk test
𝑏 is defined by Equation 10
π‘Ž is defined as a Shapiro-Wilk coefficient
(𝑛 βˆ’ 𝑖 + 1) is defined as a Shapiro-Wilk coefficient
π‘₯ is defined as the data from the series being tested
Kolmogorov-Smirnov Test uses p-values to compare the test statistic and to accept or reject the null
hypothesis of the data being normal. The Shapiro-Wilk test uses critical values from a Shapiro-Wilk
table of values to compare the test statistic and conclude if the null hypothesis is correct or not.
ο‚· Linearity
Linearity of data is the ability of a line to display the relationship between the dependant variable and
independent variable. This is usually determined through linear regression. In linear regression the aim
is to fit a line through the data while minimising the error. The goodness of fit can be determined by
finding the coefficient of determination R2
of the line. The equation below can be used to find the
coefficient of determination [31].
𝑅2
= (
𝑛(βˆ‘ π‘₯𝑦) βˆ’ (βˆ‘ π‘₯)(βˆ‘ 𝑦)
√(𝑛 βˆ‘ π‘₯2 βˆ’ (βˆ‘ π‘₯)2)(𝑛 βˆ‘ 𝑦2 βˆ’ (βˆ‘ 𝑦)2)
)
2 (11)
Or alternatively can be found by analysis of residuals with the formula below
𝑅2
= 1 βˆ’
βˆ‘ πœ€π‘–
2
𝑛𝑆2
(12)
Where
𝑅2
is the coefficient of determination
πœ€π‘– is the error at each point i
10
𝑆 is the standard deviation of the data set being analysed
ο‚· Stationarity
Stationarity is the absence of random effects. There are two types of stationarity, Difference stationarity
and trend stationarity [32][33]. Before one continues through these definitions it is important to first
define the following;
Pure Random Walk
A pure random walk is defined by the equation
π‘Œπ‘‘ = π‘Œπ‘‘βˆ’1 + πœ€π‘‘ (13)
Where
πœ€π‘‘ is white noise
π‘Œπ‘‘ is the series value at time t
π‘Œπ‘‘βˆ’1 is the series value at time t-1
White noise is stochastic; this means that this series will not become mean reverting as the variance will
evolve over time. The variance of the series will tend to infinity as time tends to infinity. This is a
difference stationary process[34].
Random walk with drift
This series is defined by the equation below
π‘Œπ‘‘ = 𝛼 + π‘Œπ‘‘βˆ’1 + 𝑒𝑑 (14)
Where
𝛼 is the drift term in the series
This series too has a variance that is dependant on time and hence is not mean reverting. This is a
difference stationary process. [34]
Deterministic trend
This is defined by the equation below
π‘Œπ‘‘ = 𝛼 + 𝛽𝑑 + 𝑒𝑑 (15)
Where
𝛽𝑑 is the deterministic trend
11
This series although it looks similar to that of a random walk with drift is different as it is a regressed
series of the time trend 𝛽𝑑. A nonstationary process with a deterministic trend has a mean that grows
around a fixed trend which is constant and independent of time. This is a trend stationary process. [34]
Random walk with drift and deterministic trend
This series is described by the equation below
π‘Œπ‘‘ = 𝛼 + π‘Œπ‘‘βˆ’1 + 𝛽𝑑 + 𝑒𝑑 (16)
This series has both a drift component and a deterministic trend. This is both difference and trend
stationary. [34]
Difference stationarity
A series with a random walk can be transformed into a stationary process using differencing,
irrespective of whether it has drift or not. [34]
Trend Stationarity
A nonstationary process with a deterministic trend can be transformed into a stationary process by
detrending. [34]
Difference and Trend Stationary
In cases where a random walk with drift and a deterministic trend, stationarity can be achieved through
detrending but differencing needs to also be applied in order to ensure that the variance does not grow
to infinity over time. [34]
Testing for Trend stationarity and Difference Stationarity
There are two preferred methods for testing for stationarity, these are the Augmented Dickey-Fuller
(ADF) test and the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test [34]–[39].
Augmented Dickey-Fuller Test (ADF)
In the ADF test Equation 17 below is used to represent an AR process [35], [40], [41]
Where 𝛿 ∈ [0,1], when 𝛿 is 0 the process is unit root stationary, when 𝛿 β‰  0 then the process is not.
The test sets a null hypothesis that the process is unit root stationary.
𝐻0: 𝛿 = 0
βˆ†π‘Œπ‘‘ = 𝛼 + π›Ώπ‘Œπ‘‘βˆ’1 + 𝑒𝑑 (17)
12
𝐻1: 𝛿 β‰  0
A t-statistic is calculated for the 𝛿̂ which is the estimated value of 𝛿. This test statistic is then compared
to the critical values from the Dickey-Fuller Distribution.
When
𝑑 < π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ (18)
The null hypothesis is rejected. The π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ value can be calculated using Equation 19.
π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ =
𝛿̂
𝑆𝐸(𝛿̂ )
(19)
Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test
This test evaluates whether a univariate series is trend stationary as the null hypothesis and that it is a
nonstationary unit root process. It does this by first defining the series with Equation 20 below [42][43].
π‘Œπ‘‘ = 𝑐𝑑 + 𝛽𝑑 + 𝑒1𝑑 (20)
𝑐𝑑 = π‘π‘‘βˆ’1 + 𝑒2𝑑′ (21)
Where
𝑐𝑑 is some random walk process
𝑒1𝑑 is a stationary process
𝑒2𝑑′ is and independent and iid process with a mean of 0 and variance 𝜎2
𝐻0: 𝜎2
= 0
𝐻0: 𝜎2
> 0
With the test statistic
𝑝𝑑𝑒𝑠𝑑 =
βˆ‘ 𝑆𝑑
2𝑇
𝑑=1
𝑆2 𝑇2
(22)
Where
𝑝𝑑𝑒𝑠𝑑 is the p-test statistic
𝑇 is the sample size
𝑆2
is the Newey-West estimate of the long-run variance
And
𝑆𝑑 = 𝑒1 + 𝑒2 + 𝑒3 + β‹― +𝑒𝑑 (23)
13
ο‚· Homoscedasticity
Homoscedasticity describes the variance of a series, it means that the variance does not increase with
time[44]. Homoscedasticity can be tested graphically by looking at plots of residuals against actuals. It
can also be tested using the Engle test for residual heteroscedasticity.
Engle test for residual heteroscedasticity
Residuals of the series are defined as in Equation 24
𝑒𝑑 = 𝑦𝑑 βˆ’ 𝑒̂ 𝑑 (24)
Where
𝑒̂ 𝑑 is the conditional mean of the process
𝑒𝑑 is the residual which is identically distributed with a mean of 0 and variance of 1
𝐻0: 𝛼0 = 𝛼1 = 𝛼2 = β‹― = 𝛼 π‘š
𝐻1: 𝑒𝑑
2
= 𝛼0 + 𝛼1 π‘’π‘‘βˆ’1
2
+ 𝛼2 π‘’π‘‘βˆ’2
2
= β‹― = 𝛼 π‘š π‘’π‘‘βˆ’π‘š
2
+ 𝑒 𝑑
Where
𝑒 𝑑 is a white noise error process
The null hypothesis tests that the error at time t is not dependent on the error from previous lags, which
means that it is not heteroscedastic. The test statistic is found by using the F statistic for regression on
the squared residuals, and the critical value is found in the 𝒳2
distribution using m degrees of freedom
and the required significance[45][46].
Cross correlation and Auto Correlation
Cross correlation is the correlation of a time series with another time series at different lags. It is
achieved by shifting one of the series forward or backward by several lags. This results in the
observation of a lead-lag relationship between two variables. For example, if the cross correlation
between two series is highest at the offset of -3, which means that the correlation between the two series
is highest at that point. This could mean that the one variable signals when the other is going to be
affected. Auto correlation is the same as cross correlation except that, instead of using two different
series, it uses one series and measures if the series is correlated with its self. This could be exploited as
this too could signal change occurring. Cross correlation and auto correlation are both calculated using
the Pearson’s correlation coefficient. One adjustment is made to this; this is for the shifting of the series
as seen in the figure below.
14
Figure 2: Shifting of Time Series
2.4 Feature Selection
When creating a model that approximates functional relationships between inputs and outputs, generally
when using machine learning and artificial intelligence systems, a problem arises where there are too
many inputs that may be irrelevant and these lead to the overfitting of the model to the output data. To
deal with this feature selection was created. Feature selection is a methodology that eliminates the
irrelevant or redundant inputs [1],[47]. There are many different methods that can be applied. Principle
component analysis is used to reduce the amount of variables by transforming data from a higher to a
lower dimension while minimising the information lost [48], [49].
2.5 Exploratory Data Analysis (EDA)
Exploratory data analysis is usually used as a first step in any data analysis, it is an approach to data
analysis that uses different techniques to; increase the insight into the data, find what variables are
important, detect outliers and anomalies, uncover an underlying structure and test the underlying
assumptions before performing further analysis. It does this by looking at variable distribution,
scatterplots, correlation analysis and other multivariate approaches. EDA aims to be a more visual
analysis of variables[50].This is done through the following steps:
ο‚· Initial Extraction
ο‚· Determine number of factors to retain
ο‚· Rotation-a transformation
ο‚· Interpret solution
ο‚· Calculate factor scores
ο‚· Results in table
ο‚· Prepare results
2.6 Confirmatory Data Analysis (CDA)
Confirmatory data analysis uses statistical techniques to verify that there is indeed a factor structure
between a set of observed variables[50]. The method allows for the testing of the hypothesis that a
relationship does indeed exist. This is done through the following methodology;
15
ο‚· Review the relevant theory and research literature to support model specification
ο‚· Specify a model
ο‚· Determine model identification
ο‚· Collect Data
ο‚· Conduct preliminary descriptive statistical analysis such as scaling, missing data, collinearity
measures and finding outliers
ο‚· Estimate parameters in the model
ο‚· Present and interpret results
2.7 Dividend Yield
Dividend yield is the ratio of the dividends paid against the price of the share. This ratio is seen as the
return on investment. The ratio is a measure of cash flow that is resulted from the purchase[51].
However, the dividends irrelevance theory says that dividends are not relevant to the shareholders as
the shareholder can sell their shares to achieve an income[52]
2.8 Interest Rate
An interest rate is seen as the cost of borrowing money. When investing money, one would benchmark
the return from borrowing money and compare it with the return the investment could yield [53]. When
investing, the interest rate could be used as an indicator of economic conditions due to the strong
relationships between the value of a currency and the inflation rate. In macroeconomics, the interest
rate is used to balance the demand of money in a country. When inflation increases, then the demand
for money begins to grow. This is when the minister of finance increases the interest rate. The increase
in interest rate results in the cost of borrowing money increasing. Due to this increase in cost, the
demand for money would decrease. This could be seen as a relevant factor to determining a shares value
as the share price is essentially determined by the demand and supply of the share [54].
2.9 Principle Component Analysis (PCA)
The main objective of PCA is the reduction of variables when working with large amounts of variables.
The graphical display would not be helpful in the analysis of the variables due to the number of variables
being studied. The procedure of the first principle analysis starts with defining a matrix X with all
variables. The matrix is then used to find a linear combination of the variables in the form of a
multifactor linear regression model without an intercept. The linear combination will have a matrix of
coefficients which describe the linear regression model. These coefficients are chosen to maximise the
variance of the linear combination. The sum of all the coefficients squared is constrained such that it
must be less than or equal to 1. The above described process is used again except the variance in this
model is just the remaining variance of the first principle component.[55]
16
2.10 Causality
Causality is the term used for describing the relationship between two variables. The relationship
described will lead to realisation of which variable causes the fluctuation in the other. Unlike correlation
which only describes the strength of the relationship between the variables[56]. Causality is usually
tested using the Granger-Causality test.
2.11 Linear regression
A simple regression model is used to create linear relationships between two variables. The relationship
formed is of the form 𝑦𝑖 = 𝛽0 + 𝛽1 π‘₯𝑖 + πœ€π‘–. The 𝛽 values represent the regression coefficients. The error
that is resulted from fitting the model to the data is represented by πœ€π‘–. The least squares method is used
to find the 𝛽 values such that it minimises the sum of the squared residuals. The formulas below are
used to find the coefficients.
𝛽1 =
𝑆𝑆π‘₯𝑦
𝑆𝑆π‘₯
(25)
𝛽0 = 𝑦̅ βˆ’ 𝛽1 π‘₯Μ… (26)
Where
𝑆𝑆π‘₯𝑦 = βˆ‘(π‘₯𝑖 𝑦𝑖)
𝑛
𝑖=1
βˆ’
(βˆ‘ π‘₯𝑖
𝑛
𝑖=1 )(βˆ‘ 𝑦𝑖
𝑛
𝑖=1 )
𝑛
(27)
𝑆𝑆π‘₯ = βˆ‘(π‘₯𝑖
2
)
𝑛
𝑖=1
βˆ’
(βˆ‘ π‘₯𝑖
𝑛
𝑖=1 )2
𝑛
(28)
𝑦̅ =
βˆ‘ 𝑦𝑖
𝑛
𝑖=1
𝑛
(29)
π‘₯Μ… =
βˆ‘ π‘₯𝑖
𝑛
𝑖=1
𝑛
(30)
2.12 Residual Analysis
Residual analysis is done in order to see how well a model fits data. The residual vs actual plot is one
that is rich in information. Residual analysis can be seen as the validation of the model. When the
residual vs actual plot reveals a pattern, it means that the data is not described by the model. Ideally
after fitting model to the data, a randomly distributed error should be found. This randomly distributed
error should be about the 0 residual line[57]. If the residual grows with an increase in the actual data,
this means that the data is heteroscedastic or that the data needs to be transformed using the log
transform. Heteroscedastic data can still be transformed using the Box-Cox transformations[58].
17
2.13 Related Studies
Justin Colyn [23] had focused his research on determining whether Price-Earnings (P/E) ratio and the
Dividend Yield (D/Y) influence future price for a select few value weighted, equity capital market
indices. To do this he had used a methodology which tested for stationarity using the Augmented
Dickey-Fuller method (ADF) then for co-integration using the methodology of Johansen [59]. From
here he corrected the non-stationary series using the Vector Error Correction Model (VECM) for Co-
integrated variables and the Vector Auto-Regression (VAR) model at the correct lag which was found
by various information criterion such as the Akaike Information Criterion (AIC), Schwarz Information
Criterion (SC), and Hannan-Quinn Information Criterion (HQ) After these two tests had been done and
corrections made, Granger-Causality tests were performed between the series with the hypothesised
relationships. His study concluded that for most indices there is very little evidence of Granger-
Causality in either direction, Between Price and P/E ratio and between Price and DY but there appeared
to be Granger-Causality between price and P/E ratio in relation to the Financial Times Stock Exchange
top 100 (FTSE 100).[23]
Enos Lentsoane [60] had researched the stock price reaction to dividend changes. This study is slightly
different from the one carried out by Justin Colyn as this study focuses on event study methodology.
The changes in dividends when announced are treated as an event, this study methodology is usually
used for qualitative information such as Corporate events. This study used the event study methodology
as used by Khotari & Warner [61], which had nine steps to follow. These steps are as follows;
1. Define the event to be tested
2. Define period to be studied in terms of estimation window, event window, and event date.
3. Define what is meant by abnormal performance
4. Collect event data which meets data selection criteria as defined in step 2
5. Calculate pre-event abnormal returns
6. Calculate abnormal returns over event window
7. Calculate the Average Abnormal Return (AAR) and Cumulative Abnormal Return (CAR) for the
test statistic
8. Determine the critical values (Statistical significance) of the AAR and CAR
9. Analyse and interpret the results
When calculating the abnormal return three measures were used, Market Adjusted Abnormal Return
(MAAR), Market Model Abnormal Return (MMAR), and Buy-and-Hold Abnormal Return (BHAR).
This event study concluded that market reaction is not statistically significant on the announcement day
and that more negative returns occur during the pre-crisis period. He also concluded that the research
does not support the irrelevance theory but seems to support signalling hypothesis.[60]
18
Nondumiso Ngidi [62] researched the effect of strikes in South Africa on the share prices of 49 listed
companies. He too used the event study method to find the effect that strikes in South Africa have on
share prices of listed companies on the Johannesburg Stock Exchange (JSE). He concluded that stock
prices react negatively to the news of strike action and continue to follow a downward trend for
approximately 5 days after the strike action has concluded. His study also finds that the JSE is not an
efficient market as it takes days for the market to return to equilibrium after these announcements. [62]
Yusuf Varli et al [24] studied the use of a new correlation coefficient that can be used for analysing
bivariate time series data. The new correlation coefficient was tried through simulations and compared
to the performance of other correlation coefficients, mainly Pearson’s Correlation Coefficient. The
conclusions drawn from this were;
ο‚· The coefficient being tested takes lag-difference into account
ο‚· Better performance in capturing the cross-independence of two variables over time
ο‚· More normal than the Pearson’s Coefficient
ο‚· Performs better than the Detrended Cross-Correlation Analysis (DCCA) coefficient in terms of
capturing the independence and co-integration in non-stationary series.[24]
Bwo-Nung Huang et al[63] used unit root and co-integration models to determine the appropriate
Granger-Causal relations between stock prices and exchange rates using the Asian Flu data. The tests
included in the methodology are as follows;
ο‚· Augmented Dickey Fuller (ADF)
ο‚· Phillips-Paren technique
ο‚· Bivariate Vector Auto-Regression (VAR) model
ο‚· Granger Causality test
ο‚· Co-Integration test
From this research it was found that data from South Korea are in agreement that exchange rates lead
stock prices, the data from the Philippines suggest the stock prices lead exchange rates with negative
correlation and the data from Hong Kong, Malaysia, Singapore, Thailand and Taiwan indicate strong
feedback relations whereas that of Indonesia and Japan fail to reveal any recognisable pattern.[63]
3 OBJECTIVES
To determine:
ο‚· What factors influence the price of the Richemont share?
ο‚· What the relationship between the more influential factors and the Richemont share price are?
ο‚· Which factors are more correlated with the share price?
19
4 APPARATUS
1. Personal Computer using windows 10 OS with i7 processor
2. Matlab R2010a
3. Excel 2016
5 METHODOLOGY
From the factors listed in motivation, a list of 8 factors were chosen. The factors that were chosen are;
1. Dividend Yield
2. Volume Traded
3. Rand-Euro exchange rate
4. Rand-Dollar exchange rate
5. Inflation – Average National South African CPI was used.
6. Interest rate- Prime Lending Rate was used
7. Gold Price- Rand/Ounce
8. Platinum Price-Rand/Ounce
These factors were chosen due to nature of the company whose share was selected. Richemont is a
luxury goods company based in Switzerland. Richemont’s main focus is jewellery, luxury watches and
writing instruments[64]. Given that this is their main focus, it was hypothesised that Gold and Platinum
Prices would have some relationship with the share price. The factors that then needed to be evaluated
were exchange rates, and economic factors. Hence the Rand-Euro exchange rate, Rand-Dollar exchange
rate, Inflation and the Interest rate were used in this study. Dividend yield and Volume traded are two
of the more commonly used indicators when evaluating a share price, although controversial to that
there is a theory of Irrelevance of dividends [52].
5.1 Collection of Data
1. Each factors’ data was further split into 6 data collection frequencies, Yearly, Half-Yearly,
Quarterly, Monthly, Weekly and Daily.
2. A table with the names of each data set was created and ticked for each data set as it was collected,
this ensure that there were no duplications of the data.
3. The collection of the data was done with Inet BFA database, using the student portal to access the
data.
4. The data that could not be found on Inet BFA database was found on Stats SA or Quantec EasyData.
5. All the exchange rates, gold prices, platinum prices, share prices, volumes and dividend yields were
found through Inet BFA.
20
6. A 5-year dataset and different frequencies were chosen and all data would be output to excel files.
The Inflation rate and interest rate were taken from Stats SA and Quantec respectively.
7. Data had to be reordered for it to be used easily in one workbook per period. In some cases, data
needed to be extracted from one periods data into another periods data, for example the CPI was
found yearly at a monthly frequency.
5.2 Processing of Data
1. Plot each time series.
2. Find correlation values between each factor and the share price within each dataset, this can be done
using the built-in correlation function in Excel or Matlab.
3. Load the workbooks with all the data for each frequency and name it.
4. Cross correlation and auto correlation can now be found using Matlab and the cc function attached
in Appendix A. This is done to find the lead and lag relationships between the share price and the
factors.
5. When using the code in Appendix A, the data collection needs to be done for each frequency, daily,
weekly, monthly, quarterly, half-yearly and yearly.
6. After running the code in Appendix A, save each output variable into excel and display in graphs.
7. Use the code in Appendix B to test the assumptions listed below to a 5% significance. The
assumptions are;
ο‚· Linearity-Tested by fitting a linear regression model to the data, obtaining the residuals and the
calculating the coefficient of determination. Using Matlab function detrend() to capture the
residuals and then the rest was processed in excel using Equation 12.
ο‚· Randomness- This was tested using the Matlab function runstest()
ο‚· Stationarity- Tested by using ADF method for unit root stationary and using KPSS for trend
stationarity. Using adftest() and kpsstest()
ο‚· Homoscedasticity- using Engle test for residual heteroscedasticity. Using archtest()
ο‚· Normality- Tested using a one-sample Kolmogorov-Smirnov test. Using Matlab function-
kstest() .
8. The results from the assumption tests must now be saved into excel and tabulated.
9. Each series has been detrended when running the code for the assumptions tests.
10. The detrended data can be seen as residuals. From this the residuals must be squared and summed
and divided by the standard deviation squared, the subtracted from 1 to find the coefficient of
determination.
11. In all the tests, except the test for linearity, having all zeros means that the test has been passed.
Analyse the tables to see which series in the data sets have met all assumptions.
21
12. If none meet all the assumptions, look for the variables that have passed stationarity and
homoscedasticity tests.
13. Plot the data and fit a linear best fit to the data in excel.
14. Residuals can be found in Matlab using the detrend function, which removes the line of best fit
from the data.
15. The data can then be plotted against the actual share price, this can be done in Matlab too using the
plot function and then editing the graphs as required.
16. Transformation of series that meet the homoscedasticity and the stationarity tests.
17. Start the transformations with each factor, iterate between different transformations to improve the
coefficient of determination. Residuals plots that have an increasing variance usually need to be log
transformed.
18. After transforming all data, plot the regression line through the data and calculate the coefficient of
determination for this fit. This can easily be done in excel
19. Check if all data falls with 95% confidence intervals by plotting the upper and lower confidence
lines with all the data and the regression line. This can be achieved by using the Matlab code in
Appendix C and D.
20. The fit of the regression model needs to be evaluated for the transformed data. This is done by using
the detrended transformed data and plotting it against the actual share price data.
21. A correlation matrix is found using the built-in Matlab correlation function.
22. The cross correlation is found using the cc function in the Appendix A
23. Study the correlation matrix and the cross correlations between factors to determine which factors
are not necessary.
24. From the factors that are found to be necessary determine which factors have very poor correlations
with the share price. These factors can be eliminated too
25. From the remaining factors, investigate how well the linear regression line fits the data from all
datasets that had been eliminated when they had not met the homoscedasticity assumption and the
stationarity assumptions. This can be done using the Matlab code in Appendix F to produce residual
vs actual plots for each dataset.
26. Determine the correlation between the Predicted share price values and the actual share price values,
this can be done using the correlation function in Matlab.
22
5.3 Precautions
1. The collection of the data was done factor at a time to ensure that no data was left out and that it
takes as little time as possible.
2. Some datasets were unable to be found as results are not output at that particular interval, for
example interest rates and CPI do not change daily, or weekly.
3. Data needed to be checked to ensure that the correct data was in the right workbooks in excel. If
the data was not or if the file corrupted, the data would need to be downloaded again.
4. Ensure that the function is saved in the same working directory that is being used.
5. Ensure that when running the Matlab code the variables have been updated and that the right
variables have been used.
6. Ensure that the β€œlengthData” variable in the Matlab code is changed as the data collection frequency
is changed, this is due to the data being of different lengths each time.
23
6 OBSERVATIONS
All data that was gathered was plotted in one dataset for each factor. This was done by using the lowest
resolution of the data available. This could be done due to the lowest resolution having the all the factors
data for each dataset within it. The factors were plotted in Figure 3 and Figure 5 to Figure 11.
Figure 3: Platinum Price Over Time
Figure 4: Share Price Over Time
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
24
Figure 5: Gold Price Over Time
Figure 6 Volume Traded Over Time
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
25
Figure 7: Rand-Euro Over Time
Figure 8: Dividend Yield Over Time
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
26
Figure 9: Rand-Dollar Over Time
Figure 10: Interest Rate Over Time
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
27
Figure 11: Inflation Over Time
7 DATA PROCESSING
The processing of the data was done within Matlab and excel, a lot of built in functions were used when
processing the data. There were 12 ways that the data was processed to be interpreted. Sometimes the
preceding step to processing the data determined how the data would be processed next. Below each
step will be discussed and the Matlab code and functions that were used will be explained.
7.1 Correlation Values
To produce the correlation values seen in Table 1, the correlation function in excel was used. This
computes the Pearson Product Moment Correlation as discussed in Section 2.3.4.
7.2 Cross Correlation and auto Correlations
The data displayed in Figure 12 to Figure 17 was processed using the cc function in Appendix A. The
function calculated the cross correlation by using Equation 7. This can be seen implemented in the
lines 12 and 19, with the code
n=(cov(x(k:length(x)),y(1:(length(x)k+1))))/(sqrt(var(x(k:length(x)))*var(y(1:(length(x)-k+1))))); and
n=cov(x(1:length(y)-l+1),y(l:length(y)))/(sqrt(var(x(1:length(y)-l+1))*var(y(l:length(y)))));
respectively. The covariance matrix has a structure that has the variance of each set on its diagonal, it
is symmetric about the diagonal. For example, to find the covariance between variables 1 and 2 in the
2x2 matrix named A, a person would need to look at either A(1,2) or A(2,1) as this shows the
covariance between the two variables.
The shifting of the data, was achieved by introducing the variables k and l respectively. The point of
this was to iterate the insertion of variables above the halfway point in the cross correlation vector
using k and the code β€œrow((length(x)+k-1))=n(1,2);”. The variable l was used to insert values before
Rand-Dollar
Exchange rate
28
the halfway point in the cross correlation vector by using the code β€œrow((length(x)-l))= n(1,2);”. Note
that this had to be called for each factor in each data set against the share price. This can be seen in the
Matlab code in Appendix B with the following lines ;
1. AutoCorrel=cc(data(1:lengthData,2),data(1:lengthData,2));
2. DYCorrel=cc(data(1:lengthData,2),data(1:lengthData,5));
3. VolumeCorrel=cc(data(1:lengthData,2),data(1:lengthData,7));
4. Rand_Euro_Correl=cc(data(1:lengthData,2),data(1:lengthData,9));
5. Rand_Dollar_Correl=cc(data(1:lengthData,2),data(1:lengthData,11));
6. inflationCorrel=cc(data(1:lengthData,2),data(1:lengthData,13));
7. interestCorrel=cc(data(1:lengthData,2),data(1:lengthData,15));
8. GoldCorrel=cc(data(1:lengthData,2),data(1:lengthData,17));
9. PlatinumCorrel=cc(data(1:lengthData,2),data(1:lengthData,19));
10. alldataCC=[AutoCorrel;DYCorrel;VolumeCorrel;Rand_Euro_Correl;Rand_Dollar_Correl;inflatio
nCorrel;interestCorrel;GoldCorrel;PlatinumCorrel];
Where the variable β€œlengthData” was changed manually for each dataset, the matrix data needed to be
changed for each data set as well.
7.3 Assumption tests
The assumptions were tested using the following built-in Matlab functions;
1. detrended(1:lengthData,i)= detrend(data(1:lengthData,i),1);
2. arch(i)=archtest(detrended(:,i));
3. [h(i),p(i),k(i),c(i)]=kstest(data(1:lengthData,i));
4. r(i)=runstest(data(1:lengthData,i));
5. adf(i)=adftest(data(1:lengthData,i));
6. kpss(i)=adftest(data(1:lengthData,i));
The detrend function as using in the text above would result in the residuals being calculated for a linear
model being fit to the data. The function archtest, tested the hypothesis that the series was
heteroscedastic by looking at the residuals and employing the Engle test for residual heteroscedasticity
as described in Section 2.3.3 under the homoscedasticity section which was tested at 95% confidence.
The kstest function used the Kolmogorov-Smirnov test at a 95% confidence. The adftest function, used
the Augmented-Dickey-Fuller method to test if the series is unit root stationary at a 95% confidence
interval. The kpsstest function used the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test, to test the
hypothesis that the series is trend stationary, which was also conducted at a 95% confidence. The
outputs needed for a series to meet all the assumptions being tested was a row of zeros. This code
needed to be iterated for every dataset.
29
7.4 Coefficient of determination
This was done manually using the detrended data and the original data for each series. The squared sum
of the residuals was calculated, followed by the standard deviation of the original data. These were then
substituted into Equation 12. A sample calculation is done below for the yearly share price.
𝑅2
= 1 βˆ’
βˆ‘ πœ€π‘–
2
𝑛𝑆2
πœ€π‘–
2
=
[
7304921.91
1322521.9
2192712.23
902724.77
150292.829
3880524.77 ]
, 𝑆2
=7586437.77, 𝑛 = 6
𝑅2
= 1 βˆ’
20753698.42
6Γ—7586437.77
𝑅2
= 0.544
7.5 Confidence intervals of regression plots
The confidence intervals of the regression plot were done in Matlab using the codes attached in
Appendix C and D. The regress function in Matlab was used in order to achieve the values to be used
in the equation of the lines, the upper line at 95% confidence level, the lower line at a 95% confidence
level and the linear regression line.
7.6 Residual analysis for transformed data
The residual plot was done by using the detrend function in Matlab and the scatter function to plot the
data against the actual values.
7.7 Correlation of factors
The correlation of factors was done in Matlab using the correlation function which also uses the
Pearson’s Product Moment Correlation coefficient calculation as shown in Section 2.3.4.
7.8 Cross Correlation of all factors
The cross correlation was calculated using the Matlab code attached in Appendix E, which uses the
function β€œcc” which is attached in Appendix A.
7.9 Residuals for fitting regression model to each data sets’ volume traded data
The equation of the regression model was attained from the graphs created in Excel. This equation was
input into Matlab and tested to see if it holds for each data set. The Code in Appendix F was used to
30
plot the graphs required. From the code in Appendix F, the correlation coefficients were determined.
Below the code is explained.
1. cas=(max(Dailydata(1:1250,13).^(3))-min(Dailydata(1:1250,13).^(3)))/1249;
2. xvl=min(Dailydata(1:1250,13).^(3)):cas:max(Dailydata(1:1250,13).^(3));
3. SPdy=transpose((-2E-25*(xvl))+21.94);
4. dyres=(Dailydata(1:1250,2))-SPdy.^3;
5. scatter((Dailydata(1:25:1250,2)),dyres(1:25:1250,1),'DisplayName','dyres(1:1250,1)','YDataSourc
e','dyres(1:1250,1)');figure(gcf)
In line 1 the variable β€œcas” determines the step of the x values in the graphs. Line 2 creates a vector of
x values. The variable β€œSPdy” is the predicted share price. The variable β€œdyres” is the residual between
the actual and predicted. Line 5 creates a scatter plot of the residuals against the actual values. In the
Matlab graphical user interface (GUI), the chart is editing according to what is required. The line of
best fit can be added and the equation of the line can be displayed. Residuals could also be displayed
using the Matlab GUI.
8 RESULTS
In this section, only the relevant results have been placed, all other results have been attached in
Appendix G. This section comprises of 12 sections. The 12 sections are, Correlation values, Cross
Correlation and Auto Correlation, Assumption Tests, Coefficients of Determination, Residual vs Actual
plots of data before transformation, Residual vs Actual plots of data before transformation, Confidence
Intervals on Regression Plots, Residual Analysis of transform data, Correlation of Factors, Cross
Correlation of factors and Residuals for fitting regression model to each periods volume data. Each
section will be briefly introduced below.
8.1 Correlation Values
The correlation values that are placed in the Table 1 are between each factor and the share price. This
was done in order to find which factors impact the share price most within each data set. There are 6
data sets that have been used, Daily, Weekly, Quarterly, Half-yearly and Yearly. The blank spaces in
the table are due to not having the data required to perform the calculation for that data set. Absolute
average was calculated to ensure that when looking at the average correlation for each dataset or factor,
the correlation is not reduced due to a few inverse relationships.
31
Table 1: Correlation Values of Each Factor with Share Price
Yearly Weekly Monthly Quarterly Half-Yearly Daily
Absolute
Average
DY 0.687 0.451 0.471 0.460 0.360 0.458 0.481
Volume -0.987 -0.667 -0.823 -0.907 -0.731 -0.489 0.767
Rand-Euro 0.903 0.825 0.844 0.843 0.763 0.541 0.787
Rand-Dollar 0.783 0.740 0.752 0.732 0.591 0.699 0.716
Inflation 0.913 0.827 0.807 0.603 0.788
Interest 0.374 0.490 0.391 0.188 0.565 0.401
Gold 0.294 0.294 0.298 0.251 0.127 0.450 0.286
Platinum 0.958 0.594 0.612 0.638 0.667 -0.178 0.609
Absolute
Average 0.737 0.595 0.640 0.627 0.504 0.483
8.2 Cross Correlations and Auto Correlations
Figures 12 to 17, are the cross correlations and auto correlations for each factor within each dataset.
Note that a lag in each datasets graphs would represent a different time step due to the resolution of the
data. The data in each graph also becomes less as the data set moves from daily to yearly. The daily
data set has 1250 points and there is no line fit through it as it is easy to see the movement of the
correlation with each lag. The figures that have less data points, have a line passing through the points
due to the difficulty of following the points and seeing any patterns that may exist. Note that the
autocorrelation in each figure is symmetrical, this is used as a validation that the β€œcc” function in
Appendix A does work correctly. In each figure all the factors data have been plotted, this was to
decrease the number of figure that needed to be analysed. It also makes for easier reading of the data
when analysing.
32
Figure 12: Cross Correlation of Factors Within Daily Data Set
Figure 13: Cross Correlation of Factors Within Weekly Data Set
33
Figure 14: Cross Correlation of Factors Within Monthly Data Set
Figure 15: Cross Correlation of Factors Within Quarterly Data Set
34
Figure 16: Cross Correlation of Factors Within Half-Yearly Data Set
Figure 17: Cross Correlation of Factors Within Yearly Data Set
8.3 Assumption Tests
In Table 2 to Table 6, each table is a summary of the results of the tests done for each factor within each
dataset. An assumption is met when there is a 0 in the block, 1 represents the failure to meet the
assumption.
35
Table 2:Assumption Test Results for Each Series in The Daily Data Set
Homoscedastici
ty
Normal
Test
Random
Test
Unit-Root
Stationary
Trend
Stationary
Close 1 1 1 0 0
Volume 1 1 1 0 0
Rand-Euro 1 1 1 1 1
Rand-Dollar 1 1 1 0 0
Inflation
Rate 1 1 1 0 0
Gold Price 1 1 1 0 0
Platinum
Price 1 1 1 0 0
Table 3: Assumption Test Results for Each Series in The Weekly Data Set
Homoscedastici
ty
Normal
Test
Random
Test
Unit-Root
Stationary
Trend
Stationary
Close 1 1 1 0 0
Volume 1 1 1 1 1
Rand-Euro 1 1 1 0 0
Rand-Dollar 1 1 1 1 1
Gold Price 1 1 1 0 0
Platinum
Price 1 1 1 0 0
Table 4: Assumption Test Results for Each Series in The Monthly Data Set
Homoscedastici
ty
Normal
Test
Random
Test
Unit-Root
Stationary
Trend
Stationary
Close 1 1 1 0 0
Volume 1 1 1 0 0
Rand-Euro 1 1 1 0 0
Rand-Dollar 1 1 1 1 1
Inflation
Rate 1 1 1 1 1
Interest
Rate 1 1 1 0 0
Gold Price 1 1 1 0 0
Platinum
Price 0 1 1 0 0
36
Table 5: Assumption Test Results for Each Series in The Quarterly Data Set
Homoscedastici
ty
Normal
Test
Random
Test
Unit-Root
Stationary
Trend
Stationary
Close 1 1 1 0 0
Volume 1 1 1 0 0
Rand-Euro 0 1 1 0 0
Rand-Dollar 0 1 1 1 1
Inflation
Rate 0 1 1 1 1
Interest
Rate 0 1 1 0 0
Gold Price 0 1 0 0 0
Platinum
Price 0 1 0 0 0
Table 6: Assumption Test Results for Each Series in The Half-Yearly Data Set
Homoscedastici
ty
Normal
Test
Random
Test
Unit-Root
Stationary
Trend
Stationary
Close 0 1 1 0 0
Volume 0 1 1 0 0
Rand-Euro 0 1 0 0 0
Rand-Dollar 0 1 1 0 0
Inflation
Rate 0 1 1 1 1
Interest
Rate 0 1 1 0 0
Gold Price 0 1 1 0 0
Platinum
Price 0 1 0 0 0
8.4 Coefficient of Determination
Table 7 is a summary of the coefficients of determination for each factor within each data set. This
coefficient of determination shows the ability to fit a straight line through the data for each factor within
each data set. The share price was also tested for linearity in this way.
37
Table 7: Coefficient of Determination for Each Series
Yearly Weekly Monthly Quarterly Half-Yearly Daily Average
Share Price 0.544 0.701 0.715 0.615 0.455 0.706 0.606
DY 0.168 0.330 0.357 -0.015 0.095 0.324 0.187
Volume 0.933 0.286 0.451 0.565 0.754 0.152 0.598
Rand-Euro 0.835 0.823 0.831 0.838 0.827 0.573 0.831
Rand-Dollar 0.888 0.917 0.923 0.929 0.906 0.877 0.913
Inflation 0.997 No Data 0.993 0.993 0.993
No
Data
0.994
Interest 0.738 No Data 0.643 0.556 0.702 0.734 0.660
Gold 0.695 0.456 0.447 0.472 0.671 0.609 0.548
Platinum 0.817 0.205 0.222 0.318 0.607 0.033 0.434
Average 0.735 0.531 0.620 0.586 0.668 0.501
8.5 Regression of Price Vs Factors
Figure 18 to Figure 25 show a regression plot of the Share Price vs each the factors for the half-yearly
data set.
Figure 18: Linear Regression of DY Vs Share price
Rand-Dollar
Exchange rate
38
Figure 19: Linear Regression of Volume Traded Vs Share price
Figure 20: Linear Regression of Rand-Euro Exchange Rate Vs Share price
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
39
Figure 21: Linear Regression of Rand-Dollar Exchange Rate Vs Share price
Figure 22: Linear Regression of Inflation Vs Share price
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
40
Figure 23: Linear Regression of Interest Rate Vs Share price
Figure 24: Linear Regression of Gold Price Vs Share price
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
41
Figure 25: Linear Regression of Platinum Price Vs Share price
8.6 Residual vs Actual
Following the previous figures, which showed the regression plot of the factors vs the share price, this
section plots the residuals of the linear fits in the figures found in Section 8.5. This was done to look
for patterns in the residuals, an increasing variance in the residuals and the randomness of the
distribution of the residuals. The residuals are plotted in Figure 26 to Figure 34.
Figure 26: Residual Plot for Share Price
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
42
Figure 27: Residual Plot for Dividend Yield
Figure 28: Residual Plot for Volume Traded
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
43
Figure 29: Residual Plot for Rand-Euro Exchange Rate
Figure 30: Residual Plot for Rand-Dollar Exchange Rate
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
44
Figure 31: Residual Plot for Inflation
Figure 32: Residual Plot for Interest Rate
Rand-Dollar
Exchange rate
45
Figure 33: Residual Plot for Gold Price
Figure 34: Residual Plot for Platinum Price
8.7 Regression of Price Vs Factors After Transformations
Following the transformation of the factors and share price, each series of factor vs share price has been
plotted below. Note that the gold price and the interest rate are both not included within the figures
below, this is due to no improvement of the linearity through transformation of the data.
Rand-Dollar
Exchange rate
Rand-Dollar Exchange
rate
46
Figure 35: Linear Regression of Dividend Yield Vs Share Price After Transformation
Figure 36: Linear Regression of Volume Traded Vs Share Price After Transformation
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
47
Figure 37: Linear Regression of Rand-Euro Exchange Rate Vs Share Price After Transformation
Figure 38: Linear Regression of Rand-Dollar Exchange Rate Vs Share Price After Transformation
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
48
Figure 39: Linear Regression of Inflation Vs Share Price After Transformation
Figure 40: Linear Regression of Platinum Price Vs Share Price After Transformation
8.8 Confidence Intervals on regression plots
Below are the figures of each regression plot after transformation, each plot has an upper and a lower
confidence interval, these confidence intervals communicate whether there is a significant outlier in the
data. The significance level of the data is at 95%.
Rand-Dollar
Exchange rate
Rand-Dollar
Exchange rate
49
Figure 41: Regression Plot of transformed Dividend Yield vs transformed Share Price with Confidence
Intervals
Figure 42: Regression Plot of transformed Rand-Euro vs transformed Share Price with Confidence
Intervals
Figure 43: Regression Plot of transformed Rand-Dollar vs transformed Share Price with Confidence
Intervals
Rand-Dollar
Rand-Dollar
Rand-Dollar
Exchange rate
50
Figure 44: Regression Plot of transformed Inflation vs transformed Share Price with Confidence
Intervals
Figure 45: Regression Plot of transformed Platinum Price vs transformed Share Price with Confidence
Intervals
8.9 Residual Analysis for transformed data
Here again the linear fit for each series is evaluated by observing the residuals. The residuals are
analysed for error growth with an increase of the factor as well as patterns within the plot.
DY
Rand-Dollar
Rand-Dollar
Exchange rate
51
Figure 46: Residual Plot of transformed regression between Share Price and DY
Figure 47: Residual Plot of transformed regression between Share Price and Inflation
Figure 48: Residual Plot of transformed regression between Share Price and Platinum Price
Figure 49: Residual Plot of transformed regression between Share Price and Rand-Dollar
Inflation
Platinum Price
Rand-Dollar Exchange rate
52
Figure 50: Residual Plot of transformed regression between Share Price and Rand-Euro
8.10 Correlation of Factors
Below is the table of cross correlation of each observation within the half-yearly dataset. Only half the
matrix has been attached as it symmetrical about the diagonal. The colour that has been added to the
table is done to highlight strong negative correlations in dark red and strong positive correlations in
dark green. The lower the correlation the lighter the colour of the block will be.
Rand-Euro Exchange rate
53
Table 8: Correlation matrix of Untransformed Factors
Close DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
Share Price 1.000 0.360 -0.731 0.763 0.591 0.603 0.188 0.143 0.667
DY 1.000 -0.008
-
0.065
-
0.042
-0.098 -0.279 0.001 -0.245
Volume 1.000
-
0.761
-
0.704
-0.859 -0.642 0.178 -0.809
Rand-Euro 1.000 0.934 0.886 0.635 0.370 0.804
Rand-Dollar 1.000 0.942 0.760 0.409 0.639
Inflation 1.000 0.850 0.214 0.747
Interest
Rate
1.000 0.066 0.603
Gold Price 1.000 0.110
Platinum
Price
1.000
8.11 Cross Correlation of all Factors
The tables below are the cross correlations between each factor and all the observed data within the
half-yearly dataset. Here too, the tables are colour coded with a colour gradient to show the strength of
the correlation. Green means positive correlation and red means negative correlation.
54
Table 9: Cross correlation of the Platinum Price with each factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.1 0.39 -0.21 -0.11 0.26 0.42 0.29 0.66 -0.12
-2 0.43 0.11 -0.40 0.47 0.60 0.53 0.11 0.37 0.03
-1 0.63 -0.18 -0.67 0.79 0.68 0.66 0.22 0.51 0.65
0 0.67 -0.24 -0.81 0.80 0.64 0.75 0.60 0.11 1.00
1 0.58 0.01 -0.77 0.57 0.51 0.70 0.69 -0.12 0.65
2 0.48 0.63 -0.56 0.33 0.48 0.58 0.48 0.00 0.03
3 0.28 0.45 -0.06 0.61 0.79 0.59 0.38 0.50 -0.12
Table 10: Cross Correlation of the Gold Price with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.41 -0.12 -0.28 0.40 0.22 0.32 0.23 0.01 0.50
-2 0.16 0.44 -0.31 -0.09 -0.03 0.22 0.37 -0.29 0.00
-1 0.39 0.76 -0.32 0.29 0.39 0.28 0.06 -0.15 -0.12
0 0.14 0.00 0.18 0.37 0.41 0.21 0.07 1.00 0.11
1 -0.13 -0.61 -0.64 0.46 0.59 0.75 0.88 -0.15 0.51
2 -0.37 -0.48 -0.69 0.35 0.48 0.66 0.73 -0.29 0.37
3 -0.11 -0.32 -0.36 0.62 0.43 0.36 0.38 0.01 0.66
Table 11: Cross Correlation of the Interest Rate with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.74 0.67 -0.80 0.67 0.93 0.93 0.59 0.38 0.38
-2 0.79 0.68 -0.75 0.92 0.99 0.93 0.64 0.73 0.48
-1 0.53 -0.02 -0.58 0.88 0.91 0.92 0.86 0.88 0.69
0 0.19 -0.28 -0.64 0.64 0.76 0.85 1.00 0.07 0.60
1 -0.05 -0.10 -0.56 0.47 0.73 0.80 0.86 0.06 0.22
2 -0.37 -0.21 -0.23 0.44 0.71 0.67 0.64 0.37 0.11
3 -0.72 -0.73 -0.17 0.60 0.66 0.60 0.59 0.23 0.29
55
Table 12: Cross Correlation of Inflation with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.86 0.72 -0.92 0.80 0.97 0.98 0.60 0.36 0.59
-2 0.88 0.70 -0.86 0.90 0.95 1.00 0.67 0.66 0.58
-1 0.72 0.17 -0.78 0.90 0.94 0.99 0.80 0.75 0.70
0 0.60 -0.10 -0.86 0.89 0.94 1.00 0.85 0.21 0.75
1 0.46 -0.11 -0.79 0.86 0.94 0.99 0.92 0.28 0.66
2 0.17 -0.09 -0.72 0.83 0.95 1.00 0.93 0.22 0.53
3 -0.39 -0.27 -0.54 0.75 0.91 0.98 0.93 0.32 0.42
Table 13: Cross Correlation of the Rand-Dollar Exchange Rate with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.87 0.70 -0.87 0.84 0.87 0.91 0.66 0.43 0.79
-2 0.79 0.71 -0.82 0.72 0.80 0.95 0.71 0.48 0.48
-1 0.71 0.39 -0.77 0.80 0.89 0.94 0.73 0.59 0.51
0 0.59 -0.04 -0.70 0.93 1.00 0.94 0.76 0.41 0.64
1 0.27 -0.30 -0.67 0.81 0.89 0.94 0.91 0.39 0.68
2 -0.05 -0.26 -0.79 0.67 0.80 0.95 0.99 -0.03 0.60
3 -0.50 -0.29 -0.53 0.63 0.87 0.97 0.93 0.22 0.26
Table 14: Cross Correlation of the Rand-Euro Exchange Rate with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.63 0.68 -0.66 0.52 0.63 0.75 0.60 0.62 0.61
-2 0.70 0.48 -0.75 0.57 0.67 0.83 0.44 0.35 0.33
-1 0.80 0.28 -0.85 0.81 0.81 0.86 0.47 0.46 0.57
0 0.76 -0.06 -0.76 1.00 0.93 0.89 0.64 0.37 0.80
1 0.45 -0.18 -0.76 0.81 0.80 0.90 0.88 0.29 0.79
2 0.18 0.10 -0.82 0.57 0.72 0.90 0.92 -0.09 0.47
3 -0.14 0.18 -0.25 0.52 0.84 0.80 0.67 0.40 -0.11
56
Table 15: Cross Correlation of the Volume Traded with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 -0.35 -0.65 0.54 -0.25 -0.53 -0.54 -0.17 -0.36 -0.06
-2 -0.72 -0.25 0.62 -0.82 -0.79 -0.72 -0.23 -0.69 -0.56
-1 -0.62 0.28 0.69 -0.76 -0.67 -0.79 -0.56 -0.64 -0.77
0 -0.73 -0.01 1.00 -0.76 -0.70 -0.86 -0.64 0.18 -0.81
1 -0.87 -0.25 0.69 -0.85 -0.77 -0.78 -0.58 -0.32 -0.67
2 -0.47 -0.24 0.62 -0.75 -0.82 -0.86 -0.75 -0.31 -0.40
3 0.08 -0.11 0.54 -0.66 -0.87 -0.92 -0.80 -0.28 -0.21
Table 16: Cross Correlation of the Dividend Yield with Each Factor
Lag
Share
Price
DY Volume
Rand-
Euro
Rand-
Dollar
Inflation
Interest
Rate
Gold
Price
Platinum
Price
-3 0.17 -0.49 -0.11 0.18 -0.29 -0.27 -0.73 -0.32 0.45
-2 0.32 -0.36 -0.24 0.10 -0.26 -0.09 -0.21 -0.48 0.63
-1 0.25 0.41 -0.25 -0.18 -0.30 -0.11 -0.10 -0.61 0.01
0 0.36 1.00 -0.01 -0.06 -0.04 -0.10 -0.28 0.00 -0.24
1 0.28 0.41 0.28 0.28 0.39 0.17 -0.02 0.76 -0.18
2 -0.34 -0.36 -0.25 0.48 0.71 0.70 0.68 0.44 0.11
3 -0.45 -0.49 -0.65 0.68 0.70 0.72 0.67 -0.12 0.39
8.12 Residuals for fitting regression model to each periods volume traded data
Figure 51 to Figure 57 analyses the fit of the linear model that was found for the volume traded against
the share price. It was evaluated to determine how well the model can explain the relationship between
the volume traded and the share price within the other data sets. A linear trend can be seen between the
residual vs actual share price in most of the data sets. Table 17 shows the correlation of the predicted
and the actual observations.
57
Figure 51: Error for Fitting Equation to Daily Data
Figure 52: Error for Fitting Equation to Weekly Data
Figure 53: Error for Fitting Equation to Monthly Data
Figure 54: Error for Fitting Equation to Quarterly Data
58
Figure 55: Error for Fitting Equation to Yearly Data
Figure 56: Error for Fitting Equation to Half-yearly Data
Table 17: Correlation of predicted and actual share price
Correlation
Coefficient
Daily 0.207
Weekly 0.561
Monthly 0.720
Quarterly 0.722
Half-yearly 0.865
Yearly 0.904
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006
MECN4006

More Related Content

What's hot (8)

Textual analysis of stock market
Textual analysis of stock marketTextual analysis of stock market
Textual analysis of stock market
Β 
Yelp dataset challenge
Yelp dataset challengeYelp dataset challenge
Yelp dataset challenge
Β 
IRJET- Prediction of Stock Market using Machine Learning Algorithms
IRJET- Prediction of Stock Market using Machine Learning AlgorithmsIRJET- Prediction of Stock Market using Machine Learning Algorithms
IRJET- Prediction of Stock Market using Machine Learning Algorithms
Β 
STOCK MARKET PREDICTION
STOCK MARKET PREDICTIONSTOCK MARKET PREDICTION
STOCK MARKET PREDICTION
Β 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - Smarten
Β 
Surstromming
Surstromming Surstromming
Surstromming
Β 
Biuspp metrex
Biuspp metrexBiuspp metrex
Biuspp metrex
Β 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
Β 

Similar to MECN4006

A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
Dr. Amarjeet Singh
Β 
Accounting Research Center, Booth School of Business, Universi.docx
Accounting Research Center, Booth School of Business, Universi.docxAccounting Research Center, Booth School of Business, Universi.docx
Accounting Research Center, Booth School of Business, Universi.docx
nettletondevon
Β 
Reliability Analysis Of Refined Model With 25 Items And 5...
Reliability Analysis Of Refined Model With 25 Items And 5...Reliability Analysis Of Refined Model With 25 Items And 5...
Reliability Analysis Of Refined Model With 25 Items And 5...
Jessica Myers
Β 
A spill over relationship corporate governance and investor
A spill over relationship   corporate governance and investorA spill over relationship   corporate governance and investor
A spill over relationship corporate governance and investor
Alexander Decker
Β 
A spill over relationship corporate governance and investor
A spill over relationship   corporate governance and investorA spill over relationship   corporate governance and investor
A spill over relationship corporate governance and investor
Alexander Decker
Β 
Applicable works
Applicable worksApplicable works
Applicable works
Keil Funk
Β 
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
ijtsrd
Β 

Similar to MECN4006 (20)

A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
A Study on Factors Influencing Investing in Mutual Fund with Special Referenc...
Β 
Accounting Research Center, Booth School of Business, Universi.docx
Accounting Research Center, Booth School of Business, Universi.docxAccounting Research Center, Booth School of Business, Universi.docx
Accounting Research Center, Booth School of Business, Universi.docx
Β 
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
AN ANALYSIS OF THE FINANCIAL PERFORMANCE EFFECT OF SHARIA COMPANIES ON STOCK ...
Β 
Graduate RP
Graduate RPGraduate RP
Graduate RP
Β 
Graduate RP
Graduate RPGraduate RP
Graduate RP
Β 
Determinants of equity share prices of the listed company in dhaka stock exch...
Determinants of equity share prices of the listed company in dhaka stock exch...Determinants of equity share prices of the listed company in dhaka stock exch...
Determinants of equity share prices of the listed company in dhaka stock exch...
Β 
FactorAnalysis.ppt
FactorAnalysis.pptFactorAnalysis.ppt
FactorAnalysis.ppt
Β 
Reliability Analysis Of Refined Model With 25 Items And 5...
Reliability Analysis Of Refined Model With 25 Items And 5...Reliability Analysis Of Refined Model With 25 Items And 5...
Reliability Analysis Of Refined Model With 25 Items And 5...
Β 
A spill over relationship corporate governance and investor
A spill over relationship   corporate governance and investorA spill over relationship   corporate governance and investor
A spill over relationship corporate governance and investor
Β 
A spill over relationship corporate governance and investor
A spill over relationship   corporate governance and investorA spill over relationship   corporate governance and investor
A spill over relationship corporate governance and investor
Β 
Factor analysis
Factor analysis Factor analysis
Factor analysis
Β 
Applicable works
Applicable worksApplicable works
Applicable works
Β 
Re-mining Positive and Negative Association Mining Results
Re-mining Positive and Negative Association Mining ResultsRe-mining Positive and Negative Association Mining Results
Re-mining Positive and Negative Association Mining Results
Β 
Final thesis
Final thesisFinal thesis
Final thesis
Β 
Lecture_note1.pdf
Lecture_note1.pdfLecture_note1.pdf
Lecture_note1.pdf
Β 
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
An Empirical Assessment of Capital Asset Pricing Model with Reference to Nati...
Β 
PRIORITIZING THE BANKING SERVICE QUALITY OF DIFFERENT BRANCHES USING FACTOR A...
PRIORITIZING THE BANKING SERVICE QUALITY OF DIFFERENT BRANCHES USING FACTOR A...PRIORITIZING THE BANKING SERVICE QUALITY OF DIFFERENT BRANCHES USING FACTOR A...
PRIORITIZING THE BANKING SERVICE QUALITY OF DIFFERENT BRANCHES USING FACTOR A...
Β 
Factor analysis
Factor analysisFactor analysis
Factor analysis
Β 
Unit iv statistical tools
Unit iv statistical toolsUnit iv statistical tools
Unit iv statistical tools
Β 
Factor analysis
Factor analysisFactor analysis
Factor analysis
Β 

MECN4006

  • 1. MECN4006 – Research Project Correlation of Factors Influencing a Share Price Name: Ashail Maharaj Student Number: 536684 Supervisor: Dr Ian Campbell 25 August 2016 A project report submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in partial fulfilment of the requirements for the degree of Bachelor of Science in Engineering. Johannesburg, August 2016
  • 2. ii DECLARATION UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG SCHOOL OF MECHANICAL, INDUSTRIAL AND AERONAUTICAL ENGINEERING I Ashail Maharaj Student Number: 536684, am registered for the course No. MECN 4006 - in the year 2016. I herewith submit the following task, β€œResearch Project: Correlation of Factors Influencing a Share Price” in partial fulfilment of the requirements of the above course. I hereby declare the following: ο‚· I am aware that plagiarism (the use of someone else’s work without their permission and/or without acknowledging the original source) is wrong; ο‚· I confirm that the work submitted herewith for assessment in the above course is my own unaided work except where I have explicitly indicated otherwise; ο‚· This task has not been submitted before, either individually or jointly, for any course requirement, examination or degree at this or any other tertiary educational institution; ο‚· I have followed the required conventions in referencing the thoughts and ideas of others; ο‚· I understand that the University of the Witwatersrand may take disciplinary action against me if it can be shown that this task is not my own unaided work or that I have failed to acknowledge the sources of the ideas or words in my writing in this task. Signature: ________________________________ Date: 25/08/2016
  • 3. iii ABSTRACT When investors predict share prices it is important to find the relevant factors that could influence the way the investors value the share. Generally, when investors value a share they use different factors that they feel are strong indicators of the company’s worth. These factors are also determined by the knowledge the investors have gained. There are a vast number of factors that a company’s financial performance is dependent upon and each of these factors could be related to each other, in such a case it would be redundant to include all the factors. The scope of this project focused around the correlation of factors with each other and the share price, with the aim to find the factors that are most relevant to the share price. The objectives of this research were to determine what factors influence the price of the Richemont share, what the relationship between relevant factors and the Richemont share price are and which factors can be used to predict the share price. A group of 8 factors were chosen. i.e. dividends yield, volume traded, rand-euro exchange rate, rand dollar exchange rate, inflation (CPI), the prime lending interest rate, the gold price and the platinum price. Each factor was recorded at daily, weekly, monthly, quarterly, half-yearly and yearly intervals. During this study a number of steps were carried out to ensure that the data used was valid for the analysis. The correlation and cross correlation of the factors with the share price was determined. The underlying assumptions for correlation were then tested. A linear regression was performed on each factor with the share price and a residual analysis was done. Each series was then transformed to get a better linear fit between the factors and the share price. Each transformed set of factors and share price was linearly regressed after transformation and the residual analysis was performed again. Correlation and cross correlation were evaluated between all factors to find redundant factors, which were then eliminated. The volume traded was the only relevant factor remaining after the test. The relationship found in the regression was then tested for each dataset’s volume traded series to see if the regression model still holds. This study then concluded showing that the only relevant factor from the initial 8 was the volume, the relationship between the Richemont share price and the volume traded is showed by the equation √ 𝑦3 = (βˆ’2 βˆ™ 10βˆ’27) βˆ™ π‘₯3 + 21.9 and that the volume traded is correlated with the share price with a correlation coefficient of -0.894.
  • 4. iv TABLE OF CONTENTS DECLARATION...................................................................................................................................ii TABLE OF CONTENTS ....................................................................................................................iv LIST OF FIGURES.............................................................................................................................ix LIST OF TABLES...............................................................................................................................xi 1 INTRODUCTION.........................................................................................................................1 1.1 Background.............................................................................................................................1 1.2 Motivation...............................................................................................................................1 2 LITERATURE SURVEY.............................................................................................................2 2.1 Fundamental Analysis.............................................................................................................2 2.2 Technical Analysis..................................................................................................................3 2.3 Time Series Analysis ..............................................................................................................4 Seasonality......................................................................................................................6 Trends .............................................................................................................................7 Correlation ......................................................................................................................7 Pearson’s Product Moment Correlation Coefficient.......................................................7 Cross correlation and Auto Correlation ........................................................................13 2.4 Feature Selection...................................................................................................................14 2.5 Exploratory Data Analysis (EDA) ........................................................................................14 2.6 Confirmatory Data Analysis (CDA) .....................................................................................14 2.7 Dividend Yield......................................................................................................................15 2.8 Interest Rate ..........................................................................................................................15 2.9 Principle Component Analysis (PCA) ..................................................................................15 2.10 Causality ...............................................................................................................................16 2.11 Linear regression...................................................................................................................16 2.12 Residual Analysis..................................................................................................................16 2.13 Related Studies......................................................................................................................17 3 OBJECTIVES .............................................................................................................................18
  • 5. v 4 APPARATUS ..............................................................................................................................19 5 METHODOLOGY .....................................................................................................................19 5.1 Collection of Data.................................................................................................................19 5.2 Processing of Data ................................................................................................................20 5.3 Precautions............................................................................................................................21 6 OBSERVATIONS.......................................................................................................................23 7 DATA PROCESSING ................................................................................................................27 7.1 Correlation Values ................................................................................................................27 7.2 Cross Correlation and auto Correlations...............................................................................27 7.3 Assumption tests...................................................................................................................28 7.4 Coefficient of determination.................................................................................................29 7.5 Confidence intervals of regression plots...............................................................................29 7.6 Residual analysis for transformed data .................................................................................29 7.7 Correlation of factors ............................................................................................................29 7.8 Cross Correlation of all factors .............................................................................................29 7.9 Residuals for fitting regression model to each data sets’ volume traded data ......................29 8 RESULTS ....................................................................................................................................30 8.1 Correlation Values ................................................................................................................30 8.2 Cross Correlations and Auto Correlations ............................................................................31 8.3 Assumption Tests..................................................................................................................34 8.4 Coefficient of Determination ................................................................................................36 8.5 Regression of Price Vs Factors .............................................................................................37 8.6 Residual vs Actual ................................................................................................................41 8.7 Regression of Price Vs Factors After Transformations ........................................................45 8.8 Confidence Intervals on regression plots..............................................................................48 8.9 Residual Analysis for transformed data................................................................................50 8.10 Correlation of Factors ...........................................................................................................52 8.11 Cross Correlation of all Factors ............................................................................................53 8.12 Residuals for fitting regression model to each periods volume traded data..........................56
  • 6. vi 9 DISCUSSION..............................................................................................................................59 9.1 Correlation Coefficients........................................................................................................59 Average correlation for each factor...............................................................................59 Absolute average correlation for each data capture frequency .....................................59 9.2 Cross Correlation ..................................................................................................................60 9.3 Assumption Tests..................................................................................................................61 Daily..............................................................................................................................61 Weekly..........................................................................................................................61 Monthly.........................................................................................................................62 Quarterly .......................................................................................................................62 Half-Yearly ...................................................................................................................62 Yearly............................................................................................................................62 9.4 Linear Regression .................................................................................................................62 Dividends Yield Vs Share Price....................................................................................63 Volume Traded Vs Share Price.....................................................................................63 Rand-Euro Exchange rate vs Share Price......................................................................63 Rand-Dollar Exchange rate Vs Share Price ..................................................................63 Inflation Vs Share Price ................................................................................................63 Interest Rate Vs Share Price..........................................................................................63 Gold Price Vs Share Price.............................................................................................63 Platinum Price Vs Share Price ......................................................................................64 9.5 Residuals vs Actual...............................................................................................................64 Share Price ....................................................................................................................64 Dividend yield...............................................................................................................64 Volume..........................................................................................................................64 Rand-Euro.....................................................................................................................64 Rand-Dollar...................................................................................................................65 Inflation.........................................................................................................................65 Interest...........................................................................................................................65
  • 7. vii Gold...............................................................................................................................65 Platinum........................................................................................................................65 9.6 Transformation and Linear Regression.................................................................................65 Dividends Yield ............................................................................................................66 Volume..........................................................................................................................67 Rand-Euro.....................................................................................................................67 Rand-Dollar...................................................................................................................67 Inflation.........................................................................................................................67 Interest Rate and Gold...................................................................................................67 Platinum........................................................................................................................67 9.7 Confidence intervals on regression plots ..............................................................................67 9.8 Residual analysis On transformed data.................................................................................68 Dividend Yield..............................................................................................................68 Inflation.........................................................................................................................68 Platinum Price...............................................................................................................68 Rand-Dollar Exchange Rate..........................................................................................68 Rand-Euro Exchange Rate............................................................................................68 9.9 Correlation of factors ............................................................................................................69 Dividends yield .............................................................................................................69 Volume traded...............................................................................................................69 Rand-Euro exchange rate..............................................................................................70 Rand-Dollar exchange rate............................................................................................70 Inflation.........................................................................................................................70 Interest Rate ..................................................................................................................71 Gold Price .....................................................................................................................71 Errors from Fitting the Linear Model to Each Volume Traded and Share Price Data Set 72 10 CONCLUSIONS .....................................................................................................................72 11 RECOMMENDATIONS........................................................................................................72
  • 8. viii 12 REFERENCES........................................................................................................................74 APPENDIX A: Matlab Code for Cross Correlation and Auto Correlation..................................80 APPENDIX B: Matlab Code for Testing All Assumptions.............................................................81 APPENDIX C: Matlab Code for A Regression Plot with Confidence Limits ...............................83 APPENDIX D: Code for changing the data that is plotted.............................................................84 APPENDIX E: Matlab Code for The Cross Correlation with Each Data Set...............................85 APPENDIX F: Matlab Code for Extracting Error Graphs............................................................89 APPENDIX G: Residual Plots ...........................................................................................................91
  • 9. ix LIST OF FIGURES Figure 1: Moods of the masses according to technical analysis[8].........................................................3 Figure 2: Shifting of Time Series..........................................................................................................14 Figure 3: Platinum Price Over Time.....................................................................................................23 Figure 4: Share Price Over Time ..........................................................................................................23 Figure 5: Gold Price Over Time ...........................................................................................................24 Figure 6 Volume Traded Over Time.....................................................................................................24 Figure 7: Rand-Euro Over Time...........................................................................................................25 Figure 8: Dividend Yield Over Time....................................................................................................25 Figure 9: Rand-Dollar Over Time.........................................................................................................26 Figure 10: Interest Rate Over Time ......................................................................................................26 Figure 11: Inflation Over Time.............................................................................................................27 Figure 12: Cross Correlation of Factors Within Daily Data Set ...........................................................32 Figure 13: Cross Correlation of Factors Within Weekly Data Set........................................................32 Figure 14: Cross Correlation of Factors Within Monthly Data Set ......................................................33 Figure 15: Cross Correlation of Factors Within Quarterly Data Set.....................................................33 Figure 16: Cross Correlation of Factors Within Half-Yearly Data Set.................................................34 Figure 17: Cross Correlation of Factors Within Yearly Data Set .........................................................34 Figure 27: Linear Regression of DY Vs Share price ............................................................................37 Figure 28: Linear Regression of Volume Traded Vs Share price.........................................................38 Figure 29: Linear Regression of Rand-Euro Exchange Rate Vs Share price........................................38 Figure 30: Linear Regression of Rand-Dollar Exchange Rate Vs Share price .....................................39 Figure 31: Linear Regression of Inflation Vs Share price ....................................................................39 Figure 32: Linear Regression of Interest Rate Vs Share price..............................................................40 Figure 33: Linear Regression of Gold Price Vs Share price.................................................................40 Figure 34: Linear Regression of Platinum Price Vs Share price ..........................................................41 Figure 18: Residual Plot for Share Price...............................................................................................41 Figure 19: Residual Plot for Dividend Yield ........................................................................................42 Figure 20: Residual Plot for Volume Traded........................................................................................42 Figure 21: Residual Plot for Rand-Euro Exchange Rate ......................................................................43 Figure 22: Residual Plot for Rand-Dollar Exchange Rate....................................................................43 Figure 23: Residual Plot for Inflation ...................................................................................................44 Figure 24: Residual Plot for Interest Rate.............................................................................................44 Figure 25: Residual Plot for Gold Price................................................................................................45 Figure 26: Residual Plot for Platinum Price .........................................................................................45 Figure 35: Linear Regression of Dividend Yield Vs Share Price After Transformation......................46 Figure 36: Linear Regression of Volume Traded Vs Share Price After Transformation......................46
  • 10. x Figure 37: Linear Regression of Rand-Euro Exchange Rate Vs Share Price After Transformation ....47 Figure 38: Linear Regression of Rand-Dollar Exchange Rate Vs Share Price After Transformation ..............................................................................................................................................................47 Figure 39: Linear Regression of Inflation Vs Share Price After Transformation.................................48 Figure 40: Linear Regression of Platinum Price Vs Share Price After Transformation.......................48 Figure 41: Regression Plot of transformed Dividend Yield vs transformed Share Price with Confidence Intervals ................................................................................................................................................49 Figure 42: Regression Plot of transformed Rand-Euro vs transformed Share Price with Confidence Intervals ................................................................................................................................................49 Figure 43: Regression Plot of transformed Rand-Dollar vs transformed Share Price with Confidence Intervals ................................................................................................................................................49 Figure 44: Regression Plot of transformed Inflation vs transformed Share Price with Confidence Intervals ................................................................................................................................................50 Figure 45: Regression Plot of transformed Platinum Price vs transformed Share Price with Confidence Intervals ................................................................................................................................................50 Figure 46: Residual Plot of transformed regression between Share Price and DY...............................51 Figure 47: Residual Plot of transformed regression between Share Price and Inflation.......................51 Figure 48: Residual Plot of transformed regression between Share Price and Platinum Price.............51 Figure 49: Residual Plot of transformed regression between Share Price and Rand-Dollar ................51 Figure 50: Residual Plot of transformed regression between Share Price and Rand-Euro...................52 Figure 51: Error for Fitting Equation to Daily Data .............................................................................57 Figure 52: Error for Fitting Equation to Weekly Data..........................................................................57 Figure 53: Error for Fitting Equation to Monthly DataFigure 54: Error for Fitting Equation to Quarterly Data.......................................................................................................................................................57 Figure 56: Error for Fitting Equation to Yearly Data ...........................................................................58 Figure 57: Error for Fitting Equation to Half-yearly Data....................................................................58
  • 11. xi LIST OF TABLES Table 1: Correlation Values of Each Factor with Share Price ..............................................................31 Table 2:Assumption Test Results for Each Series in The Daily Data Set ............................................35 Table 3: Assumption Test Results for Each Series in The Weekly Data Set........................................35 Table 4: Assumption Test Results for Each Series in The Monthly Data Set ......................................35 Table 5: Assumption Test Results for Each Series in The Quarterly Data Set.....................................36 Table 6: Assumption Test Results for Each Series in The Half-Yearly Data Set.................................36 Table 7: Coefficient of Determination for Each Series.........................................................................37 Table 8: Correlation matrix of Untransformed Factors.........................................................................53 Table 9: Cross correlation of the Platinum Price with each factor........................................................54 Table 10: Cross Correlation of the Gold Price with Each Factor .........................................................54 Table 11: Cross Correlation of the Interest Rate with Each Factor ......................................................54 Table 12: Cross Correlation of Inflation with Each Factor...................................................................55 Table 13: Cross Correlation of the Rand-Dollar Exchange Rate with Each Factor..............................55 Table 14: Cross Correlation of the Rand-Euro Exchange Rate with Each Factor ................................55 Table 15: Cross Correlation of the Volume Traded with Each Factor..................................................56 Table 16: Cross Correlation of the Dividend Yield with Each Factor..................................................56 Table 17: Correlation of predicted and actual share price ....................................................................58 Table 18: Transformations used on each dataset ..................................................................................66 Table 19: R2 values ...............................................................................................................................66
  • 12. 1 1 INTRODUCTION 1.1 Background There are many factors that could influence a share price. The factors could be economic or behavioural. An economic factor could be factors that influence the spending of individuals, factors like the value of the currency, the GDP (Gross Domestic Product) and commodity prices. Behavioural factors can be described as factors that influence the decision-making ability of individuals. Factors such as what was taught to an individual about trading (knowledge), the risk that the individual is willing to take and the driving factors for the investment (which could be seen as emotions). There are many factors to be considered with any single shares’ price. There are relationships between each factor and this could lead to many different combinations of factors that can be used to analyse the value of a share price. The price of a share is determined by the demand for the share, and what each individual buying or selling the share perceives its value as. The problem with predicting the share price reduces to which factors are most relevant to the markets perception of the value of the share. There are many factors to use but not all will be relevant. Some factors may even be redundant as multiple factors have causal relationships and may also be correlated with each other. Finding the few factors, which are most relevant to the share price, can improve the prediction accuracy and simplify the prediction process by reducing the computational complexity and the time taken to complete the model. This project aims to find the factors which are most relevant to predicting the Richemont share price. These factors could later be used in a prediction model. Currently there are many methods to reaching the results required. The methods used in literature are, Factor analysis, Principle Component Analysis, Causality and Correlations between factors. 1.2 Motivation The purpose of this study is to investigate what factors are most relevant to the Richemont share price. This study forms a preliminary investigation to the design of a simplified model of the JSE. While trying to create this model, a number of factors as well as one specific share was chosen. The model mentioned above will use Agent based simulation to simulate the decisions of entities within the market according to economic factors such as; ο‚· Fundamental pricing information including dividends. ο‚· Distribution of knowledge. ο‚· Economic conditions such as exchange rate, overseas market closing levels, inflation and interest rates as well as commodity prices that may be correlated to the share. ο‚· Market perceptions of the company including company ethics.
  • 13. 2 ο‚· Agents will be rational and will have a risk profile and time view (e.g. buy and hold, day trading, etc.) ο‚· Competition and alternatives in the market. ο‚· Agents will be segmented by financial level (and size of trades) While searching through literature it was found that with many models selection of the correct factors led to higher accuracy as opposed to using as many factors as possible[1]–[3]. Based on an investigation on prior models it was found that a selection of the factors based on quality was a more effective criterion than the quantity of factors selected. The reduction of the factors also allowed for the model to be less computationally complex as well as it reduced the times it took for the simulations to be completed. From this the question β€œWhat factors are most influential to the Richemont Share Price?” was created and leads into this research. 2 LITERATURE SURVEY 2.1 Fundamental Analysis Fundamental analysis is a method of evaluating the value of a share by looking at the effect of the company’s performance and their reactions to economic conditions [4]. Fundamental analysis seeks to understand if the company is growing, if it is profitable, will it continue to improve or become the best in its market segment, whether it is able to pay its debts and if the company is in good ethical standing. All the questions and many more are answered in order to answer the question β€œDoes the company make a good investment?”. Fundamental analysis is usually used for evaluation of stocks but can be applied to securities, countries, markets and market segments [5]. There are many factors that can be analysed when performing fundamental analysis on a share, all factors that could influence the company’s performance could be included. Fundamental analysis is not limited to using purely qualitative or purely quantitative data. Any news about a company can usually be helpful to measuring the value of the company or the potential value of the company. This is where the methodology of using event studies becomes useful. The quantitative analysis usually includes looking at the company’s finances and books in order to create a perceived value of the share. This perceived value is called the intrinsic value, factors such as revenue, debt, dividends, and company performance ratios are used as measures of this intrinsic value too[6]. The objectives of fundamental analysis are [4]: ο‚· To predict the direction of economies that impact a company. This is done because the financial performance of the company is dependent upon the economy is resides within. ο‚· To estimate the intrinsic value of the stock and try to predict when changes in this value will occur. ο‚· To select the right time to buy and sell stocks to maximise investment returns.
  • 14. 3 2.2 Technical Analysis Technical analysis is the evaluation of securities by analysing statistics which are generated by market activity. These statistics are generally past prices or volumes traded[7]. The aim of technical analysis is to identify patterns that can suggest future activity instead of trying to forecast intrinsic value. Technical analysis typically depends on the use of charts patterns, technical indicators, oscillators or some combination of the above mentioned[8]. Technical analysts believe that the charts show the moods of the crowds and thus they focus on the analysis of mass human psychology. Emotional risk is inversely correlated to financial risk; Figure 1 below displays the moods associated with the different price trends. Figure 1: Moods of the masses according to technical analysis[8] People are generally motivated by greed and optimism when buying and are driven by fear or pessimism when selling. It is believed that people formulate scenarios based on their emotional state in order to rationalise their emotions. Investors will try to sell at the top or as close to the top, and buy at the bottom or as close to the bottom as possible using this rationale. Investors use this in the aid of finding turning points which they cannot see[8]. Apart from the above-mentioned methods technical analysis, trend, support, and resistance and volume analysis are used. ο‚· Trend Analysis Trend analysis is one of the most important and most used techniques in technical analysis. A trend is the general direction in which the price is heading. Trends aren’t always easy to spot as there are lots of fluctuations in the price over time. In trend analysis trends are classified according to their direction into three sets; uptrends, horizontal trends and downtrends. An uptrend is characterised by a series of higher highs and higher lows whereas a downtrend is characterised by lower lows and lower highs[9][8].
  • 15. 4 Trends are then further classified into another set of three; a long-term trend, an intermediate trend and a short-term trend. A long term trend is one that last longer than a year, an intermediate trend lasts between one and three months and a short-term trend is considered to last up to a month. Channel lines are the addition of two parallel trend lines which act as areas of support and resistance[8]. ο‚· Support and Resistance Analysis Support is defined as the price level which the stock seldom falls below and resistance is the price level which the stock seldom increases above. Support and resistance are governed by the psychology behind supply and demand. Where the support is the price level which the market is willing to buy at and the resistance level is the price at which the market is willing to sell at. When the price breaches the support or the resistance level this means that there has been a shift in the supply or demand curves for the shares. Once the resistance or support level is breached, it’s role will be reversed, i.e. the resistance level will become the support level if the resistance level is broken and vice versa for the support level[9][10]. ο‚· Volume Analysis Volume is the number of shares that are traded over a given period of time, greater volume results in a more active security. Volume charts have trends too which can show the increase or demand in the demand or supply of the share. Volume analysis is important to technical analysis because it is used to confirm chart trends and patterns. In most scenarios changes in volume precedes changes in price except when the divergence case occurs. The divergence case is when the volume and price relationship starts to deteriorate[10][11]. 2.3 Time Series Analysis A time series is a set of observations, each of which are recorded at a different point in time denoted by 𝑑 . When 𝑑 is incremental and data is recorded at each increment the series is discrete, whereas if 𝑑 is continuous the series is a continuous time series[12]. Time series analysis can be broken down into the following objectives [13]; ο‚· Description The description objective comprises of plotting the data, looking for trends, seasonality, outliers, normality, stationarity and using more tools used to describe the data set better. ο‚· Explanation Explanation focuses around correlations and relationships between different time series or within a single time series. ο‚· Prediction Prediction focuses around trying to estimate future values of the data. This is also known as forecasting. This includes fitting models to the data to improve forecasts.
  • 16. 5
  • 17. 6 ο‚· Control This objective is usually used when dealing with quality control, it is used to ensure that process outputs are within specification or are significantly within specification. The classical decomposition model is used to describe a time series and use it to better forecast the future values, this model defines a time series in terms of components such as the trend, seasonality and noise[14]. There are two classical decomposition methods, the additive and the multiplicative. Equations 1 and 2 are used to describe them respectively. π‘Œ = 𝑇 + 𝐢 + 𝑆 + 𝑒 (1) π‘Œ = 𝑇×𝐢×𝑆×𝑒 (2) Where π‘Œ is the value of the series at a specified point 𝑇 is the linear trend 𝐢 is the cycle 𝑆 is the seasonality 𝑒 is the random error Seasonality Seasonality is described as the predictable changes that data in the time series experiences and recurs over a one-year period[15]. Seasonality can be calculated by using Equation 3 below. 𝐾𝑑 = π‘Œπ‘‘ 𝑀𝑑 (3) Where 𝐾𝑑 is a series of seasonality and randomness 𝑀𝑑 is the moving average of the time series π‘Œπ‘‘ is the value of the series at time t
  • 18. 7 In order to attain the seasonality series Equation 4 needs to be used. It is important to note that the subscript g is the number of increments in a season, and that each season is the sum of the time each season lasts. This is done to average out the randomness that occurs within each season[16]. 𝑆 𝑔 = βˆ‘ 𝐾𝑑 (4) Where 𝑆 𝑔 is the seasonality of the series Trends The trend of a time series is found by using a least squares fit of the model using Equation 5 below [16]. 𝑀𝑑 = π‘Ž + 𝑏𝑑 + 𝑒𝑑 (5) Where 𝑀𝑑 is the Moving average value at time t π‘Ž is the intercept 𝑏 is the slope 𝑒𝑑 is the residual For the trend, just the linear part of Equation 5 is used and the residual term is discarded. Correlation Correlation in terms of time series is a measure of how two time series are able to fluctuate together or the measure of the linear relationship between the two. It is used to tell how well one time series is able to predict fluctuations in another[17], [18]. Correlation does not mean a causal relationship but merely that there exists a relationship between the variables that can be exploited for the forecasting of the other[19]. Correlated variables could have some common variable that causes the fluctuations in them and this is what may create the relationship. A time series may also have correlations with lagged version of itself which is called serial correlation or auto correlation [20]. Correlation also allows for analysis of which time series leads which by using offset data and comparing the two this is called cross-correlation[21], [22]. When these lags are used, generally a model will first be fit and then an information criterion like AIC may be used to find the best lag order [23], [24].Alternatively the maximum lags can be used. Pearson’s Product Moment Correlation Coefficient Pearson’s Product moment correlation (PPMC) also referred to as the Pearson’s correlation coefficient is a measure of how well two variables are related linearly. It enables the user to know whether fitting a straight line to the data accurately represents the relationship between the variables in question. The
  • 19. 8 equation below is used to calculate the coefficient of correlation. A strong correlation is represented by an r value within the intervals [0.7; 1] or [ -1; -0.7]. Where an absolute value of 1 represents a perfect linear relationship. A moderate strength correlation would fall within the intervals [0.3;0.7] or [-0.7; - 0.3]. Low correlation is represented with a value within the ranges [0.1;0.3] or [-0.3; -0.1]. No correlation is when the r value is 0.[25][17] π‘Ÿ = 𝑛(βˆ‘ π‘₯𝑦) βˆ’ (βˆ‘ π‘₯)(βˆ‘ 𝑦) √(𝑛 βˆ‘ π‘₯2 βˆ’ (βˆ‘ π‘₯)2)(𝑛 βˆ‘ 𝑦2 βˆ’ (βˆ‘ 𝑦)2) (6) π‘Ÿ = π‘π‘œπ‘£(π‘₯, 𝑦) 𝑆 π‘₯ 𝑆 𝑦 (7) Where 𝑛 is the number of observations in the sample π‘₯ is the independent variable 𝑦 is the dependant variable π‘π‘œπ‘£(π‘₯, 𝑦) is the covariance between the two variables 𝑆 π‘₯ is the sample standard deviation of the independent variable 𝑆 𝑦 is the sample standard deviation of the dependent variable Equation 7 is a substitution of covariance for the numerator and variance of both of the variables into the bracketed terms in equation 6 denominator [26]. With PPMC there are assumptions that were made, if these assumptions are not met then the data may not mean what it is thought to, or the results would not be valid[27][28]. These assumptions are; ο‚· Normality Normality is the measure of how the data is distributed and if the normal distribution can be fit to the data significantly. Testing for normality can be done graphically or numerically. The numerical methods that can be used to test this, are the Kolmogorov-Smirnov Test [29] (see Equation 8 for test statistic) and the Shapiro-Wilk test (see Equation 9 and 10 for test statistic) [30]. The graphical methods that can be used are, Q-Q plots, histograms and Box-and-Whisker diagrams[30]. The numerical tests can be performed to a specified significance level to see if the data is normally distributed. 𝑇 = π‘šπ‘Žπ‘₯|πΉβˆ—(π‘₯) βˆ’ 𝑆(π‘₯)| (8) Where 𝑇 is the test statistic used for the Kolmogorov-Smirnov test
  • 20. 9 πΉβˆ—(π‘₯) is the data being tested 𝑆(π‘₯) is the empirical distribution data (data from the normal distribution for normality tests) π‘Š = ( 𝑏 π‘ βˆšπ‘› βˆ’ 1 ) 2 (9) 𝑏 = βˆ‘ 𝑏𝑖 = βˆ‘ π‘Ž(𝑛 βˆ’ 𝑖 + 1)(𝑋 π‘›βˆ’π‘–+1 βˆ’ 𝑋𝑖) (10) Where π‘Š is the test statistic for the Shapiro-Wilk test 𝑏 is defined by Equation 10 π‘Ž is defined as a Shapiro-Wilk coefficient (𝑛 βˆ’ 𝑖 + 1) is defined as a Shapiro-Wilk coefficient π‘₯ is defined as the data from the series being tested Kolmogorov-Smirnov Test uses p-values to compare the test statistic and to accept or reject the null hypothesis of the data being normal. The Shapiro-Wilk test uses critical values from a Shapiro-Wilk table of values to compare the test statistic and conclude if the null hypothesis is correct or not. ο‚· Linearity Linearity of data is the ability of a line to display the relationship between the dependant variable and independent variable. This is usually determined through linear regression. In linear regression the aim is to fit a line through the data while minimising the error. The goodness of fit can be determined by finding the coefficient of determination R2 of the line. The equation below can be used to find the coefficient of determination [31]. 𝑅2 = ( 𝑛(βˆ‘ π‘₯𝑦) βˆ’ (βˆ‘ π‘₯)(βˆ‘ 𝑦) √(𝑛 βˆ‘ π‘₯2 βˆ’ (βˆ‘ π‘₯)2)(𝑛 βˆ‘ 𝑦2 βˆ’ (βˆ‘ 𝑦)2) ) 2 (11) Or alternatively can be found by analysis of residuals with the formula below 𝑅2 = 1 βˆ’ βˆ‘ πœ€π‘– 2 𝑛𝑆2 (12) Where 𝑅2 is the coefficient of determination πœ€π‘– is the error at each point i
  • 21. 10 𝑆 is the standard deviation of the data set being analysed ο‚· Stationarity Stationarity is the absence of random effects. There are two types of stationarity, Difference stationarity and trend stationarity [32][33]. Before one continues through these definitions it is important to first define the following; Pure Random Walk A pure random walk is defined by the equation π‘Œπ‘‘ = π‘Œπ‘‘βˆ’1 + πœ€π‘‘ (13) Where πœ€π‘‘ is white noise π‘Œπ‘‘ is the series value at time t π‘Œπ‘‘βˆ’1 is the series value at time t-1 White noise is stochastic; this means that this series will not become mean reverting as the variance will evolve over time. The variance of the series will tend to infinity as time tends to infinity. This is a difference stationary process[34]. Random walk with drift This series is defined by the equation below π‘Œπ‘‘ = 𝛼 + π‘Œπ‘‘βˆ’1 + 𝑒𝑑 (14) Where 𝛼 is the drift term in the series This series too has a variance that is dependant on time and hence is not mean reverting. This is a difference stationary process. [34] Deterministic trend This is defined by the equation below π‘Œπ‘‘ = 𝛼 + 𝛽𝑑 + 𝑒𝑑 (15) Where 𝛽𝑑 is the deterministic trend
  • 22. 11 This series although it looks similar to that of a random walk with drift is different as it is a regressed series of the time trend 𝛽𝑑. A nonstationary process with a deterministic trend has a mean that grows around a fixed trend which is constant and independent of time. This is a trend stationary process. [34] Random walk with drift and deterministic trend This series is described by the equation below π‘Œπ‘‘ = 𝛼 + π‘Œπ‘‘βˆ’1 + 𝛽𝑑 + 𝑒𝑑 (16) This series has both a drift component and a deterministic trend. This is both difference and trend stationary. [34] Difference stationarity A series with a random walk can be transformed into a stationary process using differencing, irrespective of whether it has drift or not. [34] Trend Stationarity A nonstationary process with a deterministic trend can be transformed into a stationary process by detrending. [34] Difference and Trend Stationary In cases where a random walk with drift and a deterministic trend, stationarity can be achieved through detrending but differencing needs to also be applied in order to ensure that the variance does not grow to infinity over time. [34] Testing for Trend stationarity and Difference Stationarity There are two preferred methods for testing for stationarity, these are the Augmented Dickey-Fuller (ADF) test and the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test [34]–[39]. Augmented Dickey-Fuller Test (ADF) In the ADF test Equation 17 below is used to represent an AR process [35], [40], [41] Where 𝛿 ∈ [0,1], when 𝛿 is 0 the process is unit root stationary, when 𝛿 β‰  0 then the process is not. The test sets a null hypothesis that the process is unit root stationary. 𝐻0: 𝛿 = 0 βˆ†π‘Œπ‘‘ = 𝛼 + π›Ώπ‘Œπ‘‘βˆ’1 + 𝑒𝑑 (17)
  • 23. 12 𝐻1: 𝛿 β‰  0 A t-statistic is calculated for the 𝛿̂ which is the estimated value of 𝛿. This test statistic is then compared to the critical values from the Dickey-Fuller Distribution. When 𝑑 < π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ (18) The null hypothesis is rejected. The π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ value can be calculated using Equation 19. π·πΉπΆπ‘Ÿπ‘–π‘‘π‘–π‘π‘Žπ‘™ = 𝛿̂ 𝑆𝐸(𝛿̂ ) (19) Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test This test evaluates whether a univariate series is trend stationary as the null hypothesis and that it is a nonstationary unit root process. It does this by first defining the series with Equation 20 below [42][43]. π‘Œπ‘‘ = 𝑐𝑑 + 𝛽𝑑 + 𝑒1𝑑 (20) 𝑐𝑑 = π‘π‘‘βˆ’1 + 𝑒2𝑑′ (21) Where 𝑐𝑑 is some random walk process 𝑒1𝑑 is a stationary process 𝑒2𝑑′ is and independent and iid process with a mean of 0 and variance 𝜎2 𝐻0: 𝜎2 = 0 𝐻0: 𝜎2 > 0 With the test statistic 𝑝𝑑𝑒𝑠𝑑 = βˆ‘ 𝑆𝑑 2𝑇 𝑑=1 𝑆2 𝑇2 (22) Where 𝑝𝑑𝑒𝑠𝑑 is the p-test statistic 𝑇 is the sample size 𝑆2 is the Newey-West estimate of the long-run variance And 𝑆𝑑 = 𝑒1 + 𝑒2 + 𝑒3 + β‹― +𝑒𝑑 (23)
  • 24. 13 ο‚· Homoscedasticity Homoscedasticity describes the variance of a series, it means that the variance does not increase with time[44]. Homoscedasticity can be tested graphically by looking at plots of residuals against actuals. It can also be tested using the Engle test for residual heteroscedasticity. Engle test for residual heteroscedasticity Residuals of the series are defined as in Equation 24 𝑒𝑑 = 𝑦𝑑 βˆ’ 𝑒̂ 𝑑 (24) Where 𝑒̂ 𝑑 is the conditional mean of the process 𝑒𝑑 is the residual which is identically distributed with a mean of 0 and variance of 1 𝐻0: 𝛼0 = 𝛼1 = 𝛼2 = β‹― = 𝛼 π‘š 𝐻1: 𝑒𝑑 2 = 𝛼0 + 𝛼1 π‘’π‘‘βˆ’1 2 + 𝛼2 π‘’π‘‘βˆ’2 2 = β‹― = 𝛼 π‘š π‘’π‘‘βˆ’π‘š 2 + 𝑒 𝑑 Where 𝑒 𝑑 is a white noise error process The null hypothesis tests that the error at time t is not dependent on the error from previous lags, which means that it is not heteroscedastic. The test statistic is found by using the F statistic for regression on the squared residuals, and the critical value is found in the 𝒳2 distribution using m degrees of freedom and the required significance[45][46]. Cross correlation and Auto Correlation Cross correlation is the correlation of a time series with another time series at different lags. It is achieved by shifting one of the series forward or backward by several lags. This results in the observation of a lead-lag relationship between two variables. For example, if the cross correlation between two series is highest at the offset of -3, which means that the correlation between the two series is highest at that point. This could mean that the one variable signals when the other is going to be affected. Auto correlation is the same as cross correlation except that, instead of using two different series, it uses one series and measures if the series is correlated with its self. This could be exploited as this too could signal change occurring. Cross correlation and auto correlation are both calculated using the Pearson’s correlation coefficient. One adjustment is made to this; this is for the shifting of the series as seen in the figure below.
  • 25. 14 Figure 2: Shifting of Time Series 2.4 Feature Selection When creating a model that approximates functional relationships between inputs and outputs, generally when using machine learning and artificial intelligence systems, a problem arises where there are too many inputs that may be irrelevant and these lead to the overfitting of the model to the output data. To deal with this feature selection was created. Feature selection is a methodology that eliminates the irrelevant or redundant inputs [1],[47]. There are many different methods that can be applied. Principle component analysis is used to reduce the amount of variables by transforming data from a higher to a lower dimension while minimising the information lost [48], [49]. 2.5 Exploratory Data Analysis (EDA) Exploratory data analysis is usually used as a first step in any data analysis, it is an approach to data analysis that uses different techniques to; increase the insight into the data, find what variables are important, detect outliers and anomalies, uncover an underlying structure and test the underlying assumptions before performing further analysis. It does this by looking at variable distribution, scatterplots, correlation analysis and other multivariate approaches. EDA aims to be a more visual analysis of variables[50].This is done through the following steps: ο‚· Initial Extraction ο‚· Determine number of factors to retain ο‚· Rotation-a transformation ο‚· Interpret solution ο‚· Calculate factor scores ο‚· Results in table ο‚· Prepare results 2.6 Confirmatory Data Analysis (CDA) Confirmatory data analysis uses statistical techniques to verify that there is indeed a factor structure between a set of observed variables[50]. The method allows for the testing of the hypothesis that a relationship does indeed exist. This is done through the following methodology;
  • 26. 15 ο‚· Review the relevant theory and research literature to support model specification ο‚· Specify a model ο‚· Determine model identification ο‚· Collect Data ο‚· Conduct preliminary descriptive statistical analysis such as scaling, missing data, collinearity measures and finding outliers ο‚· Estimate parameters in the model ο‚· Present and interpret results 2.7 Dividend Yield Dividend yield is the ratio of the dividends paid against the price of the share. This ratio is seen as the return on investment. The ratio is a measure of cash flow that is resulted from the purchase[51]. However, the dividends irrelevance theory says that dividends are not relevant to the shareholders as the shareholder can sell their shares to achieve an income[52] 2.8 Interest Rate An interest rate is seen as the cost of borrowing money. When investing money, one would benchmark the return from borrowing money and compare it with the return the investment could yield [53]. When investing, the interest rate could be used as an indicator of economic conditions due to the strong relationships between the value of a currency and the inflation rate. In macroeconomics, the interest rate is used to balance the demand of money in a country. When inflation increases, then the demand for money begins to grow. This is when the minister of finance increases the interest rate. The increase in interest rate results in the cost of borrowing money increasing. Due to this increase in cost, the demand for money would decrease. This could be seen as a relevant factor to determining a shares value as the share price is essentially determined by the demand and supply of the share [54]. 2.9 Principle Component Analysis (PCA) The main objective of PCA is the reduction of variables when working with large amounts of variables. The graphical display would not be helpful in the analysis of the variables due to the number of variables being studied. The procedure of the first principle analysis starts with defining a matrix X with all variables. The matrix is then used to find a linear combination of the variables in the form of a multifactor linear regression model without an intercept. The linear combination will have a matrix of coefficients which describe the linear regression model. These coefficients are chosen to maximise the variance of the linear combination. The sum of all the coefficients squared is constrained such that it must be less than or equal to 1. The above described process is used again except the variance in this model is just the remaining variance of the first principle component.[55]
  • 27. 16 2.10 Causality Causality is the term used for describing the relationship between two variables. The relationship described will lead to realisation of which variable causes the fluctuation in the other. Unlike correlation which only describes the strength of the relationship between the variables[56]. Causality is usually tested using the Granger-Causality test. 2.11 Linear regression A simple regression model is used to create linear relationships between two variables. The relationship formed is of the form 𝑦𝑖 = 𝛽0 + 𝛽1 π‘₯𝑖 + πœ€π‘–. The 𝛽 values represent the regression coefficients. The error that is resulted from fitting the model to the data is represented by πœ€π‘–. The least squares method is used to find the 𝛽 values such that it minimises the sum of the squared residuals. The formulas below are used to find the coefficients. 𝛽1 = 𝑆𝑆π‘₯𝑦 𝑆𝑆π‘₯ (25) 𝛽0 = 𝑦̅ βˆ’ 𝛽1 π‘₯Μ… (26) Where 𝑆𝑆π‘₯𝑦 = βˆ‘(π‘₯𝑖 𝑦𝑖) 𝑛 𝑖=1 βˆ’ (βˆ‘ π‘₯𝑖 𝑛 𝑖=1 )(βˆ‘ 𝑦𝑖 𝑛 𝑖=1 ) 𝑛 (27) 𝑆𝑆π‘₯ = βˆ‘(π‘₯𝑖 2 ) 𝑛 𝑖=1 βˆ’ (βˆ‘ π‘₯𝑖 𝑛 𝑖=1 )2 𝑛 (28) 𝑦̅ = βˆ‘ 𝑦𝑖 𝑛 𝑖=1 𝑛 (29) π‘₯Μ… = βˆ‘ π‘₯𝑖 𝑛 𝑖=1 𝑛 (30) 2.12 Residual Analysis Residual analysis is done in order to see how well a model fits data. The residual vs actual plot is one that is rich in information. Residual analysis can be seen as the validation of the model. When the residual vs actual plot reveals a pattern, it means that the data is not described by the model. Ideally after fitting model to the data, a randomly distributed error should be found. This randomly distributed error should be about the 0 residual line[57]. If the residual grows with an increase in the actual data, this means that the data is heteroscedastic or that the data needs to be transformed using the log transform. Heteroscedastic data can still be transformed using the Box-Cox transformations[58].
  • 28. 17 2.13 Related Studies Justin Colyn [23] had focused his research on determining whether Price-Earnings (P/E) ratio and the Dividend Yield (D/Y) influence future price for a select few value weighted, equity capital market indices. To do this he had used a methodology which tested for stationarity using the Augmented Dickey-Fuller method (ADF) then for co-integration using the methodology of Johansen [59]. From here he corrected the non-stationary series using the Vector Error Correction Model (VECM) for Co- integrated variables and the Vector Auto-Regression (VAR) model at the correct lag which was found by various information criterion such as the Akaike Information Criterion (AIC), Schwarz Information Criterion (SC), and Hannan-Quinn Information Criterion (HQ) After these two tests had been done and corrections made, Granger-Causality tests were performed between the series with the hypothesised relationships. His study concluded that for most indices there is very little evidence of Granger- Causality in either direction, Between Price and P/E ratio and between Price and DY but there appeared to be Granger-Causality between price and P/E ratio in relation to the Financial Times Stock Exchange top 100 (FTSE 100).[23] Enos Lentsoane [60] had researched the stock price reaction to dividend changes. This study is slightly different from the one carried out by Justin Colyn as this study focuses on event study methodology. The changes in dividends when announced are treated as an event, this study methodology is usually used for qualitative information such as Corporate events. This study used the event study methodology as used by Khotari & Warner [61], which had nine steps to follow. These steps are as follows; 1. Define the event to be tested 2. Define period to be studied in terms of estimation window, event window, and event date. 3. Define what is meant by abnormal performance 4. Collect event data which meets data selection criteria as defined in step 2 5. Calculate pre-event abnormal returns 6. Calculate abnormal returns over event window 7. Calculate the Average Abnormal Return (AAR) and Cumulative Abnormal Return (CAR) for the test statistic 8. Determine the critical values (Statistical significance) of the AAR and CAR 9. Analyse and interpret the results When calculating the abnormal return three measures were used, Market Adjusted Abnormal Return (MAAR), Market Model Abnormal Return (MMAR), and Buy-and-Hold Abnormal Return (BHAR). This event study concluded that market reaction is not statistically significant on the announcement day and that more negative returns occur during the pre-crisis period. He also concluded that the research does not support the irrelevance theory but seems to support signalling hypothesis.[60]
  • 29. 18 Nondumiso Ngidi [62] researched the effect of strikes in South Africa on the share prices of 49 listed companies. He too used the event study method to find the effect that strikes in South Africa have on share prices of listed companies on the Johannesburg Stock Exchange (JSE). He concluded that stock prices react negatively to the news of strike action and continue to follow a downward trend for approximately 5 days after the strike action has concluded. His study also finds that the JSE is not an efficient market as it takes days for the market to return to equilibrium after these announcements. [62] Yusuf Varli et al [24] studied the use of a new correlation coefficient that can be used for analysing bivariate time series data. The new correlation coefficient was tried through simulations and compared to the performance of other correlation coefficients, mainly Pearson’s Correlation Coefficient. The conclusions drawn from this were; ο‚· The coefficient being tested takes lag-difference into account ο‚· Better performance in capturing the cross-independence of two variables over time ο‚· More normal than the Pearson’s Coefficient ο‚· Performs better than the Detrended Cross-Correlation Analysis (DCCA) coefficient in terms of capturing the independence and co-integration in non-stationary series.[24] Bwo-Nung Huang et al[63] used unit root and co-integration models to determine the appropriate Granger-Causal relations between stock prices and exchange rates using the Asian Flu data. The tests included in the methodology are as follows; ο‚· Augmented Dickey Fuller (ADF) ο‚· Phillips-Paren technique ο‚· Bivariate Vector Auto-Regression (VAR) model ο‚· Granger Causality test ο‚· Co-Integration test From this research it was found that data from South Korea are in agreement that exchange rates lead stock prices, the data from the Philippines suggest the stock prices lead exchange rates with negative correlation and the data from Hong Kong, Malaysia, Singapore, Thailand and Taiwan indicate strong feedback relations whereas that of Indonesia and Japan fail to reveal any recognisable pattern.[63] 3 OBJECTIVES To determine: ο‚· What factors influence the price of the Richemont share? ο‚· What the relationship between the more influential factors and the Richemont share price are? ο‚· Which factors are more correlated with the share price?
  • 30. 19 4 APPARATUS 1. Personal Computer using windows 10 OS with i7 processor 2. Matlab R2010a 3. Excel 2016 5 METHODOLOGY From the factors listed in motivation, a list of 8 factors were chosen. The factors that were chosen are; 1. Dividend Yield 2. Volume Traded 3. Rand-Euro exchange rate 4. Rand-Dollar exchange rate 5. Inflation – Average National South African CPI was used. 6. Interest rate- Prime Lending Rate was used 7. Gold Price- Rand/Ounce 8. Platinum Price-Rand/Ounce These factors were chosen due to nature of the company whose share was selected. Richemont is a luxury goods company based in Switzerland. Richemont’s main focus is jewellery, luxury watches and writing instruments[64]. Given that this is their main focus, it was hypothesised that Gold and Platinum Prices would have some relationship with the share price. The factors that then needed to be evaluated were exchange rates, and economic factors. Hence the Rand-Euro exchange rate, Rand-Dollar exchange rate, Inflation and the Interest rate were used in this study. Dividend yield and Volume traded are two of the more commonly used indicators when evaluating a share price, although controversial to that there is a theory of Irrelevance of dividends [52]. 5.1 Collection of Data 1. Each factors’ data was further split into 6 data collection frequencies, Yearly, Half-Yearly, Quarterly, Monthly, Weekly and Daily. 2. A table with the names of each data set was created and ticked for each data set as it was collected, this ensure that there were no duplications of the data. 3. The collection of the data was done with Inet BFA database, using the student portal to access the data. 4. The data that could not be found on Inet BFA database was found on Stats SA or Quantec EasyData. 5. All the exchange rates, gold prices, platinum prices, share prices, volumes and dividend yields were found through Inet BFA.
  • 31. 20 6. A 5-year dataset and different frequencies were chosen and all data would be output to excel files. The Inflation rate and interest rate were taken from Stats SA and Quantec respectively. 7. Data had to be reordered for it to be used easily in one workbook per period. In some cases, data needed to be extracted from one periods data into another periods data, for example the CPI was found yearly at a monthly frequency. 5.2 Processing of Data 1. Plot each time series. 2. Find correlation values between each factor and the share price within each dataset, this can be done using the built-in correlation function in Excel or Matlab. 3. Load the workbooks with all the data for each frequency and name it. 4. Cross correlation and auto correlation can now be found using Matlab and the cc function attached in Appendix A. This is done to find the lead and lag relationships between the share price and the factors. 5. When using the code in Appendix A, the data collection needs to be done for each frequency, daily, weekly, monthly, quarterly, half-yearly and yearly. 6. After running the code in Appendix A, save each output variable into excel and display in graphs. 7. Use the code in Appendix B to test the assumptions listed below to a 5% significance. The assumptions are; ο‚· Linearity-Tested by fitting a linear regression model to the data, obtaining the residuals and the calculating the coefficient of determination. Using Matlab function detrend() to capture the residuals and then the rest was processed in excel using Equation 12. ο‚· Randomness- This was tested using the Matlab function runstest() ο‚· Stationarity- Tested by using ADF method for unit root stationary and using KPSS for trend stationarity. Using adftest() and kpsstest() ο‚· Homoscedasticity- using Engle test for residual heteroscedasticity. Using archtest() ο‚· Normality- Tested using a one-sample Kolmogorov-Smirnov test. Using Matlab function- kstest() . 8. The results from the assumption tests must now be saved into excel and tabulated. 9. Each series has been detrended when running the code for the assumptions tests. 10. The detrended data can be seen as residuals. From this the residuals must be squared and summed and divided by the standard deviation squared, the subtracted from 1 to find the coefficient of determination. 11. In all the tests, except the test for linearity, having all zeros means that the test has been passed. Analyse the tables to see which series in the data sets have met all assumptions.
  • 32. 21 12. If none meet all the assumptions, look for the variables that have passed stationarity and homoscedasticity tests. 13. Plot the data and fit a linear best fit to the data in excel. 14. Residuals can be found in Matlab using the detrend function, which removes the line of best fit from the data. 15. The data can then be plotted against the actual share price, this can be done in Matlab too using the plot function and then editing the graphs as required. 16. Transformation of series that meet the homoscedasticity and the stationarity tests. 17. Start the transformations with each factor, iterate between different transformations to improve the coefficient of determination. Residuals plots that have an increasing variance usually need to be log transformed. 18. After transforming all data, plot the regression line through the data and calculate the coefficient of determination for this fit. This can easily be done in excel 19. Check if all data falls with 95% confidence intervals by plotting the upper and lower confidence lines with all the data and the regression line. This can be achieved by using the Matlab code in Appendix C and D. 20. The fit of the regression model needs to be evaluated for the transformed data. This is done by using the detrended transformed data and plotting it against the actual share price data. 21. A correlation matrix is found using the built-in Matlab correlation function. 22. The cross correlation is found using the cc function in the Appendix A 23. Study the correlation matrix and the cross correlations between factors to determine which factors are not necessary. 24. From the factors that are found to be necessary determine which factors have very poor correlations with the share price. These factors can be eliminated too 25. From the remaining factors, investigate how well the linear regression line fits the data from all datasets that had been eliminated when they had not met the homoscedasticity assumption and the stationarity assumptions. This can be done using the Matlab code in Appendix F to produce residual vs actual plots for each dataset. 26. Determine the correlation between the Predicted share price values and the actual share price values, this can be done using the correlation function in Matlab.
  • 33. 22 5.3 Precautions 1. The collection of the data was done factor at a time to ensure that no data was left out and that it takes as little time as possible. 2. Some datasets were unable to be found as results are not output at that particular interval, for example interest rates and CPI do not change daily, or weekly. 3. Data needed to be checked to ensure that the correct data was in the right workbooks in excel. If the data was not or if the file corrupted, the data would need to be downloaded again. 4. Ensure that the function is saved in the same working directory that is being used. 5. Ensure that when running the Matlab code the variables have been updated and that the right variables have been used. 6. Ensure that the β€œlengthData” variable in the Matlab code is changed as the data collection frequency is changed, this is due to the data being of different lengths each time.
  • 34. 23 6 OBSERVATIONS All data that was gathered was plotted in one dataset for each factor. This was done by using the lowest resolution of the data available. This could be done due to the lowest resolution having the all the factors data for each dataset within it. The factors were plotted in Figure 3 and Figure 5 to Figure 11. Figure 3: Platinum Price Over Time Figure 4: Share Price Over Time Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 35. 24 Figure 5: Gold Price Over Time Figure 6 Volume Traded Over Time Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 36. 25 Figure 7: Rand-Euro Over Time Figure 8: Dividend Yield Over Time Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 37. 26 Figure 9: Rand-Dollar Over Time Figure 10: Interest Rate Over Time Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 38. 27 Figure 11: Inflation Over Time 7 DATA PROCESSING The processing of the data was done within Matlab and excel, a lot of built in functions were used when processing the data. There were 12 ways that the data was processed to be interpreted. Sometimes the preceding step to processing the data determined how the data would be processed next. Below each step will be discussed and the Matlab code and functions that were used will be explained. 7.1 Correlation Values To produce the correlation values seen in Table 1, the correlation function in excel was used. This computes the Pearson Product Moment Correlation as discussed in Section 2.3.4. 7.2 Cross Correlation and auto Correlations The data displayed in Figure 12 to Figure 17 was processed using the cc function in Appendix A. The function calculated the cross correlation by using Equation 7. This can be seen implemented in the lines 12 and 19, with the code n=(cov(x(k:length(x)),y(1:(length(x)k+1))))/(sqrt(var(x(k:length(x)))*var(y(1:(length(x)-k+1))))); and n=cov(x(1:length(y)-l+1),y(l:length(y)))/(sqrt(var(x(1:length(y)-l+1))*var(y(l:length(y))))); respectively. The covariance matrix has a structure that has the variance of each set on its diagonal, it is symmetric about the diagonal. For example, to find the covariance between variables 1 and 2 in the 2x2 matrix named A, a person would need to look at either A(1,2) or A(2,1) as this shows the covariance between the two variables. The shifting of the data, was achieved by introducing the variables k and l respectively. The point of this was to iterate the insertion of variables above the halfway point in the cross correlation vector using k and the code β€œrow((length(x)+k-1))=n(1,2);”. The variable l was used to insert values before Rand-Dollar Exchange rate
  • 39. 28 the halfway point in the cross correlation vector by using the code β€œrow((length(x)-l))= n(1,2);”. Note that this had to be called for each factor in each data set against the share price. This can be seen in the Matlab code in Appendix B with the following lines ; 1. AutoCorrel=cc(data(1:lengthData,2),data(1:lengthData,2)); 2. DYCorrel=cc(data(1:lengthData,2),data(1:lengthData,5)); 3. VolumeCorrel=cc(data(1:lengthData,2),data(1:lengthData,7)); 4. Rand_Euro_Correl=cc(data(1:lengthData,2),data(1:lengthData,9)); 5. Rand_Dollar_Correl=cc(data(1:lengthData,2),data(1:lengthData,11)); 6. inflationCorrel=cc(data(1:lengthData,2),data(1:lengthData,13)); 7. interestCorrel=cc(data(1:lengthData,2),data(1:lengthData,15)); 8. GoldCorrel=cc(data(1:lengthData,2),data(1:lengthData,17)); 9. PlatinumCorrel=cc(data(1:lengthData,2),data(1:lengthData,19)); 10. alldataCC=[AutoCorrel;DYCorrel;VolumeCorrel;Rand_Euro_Correl;Rand_Dollar_Correl;inflatio nCorrel;interestCorrel;GoldCorrel;PlatinumCorrel]; Where the variable β€œlengthData” was changed manually for each dataset, the matrix data needed to be changed for each data set as well. 7.3 Assumption tests The assumptions were tested using the following built-in Matlab functions; 1. detrended(1:lengthData,i)= detrend(data(1:lengthData,i),1); 2. arch(i)=archtest(detrended(:,i)); 3. [h(i),p(i),k(i),c(i)]=kstest(data(1:lengthData,i)); 4. r(i)=runstest(data(1:lengthData,i)); 5. adf(i)=adftest(data(1:lengthData,i)); 6. kpss(i)=adftest(data(1:lengthData,i)); The detrend function as using in the text above would result in the residuals being calculated for a linear model being fit to the data. The function archtest, tested the hypothesis that the series was heteroscedastic by looking at the residuals and employing the Engle test for residual heteroscedasticity as described in Section 2.3.3 under the homoscedasticity section which was tested at 95% confidence. The kstest function used the Kolmogorov-Smirnov test at a 95% confidence. The adftest function, used the Augmented-Dickey-Fuller method to test if the series is unit root stationary at a 95% confidence interval. The kpsstest function used the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test, to test the hypothesis that the series is trend stationary, which was also conducted at a 95% confidence. The outputs needed for a series to meet all the assumptions being tested was a row of zeros. This code needed to be iterated for every dataset.
  • 40. 29 7.4 Coefficient of determination This was done manually using the detrended data and the original data for each series. The squared sum of the residuals was calculated, followed by the standard deviation of the original data. These were then substituted into Equation 12. A sample calculation is done below for the yearly share price. 𝑅2 = 1 βˆ’ βˆ‘ πœ€π‘– 2 𝑛𝑆2 πœ€π‘– 2 = [ 7304921.91 1322521.9 2192712.23 902724.77 150292.829 3880524.77 ] , 𝑆2 =7586437.77, 𝑛 = 6 𝑅2 = 1 βˆ’ 20753698.42 6Γ—7586437.77 𝑅2 = 0.544 7.5 Confidence intervals of regression plots The confidence intervals of the regression plot were done in Matlab using the codes attached in Appendix C and D. The regress function in Matlab was used in order to achieve the values to be used in the equation of the lines, the upper line at 95% confidence level, the lower line at a 95% confidence level and the linear regression line. 7.6 Residual analysis for transformed data The residual plot was done by using the detrend function in Matlab and the scatter function to plot the data against the actual values. 7.7 Correlation of factors The correlation of factors was done in Matlab using the correlation function which also uses the Pearson’s Product Moment Correlation coefficient calculation as shown in Section 2.3.4. 7.8 Cross Correlation of all factors The cross correlation was calculated using the Matlab code attached in Appendix E, which uses the function β€œcc” which is attached in Appendix A. 7.9 Residuals for fitting regression model to each data sets’ volume traded data The equation of the regression model was attained from the graphs created in Excel. This equation was input into Matlab and tested to see if it holds for each data set. The Code in Appendix F was used to
  • 41. 30 plot the graphs required. From the code in Appendix F, the correlation coefficients were determined. Below the code is explained. 1. cas=(max(Dailydata(1:1250,13).^(3))-min(Dailydata(1:1250,13).^(3)))/1249; 2. xvl=min(Dailydata(1:1250,13).^(3)):cas:max(Dailydata(1:1250,13).^(3)); 3. SPdy=transpose((-2E-25*(xvl))+21.94); 4. dyres=(Dailydata(1:1250,2))-SPdy.^3; 5. scatter((Dailydata(1:25:1250,2)),dyres(1:25:1250,1),'DisplayName','dyres(1:1250,1)','YDataSourc e','dyres(1:1250,1)');figure(gcf) In line 1 the variable β€œcas” determines the step of the x values in the graphs. Line 2 creates a vector of x values. The variable β€œSPdy” is the predicted share price. The variable β€œdyres” is the residual between the actual and predicted. Line 5 creates a scatter plot of the residuals against the actual values. In the Matlab graphical user interface (GUI), the chart is editing according to what is required. The line of best fit can be added and the equation of the line can be displayed. Residuals could also be displayed using the Matlab GUI. 8 RESULTS In this section, only the relevant results have been placed, all other results have been attached in Appendix G. This section comprises of 12 sections. The 12 sections are, Correlation values, Cross Correlation and Auto Correlation, Assumption Tests, Coefficients of Determination, Residual vs Actual plots of data before transformation, Residual vs Actual plots of data before transformation, Confidence Intervals on Regression Plots, Residual Analysis of transform data, Correlation of Factors, Cross Correlation of factors and Residuals for fitting regression model to each periods volume data. Each section will be briefly introduced below. 8.1 Correlation Values The correlation values that are placed in the Table 1 are between each factor and the share price. This was done in order to find which factors impact the share price most within each data set. There are 6 data sets that have been used, Daily, Weekly, Quarterly, Half-yearly and Yearly. The blank spaces in the table are due to not having the data required to perform the calculation for that data set. Absolute average was calculated to ensure that when looking at the average correlation for each dataset or factor, the correlation is not reduced due to a few inverse relationships.
  • 42. 31 Table 1: Correlation Values of Each Factor with Share Price Yearly Weekly Monthly Quarterly Half-Yearly Daily Absolute Average DY 0.687 0.451 0.471 0.460 0.360 0.458 0.481 Volume -0.987 -0.667 -0.823 -0.907 -0.731 -0.489 0.767 Rand-Euro 0.903 0.825 0.844 0.843 0.763 0.541 0.787 Rand-Dollar 0.783 0.740 0.752 0.732 0.591 0.699 0.716 Inflation 0.913 0.827 0.807 0.603 0.788 Interest 0.374 0.490 0.391 0.188 0.565 0.401 Gold 0.294 0.294 0.298 0.251 0.127 0.450 0.286 Platinum 0.958 0.594 0.612 0.638 0.667 -0.178 0.609 Absolute Average 0.737 0.595 0.640 0.627 0.504 0.483 8.2 Cross Correlations and Auto Correlations Figures 12 to 17, are the cross correlations and auto correlations for each factor within each dataset. Note that a lag in each datasets graphs would represent a different time step due to the resolution of the data. The data in each graph also becomes less as the data set moves from daily to yearly. The daily data set has 1250 points and there is no line fit through it as it is easy to see the movement of the correlation with each lag. The figures that have less data points, have a line passing through the points due to the difficulty of following the points and seeing any patterns that may exist. Note that the autocorrelation in each figure is symmetrical, this is used as a validation that the β€œcc” function in Appendix A does work correctly. In each figure all the factors data have been plotted, this was to decrease the number of figure that needed to be analysed. It also makes for easier reading of the data when analysing.
  • 43. 32 Figure 12: Cross Correlation of Factors Within Daily Data Set Figure 13: Cross Correlation of Factors Within Weekly Data Set
  • 44. 33 Figure 14: Cross Correlation of Factors Within Monthly Data Set Figure 15: Cross Correlation of Factors Within Quarterly Data Set
  • 45. 34 Figure 16: Cross Correlation of Factors Within Half-Yearly Data Set Figure 17: Cross Correlation of Factors Within Yearly Data Set 8.3 Assumption Tests In Table 2 to Table 6, each table is a summary of the results of the tests done for each factor within each dataset. An assumption is met when there is a 0 in the block, 1 represents the failure to meet the assumption.
  • 46. 35 Table 2:Assumption Test Results for Each Series in The Daily Data Set Homoscedastici ty Normal Test Random Test Unit-Root Stationary Trend Stationary Close 1 1 1 0 0 Volume 1 1 1 0 0 Rand-Euro 1 1 1 1 1 Rand-Dollar 1 1 1 0 0 Inflation Rate 1 1 1 0 0 Gold Price 1 1 1 0 0 Platinum Price 1 1 1 0 0 Table 3: Assumption Test Results for Each Series in The Weekly Data Set Homoscedastici ty Normal Test Random Test Unit-Root Stationary Trend Stationary Close 1 1 1 0 0 Volume 1 1 1 1 1 Rand-Euro 1 1 1 0 0 Rand-Dollar 1 1 1 1 1 Gold Price 1 1 1 0 0 Platinum Price 1 1 1 0 0 Table 4: Assumption Test Results for Each Series in The Monthly Data Set Homoscedastici ty Normal Test Random Test Unit-Root Stationary Trend Stationary Close 1 1 1 0 0 Volume 1 1 1 0 0 Rand-Euro 1 1 1 0 0 Rand-Dollar 1 1 1 1 1 Inflation Rate 1 1 1 1 1 Interest Rate 1 1 1 0 0 Gold Price 1 1 1 0 0 Platinum Price 0 1 1 0 0
  • 47. 36 Table 5: Assumption Test Results for Each Series in The Quarterly Data Set Homoscedastici ty Normal Test Random Test Unit-Root Stationary Trend Stationary Close 1 1 1 0 0 Volume 1 1 1 0 0 Rand-Euro 0 1 1 0 0 Rand-Dollar 0 1 1 1 1 Inflation Rate 0 1 1 1 1 Interest Rate 0 1 1 0 0 Gold Price 0 1 0 0 0 Platinum Price 0 1 0 0 0 Table 6: Assumption Test Results for Each Series in The Half-Yearly Data Set Homoscedastici ty Normal Test Random Test Unit-Root Stationary Trend Stationary Close 0 1 1 0 0 Volume 0 1 1 0 0 Rand-Euro 0 1 0 0 0 Rand-Dollar 0 1 1 0 0 Inflation Rate 0 1 1 1 1 Interest Rate 0 1 1 0 0 Gold Price 0 1 1 0 0 Platinum Price 0 1 0 0 0 8.4 Coefficient of Determination Table 7 is a summary of the coefficients of determination for each factor within each data set. This coefficient of determination shows the ability to fit a straight line through the data for each factor within each data set. The share price was also tested for linearity in this way.
  • 48. 37 Table 7: Coefficient of Determination for Each Series Yearly Weekly Monthly Quarterly Half-Yearly Daily Average Share Price 0.544 0.701 0.715 0.615 0.455 0.706 0.606 DY 0.168 0.330 0.357 -0.015 0.095 0.324 0.187 Volume 0.933 0.286 0.451 0.565 0.754 0.152 0.598 Rand-Euro 0.835 0.823 0.831 0.838 0.827 0.573 0.831 Rand-Dollar 0.888 0.917 0.923 0.929 0.906 0.877 0.913 Inflation 0.997 No Data 0.993 0.993 0.993 No Data 0.994 Interest 0.738 No Data 0.643 0.556 0.702 0.734 0.660 Gold 0.695 0.456 0.447 0.472 0.671 0.609 0.548 Platinum 0.817 0.205 0.222 0.318 0.607 0.033 0.434 Average 0.735 0.531 0.620 0.586 0.668 0.501 8.5 Regression of Price Vs Factors Figure 18 to Figure 25 show a regression plot of the Share Price vs each the factors for the half-yearly data set. Figure 18: Linear Regression of DY Vs Share price Rand-Dollar Exchange rate
  • 49. 38 Figure 19: Linear Regression of Volume Traded Vs Share price Figure 20: Linear Regression of Rand-Euro Exchange Rate Vs Share price Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 50. 39 Figure 21: Linear Regression of Rand-Dollar Exchange Rate Vs Share price Figure 22: Linear Regression of Inflation Vs Share price Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 51. 40 Figure 23: Linear Regression of Interest Rate Vs Share price Figure 24: Linear Regression of Gold Price Vs Share price Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 52. 41 Figure 25: Linear Regression of Platinum Price Vs Share price 8.6 Residual vs Actual Following the previous figures, which showed the regression plot of the factors vs the share price, this section plots the residuals of the linear fits in the figures found in Section 8.5. This was done to look for patterns in the residuals, an increasing variance in the residuals and the randomness of the distribution of the residuals. The residuals are plotted in Figure 26 to Figure 34. Figure 26: Residual Plot for Share Price Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 53. 42 Figure 27: Residual Plot for Dividend Yield Figure 28: Residual Plot for Volume Traded Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 54. 43 Figure 29: Residual Plot for Rand-Euro Exchange Rate Figure 30: Residual Plot for Rand-Dollar Exchange Rate Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 55. 44 Figure 31: Residual Plot for Inflation Figure 32: Residual Plot for Interest Rate Rand-Dollar Exchange rate
  • 56. 45 Figure 33: Residual Plot for Gold Price Figure 34: Residual Plot for Platinum Price 8.7 Regression of Price Vs Factors After Transformations Following the transformation of the factors and share price, each series of factor vs share price has been plotted below. Note that the gold price and the interest rate are both not included within the figures below, this is due to no improvement of the linearity through transformation of the data. Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 57. 46 Figure 35: Linear Regression of Dividend Yield Vs Share Price After Transformation Figure 36: Linear Regression of Volume Traded Vs Share Price After Transformation Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 58. 47 Figure 37: Linear Regression of Rand-Euro Exchange Rate Vs Share Price After Transformation Figure 38: Linear Regression of Rand-Dollar Exchange Rate Vs Share Price After Transformation Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 59. 48 Figure 39: Linear Regression of Inflation Vs Share Price After Transformation Figure 40: Linear Regression of Platinum Price Vs Share Price After Transformation 8.8 Confidence Intervals on regression plots Below are the figures of each regression plot after transformation, each plot has an upper and a lower confidence interval, these confidence intervals communicate whether there is a significant outlier in the data. The significance level of the data is at 95%. Rand-Dollar Exchange rate Rand-Dollar Exchange rate
  • 60. 49 Figure 41: Regression Plot of transformed Dividend Yield vs transformed Share Price with Confidence Intervals Figure 42: Regression Plot of transformed Rand-Euro vs transformed Share Price with Confidence Intervals Figure 43: Regression Plot of transformed Rand-Dollar vs transformed Share Price with Confidence Intervals Rand-Dollar Rand-Dollar Rand-Dollar Exchange rate
  • 61. 50 Figure 44: Regression Plot of transformed Inflation vs transformed Share Price with Confidence Intervals Figure 45: Regression Plot of transformed Platinum Price vs transformed Share Price with Confidence Intervals 8.9 Residual Analysis for transformed data Here again the linear fit for each series is evaluated by observing the residuals. The residuals are analysed for error growth with an increase of the factor as well as patterns within the plot. DY Rand-Dollar Rand-Dollar Exchange rate
  • 62. 51 Figure 46: Residual Plot of transformed regression between Share Price and DY Figure 47: Residual Plot of transformed regression between Share Price and Inflation Figure 48: Residual Plot of transformed regression between Share Price and Platinum Price Figure 49: Residual Plot of transformed regression between Share Price and Rand-Dollar Inflation Platinum Price Rand-Dollar Exchange rate
  • 63. 52 Figure 50: Residual Plot of transformed regression between Share Price and Rand-Euro 8.10 Correlation of Factors Below is the table of cross correlation of each observation within the half-yearly dataset. Only half the matrix has been attached as it symmetrical about the diagonal. The colour that has been added to the table is done to highlight strong negative correlations in dark red and strong positive correlations in dark green. The lower the correlation the lighter the colour of the block will be. Rand-Euro Exchange rate
  • 64. 53 Table 8: Correlation matrix of Untransformed Factors Close DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price Share Price 1.000 0.360 -0.731 0.763 0.591 0.603 0.188 0.143 0.667 DY 1.000 -0.008 - 0.065 - 0.042 -0.098 -0.279 0.001 -0.245 Volume 1.000 - 0.761 - 0.704 -0.859 -0.642 0.178 -0.809 Rand-Euro 1.000 0.934 0.886 0.635 0.370 0.804 Rand-Dollar 1.000 0.942 0.760 0.409 0.639 Inflation 1.000 0.850 0.214 0.747 Interest Rate 1.000 0.066 0.603 Gold Price 1.000 0.110 Platinum Price 1.000 8.11 Cross Correlation of all Factors The tables below are the cross correlations between each factor and all the observed data within the half-yearly dataset. Here too, the tables are colour coded with a colour gradient to show the strength of the correlation. Green means positive correlation and red means negative correlation.
  • 65. 54 Table 9: Cross correlation of the Platinum Price with each factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.1 0.39 -0.21 -0.11 0.26 0.42 0.29 0.66 -0.12 -2 0.43 0.11 -0.40 0.47 0.60 0.53 0.11 0.37 0.03 -1 0.63 -0.18 -0.67 0.79 0.68 0.66 0.22 0.51 0.65 0 0.67 -0.24 -0.81 0.80 0.64 0.75 0.60 0.11 1.00 1 0.58 0.01 -0.77 0.57 0.51 0.70 0.69 -0.12 0.65 2 0.48 0.63 -0.56 0.33 0.48 0.58 0.48 0.00 0.03 3 0.28 0.45 -0.06 0.61 0.79 0.59 0.38 0.50 -0.12 Table 10: Cross Correlation of the Gold Price with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.41 -0.12 -0.28 0.40 0.22 0.32 0.23 0.01 0.50 -2 0.16 0.44 -0.31 -0.09 -0.03 0.22 0.37 -0.29 0.00 -1 0.39 0.76 -0.32 0.29 0.39 0.28 0.06 -0.15 -0.12 0 0.14 0.00 0.18 0.37 0.41 0.21 0.07 1.00 0.11 1 -0.13 -0.61 -0.64 0.46 0.59 0.75 0.88 -0.15 0.51 2 -0.37 -0.48 -0.69 0.35 0.48 0.66 0.73 -0.29 0.37 3 -0.11 -0.32 -0.36 0.62 0.43 0.36 0.38 0.01 0.66 Table 11: Cross Correlation of the Interest Rate with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.74 0.67 -0.80 0.67 0.93 0.93 0.59 0.38 0.38 -2 0.79 0.68 -0.75 0.92 0.99 0.93 0.64 0.73 0.48 -1 0.53 -0.02 -0.58 0.88 0.91 0.92 0.86 0.88 0.69 0 0.19 -0.28 -0.64 0.64 0.76 0.85 1.00 0.07 0.60 1 -0.05 -0.10 -0.56 0.47 0.73 0.80 0.86 0.06 0.22 2 -0.37 -0.21 -0.23 0.44 0.71 0.67 0.64 0.37 0.11 3 -0.72 -0.73 -0.17 0.60 0.66 0.60 0.59 0.23 0.29
  • 66. 55 Table 12: Cross Correlation of Inflation with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.86 0.72 -0.92 0.80 0.97 0.98 0.60 0.36 0.59 -2 0.88 0.70 -0.86 0.90 0.95 1.00 0.67 0.66 0.58 -1 0.72 0.17 -0.78 0.90 0.94 0.99 0.80 0.75 0.70 0 0.60 -0.10 -0.86 0.89 0.94 1.00 0.85 0.21 0.75 1 0.46 -0.11 -0.79 0.86 0.94 0.99 0.92 0.28 0.66 2 0.17 -0.09 -0.72 0.83 0.95 1.00 0.93 0.22 0.53 3 -0.39 -0.27 -0.54 0.75 0.91 0.98 0.93 0.32 0.42 Table 13: Cross Correlation of the Rand-Dollar Exchange Rate with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.87 0.70 -0.87 0.84 0.87 0.91 0.66 0.43 0.79 -2 0.79 0.71 -0.82 0.72 0.80 0.95 0.71 0.48 0.48 -1 0.71 0.39 -0.77 0.80 0.89 0.94 0.73 0.59 0.51 0 0.59 -0.04 -0.70 0.93 1.00 0.94 0.76 0.41 0.64 1 0.27 -0.30 -0.67 0.81 0.89 0.94 0.91 0.39 0.68 2 -0.05 -0.26 -0.79 0.67 0.80 0.95 0.99 -0.03 0.60 3 -0.50 -0.29 -0.53 0.63 0.87 0.97 0.93 0.22 0.26 Table 14: Cross Correlation of the Rand-Euro Exchange Rate with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.63 0.68 -0.66 0.52 0.63 0.75 0.60 0.62 0.61 -2 0.70 0.48 -0.75 0.57 0.67 0.83 0.44 0.35 0.33 -1 0.80 0.28 -0.85 0.81 0.81 0.86 0.47 0.46 0.57 0 0.76 -0.06 -0.76 1.00 0.93 0.89 0.64 0.37 0.80 1 0.45 -0.18 -0.76 0.81 0.80 0.90 0.88 0.29 0.79 2 0.18 0.10 -0.82 0.57 0.72 0.90 0.92 -0.09 0.47 3 -0.14 0.18 -0.25 0.52 0.84 0.80 0.67 0.40 -0.11
  • 67. 56 Table 15: Cross Correlation of the Volume Traded with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 -0.35 -0.65 0.54 -0.25 -0.53 -0.54 -0.17 -0.36 -0.06 -2 -0.72 -0.25 0.62 -0.82 -0.79 -0.72 -0.23 -0.69 -0.56 -1 -0.62 0.28 0.69 -0.76 -0.67 -0.79 -0.56 -0.64 -0.77 0 -0.73 -0.01 1.00 -0.76 -0.70 -0.86 -0.64 0.18 -0.81 1 -0.87 -0.25 0.69 -0.85 -0.77 -0.78 -0.58 -0.32 -0.67 2 -0.47 -0.24 0.62 -0.75 -0.82 -0.86 -0.75 -0.31 -0.40 3 0.08 -0.11 0.54 -0.66 -0.87 -0.92 -0.80 -0.28 -0.21 Table 16: Cross Correlation of the Dividend Yield with Each Factor Lag Share Price DY Volume Rand- Euro Rand- Dollar Inflation Interest Rate Gold Price Platinum Price -3 0.17 -0.49 -0.11 0.18 -0.29 -0.27 -0.73 -0.32 0.45 -2 0.32 -0.36 -0.24 0.10 -0.26 -0.09 -0.21 -0.48 0.63 -1 0.25 0.41 -0.25 -0.18 -0.30 -0.11 -0.10 -0.61 0.01 0 0.36 1.00 -0.01 -0.06 -0.04 -0.10 -0.28 0.00 -0.24 1 0.28 0.41 0.28 0.28 0.39 0.17 -0.02 0.76 -0.18 2 -0.34 -0.36 -0.25 0.48 0.71 0.70 0.68 0.44 0.11 3 -0.45 -0.49 -0.65 0.68 0.70 0.72 0.67 -0.12 0.39 8.12 Residuals for fitting regression model to each periods volume traded data Figure 51 to Figure 57 analyses the fit of the linear model that was found for the volume traded against the share price. It was evaluated to determine how well the model can explain the relationship between the volume traded and the share price within the other data sets. A linear trend can be seen between the residual vs actual share price in most of the data sets. Table 17 shows the correlation of the predicted and the actual observations.
  • 68. 57 Figure 51: Error for Fitting Equation to Daily Data Figure 52: Error for Fitting Equation to Weekly Data Figure 53: Error for Fitting Equation to Monthly Data Figure 54: Error for Fitting Equation to Quarterly Data
  • 69. 58 Figure 55: Error for Fitting Equation to Yearly Data Figure 56: Error for Fitting Equation to Half-yearly Data Table 17: Correlation of predicted and actual share price Correlation Coefficient Daily 0.207 Weekly 0.561 Monthly 0.720 Quarterly 0.722 Half-yearly 0.865 Yearly 0.904