Indian Stock Market Using Machine Learning(Volume1, oct 2017)
1. Volume1, Oct-2017
Prediction of Stock Performance in the Indian Stock Market Using Machine Learning
Santosh Kumar Joshi,
Orange Business Services, Cyber City Gurgaon India
ABSTRACT
I use Machine Learning and various financial ratios as independent variables to investigate indicators that significantly affect the performance of stocks actively traded on the Indian stock
market. The study sample consists of the ratios of 50 large market capitalization companies(Nifty 50) quarterly financial data over one-year period. The study identifies and examines 14
financial indicators that can classify the companies into two categories â âOutperformerâ or âUnderperformerâ â based on their rate of return. The paper asserts that the model developed can
enhance an investor's stock price forecasting ability. Macro-economic variables such as GDP, the unemployment rate, and the inflation rate of the Country , which also can influence the share
price, were not taken into account, however. There is always a risk of Geopolitical development which cause uncertainty globally. The paper discusses the practical implications of using the ML
to predict the probability of good stock performance. The author states that the model can be used by investors, fund managers, and investment companies to enhance their ability to select
out-performing stocks at their own risk.
Keywords: Classification of stock performance, Indian stock market, market rate of return, financial ratios, NIFTY 50
INTRODUCTION1.
It is important for shareholders and potential investors to use relevant financial information to enable them to make good investment decisions in the stock market. Predicting stock
performance is certainly very complicated and difficult. In the history of stock performance literature, no comprehensive, accurate model has been suggested to date for predicting stock
market performance. A stockâs performance can, to some extent, be analyzed based on financial indicators presented in the companyâs annual report. The annual report contains a vast
amount of information that can be transformed into various ratios. Previous literature suggests that financial ratios are important tools for assessing future stock performance. Analysts,
investors, and researchers use financial ratios to project future stock price trends. Ratio analysis has emerged, therefore, as one of the key parameters used by fund managers and investors to
determine the intrinsic value of stock shares; thus, financial ratios are used extensively for the valuation of stock. ratios are used extensively in fundamental analysis to predict the future
performance of a company. Various new ratios, such as book value and price/cash earnings per share, have been included in this discipline for share valuation. Financial ratios help to form the
basis of investor stock price expectations and, hence, influence investment decision making. The level of importance given to financial ratios differs from industry to industry and from one
country to another. Thus, selecting appropriate ratios is very crucial in increasing the prediction success rate.
The objective of this paper is to apply statistical methods to survey and analyze financial data in order to develop a simplified model for interpretation. This study aims to develop a model for
classifying stocks into two categories(Underperformer & Outperformer)poor), based on their rate of return. A companyâs stock is classified as âgoodâ if its share returns perform above the
market returns provided by the National Stock Exchange composite index of India; i.e., the NIFTY.
In this study, the SVC method has been used to classify selected companies, based on their performance.
REVIEW OF LITERATURE2.
In stock performance literature, little attention has been given in the past to the Indian stock market. In recent years, however, there has been a greater focus on the market because of its
rapid growth and its increasing potential for global investors. A number of research papers predict stock performance as well as pricing of the stock index across the globe. Harvey [1995]
observes that emerging market returns are usually more predictable than developed market returns because emerging market returns are more likely to be influenced by local information
than developed markets.
Fundamental variables such as earnings yield, cash flow yield, book-to-market ratio, and size are demonstrated to have some power in predicting stock returns [Fama and French, 1992].
Studies based on European markets also demonstrate similar findings.
â˘
Ferson and Harvey [1993] observe that returns are predictable, to an extent, across a number of European markets (e.g., UK, France, and Germany).â˘
Jung and Boyd [1996], in their study of forecasting UK stock prices, suggest that the predictive strength of their stock performance models is quite significant.â˘
In the Japanese stock market, studies carried out by Jaffe and Westerfield [1985] and Kato et al. [1990] also demonstrate some evidence of predictability in the behavior of index returns.â˘
In recent literature, artificial neural networks (ANN) have been successfully used for modeling financial time series [Cheng,1996; Van and Robert, 1997]. In the United States, several studies
have examined the cross-sectional relationship between fundamental variables and stock returns.
RESEARCH OBJECTIVE AND METHODOLOGY3.
The objective of this study is to build a model using financial ratios of the firms for the purpose of predicting out-performing shares in the Indian stock market. This study aims, therefore, to
answer two questions:
(1) Can the yields of stocks be explained with the help of financial ratios?
(2) Can we analyze stock yields using a logistic regression model?
DATA COLLECTION METHODOLOGY
3.1 DATASET
The dataset was obtained from NDTV Profit. I selected stocks from Nifty Index(NIFTY50) . In total I selected 50 stocks. For each stock I obtained the stock price at the end of each quarter from
the first quarter of 2016 until the second quarter of 2017. Along the price, we have also retrieved the following financial indicators about each company in our dataset:
Book value - the net asset value of a company, calculated by total assets minus intangible assets (patents, goodwill) and liabilities.
Market capitalization - the market value of a company's issued share capital; it is equal to the share price times the number of shares outstanding.
Change of stock Net price over the one month period
Percentage change of Net price over the one month period
Dividend yield - indicates how much a company pays out in dividends each year relative to its share price.
Earnings per share - a portion of a company's profit divided by the number of issued shares. Earnings per share serves as an indicator of a company's profitability.
Earnings per share growth â the growth of earnings per share over the trailing one-year period.
Sales revenue turnover -
Net revenue - the proceeds from the sale of an asset, minus commissions, taxes, or other expenses related to the sale.
Net revenue growth â the growth of Net revenue over the trailing one-year period.
Sales growth â sales growth over the trailing one-year period.
Price to earnings ratio â measures companyâs current share price relative to its per-share earnings.
Price to earnings ratio, five years average â averaged price to earnings ratio over the period of five years.
Price to book ratio - compares a company's current market price to its book value.
Price to sales ratio â ratio calculated by dividing the company's market cap by the revenue in the most recent year.
Dividend per share - is the total dividends paid out over an entire year divided by the number of ordinary shares issued.
Current ratio - compares a firm's current assets to its current liabilities.
Quick ratio - compares the total amount of cash, marketable securities and accounts receivable to the amount of current liabilities.
Total debt to equity - ratio used to measure a company's financial leverage, calculated by dividing a company's total liabilities by its stockholders' equity.
Analyst ratio â ratio given by human analyst.
Revenue growth adjusted by 5 year compound annual growth ratio
Stock Market Prediction Page 1
2. Revenue growth adjusted by 5 year compound annual growth ratio
Profit margin â a profitability ratio calculated as net income divided by revenue, or net profits divided by sales
Operating margin - ratio used to measure a company's pricing strategy and operating efficiency. It is a measurement of what proportion of a company's revenue is left over after paying for
variable costs of production such as wages, raw materials, etc.
Asset turnover - the ratio of the value of a companyâs sales or revenues generated relative to the value of its assets1.
3.2 PREDICTING EQUITY PRICE MOVEMENT METHODOLOGY
I modelled out task of predicting equity price movement as classification task, in which I classify stocks that will have Nifty50 price as the benchmark. I classify Stock price in three months
period as âOutperformâ when individual stock price performs better than broader Nift50 index. And âUnderperformâ when Individual stock underperforms nifty50. Since I collected historical
data retrieved (stock price and nifty index price)from NSE, I created a dataset which had indicator values and price 90 days in future of the recording date . I created a script in Python that was
comparing history price with the price exactly 90 days after first price was measured.
There were many stocks which did not have updated quarterly financial data on NDTV also many stocks have duplicate data. These anomalies would have cause inconsistency in data.
Therefore I dropped all those rows.
Since not all financial indicators were available for all companies in our data set, we assigned value -9999 to not present or not available values.
Analssis of SVM(Support Vector Machine)4.
âSupport Vector Machineâ (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges.
However, it is mostly used in classification problems.
Results5.
Since our dataset contained 50 stocks and we needed to discard some stocks because they would imbalance our dataset, our dataset contained 29 stocks. Each stock data having 4 quarters
financial data.
We Used Linear SVM to test our model . We adopted two approaches to see how the performance differs. The first method was to find outperformer in 29 stocks from all the sectors. Output
can be seen below with accuracy ranging from 45%-65% which was a huge variation in multiple runs.
Stocks to Invest ICICIBANK HDFCBANK IBULHSGFIN YESBANK HDFC
Stocks to Ignore All Others
As these results look highly biased towards one sector which does reflect from the accuracy percentage, I changed the approach . I decided to train and test sector wise i.e to use all energy
sector companies data to test energy sector company performance. This will give our model comparatively relative data to train and test on.
With this approach I found surprisingly good results , Our model gave 80% accuracy while it recommended 12 stocks to looking good to invest and 17 as not so good.
Stocks to Invest
RELIANCE
N/A]
HDFCBANK,HDFC,ICICIBANK,YESBANK, IBULHSGFIN
ITC,HINDUNILVR
HEROMOTOCO
INFRATEL
ULTRACEMCO
VEDL
N/A
N/A
Stocks to Ignore
NTPC
TCS, INFY,WIPRO, HCLTECH, TECHM
KOTAKBANK,AXISBANK, INDUSINDBK
ASIANPAINT
MARUTI, BAJAJ-AUTO
BHARTIARTL
AMBUJACEM, ACC
N/A
ZEEL
DRREDDY
Sectors
ENERGY
IT
FINANCIAL SERVICES
CONSUMER GOODS
AUTOMOBILE
TELECOM
CEMENT & CEMENT PRODUCTS
METALS
MEDIA & ENTERTAINMENT
PHARMA
Future Works6.
Future iterations on this work should first try to improve model generalization error and reduce overfitting. In this quarter we used 5 quarters data to predict latest quarter performance of
stocks. We are very excited to see the outcome of our approach. Future work should include at least 8 quarters data , we should see how our model performs with more data.
At the time of this study overall global market sentiments have been benign. I would like to test the model on volatile times.
I would like to extend the model to predict 1 year stock performance with more variables.
I shall extend the study to involve technical analysis and algo trading in my work.
Additional computing power could be used to work with network-derived data at much more granular periods of time, such as weekly or monthly data, as opposed to the quarterly splits used
in this paper.
Another avenue for further improvement involves the compilation of more external factors in my study. Geopolitical developments, macroeconomic data, sentiment analysis etc. In this paper
we targeted Nifty 50 stocks however there are about 2000 stocks in NSE.
Conclusion7.
In this paper is presented a machine learning aided methodology for equity movement prediction over the long time. With all selected financial indicators, the methodology performs with
accuracy of 80.1%
Stock Market Prediction Page 2
3. accuracy of 80.1%
Some of the features from the larger set were not necessary, since they were not giving any relevant information about companyâs valuation, while the others were just duplicating the fact
told by already analyzed financial indicator. For example, it is possible to assume the value of earnings if the value of total stock number and earning per share ratio is available.
It seems that information about growth is not necessary. From this is could be deduced that ratios and information that describes current financial state of the company, without a look at the
past performances is enough for predicting future behavior of the company (with accuracy showed in this work). This principle can be especially useful for investors that want to invest in new
companies. Hypothesis that companies can be valued and their future can be predicted only by looking at present data has to be further tested, however, it proved to be correct in our case for
our dataset. We will leave this hypotheses to be tested by future researchers in more details.
8. Annexure
Later I did a market study (Annexure)of how actually markets have performed relative to my prediction. Below are the graphs with our prediction to invest and ignore.
Energy
Invest RELIANCE
Not Invest NTPC
In above graph we can see out prediction turned right as RELIANCE seems to be outperforming by a huge margin.
IT:
Invest
Not Invest TECHM HCLTECH WIPRO INFY TCS
We recommended to avoid IT stocks for not and result does reflect the similar sentiments. Nifty is outperforming all the IT stocks.
Finance:
Invest ICICIBANK HDFCBANK IBULHSGFIN YESBANK HDFC
Stock Market Prediction Page 3
4. Finance:
Not Invest INDUSINDBK AXISBANK KOTAKBANK
Auto:
Invest HEROMOTO
Not Invest MARUTI BAJAJ-AUTO
Telecom:
Invest INFRATEL
Not Invest BHARTIARTL
Cement:
Invest ULTRACMCO
Not Invest ACC AMBUJACEM
Stock Market Prediction Page 4
6. The results are surprisingly motivating.
9. REFERENCES
Altman, E.I. 1968. Financial ratios, discriminant analysis, and the prediction of corporate bankruptcy, Journal of Finance 23, 589-609.
Awales, George S. Jr. 1988. Another look at the Presidentâs letter to stockholders, Financial Analysts Journal, 71-73, March-April.
Bhattacharya, Hrishikes. 2007. Total Management by Ratios, 2nd ed., New Delhi, India: Sage Publications.
Bildirici, Melike, and ĂzgĂźr Ămer Ersin. 2009. Improving forecasts of GARCH family models with the artificial neural networks: An application to the daily returns in Istanbul Stock Exchange,
Expert Systems with Applications 36(4), 7355-7362
Connor, M.C. 1973. On the usefulness of financial ratios to investors in common stock, The Accounting Review, 339- 352.
Chen, Mu-Yen. 2011 Predicting corporate financial distress based on integration of decision tree classification and logistic regression, Expert Systems with Applications 38(9), 11261-11272.
Cheng, W; L. Wanger; and Ch. Lin. 1996. Forecasting the 30-year US treasury bond with a system of neural networks, Journal of Computational Intelligence in Finance 4, 10â6.
Davis, D. 2005. Business Research for Decision-Making, 1st ed., Belmont, CA: Thomson Brooks/Cole.
Dutta, A., et al. 2008. Classification and Prediction of Stock Performance using Logistic Regression: An Empirical Examination from Indian Stock Market: Redefining Business Horizons: McMillan
Advanced Research Series, 46-62.
Fama, E, and K. French. 1988. Permanent and temporary components of stock prices, Journal of Political Economy 96, 246â73.
Ferson, W.E., and C.R. Harvey. 1993. The risk and predictability of international equity returns, Review of Financial Studies 6, 1993, 527â66.
Guresen, Erkam, et al. 2011. Using artificial neural network models in stock market index prediction, Expert Systems with Applications 38 (8),10389-10397.
Haines, L.M., et al. 2007. .D-optimal designs for logistic regression in two variables, moda 8-advances in model-oriented design and analysis, Physica-Verlag HD, 91-98.
Hair, J.F., et.al. 2008. Multivariate Data Analysis, 6th ed.: Pearson Education, Inc.
Hair, J.F. 1995, Multivariate Data Analysis with Readings, 4th ed., Englewood Cliffs, NJ: Prentice Hall.
Stock Market Prediction Page 6