SlideShare a Scribd company logo
STOCK MARKET PREDICTION
Enroll No. 9910103561
Name of Student Aditya datta
Name of Supervisor Mr. Bansidhar Joshi
MAY’ 2014
Submitted in partial fulfillment of the Degree of
Bachelor of Technology
In
Computer Science Engineering
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING &
INFORMATION TECHNOLOGY
JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA
2
(I)
TABLE OF CONTENTS
Chapter No. Topics Page No.
Student Declaration II
Certificate from the Supervisor III
Acknowledgement IV
Summary (Not more than 250 words) V
List of Figures VI
List of Tables VII
List of Symbols VIII
List of Acronyms IX
Chapter-1 Introduction 10 - 13
1.1 General Introduction 10
1.2 Problem Statement 12
1.3 Empirical Study 13
1.4 Approach to problem in terms of technology 13
1.5 Support for novelty/significance of problem
Chapter-2 Literature Survey 14 - 18
2.1 Summary of papers 14
2.2 Integrated summary of the literature studied 18
Chapter 3: Analysis, Design and Modeling 19 - 25
3.1 Overall Description of the project 19
3.2 Functional Requirements 19
3.3. Non Functional Requirements
3.4 Design Diagrams 22
3.4.1Use Case diagrams 23
3.4.2 Data Flow Diagram 24-25
Chapter-4 Implementation details and issues 26 - 29
4.1 Implementation Details and Issues 26
4.1.1 Implementation Issues
4.1.2 Algorithms
4.2 Risk Analysis and Mitigation plan 29
Chapter-5 Testing
5.1 Testing Plan 30
5.2 Component decomposition & type of testing required 34
5.3 Test cases 35
5.4 Error and Exception Handling 37
5.5 Limitation of the solution
3
Chapter-6 Findings & Conclusion 38 - 39
6.1 Findings 38
6.2 Conclusion 39
6.3 Future Work 39
References 40
4
(II)
DECLARATION
I hereby declare that this submission is my/our own work and that, to the best of my knowledge
and belief, it contains no material previously published or written by another person nor material
which has been accepted for the award of any other degree or diploma of the university or other
institute of higher learning, except where due acknowledgment has been made in the text.
Place: Signature:
Date: Name: Aditya datta
Enrollment No: 9910103561
5
(III)
CERTIFICATE
This is to certify that the work titled “Stock market prediction” submitted by “Aditya datta” in
partial fulfillment for the award of degree of Bachelors of Technology of Jaypee Institute of
Information Technology University, Noida has been carried out under my supervision. This work
has not been submitted partially or wholly to any other University or Institute for the award of
this or any other degree or diploma.
Signature of Supervisor ……………………..
Name of Supervisor ……………………..
Designation ……………………..
Date ……………………..
6
(IV)
ACKNOWLEDGEMENT
I take this opportunity to acknowledge all the people who have helped me whole heartedly at
every stage of the project.
I would like to express my special thanks of gratitude to my respected supervisor Mr. Bansidhar
Joshi who gave me the golden opportunity to do this great project on the topic:”Stock market
prediction”. The guidance and support received from her was vital for the success of the project.
I also extend my sincere thanks to all other faculty members of Computer Science Engineering
Department who helped me in the project.
Signature of the Student ……………………..
Name of Student ……………………..
Enrollment Number ……………………..
Date ……………………..
7
(V)
SUMMARY
Stock market prediction is a classic problem which has been analyzed extensively using tools
and techniques of Machine Learning. Interesting properties which make this modeling non-trivial
is the time dependence, volatility and other similar complex dependencies of this problem. To in
corporate these, Hidden Markov Models (HMM's) have recently been applied to forecast and
predict the stock market. We present the Maximum a Posteriori HMM approach for forecasting
stock values for the next day given historical data. In our approach, we consider the fractional
change in Stock value and the intra-day high and low values of the stock to train the continuous
HMM.
This HMM is then used to make a Maximum a Posteriori decision over all the possible stock
values for the next day. We test our approach on several stocks, and compare the performance to
some of the existing methods using HMMs and Artificial Neural Networks using Mean Absolute
Percentage Error (MAPE).
__________________ __________________
Signature of Student Signature of Supervisor
Name Name
Date Date
8
(VI)
LIST OF FIGURES
FIGURE NUMBER NAME PAGE NUMBER
Figure 1 USE CASE DIAGRAM 22
Figure 2 DFD DIAGRAM 23
Figure 3 DFD LEVEL1 DIAGRAM 24
Figure 4 ARCHITECTURE 25
Figure 6 PROCESS OVERVIEW 25
Figure 7 RISK ANALYSIS AND 35
MITIGATION DIAGRAM
Figure 8 CORRELATION BETWEEN 39
ACTUAL AND PREDICTED
VALUE FOR DELL
Figure 9 CORRELATION BETWEEN 39
ACTUAL AND PREDICTED
VALUE FOR GOOGLE
9
(VII)
LIST OF TABLES
TABLE NUMBER NAME PAGE NUMBER
Table 1 Risk Analysis & 30
Mitigation plan
Table 2 Test Plan 33
Table 3 Test Activity 34
Table 4 Software & Hardware 34
Items
Table 5 Component Testing 35
Table 6 Test Cases 36
10
(IX)
LIST OF ACRONYMS
S.NO. ABBREVIATION EXPANSION
1 POSM prediction on stock market
2 HMM Hidden markov model
3 CSV Comma separated Values
4 LRA Left right algorithm
5 BWA Baum Welch algorithm
6 ARIMA Integrated Moving Average
11
CHAPTER 1
INTRODUCTION
1.1. GENERAL
The stock market is a network which provides a platform for almost all major economic
transactions in the world at a dynamic rate called the stock value which is based on market
equilibrium. Predicting this stock value offers enormous arbitrage profit opportunities which are
a huge motivation for research in this area. Knowledge of a stock value beforehand by even a
fraction of a second can result in high profits. Similarly, a probabilistically correct prediction can
be extremely profitable in the amortized case. This attractiveness of finding a solution has
prompted researchers, in both industry and academia to find a way past the problems like
volatility, seasonality and dependence on time, economies and rest of the market. Previously,
techniques of Artificial Intelligence and Machine Learning - like Artificial Neural Networks,
Fuzzy Logic and Support Vector Machines, have been used to solve these problems. Recently,
the Hidden Markov Model (HMM) approach was applied to this problem in predicting the
pattern. The reason for using this approach is fairly intuitive. HMM's have been successful in
analyzing and predicting time depending phenomena, or time series. They have been used
extensively in the past in speech recognition, ECG analysis etc. The stock market prediction
problem is similar in its inherent relation with time. Hidden Markov Models are based on a set of
unobserved underlying states amongst which transitions can occur and each state is associated
with a set of possible observations. The stock market can also be seen in a similar manner. The
underlying states, which determine the behavior of the stock value, are usually invisible to the
investor. The transitions between these underlying states are based on company policy, decisions
and economic conditions etc. The visible effect which reflects these is the value of the stock.
Clearly, the HMM conforms well to this real life scenario. The choice of attributes, or feature
selection is significant in this approach. In the past various attempts have been made using the
volume of trade, the momentum of the stock, correlation with the market, the volatility of the
stock etc. In our model we use the daily fractional change in the stock value, and the fractional
12
deviation of intra-day high and low. The fractional change is necessary in order to make the
required prediction. Measuring the fractional deviation of both the intra-day high and low value
is a good measure as it gives the direction of the volatility as well. We use three different stocks
for evaluating the approach – Google , Apple Inc., and Dell Inc. A separate HMM is trained for
each stock. The one constraint that the training set needs to have is suitable variability in the
data. This is taken care of by taking appropriately large periods of time in which the stock value
changes steadily yet significantly. The remaining paper is organized as follows. In Section II we
review some of the existing techniques for stock market prediction, especially the ones using
HMMs. In Section III we give details of our approach with mathematical justifications.
In Section IV we describe the data-sets and provide the experimental results. Finally, in Section
V we discuss the results and conclude the paper.
1.2. PROBLEM STATEMENT
Stock Market prediction has been one of the more active research areas in the past, given the
obvious interest of a lot of major companies. In this research several machine learning
techniques have been applied to varying degrees of success. However, stock forecasting is still
severely limited due to its non-stationary, seasonal and in general unpredictable nature.
Predicting forecasts from just the previous stock data is an even more challenging task since it
ignores several outlying factors (such as the state of the company, economic conditions
ownership etc.). Machine learning techniques which have been widely applied to forecasting
stock market data include Artificial Neural Networks (ANNs) , Fuzzy Logic (FL), and Support
Vector Machines (SVMs) . Out of these ANNs have been the most successful, however even
their performance is quite limited, and not reliable enough .Wang and Leu trained a system based
on a recurrent neural network which used features extracted using the Autoregressive Integrated
Moving Average (ARIMA)analysis, which showed reasonable accuracy, .HMMs have only been
rarely applied to the given problem in the past, which is surprising given the time-dependent
nature of the market. HMMs have been successfully applied in the areas of Speech Recognition,
DNA sequencing, and ECG analysis . Shi and Weigend, , used HMMs to predict changes in the
trajectories of financial time series data. Recently Hassal combined HMMs and fuzzy logic rules
to improve the prediction accuracy on non-stationary stock data sets, . His performance was
significantly better than that of past approaches. The basic idea there was to combine HMM's
13
data pattern identification method to partition the data-space with the generation of fuzzy logic
for the prediction of multivariate financial time series data. Our approach is similar to the one
taken by Nobakht et al in .They model the daily opening, closing, high and low indices as
continuous observations from underlying hidden states. The main difference between our
approach and theirs lies in the features used (we use fractional changes in the above quantities)
and in the manner of forecasting. While they look for similar data patterns in the past data, we
maximize the likelihood of a sequence of observations over all possible forecasted future values.
1.3. APPROACH TO PROBLEM IN TERMS OF TECHNOLOGY
/PLATFORM TO BE USED
We use a continuous Hidden Markov Model (CHMM) to model the stock data as a time series.
An HMM (denoted by λ) can be written as λ=(π,A,B) Where A is the transition matrix whose
elements give the probability of a transition from one state to another, B is the emission matrix
giving the probability of observing when in state j, and π gives the initial
probabilities of the states at t= 1. Further, for a continuous HMM the emission probabilities are
modelled as Gaussian Mixture Models (GMMs):
where:
•M is the number of Gaussian Mixture components.
• is the weight of the mth mixture component in state j
• is the mean vector for the mth component in the jth
state.
• is the probability of observing in the multi-dimensional Gaussian
distribution.Training of the above HMM from given sequences of observations is done using the
Baum-Welch algorithm which uses Expectation-Maximization (EM) to arrive at the optimal
parameters for the HMM,.In our model the observations are the daily stock data in the form of
the 4-dimensional vector,
14
Here open is the day opening value, close is the day closing value, high is the day high, and low
is the day low. We use fractional changes along to model the variation in stock data which
remains constant over the years.
Once the model is trained, testing is done using an approximate Maximum a Posteriori (MAP)
approach. We assume a latency of d days while forecasting future stock values. Hence, the
problem becomes as follows - given the HMM model λ and the stock values for d days along
with the stock open value for the (d+1) day, we need to compute the close value for the (d+1)
day. This is equivalent to estimating the fractional change( ) for the (d+1) day. For
this, we compute the MAP estimate of the observation vector .Let be the MAP estimate
of the observation on the (d+1) day, given the values of the first d days. Then,
The observation vector is varied over all possible values. Since the denominator is constant with
respect to, the MAP estimate becomes,
The joint probability value can be computed using the
forward-backward algorithm for HMMs. In practice, we compute the probability over a discrete
set of possible values of see Table II, and find the maximum, hence the name MAP HMM
model. The computational complexity of the forward-backward algorithm for finding the
likelihood of a given observation is , where n is the number of states in the HMM and d
is the latency. This procedure is repeated over the discrete set of possible values of . In our
case n= 4, d= 10 and there are 50x10x10 possible values of The closing value of a
15
particular day can be computed by using the day opening value and the predicted fractional
change for that day.
CHAPTER 2
ADDITIONAL LITERATURE SURVEY
2.1. SUMMARY OF RELEVANT PAPERS
Title of paper Stock Market Forecasting Using Hidden Markov Model:
A New Approach
Authors Md. Rafiul Hassan and Baikunth Nath
Year of Publication 2005
Publishing details Proceedings of the 2005 ,5th International Conference on Intelligent
Systems Design and Applications
Summary This paper presents Hidden Markov Models (HMM) approach for
forecasting stock price for interrelated markets. We apply HMM to
forecast some of the airlines stock. HMMs have been extensively
used for pattern recognition and classification problems because of
its proven suitability for modelling dynamic systems. However,
using HMM for predicting future events is not straightforward.
Here we use only one HMM that is trained on the past dataset of
the chosen airlines. The trained HMM is used to search for the
variable of interest behavioural data pattern from the past dataset.
By interpolating the neighbouring values of these datasets
forecasts are prepared. The results obtained using HMM are
encouraging and HMM offers a new paradigm for stock market
forecasting, an area that has been of much research interest lately.
Key Words: HMM, stock market forecasting,
financial time series, feature selection
Web link hassan-nath-2005
16
stock_market_forecasting_using_hidden_markov_model_a_new_app
roach.pdf
Paper 2
Title of paper Analysis of Hidden Markov Models and Support Vector Machines
in Financial Applications.
Authors Satish Rao
Jerry Hong
Year of Publication 12th
may,2010
Publishing details Electrical Engineering and Computer Sciences University of
California at Berkeley
Summary This paper presents two approaches in helping investors make
better decisions. First, we discuss conventional methods, such as
using the Efficient Market Hypothesis and technical indicators, for
forecasting stock prices and movements.
We will show that these methods are inadequate, and thus, we
need to rethink the issue. Afterwards, we will discuss using
artificial intelligence, such as Hidden Markov Models and Support
Vector Machines, to help investors gather and compute enormous
amount of data that will enable them to make informed decisions.
We will leverage the Simlio engine to train both the HMM and
SVM on past datasets and use it to predict future stock
movements. The results are encouraging and they warrant future
research on using AI for market forecasts.
Web link EECS-2010-63.pdf
17
Paper 3
Title of paper Stock Market Prediction using Hidden Markov Model
Authors Aditya gupta,Bhuwan Dhinghra
Year of Publication 2010
Publishing details International Journal of Multimedia and Ubiquitous Engineering
Summary Stock market prediction is a classic problem which has been
analyzed extensively using tools and techniques of Machine
Learning. Interesting properties which make this modelling non-
trivial is the time dependence, volatility and other similar complex
dependencies of this problem. To incorporate these, Hidden
Markov Models (HMM’s) have recently been applied to forecast
and predict the stock market. We present the Maximum a
Posteriori HMM approach for forecasting the stock value for the
next day given historical data. In our approach, we consider the
fractional change in Stock value, and the intra-day high and low
values of the stock, to train the continuous HMM. This HMM is
then used to make a Maximum a Posteriori decision over all the
possible stock values for the next day. We test our approach on
several stocks, and compare the performance to some of the
existing methods using HMMs and Artificial Neural Networks
using Mean Absolute Percentage Error (MAPE).
Web link Y8036_Y8167.pdf
18
Paper 4
Title of paper Stock market trend analysis using Hidden markov Model
Authors Kavitha G, Udhaya kumar A,Nagarajan D
Year of Publication 2009
Publishing details IEEE Transactions on Network and Service Management
Summary Price movements of stock market are not totally random. In fact,
what drives the financial market and what pattern financial time
series follows have long been the interest that attracts economists,
mathematicians and most recently computer scientist. This paper
gives an idea about the trend analysis of stock market behaviour
using Hidden Markov Model (HMM). The trend once followed
over a particular period will sure repeat in future. The one day
difference in close value of stocks for a certain period is found and
its corresponding steady state probability distribution values are
determined. The pattern of the stock market behaviour is then
decided based on these probability values for a particular time.
The goal is to figure out the hidden state sequence given the
observation sequence so that the trend can be analyzed using the
steady state probability distribution(π ) values. Six optimal hidden
state sequences are generated and compared. The one day
difference in close value when considered is found to give the best
optimum state sequence.
.
Web link http://arxiv.org/ftp/arxiv/papers/1311/1311.4771.pdf
19
2.2. INTEGRATED SUMMARY OF LITERATURE STUDIED
The study of papers mainly focus on Hidden Markov Models (HMM) approach for forecasting
stock price for interrelated markets. We apply HMM to forecast some of the airlines stock.
HMMs have been extensively used for pattern recognition and classification problems because of
its proven suitability for modelling dynamic systems. However, using HMM for predicting future
events is not straightforward. Here we use only one HMM that is trained on the past dataset of
the chosen airlines. The trained HMM is used to search for the variable of interest behavioural
data pattern from the past dataset. By interpolating the neighbouring values of these datasets
forecasts are prepared. The results obtained using HMM are encouraging and HMM offers a new
paradigm for stock market forecasting, an area that has been of much research interest lately.
These conventional tools offered much insight into the workings of the financial market.
However, they provide only a macro-simplification that does not always reflect how the real
market works. There are definitely limitations to these tools that prevent them from modeling the
market in a more focused, “micro” manner. One of the major issues is that many conventional
finance theories only take in so many factors. This limited scope prevents us to accurately model
the real market that has countably infinite number of patterns. We need a model that can
constantly adapt to the dynamic nature of the market. Technical indicators can only help an
investor so much before the different combinations and patterns causes the investor to question
whether any formula actually works consistently. This is where AI models such as HMMs and
SVMs come into play. Using these tools, we can achieve a more realistic micro-representation of
the market while overcoming the limitations of the earlier techniques. In Recent years, a variety
of forecasting methods have been proposed and implemented for the stock market analysis. A
brief study on the literature survey is presented. Markov Process is a stochastic process where the
probability at one time is only conditioned on a finite history, being in a certain state at a certain
time. Markov chain is “Given the present, the future is independent of the past”. HMM is a form
of probabilistic finite state system where the actual states are not directly observable. They can
only be estimated using observable symbols associated with the hidden states. At each time
point, the HMM emits a symbol and changes a state with certain probability. HMM analyze and
20
predict time series or time depending phenomena. There is not a one to one correspondence
between the states and the observation symbols.
CHAPTER 3
ANALYSIS, DESIGN AND MODELLING
3.1. OVERALL DESCRIPTION OF THE PROJECT
3.1.1. PRODUCT PERSPECTIVE
The product to be developed: “Prediction of stock market(POSM)” is a stand-alone application
and an independent product and is developed in Java programming language using swings
framework.
3.1.2. PRODUCT FUNCTIONS
Hardware Requirements:
The hardware requirements may serve as the basis for a contract for the implementation of the
system and should therefore be a complete and consistent specification of the whole system.
They are used by software engineers as the starting point for the system design. It should what
the system do and not how it should be implemented.
PROCESSOR: PENTIUM 4 2.1 GHZ & ABOVE.
RAM: 512 MB DDR2 RAM
MONITOR: COLOR
HARD DISK SPACE: 10 MB
Software Requirements:
The software requirements document is the specification of the system. It should include both a
definition and a specification of requirements. It is a set of what the system should do rather than
how it should do it. The software requirements provide a basis for creating the software
requirements specification. It is useful in estimating cost, planning team activities, performing
tasks and tracking the teams and tracking the team’s progress throughout the development
activity.
OPERATING SYSTEM: WINDOWS 7/8/8.1
LANGUAGE: JAVA
21
FRAMEWORK: JAVA SWINGS
TOOL USED: NETBEANS IDE 7.1 & ABOVE, JDK 1.7 & ABOVE.
3.1.3. USER CHARACTERSTICS
• The user needs to have Java Virtual Machine and Java Runtime Environment Installed
in the computer
• User must have some knowledge in the field of Share Market.
3.1.4. CONSTRAINTS
Reliability requirements:
The system must be reliable and must not crash due to higher number of operations and
higher processing time. A high speed processor must be employed.
3.1.5. ASSUMPTIONS AND DEPENDENCIES
Assumption 1: The user knows how to operate Windows system and is aware of Share
Marketing basics.
Assumption 2: The JDK Version should not be less than the Version 1.7.
Dependency 1: Dependent on Internet speed of the System.
3.1.6. APPORTIONING OF REQUIREMENTS
The system may be optimized in terms of accuracy and speed in the future versions of the
product.
3.2. FUNCTIONAL REQUIREMENTS
1. Fetch Chart data from Yahoo Finance.
2. Convert the Csv file to XLS spread sheet.
3. Apply Left right algorithm to the incoming data and send it for a transition matrix.
4. Apply Baum Welch or predicting algorithm to the received observations ,calculate the
fractional change.
5. Calculate the opening ,closing ,high, low value of the given stock using hidden
markov model.
22
3.3. NON FUNCTIONAL REQUIREMENTS
1. Data fetched from yahoo finance need to be delivered reliably and quickly.
2. Study the specifications and configuration setting of java API’s , while integrating
with user application.
.
3.4. DESIGN DIAGRAMS
3.4.1. USE CASE DIAGRAM
23
24
DATA FLOW DIAGRAM
LEVEL 0-DFD
25
LEVEL 1 DFD
26
3.4.2. ARCHITECTURAL DIAGRAM
Fig. 1. Architecture of stock market prediction
There are a number of sources of stock and financial data on web including Google Finance and
Yahoo! Finance. In the specific case of SENSEX Index, it seems that Google Finance does not
provide the required data format as required. As an alternative, Yahoo! Finance was used to load
and store the stock data from 1984. In the application developed for this project, an interface is
included to use when needed to load the data beforehand using the algorithms for training and
prediction.
SMP
Microsoft
Excel Reader
Excel
reader
JAVA
ARCHITECT
URE Stock
Data
Extracted
Data
returned
as data -
source
27
In this problem, there is a sequence of data over time with which we need to train an HMM,
To train an HMM matching a set of the sequence of stock index data as ~O = (open; high; low;
close), the following settings and considerations have been taken into account:
1. States: in the experiments, we have N = 4; intuitively, it is denoting the stages in time that are
allowed in different transitions in the HMM training.
2. Dimensions: as a mixture of multivariate Gaussian is utilized in this problem, we have D = 4
as the observation vector for each stock date is as (open; low; high; close).
3. Mixtures: [10] proposes to have M = 3.
4. Left-Right Delta: experimentally, we have tried delta=1 and delta = 3.
5. Prior Probability: Adhering to the left-right HMM, we have = (1; 0; 0; 0).
28
CHAPTER 4
IMPLEMENTATION DETAILS AND ISSUES
3.1. IMPLEMENTATION DETAILS AND ISSUES
The software is implemented in java programming language and the user interface is designed
using Java swings framework.
DETAILS OF THE PROJECT
As the problem of the prediction could be complex and lengthy, a series of actions and activities
in the form of several phases was considered to break down the problem to conquer the
complexity. Figure 1 depicts the overall process that is considered when solving the problem.
Additionally, the following section will discuss each of the the phases in more details. As
mentions, through this experiment, we try to take advantage of Hidden Markov Models (HMM)
to address some interesting problems regarding stock market analysis.
Specically, stock market index prediction is done in this assignment. First, a set of past data is
loaded and analyzed; then, an HMM is modelled and trained for the problem model. Afterwards,
similar past data are distinguished and used to predict future stock market values.Stock market
data that are used in this assignment are the data from SENSEX that is an index over the stock
data in INDIA. Basically, each stock market data is a quadruple (open; low; high; close) carrying
the meaning that each day the stock market starts its activity, it starts with some opening after
which during the day it reaches its highest or drops down to its lowest of the days and then it will
stop with a close value. Such data seems to be very sensitive for stock workers and business
shareholders to predict future stock trends. In this project, we try to estimate the future day's
close values as precise as possible.
29
3.1.1. IMPLEMENTATION ISSUES
A number of important implementation issues must be addressed before POSM can be
feasibly deployed . Some of these issues are discussed below:
1. Interesting HMM Questions:
1. How to compute the probability of the occurrence of
a specific sequence of observations, P(~O j_), in which ~O = f(O1;O2,,,,,,;OT) g
2. How to choose state sequence (q(1); q(2),,,,,; q(T)) that explains best the observation of
~O in the model .
3. How to tune parameters (٨; A;B) to and a model that best matches the observations of
~O In stock market analysis and prediction, we are facing the first and the third problem.
Training HMM is done through the problem (3) and prediction is achieved with problem
(1).
2. Initializing the HMM: Another important issue in the modeling with HMM is how to
initialize the parameters of defined HMM, i.e. transition probabilities, observation probabilities
(distributions) and prior probabilities. Although the HMM can adjusts it parameters as good as
possible in the learning process, however badly initiating the HMM parameters can cause
imprecise model even though after HMM is trained and learnt. A basic approach is required to
obtain initial optimized values of the HMM parameters. In this project we proposed our approach
to initialize the parameters of the HMM base of physical behavior of data. As we describe in one
of the best HMM types that can be used for time series prediction is Left-Right HMM.
Consequently the prior probabilities is (1, 0, 0, 0) since the Left-Right HMM impose this fact.
Transition probabilities is again mainly depends on the defined model. we described how should
we initialized the transition probabilities based on the value of Observation distributions can be
initialized based on the training data. Since our proposed HMM has 4 states, we divided our
training sequence in 4 equal sections, each sequence is used to estimates the best possible
mixture of multi-Gaussian distribution for each state. The parameters of the multi-Gaussian
distributions are estimated with maximum likelihood method.
30
3.Continous HMM: In some applications such as stock market analysis or speech recognition,
the observations are not from of a discrete space; this would make the discrete observation Ot to
some ~Ot in each state meaning that in each state a series of observations could be received.
Specically, the observation vector in our assignment would be:~Ot = (opening; low; high; close)
which are the values of the stock index. Usually, for such HMM's, the representation for the
probability density function (pdf) that is used is a mixture of higher-dimensional Gaussian
distribution.
3.1.2. ALGORITHMS
Likelihood update algorithm
Likelihood update Algorithm in POSM
In the path to prediction, first, there is a need to find the most similar day in stock market data for
a specific day so that it could be used to predict the following day's close value. To do so, first,
we need to compute the likelihood of previous days in the desired range. When having one day's
stock data, it is straightforward to compute the likelihood of that specific day from the HMM.
This is Problem (1) which is computed using Forward Backward Algorithm proposed in [12, 2,
13]. Algorithm 1 overalls depicts the method to compute likelihoods.
31
Prediction Algorithm
When the likelihood probabilities of different days are computed, the last phase would be to
predict some day's close value as the target of this experiment. To do so, we introduce a
parameter likelihood tolerance denoting the similarity neighborhood that we can accept similar
days to the previous day. Through using the likelihood tolerance, we fetch a list of similar days
to yesterday's stock data and then we try to find the best guess as the one that has the highest
likelihood of all. In this experiment, we used the likelihood tolerance value in range [0.001, 0.01]
From this point, prediction is straightforward with calculating the difference of the similar day
and yesterday's values and then calculating tomorrow's close value. Algorithm 2 shows the
overall pseudo-code used for prediction. Along with prediction computation, we calculate also
the MAPE (Mean Average Percentage Error) measure.
Prediction algorithm for POSM
32
CHAPTER 5
TESTING
5.1. TESTING PLAN
Type of test Will test be
performed
Comments/explanation Software component
Requirements
testing
Yes Needs to be done to cope
up with changing
environment
Fluctuation in the share
market.
Unit Yes Maximum number of
defects are found. Each
component of code was
tested or analyzed
accordingly not only to
ensure the best quality of
the developed software
but also to make sure that
code behaves in the same
way as it was intended to.
Unit testing was
performed as and when
the component was
developed.
• User interface code
• Baum welch code
• Left right code
• Sgenerator code
• distribution Code
• hmm code
• prediction of day code
Integration Yes All the well-developed
sub-system are
integrated together and
tested called as
integration testing.
• Left right Bakis
Algorithm
• Baum welch Algorithm
33
Performance Yes Performance is the major
criteria for evaluating
any type of the system. It
holds importance and is
tested likewise.
Performance of different
Algorithms is measured in
combination. Algorithms are:
• Left right Algorithm
• Baum Welch algorithm
Performance is also measured
on the precision of output
Stress No - -
Compliance No - -
Security No - -
Table 1: Testing Plan
TEST TEAM DETAILS
Role Name Responsibility
Chief Testing
Incharge
Aditya data To perform requirements, unit, integration,
performance and load testing.
Table 2: Test Team details
Activity Start Date Completion Date Hours Comments
34
Develop Input 25-03-2014 28-03-2014 6 Nominal and
trivial issues
tested using the
standard test
cases designed
for the system.
Test Region
Setup
5-04-2014 10-04-2014 10 Test region is so
defined to check
all the features
individually as
well as in
combination
Table 3: Testing Schedule
Test Environment
Software Items
• Window 7/8/8.1 Stability
• Mac Stability
• Internet connection
• Java Runtime Environment & Development Kit 1.7 & above
• Netbeans7.1 & above
Hardware Items
• Personal Computer/Laptop
• Network Interface card
• Wireless connection or connecting cable
Table 4: Test Environment
5.2. COMPONENT TESTING
35
S.No Components that require
testing
Type of testing required Technique for
writing test case
1
Left right code Unit Testing White Box Testing
2 Csv to xls converter code Unit Testing White Box Testing
3 Model code Unit Testing White Box Testing
4 graph code Unit Testing White Box Testing
5 sgenerator code Unit Testing White Box Testing
6 Destination System Testing Black Box Testing
7 Source System Testing Black Box Testing
8 User interface code Performance Testing Black Box Testing
9 Utils code Performance Testing Black Box testing
Table 5: Component decomposition and identification of tests required
5.3. TEST CASES
Test Id T1
Input Enter the starting and the last date to update the data
Expected Output Data fetched from yahoo finance.
Status Pass
Test Id T2
Input Predict the stock rate for the very next day.
Expected Output We get low, high, opening and closing for the next day.
Status
Pass
36
Test Id T3
Input
Predict the stock rate for any day after a week from the given set
of data.
Expected Output Enter a date within a week
Status Pass
Test Id T4
Input Check the precision of output by entering a date whose value’s
are already known.
Expected Output Outputs are almost precise.
Status Pass
5.4. ERROR AND EXCEPTION HANDLING
Test Case Id Test Case Debugging Technique
T1 Fetching data from yahoo
finance .
Check the posmdownloader
code and check for errors.
Table 6: Error and Exception Handling
5.5. LIMITATION OF THE SOLUTION
.
• The precison of the output sometimes is not even near to the actual value.
• System sometimes hang due to loss of connection to Internet.
37
5.6. Risk analysis and Mitigation plan
Risk
Id
Classification Description Risk Area Probability Impact RE =
(P*I)
R1 Performance Low
Performance
Product
Engineering
L H 81
R2 Budget Medium
Budget
Program
Constraints
M L 3
R3 Project
Specification
Infeasible
Specifications
Product
Engineering
L M 3
R4 Hardware Hardware
Constraints
Product
Engineering
L L 1
R5 Accuracy Low Accuracy Product
Engineering
M H 27
R6 External Inputs Inaccurate
Inputs
Program
Constraints
H H 81
Table 7: Risk Identification
38
S. No. Risk Area # of Risk
Statements
Weight(in+out) Total
Weight
Priority
1 Performance 5 1+1+9+9+9 29 1
2 Accuracy 3 9+9+9 27 2
3 External Inputs 3 3+9+9 21 3
4 Hardware 2 9+9 18 4
5 Project
Specification
3 3+3+1 7 5
6 Budget 2 3+1 4 6
Table 8: Risk Area Wise Total Weighting Factor
Risk
Id
Risk Statement Risk Area Priority
R1 Risk of Performance Performance 1
R5 Risk of Low Accuracy Accuracy 2
R6 Risk of Inaccurate Inputs External Inputs 3
R4 Risk of Inaccurate Hardware
equipment
Hardware 4
Table 9: Risk with Maximum Weight
39
MITIGATION APPROACHES
Approach 1: To ensure high performance, optimize the system by reducing response time
10 April 2014 25 April 2014 Aditya datta
Additional Resources: High Speed Processor
Approach 2: To ensure high accuracy, optimize the code
10 April 2014 20 April 2014 Aditya datta
Additional Resources: Internet
Approach 3: Ensure the system is secure
16 April 2014 25 April 2014 Aditya datta
Additional Resources: None
Approach 4: Ensure the all the specifications and inputs are correct
16 April 2014 25 April 2014 Aditya datta
Additional Resources: None
40
CHAPTER 6
FINDINGS AND CONCLUSION
As it is revealing, we have been successful rough estimation of the future data required in the
project. Though, the quality of “preciseness" becomes more significant as the sensitiveness of the
data rises. Thus, regarding the work that has been done, for future, one of the ideas to apply to
gain better quality is to consider weighted ranking of the most similar past data in search for the
likelihood tolerance. Intuitively, it will somehow try to control the deviation from the actual
values that are seen over time. Additionally, further boundary checks could be applied to the
predicted data to prevent undesired deviations in the predictions Another idea could be proposed
as “continuous training", as opposed to the current situation in which a period of time is
considered and for that an amount of data is located and used to train an HMM. Then the trained
HMM is used for prediction purposes. However, a better idea is to somehow persist the trained
HMM and over time try to optimize and tune the HMM according to the latest data that emerge
in time. This way, intuitively, we would be trying to optimize and improve the HMM over time
without losing the trained HMM from the past. ANN is well researched and established method
that has been successfully used to predict time series behaviour from past datasets. In this paper,
we proposed the use of HMM, a new approach, to predict unknown value in a time series (stock
market). It is clear from that the mean absolute percentage errors (MAPE) values of the two
methods are quite similar. Whilst, the primary weakness with ANNs is the inability to properly
explain the models. According to Repley“ the design and learning for feed-forward networks are
Opening ,price High, price Low, price Closing, price Predicted”. The proposed method using
HMM to forecast stock price is explainable and has solid statistical foundation. The results show
potential of using HMM for time series prediction. In our future work we plan to develop hybrid
systems using AI paradigms with HMM to further improve accuracy and efficiency of our
forecasts.
41
Correlation between predicted and actual closing stock price for google.
.
Correlation between predicted and actual closing stock price for dell.
42
43
REFERENCES
[1] Kuo R J, Lee L C and Lee C F (1996), Integration of Artificial NN and Fuzzy Delphi for
Stock market forecasting, IEEE International Conference on Systems, Man, and Cybernetics,
Vol. 2, pp. 1073-1078.
[2] Kimoto T, Asakawa K, Yoda M and Takeoka M (1990), Stock market prediction system with
modular neural networks, Proc. International Joint Conference on Neural Networks, San Diego,
Vol. 1, pp. 1-6.
[3] White H (1998), Economic Prediction Using Neural Networks: The Case of IBM Daily Stock
Returns, Proceedings of the Second Annual IEEE Conference on Neural Networks, Vol. 2, pp.
451-458.
[4] Chiang W C, Urban T L and Baldridge G W (1996), A Neural Network Approach to Mutual
Fund Net Asset Value Forecasting. Omega, Vol. 24 (2), pp. 205-215.
[5] Kim S H and Chun S H (1998), Graded forecasting using an array of bipolar predictions:
application of probabilistic neural networks to a stock market index.International Journal of
Forecasting, Vol. 14, pp. 323-337.
[6] Romahi Y and Shen Q (2000), Dynamic Financial Forecasting with Automatically Induced
Fuzzy Associations, Proceedings of the 9th international conference on Fuzzy systems, pp. 493-
498.
[7] Thammano A (1999), Neuro-fuzzy Model for Stock Market Prediction, Proceedings of the
Artificial Neural Networks in Engineering Conference, ASME Press, New York, pp. 587-591.
[8] Abraham A, Nath B and Mahanti P K (2001), Hybrid Intelligent Systems for Stock Market
Analysis,Proceedings of the International Conference on Computational Science. Springer, pp.
337-345.
[9] Raposo R De C T and Cruz A J De O (2004), Stock Market prediction based on
fundamentalist analysis with Fuzzy-Neural Networks.
http://www.labic.nce.ufrj.br/downloads/3wses_fsfs_2002.pdf
[10] Cao L and Tay F E H (2001), Financial Forecasting Using Support Vector Machines, Neural
Computation and Application, Vol. 10, pp. 184-192.
[11] Huang X, Ariki Y, Jack M (1990), Hidden Markov Models for speech recognition.
Edinburgh University Press.
.

More Related Content

What's hot

IRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine LearningIRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine Learning
IRJET Journal
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
MRIDUL GUPTA
 
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODSSTOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
IAEME Publication
 
stock market prediction
stock market predictionstock market prediction
stock market prediction
SRIGINES
 
Stock market prediction technique:
Stock market prediction technique:Stock market prediction technique:
Stock market prediction technique:
Paladion Networks
 
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUESTOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
Richa Handa
 
Final PPT.pptx
Final PPT.pptxFinal PPT.pptx
Final PPT.pptx
samarth70133
 
Stock prediction system using ann
Stock prediction system using annStock prediction system using ann
Stock prediction system using ann
eSAT Publishing House
 
Stock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised LearningStock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised Learning
Sharvil Katariya
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
Nikola Milosevic
 
Stock Price Prediction PPT
Stock Price Prediction  PPTStock Price Prediction  PPT
Stock Price Prediction PPT
PrashantGanji4
 
Sales and inventory management system project report
Sales and inventory management system project reportSales and inventory management system project report
Sales and inventory management system project report
Fuckboy123
 
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
IRJET Journal
 
Farmer Recommendation system
Farmer Recommendation systemFarmer Recommendation system
Farmer Recommendation system
Sandeep Wakchaure
 
Google Stock Price Forecasting
Google Stock Price ForecastingGoogle Stock Price Forecasting
Google Stock Price Forecasting
Arkaprava Kundu
 
Inventory Managment
Inventory ManagmentInventory Managment
Inventory Managment
sai prakash
 
Performance analysis and prediction of stock market for investment decision u...
Performance analysis and prediction of stock market for investment decision u...Performance analysis and prediction of stock market for investment decision u...
Performance analysis and prediction of stock market for investment decision u...
Hari KC
 

What's hot (20)

IRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine LearningIRJET- Stock Market Prediction using Machine Learning
IRJET- Stock Market Prediction using Machine Learning
 
Stock Market Prediction
Stock Market PredictionStock Market Prediction
Stock Market Prediction
 
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODSSTOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
STOCK MARKET PREDICTION USING MACHINE LEARNING METHODS
 
stock market prediction
stock market predictionstock market prediction
stock market prediction
 
STOCK MARKET PREDICTION
STOCK MARKET PREDICTIONSTOCK MARKET PREDICTION
STOCK MARKET PREDICTION
 
Stock market prediction technique:
Stock market prediction technique:Stock market prediction technique:
Stock market prediction technique:
 
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUESTOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
STOCK MARKET PRREDICTION WITH FEATURE EXTRACTION USING NEURAL NETWORK TEHNIQUE
 
Timeseries forecasting
Timeseries forecastingTimeseries forecasting
Timeseries forecasting
 
Final PPT.pptx
Final PPT.pptxFinal PPT.pptx
Final PPT.pptx
 
Stock prediction system using ann
Stock prediction system using annStock prediction system using ann
Stock prediction system using ann
 
Stock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised LearningStock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised Learning
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
 
Presentation1
Presentation1Presentation1
Presentation1
 
Stock Price Prediction PPT
Stock Price Prediction  PPTStock Price Prediction  PPT
Stock Price Prediction PPT
 
Sales and inventory management system project report
Sales and inventory management system project reportSales and inventory management system project report
Sales and inventory management system project report
 
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
Crop Recommendation System to Maximize Crop Yield using Machine Learning Tech...
 
Farmer Recommendation system
Farmer Recommendation systemFarmer Recommendation system
Farmer Recommendation system
 
Google Stock Price Forecasting
Google Stock Price ForecastingGoogle Stock Price Forecasting
Google Stock Price Forecasting
 
Inventory Managment
Inventory ManagmentInventory Managment
Inventory Managment
 
Performance analysis and prediction of stock market for investment decision u...
Performance analysis and prediction of stock market for investment decision u...Performance analysis and prediction of stock market for investment decision u...
Performance analysis and prediction of stock market for investment decision u...
 

Similar to Aditya report finaL

Stock Market Analysis and Prediction (1) (2).pdf
Stock Market Analysis and Prediction (1) (2).pdfStock Market Analysis and Prediction (1) (2).pdf
Stock Market Analysis and Prediction (1) (2).pdf
digitallynikitasharm
 
Stock Price Prediction Using Sentiment Analysis and Historic Data of Stock
Stock Price Prediction Using Sentiment Analysis and Historic Data of StockStock Price Prediction Using Sentiment Analysis and Historic Data of Stock
Stock Price Prediction Using Sentiment Analysis and Historic Data of Stock
IRJET Journal
 
Stock Market Prediction Using Deep Learning
Stock Market Prediction Using Deep LearningStock Market Prediction Using Deep Learning
Stock Market Prediction Using Deep Learning
IRJET Journal
 
Stock Market Prediction Analysis
Stock Market Prediction AnalysisStock Market Prediction Analysis
Stock Market Prediction Analysis
IRJET Journal
 
4317mlaij02
4317mlaij024317mlaij02
4317mlaij02
mlaij
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
IRJET Journal
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
IRJET Journal
 
IRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning TechniquesIRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning Techniques
IRJET Journal
 
The Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVMThe Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVM
IRJET Journal
 
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
IRJET Journal
 
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdfSTOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
programmersridhar
 
Analysis of Trends in Stock Market.pdf
Analysis of Trends in Stock Market.pdfAnalysis of Trends in Stock Market.pdf
Analysis of Trends in Stock Market.pdf
Valerie Felton
 
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHONSTOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
IRJET Journal
 
stock price prediction using machine learning
stock price prediction using machine learningstock price prediction using machine learning
stock price prediction using machine learning
gauravwankar27
 
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
IRJET Journal
 
A Compendium of Various Applications of Machine Learning
A Compendium of Various Applications of Machine LearningA Compendium of Various Applications of Machine Learning
A Compendium of Various Applications of Machine Learning
IRJET Journal
 
Project report on Share Market application
Project report on Share Market applicationProject report on Share Market application
Project report on Share Market application
KRISHNA PANDEY
 
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A SurveyIRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET Journal
 
OR 14 15-unit_1
OR 14 15-unit_1OR 14 15-unit_1
OR 14 15-unit_1
Nageswara Rao Thots
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction
IJECEIAES
 

Similar to Aditya report finaL (20)

Stock Market Analysis and Prediction (1) (2).pdf
Stock Market Analysis and Prediction (1) (2).pdfStock Market Analysis and Prediction (1) (2).pdf
Stock Market Analysis and Prediction (1) (2).pdf
 
Stock Price Prediction Using Sentiment Analysis and Historic Data of Stock
Stock Price Prediction Using Sentiment Analysis and Historic Data of StockStock Price Prediction Using Sentiment Analysis and Historic Data of Stock
Stock Price Prediction Using Sentiment Analysis and Historic Data of Stock
 
Stock Market Prediction Using Deep Learning
Stock Market Prediction Using Deep LearningStock Market Prediction Using Deep Learning
Stock Market Prediction Using Deep Learning
 
Stock Market Prediction Analysis
Stock Market Prediction AnalysisStock Market Prediction Analysis
Stock Market Prediction Analysis
 
4317mlaij02
4317mlaij024317mlaij02
4317mlaij02
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
 
STOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIESSTOCK PRICE PREDICTION USING TIME SERIES
STOCK PRICE PREDICTION USING TIME SERIES
 
IRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning TechniquesIRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning Techniques
 
The Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVMThe Analysis of Share Market using Random Forest & SVM
The Analysis of Share Market using Random Forest & SVM
 
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
ELASTIC PROPERTY EVALUATION OF FIBRE REINFORCED GEOPOLYMER COMPOSITE USING SU...
 
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdfSTOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
STOCK MARKET PRICE PREDICTION MANAGEMENT SYSTEM.pdf
 
Analysis of Trends in Stock Market.pdf
Analysis of Trends in Stock Market.pdfAnalysis of Trends in Stock Market.pdf
Analysis of Trends in Stock Market.pdf
 
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHONSTOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
STOCK MARKET PREDICTION USING MACHINE LEARNING IN PYTHON
 
stock price prediction using machine learning
stock price prediction using machine learningstock price prediction using machine learning
stock price prediction using machine learning
 
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
STOCK PRICE PREDICTION AND RECOMMENDATION USINGMACHINE LEARNING TECHNIQUES AN...
 
A Compendium of Various Applications of Machine Learning
A Compendium of Various Applications of Machine LearningA Compendium of Various Applications of Machine Learning
A Compendium of Various Applications of Machine Learning
 
Project report on Share Market application
Project report on Share Market applicationProject report on Share Market application
Project report on Share Market application
 
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A SurveyIRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A Survey
 
OR 14 15-unit_1
OR 14 15-unit_1OR 14 15-unit_1
OR 14 15-unit_1
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction
 

Recently uploaded

Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 

Recently uploaded (20)

Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 

Aditya report finaL

  • 1. STOCK MARKET PREDICTION Enroll No. 9910103561 Name of Student Aditya datta Name of Supervisor Mr. Bansidhar Joshi MAY’ 2014 Submitted in partial fulfillment of the Degree of Bachelor of Technology In Computer Science Engineering DEPARTMENT OF COMPUTER SCIENCE ENGINEERING & INFORMATION TECHNOLOGY JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA
  • 2. 2 (I) TABLE OF CONTENTS Chapter No. Topics Page No. Student Declaration II Certificate from the Supervisor III Acknowledgement IV Summary (Not more than 250 words) V List of Figures VI List of Tables VII List of Symbols VIII List of Acronyms IX Chapter-1 Introduction 10 - 13 1.1 General Introduction 10 1.2 Problem Statement 12 1.3 Empirical Study 13 1.4 Approach to problem in terms of technology 13 1.5 Support for novelty/significance of problem Chapter-2 Literature Survey 14 - 18 2.1 Summary of papers 14 2.2 Integrated summary of the literature studied 18 Chapter 3: Analysis, Design and Modeling 19 - 25 3.1 Overall Description of the project 19 3.2 Functional Requirements 19 3.3. Non Functional Requirements 3.4 Design Diagrams 22 3.4.1Use Case diagrams 23 3.4.2 Data Flow Diagram 24-25 Chapter-4 Implementation details and issues 26 - 29 4.1 Implementation Details and Issues 26 4.1.1 Implementation Issues 4.1.2 Algorithms 4.2 Risk Analysis and Mitigation plan 29 Chapter-5 Testing 5.1 Testing Plan 30 5.2 Component decomposition & type of testing required 34 5.3 Test cases 35 5.4 Error and Exception Handling 37 5.5 Limitation of the solution
  • 3. 3 Chapter-6 Findings & Conclusion 38 - 39 6.1 Findings 38 6.2 Conclusion 39 6.3 Future Work 39 References 40
  • 4. 4 (II) DECLARATION I hereby declare that this submission is my/our own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person nor material which has been accepted for the award of any other degree or diploma of the university or other institute of higher learning, except where due acknowledgment has been made in the text. Place: Signature: Date: Name: Aditya datta Enrollment No: 9910103561
  • 5. 5 (III) CERTIFICATE This is to certify that the work titled “Stock market prediction” submitted by “Aditya datta” in partial fulfillment for the award of degree of Bachelors of Technology of Jaypee Institute of Information Technology University, Noida has been carried out under my supervision. This work has not been submitted partially or wholly to any other University or Institute for the award of this or any other degree or diploma. Signature of Supervisor …………………….. Name of Supervisor …………………….. Designation …………………….. Date ……………………..
  • 6. 6 (IV) ACKNOWLEDGEMENT I take this opportunity to acknowledge all the people who have helped me whole heartedly at every stage of the project. I would like to express my special thanks of gratitude to my respected supervisor Mr. Bansidhar Joshi who gave me the golden opportunity to do this great project on the topic:”Stock market prediction”. The guidance and support received from her was vital for the success of the project. I also extend my sincere thanks to all other faculty members of Computer Science Engineering Department who helped me in the project. Signature of the Student …………………….. Name of Student …………………….. Enrollment Number …………………….. Date ……………………..
  • 7. 7 (V) SUMMARY Stock market prediction is a classic problem which has been analyzed extensively using tools and techniques of Machine Learning. Interesting properties which make this modeling non-trivial is the time dependence, volatility and other similar complex dependencies of this problem. To in corporate these, Hidden Markov Models (HMM's) have recently been applied to forecast and predict the stock market. We present the Maximum a Posteriori HMM approach for forecasting stock values for the next day given historical data. In our approach, we consider the fractional change in Stock value and the intra-day high and low values of the stock to train the continuous HMM. This HMM is then used to make a Maximum a Posteriori decision over all the possible stock values for the next day. We test our approach on several stocks, and compare the performance to some of the existing methods using HMMs and Artificial Neural Networks using Mean Absolute Percentage Error (MAPE). __________________ __________________ Signature of Student Signature of Supervisor Name Name Date Date
  • 8. 8 (VI) LIST OF FIGURES FIGURE NUMBER NAME PAGE NUMBER Figure 1 USE CASE DIAGRAM 22 Figure 2 DFD DIAGRAM 23 Figure 3 DFD LEVEL1 DIAGRAM 24 Figure 4 ARCHITECTURE 25 Figure 6 PROCESS OVERVIEW 25 Figure 7 RISK ANALYSIS AND 35 MITIGATION DIAGRAM Figure 8 CORRELATION BETWEEN 39 ACTUAL AND PREDICTED VALUE FOR DELL Figure 9 CORRELATION BETWEEN 39 ACTUAL AND PREDICTED VALUE FOR GOOGLE
  • 9. 9 (VII) LIST OF TABLES TABLE NUMBER NAME PAGE NUMBER Table 1 Risk Analysis & 30 Mitigation plan Table 2 Test Plan 33 Table 3 Test Activity 34 Table 4 Software & Hardware 34 Items Table 5 Component Testing 35 Table 6 Test Cases 36
  • 10. 10 (IX) LIST OF ACRONYMS S.NO. ABBREVIATION EXPANSION 1 POSM prediction on stock market 2 HMM Hidden markov model 3 CSV Comma separated Values 4 LRA Left right algorithm 5 BWA Baum Welch algorithm 6 ARIMA Integrated Moving Average
  • 11. 11 CHAPTER 1 INTRODUCTION 1.1. GENERAL The stock market is a network which provides a platform for almost all major economic transactions in the world at a dynamic rate called the stock value which is based on market equilibrium. Predicting this stock value offers enormous arbitrage profit opportunities which are a huge motivation for research in this area. Knowledge of a stock value beforehand by even a fraction of a second can result in high profits. Similarly, a probabilistically correct prediction can be extremely profitable in the amortized case. This attractiveness of finding a solution has prompted researchers, in both industry and academia to find a way past the problems like volatility, seasonality and dependence on time, economies and rest of the market. Previously, techniques of Artificial Intelligence and Machine Learning - like Artificial Neural Networks, Fuzzy Logic and Support Vector Machines, have been used to solve these problems. Recently, the Hidden Markov Model (HMM) approach was applied to this problem in predicting the pattern. The reason for using this approach is fairly intuitive. HMM's have been successful in analyzing and predicting time depending phenomena, or time series. They have been used extensively in the past in speech recognition, ECG analysis etc. The stock market prediction problem is similar in its inherent relation with time. Hidden Markov Models are based on a set of unobserved underlying states amongst which transitions can occur and each state is associated with a set of possible observations. The stock market can also be seen in a similar manner. The underlying states, which determine the behavior of the stock value, are usually invisible to the investor. The transitions between these underlying states are based on company policy, decisions and economic conditions etc. The visible effect which reflects these is the value of the stock. Clearly, the HMM conforms well to this real life scenario. The choice of attributes, or feature selection is significant in this approach. In the past various attempts have been made using the volume of trade, the momentum of the stock, correlation with the market, the volatility of the stock etc. In our model we use the daily fractional change in the stock value, and the fractional
  • 12. 12 deviation of intra-day high and low. The fractional change is necessary in order to make the required prediction. Measuring the fractional deviation of both the intra-day high and low value is a good measure as it gives the direction of the volatility as well. We use three different stocks for evaluating the approach – Google , Apple Inc., and Dell Inc. A separate HMM is trained for each stock. The one constraint that the training set needs to have is suitable variability in the data. This is taken care of by taking appropriately large periods of time in which the stock value changes steadily yet significantly. The remaining paper is organized as follows. In Section II we review some of the existing techniques for stock market prediction, especially the ones using HMMs. In Section III we give details of our approach with mathematical justifications. In Section IV we describe the data-sets and provide the experimental results. Finally, in Section V we discuss the results and conclude the paper. 1.2. PROBLEM STATEMENT Stock Market prediction has been one of the more active research areas in the past, given the obvious interest of a lot of major companies. In this research several machine learning techniques have been applied to varying degrees of success. However, stock forecasting is still severely limited due to its non-stationary, seasonal and in general unpredictable nature. Predicting forecasts from just the previous stock data is an even more challenging task since it ignores several outlying factors (such as the state of the company, economic conditions ownership etc.). Machine learning techniques which have been widely applied to forecasting stock market data include Artificial Neural Networks (ANNs) , Fuzzy Logic (FL), and Support Vector Machines (SVMs) . Out of these ANNs have been the most successful, however even their performance is quite limited, and not reliable enough .Wang and Leu trained a system based on a recurrent neural network which used features extracted using the Autoregressive Integrated Moving Average (ARIMA)analysis, which showed reasonable accuracy, .HMMs have only been rarely applied to the given problem in the past, which is surprising given the time-dependent nature of the market. HMMs have been successfully applied in the areas of Speech Recognition, DNA sequencing, and ECG analysis . Shi and Weigend, , used HMMs to predict changes in the trajectories of financial time series data. Recently Hassal combined HMMs and fuzzy logic rules to improve the prediction accuracy on non-stationary stock data sets, . His performance was significantly better than that of past approaches. The basic idea there was to combine HMM's
  • 13. 13 data pattern identification method to partition the data-space with the generation of fuzzy logic for the prediction of multivariate financial time series data. Our approach is similar to the one taken by Nobakht et al in .They model the daily opening, closing, high and low indices as continuous observations from underlying hidden states. The main difference between our approach and theirs lies in the features used (we use fractional changes in the above quantities) and in the manner of forecasting. While they look for similar data patterns in the past data, we maximize the likelihood of a sequence of observations over all possible forecasted future values. 1.3. APPROACH TO PROBLEM IN TERMS OF TECHNOLOGY /PLATFORM TO BE USED We use a continuous Hidden Markov Model (CHMM) to model the stock data as a time series. An HMM (denoted by λ) can be written as λ=(π,A,B) Where A is the transition matrix whose elements give the probability of a transition from one state to another, B is the emission matrix giving the probability of observing when in state j, and π gives the initial probabilities of the states at t= 1. Further, for a continuous HMM the emission probabilities are modelled as Gaussian Mixture Models (GMMs): where: •M is the number of Gaussian Mixture components. • is the weight of the mth mixture component in state j • is the mean vector for the mth component in the jth state. • is the probability of observing in the multi-dimensional Gaussian distribution.Training of the above HMM from given sequences of observations is done using the Baum-Welch algorithm which uses Expectation-Maximization (EM) to arrive at the optimal parameters for the HMM,.In our model the observations are the daily stock data in the form of the 4-dimensional vector,
  • 14. 14 Here open is the day opening value, close is the day closing value, high is the day high, and low is the day low. We use fractional changes along to model the variation in stock data which remains constant over the years. Once the model is trained, testing is done using an approximate Maximum a Posteriori (MAP) approach. We assume a latency of d days while forecasting future stock values. Hence, the problem becomes as follows - given the HMM model λ and the stock values for d days along with the stock open value for the (d+1) day, we need to compute the close value for the (d+1) day. This is equivalent to estimating the fractional change( ) for the (d+1) day. For this, we compute the MAP estimate of the observation vector .Let be the MAP estimate of the observation on the (d+1) day, given the values of the first d days. Then, The observation vector is varied over all possible values. Since the denominator is constant with respect to, the MAP estimate becomes, The joint probability value can be computed using the forward-backward algorithm for HMMs. In practice, we compute the probability over a discrete set of possible values of see Table II, and find the maximum, hence the name MAP HMM model. The computational complexity of the forward-backward algorithm for finding the likelihood of a given observation is , where n is the number of states in the HMM and d is the latency. This procedure is repeated over the discrete set of possible values of . In our case n= 4, d= 10 and there are 50x10x10 possible values of The closing value of a
  • 15. 15 particular day can be computed by using the day opening value and the predicted fractional change for that day. CHAPTER 2 ADDITIONAL LITERATURE SURVEY 2.1. SUMMARY OF RELEVANT PAPERS Title of paper Stock Market Forecasting Using Hidden Markov Model: A New Approach Authors Md. Rafiul Hassan and Baikunth Nath Year of Publication 2005 Publishing details Proceedings of the 2005 ,5th International Conference on Intelligent Systems Design and Applications Summary This paper presents Hidden Markov Models (HMM) approach for forecasting stock price for interrelated markets. We apply HMM to forecast some of the airlines stock. HMMs have been extensively used for pattern recognition and classification problems because of its proven suitability for modelling dynamic systems. However, using HMM for predicting future events is not straightforward. Here we use only one HMM that is trained on the past dataset of the chosen airlines. The trained HMM is used to search for the variable of interest behavioural data pattern from the past dataset. By interpolating the neighbouring values of these datasets forecasts are prepared. The results obtained using HMM are encouraging and HMM offers a new paradigm for stock market forecasting, an area that has been of much research interest lately. Key Words: HMM, stock market forecasting, financial time series, feature selection Web link hassan-nath-2005
  • 16. 16 stock_market_forecasting_using_hidden_markov_model_a_new_app roach.pdf Paper 2 Title of paper Analysis of Hidden Markov Models and Support Vector Machines in Financial Applications. Authors Satish Rao Jerry Hong Year of Publication 12th may,2010 Publishing details Electrical Engineering and Computer Sciences University of California at Berkeley Summary This paper presents two approaches in helping investors make better decisions. First, we discuss conventional methods, such as using the Efficient Market Hypothesis and technical indicators, for forecasting stock prices and movements. We will show that these methods are inadequate, and thus, we need to rethink the issue. Afterwards, we will discuss using artificial intelligence, such as Hidden Markov Models and Support Vector Machines, to help investors gather and compute enormous amount of data that will enable them to make informed decisions. We will leverage the Simlio engine to train both the HMM and SVM on past datasets and use it to predict future stock movements. The results are encouraging and they warrant future research on using AI for market forecasts. Web link EECS-2010-63.pdf
  • 17. 17 Paper 3 Title of paper Stock Market Prediction using Hidden Markov Model Authors Aditya gupta,Bhuwan Dhinghra Year of Publication 2010 Publishing details International Journal of Multimedia and Ubiquitous Engineering Summary Stock market prediction is a classic problem which has been analyzed extensively using tools and techniques of Machine Learning. Interesting properties which make this modelling non- trivial is the time dependence, volatility and other similar complex dependencies of this problem. To incorporate these, Hidden Markov Models (HMM’s) have recently been applied to forecast and predict the stock market. We present the Maximum a Posteriori HMM approach for forecasting the stock value for the next day given historical data. In our approach, we consider the fractional change in Stock value, and the intra-day high and low values of the stock, to train the continuous HMM. This HMM is then used to make a Maximum a Posteriori decision over all the possible stock values for the next day. We test our approach on several stocks, and compare the performance to some of the existing methods using HMMs and Artificial Neural Networks using Mean Absolute Percentage Error (MAPE). Web link Y8036_Y8167.pdf
  • 18. 18 Paper 4 Title of paper Stock market trend analysis using Hidden markov Model Authors Kavitha G, Udhaya kumar A,Nagarajan D Year of Publication 2009 Publishing details IEEE Transactions on Network and Service Management Summary Price movements of stock market are not totally random. In fact, what drives the financial market and what pattern financial time series follows have long been the interest that attracts economists, mathematicians and most recently computer scientist. This paper gives an idea about the trend analysis of stock market behaviour using Hidden Markov Model (HMM). The trend once followed over a particular period will sure repeat in future. The one day difference in close value of stocks for a certain period is found and its corresponding steady state probability distribution values are determined. The pattern of the stock market behaviour is then decided based on these probability values for a particular time. The goal is to figure out the hidden state sequence given the observation sequence so that the trend can be analyzed using the steady state probability distribution(π ) values. Six optimal hidden state sequences are generated and compared. The one day difference in close value when considered is found to give the best optimum state sequence. . Web link http://arxiv.org/ftp/arxiv/papers/1311/1311.4771.pdf
  • 19. 19 2.2. INTEGRATED SUMMARY OF LITERATURE STUDIED The study of papers mainly focus on Hidden Markov Models (HMM) approach for forecasting stock price for interrelated markets. We apply HMM to forecast some of the airlines stock. HMMs have been extensively used for pattern recognition and classification problems because of its proven suitability for modelling dynamic systems. However, using HMM for predicting future events is not straightforward. Here we use only one HMM that is trained on the past dataset of the chosen airlines. The trained HMM is used to search for the variable of interest behavioural data pattern from the past dataset. By interpolating the neighbouring values of these datasets forecasts are prepared. The results obtained using HMM are encouraging and HMM offers a new paradigm for stock market forecasting, an area that has been of much research interest lately. These conventional tools offered much insight into the workings of the financial market. However, they provide only a macro-simplification that does not always reflect how the real market works. There are definitely limitations to these tools that prevent them from modeling the market in a more focused, “micro” manner. One of the major issues is that many conventional finance theories only take in so many factors. This limited scope prevents us to accurately model the real market that has countably infinite number of patterns. We need a model that can constantly adapt to the dynamic nature of the market. Technical indicators can only help an investor so much before the different combinations and patterns causes the investor to question whether any formula actually works consistently. This is where AI models such as HMMs and SVMs come into play. Using these tools, we can achieve a more realistic micro-representation of the market while overcoming the limitations of the earlier techniques. In Recent years, a variety of forecasting methods have been proposed and implemented for the stock market analysis. A brief study on the literature survey is presented. Markov Process is a stochastic process where the probability at one time is only conditioned on a finite history, being in a certain state at a certain time. Markov chain is “Given the present, the future is independent of the past”. HMM is a form of probabilistic finite state system where the actual states are not directly observable. They can only be estimated using observable symbols associated with the hidden states. At each time point, the HMM emits a symbol and changes a state with certain probability. HMM analyze and
  • 20. 20 predict time series or time depending phenomena. There is not a one to one correspondence between the states and the observation symbols. CHAPTER 3 ANALYSIS, DESIGN AND MODELLING 3.1. OVERALL DESCRIPTION OF THE PROJECT 3.1.1. PRODUCT PERSPECTIVE The product to be developed: “Prediction of stock market(POSM)” is a stand-alone application and an independent product and is developed in Java programming language using swings framework. 3.1.2. PRODUCT FUNCTIONS Hardware Requirements: The hardware requirements may serve as the basis for a contract for the implementation of the system and should therefore be a complete and consistent specification of the whole system. They are used by software engineers as the starting point for the system design. It should what the system do and not how it should be implemented. PROCESSOR: PENTIUM 4 2.1 GHZ & ABOVE. RAM: 512 MB DDR2 RAM MONITOR: COLOR HARD DISK SPACE: 10 MB Software Requirements: The software requirements document is the specification of the system. It should include both a definition and a specification of requirements. It is a set of what the system should do rather than how it should do it. The software requirements provide a basis for creating the software requirements specification. It is useful in estimating cost, planning team activities, performing tasks and tracking the teams and tracking the team’s progress throughout the development activity. OPERATING SYSTEM: WINDOWS 7/8/8.1 LANGUAGE: JAVA
  • 21. 21 FRAMEWORK: JAVA SWINGS TOOL USED: NETBEANS IDE 7.1 & ABOVE, JDK 1.7 & ABOVE. 3.1.3. USER CHARACTERSTICS • The user needs to have Java Virtual Machine and Java Runtime Environment Installed in the computer • User must have some knowledge in the field of Share Market. 3.1.4. CONSTRAINTS Reliability requirements: The system must be reliable and must not crash due to higher number of operations and higher processing time. A high speed processor must be employed. 3.1.5. ASSUMPTIONS AND DEPENDENCIES Assumption 1: The user knows how to operate Windows system and is aware of Share Marketing basics. Assumption 2: The JDK Version should not be less than the Version 1.7. Dependency 1: Dependent on Internet speed of the System. 3.1.6. APPORTIONING OF REQUIREMENTS The system may be optimized in terms of accuracy and speed in the future versions of the product. 3.2. FUNCTIONAL REQUIREMENTS 1. Fetch Chart data from Yahoo Finance. 2. Convert the Csv file to XLS spread sheet. 3. Apply Left right algorithm to the incoming data and send it for a transition matrix. 4. Apply Baum Welch or predicting algorithm to the received observations ,calculate the fractional change. 5. Calculate the opening ,closing ,high, low value of the given stock using hidden markov model.
  • 22. 22 3.3. NON FUNCTIONAL REQUIREMENTS 1. Data fetched from yahoo finance need to be delivered reliably and quickly. 2. Study the specifications and configuration setting of java API’s , while integrating with user application. . 3.4. DESIGN DIAGRAMS 3.4.1. USE CASE DIAGRAM
  • 23. 23
  • 26. 26 3.4.2. ARCHITECTURAL DIAGRAM Fig. 1. Architecture of stock market prediction There are a number of sources of stock and financial data on web including Google Finance and Yahoo! Finance. In the specific case of SENSEX Index, it seems that Google Finance does not provide the required data format as required. As an alternative, Yahoo! Finance was used to load and store the stock data from 1984. In the application developed for this project, an interface is included to use when needed to load the data beforehand using the algorithms for training and prediction. SMP Microsoft Excel Reader Excel reader JAVA ARCHITECT URE Stock Data Extracted Data returned as data - source
  • 27. 27 In this problem, there is a sequence of data over time with which we need to train an HMM, To train an HMM matching a set of the sequence of stock index data as ~O = (open; high; low; close), the following settings and considerations have been taken into account: 1. States: in the experiments, we have N = 4; intuitively, it is denoting the stages in time that are allowed in different transitions in the HMM training. 2. Dimensions: as a mixture of multivariate Gaussian is utilized in this problem, we have D = 4 as the observation vector for each stock date is as (open; low; high; close). 3. Mixtures: [10] proposes to have M = 3. 4. Left-Right Delta: experimentally, we have tried delta=1 and delta = 3. 5. Prior Probability: Adhering to the left-right HMM, we have = (1; 0; 0; 0).
  • 28. 28 CHAPTER 4 IMPLEMENTATION DETAILS AND ISSUES 3.1. IMPLEMENTATION DETAILS AND ISSUES The software is implemented in java programming language and the user interface is designed using Java swings framework. DETAILS OF THE PROJECT As the problem of the prediction could be complex and lengthy, a series of actions and activities in the form of several phases was considered to break down the problem to conquer the complexity. Figure 1 depicts the overall process that is considered when solving the problem. Additionally, the following section will discuss each of the the phases in more details. As mentions, through this experiment, we try to take advantage of Hidden Markov Models (HMM) to address some interesting problems regarding stock market analysis. Specically, stock market index prediction is done in this assignment. First, a set of past data is loaded and analyzed; then, an HMM is modelled and trained for the problem model. Afterwards, similar past data are distinguished and used to predict future stock market values.Stock market data that are used in this assignment are the data from SENSEX that is an index over the stock data in INDIA. Basically, each stock market data is a quadruple (open; low; high; close) carrying the meaning that each day the stock market starts its activity, it starts with some opening after which during the day it reaches its highest or drops down to its lowest of the days and then it will stop with a close value. Such data seems to be very sensitive for stock workers and business shareholders to predict future stock trends. In this project, we try to estimate the future day's close values as precise as possible.
  • 29. 29 3.1.1. IMPLEMENTATION ISSUES A number of important implementation issues must be addressed before POSM can be feasibly deployed . Some of these issues are discussed below: 1. Interesting HMM Questions: 1. How to compute the probability of the occurrence of a specific sequence of observations, P(~O j_), in which ~O = f(O1;O2,,,,,,;OT) g 2. How to choose state sequence (q(1); q(2),,,,,; q(T)) that explains best the observation of ~O in the model . 3. How to tune parameters (٨; A;B) to and a model that best matches the observations of ~O In stock market analysis and prediction, we are facing the first and the third problem. Training HMM is done through the problem (3) and prediction is achieved with problem (1). 2. Initializing the HMM: Another important issue in the modeling with HMM is how to initialize the parameters of defined HMM, i.e. transition probabilities, observation probabilities (distributions) and prior probabilities. Although the HMM can adjusts it parameters as good as possible in the learning process, however badly initiating the HMM parameters can cause imprecise model even though after HMM is trained and learnt. A basic approach is required to obtain initial optimized values of the HMM parameters. In this project we proposed our approach to initialize the parameters of the HMM base of physical behavior of data. As we describe in one of the best HMM types that can be used for time series prediction is Left-Right HMM. Consequently the prior probabilities is (1, 0, 0, 0) since the Left-Right HMM impose this fact. Transition probabilities is again mainly depends on the defined model. we described how should we initialized the transition probabilities based on the value of Observation distributions can be initialized based on the training data. Since our proposed HMM has 4 states, we divided our training sequence in 4 equal sections, each sequence is used to estimates the best possible mixture of multi-Gaussian distribution for each state. The parameters of the multi-Gaussian distributions are estimated with maximum likelihood method.
  • 30. 30 3.Continous HMM: In some applications such as stock market analysis or speech recognition, the observations are not from of a discrete space; this would make the discrete observation Ot to some ~Ot in each state meaning that in each state a series of observations could be received. Specically, the observation vector in our assignment would be:~Ot = (opening; low; high; close) which are the values of the stock index. Usually, for such HMM's, the representation for the probability density function (pdf) that is used is a mixture of higher-dimensional Gaussian distribution. 3.1.2. ALGORITHMS Likelihood update algorithm Likelihood update Algorithm in POSM In the path to prediction, first, there is a need to find the most similar day in stock market data for a specific day so that it could be used to predict the following day's close value. To do so, first, we need to compute the likelihood of previous days in the desired range. When having one day's stock data, it is straightforward to compute the likelihood of that specific day from the HMM. This is Problem (1) which is computed using Forward Backward Algorithm proposed in [12, 2, 13]. Algorithm 1 overalls depicts the method to compute likelihoods.
  • 31. 31 Prediction Algorithm When the likelihood probabilities of different days are computed, the last phase would be to predict some day's close value as the target of this experiment. To do so, we introduce a parameter likelihood tolerance denoting the similarity neighborhood that we can accept similar days to the previous day. Through using the likelihood tolerance, we fetch a list of similar days to yesterday's stock data and then we try to find the best guess as the one that has the highest likelihood of all. In this experiment, we used the likelihood tolerance value in range [0.001, 0.01] From this point, prediction is straightforward with calculating the difference of the similar day and yesterday's values and then calculating tomorrow's close value. Algorithm 2 shows the overall pseudo-code used for prediction. Along with prediction computation, we calculate also the MAPE (Mean Average Percentage Error) measure. Prediction algorithm for POSM
  • 32. 32 CHAPTER 5 TESTING 5.1. TESTING PLAN Type of test Will test be performed Comments/explanation Software component Requirements testing Yes Needs to be done to cope up with changing environment Fluctuation in the share market. Unit Yes Maximum number of defects are found. Each component of code was tested or analyzed accordingly not only to ensure the best quality of the developed software but also to make sure that code behaves in the same way as it was intended to. Unit testing was performed as and when the component was developed. • User interface code • Baum welch code • Left right code • Sgenerator code • distribution Code • hmm code • prediction of day code Integration Yes All the well-developed sub-system are integrated together and tested called as integration testing. • Left right Bakis Algorithm • Baum welch Algorithm
  • 33. 33 Performance Yes Performance is the major criteria for evaluating any type of the system. It holds importance and is tested likewise. Performance of different Algorithms is measured in combination. Algorithms are: • Left right Algorithm • Baum Welch algorithm Performance is also measured on the precision of output Stress No - - Compliance No - - Security No - - Table 1: Testing Plan TEST TEAM DETAILS Role Name Responsibility Chief Testing Incharge Aditya data To perform requirements, unit, integration, performance and load testing. Table 2: Test Team details Activity Start Date Completion Date Hours Comments
  • 34. 34 Develop Input 25-03-2014 28-03-2014 6 Nominal and trivial issues tested using the standard test cases designed for the system. Test Region Setup 5-04-2014 10-04-2014 10 Test region is so defined to check all the features individually as well as in combination Table 3: Testing Schedule Test Environment Software Items • Window 7/8/8.1 Stability • Mac Stability • Internet connection • Java Runtime Environment & Development Kit 1.7 & above • Netbeans7.1 & above Hardware Items • Personal Computer/Laptop • Network Interface card • Wireless connection or connecting cable Table 4: Test Environment 5.2. COMPONENT TESTING
  • 35. 35 S.No Components that require testing Type of testing required Technique for writing test case 1 Left right code Unit Testing White Box Testing 2 Csv to xls converter code Unit Testing White Box Testing 3 Model code Unit Testing White Box Testing 4 graph code Unit Testing White Box Testing 5 sgenerator code Unit Testing White Box Testing 6 Destination System Testing Black Box Testing 7 Source System Testing Black Box Testing 8 User interface code Performance Testing Black Box Testing 9 Utils code Performance Testing Black Box testing Table 5: Component decomposition and identification of tests required 5.3. TEST CASES Test Id T1 Input Enter the starting and the last date to update the data Expected Output Data fetched from yahoo finance. Status Pass Test Id T2 Input Predict the stock rate for the very next day. Expected Output We get low, high, opening and closing for the next day. Status Pass
  • 36. 36 Test Id T3 Input Predict the stock rate for any day after a week from the given set of data. Expected Output Enter a date within a week Status Pass Test Id T4 Input Check the precision of output by entering a date whose value’s are already known. Expected Output Outputs are almost precise. Status Pass 5.4. ERROR AND EXCEPTION HANDLING Test Case Id Test Case Debugging Technique T1 Fetching data from yahoo finance . Check the posmdownloader code and check for errors. Table 6: Error and Exception Handling 5.5. LIMITATION OF THE SOLUTION . • The precison of the output sometimes is not even near to the actual value. • System sometimes hang due to loss of connection to Internet.
  • 37. 37 5.6. Risk analysis and Mitigation plan Risk Id Classification Description Risk Area Probability Impact RE = (P*I) R1 Performance Low Performance Product Engineering L H 81 R2 Budget Medium Budget Program Constraints M L 3 R3 Project Specification Infeasible Specifications Product Engineering L M 3 R4 Hardware Hardware Constraints Product Engineering L L 1 R5 Accuracy Low Accuracy Product Engineering M H 27 R6 External Inputs Inaccurate Inputs Program Constraints H H 81 Table 7: Risk Identification
  • 38. 38 S. No. Risk Area # of Risk Statements Weight(in+out) Total Weight Priority 1 Performance 5 1+1+9+9+9 29 1 2 Accuracy 3 9+9+9 27 2 3 External Inputs 3 3+9+9 21 3 4 Hardware 2 9+9 18 4 5 Project Specification 3 3+3+1 7 5 6 Budget 2 3+1 4 6 Table 8: Risk Area Wise Total Weighting Factor Risk Id Risk Statement Risk Area Priority R1 Risk of Performance Performance 1 R5 Risk of Low Accuracy Accuracy 2 R6 Risk of Inaccurate Inputs External Inputs 3 R4 Risk of Inaccurate Hardware equipment Hardware 4 Table 9: Risk with Maximum Weight
  • 39. 39 MITIGATION APPROACHES Approach 1: To ensure high performance, optimize the system by reducing response time 10 April 2014 25 April 2014 Aditya datta Additional Resources: High Speed Processor Approach 2: To ensure high accuracy, optimize the code 10 April 2014 20 April 2014 Aditya datta Additional Resources: Internet Approach 3: Ensure the system is secure 16 April 2014 25 April 2014 Aditya datta Additional Resources: None Approach 4: Ensure the all the specifications and inputs are correct 16 April 2014 25 April 2014 Aditya datta Additional Resources: None
  • 40. 40 CHAPTER 6 FINDINGS AND CONCLUSION As it is revealing, we have been successful rough estimation of the future data required in the project. Though, the quality of “preciseness" becomes more significant as the sensitiveness of the data rises. Thus, regarding the work that has been done, for future, one of the ideas to apply to gain better quality is to consider weighted ranking of the most similar past data in search for the likelihood tolerance. Intuitively, it will somehow try to control the deviation from the actual values that are seen over time. Additionally, further boundary checks could be applied to the predicted data to prevent undesired deviations in the predictions Another idea could be proposed as “continuous training", as opposed to the current situation in which a period of time is considered and for that an amount of data is located and used to train an HMM. Then the trained HMM is used for prediction purposes. However, a better idea is to somehow persist the trained HMM and over time try to optimize and tune the HMM according to the latest data that emerge in time. This way, intuitively, we would be trying to optimize and improve the HMM over time without losing the trained HMM from the past. ANN is well researched and established method that has been successfully used to predict time series behaviour from past datasets. In this paper, we proposed the use of HMM, a new approach, to predict unknown value in a time series (stock market). It is clear from that the mean absolute percentage errors (MAPE) values of the two methods are quite similar. Whilst, the primary weakness with ANNs is the inability to properly explain the models. According to Repley“ the design and learning for feed-forward networks are Opening ,price High, price Low, price Closing, price Predicted”. The proposed method using HMM to forecast stock price is explainable and has solid statistical foundation. The results show potential of using HMM for time series prediction. In our future work we plan to develop hybrid systems using AI paradigms with HMM to further improve accuracy and efficiency of our forecasts.
  • 41. 41 Correlation between predicted and actual closing stock price for google. . Correlation between predicted and actual closing stock price for dell.
  • 42. 42
  • 43. 43 REFERENCES [1] Kuo R J, Lee L C and Lee C F (1996), Integration of Artificial NN and Fuzzy Delphi for Stock market forecasting, IEEE International Conference on Systems, Man, and Cybernetics, Vol. 2, pp. 1073-1078. [2] Kimoto T, Asakawa K, Yoda M and Takeoka M (1990), Stock market prediction system with modular neural networks, Proc. International Joint Conference on Neural Networks, San Diego, Vol. 1, pp. 1-6. [3] White H (1998), Economic Prediction Using Neural Networks: The Case of IBM Daily Stock Returns, Proceedings of the Second Annual IEEE Conference on Neural Networks, Vol. 2, pp. 451-458. [4] Chiang W C, Urban T L and Baldridge G W (1996), A Neural Network Approach to Mutual Fund Net Asset Value Forecasting. Omega, Vol. 24 (2), pp. 205-215. [5] Kim S H and Chun S H (1998), Graded forecasting using an array of bipolar predictions: application of probabilistic neural networks to a stock market index.International Journal of Forecasting, Vol. 14, pp. 323-337. [6] Romahi Y and Shen Q (2000), Dynamic Financial Forecasting with Automatically Induced Fuzzy Associations, Proceedings of the 9th international conference on Fuzzy systems, pp. 493- 498. [7] Thammano A (1999), Neuro-fuzzy Model for Stock Market Prediction, Proceedings of the Artificial Neural Networks in Engineering Conference, ASME Press, New York, pp. 587-591. [8] Abraham A, Nath B and Mahanti P K (2001), Hybrid Intelligent Systems for Stock Market Analysis,Proceedings of the International Conference on Computational Science. Springer, pp. 337-345. [9] Raposo R De C T and Cruz A J De O (2004), Stock Market prediction based on fundamentalist analysis with Fuzzy-Neural Networks. http://www.labic.nce.ufrj.br/downloads/3wses_fsfs_2002.pdf [10] Cao L and Tay F E H (2001), Financial Forecasting Using Support Vector Machines, Neural Computation and Application, Vol. 10, pp. 184-192. [11] Huang X, Ariki Y, Jack M (1990), Hidden Markov Models for speech recognition. Edinburgh University Press. .