SlideShare a Scribd company logo
1
Bitcoin close price prediction
Abstract—The objective of this project is to determine the
accuracy with which the closing price of the bitcoin can be
predicted with the help of classification and linear regression
methods. For classification, we have implemented several ANN
models with different layers and neurons to find the model with
the best accuracy and compared the result with LSTM. Using
LSTM an accuracy of 54.35% was achieved with a log loss of 7.18
to predict the direction of the close price. Also, the best ANN
model had an accuracy of 55.1 % which was almost at power with
LSTM accuracy. Using multiple linear regression models, we
deduced that elastic net performed better in comparison to lasso
and ridge model as it had lower RMSE and R squared value.
RMSE value recorded for elastic net regression is 0.00808 which
was lowest when compared to other regression models.
Keywords—Bitcoin, machine learning, ANN, LSTM, multiple
linear regression model, ridge, lasso, elastic net regression.
I. INTRODUCTION
In recent years cryptocurrencies has been on a constant
rise. Cryptocurrencies are used as a means for digital
transactions and for investment purposes around the world
[3]. Bitcoins nature of combining monetary units and
encryption technology lately has attracted substantial
recognition in fields such as economics, computer science
and cryptography [1]. Bitcoin being one of the first
cryptocurrency which was decentralized now has a market
capital of 170 billion US dollars [2]. Since bitcoin is a
decentralized cryptocurrency, it is not owned by a
government body or restricted to a certain location but
applied as a type of peer to peer payment [4].
With the ever-increasing demand in trying to understand
the fluctuation in prices of cryptocurrencies, it is vital to
have a system that can help predict the change in prices
daily. Like the stock exchange bitcoin price change is quite
volatile and can be difficult to get a high accuracy in terms
of prediction. The value of bitcoin or any other
cryptocurrency cannot be static and can vary almost every
second. The fluctuation is completely dependent on the
amount being paid for bitcoin by buyers. As bitcoin is used
as an investment, the same principle applied in stocks for
buying cheap and selling at a high price is applicable for
cryptocurrency [4]. The volatile nature of the
cryptocurrency makes it much more challenging and
interesting for analysts to predict the right price. The
prediction and approximation of bitcoin prices is an area
where much research has not been done [1]. The traditional
time series methodology is not suitable since there is a lack
of seasonality in the cryptocurrency market and the major
factors that help in this methodology are trend, seasonal and
noise [5].
Since investors are keen to know the direction of
cryptocurrency price i.e. high or low it is vital to have an
algorithm that gives the best accuracy in terms of
determining the range. A lot of work and research has been
done in trying to predict the direction of stock prices and
very less in terms of cryptocurrency.
In the following sections we will investigate the related
work, the methodology used, and the results achieved. One
of the main papers referred to the project is [5]. We will be
trying to improve on the accuracy achieved by adding more
parameters used in the study. With the help of classification
methods, we will be classifying the closing price into high,
low and no change. Also, we will be using various multiple
linear regression methods and find the best method suited
for this project. The results of each model would then be
analyzed to find the best-suited model for classification and
multiple linear regression.
II. RELATED WORK
Our work improves on the existing research done to predict
the bitcoin prices in [5]. McNally, Roche and Caton, 2018
investigated RNN along with LSTM for prediction. The
algorithms were benchmarked based on GPU and CPU
performances. Results were then compared to ARIMA
where it was known that the accuracy of ARIMA was very
poor in comparison to RNN and LSTM. Accuracy measured
was 52.78%, 50.25% and 50.05% for RNN, LSTM, and
ARIMA respectively.
[1] used BNN analysing time series. Linear and
non-linear benchmark models were used to compare.
Resampling was done with the help of bootstrap and cross-
validation. The prices were then compared with SVR and
linear regression. BNN gave the best results in terms of
accuracy in comparison to others. [3] implemented the
general linear model and Bayesian regression to predict the
daily change of price values considering the parameters.
Five normalization techniques were used on the data.
Finally, with random forest was applied on both the time
series datasets, the results of which were combined to
predict the macro change in price.
[4] used four ANN methods BPNN, GANN,
GABPNN and NEAT. Data was executed with 30 iterations
on training. The study focused only on a day’s prediction.
BPNN outperformed GAPNN. [6] implemented the genetic
algorithm based selective neural network ensemble which is
built using multi-layered perceptron. Supervised algorithm
Levenberg-Marquardt (LM) was used as a result of the
complexity and the computational cost. [7] analysed the
social media posts and performed a sentimental analysis to
get a positive, neutral and negative score. Using this data
‘Granger causality test’ was performed to test and reject the
null hypothesis which was assumed that community
comments do not help in predicting the fluctuations in
cryptocurrency prices. This paper analyses the market using
sentiment score as compared to the HMM model which uses
time series data to predict the prices. One of the major risks
with social media posts being that it can be easily exploited.
[8] used the Hidden Markov Model (HMM) to examine
social media posts to predict the transition to another state at
a certain point in time, given the current state of the
currency. With the help of this model, by identifying the
hidden state, given the data point, the state of the
2
cryptocurrency at a certain point in time can be predicted.
This model particularly focuses on time series data to
predict the prices of the cryptocurrency. [9] predicted the
highest and closing price of bitcoin using time-delay neural
network (TDNN) and recurrent neural network (RNN). The
models were trained across data from past eight quarters to
test over the next quarter. TDNN needed less training time
and predicted values closer to the actual price as compared
to RNN. [10] approached classification and regression
problems of machine learning by proposing a regularization
method based neural network. Their results depicted that
obtaining directional accuracy of up to 5%, the rolling
volatility and rolling skewness were the best auxiliary
objectives to forecast. The best regularization parameters for
the tasks were found by applying Bayesian optimization.
[11] in their thesis work, used fractionally integrated
autoregressive moving average (ARFIMA) model to predict
the value of currency using the exchange rate of the Bitcoin.
Their research is based on the Lewellen approach and the
approaches of Westerlund and Narayan to find any
statistical effects that could be responsible for a bias of
regression estimates. [12] applied the ARIMA
(Autoregressive Integrated Moving Average) model to
predict the exchange rate of Bitcoin, by conducting
autocorrelation function and partial autocorrelation function
analysis to determine the parameters for the ARIMA model.
The MAPE of the model was found to be 5.36% while
explaining approximately 44% of variability from the
response of the data around the models mean. [13] predicted
the stock prices using four different models namely ANN,
Naïve Bayes, SVM and Random forest. Naïve Bayes
exhibited the least performance while random forest had the
highest performance. [14] applied deep neural network
(DNN) to predict the stock returns in future. DNN
outperformed linear autoregressive model in training set but
did not have the advantage the test set.
[15] used GARCH (General Autoregressive Conditional
Heteroskedasticity) and LSTM (Long short-term memory)
to forecast the volatility of the stock price index. Multiple
GARCH models gave much-improved prediction over other
hybrid neural networks. [16] proposed a BNNMAS (bat-
neural network multi-agent system) architecture with four
layers to tackle the problem of stock prediction. The model
proved to be quite robust. [17] implemented WNN (Wavelet
neural network) to reduce the size of the network and
simplify the structure. [18] tried to determine the various
factors that determine the price of bitcoin by taking into
consideration the twitter sentiment. SVM (Support vector
machine) was used to analyse the sentiment ratio on a day-
to-day basis. The research showed that the bitcoin prices
were positively affected by the search queries from
Wikipedia. [19] tried to predict the bitcoin price for one
hour in the future with the help of a naïve approach to set
the baseline prediction and evaluated the results by using
mean squared error (MSE). Other tree-based algorithms and
k nearest neighbour algorithm were used which didn’t even
match up to the baseline prediction. SVM and linear
regression performed better in comparison. Finally, [20]
used the neural network to predict the stock prices. Higher
performance was achieved by increasing the number of
hidden units although increasing the units beyond a certain
point diminished the performance of the model. Neural
network gave significant results when compared to multiple
discriminant analysis (MDA) for predicting stock prices.
III. DATA MINING METHODOLOGY
In this project, we have used the CRISP data mining
methodology. We had a clear business understanding as to
predict the closing price of bitcoin when comparing to USD.
The data was sourced from blockchain and coinmarketcap.
Quandl package is used to dynamically source data from
blockchain. The data is taken from 1st
January 2014 to 27th
July 2018. The second source of data is coinmarketcap.
Htmltab package is used to source data from this site by
giving the start and end date. Data from 2013 was not
considered as the volume column did not have any values.
Cleaning was then performed to adjust the date and number
format. Finally, the data set was merged based on date.
We then performed the Granger causality test to check if
classification methods can be used for prediction. By not
handling high correlation we performed ANN and LSTM
classification methods. Also, by handling high correlation
with the help of principal component analysis (PCA) we
performed multiple linear regression (MLR). Lasso, ridge
and elastic net linear regression models were implemented,
and the results were compared based on the root mean
squared error (RMSE). First, we will investigate the
classification methods i.e. ANN and LSTM and then into the
multiple linear regression methods. Figure. 1 allows us to
deduce the number of correlations depicted between the
variables.
A. Classification
2. ANN
Artificial Neural Network (ANN) is part of cognitive
learning, which is used for an approximation as mentioned
in [22]. In recent times, the use of ANN has increased for
tasks such as classification, time series forecasting, and
pattern recognition. Moreover, use of ANN has drastically
increased in financial organizations. As the data used is time
series in nature, ANN was considered for implementation.
Moreover, the ANN is a non-linear model and can handle
Figure 1: Correlation Matrix
3
complex relationships between variables. It can also
generalize and infer unseen relationships that are unseen in
the data. In addition, ANN also does not impose any
restriction on input data [23].
Here, ANN is used for classifying the direction of close
price i.e. high or low. Therefore, the classification was
binary in nature. To prepare the model, initially some
cleaning was required and was done using R. One of the key
requirements of the ANN is to normalize the data. Data
were scaled to bring the values in between 0 and 1. Price-
Direction was the dependent variable and was derived from
"Close" attribute. Price-Direction was then encoded with 1
being higher and 0 being lower. Various models were
constructed with a single and multiple layer of hidden
layers. It was found that single hidden layer provided the
model with better and consistent accuracy.
In ANN after modelling the inputs as per requirement of
the model, the dataset was divided into training and test data
with 70 and 30 percent distribution of rows respectively.
Based on the input, models with different hidden layers
were made to run and the results were collected. Table 1, 2
and 3 consolidates the confusion matrix calculations for the
single hidden layer with one neuron, two hidden layers with
(2,1) neurons and (4,3) neurons respectively.
Model 1:
ANN model with single hidden layer and one neuron:
TABLE 1: Confusion matrix with one hidden layer
Lower Higher
Lower 120 111
Higher 253 326
Accuracy 0.55061728
Misclassification 0.44938272
Sensitivity (120/ (120+253) = 0.3217
Specificity (326/ (111+326) =0.8624
Model 2:
ANN model with two hidden layers with two and one
neuron:
TABLE 2: Confusion matrix with two hidden layers (2,1)
Lower Higher
Lower 138 141
Higher 219 298
Accuracy 0.5477386935
Misclassification 0.4522613065
Sensitivity 138/ (138+219) =0.3876
Specificity 141/ (141+298) =0.3211
Model 3:
ANN model with two hidden layers with four and three
neurons:
Figure 2: ANN model with one hidden layer
Figure 4: ANN model with two hidden layers (4,3)
Figure 3: ANN model with two hidden layers (2,1)
4
TABLE 3: Confusion matrix with two hidden layers (4,3)
Lower Higher
Lower 125 148
Higher 248 289
Accuracy 0.55111111
Misclassification 0.48888889
Sensitivity (125/ (122+248) = 0.3351
Specificity (289/ (148+289) =0.2860
Several other models with different configurations were
executed. Model with one hidden layer and one neuron gave
consistent result with training and test dataset and a better
accuracy.
Disadvantage of ANN model:
1. Execution time is high with moderate hardware.
2. Reduced trust as it gives different result with
different models.
1. LSTM
The model consists of blocks of memory which consists
of input, output and a forget gate in memory (Ct) [15]. The
use of LSTM over MLP is due to the materialistic nature of
bitcoin data [5]. The deep learning model is supported by
Keras package in R which is already well known in the
Python environment. The sample data was split into 80%
and 20% for train and test respectively. Dense and dropout
functions were defined with two hidden layers with 60 and
50 neurons and an output layer with 1 neuron as the
classification problem is binary class classification.
The activation function describes the weighted sum
multiplied with input and its summation with bias [15]. The
classification probability lies in the range of 0 to 1 and thus
supported by sigmoid which is a non-linear activation
function. Rectified Linear Unit (ReLU) decides the output
as 0 or 1 based on the maximum value of data given by
max(x,0). Value is passed through the gate upon which
forget gate controls the information in the previous state
(Ct-1) and passed to the sigmoid function. Binary cross
entropy is applied to log losses with a probability between 0
and 1 contributing to 1 as a bad model and 0 being the
perfect model. A stochastic optimizer function Adam
manages to sustain the learning rate with the weights for the
training set. The learning rate (lr) specified as 0.0001 with a
delay of 1e-6. The parameter metrics is set to accuracy for
performance model. The data was tested against multiple
epochs ranging from 50 to 150 and a batch size of 150
gravitated towards the higher effect
In comparison to [5] where LSTM reported an accuracy
of 52.78%, by considering more parameters we have
improved on the accuracy. The confusion matrix and the log
loss were traced to plot the graph and resulted in 7.18 loss
with an accuracy of 54.35%. The value of sensitivity and
specificity are 0.98 and 0.038 respectively. As the data used
in the model was less compared to other financial data, it
resulted in lesser accuracy and can be enhanced over a
period with the collection of historical data along with the
current date.
B. Regression Analysis
1. Assumptions
To conduct Multiple Linear Regression Analysis, certain
assumptions [24] on the data need to be met. These
assumptions include:
1) Adequate Sample Size: According to Tabachnik and
Fidell cited in Palant, 2007, the appropriate sample size
formula is N > 50 + 8m, where m is the number of total
independent variables [24]. The sample size for the project
was 1669 which is way above the calculated limit with
eleven independent variables (138).
2) Ouliers: The outliers of the dataset were handled by
imputing the mean for the outliers detected in R code for
boxplot.stats(column_name)$out by creating a function and
passing each column as an input to the function.
3) Multicollinearity and Singularity: The Independent
and Dependent variables demonstrated high correlations as
mentioned before. This violates our assumption of
multicollinearity. Principal Component Analysis was
performed to overcome this [25]. By deducing only those
variables that explain maximum variance among the linear
combinations of the independent variables PCA was
conducted on the train and test datasets using the prcomp
function in R setting the scale function as true to normalize
the data, which is one of the requirements of performing
PCA. The figure below indicates the components that
explain the maximum proportion of variance for the input
train set of data:
Fig. 1 LSTM model [21]
Figure 5: LSTM model
Figure 6: Proportion of variance against each
component
5
After the PCA on the train set was conducted only those
components that explain the maximum variance (excluding
the ones that are tending towards 0 i.e., component 9 and
10), were considered as the inputs to create our test PCA
dataset, and as inputs for our regression analysis. Thus a
total of eight normalized principal components for the test
and the train datasets were considered as the inputs for the
multiple linear regression (MLR) algorithms conducted in
the sections that follow.
4) Normality, Linearity, Homoscedasticity,
independence of residuals: Normality and independence of
the components of the data after PCA was observed in the
Normal Q-Q plots of the residuals vs fitted values and were
checked to detect the preceding assumptions (Figure. 6).
The straight red line in the (Figure. 7) indicate that the
assumption of homoscedasticity is not violated.
The linear distribution of the residuals in (Figure. 6)
indicates the linearity of the data as well. Having met all the
assumptions, the following sections explain how we
performed MLR. Before the model created a custom control
function was created that would allow us to use the caret
package in R’s train control function that would allow us to
choose the number of times the model should run to choose
the best fit.
2. Linear Regression
If Y is the dependent or response variable, x is the predictor
or explanatory variable,  is the coefficient  is the random
error or noise, with n total number of variables then a linear
regression analysis model equation for squared error
calculation is represented as:
Based on (1), the caret train functions method was set to
linear model. This trained model was then used as part of a
comparison with the regression models considered further.
C. Ridge Regression
The least squares linear regression model can be modified to
create a ridge regression model by applying a non-negative
cost (penalty) function lambda to the coefficients [27], thus
modifying the equation to calculate the squared error as
follows:
Based on (2), and as we were using a custom control to find
the best penalty value for lambda, a sequence of lambda
values ranging (0.0001 to 0.2) was run.
A linear regression model under ridge regression, is trained
under the L2 regularization norm, which tends to reduce the
coefficients of the predictors that are correlated towards one
another, permitting them to influence each [27].
Figure 9 depicts the behaviour of the components under the
L2 penalty. The trend observed shows that higher the
penalty applied the further away the coefficients of the
components tend to deviate from one another and hence the
model chosen among the ten iterations of the custom control
for lambda is at 0.0001.
Each of the selected eight principal components that
influence the prediction in the final model of the ridge
Figure 8: Residuals vs fitted values plot to confirm
assumptions of homoscedasticity and independence of
residuals.
(2)
(1)
Figure 9: Coefficients variation on Ridge regression's (L2
regularization) best model
Figure 7: Normal Q-Q plots checked to confirm the assumptions
of normality and linearity.
6
regression model are depicted in Figure 10. The eighth and
the first principal component seem to have the highest
influence in the ridge regression’s final model, while the
seventh and the second have the least impact on the model’s
prediction capabilities, though not completely zero.
D. Least Absolute Shrinkage and Selection Operator
(Lasso) Regression
In addition to the lambda cost penalty, the LASSO
regression equation reduces the coefficients size of the
model and selects only those coefficients that have a
significant impact on the prediction outcome [26]. The
linear regression model for squared error calculation can
now be modified to:
Based on (3), we now use the custom control to find the best
penalty value for lambda, as a sequence of lambda values as
before and setting the alpha value equal to 1.
The L1 penalty norm is observed in lasso regression, under
which the model tends to favour one component/coefficient
over the rest while choosing its coefficients/components
[27]. The lasso penalty conforms to the Laplace prior, that
anticipates most of the coefficients to be minimum (close to
zero) and only a few to be larger in magnitude but low in
number [27]. As the L1 penalty is applied in the lasso
model, the coefficients respond with the variation in the
model as depicted in Figure 11. The component represented
in black (PC1) is the only component at lambda value of
0.0001 while all other coefficients appear at later stages of
the lambda value, thus the final model of lasso regression
chooses the optimum lambda L1 penalty as 0.0001.
The parameters that influence the lasso regression final
model’s prediction are depicted in Figure 12, which still
confirms component eight as the major influencer while
component seven has zero influence, which in the case of
lasso regression is possible as it favours one component
while completely ignoring the rest [27].
E. Elastic Net Regression
The Elastic Net Regression is a combination of both the
Lasso and Ridge regression model equations [27]. This can
be achieved by introducing alpha values as a sequence from
0 (ridge) to 1(lasso). The elastic net regression model for
squared error will now be modified to include alpha:
Figure 13: Elastic regression's best model
(3)
(4)
Figure 11: Lasso regression's best model
Figure 12: Variable importance for lasso model
Figure 10: Variable importance for ridge model
7
Based on (4), the custom control will run to find the best
penalty value for lambda as well as for alpha.
Figure 13 depicts the reaction of the predictor components
to L1 to L2 penalty norm as the model runs alpha and
lambda from 0 to 1 and 0.0001 to 0.2 respectively. The final
outcome has been depicted in Table 4. The penalty term is
extremely useful in cases where the number of predictors
exceeds the number of records in the dataset (population)
[27]. The important variables for the elastic net regression’s
final model as expected between lasso and ridge shows
minor variations with the eighth component still having the
maximum influence on the model.
1V. EVALUATION AND RESULTS
For multiple linear regression model the final prediction
model was chosen by comparing the above-mentioned
regression models on the Root Mean Square Error and R-
Squared values for each model, the results of which are
consolidated in Table 4.
Table 4: Regression model comparison
Models Comparison Results
Cost Penalties RMSE R-Squared
Lasso
Regression
 = 1,  = 0.0001 0.00873 0.998
Ridge
Regression
 = 0,  = 0.0001 0.02190 0.998
Elastic Net
Regression
 () = 0.1111,
 = 0.0001
0.00808 0.999
This indicates Elastic Net regression outperforms the other
models with the lowest RMSE. Elastic Net Regression
Model was chosen to conduct the prediction on the test data
set generated from the PCA. The RMSE value of the
prediction using Elastic Net regression was observed as
6.73% with an R-squared value of 99.99%.
In classification both ANN and LSTM were compared based
on accuracy, specificity and sensitivity. The results of which
are shown in Table 5.
Table 5: ANN vs LSTM
Model Sensitivity Specificity Accuracy
ANN 0.3351 0.286 55.11%
LSTM 0.98 0.038 54.35%
From table 5 we can see that ANN performed slightly better
than LSTM. Also, we can see that the models found it hard
to learn from the data.
V. CONCLUSION AND FUTURE WORK
Coming up with the best model for ANN can be time-
consuming. From the results, we know that the accuracy for
LSTM and the best model for ANN are very close with a
difference of 0.76%. In linear regression models, we know
that elastic net outperformed lasso and ridge models. The
elastic net had the lowest RMSE and R squared value.
Although, it was noticed that with more recent data there
were fluctuations in the RMSE and R squared value. There
is a chance that with more recent data and the fluctuations of
the price, the performance of the model may change.
Due to time constraint, LSTM could not be performed
along with RNN. For further study in this area, LSTM
performance can be evaluated along with RNN with much
more recent data and more parameters. Some of the
parameters that were not included in this study are minutes
per transaction, the number of unspent transaction and
transaction fees. These parameters were excluded as they
had low correlation against other attributes. By considering
these parameters multiple linear regression can be
performed for better prediction. One of the limitations in
this study was that we have performed a binary
classification as among 1669 records only 2 records had ‘no
change’ which were removed. With much more recent data
over a period of time and more records having ‘no change’,
multiclass classification can be performed. Also, there was
limitation in R for keras package as it does not support all
versions of R when compared with python and much better
results could have been obtained if we had full access for
the package in R.
REFERENCES
[1] H.Jang, and J.Lee, 2018. An empirical study on modeling and
prediction of bitcoin prices with bayesian neural networks based on
blockchain information. IEEE Access, 6, pp.5427-5437.J. Clerk
Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
Oxford: Clarendon, 1892, pp.68–73.
[2] M.Nakano, A.Takahashi and S.Takahashi, 2018. Bitcoin technical
trading with artificial neural network.
[3] S.Velankar, S.Valecha and S.Maji, 2018, February. Bitcoin price
prediction using machine learning. In Advanced Communication
Technology (ICACT), 2018 20th International Conference on (pp.
144-147). IEEE..
[4] A.Radityo, Q.Munajat and I.Budi, 2017, October. Prediction of
Bitcoin exchange rate to American dollar using artificial neural
network methods. In Advanced Computer Science and Information
Systems (ICACSIS), 2017 International Conference on (pp. 433-438).
IEEE..
[5] S.McNally, J.Roche and S.Caton, 2018, March. Predicting the price
of Bitcoin using Machine Learning. In Parallel, Distributed and
Network-based Processing (PDP), 2018 26th Euromicro International
Conference on (pp. 339-343). IEEE..
Figure 14: Variable importance for elastic net model
8
[6] E.Sin and L.Wang, 2017, July. Bitcoin price prediction using
ensembles of neural networks. In 2017 13th International Conference
on Natural Computation, Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD) (pp. 666-671). IEEE.
[7] Y.B.Kim, J.G.Kim, W.Kim, J.H.Im, , T.H.Kim, S.J.Kang, and
C.H.Kim, 2016. Predicting fluctuations in cryptocurrency transactions
based on user comments and replies. PloS one, 11(8), p.e0161197.
[8] R.C.Phillips and D.Gorse, 2017, November. Predicting
cryptocurrency price bubbles using social media data and epidemic
modelling. In Computational Intelligence (SSCI), 2017 IEEE
Symposium Series on (pp. 1-7). IEEE.
[9] S.Gullapalli, 2018. Learning to predict cryptocurrency price using
artificial neural network models of time series.
[10] L.D Persio, and O.Honchar, Multitask machine learning for financial
forecasting.
[11] A.A.Salisu, L.O. Akanni and R.O.Azeez, 2018. Could this be
affliction? Bitcoin forecasts most tradable currency pairs better than
ARFIMA.
[12] N.A.Bakar and S.Rosbi, 2017. Autoregressive Integrated Moving
Average (ARIMA) Model for Forecasting Cryptocurrency Exchange
Rate in High Volatility Environment: A New Insight of Bitcoin
Transaction. International Journal of Advanced Engineering Research
and Science, 4(11).
[13] J.Patel, S.Shah, P.Thakkar and K.Kotecha, 2015. Predicting stock and
stock price index movement using trend deterministic data
preparation and machine learning techniques. Expert Systems with
Applications, 42(1), pp.259-268.
[14] E.Chong, C.Han and F.C.Park, 2017. Deep learning networks for
stock market analysis and prediction: Methodology, data
representations, and case studies. Expert Systems with Applications,
83, pp.187-205.
[15] H.Y.Kim and C.H.Won, 2018. Forecasting the volatility of stock
price index: A hybrid model integrating LSTM with multiple
GARCH-type models. Expert Systems with Applications, 103, pp.25-
37.
[16] R.Hafezi, J.Shahrabi and E.Hadavandi, 2015. A bat-neural network
multi-agent system (BNNMAS) for stock price prediction: Case study
of DAX stock price. Applied Soft Computing, 29, pp.196-210.
[17] L.Lei, 2018. Wavelet neural network prediction method of stock price
trend based on rough set attribute reduction. Applied Soft Computing,
62, pp.923-932.
[18] I.Georgoula, D.Pournarakis, C.Bilanakos, D.Sotiropoulos and Giaglis,
G.M., 2015. Using time-series and sentiment analysis to detect the
determinants of bitcoin prices.
[19] A.Greaves, and B.Au, 2015. Using the bitcoin transaction graph to
predict the price of bitcoin.
[20] Y.Yoon and G.Swales, 1991, January. Predicting stock price
performance: A neural network approach. In System Sciences, 1991.
Proceedings of the Twenty-Fourth Annual Hawaii International
Conference on (Vol. 4, pp. 156-162). IEEE.
[21] T.Gao, Y.Chai and Y.Liu, 2017, November. Applying long short term
momory neural networks for predicting stock closing price. In
Software Engineering and Service Science (ICSESS), 2017 8th IEEE
International Conference on (pp. 575-578). IEEE.
[22] I.Kaastra and M.Boyd, 1996. Designing a neural network for
forecasting financial and economic time series. Neurocomputing,
10(3), pp.215-236.
[23] J.Mahanta , 2017. Introduction to Neural Networks, Advantages and
Applications.[Online]
Available at: "https://towardsdatascience.com/introduction-to-neural-
networks-advantages-and-applications-96851bd1a207"
[Accessed 1 7 2018].
[24] J.Pallant, 2013. SPSS survival manual. McGraw-Hill Education
(UK).
[25] H.Zou, T.Hastie and R.Tibshirani, , 2006. Sparse principal component
analysis. Journal of computational and graphical statistics, 15(2),
pp.265-286.
[26] S.S.Roy, D.Mittal, A.Basu and A.Abraham, 2015. Stock market
forecasting using LASSO linear regression model. In Afro-European
Conference for Industrial Advancement (pp. 371-381). Springer,
Cham.
[27] J.Friedman, T.Hastie and R.Tibshirani, 2010. Regularization paths for
generalized linear models via coordinate descent. Journal of statistical
software, 33(1), p.1.

More Related Content

What's hot

Bert
BertBert
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
IRJET- Future Stock Price Prediction using LSTM Machine Learning Algorithm
IRJET-  	  Future Stock Price Prediction using LSTM Machine Learning AlgorithmIRJET-  	  Future Stock Price Prediction using LSTM Machine Learning Algorithm
IRJET- Future Stock Price Prediction using LSTM Machine Learning Algorithm
IRJET Journal
 
Stock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised LearningStock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised Learning
Sharvil Katariya
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
RahulKumar854607
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
David Voyles
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
Mohammad Sabouri
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
Shreshth Saxena
 
Virtual mouse
Virtual mouseVirtual mouse
Virtual mouse
Nikhil Mane
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
Syed Annus Ali SHah
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
Hye-min Ahn
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
Ralph Schlosser
 
Stock Market Prediction using Machine Learning
Stock Market Prediction using Machine LearningStock Market Prediction using Machine Learning
Stock Market Prediction using Machine Learning
Aravind Balaji
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Heuristc Search Techniques
Heuristc Search TechniquesHeuristc Search Techniques
Heuristc Search Techniques
Jismy .K.Jose
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
Larry Guo
 

What's hot (20)

Bert
BertBert
Bert
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
IRJET- Future Stock Price Prediction using LSTM Machine Learning Algorithm
IRJET-  	  Future Stock Price Prediction using LSTM Machine Learning AlgorithmIRJET-  	  Future Stock Price Prediction using LSTM Machine Learning Algorithm
IRJET- Future Stock Price Prediction using LSTM Machine Learning Algorithm
 
Stock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised LearningStock Price Trend Forecasting using Supervised Learning
Stock Price Trend Forecasting using Supervised Learning
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
Virtual mouse
Virtual mouseVirtual mouse
Virtual mouse
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Stock Market Prediction using Machine Learning
Stock Market Prediction using Machine LearningStock Market Prediction using Machine Learning
Stock Market Prediction using Machine Learning
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Heuristc Search Techniques
Heuristc Search TechniquesHeuristc Search Techniques
Heuristc Search Techniques
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 

Similar to Bitcoin Close Price Prediction Report

ACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdfACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdf
KiranKumar757501
 
A Deep Learning Approach for Crypto Price Prediction
A Deep Learning Approach for Crypto Price PredictionA Deep Learning Approach for Crypto Price Prediction
A Deep Learning Approach for Crypto Price Prediction
IRJET Journal
 
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
AIRCC Publishing Corporation
 
Bitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTMBitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTM
IRJET Journal
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
ijtsrd
 
Integration of a Predictive, Continuous Time Neural Network into Securities M...
Integration of a Predictive, Continuous Time Neural Network into Securities M...Integration of a Predictive, Continuous Time Neural Network into Securities M...
Integration of a Predictive, Continuous Time Neural Network into Securities M...
Chris Kirk, PhD, FIAP
 
bitcoin prediction
bitcoin predictionbitcoin prediction
bitcoin prediction
RACHANAB18
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
IJERA Editor
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction
IJECEIAES
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data Science
Boston Institute of Analytics
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
Boston Institute of Analytics
 
STOCK PRICE PREDICTION USING ML TECHNIQUES
STOCK PRICE PREDICTION USING ML TECHNIQUESSTOCK PRICE PREDICTION USING ML TECHNIQUES
STOCK PRICE PREDICTION USING ML TECHNIQUES
IRJET Journal
 
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A SurveyIRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET Journal
 
Stock market analysis
Stock market analysisStock market analysis
Stock market analysis
Sruti Jain
 
Q04602106117
Q04602106117Q04602106117
Q04602106117
IJERA Editor
 
Estimize Bull speed using Back propagation
Estimize Bull speed using Back propagationEstimize Bull speed using Back propagation
Estimize Bull speed using Back propagation
IJMER
 
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Shakas Technologies
 
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
IJDKP
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
ThinkInnovation
 

Similar to Bitcoin Close Price Prediction Report (20)

ACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdfACCESS.2020.3015966.pdf
ACCESS.2020.3015966.pdf
 
A Deep Learning Approach for Crypto Price Prediction
A Deep Learning Approach for Crypto Price PredictionA Deep Learning Approach for Crypto Price Prediction
A Deep Learning Approach for Crypto Price Prediction
 
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
An Adaptive Network-Based Approach for Advanced Forecasting of Cryptocurrency...
 
Bitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTMBitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTM
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
 
Integration of a Predictive, Continuous Time Neural Network into Securities M...
Integration of a Predictive, Continuous Time Neural Network into Securities M...Integration of a Predictive, Continuous Time Neural Network into Securities M...
Integration of a Predictive, Continuous Time Neural Network into Securities M...
 
bitcoin prediction
bitcoin predictionbitcoin prediction
bitcoin prediction
 
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
 
A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction A novel hybrid deep learning model for price prediction
A novel hybrid deep learning model for price prediction
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data Science
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
STOCK PRICE PREDICTION USING ML TECHNIQUES
STOCK PRICE PREDICTION USING ML TECHNIQUESSTOCK PRICE PREDICTION USING ML TECHNIQUES
STOCK PRICE PREDICTION USING ML TECHNIQUES
 
IRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A SurveyIRJET- Stock Market Forecasting Techniques: A Survey
IRJET- Stock Market Forecasting Techniques: A Survey
 
Stock market analysis
Stock market analysisStock market analysis
Stock market analysis
 
Q04602106117
Q04602106117Q04602106117
Q04602106117
 
Estimize Bull speed using Back propagation
Estimize Bull speed using Back propagationEstimize Bull speed using Back propagation
Estimize Bull speed using Back propagation
 
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
 
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
 

Recently uploaded

一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
cjimenez2581
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 

Recently uploaded (20)

一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 

Bitcoin Close Price Prediction Report

  • 1. 1 Bitcoin close price prediction Abstract—The objective of this project is to determine the accuracy with which the closing price of the bitcoin can be predicted with the help of classification and linear regression methods. For classification, we have implemented several ANN models with different layers and neurons to find the model with the best accuracy and compared the result with LSTM. Using LSTM an accuracy of 54.35% was achieved with a log loss of 7.18 to predict the direction of the close price. Also, the best ANN model had an accuracy of 55.1 % which was almost at power with LSTM accuracy. Using multiple linear regression models, we deduced that elastic net performed better in comparison to lasso and ridge model as it had lower RMSE and R squared value. RMSE value recorded for elastic net regression is 0.00808 which was lowest when compared to other regression models. Keywords—Bitcoin, machine learning, ANN, LSTM, multiple linear regression model, ridge, lasso, elastic net regression. I. INTRODUCTION In recent years cryptocurrencies has been on a constant rise. Cryptocurrencies are used as a means for digital transactions and for investment purposes around the world [3]. Bitcoins nature of combining monetary units and encryption technology lately has attracted substantial recognition in fields such as economics, computer science and cryptography [1]. Bitcoin being one of the first cryptocurrency which was decentralized now has a market capital of 170 billion US dollars [2]. Since bitcoin is a decentralized cryptocurrency, it is not owned by a government body or restricted to a certain location but applied as a type of peer to peer payment [4]. With the ever-increasing demand in trying to understand the fluctuation in prices of cryptocurrencies, it is vital to have a system that can help predict the change in prices daily. Like the stock exchange bitcoin price change is quite volatile and can be difficult to get a high accuracy in terms of prediction. The value of bitcoin or any other cryptocurrency cannot be static and can vary almost every second. The fluctuation is completely dependent on the amount being paid for bitcoin by buyers. As bitcoin is used as an investment, the same principle applied in stocks for buying cheap and selling at a high price is applicable for cryptocurrency [4]. The volatile nature of the cryptocurrency makes it much more challenging and interesting for analysts to predict the right price. The prediction and approximation of bitcoin prices is an area where much research has not been done [1]. The traditional time series methodology is not suitable since there is a lack of seasonality in the cryptocurrency market and the major factors that help in this methodology are trend, seasonal and noise [5]. Since investors are keen to know the direction of cryptocurrency price i.e. high or low it is vital to have an algorithm that gives the best accuracy in terms of determining the range. A lot of work and research has been done in trying to predict the direction of stock prices and very less in terms of cryptocurrency. In the following sections we will investigate the related work, the methodology used, and the results achieved. One of the main papers referred to the project is [5]. We will be trying to improve on the accuracy achieved by adding more parameters used in the study. With the help of classification methods, we will be classifying the closing price into high, low and no change. Also, we will be using various multiple linear regression methods and find the best method suited for this project. The results of each model would then be analyzed to find the best-suited model for classification and multiple linear regression. II. RELATED WORK Our work improves on the existing research done to predict the bitcoin prices in [5]. McNally, Roche and Caton, 2018 investigated RNN along with LSTM for prediction. The algorithms were benchmarked based on GPU and CPU performances. Results were then compared to ARIMA where it was known that the accuracy of ARIMA was very poor in comparison to RNN and LSTM. Accuracy measured was 52.78%, 50.25% and 50.05% for RNN, LSTM, and ARIMA respectively. [1] used BNN analysing time series. Linear and non-linear benchmark models were used to compare. Resampling was done with the help of bootstrap and cross- validation. The prices were then compared with SVR and linear regression. BNN gave the best results in terms of accuracy in comparison to others. [3] implemented the general linear model and Bayesian regression to predict the daily change of price values considering the parameters. Five normalization techniques were used on the data. Finally, with random forest was applied on both the time series datasets, the results of which were combined to predict the macro change in price. [4] used four ANN methods BPNN, GANN, GABPNN and NEAT. Data was executed with 30 iterations on training. The study focused only on a day’s prediction. BPNN outperformed GAPNN. [6] implemented the genetic algorithm based selective neural network ensemble which is built using multi-layered perceptron. Supervised algorithm Levenberg-Marquardt (LM) was used as a result of the complexity and the computational cost. [7] analysed the social media posts and performed a sentimental analysis to get a positive, neutral and negative score. Using this data ‘Granger causality test’ was performed to test and reject the null hypothesis which was assumed that community comments do not help in predicting the fluctuations in cryptocurrency prices. This paper analyses the market using sentiment score as compared to the HMM model which uses time series data to predict the prices. One of the major risks with social media posts being that it can be easily exploited. [8] used the Hidden Markov Model (HMM) to examine social media posts to predict the transition to another state at a certain point in time, given the current state of the currency. With the help of this model, by identifying the hidden state, given the data point, the state of the
  • 2. 2 cryptocurrency at a certain point in time can be predicted. This model particularly focuses on time series data to predict the prices of the cryptocurrency. [9] predicted the highest and closing price of bitcoin using time-delay neural network (TDNN) and recurrent neural network (RNN). The models were trained across data from past eight quarters to test over the next quarter. TDNN needed less training time and predicted values closer to the actual price as compared to RNN. [10] approached classification and regression problems of machine learning by proposing a regularization method based neural network. Their results depicted that obtaining directional accuracy of up to 5%, the rolling volatility and rolling skewness were the best auxiliary objectives to forecast. The best regularization parameters for the tasks were found by applying Bayesian optimization. [11] in their thesis work, used fractionally integrated autoregressive moving average (ARFIMA) model to predict the value of currency using the exchange rate of the Bitcoin. Their research is based on the Lewellen approach and the approaches of Westerlund and Narayan to find any statistical effects that could be responsible for a bias of regression estimates. [12] applied the ARIMA (Autoregressive Integrated Moving Average) model to predict the exchange rate of Bitcoin, by conducting autocorrelation function and partial autocorrelation function analysis to determine the parameters for the ARIMA model. The MAPE of the model was found to be 5.36% while explaining approximately 44% of variability from the response of the data around the models mean. [13] predicted the stock prices using four different models namely ANN, Naïve Bayes, SVM and Random forest. Naïve Bayes exhibited the least performance while random forest had the highest performance. [14] applied deep neural network (DNN) to predict the stock returns in future. DNN outperformed linear autoregressive model in training set but did not have the advantage the test set. [15] used GARCH (General Autoregressive Conditional Heteroskedasticity) and LSTM (Long short-term memory) to forecast the volatility of the stock price index. Multiple GARCH models gave much-improved prediction over other hybrid neural networks. [16] proposed a BNNMAS (bat- neural network multi-agent system) architecture with four layers to tackle the problem of stock prediction. The model proved to be quite robust. [17] implemented WNN (Wavelet neural network) to reduce the size of the network and simplify the structure. [18] tried to determine the various factors that determine the price of bitcoin by taking into consideration the twitter sentiment. SVM (Support vector machine) was used to analyse the sentiment ratio on a day- to-day basis. The research showed that the bitcoin prices were positively affected by the search queries from Wikipedia. [19] tried to predict the bitcoin price for one hour in the future with the help of a naïve approach to set the baseline prediction and evaluated the results by using mean squared error (MSE). Other tree-based algorithms and k nearest neighbour algorithm were used which didn’t even match up to the baseline prediction. SVM and linear regression performed better in comparison. Finally, [20] used the neural network to predict the stock prices. Higher performance was achieved by increasing the number of hidden units although increasing the units beyond a certain point diminished the performance of the model. Neural network gave significant results when compared to multiple discriminant analysis (MDA) for predicting stock prices. III. DATA MINING METHODOLOGY In this project, we have used the CRISP data mining methodology. We had a clear business understanding as to predict the closing price of bitcoin when comparing to USD. The data was sourced from blockchain and coinmarketcap. Quandl package is used to dynamically source data from blockchain. The data is taken from 1st January 2014 to 27th July 2018. The second source of data is coinmarketcap. Htmltab package is used to source data from this site by giving the start and end date. Data from 2013 was not considered as the volume column did not have any values. Cleaning was then performed to adjust the date and number format. Finally, the data set was merged based on date. We then performed the Granger causality test to check if classification methods can be used for prediction. By not handling high correlation we performed ANN and LSTM classification methods. Also, by handling high correlation with the help of principal component analysis (PCA) we performed multiple linear regression (MLR). Lasso, ridge and elastic net linear regression models were implemented, and the results were compared based on the root mean squared error (RMSE). First, we will investigate the classification methods i.e. ANN and LSTM and then into the multiple linear regression methods. Figure. 1 allows us to deduce the number of correlations depicted between the variables. A. Classification 2. ANN Artificial Neural Network (ANN) is part of cognitive learning, which is used for an approximation as mentioned in [22]. In recent times, the use of ANN has increased for tasks such as classification, time series forecasting, and pattern recognition. Moreover, use of ANN has drastically increased in financial organizations. As the data used is time series in nature, ANN was considered for implementation. Moreover, the ANN is a non-linear model and can handle Figure 1: Correlation Matrix
  • 3. 3 complex relationships between variables. It can also generalize and infer unseen relationships that are unseen in the data. In addition, ANN also does not impose any restriction on input data [23]. Here, ANN is used for classifying the direction of close price i.e. high or low. Therefore, the classification was binary in nature. To prepare the model, initially some cleaning was required and was done using R. One of the key requirements of the ANN is to normalize the data. Data were scaled to bring the values in between 0 and 1. Price- Direction was the dependent variable and was derived from "Close" attribute. Price-Direction was then encoded with 1 being higher and 0 being lower. Various models were constructed with a single and multiple layer of hidden layers. It was found that single hidden layer provided the model with better and consistent accuracy. In ANN after modelling the inputs as per requirement of the model, the dataset was divided into training and test data with 70 and 30 percent distribution of rows respectively. Based on the input, models with different hidden layers were made to run and the results were collected. Table 1, 2 and 3 consolidates the confusion matrix calculations for the single hidden layer with one neuron, two hidden layers with (2,1) neurons and (4,3) neurons respectively. Model 1: ANN model with single hidden layer and one neuron: TABLE 1: Confusion matrix with one hidden layer Lower Higher Lower 120 111 Higher 253 326 Accuracy 0.55061728 Misclassification 0.44938272 Sensitivity (120/ (120+253) = 0.3217 Specificity (326/ (111+326) =0.8624 Model 2: ANN model with two hidden layers with two and one neuron: TABLE 2: Confusion matrix with two hidden layers (2,1) Lower Higher Lower 138 141 Higher 219 298 Accuracy 0.5477386935 Misclassification 0.4522613065 Sensitivity 138/ (138+219) =0.3876 Specificity 141/ (141+298) =0.3211 Model 3: ANN model with two hidden layers with four and three neurons: Figure 2: ANN model with one hidden layer Figure 4: ANN model with two hidden layers (4,3) Figure 3: ANN model with two hidden layers (2,1)
  • 4. 4 TABLE 3: Confusion matrix with two hidden layers (4,3) Lower Higher Lower 125 148 Higher 248 289 Accuracy 0.55111111 Misclassification 0.48888889 Sensitivity (125/ (122+248) = 0.3351 Specificity (289/ (148+289) =0.2860 Several other models with different configurations were executed. Model with one hidden layer and one neuron gave consistent result with training and test dataset and a better accuracy. Disadvantage of ANN model: 1. Execution time is high with moderate hardware. 2. Reduced trust as it gives different result with different models. 1. LSTM The model consists of blocks of memory which consists of input, output and a forget gate in memory (Ct) [15]. The use of LSTM over MLP is due to the materialistic nature of bitcoin data [5]. The deep learning model is supported by Keras package in R which is already well known in the Python environment. The sample data was split into 80% and 20% for train and test respectively. Dense and dropout functions were defined with two hidden layers with 60 and 50 neurons and an output layer with 1 neuron as the classification problem is binary class classification. The activation function describes the weighted sum multiplied with input and its summation with bias [15]. The classification probability lies in the range of 0 to 1 and thus supported by sigmoid which is a non-linear activation function. Rectified Linear Unit (ReLU) decides the output as 0 or 1 based on the maximum value of data given by max(x,0). Value is passed through the gate upon which forget gate controls the information in the previous state (Ct-1) and passed to the sigmoid function. Binary cross entropy is applied to log losses with a probability between 0 and 1 contributing to 1 as a bad model and 0 being the perfect model. A stochastic optimizer function Adam manages to sustain the learning rate with the weights for the training set. The learning rate (lr) specified as 0.0001 with a delay of 1e-6. The parameter metrics is set to accuracy for performance model. The data was tested against multiple epochs ranging from 50 to 150 and a batch size of 150 gravitated towards the higher effect In comparison to [5] where LSTM reported an accuracy of 52.78%, by considering more parameters we have improved on the accuracy. The confusion matrix and the log loss were traced to plot the graph and resulted in 7.18 loss with an accuracy of 54.35%. The value of sensitivity and specificity are 0.98 and 0.038 respectively. As the data used in the model was less compared to other financial data, it resulted in lesser accuracy and can be enhanced over a period with the collection of historical data along with the current date. B. Regression Analysis 1. Assumptions To conduct Multiple Linear Regression Analysis, certain assumptions [24] on the data need to be met. These assumptions include: 1) Adequate Sample Size: According to Tabachnik and Fidell cited in Palant, 2007, the appropriate sample size formula is N > 50 + 8m, where m is the number of total independent variables [24]. The sample size for the project was 1669 which is way above the calculated limit with eleven independent variables (138). 2) Ouliers: The outliers of the dataset were handled by imputing the mean for the outliers detected in R code for boxplot.stats(column_name)$out by creating a function and passing each column as an input to the function. 3) Multicollinearity and Singularity: The Independent and Dependent variables demonstrated high correlations as mentioned before. This violates our assumption of multicollinearity. Principal Component Analysis was performed to overcome this [25]. By deducing only those variables that explain maximum variance among the linear combinations of the independent variables PCA was conducted on the train and test datasets using the prcomp function in R setting the scale function as true to normalize the data, which is one of the requirements of performing PCA. The figure below indicates the components that explain the maximum proportion of variance for the input train set of data: Fig. 1 LSTM model [21] Figure 5: LSTM model Figure 6: Proportion of variance against each component
  • 5. 5 After the PCA on the train set was conducted only those components that explain the maximum variance (excluding the ones that are tending towards 0 i.e., component 9 and 10), were considered as the inputs to create our test PCA dataset, and as inputs for our regression analysis. Thus a total of eight normalized principal components for the test and the train datasets were considered as the inputs for the multiple linear regression (MLR) algorithms conducted in the sections that follow. 4) Normality, Linearity, Homoscedasticity, independence of residuals: Normality and independence of the components of the data after PCA was observed in the Normal Q-Q plots of the residuals vs fitted values and were checked to detect the preceding assumptions (Figure. 6). The straight red line in the (Figure. 7) indicate that the assumption of homoscedasticity is not violated. The linear distribution of the residuals in (Figure. 6) indicates the linearity of the data as well. Having met all the assumptions, the following sections explain how we performed MLR. Before the model created a custom control function was created that would allow us to use the caret package in R’s train control function that would allow us to choose the number of times the model should run to choose the best fit. 2. Linear Regression If Y is the dependent or response variable, x is the predictor or explanatory variable,  is the coefficient  is the random error or noise, with n total number of variables then a linear regression analysis model equation for squared error calculation is represented as: Based on (1), the caret train functions method was set to linear model. This trained model was then used as part of a comparison with the regression models considered further. C. Ridge Regression The least squares linear regression model can be modified to create a ridge regression model by applying a non-negative cost (penalty) function lambda to the coefficients [27], thus modifying the equation to calculate the squared error as follows: Based on (2), and as we were using a custom control to find the best penalty value for lambda, a sequence of lambda values ranging (0.0001 to 0.2) was run. A linear regression model under ridge regression, is trained under the L2 regularization norm, which tends to reduce the coefficients of the predictors that are correlated towards one another, permitting them to influence each [27]. Figure 9 depicts the behaviour of the components under the L2 penalty. The trend observed shows that higher the penalty applied the further away the coefficients of the components tend to deviate from one another and hence the model chosen among the ten iterations of the custom control for lambda is at 0.0001. Each of the selected eight principal components that influence the prediction in the final model of the ridge Figure 8: Residuals vs fitted values plot to confirm assumptions of homoscedasticity and independence of residuals. (2) (1) Figure 9: Coefficients variation on Ridge regression's (L2 regularization) best model Figure 7: Normal Q-Q plots checked to confirm the assumptions of normality and linearity.
  • 6. 6 regression model are depicted in Figure 10. The eighth and the first principal component seem to have the highest influence in the ridge regression’s final model, while the seventh and the second have the least impact on the model’s prediction capabilities, though not completely zero. D. Least Absolute Shrinkage and Selection Operator (Lasso) Regression In addition to the lambda cost penalty, the LASSO regression equation reduces the coefficients size of the model and selects only those coefficients that have a significant impact on the prediction outcome [26]. The linear regression model for squared error calculation can now be modified to: Based on (3), we now use the custom control to find the best penalty value for lambda, as a sequence of lambda values as before and setting the alpha value equal to 1. The L1 penalty norm is observed in lasso regression, under which the model tends to favour one component/coefficient over the rest while choosing its coefficients/components [27]. The lasso penalty conforms to the Laplace prior, that anticipates most of the coefficients to be minimum (close to zero) and only a few to be larger in magnitude but low in number [27]. As the L1 penalty is applied in the lasso model, the coefficients respond with the variation in the model as depicted in Figure 11. The component represented in black (PC1) is the only component at lambda value of 0.0001 while all other coefficients appear at later stages of the lambda value, thus the final model of lasso regression chooses the optimum lambda L1 penalty as 0.0001. The parameters that influence the lasso regression final model’s prediction are depicted in Figure 12, which still confirms component eight as the major influencer while component seven has zero influence, which in the case of lasso regression is possible as it favours one component while completely ignoring the rest [27]. E. Elastic Net Regression The Elastic Net Regression is a combination of both the Lasso and Ridge regression model equations [27]. This can be achieved by introducing alpha values as a sequence from 0 (ridge) to 1(lasso). The elastic net regression model for squared error will now be modified to include alpha: Figure 13: Elastic regression's best model (3) (4) Figure 11: Lasso regression's best model Figure 12: Variable importance for lasso model Figure 10: Variable importance for ridge model
  • 7. 7 Based on (4), the custom control will run to find the best penalty value for lambda as well as for alpha. Figure 13 depicts the reaction of the predictor components to L1 to L2 penalty norm as the model runs alpha and lambda from 0 to 1 and 0.0001 to 0.2 respectively. The final outcome has been depicted in Table 4. The penalty term is extremely useful in cases where the number of predictors exceeds the number of records in the dataset (population) [27]. The important variables for the elastic net regression’s final model as expected between lasso and ridge shows minor variations with the eighth component still having the maximum influence on the model. 1V. EVALUATION AND RESULTS For multiple linear regression model the final prediction model was chosen by comparing the above-mentioned regression models on the Root Mean Square Error and R- Squared values for each model, the results of which are consolidated in Table 4. Table 4: Regression model comparison Models Comparison Results Cost Penalties RMSE R-Squared Lasso Regression  = 1,  = 0.0001 0.00873 0.998 Ridge Regression  = 0,  = 0.0001 0.02190 0.998 Elastic Net Regression  () = 0.1111,  = 0.0001 0.00808 0.999 This indicates Elastic Net regression outperforms the other models with the lowest RMSE. Elastic Net Regression Model was chosen to conduct the prediction on the test data set generated from the PCA. The RMSE value of the prediction using Elastic Net regression was observed as 6.73% with an R-squared value of 99.99%. In classification both ANN and LSTM were compared based on accuracy, specificity and sensitivity. The results of which are shown in Table 5. Table 5: ANN vs LSTM Model Sensitivity Specificity Accuracy ANN 0.3351 0.286 55.11% LSTM 0.98 0.038 54.35% From table 5 we can see that ANN performed slightly better than LSTM. Also, we can see that the models found it hard to learn from the data. V. CONCLUSION AND FUTURE WORK Coming up with the best model for ANN can be time- consuming. From the results, we know that the accuracy for LSTM and the best model for ANN are very close with a difference of 0.76%. In linear regression models, we know that elastic net outperformed lasso and ridge models. The elastic net had the lowest RMSE and R squared value. Although, it was noticed that with more recent data there were fluctuations in the RMSE and R squared value. There is a chance that with more recent data and the fluctuations of the price, the performance of the model may change. Due to time constraint, LSTM could not be performed along with RNN. For further study in this area, LSTM performance can be evaluated along with RNN with much more recent data and more parameters. Some of the parameters that were not included in this study are minutes per transaction, the number of unspent transaction and transaction fees. These parameters were excluded as they had low correlation against other attributes. By considering these parameters multiple linear regression can be performed for better prediction. One of the limitations in this study was that we have performed a binary classification as among 1669 records only 2 records had ‘no change’ which were removed. With much more recent data over a period of time and more records having ‘no change’, multiclass classification can be performed. Also, there was limitation in R for keras package as it does not support all versions of R when compared with python and much better results could have been obtained if we had full access for the package in R. REFERENCES [1] H.Jang, and J.Lee, 2018. An empirical study on modeling and prediction of bitcoin prices with bayesian neural networks based on blockchain information. IEEE Access, 6, pp.5427-5437.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73. [2] M.Nakano, A.Takahashi and S.Takahashi, 2018. Bitcoin technical trading with artificial neural network. [3] S.Velankar, S.Valecha and S.Maji, 2018, February. Bitcoin price prediction using machine learning. In Advanced Communication Technology (ICACT), 2018 20th International Conference on (pp. 144-147). IEEE.. [4] A.Radityo, Q.Munajat and I.Budi, 2017, October. Prediction of Bitcoin exchange rate to American dollar using artificial neural network methods. In Advanced Computer Science and Information Systems (ICACSIS), 2017 International Conference on (pp. 433-438). IEEE.. [5] S.McNally, J.Roche and S.Caton, 2018, March. Predicting the price of Bitcoin using Machine Learning. In Parallel, Distributed and Network-based Processing (PDP), 2018 26th Euromicro International Conference on (pp. 339-343). IEEE.. Figure 14: Variable importance for elastic net model
  • 8. 8 [6] E.Sin and L.Wang, 2017, July. Bitcoin price prediction using ensembles of neural networks. In 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) (pp. 666-671). IEEE. [7] Y.B.Kim, J.G.Kim, W.Kim, J.H.Im, , T.H.Kim, S.J.Kang, and C.H.Kim, 2016. Predicting fluctuations in cryptocurrency transactions based on user comments and replies. PloS one, 11(8), p.e0161197. [8] R.C.Phillips and D.Gorse, 2017, November. Predicting cryptocurrency price bubbles using social media data and epidemic modelling. In Computational Intelligence (SSCI), 2017 IEEE Symposium Series on (pp. 1-7). IEEE. [9] S.Gullapalli, 2018. Learning to predict cryptocurrency price using artificial neural network models of time series. [10] L.D Persio, and O.Honchar, Multitask machine learning for financial forecasting. [11] A.A.Salisu, L.O. Akanni and R.O.Azeez, 2018. Could this be affliction? Bitcoin forecasts most tradable currency pairs better than ARFIMA. [12] N.A.Bakar and S.Rosbi, 2017. Autoregressive Integrated Moving Average (ARIMA) Model for Forecasting Cryptocurrency Exchange Rate in High Volatility Environment: A New Insight of Bitcoin Transaction. International Journal of Advanced Engineering Research and Science, 4(11). [13] J.Patel, S.Shah, P.Thakkar and K.Kotecha, 2015. Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications, 42(1), pp.259-268. [14] E.Chong, C.Han and F.C.Park, 2017. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Systems with Applications, 83, pp.187-205. [15] H.Y.Kim and C.H.Won, 2018. Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Systems with Applications, 103, pp.25- 37. [16] R.Hafezi, J.Shahrabi and E.Hadavandi, 2015. A bat-neural network multi-agent system (BNNMAS) for stock price prediction: Case study of DAX stock price. Applied Soft Computing, 29, pp.196-210. [17] L.Lei, 2018. Wavelet neural network prediction method of stock price trend based on rough set attribute reduction. Applied Soft Computing, 62, pp.923-932. [18] I.Georgoula, D.Pournarakis, C.Bilanakos, D.Sotiropoulos and Giaglis, G.M., 2015. Using time-series and sentiment analysis to detect the determinants of bitcoin prices. [19] A.Greaves, and B.Au, 2015. Using the bitcoin transaction graph to predict the price of bitcoin. [20] Y.Yoon and G.Swales, 1991, January. Predicting stock price performance: A neural network approach. In System Sciences, 1991. Proceedings of the Twenty-Fourth Annual Hawaii International Conference on (Vol. 4, pp. 156-162). IEEE. [21] T.Gao, Y.Chai and Y.Liu, 2017, November. Applying long short term momory neural networks for predicting stock closing price. In Software Engineering and Service Science (ICSESS), 2017 8th IEEE International Conference on (pp. 575-578). IEEE. [22] I.Kaastra and M.Boyd, 1996. Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), pp.215-236. [23] J.Mahanta , 2017. Introduction to Neural Networks, Advantages and Applications.[Online] Available at: "https://towardsdatascience.com/introduction-to-neural- networks-advantages-and-applications-96851bd1a207" [Accessed 1 7 2018]. [24] J.Pallant, 2013. SPSS survival manual. McGraw-Hill Education (UK). [25] H.Zou, T.Hastie and R.Tibshirani, , 2006. Sparse principal component analysis. Journal of computational and graphical statistics, 15(2), pp.265-286. [26] S.S.Roy, D.Mittal, A.Basu and A.Abraham, 2015. Stock market forecasting using LASSO linear regression model. In Afro-European Conference for Industrial Advancement (pp. 371-381). Springer, Cham. [27] J.Friedman, T.Hastie and R.Tibshirani, 2010. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), p.1.