The objective of this project was to determine the accuracy with which the closing price of the bitcoin can be predicted with the help of classification and linear regression methods. For classification, ANN models were implemented with different layers and neurons to find the model with the best accuracy and compared the result with LSTM. Using LSTM an accuracy of 54.35% was achieved with a log loss of 7.18 to predict the direction of the close price. Also, the best ANN model had an accuracy of 55.1 % which was almost at power with LSTM accuracy. Using multiple linear regression models, we deduced that elastic net performed better in comparison to lasso and ridge model as it had lower RMSE and R squared value. RMSE value recorded for elastic net regression is 0.00808 which was lowest when compared to other regression models.
This document provides an overview of using deep learning algorithms like LSTM and sentiment analysis to predict bitcoin prices. It discusses neural networks and RNNs, why LSTMs are better than RNNs at capturing long-term dependencies. It describes implementing an LSTM model to predict prices from historical data and analyzing sentiment from Twitter tweets. Testing was done and results showed the model could predict prices. Future work includes applying it to other cryptocurrencies and improving performance.
Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ...Databricks
Sentiment values have been analyzed in relation to myriad commodities. Since its inception, Bitcoin (BTC) has been a very speculative cryptocurrency majorly influenced by sentiment on various communication platforms. Recent research has proven a close correlation between sentiments and cryptocurrency value. Social media platforms are a gold mine for opinionated data, which proves useful in trends based analysis. The advent of deep learning has helped enhance the feats in opinion mining from static metric based analysis to lexical analysis to context based mining, where in the latter, sentiments are purely based on context extracted using advanced Natural Language Processing techniques.
We focus on data collected using Twitter and Reddit channels, perform ETL using Apache Spark, and then mine opinions using deep learning based NLP techniques to functionally associate BTC historical price data with sentiments portrayed with time, and further effectively predict the future prices with acceptable accuracy.
Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1
This presentation contains the overview of Attention models. It also has information of the stand alone self attention model used for Computer Vision tasks.
This document discusses classifying handwritten digits using the MNIST dataset with a simple linear machine learning model. It begins by introducing the MNIST dataset of images and corresponding labels. It then discusses using a linear model with weights and biases to make predictions for each image. The weights represent a filter to distinguish digits. The model is trained using gradient descent to minimize the cross-entropy cost function by adjusting the weights and biases based on batches of training data. The goal is to improve the model's ability to correctly classify handwritten digit images.
Recurrent neural networks (RNNs) and long short-term memory (LSTM) networks can be used for sequence modeling tasks like predicting the next word. RNNs apply the same function to each element of a sequence but struggle with long-term dependencies. LSTMs address this with a gated cell that can maintain information over many time steps by optionally adding, removing, or updating cell state. LSTMs are better for tasks like language modeling since they can remember inputs from much earlier in the sequence. RNNs and LSTMs have applications in areas like music generation, machine translation, and predictive modeling.
Ridge and random forest regression techniques were used to develop a mathematical model to calculate the cross-validation score and predict stock price volatility of companies. The model aims to determine if a firm's stock prices remain fluctuating or stable and analyze trends in price changes over time. Researchers found directional stock price movements were over 90% predictable given past opening and closing prices, though the magnitude of price changes could not be determined with certainty.
This document provides an overview of using deep learning algorithms like LSTM and sentiment analysis to predict bitcoin prices. It discusses neural networks and RNNs, why LSTMs are better than RNNs at capturing long-term dependencies. It describes implementing an LSTM model to predict prices from historical data and analyzing sentiment from Twitter tweets. Testing was done and results showed the model could predict prices. Future work includes applying it to other cryptocurrencies and improving performance.
Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ...Databricks
Sentiment values have been analyzed in relation to myriad commodities. Since its inception, Bitcoin (BTC) has been a very speculative cryptocurrency majorly influenced by sentiment on various communication platforms. Recent research has proven a close correlation between sentiments and cryptocurrency value. Social media platforms are a gold mine for opinionated data, which proves useful in trends based analysis. The advent of deep learning has helped enhance the feats in opinion mining from static metric based analysis to lexical analysis to context based mining, where in the latter, sentiments are purely based on context extracted using advanced Natural Language Processing techniques.
We focus on data collected using Twitter and Reddit channels, perform ETL using Apache Spark, and then mine opinions using deep learning based NLP techniques to functionally associate BTC historical price data with sentiments portrayed with time, and further effectively predict the future prices with acceptable accuracy.
Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1
This presentation contains the overview of Attention models. It also has information of the stand alone self attention model used for Computer Vision tasks.
This document discusses classifying handwritten digits using the MNIST dataset with a simple linear machine learning model. It begins by introducing the MNIST dataset of images and corresponding labels. It then discusses using a linear model with weights and biases to make predictions for each image. The weights represent a filter to distinguish digits. The model is trained using gradient descent to minimize the cross-entropy cost function by adjusting the weights and biases based on batches of training data. The goal is to improve the model's ability to correctly classify handwritten digit images.
Recurrent neural networks (RNNs) and long short-term memory (LSTM) networks can be used for sequence modeling tasks like predicting the next word. RNNs apply the same function to each element of a sequence but struggle with long-term dependencies. LSTMs address this with a gated cell that can maintain information over many time steps by optionally adding, removing, or updating cell state. LSTMs are better for tasks like language modeling since they can remember inputs from much earlier in the sequence. RNNs and LSTMs have applications in areas like music generation, machine translation, and predictive modeling.
Ridge and random forest regression techniques were used to develop a mathematical model to calculate the cross-validation score and predict stock price volatility of companies. The model aims to determine if a firm's stock prices remain fluctuating or stable and analyze trends in price changes over time. Researchers found directional stock price movements were over 90% predictable given past opening and closing prices, though the magnitude of price changes could not be determined with certainty.
The document discusses the BERT model for natural language processing. It begins with an introduction to BERT and how it achieved state-of-the-art results on 11 NLP tasks in 2018. The document then covers related work on language representation models including ELMo and GPT. It describes the key aspects of the BERT model, including its bidirectional Transformer architecture, pre-training using masked language modeling and next sentence prediction, and fine-tuning for downstream tasks. Experimental results are presented showing BERT outperforming previous models on the GLUE benchmark, SQuAD 1.1, SQuAD 2.0, and SWAG. Ablation studies examine the importance of the pre-training tasks and the effect of model size.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
This document describes using a Long Short-Term Memory (LSTM) machine learning algorithm to predict future stock prices. It begins with an introduction to stock price prediction and the challenges involved. It then discusses the proposed system, which uses LSTM to more accurately predict stock prices compared to existing methods like sliding window algorithms. The system architecture involves downloading stock price data, preprocessing it, training an LSTM model, and visualizing the predictions. LSTM is well-suited for this task since it can learn from experiences with long time lags between important events. The document concludes the LSTM approach helps investors and analysts by providing a precise forecasting model for stock market prediction.
Stock Price Trend Forecasting using Supervised LearningSharvil Katariya
The aim of the project is to examine a number of different forecasting techniques to predict future stock returns based on past returns and numerical user-generated content to construct a portfolio of multiple stocks in order to diversify the risk. We do this by applying supervised learning methods for stock price forecasting by interpreting the seemingly chaotic market data.
This document discusses neural network models for natural language processing tasks like machine translation. It describes how recurrent neural networks (RNNs) were used initially but had limitations in capturing long-term dependencies and parallelization. The encoder-decoder framework addressed some issues but still lost context. Attention mechanisms allowed focusing on relevant parts of the input and using all encoded states. Transformers replaced RNNs entirely with self-attention and encoder-decoder attention, allowing parallelization while generating a richer representation capturing word relationships. This revolutionized NLP tasks like machine translation.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
1) Transformers use self-attention to solve problems with RNNs like vanishing gradients and parallelization. They combine CNNs and attention.
2) Transformers have encoder and decoder blocks. The encoder models input and decoder models output. Variations remove encoder (GPT) or decoder (BERT) for language modeling.
3) GPT-3 is a large Transformer with 175B parameters that can perform many NLP tasks but still has safety and bias issues.
Recurrent Neural Network
ACRRL
Applied Control & Robotics Research Laboratory of Shiraz University
Department of Power and Control Engineering, Shiraz University, Fars, Iran.
Mohammad Sabouri
https://sites.google.com/view/acrrl/
RNN AND LSTM
This document provides an overview of RNNs and LSTMs:
1. RNNs can process sequential data like time series data using internal hidden states.
2. LSTMs are a type of RNN that use memory cells to store information for long periods of time.
3. LSTMs have input, forget, and output gates that control information flow into and out of the memory cell.
The document describes a proposed virtual mouse system that uses hand gesture recognition instead of a physical mouse. It discusses the limitations of existing input devices like trackballs and optical mice. The proposed system uses a webcam to capture images of hand gestures, applies object recognition techniques to identify gestures, and translates the gestures to mouse events on the screen. It outlines the hardware and software requirements and modules needed to implement the virtual mouse, including image acquisition, object recognition, coordinate calculation, and event generation. Work done so far includes literature research and initial implementation efforts.
Recurrent neural networks (RNNs) are a type of artificial neural network that can process sequential data of varying lengths. Unlike traditional neural networks, RNNs maintain an internal state that allows them to exhibit dynamic temporal behavior. RNNs take the output from the previous step and feed it as input to the current step, making the network dependent on information from earlier steps. This makes RNNs well-suited for applications like text generation, machine translation, image captioning, and more. RNNs can remember information for long periods of time but are difficult to train due to issues like vanishing gradients.
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
This is my slides for introducing sequence to sequence model and Recurrent Neural Network(RNN) to my laboratory colleagues.
Hyemin Ahn, @CPSLAB, Seoul National University (SNU)
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
Slides reviewing the paper:
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Advances in Neural Information Processing Systems, pp. 6000-6010. 2017.
The dominant sequence transduction models are based on complex recurrent orconvolutional neural networks in an encoder and decoder configuration. The best performing such models also connect the encoder and decoder through an attentionm echanisms. We propose a novel, simple network architecture based solely onan attention mechanism, dispensing with recurrence and convolutions entirely.Experiments on two machine translation tasks show these models to be superiorin quality while being more parallelizable and requiring significantly less timeto train. Our single model with 165 million parameters, achieves 27.5 BLEU onEnglish-to-German translation, improving over the existing best ensemble result by over 1 BLEU. On English-to-French translation, we outperform the previoussingle state-of-the-art with model by 0.7 BLEU, achieving a BLEU score of 41.1.
This is a presentation I gave as a short overview of LSTMs. The slides are accompanied by two examples which apply LSTMs to Time Series data. Examples were implemented using Keras. See links in slide pack.
Stock Market Prediction using Machine LearningAravind Balaji
REPO : https://github.com/rvndbalaji/StockMarketPrediction
Stock Market Prediction using Machine
This is a presentation on Stock Market Prediction application built using R.
This is a part of final year engineering project
This document discusses various heuristic search techniques, including generate-and-test, hill climbing, best first search, and simulated annealing. Generate-and-test involves generating possible solutions and testing them until a solution is found. Hill climbing iteratively improves the current state by moving in the direction of increased heuristic value until no better state can be found or a goal is reached. Best first search expands the most promising node first based on heuristic evaluation. Simulated annealing is based on hill climbing but allows moves to worse states probabilistically to escape local maxima.
Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo
This Material is an in_depth study report of Recurrent Neural Network (RNN)
Material mainly from Deep Learning Book Bible, http://www.deeplearningbook.org/
Topics: Briefing, Theory Proof, Variation, Gated RNNN Intuition. Real World Application
Application (CNN+RNN on SVHN)
Also a video (In Chinese)
https://www.youtube.com/watch?v=p6xzPqRd46w
This document presents research on predicting stock market trends in Tehran, Iran using machine learning and deep learning algorithms. Ten years of historical data from four stock market groups were analyzed using nine machine learning models (Decision Tree, Random Forest, Adaboost, XGBoost, SVC, Naive Bayes, KNN, Logistic Regression, ANN) and two deep learning models (RNN, LSTM). Ten technical indicators were used as input values in both continuous and binary formats to evaluate the models. The results showed that RNN and LSTM performed best on continuous data, outperforming other models, while on binary data they still performed best but with less difference between models due to improved performance.
A Deep Learning Approach for Crypto Price PredictionIRJET Journal
This document presents a study that uses deep learning models like LSTM and GRU to predict cryptocurrency prices. It analyzes daily price data for Bitcoin, Ethereum, and Cosmos from 2013-2021. Both LSTM and GRU recurrent neural networks are trained on the data, with GRU found to converge faster and produce more accurate predictions based on error metrics. Specifically, the GRU model outperforms the LSTM model in forecasting closing prices for the majority of cryptocurrencies examined based on mean absolute percentage error and root mean square error. The document concludes the GRU-based forecasting model is more appropriate than LSTM for predicting cryptocurrency prices due to high volatility.
The document discusses the BERT model for natural language processing. It begins with an introduction to BERT and how it achieved state-of-the-art results on 11 NLP tasks in 2018. The document then covers related work on language representation models including ELMo and GPT. It describes the key aspects of the BERT model, including its bidirectional Transformer architecture, pre-training using masked language modeling and next sentence prediction, and fine-tuning for downstream tasks. Experimental results are presented showing BERT outperforming previous models on the GLUE benchmark, SQuAD 1.1, SQuAD 2.0, and SWAG. Ablation studies examine the importance of the pre-training tasks and the effect of model size.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
This document describes using a Long Short-Term Memory (LSTM) machine learning algorithm to predict future stock prices. It begins with an introduction to stock price prediction and the challenges involved. It then discusses the proposed system, which uses LSTM to more accurately predict stock prices compared to existing methods like sliding window algorithms. The system architecture involves downloading stock price data, preprocessing it, training an LSTM model, and visualizing the predictions. LSTM is well-suited for this task since it can learn from experiences with long time lags between important events. The document concludes the LSTM approach helps investors and analysts by providing a precise forecasting model for stock market prediction.
Stock Price Trend Forecasting using Supervised LearningSharvil Katariya
The aim of the project is to examine a number of different forecasting techniques to predict future stock returns based on past returns and numerical user-generated content to construct a portfolio of multiple stocks in order to diversify the risk. We do this by applying supervised learning methods for stock price forecasting by interpreting the seemingly chaotic market data.
This document discusses neural network models for natural language processing tasks like machine translation. It describes how recurrent neural networks (RNNs) were used initially but had limitations in capturing long-term dependencies and parallelization. The encoder-decoder framework addressed some issues but still lost context. Attention mechanisms allowed focusing on relevant parts of the input and using all encoded states. Transformers replaced RNNs entirely with self-attention and encoder-decoder attention, allowing parallelization while generating a richer representation capturing word relationships. This revolutionized NLP tasks like machine translation.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
1) Transformers use self-attention to solve problems with RNNs like vanishing gradients and parallelization. They combine CNNs and attention.
2) Transformers have encoder and decoder blocks. The encoder models input and decoder models output. Variations remove encoder (GPT) or decoder (BERT) for language modeling.
3) GPT-3 is a large Transformer with 175B parameters that can perform many NLP tasks but still has safety and bias issues.
Recurrent Neural Network
ACRRL
Applied Control & Robotics Research Laboratory of Shiraz University
Department of Power and Control Engineering, Shiraz University, Fars, Iran.
Mohammad Sabouri
https://sites.google.com/view/acrrl/
RNN AND LSTM
This document provides an overview of RNNs and LSTMs:
1. RNNs can process sequential data like time series data using internal hidden states.
2. LSTMs are a type of RNN that use memory cells to store information for long periods of time.
3. LSTMs have input, forget, and output gates that control information flow into and out of the memory cell.
The document describes a proposed virtual mouse system that uses hand gesture recognition instead of a physical mouse. It discusses the limitations of existing input devices like trackballs and optical mice. The proposed system uses a webcam to capture images of hand gestures, applies object recognition techniques to identify gestures, and translates the gestures to mouse events on the screen. It outlines the hardware and software requirements and modules needed to implement the virtual mouse, including image acquisition, object recognition, coordinate calculation, and event generation. Work done so far includes literature research and initial implementation efforts.
Recurrent neural networks (RNNs) are a type of artificial neural network that can process sequential data of varying lengths. Unlike traditional neural networks, RNNs maintain an internal state that allows them to exhibit dynamic temporal behavior. RNNs take the output from the previous step and feed it as input to the current step, making the network dependent on information from earlier steps. This makes RNNs well-suited for applications like text generation, machine translation, image captioning, and more. RNNs can remember information for long periods of time but are difficult to train due to issues like vanishing gradients.
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
This is my slides for introducing sequence to sequence model and Recurrent Neural Network(RNN) to my laboratory colleagues.
Hyemin Ahn, @CPSLAB, Seoul National University (SNU)
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
Slides reviewing the paper:
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Advances in Neural Information Processing Systems, pp. 6000-6010. 2017.
The dominant sequence transduction models are based on complex recurrent orconvolutional neural networks in an encoder and decoder configuration. The best performing such models also connect the encoder and decoder through an attentionm echanisms. We propose a novel, simple network architecture based solely onan attention mechanism, dispensing with recurrence and convolutions entirely.Experiments on two machine translation tasks show these models to be superiorin quality while being more parallelizable and requiring significantly less timeto train. Our single model with 165 million parameters, achieves 27.5 BLEU onEnglish-to-German translation, improving over the existing best ensemble result by over 1 BLEU. On English-to-French translation, we outperform the previoussingle state-of-the-art with model by 0.7 BLEU, achieving a BLEU score of 41.1.
This is a presentation I gave as a short overview of LSTMs. The slides are accompanied by two examples which apply LSTMs to Time Series data. Examples were implemented using Keras. See links in slide pack.
Stock Market Prediction using Machine LearningAravind Balaji
REPO : https://github.com/rvndbalaji/StockMarketPrediction
Stock Market Prediction using Machine
This is a presentation on Stock Market Prediction application built using R.
This is a part of final year engineering project
This document discusses various heuristic search techniques, including generate-and-test, hill climbing, best first search, and simulated annealing. Generate-and-test involves generating possible solutions and testing them until a solution is found. Hill climbing iteratively improves the current state by moving in the direction of increased heuristic value until no better state can be found or a goal is reached. Best first search expands the most promising node first based on heuristic evaluation. Simulated annealing is based on hill climbing but allows moves to worse states probabilistically to escape local maxima.
Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo
This Material is an in_depth study report of Recurrent Neural Network (RNN)
Material mainly from Deep Learning Book Bible, http://www.deeplearningbook.org/
Topics: Briefing, Theory Proof, Variation, Gated RNNN Intuition. Real World Application
Application (CNN+RNN on SVHN)
Also a video (In Chinese)
https://www.youtube.com/watch?v=p6xzPqRd46w
This document presents research on predicting stock market trends in Tehran, Iran using machine learning and deep learning algorithms. Ten years of historical data from four stock market groups were analyzed using nine machine learning models (Decision Tree, Random Forest, Adaboost, XGBoost, SVC, Naive Bayes, KNN, Logistic Regression, ANN) and two deep learning models (RNN, LSTM). Ten technical indicators were used as input values in both continuous and binary formats to evaluate the models. The results showed that RNN and LSTM performed best on continuous data, outperforming other models, while on binary data they still performed best but with less difference between models due to improved performance.
A Deep Learning Approach for Crypto Price PredictionIRJET Journal
This document presents a study that uses deep learning models like LSTM and GRU to predict cryptocurrency prices. It analyzes daily price data for Bitcoin, Ethereum, and Cosmos from 2013-2021. Both LSTM and GRU recurrent neural networks are trained on the data, with GRU found to converge faster and produce more accurate predictions based on error metrics. Specifically, the GRU model outperforms the LSTM model in forecasting closing prices for the majority of cryptocurrencies examined based on mean absolute percentage error and root mean square error. The document concludes the GRU-based forecasting model is more appropriate than LSTM for predicting cryptocurrency prices due to high volatility.
This paper describes an architecture for predicting the price of cryptocurrencies for the next seven days
using the Adaptive Network Based Fuzzy Inference System (ANFIS). Historical data of cryptocurrencies
and indexes that are considered are Bitcoin (BTC), Ethereum (ETH), Bitcoin Dominance (BTC.D), and
Ethereum Dominance (ETH.D) in
adaily timeframe. The methods used to teach the data are hybrid and
backpropagation algorithms, as well as grid partition, subtractive clustering
, and Fuzzy C-means
clustering (FCM) algorithms, which are used in data clustering. The architectural performance designed
in this paper has been compared with different inputs and neural network models in terms of statistical
evaluation criteria. Finally, the proposed method can predict the price of digital currencies in a short time
This document describes research on using a Long Short-Term Memory (LSTM) neural network model to predict bitcoin prices over the next 5 days. It discusses collecting bitcoin price data from 2015-2021, cleaning the data, and using features like date, price, high, low to train and test the LSTM model. Lag plots show the data has positive correlation at daily intervals. The model is trained on recent data and tested on past data to predict future prices. Root mean square error is calculated between predicted and actual test prices. The model accurately predicts future prices but could be improved by adding more price-influencing features to the training data.
House Price Estimates Based on Machine Learning Algorithmijtsrd
Housing prices are increasing every year, necessitating the creation of a long term housing price strategy. Predicting a homes price will assist a developer in determining a homes purchase price, as well as a consumer in determining the best time to buy a home. The sale price of real estate in major cities depends on the specific circumstances. Housing prices are constantly changing from day to day and are sometimes fired rather than based on estimates. Predicting real estate prices by real factors is a key element as part of our analysis. We want to make our test dependent on all of the simple metrics that are taken into account when deciding the significance. In this research we use linear regression techniques pathway and our results are not self inflicted process rather is a weighted method of various techniques to give the most accurate results. There are fifteen features in the data collection. In this research. There has been an effort to build a forecasting model for determining the price based on the variables that influence the price.The results have proven to be effective lower error and higher accuracy than individual algorithms are used. Jakir Khan | Dr. Ganesh D "House Price Estimates Based on Machine Learning Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42367.pdf Paper URL: https://www.ijtsrd.comcomputer-science/other/42367/house-price-estimates-based-on-machine-learning-algorithm/jakir-khan
Integration of a Predictive, Continuous Time Neural Network into Securities M...Chris Kirk, PhD, FIAP
This paper describes recent development and test implementation of a continuous time recurrent neural network that has been configured to predict rates of change in securities. It presents outcomes in the
context of popular technical analysis indicators and highlights the potential impact of continuous predictive capability on securities
market trading operations.
The document discusses a project aimed at developing an accurate machine learning model for cryptocurrency price prediction. It outlines the team members and agenda, including the abstract, proposed system, literature review, design and implementation. The literature review analyzes 10 papers on cryptocurrency price prediction using various machine learning approaches like LSTM, GRU, CNN etc. It then discusses the system design using UML diagrams and describes the datasets used to develop and evaluate the predictive models.
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
Cost estimating at schematic design stage as the basis of project evaluation, engineering design, and cost
management, plays an important role in project decision under a limited definition of scope and constraints in
available information and time, and the presence of uncertainties. The purpose of this study is to compare the
performance of cost estimation models of two different hybrid artificial intelligence approaches: regression
analysis-adaptive neuro fuzzy inference system (RANFIS) and case based reasoning-genetic algorithm (CBRGA)
techniques. The models were developed based on the same 50 low-cost apartment project datasets in
Indonesia. Tested on another five testing data, the models were proven to perform very well in term of accuracy.
A CBR-GA model was found to be the best performer but suffered from disadvantage of needing 15 cost drivers
if compared to only 4 cost drivers required by RANFIS for on-par performance.
A novel hybrid deep learning model for price prediction IJECEIAES
Price prediction has become a major task due to the explosive increase in the number of investors. The price prediction task has various types such as shares, stocks, foreign exchange instruments, and cryptocurrency. The literature includes several models for price prediction that can be classified based on the utilized methods into three main classes, namely, deep learning, machine learning, and statistical. In this context, we proposed several models’ architectures for price prediction. Among them, we proposed a hybrid one that incorporates long short-term memory (LSTM) and Convolution neural network (CNN) architectures, we called it CNN-LSTM. The proposed CNNLSTM model makes use of the characteristics of the convolution layers for extracting useful features embedded in the time series data and the ability of LSTM architecture to learn long-term dependencies. The proposed architectures are thoroughly evaluated and compared against state-of-the-art methods on three different types of financial product datasets for stocks, foreign exchange instruments, and cryptocurrency. The obtained results show that the proposed CNN-LSTM has the best performance on average for the utilized evaluation metrics. Moreover, the proposed deep learning models were dominant in comparison to the state-of-the-art methods, machine learning models, and statistical models.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Ever wondered what factors influence house prices? This project explores the world of house price prediction using data science techniques. We delve into analyzing real estate data to build models that can estimate the value of a home. This can be a valuable tool for both buyers and sellers navigating the housing market. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more details
This project presents a machine learning approach to predicting house prices using a dataset containing various features such as the size of the house, number of bedrooms, location, and others. The project aims to build a predictive model that can accurately estimate the selling price of a house based on its features. The presentation covers data preprocessing steps, feature selection techniques, and the application of machine learning algorithms such as linear regression or decision trees. It also discusses model evaluation metrics and the potential impact of the model on the real estate industry. Visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
STOCK PRICE PREDICTION USING ML TECHNIQUESIRJET Journal
This document discusses using machine learning techniques like LMS and LSTM algorithms to predict stock prices. It summarizes previous research on stock price prediction that used techniques like artificial neural networks, support vector machines, and recurrent neural networks. The document then describes the proposed system for stock price prediction, which involves preprocessing data, splitting it into training and test sets, analyzing the data with LMS and LSTM algorithms, and outputting predictions in graph and report formats. It concludes that combining multiple algorithms into hybrid models can improve prediction accuracy while reducing computational complexity compared to single models.
IRJET- Stock Market Forecasting Techniques: A SurveyIRJET Journal
This document surveys various techniques for stock market forecasting, including traditional and recent methods using machine learning and artificial intelligence. It discusses techniques like artificial neural networks, hidden Markov models, support vector regression, and deep learning. It also reviews several research papers that have applied methods like ARIMA models, improved Levenberg-Marquardt training for neural networks, feedforward neural networks for the Stock Exchange of Thailand index, improved multiple linear regression in an Android app, support vector regression with windowing operators on the Dhaka Stock Exchange, hidden Markov models compared to neural networks and support vector machines, a hybrid support vector regression and filtering model, and using J48 decision trees and random forests with preprocessing.
The document describes a data analysis project using computational modeling to analyze stock market data and predict future stock prices. Specifically, it aims to:
1) Use time series analysis techniques like ARIMA to forecast future stock prices of IT companies and determine the maximum risk of investing in them.
2) Construct a diversified portfolio of multiple stocks to reduce risk while maximizing profits.
3) Employ Monte Carlo simulation to evaluate portfolio losses under random market conditions and determine the risk of individual stocks.
The project uses techniques like ARIMA, moving averages, and geometric Brownian motion to analyze stock price data and predict prices with confidence intervals for risk analysis. The results found ARIMA can accurately predict short
This document summarizes a study that used three data mining techniques - Decision Tree, Random Forest, and Naive Bayesian Classifier - to predict the direction of movement of the Tehran Stock Exchange index. Ten microeconomic and three macroeconomic variables were used as inputs to models built with each technique. The Decision Tree model was found to have the best performance at 80.08% accuracy, followed by Random Forest at 78.81% and Naive Bayesian Classifier at 73.84% based on testing the models on 20% of the data not used for training. The study aimed to develop predictive models for the emerging Tehran stock market using classification techniques from data mining.
Now knowledge pre-processing, model and reasoning issues, power metrics, quality
issues, post-processing of discovered structures, isualization, and on-line change is best challenge.
In this paper Neural Network based forecasting of stock prices of selected sectors under Bombay
Stock Exchange show that neural networks have the power to predict prices albeit the volatility in the
markets[9]. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform “intelligent" tasks similar to those performed by the human brain. Artificial Neural Networks are being counted as the wave of the future in computing. They are indeed self-learning mechanisms which don’t require the traditional skills of a
programmer. Back propagation is one of the approaches to implement concept of neural networks. Back propagation is a form of supervised learning for multi-layer nets. Error data at the output layer is back propagated to earlier ones, allowing incoming weights to these layers to be updated. It is most often used as training algorithm in current neural network applications. In this paper, we apply data
mining technology to stock market in order to research the trend of price; it aims to predict the future trend of the stock market and the fluctuation of price. This paper points out the shortage that exists in current traditional statistical analysis in the stock, then makes use of BP neural network algorithm to predict the stock market by establishing a three-tier structure of the neural network, namely input layer, hidden layer and output layer. Finally, we get a better predictive model to improve forecast accuracy.
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...IJDKP
We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simpleneural network structure as the new feature construction tool in the firststage, thenthe newly created features are used asthe additional input variables in logistic regression in the second stage. The modelis compared with the traditional onestage model in credit customer response classification. It is observed that the proposed two-stage model outperforms the one-stage model in terms of accuracy, the area under ROC curve, andKS statistic. By creating new features with theneural network technique, the underlying nonlinear relationships between variables are identified. Furthermore, by using a verysimple neural network structure, the model could overcome the drawbacks of neural networks interms of its long training time, complex topology, and limited interpretability.
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...ThinkInnovation
Project’s Primary Goals
1. To analyse past sales data to generate insights to understand what features of mobile phone that drive the sales.
2. To use these insights to efficiently plan the inventory in the next 6 months.
Data Description
3. Dataset consists of sales and product-related features.
4. Dataset contains descriptions of the top 5 most popular mobile brands.
5. Dataset consists of 418 row-instances and 16 column-features.
Strategies Deployed for Modelling
6. Check for, and treat with suitable methods, missing values in dataset.
7. Observe for, and take suitable steps to treat, outliers.
8. Check for multicollinearity amongst variables and use suitable steps to treat highly correlated variables.
9. Build a Linear Regression Model to predict the sales of mobile phones.
10. Report on the the metrics of the models.
11. Identify the significant variables, and rebuild and report on the model using only these variables only.
12. Based on the final model outcomes, determine the features driving mobile phone sales.
13. List down the recommendations to help in the inventory planning for the next 6 months.
Author: Anthony Mok
Date: 16 Nov 2023
Email: xxiaohao@yahoo.com
Similar to Bitcoin Close Price Prediction Report (20)
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
Build applications with generative AI on Google Cloud
Bitcoin Close Price Prediction Report
1. 1
Bitcoin close price prediction
Abstract—The objective of this project is to determine the
accuracy with which the closing price of the bitcoin can be
predicted with the help of classification and linear regression
methods. For classification, we have implemented several ANN
models with different layers and neurons to find the model with
the best accuracy and compared the result with LSTM. Using
LSTM an accuracy of 54.35% was achieved with a log loss of 7.18
to predict the direction of the close price. Also, the best ANN
model had an accuracy of 55.1 % which was almost at power with
LSTM accuracy. Using multiple linear regression models, we
deduced that elastic net performed better in comparison to lasso
and ridge model as it had lower RMSE and R squared value.
RMSE value recorded for elastic net regression is 0.00808 which
was lowest when compared to other regression models.
Keywords—Bitcoin, machine learning, ANN, LSTM, multiple
linear regression model, ridge, lasso, elastic net regression.
I. INTRODUCTION
In recent years cryptocurrencies has been on a constant
rise. Cryptocurrencies are used as a means for digital
transactions and for investment purposes around the world
[3]. Bitcoins nature of combining monetary units and
encryption technology lately has attracted substantial
recognition in fields such as economics, computer science
and cryptography [1]. Bitcoin being one of the first
cryptocurrency which was decentralized now has a market
capital of 170 billion US dollars [2]. Since bitcoin is a
decentralized cryptocurrency, it is not owned by a
government body or restricted to a certain location but
applied as a type of peer to peer payment [4].
With the ever-increasing demand in trying to understand
the fluctuation in prices of cryptocurrencies, it is vital to
have a system that can help predict the change in prices
daily. Like the stock exchange bitcoin price change is quite
volatile and can be difficult to get a high accuracy in terms
of prediction. The value of bitcoin or any other
cryptocurrency cannot be static and can vary almost every
second. The fluctuation is completely dependent on the
amount being paid for bitcoin by buyers. As bitcoin is used
as an investment, the same principle applied in stocks for
buying cheap and selling at a high price is applicable for
cryptocurrency [4]. The volatile nature of the
cryptocurrency makes it much more challenging and
interesting for analysts to predict the right price. The
prediction and approximation of bitcoin prices is an area
where much research has not been done [1]. The traditional
time series methodology is not suitable since there is a lack
of seasonality in the cryptocurrency market and the major
factors that help in this methodology are trend, seasonal and
noise [5].
Since investors are keen to know the direction of
cryptocurrency price i.e. high or low it is vital to have an
algorithm that gives the best accuracy in terms of
determining the range. A lot of work and research has been
done in trying to predict the direction of stock prices and
very less in terms of cryptocurrency.
In the following sections we will investigate the related
work, the methodology used, and the results achieved. One
of the main papers referred to the project is [5]. We will be
trying to improve on the accuracy achieved by adding more
parameters used in the study. With the help of classification
methods, we will be classifying the closing price into high,
low and no change. Also, we will be using various multiple
linear regression methods and find the best method suited
for this project. The results of each model would then be
analyzed to find the best-suited model for classification and
multiple linear regression.
II. RELATED WORK
Our work improves on the existing research done to predict
the bitcoin prices in [5]. McNally, Roche and Caton, 2018
investigated RNN along with LSTM for prediction. The
algorithms were benchmarked based on GPU and CPU
performances. Results were then compared to ARIMA
where it was known that the accuracy of ARIMA was very
poor in comparison to RNN and LSTM. Accuracy measured
was 52.78%, 50.25% and 50.05% for RNN, LSTM, and
ARIMA respectively.
[1] used BNN analysing time series. Linear and
non-linear benchmark models were used to compare.
Resampling was done with the help of bootstrap and cross-
validation. The prices were then compared with SVR and
linear regression. BNN gave the best results in terms of
accuracy in comparison to others. [3] implemented the
general linear model and Bayesian regression to predict the
daily change of price values considering the parameters.
Five normalization techniques were used on the data.
Finally, with random forest was applied on both the time
series datasets, the results of which were combined to
predict the macro change in price.
[4] used four ANN methods BPNN, GANN,
GABPNN and NEAT. Data was executed with 30 iterations
on training. The study focused only on a day’s prediction.
BPNN outperformed GAPNN. [6] implemented the genetic
algorithm based selective neural network ensemble which is
built using multi-layered perceptron. Supervised algorithm
Levenberg-Marquardt (LM) was used as a result of the
complexity and the computational cost. [7] analysed the
social media posts and performed a sentimental analysis to
get a positive, neutral and negative score. Using this data
‘Granger causality test’ was performed to test and reject the
null hypothesis which was assumed that community
comments do not help in predicting the fluctuations in
cryptocurrency prices. This paper analyses the market using
sentiment score as compared to the HMM model which uses
time series data to predict the prices. One of the major risks
with social media posts being that it can be easily exploited.
[8] used the Hidden Markov Model (HMM) to examine
social media posts to predict the transition to another state at
a certain point in time, given the current state of the
currency. With the help of this model, by identifying the
hidden state, given the data point, the state of the
2. 2
cryptocurrency at a certain point in time can be predicted.
This model particularly focuses on time series data to
predict the prices of the cryptocurrency. [9] predicted the
highest and closing price of bitcoin using time-delay neural
network (TDNN) and recurrent neural network (RNN). The
models were trained across data from past eight quarters to
test over the next quarter. TDNN needed less training time
and predicted values closer to the actual price as compared
to RNN. [10] approached classification and regression
problems of machine learning by proposing a regularization
method based neural network. Their results depicted that
obtaining directional accuracy of up to 5%, the rolling
volatility and rolling skewness were the best auxiliary
objectives to forecast. The best regularization parameters for
the tasks were found by applying Bayesian optimization.
[11] in their thesis work, used fractionally integrated
autoregressive moving average (ARFIMA) model to predict
the value of currency using the exchange rate of the Bitcoin.
Their research is based on the Lewellen approach and the
approaches of Westerlund and Narayan to find any
statistical effects that could be responsible for a bias of
regression estimates. [12] applied the ARIMA
(Autoregressive Integrated Moving Average) model to
predict the exchange rate of Bitcoin, by conducting
autocorrelation function and partial autocorrelation function
analysis to determine the parameters for the ARIMA model.
The MAPE of the model was found to be 5.36% while
explaining approximately 44% of variability from the
response of the data around the models mean. [13] predicted
the stock prices using four different models namely ANN,
Naïve Bayes, SVM and Random forest. Naïve Bayes
exhibited the least performance while random forest had the
highest performance. [14] applied deep neural network
(DNN) to predict the stock returns in future. DNN
outperformed linear autoregressive model in training set but
did not have the advantage the test set.
[15] used GARCH (General Autoregressive Conditional
Heteroskedasticity) and LSTM (Long short-term memory)
to forecast the volatility of the stock price index. Multiple
GARCH models gave much-improved prediction over other
hybrid neural networks. [16] proposed a BNNMAS (bat-
neural network multi-agent system) architecture with four
layers to tackle the problem of stock prediction. The model
proved to be quite robust. [17] implemented WNN (Wavelet
neural network) to reduce the size of the network and
simplify the structure. [18] tried to determine the various
factors that determine the price of bitcoin by taking into
consideration the twitter sentiment. SVM (Support vector
machine) was used to analyse the sentiment ratio on a day-
to-day basis. The research showed that the bitcoin prices
were positively affected by the search queries from
Wikipedia. [19] tried to predict the bitcoin price for one
hour in the future with the help of a naïve approach to set
the baseline prediction and evaluated the results by using
mean squared error (MSE). Other tree-based algorithms and
k nearest neighbour algorithm were used which didn’t even
match up to the baseline prediction. SVM and linear
regression performed better in comparison. Finally, [20]
used the neural network to predict the stock prices. Higher
performance was achieved by increasing the number of
hidden units although increasing the units beyond a certain
point diminished the performance of the model. Neural
network gave significant results when compared to multiple
discriminant analysis (MDA) for predicting stock prices.
III. DATA MINING METHODOLOGY
In this project, we have used the CRISP data mining
methodology. We had a clear business understanding as to
predict the closing price of bitcoin when comparing to USD.
The data was sourced from blockchain and coinmarketcap.
Quandl package is used to dynamically source data from
blockchain. The data is taken from 1st
January 2014 to 27th
July 2018. The second source of data is coinmarketcap.
Htmltab package is used to source data from this site by
giving the start and end date. Data from 2013 was not
considered as the volume column did not have any values.
Cleaning was then performed to adjust the date and number
format. Finally, the data set was merged based on date.
We then performed the Granger causality test to check if
classification methods can be used for prediction. By not
handling high correlation we performed ANN and LSTM
classification methods. Also, by handling high correlation
with the help of principal component analysis (PCA) we
performed multiple linear regression (MLR). Lasso, ridge
and elastic net linear regression models were implemented,
and the results were compared based on the root mean
squared error (RMSE). First, we will investigate the
classification methods i.e. ANN and LSTM and then into the
multiple linear regression methods. Figure. 1 allows us to
deduce the number of correlations depicted between the
variables.
A. Classification
2. ANN
Artificial Neural Network (ANN) is part of cognitive
learning, which is used for an approximation as mentioned
in [22]. In recent times, the use of ANN has increased for
tasks such as classification, time series forecasting, and
pattern recognition. Moreover, use of ANN has drastically
increased in financial organizations. As the data used is time
series in nature, ANN was considered for implementation.
Moreover, the ANN is a non-linear model and can handle
Figure 1: Correlation Matrix
3. 3
complex relationships between variables. It can also
generalize and infer unseen relationships that are unseen in
the data. In addition, ANN also does not impose any
restriction on input data [23].
Here, ANN is used for classifying the direction of close
price i.e. high or low. Therefore, the classification was
binary in nature. To prepare the model, initially some
cleaning was required and was done using R. One of the key
requirements of the ANN is to normalize the data. Data
were scaled to bring the values in between 0 and 1. Price-
Direction was the dependent variable and was derived from
"Close" attribute. Price-Direction was then encoded with 1
being higher and 0 being lower. Various models were
constructed with a single and multiple layer of hidden
layers. It was found that single hidden layer provided the
model with better and consistent accuracy.
In ANN after modelling the inputs as per requirement of
the model, the dataset was divided into training and test data
with 70 and 30 percent distribution of rows respectively.
Based on the input, models with different hidden layers
were made to run and the results were collected. Table 1, 2
and 3 consolidates the confusion matrix calculations for the
single hidden layer with one neuron, two hidden layers with
(2,1) neurons and (4,3) neurons respectively.
Model 1:
ANN model with single hidden layer and one neuron:
TABLE 1: Confusion matrix with one hidden layer
Lower Higher
Lower 120 111
Higher 253 326
Accuracy 0.55061728
Misclassification 0.44938272
Sensitivity (120/ (120+253) = 0.3217
Specificity (326/ (111+326) =0.8624
Model 2:
ANN model with two hidden layers with two and one
neuron:
TABLE 2: Confusion matrix with two hidden layers (2,1)
Lower Higher
Lower 138 141
Higher 219 298
Accuracy 0.5477386935
Misclassification 0.4522613065
Sensitivity 138/ (138+219) =0.3876
Specificity 141/ (141+298) =0.3211
Model 3:
ANN model with two hidden layers with four and three
neurons:
Figure 2: ANN model with one hidden layer
Figure 4: ANN model with two hidden layers (4,3)
Figure 3: ANN model with two hidden layers (2,1)
4. 4
TABLE 3: Confusion matrix with two hidden layers (4,3)
Lower Higher
Lower 125 148
Higher 248 289
Accuracy 0.55111111
Misclassification 0.48888889
Sensitivity (125/ (122+248) = 0.3351
Specificity (289/ (148+289) =0.2860
Several other models with different configurations were
executed. Model with one hidden layer and one neuron gave
consistent result with training and test dataset and a better
accuracy.
Disadvantage of ANN model:
1. Execution time is high with moderate hardware.
2. Reduced trust as it gives different result with
different models.
1. LSTM
The model consists of blocks of memory which consists
of input, output and a forget gate in memory (Ct) [15]. The
use of LSTM over MLP is due to the materialistic nature of
bitcoin data [5]. The deep learning model is supported by
Keras package in R which is already well known in the
Python environment. The sample data was split into 80%
and 20% for train and test respectively. Dense and dropout
functions were defined with two hidden layers with 60 and
50 neurons and an output layer with 1 neuron as the
classification problem is binary class classification.
The activation function describes the weighted sum
multiplied with input and its summation with bias [15]. The
classification probability lies in the range of 0 to 1 and thus
supported by sigmoid which is a non-linear activation
function. Rectified Linear Unit (ReLU) decides the output
as 0 or 1 based on the maximum value of data given by
max(x,0). Value is passed through the gate upon which
forget gate controls the information in the previous state
(Ct-1) and passed to the sigmoid function. Binary cross
entropy is applied to log losses with a probability between 0
and 1 contributing to 1 as a bad model and 0 being the
perfect model. A stochastic optimizer function Adam
manages to sustain the learning rate with the weights for the
training set. The learning rate (lr) specified as 0.0001 with a
delay of 1e-6. The parameter metrics is set to accuracy for
performance model. The data was tested against multiple
epochs ranging from 50 to 150 and a batch size of 150
gravitated towards the higher effect
In comparison to [5] where LSTM reported an accuracy
of 52.78%, by considering more parameters we have
improved on the accuracy. The confusion matrix and the log
loss were traced to plot the graph and resulted in 7.18 loss
with an accuracy of 54.35%. The value of sensitivity and
specificity are 0.98 and 0.038 respectively. As the data used
in the model was less compared to other financial data, it
resulted in lesser accuracy and can be enhanced over a
period with the collection of historical data along with the
current date.
B. Regression Analysis
1. Assumptions
To conduct Multiple Linear Regression Analysis, certain
assumptions [24] on the data need to be met. These
assumptions include:
1) Adequate Sample Size: According to Tabachnik and
Fidell cited in Palant, 2007, the appropriate sample size
formula is N > 50 + 8m, where m is the number of total
independent variables [24]. The sample size for the project
was 1669 which is way above the calculated limit with
eleven independent variables (138).
2) Ouliers: The outliers of the dataset were handled by
imputing the mean for the outliers detected in R code for
boxplot.stats(column_name)$out by creating a function and
passing each column as an input to the function.
3) Multicollinearity and Singularity: The Independent
and Dependent variables demonstrated high correlations as
mentioned before. This violates our assumption of
multicollinearity. Principal Component Analysis was
performed to overcome this [25]. By deducing only those
variables that explain maximum variance among the linear
combinations of the independent variables PCA was
conducted on the train and test datasets using the prcomp
function in R setting the scale function as true to normalize
the data, which is one of the requirements of performing
PCA. The figure below indicates the components that
explain the maximum proportion of variance for the input
train set of data:
Fig. 1 LSTM model [21]
Figure 5: LSTM model
Figure 6: Proportion of variance against each
component
5. 5
After the PCA on the train set was conducted only those
components that explain the maximum variance (excluding
the ones that are tending towards 0 i.e., component 9 and
10), were considered as the inputs to create our test PCA
dataset, and as inputs for our regression analysis. Thus a
total of eight normalized principal components for the test
and the train datasets were considered as the inputs for the
multiple linear regression (MLR) algorithms conducted in
the sections that follow.
4) Normality, Linearity, Homoscedasticity,
independence of residuals: Normality and independence of
the components of the data after PCA was observed in the
Normal Q-Q plots of the residuals vs fitted values and were
checked to detect the preceding assumptions (Figure. 6).
The straight red line in the (Figure. 7) indicate that the
assumption of homoscedasticity is not violated.
The linear distribution of the residuals in (Figure. 6)
indicates the linearity of the data as well. Having met all the
assumptions, the following sections explain how we
performed MLR. Before the model created a custom control
function was created that would allow us to use the caret
package in R’s train control function that would allow us to
choose the number of times the model should run to choose
the best fit.
2. Linear Regression
If Y is the dependent or response variable, x is the predictor
or explanatory variable, is the coefficient is the random
error or noise, with n total number of variables then a linear
regression analysis model equation for squared error
calculation is represented as:
Based on (1), the caret train functions method was set to
linear model. This trained model was then used as part of a
comparison with the regression models considered further.
C. Ridge Regression
The least squares linear regression model can be modified to
create a ridge regression model by applying a non-negative
cost (penalty) function lambda to the coefficients [27], thus
modifying the equation to calculate the squared error as
follows:
Based on (2), and as we were using a custom control to find
the best penalty value for lambda, a sequence of lambda
values ranging (0.0001 to 0.2) was run.
A linear regression model under ridge regression, is trained
under the L2 regularization norm, which tends to reduce the
coefficients of the predictors that are correlated towards one
another, permitting them to influence each [27].
Figure 9 depicts the behaviour of the components under the
L2 penalty. The trend observed shows that higher the
penalty applied the further away the coefficients of the
components tend to deviate from one another and hence the
model chosen among the ten iterations of the custom control
for lambda is at 0.0001.
Each of the selected eight principal components that
influence the prediction in the final model of the ridge
Figure 8: Residuals vs fitted values plot to confirm
assumptions of homoscedasticity and independence of
residuals.
(2)
(1)
Figure 9: Coefficients variation on Ridge regression's (L2
regularization) best model
Figure 7: Normal Q-Q plots checked to confirm the assumptions
of normality and linearity.
6. 6
regression model are depicted in Figure 10. The eighth and
the first principal component seem to have the highest
influence in the ridge regression’s final model, while the
seventh and the second have the least impact on the model’s
prediction capabilities, though not completely zero.
D. Least Absolute Shrinkage and Selection Operator
(Lasso) Regression
In addition to the lambda cost penalty, the LASSO
regression equation reduces the coefficients size of the
model and selects only those coefficients that have a
significant impact on the prediction outcome [26]. The
linear regression model for squared error calculation can
now be modified to:
Based on (3), we now use the custom control to find the best
penalty value for lambda, as a sequence of lambda values as
before and setting the alpha value equal to 1.
The L1 penalty norm is observed in lasso regression, under
which the model tends to favour one component/coefficient
over the rest while choosing its coefficients/components
[27]. The lasso penalty conforms to the Laplace prior, that
anticipates most of the coefficients to be minimum (close to
zero) and only a few to be larger in magnitude but low in
number [27]. As the L1 penalty is applied in the lasso
model, the coefficients respond with the variation in the
model as depicted in Figure 11. The component represented
in black (PC1) is the only component at lambda value of
0.0001 while all other coefficients appear at later stages of
the lambda value, thus the final model of lasso regression
chooses the optimum lambda L1 penalty as 0.0001.
The parameters that influence the lasso regression final
model’s prediction are depicted in Figure 12, which still
confirms component eight as the major influencer while
component seven has zero influence, which in the case of
lasso regression is possible as it favours one component
while completely ignoring the rest [27].
E. Elastic Net Regression
The Elastic Net Regression is a combination of both the
Lasso and Ridge regression model equations [27]. This can
be achieved by introducing alpha values as a sequence from
0 (ridge) to 1(lasso). The elastic net regression model for
squared error will now be modified to include alpha:
Figure 13: Elastic regression's best model
(3)
(4)
Figure 11: Lasso regression's best model
Figure 12: Variable importance for lasso model
Figure 10: Variable importance for ridge model
7. 7
Based on (4), the custom control will run to find the best
penalty value for lambda as well as for alpha.
Figure 13 depicts the reaction of the predictor components
to L1 to L2 penalty norm as the model runs alpha and
lambda from 0 to 1 and 0.0001 to 0.2 respectively. The final
outcome has been depicted in Table 4. The penalty term is
extremely useful in cases where the number of predictors
exceeds the number of records in the dataset (population)
[27]. The important variables for the elastic net regression’s
final model as expected between lasso and ridge shows
minor variations with the eighth component still having the
maximum influence on the model.
1V. EVALUATION AND RESULTS
For multiple linear regression model the final prediction
model was chosen by comparing the above-mentioned
regression models on the Root Mean Square Error and R-
Squared values for each model, the results of which are
consolidated in Table 4.
Table 4: Regression model comparison
Models Comparison Results
Cost Penalties RMSE R-Squared
Lasso
Regression
= 1, = 0.0001 0.00873 0.998
Ridge
Regression
= 0, = 0.0001 0.02190 0.998
Elastic Net
Regression
() = 0.1111,
= 0.0001
0.00808 0.999
This indicates Elastic Net regression outperforms the other
models with the lowest RMSE. Elastic Net Regression
Model was chosen to conduct the prediction on the test data
set generated from the PCA. The RMSE value of the
prediction using Elastic Net regression was observed as
6.73% with an R-squared value of 99.99%.
In classification both ANN and LSTM were compared based
on accuracy, specificity and sensitivity. The results of which
are shown in Table 5.
Table 5: ANN vs LSTM
Model Sensitivity Specificity Accuracy
ANN 0.3351 0.286 55.11%
LSTM 0.98 0.038 54.35%
From table 5 we can see that ANN performed slightly better
than LSTM. Also, we can see that the models found it hard
to learn from the data.
V. CONCLUSION AND FUTURE WORK
Coming up with the best model for ANN can be time-
consuming. From the results, we know that the accuracy for
LSTM and the best model for ANN are very close with a
difference of 0.76%. In linear regression models, we know
that elastic net outperformed lasso and ridge models. The
elastic net had the lowest RMSE and R squared value.
Although, it was noticed that with more recent data there
were fluctuations in the RMSE and R squared value. There
is a chance that with more recent data and the fluctuations of
the price, the performance of the model may change.
Due to time constraint, LSTM could not be performed
along with RNN. For further study in this area, LSTM
performance can be evaluated along with RNN with much
more recent data and more parameters. Some of the
parameters that were not included in this study are minutes
per transaction, the number of unspent transaction and
transaction fees. These parameters were excluded as they
had low correlation against other attributes. By considering
these parameters multiple linear regression can be
performed for better prediction. One of the limitations in
this study was that we have performed a binary
classification as among 1669 records only 2 records had ‘no
change’ which were removed. With much more recent data
over a period of time and more records having ‘no change’,
multiclass classification can be performed. Also, there was
limitation in R for keras package as it does not support all
versions of R when compared with python and much better
results could have been obtained if we had full access for
the package in R.
REFERENCES
[1] H.Jang, and J.Lee, 2018. An empirical study on modeling and
prediction of bitcoin prices with bayesian neural networks based on
blockchain information. IEEE Access, 6, pp.5427-5437.J. Clerk
Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
Oxford: Clarendon, 1892, pp.68–73.
[2] M.Nakano, A.Takahashi and S.Takahashi, 2018. Bitcoin technical
trading with artificial neural network.
[3] S.Velankar, S.Valecha and S.Maji, 2018, February. Bitcoin price
prediction using machine learning. In Advanced Communication
Technology (ICACT), 2018 20th International Conference on (pp.
144-147). IEEE..
[4] A.Radityo, Q.Munajat and I.Budi, 2017, October. Prediction of
Bitcoin exchange rate to American dollar using artificial neural
network methods. In Advanced Computer Science and Information
Systems (ICACSIS), 2017 International Conference on (pp. 433-438).
IEEE..
[5] S.McNally, J.Roche and S.Caton, 2018, March. Predicting the price
of Bitcoin using Machine Learning. In Parallel, Distributed and
Network-based Processing (PDP), 2018 26th Euromicro International
Conference on (pp. 339-343). IEEE..
Figure 14: Variable importance for elastic net model
8. 8
[6] E.Sin and L.Wang, 2017, July. Bitcoin price prediction using
ensembles of neural networks. In 2017 13th International Conference
on Natural Computation, Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD) (pp. 666-671). IEEE.
[7] Y.B.Kim, J.G.Kim, W.Kim, J.H.Im, , T.H.Kim, S.J.Kang, and
C.H.Kim, 2016. Predicting fluctuations in cryptocurrency transactions
based on user comments and replies. PloS one, 11(8), p.e0161197.
[8] R.C.Phillips and D.Gorse, 2017, November. Predicting
cryptocurrency price bubbles using social media data and epidemic
modelling. In Computational Intelligence (SSCI), 2017 IEEE
Symposium Series on (pp. 1-7). IEEE.
[9] S.Gullapalli, 2018. Learning to predict cryptocurrency price using
artificial neural network models of time series.
[10] L.D Persio, and O.Honchar, Multitask machine learning for financial
forecasting.
[11] A.A.Salisu, L.O. Akanni and R.O.Azeez, 2018. Could this be
affliction? Bitcoin forecasts most tradable currency pairs better than
ARFIMA.
[12] N.A.Bakar and S.Rosbi, 2017. Autoregressive Integrated Moving
Average (ARIMA) Model for Forecasting Cryptocurrency Exchange
Rate in High Volatility Environment: A New Insight of Bitcoin
Transaction. International Journal of Advanced Engineering Research
and Science, 4(11).
[13] J.Patel, S.Shah, P.Thakkar and K.Kotecha, 2015. Predicting stock and
stock price index movement using trend deterministic data
preparation and machine learning techniques. Expert Systems with
Applications, 42(1), pp.259-268.
[14] E.Chong, C.Han and F.C.Park, 2017. Deep learning networks for
stock market analysis and prediction: Methodology, data
representations, and case studies. Expert Systems with Applications,
83, pp.187-205.
[15] H.Y.Kim and C.H.Won, 2018. Forecasting the volatility of stock
price index: A hybrid model integrating LSTM with multiple
GARCH-type models. Expert Systems with Applications, 103, pp.25-
37.
[16] R.Hafezi, J.Shahrabi and E.Hadavandi, 2015. A bat-neural network
multi-agent system (BNNMAS) for stock price prediction: Case study
of DAX stock price. Applied Soft Computing, 29, pp.196-210.
[17] L.Lei, 2018. Wavelet neural network prediction method of stock price
trend based on rough set attribute reduction. Applied Soft Computing,
62, pp.923-932.
[18] I.Georgoula, D.Pournarakis, C.Bilanakos, D.Sotiropoulos and Giaglis,
G.M., 2015. Using time-series and sentiment analysis to detect the
determinants of bitcoin prices.
[19] A.Greaves, and B.Au, 2015. Using the bitcoin transaction graph to
predict the price of bitcoin.
[20] Y.Yoon and G.Swales, 1991, January. Predicting stock price
performance: A neural network approach. In System Sciences, 1991.
Proceedings of the Twenty-Fourth Annual Hawaii International
Conference on (Vol. 4, pp. 156-162). IEEE.
[21] T.Gao, Y.Chai and Y.Liu, 2017, November. Applying long short term
momory neural networks for predicting stock closing price. In
Software Engineering and Service Science (ICSESS), 2017 8th IEEE
International Conference on (pp. 575-578). IEEE.
[22] I.Kaastra and M.Boyd, 1996. Designing a neural network for
forecasting financial and economic time series. Neurocomputing,
10(3), pp.215-236.
[23] J.Mahanta , 2017. Introduction to Neural Networks, Advantages and
Applications.[Online]
Available at: "https://towardsdatascience.com/introduction-to-neural-
networks-advantages-and-applications-96851bd1a207"
[Accessed 1 7 2018].
[24] J.Pallant, 2013. SPSS survival manual. McGraw-Hill Education
(UK).
[25] H.Zou, T.Hastie and R.Tibshirani, , 2006. Sparse principal component
analysis. Journal of computational and graphical statistics, 15(2),
pp.265-286.
[26] S.S.Roy, D.Mittal, A.Basu and A.Abraham, 2015. Stock market
forecasting using LASSO linear regression model. In Afro-European
Conference for Industrial Advancement (pp. 371-381). Springer,
Cham.
[27] J.Friedman, T.Hastie and R.Tibshirani, 2010. Regularization paths for
generalized linear models via coordinate descent. Journal of statistical
software, 33(1), p.1.