This document discusses predicting stock prices over the short term using vector autoregressive (VAR) and vector autoregressive moving average (VARMA) models. It first describes acquiring and preprocessing stock price data from Kaggle to make it suitable for these time series models. It then explains formulating the prediction problem in MATLAB and using tools to estimate VAR and VARMA model parameters. Results show that while neither model reaches benchmarks, VARMA(10,4) slightly outperforms VAR(4) with a mean absolute error of 0.27 for one-day-ahead forecasts. The document concludes more work is needed to improve short-term prediction, such as incorporating additional features.
Understanding Crises: Investigating Organizational Safety Culture by Combinin...streamspotter
David Passenier, Colin Mols, Jan Bím, and Alexei Sharpanskykh on "Understanding Crises: Investigating Organizational Safety Culture by Combining Organizational Ethnography and Agent Modeling" at ISCRAM 2013 in Baden-Baden.
10th International Conference on Information Systems for Crisis Response and Management
12-15 May 2013, Baden-Baden, Germany
Understanding Crises: Investigating Organizational Safety Culture by Combinin...streamspotter
David Passenier, Colin Mols, Jan Bím, and Alexei Sharpanskykh on "Understanding Crises: Investigating Organizational Safety Culture by Combining Organizational Ethnography and Agent Modeling" at ISCRAM 2013 in Baden-Baden.
10th International Conference on Information Systems for Crisis Response and Management
12-15 May 2013, Baden-Baden, Germany
How to transform and select variables/features when creating a predictive model using machine learning. To see the source code visit https://github.com/Davisy/Feature-Engineering-and-Feature-Selection
Dolda jobb de finns men var? Dolda jobb finns i sociala medierHolger Wastlund
Dolda jobb de finns men var? Dolda jobb finns i sociala medier om.seHolgerWästlund Hur detta går till beskrivs steg för steg i boken
8. http://tips-om.seHolgerWästlund Denna marknad ligger öppen för den som är kreativ och antingen har kunskap om hur sociala medier fungerar eller vill lära sig det Låter inte det mycket mer spännande? Dina chanser att skaffa ett jobb ökar samtidigt enormt mycket
This presentation describe a quick overview about head injury and then talks about a patient with Head injury and the management from respiratory point of view.
Handy Cure s` is a portable device designed particularly for Home healthcare market. The device is based on Quantum Medicine (QM) and combines low-level pulse laser radiation, pulse infrared radiation, visible red light and static magnetic filed. Effective in both acute and chronic pain situations
Speeds recovery and relief pain. Operated up to 18 hours using a rechargeable lithium battery.
How to transform and select variables/features when creating a predictive model using machine learning. To see the source code visit https://github.com/Davisy/Feature-Engineering-and-Feature-Selection
Dolda jobb de finns men var? Dolda jobb finns i sociala medierHolger Wastlund
Dolda jobb de finns men var? Dolda jobb finns i sociala medier om.seHolgerWästlund Hur detta går till beskrivs steg för steg i boken
8. http://tips-om.seHolgerWästlund Denna marknad ligger öppen för den som är kreativ och antingen har kunskap om hur sociala medier fungerar eller vill lära sig det Låter inte det mycket mer spännande? Dina chanser att skaffa ett jobb ökar samtidigt enormt mycket
This presentation describe a quick overview about head injury and then talks about a patient with Head injury and the management from respiratory point of view.
Handy Cure s` is a portable device designed particularly for Home healthcare market. The device is based on Quantum Medicine (QM) and combines low-level pulse laser radiation, pulse infrared radiation, visible red light and static magnetic filed. Effective in both acute and chronic pain situations
Speeds recovery and relief pain. Operated up to 18 hours using a rechargeable lithium battery.
Visual fingerprinting for malicious websitesIbrahim Mosaad
Using graph database and HoneySpider Network honeyclient, we managed to create a visual fingerprint for websites. Using this fingerprint we devised a technique to differentiate between benign and suspicious websites. We further analyze the suspicious websites to determine if they are malicious ones or it was a false positive.
NAA Maximize 2015 - Presentation on In-depth Analytics of Pricing DiscoveryThe Rainmaker Group
This session was the first ever math-based session for advanced revenue management practitioners to discuss computational challenges in revenue management and customer data analytics. Pricing isn’t just about looking at data on spreadsheets. You actually have to do math. Complex math. Well, you have to do complex math if your revenue management software system doesn’t bring the science and analytics together.
Machine Learning with Big Data using Apache SparkInSemble
"Machine Learning with Big Data
using Apache Spark" was presented to Lansing Big Data and Hadoop User Group by Muk Agaram and Amit Singh on 3/31/2015. It goes over the basics of machine learning and demos a use case of predicting recession using Apache Spark through Logistic Regression, SVM and Random Forest Algorithm
This presentations covers Definition of Operations Research , Models, Scope,Phases ,advantages,limitations, tools and techniques in OR and Characteristics of Operations research
This talk includes the following items:
1) discussion of the various stages of ML application life cycle - problem formulation, data definitions, modeling, production system design & implementation, testing, deployment & maintenance, online evaluation & evolution.
2) getting the ML problem formulation right
3) key tenets for different stages of application cycle.
Audio for the talk:
https://youtu.be/oBR8flk2TjQ?t=19207
We run a training program on The Certified Six Sigma Black Belt (CSSBB) and enable participants to become a professional who can explain Six Sigma philosophies and principles, including supporting systems and tools.
The participant would be able to demonstrate team leadership, understand team dynamics, and assign team member roles and responsibilities. They will have a thorough understanding of all aspects of the DMAIC model in accordance with Six Sigma
principles and will have basic knowledge of lean enterprise concepts, are able to identify nonvalue-added elements and activities, and are able to use specific tools post this training.
Ever wondered what factors influence house prices? This project explores the world of house price prediction using data science techniques. We delve into analyzing real estate data to build models that can estimate the value of a home. This can be a valuable tool for both buyers and sellers navigating the housing market. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more details
This project presents a machine learning approach to predicting house prices using a dataset containing various features such as the size of the house, number of bedrooms, location, and others. The project aims to build a predictive model that can accurately estimate the selling price of a house based on its features. The presentation covers data preprocessing steps, feature selection techniques, and the application of machine learning algorithms such as linear regression or decision trees. It also discusses model evaluation metrics and the potential impact of the model on the real estate industry. Visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
Primary Goals
1. To determine what factors are driving the lead conversion process.
2. To Identify which leads are more likely to convert to paid customers.
Data Description
3. Dataset consists of 4613 rows and 15 columns.
Modelling Strategies
4. Plan
4.1 Perform Dummy Encoding
4.2 List Variables for Modeling
4.3 Identify metric of interest to judge model's performance
5. Build
5.1 Build Logistic Regression Model (Preliminary Model)
5.2 Observe the metrics of the model
6. Improve
6.1 Identify the significant variables
6.2 Rebuild model
6.3 Observe the metrics of the models
7. Decide
7.1 Compare the results of Logistic Regression model (Base model) and Decision Tree Model
7.2 Conclude on best model for this project
8. Recommend
8.1 Determine factors driving the lead conversion process
8.2 Recommend what that may help to identify which leads are more likely to convert to paying customers
Author: Anthony Mok
Date: 16 Nov 2023
Email: xxiaohao@yahoo.com
3. Introduction
• Stock prices – subject of curiosity for both
general public and expert investors
• Picture of the overall volatility of the market and
the financial condition of particular company
• Depend on classical economic theory of Supply
and Demand – No mathematical formula to
calculate!
• However, there are ways to predict it!
4. Introduction – Nature of Stock
data
• Stock prices are affected by Internal and External
Factors
• Internal factors – Earnings statement, market share,
etc.
• External factors- Competitor performance, public
financial indicators
• Unless significantly affected by these factors, stock
prices tend to follow their historical trend –
correlated to past data
• Correlated to other stock prices as external factors
5. Sample Auto-correlation plot of a stock
on a particular day & Correlation
between stocks
Hence use of traditional Multiple Linear Regression not
suitable
6. Data Acquisition and Pre-
processing
• Source – Kaggle Repository for the competition “Predict
Short Term Movements in Stock Prices Using News and
Sentiment Data Provided by RavenPack”
• Given percentage change in stock prices compared to the
previous day's closing price of 198 anonymous stocks
measured over 5 minutes interval for 510 days from 9
AM to 2 PM
• Also given the values of 244 anonymous features in the
same interval for 510 days
• Task is to predict the % change in prices 25 intervals
ahead i.e. at 4 PM
• Training set given for 200 days with closing data at 4 PM
7. Data Acquisition and Pre-
processing
• Pre-processing
• Zeroing out the 9 AM value for each day
• Seeking data entry errors
• Replacing missing values with zeros
• Testing for stationarity and differentiating the time
series – VAR models are based on stationary time
series
8. Data Acquisition and Pre-
processing cont..
Before and after first order differentiation of stock data for a single day
12. Evaluation of model performance
• Kaggle’s criteria: Best model – model with least
Mean Absolute Error (MAE) score. Mean Absolute
Error is calculate as:
• where 𝑦𝑖 is the true value of percentage change in stock
price and 𝑦𝑖 is the predicted value of percentage change in
stock price at the end of day (4 PM)
• Forecast done for initial 200 days only because of
availability of closing price changes at 4 PM
13. What does MAE (Mean Absolute
Error) signify ?
• 100 $ worth of stock yesterday jumped to 105 $
this morning at 10 AM but model predicted 98$
• Jump of 5%
• The model predicted decline of 2%
MAE = |5-(-2)|% = 7 %
• % prediction error = (98 – 105)/ 100 = 6.66 %
• Hence, your forecast will lie at around 6.66% of the
true value
• Good or bad? Depends!
14. Results and Conclusion
• Best results:
VARMA(10,4) model performs slightly better than
VAR(4) model with lesser
Mean Absolute Error (MAE)
16. Conclusion/ Future works
• Both models could not reach the Kaggle winning
bench mark of 0.42 MAE
• It was found MAE goes on increasing at the
forecasting horizon is extended
• Forecasting at 2 PM with the data up to 1 PM as
Training set gave MAE of 0.27
• VAR and VARMA models more suitable for pretty
short term forecasting
• Use of features in future model may improve
performance.
17. References
[1] Fama, Eugene F. ”The behavior of stock-market prices.” Journal of business
(1965): 34-105.
[2] Lewellen, Jonathan. ”Momentum and auto correlation in stock returns.” Review
of Financial Studies 15.2 (2002): 533-564.
[3] Kullmann, L., Janos Kertsz, and K. Kaski. ”Time-dependent cross-correlations
between different stock returns: A directed network of influence.” Physical Review E 66.2
(2002): 026125.
[4] George, Box. ”Time Series Analysis: Forecasting and Control”, 3/e. Pearson
Education India, 1994.
[5] Hafez, Peter, and Ilya Gorelik. Predict Short Term Movements in Stock Prices
Using News and Sentiment Data Provided by RavenPack.The Big Data Combine
Engineered by Battlefin. Battlefin Inc., 16 Aug. 2013. Web. 1 Mar. 2015.
[6] Hyndman, Rob J., and George Athanasopoulos. ”Forecasting: principles and
practice”. OTexts, 2014.
18. References
[7] Ltkepohl, Helmut. ”Forecasting with VARMA models.” Handbook of
economic
forecasting 1 (2006): 287-325.
[8] Robinson, Wayne. ”Forecasting Inflation using VAR analysis.” 28th Annual
Conference of the Regional Programme for Monetary Studies Conference, Port-
ofSpain. 1996.
[9] Zivot, Eric, and Jiahui Wang. ”Vector autoregressive models for multivariate
time series.” Modeling Financial Time Series with S-PLUS? (2006): 385-429.
[10] Chatfield, Chris. Time-series forecasting. CRC Press, 116-190, 2006
[11] Ltkepohl, Helmut. Vector autoregressive models. Springer Berlin
Heidelberg, 5-15, 2011.
[12] LeSage, James P. ”Applied econometrics using MATLAB.” Manuscript, Dept.
of Economics, University of Toronto (1999).
[13] Lack, Caesar. Forecasting Swiss inflation using VAR models. No. 2006-02.
Swiss
National Bank, 2006.