This document presents a project proposal for developing a sentiment analysis dashboard to predict stock trends on the Bursa Malaysia stock exchange using news articles. The project aims to benefit traders and companies by analyzing sentiment towards companies from media coverage. It will involve collecting news articles from Malaysian outlets, preprocessing the data, using natural language processing and sentiment analysis techniques to classify the text sentiment, and building a model to identify trends. The dashboard will analyze sentiment to give traders a clearer understanding of public perception of companies. Risks include high resource usage, inconsistent data availability from sources, and potential bias in some news sources.
Sentiment Analysis Dashboard for Bursa Malaysia stocks
1. Sentiment analysis dashboard for Bursa
Malaysia stocks
26 September 2018
Muhammad Zahriel Bin Ismail (1151101702)
Supervisor: Dr. Nor’ain Mohd Yusoff
Moderator: Dr. Khor kok chin
2. Project Members
Member 1 (Sentiment Analysis): Muhammad Zahriel Bin Ismail (1151101702)
Member 2 (Technical Analysis): Zaidee, Yisau (1141127819)
3. Dr. Nor’ain Mohd Yusoff
Faculty of Computing and Informatics
http://mmuexpert.mmu.edu.my/norainyusoff
4. Table of content
1) Introduction
2) Project Overview
3) Problem Statement
4) Objective
5) Research Motivation
6) Project Scope
7) Justification for project Scope
8) Literature Review
9) Proposed Solution
10) Design and Implementation
11) Research Highlights
12) Primary references
13) Prototype demonstration
14) Question and Answers session
5. Introduction
There are many indicators that brokers/traders utilize to assist in
predicting future stock trends.
Company activity is recorded by news outlets on a daily basis as well as
social media users on a more erratic schedule.
Sentiment analysis is the method of categorizing opinion based on text.
6. Project Overview
This project aims to create a sentiment analysis dashboard that will
predict stock trends by utilizing news outlet coverage.
This project looks to benefit traders/brokers as well as companies in
understanding the publics’ sentiment towards specific companies.
7. Problem Statement
The Outlook of a company towards society can be used as a method to
indicate whether a company is positively/negatively seen by the community.
Traders/brokers lack methods to view these forms of sentiment as well as
identifying the outlooks of each company in a simple and easy manner.
8. Objectives
- To develop a data scraper to collect data from different data sources
- To form an algorithm that can derive sentiments from data provided
by the scraper
- To correctly identify stock market trends with an acceptable accuracy
utilizing said data
9. Research Motivation
- To increase awareness towards traders the importance of public
sentiment of specific companies
- To allow traders to have a clearer picture of specific companies from
different POVs.
- To give traders an easy way to identify the publics’ sentiment
10. Project scope
- The sentiment analysis aims to encompass news outlet websites,
utilizing news articles of companies that are within Bursa Malaysia’s
listing.
- This project will primarily utilize malaysian currency (MYR).
11. Justification for project scope
- Target audience is Malaysian traders
- Reputable news outlets provide unbiased views on companies
12. Data Scraper
- Identify the css element of a news website. (To identify the wanted
text)
- Structure the output data
- Store data
13. Literature Review (Studies)
- Methods for preprocessing
- Methods for maintaining data semantics
- Methods for analysing data
17. Sentiment analysis method
Lexicon Based Approach
1) Each piece of text is tokenized
2) Classify the text bits
3) Compare the text against a lexicon dictionary
4) Produce positive/neutral/negative result
26. Justification for Sources
- The Sun daily: Popular news outlet, daily coverage of different
companies
- Bloomberg: International views of the companies, low bias rating
- Malay mail: Local views of companies
28. Implementation (Model Planning)
- Testing different analytical models on data
- Fine tuning existing models to suit the data and vice-versa
- Identifying most appropriate model
30. Implementation (Delivery)
- Communication of results/progress
- Preparing the program for delivery (Bug checking/fixes)
- Completion of design aspects of the project (Aesthetics)
33. Risk
Resource Consumption:- The program may utilize high amounts of resources,
hardware limitation issues may occur.
Data availability:- Certain data sources are inconsistent when providing articles
of specific companies.
Source bias:- Certain news sources may be biased to specific countries.
34. References
[1]Ahmed T.(2015), Text Classification and Sentiment Analysis
http://ataspinar.com/2015/11/16/text-classification-and-sentiment-analysis/
[1]Chowdhury, G. (2003), Natural language processing
https://strathprints.strath.ac.uk/2611/1/strathprints002611.pdf