SlideShare a Scribd company logo
1 of 19
Name of the students: 1) Rohit Jain (10103473)
2) Neeraj Chaudhary (10103525)
Name of the supervisor: Mr. Vivek Mishra.
Importance of project
Today is World where stock market is one of the major market to invest
in , to earn money. The stock market reflects the variation of the market
economy, and receives ten million investors’ focus since its opening
development. The stock market is characterize by high-risk, high-yield,
so investors are concerned about the analysis of the stock market and
trying to forecast the trend of the stock market. However, stock market is
impacted by the politics, economy and many other factors, coupled with
the complexity of its internal law, such as price changes in the non-linear,
and shares data with high noise characteristics, therefore the traditional
mathematical statistical techniques to forecast the stock market has not
yielded suitable results
Hence, we are going to analyze the stock based on different algorithms
designed using some tools and techniques which include hadoop and
mapreduce.
INTRODUCTION
Analysis of data is a process of inspecting, cleaning, transforming, and
modeling data with the goal of discovering useful information,suggesting
conclusions, and supporting decision making. Data analysis has multiple
facets and approaches, encompassing diverse techniques under a variety
of names, in different business, science, and social science domains.
We are going to analyse the data of a stock to find different types of
trend in this stock using hadoop and mapreduce.
Hadoop is an open source framework for writing and running distributed
applications that process large amounts of data. Distributed computing is
a wide and varied field, but the key distinctions of Hadoop are that it is
MapReduce is a data processing model . Its greatest advantage is the
easy scaling of data processing over multiple computing nodes. Under
the MapReduce model, the data processing primitives are called
mappers and reducers . Decomposing a data processing application
into mappers and reducers is sometimes nontrivial. But, once you
write an application in the MapReduce form, scaling the application
to run over hundreds, thousands, or even tens of thousands of
machines in a cluster is merely a configuration change. This simple
scalability is what has attracted many programmers to the MapReduce
model.
Technical and graphical indicators used
We are analyzing the data of a particular for past several years through
different type of algorithms. We need to find no. of days where a same
percentage change has been occurred in whole data.
For example, given a particular stock, we’d like to know how often in the
past several years its changed by 1%, 2%, 3% etc (kind of like a a
Fourier Transform, or transforming some temporal domain data into the
frequency domain).
Further we will we using different Technical indicators for analysis
purpose only which will include:
•Simple Moving Average (SMA)
•Exponential Moving Average (EMA)
•On Balance Volume (OBV)
TECHNICAL INDICATORS USED -:
This method is used for analysis purpose by using one of the
following feature users can see graph of that company by entering
period as input.
Simple Moving Average (SMA)-
1) SMA is basic of the moving average used for treading.
2) It is based on closing price.
Exponential Moving Average (EMA) –
1) Try to reduce Lag by applying more weight to recent price.
2) EMA (Current) = ((Price (Cur) – EMA (Prev))*Multiplier) +
EMA (Prev)
Multiplier = (2/ (Time period+1))
Overall description of the project
Our project aims at analyzing the data of particular stock using hadoop
mapreduce.
We proposed some algorithm to analyse the data of stock. Initially we are
finding the frequency of stock changes using an excel sheet as a input of
a stock. We will be using mapreduce functions to perform this operation
so that data could be analysed.
Then we will be using some other algorithm to forecast the trend of the
stock using some technical indicator exponential moving average(EMA)
. After this we are using graphical stock trend indicator to understand
the trend of stock.
This project is whole working on a Hadoop mapreduce .
Functional requirements and Non Functional
requirements
After the time elapsed in the project and working out the procedure to
implement our algorithm there are some requirements namely that are
needed for the proper functioning of the project .
A functional requirement describes what a software system should do,
while non-functional requirements place constraints on how the system
will do so.
•Functional Requirements:
• Hadoop should handle the inputted data of the stock.
•Mapper must have a key for mapping the data.
•Reducer must integrate the data as an output.
•Non Functional Reuirements:
•Scability: The application must work for a large data. It should not
fail in a this condition.
•Reliability: The application must be reliable in every aspect for the
user who is using for analyzing the data.
•Efficiency: Specifies how well the software utilizes scarce resources:
Component description and dependency details
An excel file of a particular stock is used as an input for the project. We
have used the excel of BP stock from yahoo server.
•Softwares Requirement
•Oracle (Sun) Java 6: Oracle (Sun) Java 6 is the reference
implementation for Java6.
•Hadoop: Hadoop Map/Reduce is a software framework for easily
writing applications which process vast amounts of data (multi-terabyte
data-sets) in-parallel on large clusters (thousands of nodes) of
commodity hardware in a reliable, fault-tolerant manner.
•Hardware Requirement
•PC 1.6 Ghz or higher
•3 Gb Ram or higher
•Operating System: Ubuntu
Overall Architecture
We are taking an excel file as an input and allowing map function
to perform a task on it and then reducing the result to get an output.
Proposed Algorithm
Algorithm based on percentage change of stock:
It’s an Algorithm to compute the frequency of stock market changes.
For example, given a particular stock, we’d like to know how often in the
past several years its changed by 1%, 2%, 3% etc (kind of like a a
Fourier Transform, or transforming some temporal domain data into the
frequency domain).
Yahoo Finance provides us a stock of BP as an excel sheet for the
analysis.
Map Function:
Primarily we are writing a stream processor here that atomically
performs what needs to happen on one line of data. Thats perfect for us,
we’re going to simply take the opening price, the closing price, calculate
the percent change and spit it out.
//Date,Open,High,Low,Close,Volume,Adj Close
String[] tokens = value.toString().split(“,”);
Float open= Float.valueOf(tokens[1]);
Float close= Float.valueOf(tokens[4]);
Float change=((close-open)/open)*100;
Word.set(new DecimalFormat(“0.##”).format((double)change) + “%”);
Context.write(word, one);
We will get a stream of (name, value) pairs with the name being the
percentage change for the day and the value being the integer ‘1’. This
function can be distributed over X number of machines, each one
performing its streaming function in parallel and independent of the
others.
Reduce Function:
This function is going to take the (name, value) outputs from all the
mappers and process that data accordingly (often ‘reducing’ it). In our
case we are simply going to count the number of times a particular
percentage change happens. In essence we are going to change this:
1.2% 1
1.3% 1
1.2% 1
Into
1.2% 2
1.3% 1
int sum=0;
for(IntWritable val : values)
{
Sum=sum +val.get();
}
Context,write(key , new IntWritable(sum));
Technical indicators algorithm:
This method is used for analysis purpose by using one of the following
feature users can see graph of that company by entering period as input.
A. Simple Moving Average (SMA)-
1) SMA is basic of the moving average used for treading.
2) It is based on closing price.
Ex. Daily Closing price- 11,12,13,14,15,16,17
To Find MA of day-
1st day- (11+12+13+14+15)/5=13
2nd day- (12+13+14+15+16)/5=14
3rd day- (13+14+15+16+17)/5=15 & so on.
B. Exponential Moving Average (EMA) –
1) Try to reduce Lag by applying more weight to recent
price.
2) EMA (Current) = ((Price (Cur) – EMA
(Prev))*Multiplier) + EMA (Prev)
Multiplier = (2/ (Time period+1))
Conclusion
Investing into stocks is a common side business of companies and
indivisual to get compound interest, time value of money, tax benefit,
diversification. So that to invest into good rising stock is necessary to get
desired profit. To select good stock stock change indicator is very helpful
for the user. Hadoop is a open source software which can handle the
huge amount of data quite easily. Hadoop has some modules like map
reduce function , HDFS, Hadoop common, hadoop yarn. Map is a
programming model which calculates percentage change of stock and
assigns that change as key and gives value equals to 1 for each key.
Whereas map function reads key and set of values associated to it.
Reduce function than calculates sum of values associates with key and
gives key and that sum (frequency) as final output. EMA algorithm
mainly focuses on recent price values. By analyzing these values user
can choose a stock less risky. By drawing graph of EMA closing prices
user can understand trend of stock. So that he can invest into less risky or
more risky (according to his choice) stock with upper trend.
Future work
We have planned the following things to do in future .
•We want add some more technical indiacators (like back propagation
neural networks ) to this program so that person can compare result of
each indicator. User will have the freedom to give importance on
particular condition (indicator).
•We want to add some graphical indicators ( like OBVP ) also with this
project so that user gets the graphical knowledge along with statistical
knowledge. So that he can better understand the trend of stock.
•We want to link this project to a website so that more no of people can
take benefit of this project.
I. Apache Software Foundation. Official apache hadoop website,
http://hadoop.apache.org
II. The Hadoop Architecture and Design,
Available:http://hadoop.apache.org/common/docs/r0.16.4/hdfs_desig
n.html
III. Aditya B. Patel, Manashvi Birla, Ushma Nair ,Addressing Big Data
Problem Using Hadoop and Map Reduce, NIRMA UNIVERSITY
INTERNATIONAL CONFERENCE ON ENGINEERING,
NUiCONE-2012, 06-08DECEMBER, 2012.
References
I. Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplied Data
Processing on Large Clusters,OSDI 2004.
II. KUSHAGRA SAHU, REVATI PAWAR, SONALI TILEKAR,
RESHMA SATPUTE, STOCK EXCHANGE IFORECASTING
USING HADOOP MAP-REDUCE TECHNIQUE,
International Journal ofAdvancements inResearch & Technology,
Volume 2,Issue4,April‐2013
III. Hadoop in Action” by Chuck Lam.
IV. “Pro Hadoop- build scalable distributed applications in the cloud” by
Jason Venner Michael G Noll tutorials Applied Research. Big
Data. Distributed Systems. website: http://www.michael-noll.com

More Related Content

What's hot

Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...
Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...
Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...Yigal D. Jhirad
 
Stacks
StacksStacks
StacksAcad
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
Predictive Analytics for Alpha Generation and Risk Management
Predictive Analytics for Alpha Generation and Risk ManagementPredictive Analytics for Alpha Generation and Risk Management
Predictive Analytics for Alpha Generation and Risk ManagementYigal D. Jhirad
 
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...Yigal D. Jhirad
 
Enhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce TechniqueEnhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce TechniquejournalBEEI
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processingYogi Devendra Vyavahare
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5Robert Grossman
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersAbhishek Singh
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286Ninad Samel
 
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...Srinath Perera
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithmsiqbalphy1
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 

What's hot (18)

Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...
Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...
Parallel Processing of Big Data in Finance for Alpha Generation and Risk Mana...
 
Stacks
StacksStacks
Stacks
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
Predictive Analytics for Alpha Generation and Risk Management
Predictive Analytics for Alpha Generation and Risk ManagementPredictive Analytics for Alpha Generation and Risk Management
Predictive Analytics for Alpha Generation and Risk Management
 
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...
A Pioneering Approach to Parallel Array Processing in Quantitative and Mathem...
 
Lecture 01
Lecture 01Lecture 01
Lecture 01
 
Enhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce TechniqueEnhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce Technique
 
rscript_paper-1
rscript_paper-1rscript_paper-1
rscript_paper-1
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5
The Impact of Cloud Computing on Predictive Analytics 7-29-09 v5
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large Clusters
 
5 parallel implementation 06299286
5 parallel implementation 062992865 parallel implementation 06299286
5 parallel implementation 06299286
 
Introduction to data structure and algorithms
Introduction to data structure and algorithmsIntroduction to data structure and algorithms
Introduction to data structure and algorithms
 
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...
View, Act, and React: Shaping Business Activity with Analytics, BigData Queri...
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithms
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Lecture 1 mapreduce
Lecture 1  mapreduceLecture 1  mapreduce
Lecture 1 mapreduce
 

Viewers also liked

Non functional requirements. do we really care…?
Non functional requirements. do we really care…?Non functional requirements. do we really care…?
Non functional requirements. do we really care…?OSSCube
 
Software requirement specification
Software requirement specificationSoftware requirement specification
Software requirement specificationRavi Yasas
 
Use Case Diagram Templates by Creately
Use Case Diagram Templates by CreatelyUse Case Diagram Templates by Creately
Use Case Diagram Templates by CreatelyCreately
 
Software Requirements Specification on Student Information System (SRS on SIS)
Software Requirements Specification on Student Information System (SRS on SIS)Software Requirements Specification on Student Information System (SRS on SIS)
Software Requirements Specification on Student Information System (SRS on SIS)Minhas Kamal
 
Library Management System Waterfall Model
Library Management System Waterfall ModelLibrary Management System Waterfall Model
Library Management System Waterfall Modelmitwa1990
 
Stock Analyzer Hadoop MapReduce Implementation
Stock Analyzer Hadoop MapReduce ImplementationStock Analyzer Hadoop MapReduce Implementation
Stock Analyzer Hadoop MapReduce ImplementationMaruthi Nataraj K
 
School Management System ppt
School Management System pptSchool Management System ppt
School Management System pptMohsin Ali
 
Event Management System Document
Event Management System Document Event Management System Document
Event Management System Document LJ PROJECTS
 

Viewers also liked (9)

School management system
School management systemSchool management system
School management system
 
Non functional requirements. do we really care…?
Non functional requirements. do we really care…?Non functional requirements. do we really care…?
Non functional requirements. do we really care…?
 
Software requirement specification
Software requirement specificationSoftware requirement specification
Software requirement specification
 
Use Case Diagram Templates by Creately
Use Case Diagram Templates by CreatelyUse Case Diagram Templates by Creately
Use Case Diagram Templates by Creately
 
Software Requirements Specification on Student Information System (SRS on SIS)
Software Requirements Specification on Student Information System (SRS on SIS)Software Requirements Specification on Student Information System (SRS on SIS)
Software Requirements Specification on Student Information System (SRS on SIS)
 
Library Management System Waterfall Model
Library Management System Waterfall ModelLibrary Management System Waterfall Model
Library Management System Waterfall Model
 
Stock Analyzer Hadoop MapReduce Implementation
Stock Analyzer Hadoop MapReduce ImplementationStock Analyzer Hadoop MapReduce Implementation
Stock Analyzer Hadoop MapReduce Implementation
 
School Management System ppt
School Management System pptSchool Management System ppt
School Management System ppt
 
Event Management System Document
Event Management System Document Event Management System Document
Event Management System Document
 

Similar to Stock Market Trend Analysis Using Hadoop MapReduce

A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...inscit2006
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportAhmad El Tawil
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaNithin Kakkireni
 
2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)anh tuan
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacmlmphuong06
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Casesmathieuraj
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & HadoopAhmed Gamil
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Sudhir Mallem
 
Performance evaluation and estimation model using regression method for hadoo...
Performance evaluation and estimation model using regression method for hadoo...Performance evaluation and estimation model using regression method for hadoo...
Performance evaluation and estimation model using regression method for hadoo...redpel dot com
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsPetr Novotný
 
Decomposition technique In Software Engineering
Decomposition technique In Software Engineering Decomposition technique In Software Engineering
Decomposition technique In Software Engineering Bilal Hassan
 
OORPT Dynamic Analysis
OORPT Dynamic AnalysisOORPT Dynamic Analysis
OORPT Dynamic Analysislienhard
 
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...IRJET Journal
 
Data mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationData mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationijcsit
 

Similar to Stock Market Trend Analysis Using Hadoop MapReduce (20)

A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_Sharmila
 
Map reduce
Map reduceMap reduce
Map reduce
 
2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)2004 map reduce simplied data processing on large clusters (mapreduce)
2004 map reduce simplied data processing on large clusters (mapreduce)
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Cases
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & Hadoop
 
2 mapreduce-model-principles
2 mapreduce-model-principles2 mapreduce-model-principles
2 mapreduce-model-principles
 
PRELIM-Lesson-2.pdf
PRELIM-Lesson-2.pdfPRELIM-Lesson-2.pdf
PRELIM-Lesson-2.pdf
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reduce
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
 
Performance evaluation and estimation model using regression method for hadoo...
Performance evaluation and estimation model using regression method for hadoo...Performance evaluation and estimation model using regression method for hadoo...
Performance evaluation and estimation model using regression method for hadoo...
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Decomposition technique In Software Engineering
Decomposition technique In Software Engineering Decomposition technique In Software Engineering
Decomposition technique In Software Engineering
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 
Hadoop map reduce concepts
Hadoop map reduce conceptsHadoop map reduce concepts
Hadoop map reduce concepts
 
OORPT Dynamic Analysis
OORPT Dynamic AnalysisOORPT Dynamic Analysis
OORPT Dynamic Analysis
 
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...
IRJET-An Efficient Technique to Improve Resources Utilization for Hadoop Mapr...
 
Data mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configurationData mining model for the data retrieval from central server configuration
Data mining model for the data retrieval from central server configuration
 

Recently uploaded

9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking MenSapana Sha
 
What are the 4 characteristics of CTAs that convert?
What are the 4 characteristics of CTAs that convert?What are the 4 characteristics of CTAs that convert?
What are the 4 characteristics of CTAs that convert?Juan Pineda
 
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...ChesterYang6
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfEastern Online-iSURVEY
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingJuan Pineda
 
Best Persuasive selling skills presentation.pptx
Best Persuasive selling skills  presentation.pptxBest Persuasive selling skills  presentation.pptx
Best Persuasive selling skills presentation.pptxMasterPhil1
 
2024 SEO Trends for Business Success (WSA)
2024 SEO Trends for Business Success (WSA)2024 SEO Trends for Business Success (WSA)
2024 SEO Trends for Business Success (WSA)Jomer Gregorio
 
Local SEO Domination: Put your business at the forefront of local searches!
Local SEO Domination:  Put your business at the forefront of local searches!Local SEO Domination:  Put your business at the forefront of local searches!
Local SEO Domination: Put your business at the forefront of local searches!dstvtechnician
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...aditipandeya
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Richard Ingilby
 
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Search Engine Journal
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesPushON Ltd
 
marketing strategy of tanishq word PPROJECT.pdf
marketing strategy of tanishq word PPROJECT.pdfmarketing strategy of tanishq word PPROJECT.pdf
marketing strategy of tanishq word PPROJECT.pdfarsathsahil
 
Social Samosa Guidebook for SAMMIES 2024.pdf
Social Samosa Guidebook for SAMMIES 2024.pdfSocial Samosa Guidebook for SAMMIES 2024.pdf
Social Samosa Guidebook for SAMMIES 2024.pdfSocial Samosa
 
pptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxpptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxarsathsahil
 
Branding strategies of new company .pptx
Branding strategies of new company .pptxBranding strategies of new company .pptx
Branding strategies of new company .pptxVikasTiwari846641
 
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessBrighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessVarn
 
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorTAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorSocial Samosa
 
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptxBrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptxcollette15
 
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBalmerLawrie
 

Recently uploaded (20)

9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men9654467111 Call Girls In Mahipalpur Women Seeking Men
9654467111 Call Girls In Mahipalpur Women Seeking Men
 
What are the 4 characteristics of CTAs that convert?
What are the 4 characteristics of CTAs that convert?What are the 4 characteristics of CTAs that convert?
What are the 4 characteristics of CTAs that convert?
 
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
Netflix Ads The Game Changer in Video Ads – Who Needs YouTube.pptx (Chester Y...
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO Copywriting
 
Best Persuasive selling skills presentation.pptx
Best Persuasive selling skills  presentation.pptxBest Persuasive selling skills  presentation.pptx
Best Persuasive selling skills presentation.pptx
 
2024 SEO Trends for Business Success (WSA)
2024 SEO Trends for Business Success (WSA)2024 SEO Trends for Business Success (WSA)
2024 SEO Trends for Business Success (WSA)
 
Local SEO Domination: Put your business at the forefront of local searches!
Local SEO Domination:  Put your business at the forefront of local searches!Local SEO Domination:  Put your business at the forefront of local searches!
Local SEO Domination: Put your business at the forefront of local searches!
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...
VIP 7001035870 Find & Meet Hyderabad Call Girls Film Nagar high-profile Call ...
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
 
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surges
 
marketing strategy of tanishq word PPROJECT.pdf
marketing strategy of tanishq word PPROJECT.pdfmarketing strategy of tanishq word PPROJECT.pdf
marketing strategy of tanishq word PPROJECT.pdf
 
Social Samosa Guidebook for SAMMIES 2024.pdf
Social Samosa Guidebook for SAMMIES 2024.pdfSocial Samosa Guidebook for SAMMIES 2024.pdf
Social Samosa Guidebook for SAMMIES 2024.pdf
 
pptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptxpptx.marketing strategy of tanishq. pptx
pptx.marketing strategy of tanishq. pptx
 
Branding strategies of new company .pptx
Branding strategies of new company .pptxBranding strategies of new company .pptx
Branding strategies of new company .pptx
 
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO SuccessBrighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
Brighton SEO April 2024 - The Good, the Bad & the Ugly of SEO Success
 
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorTAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
 
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptxBrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
 
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
 

Stock Market Trend Analysis Using Hadoop MapReduce

  • 1. Name of the students: 1) Rohit Jain (10103473) 2) Neeraj Chaudhary (10103525) Name of the supervisor: Mr. Vivek Mishra.
  • 2. Importance of project Today is World where stock market is one of the major market to invest in , to earn money. The stock market reflects the variation of the market economy, and receives ten million investors’ focus since its opening development. The stock market is characterize by high-risk, high-yield, so investors are concerned about the analysis of the stock market and trying to forecast the trend of the stock market. However, stock market is impacted by the politics, economy and many other factors, coupled with the complexity of its internal law, such as price changes in the non-linear, and shares data with high noise characteristics, therefore the traditional mathematical statistical techniques to forecast the stock market has not yielded suitable results Hence, we are going to analyze the stock based on different algorithms designed using some tools and techniques which include hadoop and mapreduce.
  • 3. INTRODUCTION Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information,suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains. We are going to analyse the data of a stock to find different types of trend in this stock using hadoop and mapreduce. Hadoop is an open source framework for writing and running distributed applications that process large amounts of data. Distributed computing is a wide and varied field, but the key distinctions of Hadoop are that it is
  • 4. MapReduce is a data processing model . Its greatest advantage is the easy scaling of data processing over multiple computing nodes. Under the MapReduce model, the data processing primitives are called mappers and reducers . Decomposing a data processing application into mappers and reducers is sometimes nontrivial. But, once you write an application in the MapReduce form, scaling the application to run over hundreds, thousands, or even tens of thousands of machines in a cluster is merely a configuration change. This simple scalability is what has attracted many programmers to the MapReduce model.
  • 5. Technical and graphical indicators used We are analyzing the data of a particular for past several years through different type of algorithms. We need to find no. of days where a same percentage change has been occurred in whole data. For example, given a particular stock, we’d like to know how often in the past several years its changed by 1%, 2%, 3% etc (kind of like a a Fourier Transform, or transforming some temporal domain data into the frequency domain). Further we will we using different Technical indicators for analysis purpose only which will include: •Simple Moving Average (SMA) •Exponential Moving Average (EMA) •On Balance Volume (OBV)
  • 6. TECHNICAL INDICATORS USED -: This method is used for analysis purpose by using one of the following feature users can see graph of that company by entering period as input. Simple Moving Average (SMA)- 1) SMA is basic of the moving average used for treading. 2) It is based on closing price. Exponential Moving Average (EMA) – 1) Try to reduce Lag by applying more weight to recent price. 2) EMA (Current) = ((Price (Cur) – EMA (Prev))*Multiplier) + EMA (Prev) Multiplier = (2/ (Time period+1))
  • 7. Overall description of the project Our project aims at analyzing the data of particular stock using hadoop mapreduce. We proposed some algorithm to analyse the data of stock. Initially we are finding the frequency of stock changes using an excel sheet as a input of a stock. We will be using mapreduce functions to perform this operation so that data could be analysed. Then we will be using some other algorithm to forecast the trend of the stock using some technical indicator exponential moving average(EMA) . After this we are using graphical stock trend indicator to understand the trend of stock. This project is whole working on a Hadoop mapreduce .
  • 8. Functional requirements and Non Functional requirements After the time elapsed in the project and working out the procedure to implement our algorithm there are some requirements namely that are needed for the proper functioning of the project . A functional requirement describes what a software system should do, while non-functional requirements place constraints on how the system will do so. •Functional Requirements: • Hadoop should handle the inputted data of the stock. •Mapper must have a key for mapping the data. •Reducer must integrate the data as an output. •Non Functional Reuirements: •Scability: The application must work for a large data. It should not fail in a this condition. •Reliability: The application must be reliable in every aspect for the user who is using for analyzing the data. •Efficiency: Specifies how well the software utilizes scarce resources:
  • 9. Component description and dependency details An excel file of a particular stock is used as an input for the project. We have used the excel of BP stock from yahoo server. •Softwares Requirement •Oracle (Sun) Java 6: Oracle (Sun) Java 6 is the reference implementation for Java6. •Hadoop: Hadoop Map/Reduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner. •Hardware Requirement •PC 1.6 Ghz or higher •3 Gb Ram or higher •Operating System: Ubuntu
  • 10. Overall Architecture We are taking an excel file as an input and allowing map function to perform a task on it and then reducing the result to get an output.
  • 11. Proposed Algorithm Algorithm based on percentage change of stock: It’s an Algorithm to compute the frequency of stock market changes. For example, given a particular stock, we’d like to know how often in the past several years its changed by 1%, 2%, 3% etc (kind of like a a Fourier Transform, or transforming some temporal domain data into the frequency domain). Yahoo Finance provides us a stock of BP as an excel sheet for the analysis.
  • 12. Map Function: Primarily we are writing a stream processor here that atomically performs what needs to happen on one line of data. Thats perfect for us, we’re going to simply take the opening price, the closing price, calculate the percent change and spit it out. //Date,Open,High,Low,Close,Volume,Adj Close String[] tokens = value.toString().split(“,”); Float open= Float.valueOf(tokens[1]); Float close= Float.valueOf(tokens[4]); Float change=((close-open)/open)*100; Word.set(new DecimalFormat(“0.##”).format((double)change) + “%”); Context.write(word, one); We will get a stream of (name, value) pairs with the name being the percentage change for the day and the value being the integer ‘1’. This function can be distributed over X number of machines, each one performing its streaming function in parallel and independent of the others.
  • 13. Reduce Function: This function is going to take the (name, value) outputs from all the mappers and process that data accordingly (often ‘reducing’ it). In our case we are simply going to count the number of times a particular percentage change happens. In essence we are going to change this: 1.2% 1 1.3% 1 1.2% 1 Into 1.2% 2 1.3% 1 int sum=0; for(IntWritable val : values) { Sum=sum +val.get(); } Context,write(key , new IntWritable(sum));
  • 14. Technical indicators algorithm: This method is used for analysis purpose by using one of the following feature users can see graph of that company by entering period as input. A. Simple Moving Average (SMA)- 1) SMA is basic of the moving average used for treading. 2) It is based on closing price. Ex. Daily Closing price- 11,12,13,14,15,16,17 To Find MA of day- 1st day- (11+12+13+14+15)/5=13 2nd day- (12+13+14+15+16)/5=14 3rd day- (13+14+15+16+17)/5=15 & so on.
  • 15. B. Exponential Moving Average (EMA) – 1) Try to reduce Lag by applying more weight to recent price. 2) EMA (Current) = ((Price (Cur) – EMA (Prev))*Multiplier) + EMA (Prev) Multiplier = (2/ (Time period+1))
  • 16. Conclusion Investing into stocks is a common side business of companies and indivisual to get compound interest, time value of money, tax benefit, diversification. So that to invest into good rising stock is necessary to get desired profit. To select good stock stock change indicator is very helpful for the user. Hadoop is a open source software which can handle the huge amount of data quite easily. Hadoop has some modules like map reduce function , HDFS, Hadoop common, hadoop yarn. Map is a programming model which calculates percentage change of stock and assigns that change as key and gives value equals to 1 for each key. Whereas map function reads key and set of values associated to it. Reduce function than calculates sum of values associates with key and gives key and that sum (frequency) as final output. EMA algorithm mainly focuses on recent price values. By analyzing these values user can choose a stock less risky. By drawing graph of EMA closing prices user can understand trend of stock. So that he can invest into less risky or more risky (according to his choice) stock with upper trend.
  • 17. Future work We have planned the following things to do in future . •We want add some more technical indiacators (like back propagation neural networks ) to this program so that person can compare result of each indicator. User will have the freedom to give importance on particular condition (indicator). •We want to add some graphical indicators ( like OBVP ) also with this project so that user gets the graphical knowledge along with statistical knowledge. So that he can better understand the trend of stock. •We want to link this project to a website so that more no of people can take benefit of this project.
  • 18. I. Apache Software Foundation. Official apache hadoop website, http://hadoop.apache.org II. The Hadoop Architecture and Design, Available:http://hadoop.apache.org/common/docs/r0.16.4/hdfs_desig n.html III. Aditya B. Patel, Manashvi Birla, Ushma Nair ,Addressing Big Data Problem Using Hadoop and Map Reduce, NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012, 06-08DECEMBER, 2012. References
  • 19. I. Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplied Data Processing on Large Clusters,OSDI 2004. II. KUSHAGRA SAHU, REVATI PAWAR, SONALI TILEKAR, RESHMA SATPUTE, STOCK EXCHANGE IFORECASTING USING HADOOP MAP-REDUCE TECHNIQUE, International Journal ofAdvancements inResearch & Technology, Volume 2,Issue4,April‐2013 III. Hadoop in Action” by Chuck Lam. IV. “Pro Hadoop- build scalable distributed applications in the cloud” by Jason Venner Michael G Noll tutorials Applied Research. Big Data. Distributed Systems. website: http://www.michael-noll.com