SlideShare a Scribd company logo
1 of 5
Download to read offline
Data Science Project
Case Study 1:
Edgar Standalone Application Case Study
Situation:
For the first time, a powerful tool which is bringing together a collection of data provided by EDGAR
in an organized and uniformed manner (where financial data from different sectors and industries is
taken, analyzed and displayed in a uniformed manner understandable to everyone) covering each
and every aspect of finances, to help students, financial data analysts, stock brokers or an individual
working with finances. The tool will provide access, allow exploration and analysis of the financial
data for different sectors, industries and even countries for both educational and professional
purposes.
Requirement:
 Different companies from different sectors submit their financial data files in XML format to
the EDGAR as per the guidelines of Securities and Exchange Department of the Federal
Government. The system through ftp will connect to EDGAR Database and access and
download the data files periodically. After that the data will be parsed from the data files, will
be analyzed with the help of an exclusive analyzing tool.

 Using the existing data through analysis, the system will create uniform balance sheet, profit
and loss and cash flow statements for different companies. Then this financial record will be
stored in the system Database for latter perusal of anyone using the system.

 The APIs of Various stock-exchanges like AMEX, NASDAQ, NYSE will be accessed for stock
price of different companies from different sectors and industries on a regular basis and
stored in the system Database. Statistical analysis of the stock price like Multiple Regression
Analysis, Volatility and such can be done by a system user and can be shared with others
afterwards.
Challenges:
Dealing with large scale data derived from a huge number of companies all over the world,
managing it efficiently, storing and accessing it in a regular basis promptly was the first challenge.
The second challenge was creating an exclusive analyzing tool which was to be used for analyzing
the collected data and representing it in a uniformed manner.
Sharing the large scale data with other users after analysis and ensuring the users have an efficient
and prompt access of the data was the third challenge.
Action:
To address the above-mentioned requirements, following actions were taken,
 The data files coming from ftp was getting parsed and subsequent data was stored in the
database via the indicator separating rule for faster access latter on.

 An exclusive and unique parsing tool was invented based on the structure followed by the US
Government Securities and Exchange Department where the data was capable of getting
stored in a uniformed manner and prepare balance sheet, profit and loss and cash flow
statements or reports.

 The data was stored in the database from various stock exchange APIs which was latter
retrieved (for faster access) by applying the Big Data Handling Mechanism.

 The user was capable of analyzing the stock exchange data like Open Price, Close Price
through the statistical tool provided. Following that structure a unique, efficient and
exclusive analyzing tool was created for analyzing the stock exchange data and shared with
others
Results:

Through an innovative methodology;

 A golden opportunity for the financial data analysts, stocks brokers, students, individuals
working in the financial sectors, to promptly access the balance sheets, profit and loss and
cash flow of companies from different sectors and industries presented in a unified manner
on a single platform.
 Statistical analysis tool that has been incorporated in this platform brings in the perfect
opening for financial data analysts, stocks brokers, students, individuals working in the
financial sectors to remain aware of the current financial scenario all the time. Now, the
world and its financial data is always within their reach.
Case Study 2:
High Frequency Trading Estimator Case Study
Situation:
Data Science and technology has grown capable enough to come up with an innovative algorithm to
be implemented for handling huge amount of Forex market data, analyzing it and predicting
estimated value of foreign exchange to ensure better profit for both a pro and a debutant trader
interested in exchanging currencies at regular intervals according to market condition. Now, being
an existing trader or investor or someone about to start forex trading one would definitely like some
suggestions regarding buying or selling of the currencies. Instead of depending on shady information
or prediction by an individual, if it is possible to depend on a system with specifically designed
structure and algorithm to predict the best possible profit margin at a given point of time with less
chance of error, that will be brilliant.
Requirement:
 Feeding the real market bid prices into the Database continuously coming through the API
from the Forex market and handling this huge amount of data efficiently.
 The system required a stand-alone application or product with a certain set of
capabilities. For instance, the system is to be structured in a way that an existing
trader or a debuting trader will be able to configure their own Admin Portal which
will come with the facilities like API set-up, Database source location set-up, Time-
zone set-up and Artificial Intelligence parameter set-up.

 For prediction, it is important to configure the machine accordingly and that
includes getting selected currency pairs as source-feed which in turn will help to
predict a target pair which will be based on selected time observers.

 Once the machine starts, the trader will have access of the live chart with
continuous info of the real market bid price and the estimated value simultaneously
based on the selected time observers.

 The trader will have access of the statistical analysis of real moving average
and current estimation error which will be portrayed through graphical
representation for ease of understanding.

 Application stability is extremely crucial and Data searching should be fast
so that the selected time frame is utilized properly.

Challenges:

The biggest challenge faced was, creating a prediction algorithm with less error, based on
artificial neuro network.

Second important challenge was, creating an algorithm for handling huge data and
fishing out the most required data within less time.

Another important challenge faced was, the stability of the application and optimization
of system resource utilization (Memory and the CPU).

Action:
To address the above-mentioned requirements, following actions were taken,

 An algorithm was created for fishing out the most required history data, by
generating the keys and using them for hooking up the data from the database. So
that, the searching can be optimized effectively.

 An admin portal was created where the trader can set up the API (i.e. API’s source
location and authentication token), Database (i.e. database source locations and
credential) and Time zone (i.e. the start and the stop data feeding into the
database will be automatically synchronized depending on Forex market
timings).

 An algorithm was created with the help of artificial neuro network where the
latest history data of input pairs and output pairs were being used for training
and teaching purpose to predict the estimated value.

 A live chart was created with the help of Amchart Library where the estimated
price and real market bid price was plotted.
 For statistical analysis, a separate chart was created with the help of Amchart
Library where predicted or estimated value and real time value were compared to
find the amount of error along with the continuous fluctuation error and plotting it
into the graph.

Results:

Through an innovative methodology;

 The prediction algorithm created is up to 99% error free, though, there is a
miniscule space for betterment.
 The load balancing of Database and system has been optimized.

 The application is running smoothly and it is completely stable with
optimum system resource.

 Statistical analysis shows that the error moving average is not fluctuating
greatly (i.e. the application is working almost perfectly).


More Related Content

More from Eclipse Techno Consulting Global (P) Ltd (10)

Secret to success revealed
Secret to success revealedSecret to success revealed
Secret to success revealed
 
Ets's vision for 3 d modelling & animation
Ets's vision for 3 d modelling & animationEts's vision for 3 d modelling & animation
Ets's vision for 3 d modelling & animation
 
Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0
 
Ets's take on motivation
Ets's take on motivationEts's take on motivation
Ets's take on motivation
 
Ets's take on motivation
Ets's take on motivationEts's take on motivation
Ets's take on motivation
 
offers for our customers
offers for our customers offers for our customers
offers for our customers
 
Soft skill enhancement presentation
Soft skill enhancement presentationSoft skill enhancement presentation
Soft skill enhancement presentation
 
Internet marketing proposal from ETS
Internet marketing proposal from ETSInternet marketing proposal from ETS
Internet marketing proposal from ETS
 
How to arrange Events in Corporate world
How to arrange Events in Corporate worldHow to arrange Events in Corporate world
How to arrange Events in Corporate world
 
Face and Voice Recognition- Artificial Intelligence
Face and Voice Recognition- Artificial IntelligenceFace and Voice Recognition- Artificial Intelligence
Face and Voice Recognition- Artificial Intelligence
 

Data Science Project Case studies

  • 1. Data Science Project Case Study 1: Edgar Standalone Application Case Study Situation: For the first time, a powerful tool which is bringing together a collection of data provided by EDGAR in an organized and uniformed manner (where financial data from different sectors and industries is taken, analyzed and displayed in a uniformed manner understandable to everyone) covering each and every aspect of finances, to help students, financial data analysts, stock brokers or an individual working with finances. The tool will provide access, allow exploration and analysis of the financial data for different sectors, industries and even countries for both educational and professional purposes. Requirement:  Different companies from different sectors submit their financial data files in XML format to the EDGAR as per the guidelines of Securities and Exchange Department of the Federal Government. The system through ftp will connect to EDGAR Database and access and download the data files periodically. After that the data will be parsed from the data files, will be analyzed with the help of an exclusive analyzing tool.   Using the existing data through analysis, the system will create uniform balance sheet, profit and loss and cash flow statements for different companies. Then this financial record will be stored in the system Database for latter perusal of anyone using the system.   The APIs of Various stock-exchanges like AMEX, NASDAQ, NYSE will be accessed for stock price of different companies from different sectors and industries on a regular basis and stored in the system Database. Statistical analysis of the stock price like Multiple Regression
  • 2. Analysis, Volatility and such can be done by a system user and can be shared with others afterwards. Challenges: Dealing with large scale data derived from a huge number of companies all over the world, managing it efficiently, storing and accessing it in a regular basis promptly was the first challenge. The second challenge was creating an exclusive analyzing tool which was to be used for analyzing the collected data and representing it in a uniformed manner. Sharing the large scale data with other users after analysis and ensuring the users have an efficient and prompt access of the data was the third challenge. Action: To address the above-mentioned requirements, following actions were taken,  The data files coming from ftp was getting parsed and subsequent data was stored in the database via the indicator separating rule for faster access latter on.   An exclusive and unique parsing tool was invented based on the structure followed by the US Government Securities and Exchange Department where the data was capable of getting stored in a uniformed manner and prepare balance sheet, profit and loss and cash flow statements or reports.   The data was stored in the database from various stock exchange APIs which was latter retrieved (for faster access) by applying the Big Data Handling Mechanism.   The user was capable of analyzing the stock exchange data like Open Price, Close Price through the statistical tool provided. Following that structure a unique, efficient and exclusive analyzing tool was created for analyzing the stock exchange data and shared with others Results:  Through an innovative methodology;   A golden opportunity for the financial data analysts, stocks brokers, students, individuals working in the financial sectors, to promptly access the balance sheets, profit and loss and cash flow of companies from different sectors and industries presented in a unified manner on a single platform.  Statistical analysis tool that has been incorporated in this platform brings in the perfect opening for financial data analysts, stocks brokers, students, individuals working in the financial sectors to remain aware of the current financial scenario all the time. Now, the world and its financial data is always within their reach.
  • 3. Case Study 2: High Frequency Trading Estimator Case Study Situation: Data Science and technology has grown capable enough to come up with an innovative algorithm to be implemented for handling huge amount of Forex market data, analyzing it and predicting estimated value of foreign exchange to ensure better profit for both a pro and a debutant trader interested in exchanging currencies at regular intervals according to market condition. Now, being an existing trader or investor or someone about to start forex trading one would definitely like some suggestions regarding buying or selling of the currencies. Instead of depending on shady information or prediction by an individual, if it is possible to depend on a system with specifically designed structure and algorithm to predict the best possible profit margin at a given point of time with less chance of error, that will be brilliant. Requirement:  Feeding the real market bid prices into the Database continuously coming through the API from the Forex market and handling this huge amount of data efficiently.
  • 4.  The system required a stand-alone application or product with a certain set of capabilities. For instance, the system is to be structured in a way that an existing trader or a debuting trader will be able to configure their own Admin Portal which will come with the facilities like API set-up, Database source location set-up, Time- zone set-up and Artificial Intelligence parameter set-up.   For prediction, it is important to configure the machine accordingly and that includes getting selected currency pairs as source-feed which in turn will help to predict a target pair which will be based on selected time observers.   Once the machine starts, the trader will have access of the live chart with continuous info of the real market bid price and the estimated value simultaneously based on the selected time observers.   The trader will have access of the statistical analysis of real moving average and current estimation error which will be portrayed through graphical representation for ease of understanding.   Application stability is extremely crucial and Data searching should be fast so that the selected time frame is utilized properly.  Challenges:  The biggest challenge faced was, creating a prediction algorithm with less error, based on artificial neuro network.  Second important challenge was, creating an algorithm for handling huge data and fishing out the most required data within less time.  Another important challenge faced was, the stability of the application and optimization of system resource utilization (Memory and the CPU).  Action: To address the above-mentioned requirements, following actions were taken,   An algorithm was created for fishing out the most required history data, by generating the keys and using them for hooking up the data from the database. So that, the searching can be optimized effectively.   An admin portal was created where the trader can set up the API (i.e. API’s source location and authentication token), Database (i.e. database source locations and credential) and Time zone (i.e. the start and the stop data feeding into the database will be automatically synchronized depending on Forex market timings).   An algorithm was created with the help of artificial neuro network where the latest history data of input pairs and output pairs were being used for training and teaching purpose to predict the estimated value.   A live chart was created with the help of Amchart Library where the estimated price and real market bid price was plotted.
  • 5.  For statistical analysis, a separate chart was created with the help of Amchart Library where predicted or estimated value and real time value were compared to find the amount of error along with the continuous fluctuation error and plotting it into the graph.  Results:  Through an innovative methodology;   The prediction algorithm created is up to 99% error free, though, there is a miniscule space for betterment.  The load balancing of Database and system has been optimized.   The application is running smoothly and it is completely stable with optimum system resource.   Statistical analysis shows that the error moving average is not fluctuating greatly (i.e. the application is working almost perfectly). 