SlideShare a Scribd company logo
Analysing data analytics use cases
to understand purpose of big data ecosystem components
by
Purpose of any data platform (big / not big)
is to enable analytics on data
dataeaze
Why?
Different analytics use cases expect different set of
features from data platform
Components part of big data ecosystem
are made
to serve needed features of analytics use cases
dataeaze
Why?
So to understand data platform
to understand data platform components
It is necessary to know purpose
It is necessary to know needs of analytics use cases
which are served by data platform
dataeaze
Why?
Here
We take look at all categories of analytics use
cases on data platform
dataeaze
What?
Analytics data processing use case categories
dataeaze
What?
We analyse each use case as
Nature of data
processing in order to
serve this use case
Expectations from data
platform to enable
required data processing
dataeaze
What?
Static Reports
are summary reports prepared for the purpose of
giving status to decision makers
Example
Report for top management at end of day specifying
daily sales, transactions, revenue, total traffic
dataeaze
Nature of data processing
Static reports are
Scheduled to execute at fixed time interval,
Generate analysis reports for given time period,
Can execute on raw data directly or on intermediate store
dataeaze
Static Reports
Expectations from data platform
Scheduled data processing
Static reports are executed at predefined schedule repeatedly
Timely arrival of data
Generated reports should represent complete picture of given
timeframe, and should be generated before deadline.
Process raw data to get result
Capability to generate report from raw data if it cannot be
extracted from intermediate data form
dataeaze
Static Reports
Dashboard Reports
Dashboard is reporting user interface where user can interactively
choose his own view of data with limited set of filters.
Example
An e-commerce company having dashboard for sellers where
sellers get to know how much inventory sold across demographic,
across product categories, across time range.
dataeaze
Nature of data processing
Periodically keep on processing raw data to
bring it in form required by dashboards
Populate transformed data into interactive
store backend of dashboards
dataeaze
Dashboard
Expectations from data platform
ETL
To convert raw data in format required by dashboard
Scheduled data processing
Timely repeated executions of ETL jobs to populate
dashboards with latest updates
Interactive data store
Dashboard reports are interactive in nature, so backend store
is supposed to return results in near real time
dataeaze
Dashboard
Ad Hoc data analysis
This is for business queries which are raised as per need,
This is not scheduled and is executed one time whenever necessary
Example
A product manager wanting to know detail analysis about
customer behavior on a navigation panel, so as to define optimised
ad placements.
dataeaze
Nature of data processing
Steps to serve an ad hoc report,
Identify data sources which will satisfy given
request
Execute data processing (preferable sql like
query) on identified source
Load results in data representation tool
dataeaze
Ad Hoc
Expectations from data platform
data processing SQL engine
SQL query engine makes it easy to represent required analysis
in form of SQL query, saves analyst’s time
complex data processing
A platform which supports writing custom complex data
analysis, which is not possible through SQL
dataeaze
Ad Hoc
BI Reporting
Business Intelligence tools provide advanced general purpose
dashboards which host wide array of dimensions in backend data
store. User can define and save transformations, analysis queries
through BI tool and get back reports in tabular or graphical form.
Example
A BI report representing weekly sales stats across multiple regions
for previous 6 months. This report is once created and saved. Users
execute saved report whenever they want.
dataeaze
Nature of data processing
Scheduled ETL jobs to convert raw data to
required intermediate data form
Data is loaded to interactive SQL data stores
BI tools are connected to SQL data store as
backend
dataeaze
BI Reporting
Expectations from data platform
ETL
Raw data should be transformed to required format and get
loaded to SQL data warehouse
Scheduling of ETL
Defined ETL jobs should be scheduled to execute at fixed time
interval.
data processing SQL engine
SQL query engine makes it easy to extract data out, saves
time. BI tools can connect to this SQL data store.
dataeaze
BI Reporting
Data Processing for Applications
This is data processing done to provide feedback input to business
applications. Business applications take better decisions based on
latest data feedback.
Example
Ad servers getting periodically updated about latest minimum
ecpm to expect for an ad placement getting filled dynamically.
dataeaze
Nature of data processing
Complex data processing (machine learning) on raw
data
Scheduled data processing
Update result into interactive key-value store which get
fetched directly from applications
dataeaze
App data processing
Expectations from data platform
Capability to implement custom complex data processing
User should be able to easily define custom complex data processing
algorithms (like machine learning)
Scheduled data processing
Required for periodic execution of data processing jobs
dataeaze
App data processing
Real time stream data processing
It is analysing an event as soon as it happens. Sooner the analysis
better is value obtained from it.
Example
Stock ticker getting displayed on yahoo finance
dataeaze
Nature of data processing
As soon as event happens its log entry is
collected
All log entries are buffered, made available
for processing layer.
Pull records from message buffer and
perform processing on it.
dataeaze
Real time stream
Expectations from data platform
Scalable message buffer
A message buffer to keep received messages which are pulled
from this buffer for processing
Real time stream processing engine
To pull and process records in real time. Provide user ability to
define custom data processing.
dataeaze
Real time stream
Let us take a look at super set of expectations across
all use cases
dataeaze
Expectations from data platform
across all use cases
Summarise all
dataeaze
Super set of expectations
Expectation / Capability Use caseNeeded by
Complex data analysis using query
language
Scheduled ETL data processing
Data store for interactive data
analysis
Data ingestion with timely arrival of
data
Scalable message buffer to be
consumed by stream data processing
Streaming data processing platform
Static reports
ad hoc data analysis
BI reporting
Dashboard reports
app specific data processing
Real time stream data processing
Summarise all
dataeaze
Let’s conclude
dataeaze
We have identified common set of features expected
from data platform
by most of analytics use cases
Let us map these to data platform components
Conclude
dataeaze
Capabilities provided by data platform components
Expectation / Capability Data platform
component
Supported by
Complex data analysis using query
language
Scheduled ETL data processing
Data store for interactive data
analysis
Data ingestion with timely arrival of
data
Scalable message buffer to be
consumed by stream data processing
Streaming data processing platform
Data Ingestion
Batch data processing
Workflow scheduler
Interactive data stores
Message buffers
Real time stream
engine
Data Platform
Tools
Flume, Kafka, Scribe
Hive, Mapred
Oozie
Hbase, Spark, ..
Kafka
Storm, Spark
Conclude
dataeaze
Data platform components satisfying expectations
Conclude
dataeaze
Going backwords
Now you know about
Data platform components
capabilities supported by those
satisfying features of analytics use cases
Conclude
dataeaze
Thank You
dataeaze

More Related Content

What's hot

Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Amazon Web Services
 
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
Databricks
 
Amazon big success using big data analytics
Amazon big success using big data analyticsAmazon big success using big data analytics
Amazon big success using big data analytics
Kovid Academy
 
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Amazon Web Services
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
Amazon Web Services
 
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Amazon Web Services
 
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Spark Summit
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
Amazon Web Services
 
Zero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using HadoopZero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using Hadoop
DataWorks Summit/Hadoop Summit
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
Amazon Web Services
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
Lynn Langit
 
Big Data Analytics & Architecture
Big Data Analytics & ArchitectureBig Data Analytics & Architecture
Big Data Analytics & Architecture
Anjani Phuyal
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
Databricks
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
Amazon Web Services
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data Perspective
Databricks
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
Amazon Web Services
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
SingleStore
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWS
Amazon Web Services
 

What's hot (20)

Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
2016 Spark Summit East Keynote: Ali Ghodsi and Databricks Community Edition demo
 
Amazon big success using big data analytics
Amazon big success using big data analyticsAmazon big success using big data analytics
Amazon big success using big data analytics
 
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
 
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Zero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using HadoopZero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using Hadoop
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Practical Machine Learning
Practical Machine LearningPractical Machine Learning
Practical Machine Learning
 
Big Data Analytics & Architecture
Big Data Analytics & ArchitectureBig Data Analytics & Architecture
Big Data Analytics & Architecture
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data Perspective
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWS
 

Similar to Analysing data analytics use cases to understand big data platform

Analysing data analytics use cases to understand big data platform
Analysing data analytics use cases  to understand big data platformAnalysing data analytics use cases  to understand big data platform
Analysing data analytics use cases to understand big data platform
dataeaze systems
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
Deepak Chaurasia
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
Deepali Raut
 
Data mining
Data miningData mining
Data mining
sweetysweety8
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
Michael Lamont
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperVipul Neema
 
Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
vickyc
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
AWS User Group Kochi
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
dataware
 
SAP BI/BW
SAP BI/BWSAP BI/BW
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
Sap Bw 3.5 Overview
Sap Bw 3.5 OverviewSap Bw 3.5 Overview
Sap Bw 3.5 Overview
Trevor Prescod
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
AppDynamics
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
Łukasz Grala
 

Similar to Analysing data analytics use cases to understand big data platform (20)

Analysing data analytics use cases to understand big data platform
Analysing data analytics use cases  to understand big data platformAnalysing data analytics use cases  to understand big data platform
Analysing data analytics use cases to understand big data platform
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Data mining
Data miningData mining
Data mining
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
 
Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
 
SAP BI/BW
SAP BI/BWSAP BI/BW
SAP BI/BW
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Sap Bw 3.5 Overview
Sap Bw 3.5 OverviewSap Bw 3.5 Overview
Sap Bw 3.5 Overview
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
Click to Disk Troubleshooting with AppDynamics and OpsDataStore - AppSphere16
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 

Recently uploaded

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 

Recently uploaded (20)

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 

Analysing data analytics use cases to understand big data platform

  • 1. Analysing data analytics use cases to understand purpose of big data ecosystem components by
  • 2. Purpose of any data platform (big / not big) is to enable analytics on data dataeaze Why?
  • 3. Different analytics use cases expect different set of features from data platform Components part of big data ecosystem are made to serve needed features of analytics use cases dataeaze Why?
  • 4. So to understand data platform to understand data platform components It is necessary to know purpose It is necessary to know needs of analytics use cases which are served by data platform dataeaze Why?
  • 5. Here We take look at all categories of analytics use cases on data platform dataeaze What?
  • 6. Analytics data processing use case categories dataeaze What?
  • 7. We analyse each use case as Nature of data processing in order to serve this use case Expectations from data platform to enable required data processing dataeaze What?
  • 8. Static Reports are summary reports prepared for the purpose of giving status to decision makers Example Report for top management at end of day specifying daily sales, transactions, revenue, total traffic dataeaze
  • 9. Nature of data processing Static reports are Scheduled to execute at fixed time interval, Generate analysis reports for given time period, Can execute on raw data directly or on intermediate store dataeaze Static Reports
  • 10. Expectations from data platform Scheduled data processing Static reports are executed at predefined schedule repeatedly Timely arrival of data Generated reports should represent complete picture of given timeframe, and should be generated before deadline. Process raw data to get result Capability to generate report from raw data if it cannot be extracted from intermediate data form dataeaze Static Reports
  • 11. Dashboard Reports Dashboard is reporting user interface where user can interactively choose his own view of data with limited set of filters. Example An e-commerce company having dashboard for sellers where sellers get to know how much inventory sold across demographic, across product categories, across time range. dataeaze
  • 12. Nature of data processing Periodically keep on processing raw data to bring it in form required by dashboards Populate transformed data into interactive store backend of dashboards dataeaze Dashboard
  • 13. Expectations from data platform ETL To convert raw data in format required by dashboard Scheduled data processing Timely repeated executions of ETL jobs to populate dashboards with latest updates Interactive data store Dashboard reports are interactive in nature, so backend store is supposed to return results in near real time dataeaze Dashboard
  • 14. Ad Hoc data analysis This is for business queries which are raised as per need, This is not scheduled and is executed one time whenever necessary Example A product manager wanting to know detail analysis about customer behavior on a navigation panel, so as to define optimised ad placements. dataeaze
  • 15. Nature of data processing Steps to serve an ad hoc report, Identify data sources which will satisfy given request Execute data processing (preferable sql like query) on identified source Load results in data representation tool dataeaze Ad Hoc
  • 16. Expectations from data platform data processing SQL engine SQL query engine makes it easy to represent required analysis in form of SQL query, saves analyst’s time complex data processing A platform which supports writing custom complex data analysis, which is not possible through SQL dataeaze Ad Hoc
  • 17. BI Reporting Business Intelligence tools provide advanced general purpose dashboards which host wide array of dimensions in backend data store. User can define and save transformations, analysis queries through BI tool and get back reports in tabular or graphical form. Example A BI report representing weekly sales stats across multiple regions for previous 6 months. This report is once created and saved. Users execute saved report whenever they want. dataeaze
  • 18. Nature of data processing Scheduled ETL jobs to convert raw data to required intermediate data form Data is loaded to interactive SQL data stores BI tools are connected to SQL data store as backend dataeaze BI Reporting
  • 19. Expectations from data platform ETL Raw data should be transformed to required format and get loaded to SQL data warehouse Scheduling of ETL Defined ETL jobs should be scheduled to execute at fixed time interval. data processing SQL engine SQL query engine makes it easy to extract data out, saves time. BI tools can connect to this SQL data store. dataeaze BI Reporting
  • 20. Data Processing for Applications This is data processing done to provide feedback input to business applications. Business applications take better decisions based on latest data feedback. Example Ad servers getting periodically updated about latest minimum ecpm to expect for an ad placement getting filled dynamically. dataeaze
  • 21. Nature of data processing Complex data processing (machine learning) on raw data Scheduled data processing Update result into interactive key-value store which get fetched directly from applications dataeaze App data processing
  • 22. Expectations from data platform Capability to implement custom complex data processing User should be able to easily define custom complex data processing algorithms (like machine learning) Scheduled data processing Required for periodic execution of data processing jobs dataeaze App data processing
  • 23. Real time stream data processing It is analysing an event as soon as it happens. Sooner the analysis better is value obtained from it. Example Stock ticker getting displayed on yahoo finance dataeaze
  • 24. Nature of data processing As soon as event happens its log entry is collected All log entries are buffered, made available for processing layer. Pull records from message buffer and perform processing on it. dataeaze Real time stream
  • 25. Expectations from data platform Scalable message buffer A message buffer to keep received messages which are pulled from this buffer for processing Real time stream processing engine To pull and process records in real time. Provide user ability to define custom data processing. dataeaze Real time stream
  • 26. Let us take a look at super set of expectations across all use cases dataeaze
  • 27. Expectations from data platform across all use cases Summarise all dataeaze
  • 28. Super set of expectations Expectation / Capability Use caseNeeded by Complex data analysis using query language Scheduled ETL data processing Data store for interactive data analysis Data ingestion with timely arrival of data Scalable message buffer to be consumed by stream data processing Streaming data processing platform Static reports ad hoc data analysis BI reporting Dashboard reports app specific data processing Real time stream data processing Summarise all dataeaze
  • 30. We have identified common set of features expected from data platform by most of analytics use cases Let us map these to data platform components Conclude dataeaze
  • 31. Capabilities provided by data platform components Expectation / Capability Data platform component Supported by Complex data analysis using query language Scheduled ETL data processing Data store for interactive data analysis Data ingestion with timely arrival of data Scalable message buffer to be consumed by stream data processing Streaming data processing platform Data Ingestion Batch data processing Workflow scheduler Interactive data stores Message buffers Real time stream engine Data Platform Tools Flume, Kafka, Scribe Hive, Mapred Oozie Hbase, Spark, .. Kafka Storm, Spark Conclude dataeaze
  • 32. Data platform components satisfying expectations Conclude dataeaze
  • 33. Going backwords Now you know about Data platform components capabilities supported by those satisfying features of analytics use cases Conclude dataeaze