Analytics
Induction
Ankit Rathi
Breaking
the ice…
AGENDA
A. Foundation
B. Lifecycle
C. Tools & Techniques
D. Case Study
And some activities in between…
Let me start with
a story…
But, why my story?
Story I (2008-09)
8
Story II (2012-13)
A. Foundation
Why Analytics
is required?
Analytics
Growth
Analytics in
Industry
What is
Analytics?
Analytics is the discovery, interpretation,
and communication of meaningful patterns
in data. Especially valuable in areas rich
with recorded information, analytics relies
on the simultaneous application of
statistics, computer programming and
operations research to quantify
performance.
History of
Analytics
Types of
Analytics
Activity
Time - I
Identify use-cases in my business to apply:
• Descriptive Analytics
• Diagnostic Analytics
• Predictive Analytics
• Prescriptive Analytics
• Cognitive Analytics
AR Pizza Chain
B. Lifecycle
Types of
Analytics
Projects
Strategy
Management
Consulting
Visualization
Analytics
Strategy
Analytics
Consulting
Analytics
Management
Appropriate management methodologies
Unavailability of past metrics
The value of a PoC
Achieving repeatability
Agile analytics
Management communication
Analytics
Virtualization
with Advanced
Technologies
with Internet 0f
Things
with Blockchain
with Artificial
Intelligence
Analytics
Visualization
When to use what?
Pie Chart
Bar Chart
Scatterplot Violin Plots
Histograms
Box-Whisker Plots
Line Chart Bubble Chart
Heat Map Tree Map
Pie
Chart
Bar
Chart
Histograms
Box & Whisker
Scatterplots
Line Chart
Bubble
Chart
Violin
Plots
Analytics
Integration
47
Dashboards
Analytics Workbench
Data Engineers
Data Scientists
Analytics Architect
Structured Data
Semi/Unstructured Data
batch
Streaming
Ingestion &
Staging Layer
Transformation Layer Consumption Layer
APIs
Business
Sqoop
Storm
Flume
Kafka
Hadoop
Spark
UI
Tableau
PowerBI
MapReduce
SparkSQL
MongoDB
Cassandra
SparkR
PySpark
How to do
Analytics
Projects?
Identifying
Acquiring
Cleaning
Framing
Understanding
Modeling
Evaluating
Presenting
Deploying
Monitoring
Identifying Analytics Opportunity
Framing the Problem
Determine Your Goals and Objectives
Build Your Proof of Concept Team
Obtain Sample Data Set
Select the predictive model
Analyse POC Results
Stakeholders Buy-in
Conducting the
POC
Identify Data Sources
Filter Data Attributes
Build Ingestion Pipeline
Acquiring
the
Data
Identify Data
Types
Describe
Data
Explore
Data
Verify Data
Quality
Understanding the Data
Fill Missing
Data
Fill Missing
Data
Treat
Outliers
Standardize
Data
Normalize
Data
Cleaning
the
Data
Feature Engineering Select the Model Apply the Model
Building the Model
Identify Evaluation
Metrices
Prepare Cross-validation
Sets
Evaluate the
Model
Evaluating
the Model
Engage the
Audience
Be Creative
Know the
Audience
Craft a
Data Story
Presenting
the Insights
Model
Scoring
Performance
Monitoring
Deploying the Model
Data
Architecture
DB, ETL,
DWH & BI
Case
Studies
Associative Mining
in Retail Banking
Anomaly Detection
in AML
Automation in
BPM
Flight Prediction
Activity
Time - II
Build a high-level conceptual architecture for
my business to apply analytics:
• Operational System
• ETL Pipelines
• Data Warehouse/Data Lake
• Business Intelligence/Data Science
AR Pizza Chain
C. Tools &
Techniques
Techniques
Linear Regression
Logistic Regression
Decision Trees
Naïve Bayes
Associative Mining
Anomaly Detection
Linear Regression
Logistic Regression
Decision Trees
Naïve Bayes
Associative Mining
Clustering
Anomaly Detection
Text Analytics
Tools
Business Layer
Data Analytics
Tools
Data Ingestion
Tools
Data
Transformation
Tools
Data
Presentation
Tools
D. Case Study
• Pick an existing business process
• Identify analytics opportunity
• Hypothesize transformed
business process
• Present the business case

Analytics Induction