Intended for Knowledge Sharing only
PREDICTIVE ANALYTICS & BUSINESS INSIGHTS SUMMIT
Mar 2016
Intended for Knowledge Sharing only
Disclaimer:
Participation in this summit is purely on personal basis and not representing VISA in any form or
matter. The talk is based on learnings from work across industries and firms. Care has been taken to
ensure no proprietary or work related info of any firm is used in any material.
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
REAL TIME ANALYTICS
AS THEY ARE ENVISIONED TODAY…
Intended for Knowledge Sharing only 4
SPEED PRECISION POWER
…BUT IT HAS GROWN TO
Intended for Knowledge Sharing only 5
SPEED PRECISION POWER
DISTANCE
PAYLOADS
RE-USABLE
MISSION
LONGEVITY
OH MY…
Intended for Knowledge Sharing only 6
HOUSTON,
WE HAVE A
PROBLEM!
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
ARE YOU SURE IT’S POSSIBLE IN BUSINESS WORLD?
AN EXAMPLE FROM OUR BUSINESS WORLD
8
...sync with business hours, predictive alternative means, nearby businesses instead,
book an online appointment for future, mail/call instead, suggest virtual interaction,
discovery
Intended for Knowledge Sharing only
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
LET’S SEE IT IN ACTION…
ADOBE CAPTURED IT PERFECTLY…
10Intended for Knowledge Sharing only
HOW COULD IT HAVE BEEN AVOIDED
No Knee jerk reaction
Statistical significance
Cross validation across multiple data sources
Explanation of the drivers
Proper response mechanism
11Intended for Knowledge Sharing only
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
HOW REAL IS REAL TIME ANALYTICS?
UNITED BREAKS GUITAR
Intended for Knowledge Sharing only 13
OK AGREED, BUT WHAT ARE THE OTHER USE CASES?
Intended for Knowledge Sharing only
OPERATIONAL
FRAUD
PRODUCT LAUNCHES
• System downtime, users experience issues, API failures, load
times, etc. – by regions, products, browsers, devices, etc.
• Fraud rates, types, amount, hacking, system compromise,
gaming/misuse, etc.
• New Product/Flow/App/Feature/Plug-ins performance, issues
• User Behavioral changes
FUNCTIONS TYPICAL USE CASES
MARKETING CAMPAIGNS • Campaign usage & inventory management– popular/flop/gaming
SALES
• Recommendation engines – Cross/Up sell
• New Product sales
• Inventory Management
BRAND MANAGEMENT
• Social Media Monitoring – VOC, NPS, SOV (a Trending issue or
opportunity)
14
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
HOW DO WE PULL IT OFF?
Setting up
right
Analytical
Framework
Data
Collection &
Preparation
Analysis Action
CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION…
Intended for Knowledge Sharing only 16
Problem
Statement
1 Strategy
 Type of functional use case
 Objective & strategic measurements (&
impact on Corporate KPI)
 Analyses, Alert thresholds, impact sizing
2 Execution
 Command-Control (Working Group)
 Communication protocols & methods
 Response Framework (Approvals)
 Fall back options, alternatives, ramps
3 Organizational
Transformation
 People-Process-Technology-Culture
Data
Collection &
Preparation
Analysis Action
Problem
Statement
Setting up
right
Analytical
Framework
CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION…
Intended for Knowledge Sharing only 17
Type of reporting: Statistical Process Controls (Deviation from mean, median, expected
values, benchmarking)
Other techniques required: A/B Testing, VOC, Social Media Monitoring, Mining of
patterns, etc.
Sizing & Prioritization of issues depending on impact on corporate KPIs
Types of alerts based on metric: Statistical Significance of deviation, consistency (VOC,
Social), absolute count thresholds (statistical significance calculation based), benchmarking
Level of explanation required: Multi level drilldown, early warning indicators and data
points to cross validate with
Analysis Action
Problem
Statement
Setting up
right
Analytical
Framework
Data
Collection &
Preparation
CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION…
Intended for Knowledge Sharing only 18
Data ingestion: Volume, Variety (OLTP, Clickstream, Social, Server Logs, Campaign,
Industry, Search traffic, Devices, Regions), Velocity & Value
Data blending: Ability to manage fast, at scale mix to come up with complete view
Data Governance: Data Quality (monitoring to ensure data feed is reliable, sensible and
not an issue), Data Lineage (ability to back track & understand the data is what it is
supposed to be) and Data Understanding (indicates the right usage that it was intended
for).
Action
Problem
Statement
Setting up
right
Analytical
Framework
Data
Collection &
Preparation
Analysis
CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION…
Intended for Knowledge Sharing only 19
Reporting: Depending on required analytical framework, audience, use case
A/B Testing: Analyze multiple variations and/or benchmark with current experience
Sizing & Investigation: Estimation of impact on Corporate KPI, Prioritization, ability to
explain numbers and evolving patterns
Investigation: Cross Validation, Continued trends, benchmarking
Problem
Statement
Setting up
right
Analytical
Framework
Data
Collection &
Preparation
Analysis Action
CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION…
Intended for Knowledge Sharing only 20
Mode of communication: Email/Text alerts, App Notifications, Calls?
Content: (post investigation– cross validated, continuing, benchmarking)
-What has happened: Bands breached, Statistically Significant size, Threshold counts,
trending topic)
-Where & for whom: Region, Product Type, Flow, Browsers, Customer Segment
-How big: Dollar impact, impact on Corporate KPI
-Possible drivers: Based on data analyses, Domain expert input, working group
-Recommendation
Response Type: Approval to stop/continue/ramp/alternative – over mail/app/calls
Feedback Loop: Learning needs to be fed back into mainstream analytics
Intended for Knowledge Sharing only
Intended for Knowledge Sharing only
TECHNOLOGICAL FRAMEWORK
DATA PROCESSING PIPELINE
22
Ingest /
Collect
Store
Process /
Analyze
Consume
/ Visualize
DATA
Answers
Intended for Knowledge Sharing only
DATA CATEGORIZATION
23
HOT WARM COLD
Data Volume MB-GB GB-TB TBs
Item size B-KB KB-MB KB-TB
Latency Millisec-sec Minutes – hour Hrs, Day
Durability Low-Medium High Very High
Maintenance Very High High Low
Applications Real-time, Alerts
Analysis and
reporting
Deep dive analysis
and Machine
learning
Intended for Knowledge Sharing only
DATA EVOLUTION (MASLOW HIERARCHY OF NEEDS)
24
Batch PredictionReal-time
Reports Alerts Forecast
Intended for Knowledge Sharing only
LAMBDA ARCHITECTURE
25
Aims to satisfy the needs for a robust system that is fault-tolerance, both against hardware failures
and human mistakes, being able to serve wide range of workloads and use cases, and in which
low-latency reads and updates are required. The resulting system should be linearly scalable.
1. All data entering the system is dispatched to both batch layer and speed layer for processing.
2. The batch layer has two functions: (1) managing master dataset (an immutable, append-only) (2) to pre-
compute batch views.
3. The serving layer indexes the batch views so that they can be queried in low-latency
4. The speed layer compensates for the high-latency of updates to the serving layer and deals with recent data
only.
5. Any incoming query can be answered by merging results from batch views and real-time views
Reference : http://lambda-architecture.net/
LAMBDA ARCHITECTURE
26
New data
stream
HADOOP
All
data(HDFS)
Enriched data
SPARK
Data Stream
Increment
Views
Query
Intended for Knowledge Sharing only
Batch Layer
Access Layer
Speed Layer
LAMBDA ARCHITECTURE – WITH BENCHMARKS
27
New data
stream
HADOOP
All
data(HDFS)
Enriched
SPARK
Data Stream
Alerts
Benchmarks
(rules engine)
Benchmarks
(rules engine)
Data Stream
Intended for Knowledge Sharing only
Batch Layer
Access Layer
Speed Layer
Intended for Knowledge Sharing only
Intended for Knowledge Sharing only
IN CONCLUSION…
WHY DO WE THINK THE TIME IS NOW?
Evolution in the value prop of Real Time Analytics:
What/where/how much (Descriptive) -> what can happen (Predictive) -
>what should we do (Prescriptive) ?
Audience has broadened (From Operational to other key functions)
Demands on RoI have gone up
Data Mining is maturing enough to be used to answer “Real time Pattern
identifications”
29
KPI of Analytics has changed from Turn-Around-Time (TAT) to Time-to-
Action (TTA)
KEY TAKEAWAYS
30
• “Know” that Real Time Analytics is a need not luxury
• “Must have” a strong Strategic, Tactical & Organization framework
• “Ensure” Cross validation, Sizing & Prioritizing
• “Develop” Command-Control Structure & Working Group to ensure “rapid but
right” response
• “Prepare” for evolution of Real Time Analytics closer towards Artificial Intelligence
Intended for Knowledge Sharing only
Quick recap of what it is
Intended for Knowledge Sharing only
Appendix
Intended for Knowledge Sharing only
Disclaimer:
Participation in this summit is purely on personal basis and not representing VISA in any form or
matter. The talk is based on learnings from work across industries and firms. Care has been taken to
ensure no proprietary or work related info of any firm is used in any material.
Director, Insights at Visa, Inc.
Enable Decision Making at the Executives/
Product/Marketing level via actionable
insights derived from Data.
RAMKUMAR RAVICHANDRAN
Data Warehouse Architect at Visa, Inc.
Architect a data-shop in Hadoop to get 360-
degree view of the interaction. Technology
interface for the Data Stakeholder Community.
BHARATHIRAJA CHANDRASEKHARAN
THANK YOU!
Intended for Knowledge Sharing only
Would love to hear from you on any of the following forums…
https://twitter.com/decisions_2_0
http://www.slideshare.net/RamkumarRavichandran
https://www.youtube.com/channel/UCODSVC0WQws607clv0k8mQA/videos
http://www.odbms.org/2015/01/ramkumar-ravichandran-visa/
https://www.linkedin.com/pub/ramkumar-ravichandran/10/545/67a
https://www.linkedin.com/in/dataisbig
http://bigdatadw.blogspot.com/
BHARATHIRAJA CHANDRASEKHARAN
RAMKUMAR RAVICHANDRAN
33
34
SOURCES OF VARIOUS IMAGES
Intended for Knowledge Sharing only 34
Images from:
https://www.google.com/search?q=f16&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjT2ZKytr_LAhVM12MKHZvtAngQ_AUIBygB&biw=1366&bih=599#i
mgrc=W6qpeXNuNSm1lM%3A
https://www.google.com/search?q=fast+and+furious&biw=1366&bih=599&source=lnms&tbm=isch&sa=X&sqi=2&ved=0ahUKEwjBgqfZt7_LAhXkJJoKHb8R
DrsQ_AUIBigB#imgdii=cDHYaybkEHafyM%3A%3BcDHYaybkEHafyM%3A%3BW2D1W4BUx3boGM%3A&imgrc=cDHYaybkEHafyM%3A
https://www.google.com/search?q=sandra+bullock+astronaut+movie&biw=1366&bih=599&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj23PKPvb_LAhV
E92MKHSiiD1kQ_AUICSgD#imgrc=lKmxS5CNElGmPM%3A

Real time analytics in Big Data

  • 1.
    Intended for KnowledgeSharing only PREDICTIVE ANALYTICS & BUSINESS INSIGHTS SUMMIT Mar 2016
  • 2.
    Intended for KnowledgeSharing only Disclaimer: Participation in this summit is purely on personal basis and not representing VISA in any form or matter. The talk is based on learnings from work across industries and firms. Care has been taken to ensure no proprietary or work related info of any firm is used in any material.
  • 3.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only REAL TIME ANALYTICS
  • 4.
    AS THEY AREENVISIONED TODAY… Intended for Knowledge Sharing only 4 SPEED PRECISION POWER
  • 5.
    …BUT IT HASGROWN TO Intended for Knowledge Sharing only 5 SPEED PRECISION POWER DISTANCE PAYLOADS RE-USABLE MISSION LONGEVITY
  • 6.
    OH MY… Intended forKnowledge Sharing only 6 HOUSTON, WE HAVE A PROBLEM!
  • 7.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only ARE YOU SURE IT’S POSSIBLE IN BUSINESS WORLD?
  • 8.
    AN EXAMPLE FROMOUR BUSINESS WORLD 8 ...sync with business hours, predictive alternative means, nearby businesses instead, book an online appointment for future, mail/call instead, suggest virtual interaction, discovery Intended for Knowledge Sharing only
  • 9.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only LET’S SEE IT IN ACTION…
  • 10.
    ADOBE CAPTURED ITPERFECTLY… 10Intended for Knowledge Sharing only
  • 11.
    HOW COULD ITHAVE BEEN AVOIDED No Knee jerk reaction Statistical significance Cross validation across multiple data sources Explanation of the drivers Proper response mechanism 11Intended for Knowledge Sharing only
  • 12.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only HOW REAL IS REAL TIME ANALYTICS?
  • 13.
    UNITED BREAKS GUITAR Intendedfor Knowledge Sharing only 13
  • 14.
    OK AGREED, BUTWHAT ARE THE OTHER USE CASES? Intended for Knowledge Sharing only OPERATIONAL FRAUD PRODUCT LAUNCHES • System downtime, users experience issues, API failures, load times, etc. – by regions, products, browsers, devices, etc. • Fraud rates, types, amount, hacking, system compromise, gaming/misuse, etc. • New Product/Flow/App/Feature/Plug-ins performance, issues • User Behavioral changes FUNCTIONS TYPICAL USE CASES MARKETING CAMPAIGNS • Campaign usage & inventory management– popular/flop/gaming SALES • Recommendation engines – Cross/Up sell • New Product sales • Inventory Management BRAND MANAGEMENT • Social Media Monitoring – VOC, NPS, SOV (a Trending issue or opportunity) 14
  • 15.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only HOW DO WE PULL IT OFF?
  • 16.
    Setting up right Analytical Framework Data Collection & Preparation AnalysisAction CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION… Intended for Knowledge Sharing only 16 Problem Statement 1 Strategy  Type of functional use case  Objective & strategic measurements (& impact on Corporate KPI)  Analyses, Alert thresholds, impact sizing 2 Execution  Command-Control (Working Group)  Communication protocols & methods  Response Framework (Approvals)  Fall back options, alternatives, ramps 3 Organizational Transformation  People-Process-Technology-Culture
  • 17.
    Data Collection & Preparation Analysis Action Problem Statement Settingup right Analytical Framework CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION… Intended for Knowledge Sharing only 17 Type of reporting: Statistical Process Controls (Deviation from mean, median, expected values, benchmarking) Other techniques required: A/B Testing, VOC, Social Media Monitoring, Mining of patterns, etc. Sizing & Prioritization of issues depending on impact on corporate KPIs Types of alerts based on metric: Statistical Significance of deviation, consistency (VOC, Social), absolute count thresholds (statistical significance calculation based), benchmarking Level of explanation required: Multi level drilldown, early warning indicators and data points to cross validate with
  • 18.
    Analysis Action Problem Statement Setting up right Analytical Framework Data Collection& Preparation CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION… Intended for Knowledge Sharing only 18 Data ingestion: Volume, Variety (OLTP, Clickstream, Social, Server Logs, Campaign, Industry, Search traffic, Devices, Regions), Velocity & Value Data blending: Ability to manage fast, at scale mix to come up with complete view Data Governance: Data Quality (monitoring to ensure data feed is reliable, sensible and not an issue), Data Lineage (ability to back track & understand the data is what it is supposed to be) and Data Understanding (indicates the right usage that it was intended for).
  • 19.
    Action Problem Statement Setting up right Analytical Framework Data Collection & Preparation Analysis CURRENTANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION… Intended for Knowledge Sharing only 19 Reporting: Depending on required analytical framework, audience, use case A/B Testing: Analyze multiple variations and/or benchmark with current experience Sizing & Investigation: Estimation of impact on Corporate KPI, Prioritization, ability to explain numbers and evolving patterns Investigation: Cross Validation, Continued trends, benchmarking
  • 20.
    Problem Statement Setting up right Analytical Framework Data Collection & Preparation AnalysisAction CURRENT ANALYTICAL FRAMEWORK NEEDS END-TO-END OPTIMIZATION… Intended for Knowledge Sharing only 20 Mode of communication: Email/Text alerts, App Notifications, Calls? Content: (post investigation– cross validated, continuing, benchmarking) -What has happened: Bands breached, Statistically Significant size, Threshold counts, trending topic) -Where & for whom: Region, Product Type, Flow, Browsers, Customer Segment -How big: Dollar impact, impact on Corporate KPI -Possible drivers: Based on data analyses, Domain expert input, working group -Recommendation Response Type: Approval to stop/continue/ramp/alternative – over mail/app/calls Feedback Loop: Learning needs to be fed back into mainstream analytics
  • 21.
    Intended for KnowledgeSharing only Intended for Knowledge Sharing only TECHNOLOGICAL FRAMEWORK
  • 22.
    DATA PROCESSING PIPELINE 22 Ingest/ Collect Store Process / Analyze Consume / Visualize DATA Answers Intended for Knowledge Sharing only
  • 23.
    DATA CATEGORIZATION 23 HOT WARMCOLD Data Volume MB-GB GB-TB TBs Item size B-KB KB-MB KB-TB Latency Millisec-sec Minutes – hour Hrs, Day Durability Low-Medium High Very High Maintenance Very High High Low Applications Real-time, Alerts Analysis and reporting Deep dive analysis and Machine learning Intended for Knowledge Sharing only
  • 24.
    DATA EVOLUTION (MASLOWHIERARCHY OF NEEDS) 24 Batch PredictionReal-time Reports Alerts Forecast Intended for Knowledge Sharing only
  • 25.
    LAMBDA ARCHITECTURE 25 Aims tosatisfy the needs for a robust system that is fault-tolerance, both against hardware failures and human mistakes, being able to serve wide range of workloads and use cases, and in which low-latency reads and updates are required. The resulting system should be linearly scalable. 1. All data entering the system is dispatched to both batch layer and speed layer for processing. 2. The batch layer has two functions: (1) managing master dataset (an immutable, append-only) (2) to pre- compute batch views. 3. The serving layer indexes the batch views so that they can be queried in low-latency 4. The speed layer compensates for the high-latency of updates to the serving layer and deals with recent data only. 5. Any incoming query can be answered by merging results from batch views and real-time views Reference : http://lambda-architecture.net/
  • 26.
    LAMBDA ARCHITECTURE 26 New data stream HADOOP All data(HDFS) Enricheddata SPARK Data Stream Increment Views Query Intended for Knowledge Sharing only Batch Layer Access Layer Speed Layer
  • 27.
    LAMBDA ARCHITECTURE –WITH BENCHMARKS 27 New data stream HADOOP All data(HDFS) Enriched SPARK Data Stream Alerts Benchmarks (rules engine) Benchmarks (rules engine) Data Stream Intended for Knowledge Sharing only Batch Layer Access Layer Speed Layer
  • 28.
    Intended for KnowledgeSharing only Intended for Knowledge Sharing only IN CONCLUSION…
  • 29.
    WHY DO WETHINK THE TIME IS NOW? Evolution in the value prop of Real Time Analytics: What/where/how much (Descriptive) -> what can happen (Predictive) - >what should we do (Prescriptive) ? Audience has broadened (From Operational to other key functions) Demands on RoI have gone up Data Mining is maturing enough to be used to answer “Real time Pattern identifications” 29 KPI of Analytics has changed from Turn-Around-Time (TAT) to Time-to- Action (TTA)
  • 30.
    KEY TAKEAWAYS 30 • “Know”that Real Time Analytics is a need not luxury • “Must have” a strong Strategic, Tactical & Organization framework • “Ensure” Cross validation, Sizing & Prioritizing • “Develop” Command-Control Structure & Working Group to ensure “rapid but right” response • “Prepare” for evolution of Real Time Analytics closer towards Artificial Intelligence
  • 31.
    Intended for KnowledgeSharing only Quick recap of what it is Intended for Knowledge Sharing only Appendix
  • 32.
    Intended for KnowledgeSharing only Disclaimer: Participation in this summit is purely on personal basis and not representing VISA in any form or matter. The talk is based on learnings from work across industries and firms. Care has been taken to ensure no proprietary or work related info of any firm is used in any material. Director, Insights at Visa, Inc. Enable Decision Making at the Executives/ Product/Marketing level via actionable insights derived from Data. RAMKUMAR RAVICHANDRAN Data Warehouse Architect at Visa, Inc. Architect a data-shop in Hadoop to get 360- degree view of the interaction. Technology interface for the Data Stakeholder Community. BHARATHIRAJA CHANDRASEKHARAN
  • 33.
    THANK YOU! Intended forKnowledge Sharing only Would love to hear from you on any of the following forums… https://twitter.com/decisions_2_0 http://www.slideshare.net/RamkumarRavichandran https://www.youtube.com/channel/UCODSVC0WQws607clv0k8mQA/videos http://www.odbms.org/2015/01/ramkumar-ravichandran-visa/ https://www.linkedin.com/pub/ramkumar-ravichandran/10/545/67a https://www.linkedin.com/in/dataisbig http://bigdatadw.blogspot.com/ BHARATHIRAJA CHANDRASEKHARAN RAMKUMAR RAVICHANDRAN 33
  • 34.
    34 SOURCES OF VARIOUSIMAGES Intended for Knowledge Sharing only 34 Images from: https://www.google.com/search?q=f16&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjT2ZKytr_LAhVM12MKHZvtAngQ_AUIBygB&biw=1366&bih=599#i mgrc=W6qpeXNuNSm1lM%3A https://www.google.com/search?q=fast+and+furious&biw=1366&bih=599&source=lnms&tbm=isch&sa=X&sqi=2&ved=0ahUKEwjBgqfZt7_LAhXkJJoKHb8R DrsQ_AUIBigB#imgdii=cDHYaybkEHafyM%3A%3BcDHYaybkEHafyM%3A%3BW2D1W4BUx3boGM%3A&imgrc=cDHYaybkEHafyM%3A https://www.google.com/search?q=sandra+bullock+astronaut+movie&biw=1366&bih=599&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj23PKPvb_LAhV E92MKHSiiD1kQ_AUICSgD#imgrc=lKmxS5CNElGmPM%3A