SlideShare a Scribd company logo
Advanced Analytics in Banking
Juan M. Huerta
Global Decision Management
VP Advanced Analytics
Citibank
I will talk about…
• Big Data Adoption process at Citi
• Realizing the Technical Value of Big Data
• Global Solutions
1
140
countries2
200 million
accounts
Citi: A Customer Centered Organization
3
As a customer-centered bank, the goal of our Big Data strategy to shift
the focus from independent vertical silos to Common Horizontal Solutions
focused around Citi’s 200-million customer accounts
Big Data Adoption Stakeholders
• Lines of Business
• Strategy & Decision Management Organizations: cross LOB & Geo,
global
• Data innovation Office: Governance & Regulatory
• CitiData – Big Data & Analytics Engineering
4
Big Data Adoption Roadmap
5
Adoption will not occur at once. The level of capability maturity across the
organization will vary significantly.
On theory we think in terms of Staged Competencies of a Big Data
Maturity Model.
In practice, a hybrid process, which fits the level of maturity of
participants, is needed.
Common
Data
Common
Analytic
Platform
Common
Tools &
Techniques
Common
Solutions
Common
Focus
Strategy
Big Data Adoption Hybrid Participation Model
• Novice: Proof of Concept
• Expert: R&D Environment
• Shadowed
6
7
End-to-end Analytic Process for a POC Project
This is one component of the hybrid model
Ideas and
Hypotheses
Information Asset
Inventory
Navigator
(“IAIN”)
• Pipeline of ideas
to use data for
competitive
advantage
• Robust,
comprehensive
ontology
allowing analysts
and economists
to search, sort,
and select data
for analysis
• Preliminary
assessment
for business
value, data
safekeeping
and
alignment to
business
practices
Data
Transformation &
Provisioning
• Transformation rules
executed to
normalize and
conform production
data
• Conformed data set
made available in
production
environment
Production Model
Development
• Develop scalable,
productizable
analytics
Model
Deployment
• Exploit insights and
analyses across the
enterprise to
maximize value
• Models measured
for quality / usage
• Formal approval
process through
Business
Steering
Committee
based on
understanding
expected use of
production data
R&D process
R&D
Project
Approval
Product
Approval
Engineering / Production process
Analytics
Knowledge
Management
• Robust, compreh
ensive ontology
allowing analysts
and economists
to
search, sort, an
d select data for
analysis
Data Set
Preparation
&
Provisioning
• Basic preparation
of data set (e.g.,
consolidation,
conformation)
• Permission-based
provisioning of
data set into a Big
Data Analytics
environment
Analytics
Execution
• Advanced
analytic tools
mine business
insight from
large volumes of
data
• Data scientist
peers review
model findings
and results
Analytics Peer
Review
Data
Acquisition
• Where
necessary,
acquire new
data sets to
support R&D
project
Advanced Global Solutions
• A global solution is a tested algorithm or analytic model that carries
out a particular business analysis and which is leveraged at a global
scale
• A big data global solution enables the interplay of complex algorithms
and large datasets
• When a global solution is built upon big data approaches a delivery
roadmap should be considered
• In the exploratory process a Global Solution is developed in the
Innovation R/D environment and validated through a POC process
• Alignment with Innovation, UAT, PRD environments
8
Technical Value of Big Data:
Benchmarks and Analysis
The Boom Driving Big Data is Technological
Heebyung Koh , Christopher L. Magee
A functional approach for studying technological progress:
Extension to energy technology
Technological Forecasting and Social Change, Volume 75, Issue 6,
July 2008, Pages 735–758
The Quadrant Of Analytic Opportunity
Run Time is affected by Data Size and Algorithmic Complexity
Algorithmic Complexity
Database
Interaction
Mtg+Cards+
Banking
Accounts Transaction
features
Accounts Transactions
Branches Transactions
Accounts Summary Stats.
Employees Summary Stats.
GL-GOCS GL-Entries
Branches Summary Stats.
10^10
10^9
10^9
10^8
10^7
10^6
10^5
Data Size
Sequence
Mining
Predictive
filtering
Latent
Dirichlet Allocation
HMM Baum-
Welch
O(ns nf nt)
CART
O(nf ns log ns)
Iterative
SVD- CF
K-means
Logistic
Regression
PCAPage
Rank
Self-Org.
Maps
Neural Nets
Collaborative
Filtering
(CF)
Vector based
Approaches
HMM
Machine
Learning
Traditional
Statistical
Big Data/Pattern
Mining
Conditional
Random
Fields
Support Vector
Machines
Breaking down the gains of P13n:
A Controlled Incremental Benchmark on a
Workstation grade processor (x500)
Implemented an incremental-SVD (Netflix Cup) predictive model that
runs on midsize of datasets…
X30
• Compiled Code (vs. interpreted)
x4
• In Memory (vs. Disk access)
X3.12
• Multithread (vs. single thread)
X1.3
• Workstation grade processor
Basic Map Reduce Benchmarks
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 2 3 4 5 6
Series1
Impact of overhead as function
Of input volume:
Relative Map Throughput
as a function of # Mappers
0
5
10
15
20
25
0 5 10 15 20
RelativeMapCPUtimespeedup
Number of Maps
0.003351955
0.032258065
0.319148936 1
2.631578947
21.12676056
Linear (0.003351955
0.032258065
0.319148936 1
2.631578947
21.12676056)
0
200
400
600
800
1000
1200
1400
1600
0 5 10 15 20
TokensperWallClockSecond
Number of Maps
Series1
Linear (Series1)
HAMSTER: Hadoop Multi-signature Search
for Text-based Entity Retrieval
• Core algorithm: String Edit Distance O(mnk2)
• Baseline runs at 100 matches per day
• HAMSTER speedup: 33x (5 node speedup) 60x (java speedup) =
2000x faster
Source
Items
Target
Items
Source
items
per
target
Input
Size
MAP
Records
Cluster
Max Map
Tasks
Effective
Map
Tasks
CPU
map
(secs)
Wall time
34k 618k 100 4.40GB 345 33 33 196k 2h 14
secs
34k 618k 50 8.8GB 690 40 66 196k 1h
47min
34k 618k 30 14.6GB 1,149 40 110 199k 1h 39
min
Leveraging Global Big Data Global Solutions
Creating Global Big Data solutions
Our goal is to evolve from Big Data algorithms to Big Data
Solutions
Example of Advanced Global Solution Matrix
17
Outlier
Detection
Multivariate
Segmentation
Sequence
Matching
Network
Analysis
Customer Contextual Clickstream
Action Marketing Risk/Fraud Digital
Structured
Prediction
17
K-Medoids
Clustering
Example: Transactional Time Series
AnomalousBehavior
On Demand Simulation: Generate Branches’ DNA
• Case Scenario: Unusual number of cash advances by 2 tellers.
Single day fraud Multi day fraudOriginal branch (August)
Creating Regions of Interest based on
On-Demand-Simulation
Minimum-Spanning-
Tree based branch
association for region
of interest generation
Multi-day fraud simulation
Original branch
Region of interest
• Numbers shown
are randomized
indices
Conclusion: Lessons Learned
• One Size does not fit all
• Follow a Hybrid Approach
• Leverage Analytic patterns: Global Solutions
• Big Data is about Parallelization
• The future: expensive Algorithms applied to large datasets
• Global Solutions are the combination of algorithmic building blocks
applied to specific business problems
21
Thank You!
22

More Related Content

What's hot

Data analytics in banking sector
Data analytics in banking sectorData analytics in banking sector
Data analytics in banking sector
SnigdhaGupta23
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Capgemini
 
Overview of Data Analytics in Lending Business
Overview of Data Analytics in Lending BusinessOverview of Data Analytics in Lending Business
Overview of Data Analytics in Lending Business
Sanjay Kar
 
Fintech introduction
Fintech introductionFintech introduction
Fintech introduction
QuantUniversity
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking services
Mariyageorge
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
Digital redefinition of banking banking transformation
Digital redefinition of banking   banking transformationDigital redefinition of banking   banking transformation
Digital redefinition of banking banking transformation
Draup
 
FinTech Overview
FinTech OverviewFinTech Overview
FinTech Overview
Albert Wang
 
BI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranataBI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranata
Rully Feranata
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
RohithND
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
DataWorks Summit
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
Randy L. Archambault
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banks
Pankaj Baid
 
business analytics.ppt
business analytics.pptbusiness analytics.ppt
business analytics.ppt
Renu Lamba
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industry
Bhagath Gopinath
 
Fintech in india
Fintech in indiaFintech in india
Fintech in india
Kumar Mayank
 
Analytics, Business Intelligence, and Data Science - What's the Progression?
Analytics, Business Intelligence, and Data Science - What's the Progression?Analytics, Business Intelligence, and Data Science - What's the Progression?
Analytics, Business Intelligence, and Data Science - What's the Progression?
DATAVERSITY
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
MachinePulse
 

What's hot (20)

Data analytics in banking sector
Data analytics in banking sectorData analytics in banking sector
Data analytics in banking sector
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry
 
Overview of Data Analytics in Lending Business
Overview of Data Analytics in Lending BusinessOverview of Data Analytics in Lending Business
Overview of Data Analytics in Lending Business
 
Fintech introduction
Fintech introductionFintech introduction
Fintech introduction
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking services
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Digital redefinition of banking banking transformation
Digital redefinition of banking   banking transformationDigital redefinition of banking   banking transformation
Digital redefinition of banking banking transformation
 
FinTech Overview
FinTech OverviewFinTech Overview
FinTech Overview
 
BI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranataBI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranata
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banks
 
business analytics.ppt
business analytics.pptbusiness analytics.ppt
business analytics.ppt
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Big data analytics in healthcare industry
Big data analytics in healthcare industryBig data analytics in healthcare industry
Big data analytics in healthcare industry
 
Fintech in india
Fintech in indiaFintech in india
Fintech in india
 
Analytics, Business Intelligence, and Data Science - What's the Progression?
Analytics, Business Intelligence, and Data Science - What's the Progression?Analytics, Business Intelligence, and Data Science - What's the Progression?
Analytics, Business Intelligence, and Data Science - What's the Progression?
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 

Similar to Advanced Analytics in Banking, CITI

The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
Chad Lawler
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
Skillwise Group
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
Skillwise Group
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
MongoDB
 
Applying Big Data
Applying Big DataApplying Big Data
Applying Big Data
John Dougherty
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
Albert Wong
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
predictive analysis and usage in procurement ppt 2017
predictive analysis and usage in procurement  ppt 2017predictive analysis and usage in procurement  ppt 2017
predictive analysis and usage in procurement ppt 2017
Prashant Bhatmule
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
ExtraHop Networks
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
Vikas Sardana
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
webwinkelvakdag
 

Similar to Advanced Analytics in Banking, CITI (20)

The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauWebinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
 
Applying Big Data
Applying Big DataApplying Big Data
Applying Big Data
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
predictive analysis and usage in procurement ppt 2017
predictive analysis and usage in procurement  ppt 2017predictive analysis and usage in procurement  ppt 2017
predictive analysis and usage in procurement ppt 2017
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
 

More from Innovation Enterprise

Marketing Technology Organizational Models
Marketing Technology Organizational ModelsMarketing Technology Organizational Models
Marketing Technology Organizational Models
Innovation Enterprise
 
BI, INC - BI, INC, Boeing
BI, INC - BI, INC, BoeingBI, INC - BI, INC, Boeing
BI, INC - BI, INC, Boeing
Innovation Enterprise
 
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
Innovation Enterprise
 
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell RubbermaidBeyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
Innovation Enterprise
 
CHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
CHAINalytics, Empowering Fact Based Decisions Across Your Supply ChainCHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
CHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
Innovation Enterprise
 
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
Innovation Enterprise
 
One Version of the Truth, Driving S&OP from detailed planning tools, Freescale
One Version of the Truth, Driving S&OP from detailed planning tools, FreescaleOne Version of the Truth, Driving S&OP from detailed planning tools, Freescale
One Version of the Truth, Driving S&OP from detailed planning tools, Freescale
Innovation Enterprise
 
Making Sales and Operations Planning a Truly Collaborative Process, Dick Ling
Making Sales and Operations Planning a Truly Collaborative Process, Dick LingMaking Sales and Operations Planning a Truly Collaborative Process, Dick Ling
Making Sales and Operations Planning a Truly Collaborative Process, Dick Ling
Innovation Enterprise
 
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
Innovation Enterprise
 
Strengthen the Processes to reach another level of excellence, Satish Sandhir
Strengthen the Processes to reach another level of excellence, Satish SandhirStrengthen the Processes to reach another level of excellence, Satish Sandhir
Strengthen the Processes to reach another level of excellence, Satish Sandhir
Innovation Enterprise
 
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDAHow to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
Innovation Enterprise
 
S&OP Innovation, Marietta
S&OP Innovation, MariettaS&OP Innovation, Marietta
S&OP Innovation, Marietta
Innovation Enterprise
 
Cisco Strategic Planning The Journey, Cisco
Cisco Strategic Planning The Journey, CiscoCisco Strategic Planning The Journey, Cisco
Cisco Strategic Planning The Journey, Cisco
Innovation Enterprise
 
Sales and Operations Planning, Supported by Demand Management Capability, Sus...
Sales and Operations Planning, Supported by Demand Management Capability, Sus...Sales and Operations Planning, Supported by Demand Management Capability, Sus...
Sales and Operations Planning, Supported by Demand Management Capability, Sus...
Innovation Enterprise
 
Enablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrackEnablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrack
Innovation Enterprise
 
S&OP, Kinaxis
S&OP, KinaxisS&OP, Kinaxis
S&OP, Kinaxis
Innovation Enterprise
 
Sales, Inventory & Operations Planning During High Growth, GMCR
Sales, Inventory & Operations Planning During High Growth, GMCRSales, Inventory & Operations Planning During High Growth, GMCR
Sales, Inventory & Operations Planning During High Growth, GMCR
Innovation Enterprise
 
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottrPredicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
Innovation Enterprise
 
Big Data Toronto, Unata
Big Data Toronto, UnataBig Data Toronto, Unata
Big Data Toronto, Unata
Innovation Enterprise
 
Big Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn IncBig Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn Inc
Innovation Enterprise
 

More from Innovation Enterprise (20)

Marketing Technology Organizational Models
Marketing Technology Organizational ModelsMarketing Technology Organizational Models
Marketing Technology Organizational Models
 
BI, INC - BI, INC, Boeing
BI, INC - BI, INC, BoeingBI, INC - BI, INC, Boeing
BI, INC - BI, INC, Boeing
 
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
Bridging the Gap between Budgets & Reality Oracle's Next Generation S&OP Solu...
 
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell RubbermaidBeyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
Beyond the Basics: Leveraging S&OP to Deliver Results, Newell Rubbermaid
 
CHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
CHAINalytics, Empowering Fact Based Decisions Across Your Supply ChainCHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
CHAINalytics, Empowering Fact Based Decisions Across Your Supply Chain
 
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
Sales Transformation: The Role of Sales Strategy & Operations, Dow Jones & Co...
 
One Version of the Truth, Driving S&OP from detailed planning tools, Freescale
One Version of the Truth, Driving S&OP from detailed planning tools, FreescaleOne Version of the Truth, Driving S&OP from detailed planning tools, Freescale
One Version of the Truth, Driving S&OP from detailed planning tools, Freescale
 
Making Sales and Operations Planning a Truly Collaborative Process, Dick Ling
Making Sales and Operations Planning a Truly Collaborative Process, Dick LingMaking Sales and Operations Planning a Truly Collaborative Process, Dick Ling
Making Sales and Operations Planning a Truly Collaborative Process, Dick Ling
 
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
Building a Fast and Flexible Consumer-Driven Supply Chain, Stanley Black & De...
 
Strengthen the Processes to reach another level of excellence, Satish Sandhir
Strengthen the Processes to reach another level of excellence, Satish SandhirStrengthen the Processes to reach another level of excellence, Satish Sandhir
Strengthen the Processes to reach another level of excellence, Satish Sandhir
 
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDAHow to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
 
S&OP Innovation, Marietta
S&OP Innovation, MariettaS&OP Innovation, Marietta
S&OP Innovation, Marietta
 
Cisco Strategic Planning The Journey, Cisco
Cisco Strategic Planning The Journey, CiscoCisco Strategic Planning The Journey, Cisco
Cisco Strategic Planning The Journey, Cisco
 
Sales and Operations Planning, Supported by Demand Management Capability, Sus...
Sales and Operations Planning, Supported by Demand Management Capability, Sus...Sales and Operations Planning, Supported by Demand Management Capability, Sus...
Sales and Operations Planning, Supported by Demand Management Capability, Sus...
 
Enablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrackEnablers for Maturing your S&OP Processes, SherTrack
Enablers for Maturing your S&OP Processes, SherTrack
 
S&OP, Kinaxis
S&OP, KinaxisS&OP, Kinaxis
S&OP, Kinaxis
 
Sales, Inventory & Operations Planning During High Growth, GMCR
Sales, Inventory & Operations Planning During High Growth, GMCRSales, Inventory & Operations Planning During High Growth, GMCR
Sales, Inventory & Operations Planning During High Growth, GMCR
 
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottrPredicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
Predicting The Future With Big Data: No Crystal Ball Required, TrendSpottr
 
Big Data Toronto, Unata
Big Data Toronto, UnataBig Data Toronto, Unata
Big Data Toronto, Unata
 
Big Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn IncBig Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn Inc
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 

Advanced Analytics in Banking, CITI

  • 1. Advanced Analytics in Banking Juan M. Huerta Global Decision Management VP Advanced Analytics Citibank
  • 2. I will talk about… • Big Data Adoption process at Citi • Realizing the Technical Value of Big Data • Global Solutions 1
  • 4. Citi: A Customer Centered Organization 3 As a customer-centered bank, the goal of our Big Data strategy to shift the focus from independent vertical silos to Common Horizontal Solutions focused around Citi’s 200-million customer accounts
  • 5. Big Data Adoption Stakeholders • Lines of Business • Strategy & Decision Management Organizations: cross LOB & Geo, global • Data innovation Office: Governance & Regulatory • CitiData – Big Data & Analytics Engineering 4
  • 6. Big Data Adoption Roadmap 5 Adoption will not occur at once. The level of capability maturity across the organization will vary significantly. On theory we think in terms of Staged Competencies of a Big Data Maturity Model. In practice, a hybrid process, which fits the level of maturity of participants, is needed. Common Data Common Analytic Platform Common Tools & Techniques Common Solutions Common Focus Strategy
  • 7. Big Data Adoption Hybrid Participation Model • Novice: Proof of Concept • Expert: R&D Environment • Shadowed 6
  • 8. 7 End-to-end Analytic Process for a POC Project This is one component of the hybrid model Ideas and Hypotheses Information Asset Inventory Navigator (“IAIN”) • Pipeline of ideas to use data for competitive advantage • Robust, comprehensive ontology allowing analysts and economists to search, sort, and select data for analysis • Preliminary assessment for business value, data safekeeping and alignment to business practices Data Transformation & Provisioning • Transformation rules executed to normalize and conform production data • Conformed data set made available in production environment Production Model Development • Develop scalable, productizable analytics Model Deployment • Exploit insights and analyses across the enterprise to maximize value • Models measured for quality / usage • Formal approval process through Business Steering Committee based on understanding expected use of production data R&D process R&D Project Approval Product Approval Engineering / Production process Analytics Knowledge Management • Robust, compreh ensive ontology allowing analysts and economists to search, sort, an d select data for analysis Data Set Preparation & Provisioning • Basic preparation of data set (e.g., consolidation, conformation) • Permission-based provisioning of data set into a Big Data Analytics environment Analytics Execution • Advanced analytic tools mine business insight from large volumes of data • Data scientist peers review model findings and results Analytics Peer Review Data Acquisition • Where necessary, acquire new data sets to support R&D project
  • 9. Advanced Global Solutions • A global solution is a tested algorithm or analytic model that carries out a particular business analysis and which is leveraged at a global scale • A big data global solution enables the interplay of complex algorithms and large datasets • When a global solution is built upon big data approaches a delivery roadmap should be considered • In the exploratory process a Global Solution is developed in the Innovation R/D environment and validated through a POC process • Alignment with Innovation, UAT, PRD environments 8
  • 10. Technical Value of Big Data: Benchmarks and Analysis
  • 11. The Boom Driving Big Data is Technological Heebyung Koh , Christopher L. Magee A functional approach for studying technological progress: Extension to energy technology Technological Forecasting and Social Change, Volume 75, Issue 6, July 2008, Pages 735–758
  • 12. The Quadrant Of Analytic Opportunity Run Time is affected by Data Size and Algorithmic Complexity Algorithmic Complexity Database Interaction Mtg+Cards+ Banking Accounts Transaction features Accounts Transactions Branches Transactions Accounts Summary Stats. Employees Summary Stats. GL-GOCS GL-Entries Branches Summary Stats. 10^10 10^9 10^9 10^8 10^7 10^6 10^5 Data Size Sequence Mining Predictive filtering Latent Dirichlet Allocation HMM Baum- Welch O(ns nf nt) CART O(nf ns log ns) Iterative SVD- CF K-means Logistic Regression PCAPage Rank Self-Org. Maps Neural Nets Collaborative Filtering (CF) Vector based Approaches HMM Machine Learning Traditional Statistical Big Data/Pattern Mining Conditional Random Fields Support Vector Machines
  • 13. Breaking down the gains of P13n: A Controlled Incremental Benchmark on a Workstation grade processor (x500) Implemented an incremental-SVD (Netflix Cup) predictive model that runs on midsize of datasets… X30 • Compiled Code (vs. interpreted) x4 • In Memory (vs. Disk access) X3.12 • Multithread (vs. single thread) X1.3 • Workstation grade processor
  • 14. Basic Map Reduce Benchmarks 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 2 3 4 5 6 Series1 Impact of overhead as function Of input volume: Relative Map Throughput as a function of # Mappers 0 5 10 15 20 25 0 5 10 15 20 RelativeMapCPUtimespeedup Number of Maps 0.003351955 0.032258065 0.319148936 1 2.631578947 21.12676056 Linear (0.003351955 0.032258065 0.319148936 1 2.631578947 21.12676056) 0 200 400 600 800 1000 1200 1400 1600 0 5 10 15 20 TokensperWallClockSecond Number of Maps Series1 Linear (Series1)
  • 15. HAMSTER: Hadoop Multi-signature Search for Text-based Entity Retrieval • Core algorithm: String Edit Distance O(mnk2) • Baseline runs at 100 matches per day • HAMSTER speedup: 33x (5 node speedup) 60x (java speedup) = 2000x faster Source Items Target Items Source items per target Input Size MAP Records Cluster Max Map Tasks Effective Map Tasks CPU map (secs) Wall time 34k 618k 100 4.40GB 345 33 33 196k 2h 14 secs 34k 618k 50 8.8GB 690 40 66 196k 1h 47min 34k 618k 30 14.6GB 1,149 40 110 199k 1h 39 min
  • 16. Leveraging Global Big Data Global Solutions
  • 17. Creating Global Big Data solutions Our goal is to evolve from Big Data algorithms to Big Data Solutions
  • 18. Example of Advanced Global Solution Matrix 17 Outlier Detection Multivariate Segmentation Sequence Matching Network Analysis Customer Contextual Clickstream Action Marketing Risk/Fraud Digital Structured Prediction 17 K-Medoids Clustering
  • 19. Example: Transactional Time Series AnomalousBehavior
  • 20. On Demand Simulation: Generate Branches’ DNA • Case Scenario: Unusual number of cash advances by 2 tellers. Single day fraud Multi day fraudOriginal branch (August)
  • 21. Creating Regions of Interest based on On-Demand-Simulation Minimum-Spanning- Tree based branch association for region of interest generation Multi-day fraud simulation Original branch Region of interest • Numbers shown are randomized indices
  • 22. Conclusion: Lessons Learned • One Size does not fit all • Follow a Hybrid Approach • Leverage Analytic patterns: Global Solutions • Big Data is about Parallelization • The future: expensive Algorithms applied to large datasets • Global Solutions are the combination of algorithmic building blocks applied to specific business problems 21