SlideShare a Scribd company logo
1 of 14
Harvard Business Club presentation
_______________________________
Big Data Case Studies
Nitin Kabra
1
Agenda
Case Study 1 – Customer Risk Profiling
Case Study 2 - Trade Surveillance and Reporting
Case Study 3 – Online Account Opening
Case Study 4 – Legacy Migration to Hadoop
Case Study 5 – ATM/Mobile adjustment data
Questions and next steps
2
Use Cases implementation
Very large bank with several consumer lines of business needed
to analyze customer activity across multiple channels, build a
customer scoring model based on behavioral analysis for
fraudulent activity(both real-time and batch)
Trade Surveillance &
Reporting(DF, Volcker)
The bank already captured trading activity and used that data to
assess, predict, and manage risk for both regulatory and non-
regulatory purposes.
Online Account Opening
Fraud detection cases -Bank wanted to focus on online account
opening frauds.
Legacy migration to
Hadoop
Availability of Fraud Detection cases data to build customer
scores from various LOB's for Risk Management and Customer
retention.
Provide the Adjustment data from ATM/Mobile deposits Data for
Fraud Analysts to make decision on the same day.
3
ATM/Mobile adjustment
Data
Customer Risk Profiling
Case Study 1
4
Challenge
•Very large bank with several consumer
lines of business needed to analyze
customer activity across multiple
products to predict credit risk with
greater accuracy.
•Over the years, the bank had acquired
a number of regional banks. Each of
those banks had a checking and savings
business, a home mortgage business,
credit card offerings and other financial
products.
•Those applications generally ran in
separate silos- each used its own
database and application software. A
large number of independent systems
that could not share data easily.
•With the economic downturn of 2008,
the bank had significant exposure in its
mortgage business to defaults by its
borrowers. Risk management was truly
needed.
Solution
•The bank set up a single Hadoop
cluster containing more than a
petabyte of data collected from
multiple enterprise data warehouses.
•With all of the information in one
place, the bank added new sources of
data, including customer call center
recordings, chat sessions, emails to
the customer service desk and others.
•Pattern matching techniques, text
processing, sentiment analysis, graph
creation to combine, digest and
analyze the data.
Advantage
•The bank used the Hadoop cluster to
construct a new and more accurate
score of the risk in its customer
portfolios. Clear picture of a
customer’s financial situation, his risk
of default or late payment and his
satisfaction with the bank and its
services.
•The more accurate score allowed the
bank to manage its exposure better
and to offer each customer better
products and advice.
•Hadoop increased revenue and
improved customer satisfaction.
•Not just a reduction of cost from the
existing system, but improved revenue
from better risk management and
customer retention
Customer Risk Profiling
Case Study 2
5
Challenge
• The bank already captured trading
activity and used that data to
assess, predict, and manage risk
for both regulatory and non-
regulatory purposes.
• The very large volume of data,
however, made it difficult to
monitor trades for compliance,
and virtually impossible to catch
“rogue” traders, who engage in
trades that violate policies or
expose the bank to too much risk.
• Regulatory reporting based on the
new guidelines after 2008 market
meltdown was proving a
mammoth challenge.
Solution
• Built a Hadoop cluster that runs
alongside its existing trading
systems. The Hadoop cluster gets
copies of all of the trading data,
margins, limits, exposure but also
holds information about parties in
the trade.
• Built a powerful suite of novel
algorithms using statistical and
other techniques to monitor
human and automated or program
trading.
• Pattern matching techniques,
developed a very good picture of
normal trading activity, now they
watch for unusual patterns that
may reflect rogue trading.
Advantage
• Detect a variety of illegal activity,
including money laundering,
insider trading, front-running,
intra-day manipulation, marking to
close and more.
• Fast detection allows the bank to
protect itself from considerable
losses.
• Hadoop increased revenue and
improved customer satisfaction.
• The Hadoop cluster also helps the
bank comply with financial
industry regulations.
• Hadoop provides cost effective,
scalable, reliable storage so that
the bank can retain records and
deliver reports on activities for
years, as required by law.
Trade Surveillance and Reporting (DF, Volcker)
Case Study 3
6
Challenge
• Fraud detection cases -Bank
wanted to focus on online account
opening frauds.
• Can be real-time/batch, requires
data from structured and
unstructured sources.
• Identification of patterns,
relational analysis, define certain
rules for account opening online
Solution
• Validate user identity based on
past patterns can be done in
seconds.
• Connectivity with external credit
rating agencies using CEP and
semantic search for continuous
data flow.
• Ingest huge requests in Hadoop
ecosystem.
• Identification of patterns based on
vast historical data stored in
Hadoop cluster from various
sources (rating agencies,
investigating agencies).
• Defined rules based on geospatial,
locational analysis, requests sent
format.
Advantage
• Could analyze huge PBs of data
with latency within SLA period.
• Reduced costs handling huge
volumes.
• Further alerts / Events processing
using CEP to take appropriate
action for a fraudulent request.
• Quick analytics helped bank track
down the locations, regions and
identify a pattern for such
fraudulent requests.
Online Account Opening
Case Study 4
7
Challenge
• Fraud detection cases – Collect
data from various products and
build customer score.
• The data processing happens on
legacy mainframes by state, takes
hours of time and delays
downstream batch processing.
• The SAS Fraud Management
product requires the data to be in
a specific format and the reformat
process is critical and takes longer
time.
Solution
• The bank set up a single Hadoop
cluster containing more than a
petabyte of data collected from
multiple data sources.
• The reformat process is completed
quickly as the processing is done
on tables instead of by files .
• With all of the information in one
place, data from other sources is
also added easily.
• Pattern matching techniques, text
processing, sentiment analysis,
graph creation to combine, digest
and analyze the data.
Advantage
• The bank used the Hadoop cluster
to construct a more accurate
customer score. Clear picture of a
customer’s financial situation, his
risk of default or late payment and
his satisfaction with the bank and
its services.
• The more accurate score allowed
the bank to manage its exposure
better and to offer each customer
better products and advice.
• Hadoop increased revenue and
improved customer satisfaction.
• Not just a reduction of cost from
the existing system, but improved
revenue from better risk
management and customer
retention.
Legacy Migration to Hadoop
Case Study 5
8
Challenge
• Fraud detection cases –
Adjustment data from
ATM/Mobile deposits.
• The data is available to Fraud
Analysts only on day-2 of the
transaction.
• Current business process allows
funds to be available to the
customer as soon as the
adjustment is posted to the
account.
• Have to toggle between multiple
screens for data from different
sources for verification of
transactions.
Solution
• Have the adjustment data
available to the Fraud Analysts as
soon as possible before posting of
funds to customer account.
• Have customer and transaction
data available to Fraud Analysts at
a single place.
• Ingest data from multiple sources
in Hadoop ecosystem.
• Identification of patterns based on
vast historical data stored in
Hadoop cluster from various
sources.
• Block/close the account as soon as
the fraudulent transaction is
identified preventing funds going
out of bank.
Advantage
• Could load huge PBs of data into
Hadoop much faster than
traditional processes.
• Reduced costs of handling huge
volumes.
• Analysts got capability to easily
identify patterns with all the data
available at a single place.
• Quick analytics helped bank track
down the accounts and
close/block them from operating
further.
ATM/Mobile adjustment data
Information
Ingestion
Traditional Landscape – Analytical Sources
Security and Business Continuity Management
Information
Consumption
Analytics Sources
Shared Operational Information
Information Governing Systems Metadata Catalog
Content
Repository
Master Data
Hubs
Reference
Data Hubs
TransformationEngines
Warehouse
Data
Marts
Data
Cubes
Database Files
Database
Files
Traditional
Sources
3rd Party
Applications
Visualization
Reporting &
Dashboards
Geospatial
Analysis
Statistical
Analysis
Scorecards &
Metrics
Events & Alerts
Data Mining
Investigation
Social Analysis
Case
Management
Link Analysis
Pattern Analysis
9
Case study for Customer Risk Profiling
ActionDecisioning & ContextObservation
Space
Line of Business
Applications
• Market
providers
• Broker Dealer
• Research
• Transactional
data
Real time streaming
data
Unstructured
Sources
• Voice
• Images
• eMail
• Web
Interactions
• Telematics
Investigative
Data
• Twitter
• Facebook
• Streetwatch
• finviz
• Documents &
Reports
Unstructured
Analytics Discovery
Analytical Data
Mart
Real-time Decision
& Analytics
Visualization
Reporting &
Dashboards
Geospatial Analysis
Statistical Analysis
Scorecards &
Metrics
Events & Alerts
Data Mining
Capture and
Integration
Identity &
Relationship Analysis
Who Is Who
Analytical Modeling
Anomaly Analytics
Geospatial Analytics
bitemporal Analytics
Data Exploration &
Discovery
Text Analytics
Who Knows Who
Who Does What
Who’s Name
In Database
Analytics
Industry Data
Model
Investigation
Social Analysis
Case Management
Link Analysis
Pattern Analysis
External
Sources
• Social Media
• Government
Records
• Watch Lists
Feedback
Real-time and
Batch Data
Acquisition and
Provisioning
Data Movement
Data Quality
Metadata Mgmt
USES
•Scoring model
•Linking customer
accounts eg:
•Brokerage, HELOC,
mortgage,
insurance, credit
cards, savings and
checking, MM etc.
10
Case study for Customer Risk Profiling (cont..)
Rules Engine
Authorization
Permissions
Big Data Enhanced Analytical Sources Enable Analytics - New sources, real-time analytics,
deeper regulation, new technology
Our analysts have instant access to consolidated
information across various internal and external
sources, and collaborating across departments,
channels and LOBs is seamless.
Investigation Management
We use extensive contextual information from a
variety of sources. New patterns we identify are
fed back into screening.
Analytics
Our data is consolidated across channels, we use
all of the customer interaction and behavior
information we capture, and there is no delay
before all of this information can be analyzed.
Data Management
1 2 3
Ingestion & Real-Time
Analytic Zone
• Real-time (µs) data
movement and
analysis (anomaly
detection, correlation,
scoring, etc)
• Structured and
unstructured data
(geospatial, text,
voice, video, etc)
Landing & Historical
Zone
• Structured and
unstructured data
• Single data hub
across products,
channels, behaviors
and relationships
Analytic
Appliances
• Self-provisioning
• Deep analytics
2
2
1
3
Deep & predictive
Models
• Real-time
1
2
• Real-time
• Consolidated
• Integrated
11
Case study for Customer Risk Profiling (cont..)
Reference Architecture
Observation
Space
Line of Business
Applications
Unstructured
Sources
Investigative
Data
External
Sources
12
Case study for Customer Risk Profiling (cont..)DE
DEVELOPMENT
Messaging
Web services
Logs
Parsing
JSON
Cassandra DB
Rules Engine
HDFS
Flume NG
Oozie
Zookeeper
Hive
PIG
Mapreduce
Sqoop
HCatalog
App Server
Web server
Reporting
Elasticsearch
Ikanow
Lexalytics
Provalis
Lucene
Mahout
R
Future Roadmap
13
Sentry
Ambari
FalconKnox
YARN –
MR2.0
Questions and next steps
14

More Related Content

What's hot

Tektronix : Global ERP Implementation
Tektronix : Global ERP Implementation Tektronix : Global ERP Implementation
Tektronix : Global ERP Implementation Harsh Asthana
 
Weikang Pharmaceuticals Co. Ltd.: Channel Management Dilemma
Weikang Pharmaceuticals Co. Ltd.: Channel Management DilemmaWeikang Pharmaceuticals Co. Ltd.: Channel Management Dilemma
Weikang Pharmaceuticals Co. Ltd.: Channel Management DilemmaRoma Kumari
 
Marketing strategies of wal mart
Marketing strategies of wal martMarketing strategies of wal mart
Marketing strategies of wal martMATHEW V JOSEPH
 
Reflexive Supply Chains The Zenith of Agility - Rakesh Sinha
Reflexive Supply Chains The Zenith of Agility - Rakesh SinhaReflexive Supply Chains The Zenith of Agility - Rakesh Sinha
Reflexive Supply Chains The Zenith of Agility - Rakesh SinhaELSCC
 
Boots: Hair-Care Sales Promotion- Case Analysis
Boots: Hair-Care Sales Promotion- Case AnalysisBoots: Hair-Care Sales Promotion- Case Analysis
Boots: Hair-Care Sales Promotion- Case AnalysisMeghana Muddapappu
 
Data analysis market research
Data analysis   market researchData analysis   market research
Data analysis market researchsachinudepurkar
 
Seven- Eleven Japan Co. Case Analysis
Seven- Eleven Japan Co. Case AnalysisSeven- Eleven Japan Co. Case Analysis
Seven- Eleven Japan Co. Case AnalysisGeeta Hansdah
 
THE HIDDEN TRAPS IN DECISION MAKING
THE HIDDEN TRAPS IN DECISION MAKING THE HIDDEN TRAPS IN DECISION MAKING
THE HIDDEN TRAPS IN DECISION MAKING Dtech Systems Co.
 
Business strategy- for retail shoe company
Business strategy- for retail shoe companyBusiness strategy- for retail shoe company
Business strategy- for retail shoe companyVijayananda Mohire
 
Merloni elettrodomestici SPA
Merloni elettrodomestici SPAMerloni elettrodomestici SPA
Merloni elettrodomestici SPASanket Golechha
 
Apple vs USG, Ethics
Apple vs USG, EthicsApple vs USG, Ethics
Apple vs USG, EthicsKate Organ
 
MasterCard
MasterCardMasterCard
MasterCardErdem Ay
 
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply Chain
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply ChainHewlett packard company Hewlett Packard Company Deskjet Printer Supply Chain
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply Chainaliyudhi_h
 
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case Study
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case StudyBen & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case Study
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case StudyThomas O'Brien
 

What's hot (20)

Tektronix : Global ERP Implementation
Tektronix : Global ERP Implementation Tektronix : Global ERP Implementation
Tektronix : Global ERP Implementation
 
Weikang Pharmaceuticals Co. Ltd.: Channel Management Dilemma
Weikang Pharmaceuticals Co. Ltd.: Channel Management DilemmaWeikang Pharmaceuticals Co. Ltd.: Channel Management Dilemma
Weikang Pharmaceuticals Co. Ltd.: Channel Management Dilemma
 
Marketing strategies of wal mart
Marketing strategies of wal martMarketing strategies of wal mart
Marketing strategies of wal mart
 
Reflexive Supply Chains The Zenith of Agility - Rakesh Sinha
Reflexive Supply Chains The Zenith of Agility - Rakesh SinhaReflexive Supply Chains The Zenith of Agility - Rakesh Sinha
Reflexive Supply Chains The Zenith of Agility - Rakesh Sinha
 
Boots: Hair-Care Sales Promotion- Case Analysis
Boots: Hair-Care Sales Promotion- Case AnalysisBoots: Hair-Care Sales Promotion- Case Analysis
Boots: Hair-Care Sales Promotion- Case Analysis
 
Data analysis market research
Data analysis   market researchData analysis   market research
Data analysis market research
 
Seven- Eleven Japan Co. Case Analysis
Seven- Eleven Japan Co. Case AnalysisSeven- Eleven Japan Co. Case Analysis
Seven- Eleven Japan Co. Case Analysis
 
THE HIDDEN TRAPS IN DECISION MAKING
THE HIDDEN TRAPS IN DECISION MAKING THE HIDDEN TRAPS IN DECISION MAKING
THE HIDDEN TRAPS IN DECISION MAKING
 
Walmart's Expansion in Africa
Walmart's Expansion in AfricaWalmart's Expansion in Africa
Walmart's Expansion in Africa
 
Business strategy- for retail shoe company
Business strategy- for retail shoe companyBusiness strategy- for retail shoe company
Business strategy- for retail shoe company
 
Merloni elettrodomestici SPA
Merloni elettrodomestici SPAMerloni elettrodomestici SPA
Merloni elettrodomestici SPA
 
Apple vs USG, Ethics
Apple vs USG, EthicsApple vs USG, Ethics
Apple vs USG, Ethics
 
Cola wars between Cocacola and Pepsi
Cola wars between Cocacola and PepsiCola wars between Cocacola and Pepsi
Cola wars between Cocacola and Pepsi
 
Reed Supermarket
Reed SupermarketReed Supermarket
Reed Supermarket
 
MasterCard
MasterCardMasterCard
MasterCard
 
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply Chain
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply ChainHewlett packard company Hewlett Packard Company Deskjet Printer Supply Chain
Hewlett packard company Hewlett Packard Company Deskjet Printer Supply Chain
 
Case Study of HP:CSO
Case Study of HP:CSO Case Study of HP:CSO
Case Study of HP:CSO
 
Zurin industries
Zurin industriesZurin industries
Zurin industries
 
Oscar mayer
Oscar mayerOscar mayer
Oscar mayer
 
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case Study
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case StudyBen & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case Study
Ben & Jerry's Homemade Ice Cream, Inc: A Period of Transition Case Study
 

Similar to Harvard case studies presentation 09102013

Big Data solution for multi-national Bank
Big Data solution for multi-national BankBig Data solution for multi-national Bank
Big Data solution for multi-national BankRitu Sarkar
 
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxLecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxRATISHKUMAR32
 
Hadoop® Accelerates Earnings Growth in Banking and Insurance
Hadoop® Accelerates Earnings Growth in Banking and InsuranceHadoop® Accelerates Earnings Growth in Banking and Insurance
Hadoop® Accelerates Earnings Growth in Banking and InsuranceMelissa Luongo
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise deteo
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bankPankaj Jha
 
Adopting Analytics for decision making in a bank
Adopting Analytics for decision making in a bankAdopting Analytics for decision making in a bank
Adopting Analytics for decision making in a bankKrishna Bollojula
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesaziksa
 
Aziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaAziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaData Con LA
 
Business analytics
Business analyticsBusiness analytics
Business analyticsAshnaBritto
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systemsPreeti Sontakke
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concretoHP Enterprise Italia
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyAmit Parija
 
Data warehousev2.1
Data warehousev2.1Data warehousev2.1
Data warehousev2.1Tuan Luong
 
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...Integrated Order to Cash (O2C) Automation Software for Global Shared Services...
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...Emagia
 

Similar to Harvard case studies presentation 09102013 (20)

Big Data solution for multi-national Bank
Big Data solution for multi-national BankBig Data solution for multi-national Bank
Big Data solution for multi-national Bank
 
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxLecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
 
Claims
ClaimsClaims
Claims
 
Hadoop® Accelerates Earnings Growth in Banking and Insurance
Hadoop® Accelerates Earnings Growth in Banking and InsuranceHadoop® Accelerates Earnings Growth in Banking and Insurance
Hadoop® Accelerates Earnings Growth in Banking and Insurance
 
Data mining
Data miningData mining
Data mining
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bank
 
Adopting Analytics for decision making in a bank
Adopting Analytics for decision making in a bankAdopting Analytics for decision making in a bank
Adopting Analytics for decision making in a bank
 
Adopting Analytics in BFSI
Adopting Analytics in BFSIAdopting Analytics in BFSI
Adopting Analytics in BFSI
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Aziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaAziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jha
 
Data mining wrhousing-lec
Data mining wrhousing-lecData mining wrhousing-lec
Data mining wrhousing-lec
 
HashCash big data services
HashCash big data servicesHashCash big data services
HashCash big data services
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systems
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concreto
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economy
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Data warehousev2.1
Data warehousev2.1Data warehousev2.1
Data warehousev2.1
 
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...Integrated Order to Cash (O2C) Automation Software for Global Shared Services...
Integrated Order to Cash (O2C) Automation Software for Global Shared Services...
 

More from nkabra

How i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiqueHow i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiquenkabra
 
How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...nkabra
 
How fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningHow fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningnkabra
 
Building a data science team at michelin tyres
Building a data science team at michelin tyresBuilding a data science team at michelin tyres
Building a data science team at michelin tyresnkabra
 
Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universitynkabra
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014nkabra
 
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...nkabra
 
Hadoop compression analysis strata conference
Hadoop compression analysis strata conferenceHadoop compression analysis strata conference
Hadoop compression analysis strata conferencenkabra
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conferencenkabra
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013nkabra
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014nkabra
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013nkabra
 

More from nkabra (12)

How i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiqueHow i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutique
 
How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...
 
How fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningHow fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learning
 
Building a data science team at michelin tyres
Building a data science team at michelin tyresBuilding a data science team at michelin tyres
Building a data science team at michelin tyres
 
Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia university
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
 
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
 
Hadoop compression analysis strata conference
Hadoop compression analysis strata conferenceHadoop compression analysis strata conference
Hadoop compression analysis strata conference
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conference
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 

Harvard case studies presentation 09102013

  • 1. Harvard Business Club presentation _______________________________ Big Data Case Studies Nitin Kabra 1
  • 2. Agenda Case Study 1 – Customer Risk Profiling Case Study 2 - Trade Surveillance and Reporting Case Study 3 – Online Account Opening Case Study 4 – Legacy Migration to Hadoop Case Study 5 – ATM/Mobile adjustment data Questions and next steps 2
  • 3. Use Cases implementation Very large bank with several consumer lines of business needed to analyze customer activity across multiple channels, build a customer scoring model based on behavioral analysis for fraudulent activity(both real-time and batch) Trade Surveillance & Reporting(DF, Volcker) The bank already captured trading activity and used that data to assess, predict, and manage risk for both regulatory and non- regulatory purposes. Online Account Opening Fraud detection cases -Bank wanted to focus on online account opening frauds. Legacy migration to Hadoop Availability of Fraud Detection cases data to build customer scores from various LOB's for Risk Management and Customer retention. Provide the Adjustment data from ATM/Mobile deposits Data for Fraud Analysts to make decision on the same day. 3 ATM/Mobile adjustment Data Customer Risk Profiling
  • 4. Case Study 1 4 Challenge •Very large bank with several consumer lines of business needed to analyze customer activity across multiple products to predict credit risk with greater accuracy. •Over the years, the bank had acquired a number of regional banks. Each of those banks had a checking and savings business, a home mortgage business, credit card offerings and other financial products. •Those applications generally ran in separate silos- each used its own database and application software. A large number of independent systems that could not share data easily. •With the economic downturn of 2008, the bank had significant exposure in its mortgage business to defaults by its borrowers. Risk management was truly needed. Solution •The bank set up a single Hadoop cluster containing more than a petabyte of data collected from multiple enterprise data warehouses. •With all of the information in one place, the bank added new sources of data, including customer call center recordings, chat sessions, emails to the customer service desk and others. •Pattern matching techniques, text processing, sentiment analysis, graph creation to combine, digest and analyze the data. Advantage •The bank used the Hadoop cluster to construct a new and more accurate score of the risk in its customer portfolios. Clear picture of a customer’s financial situation, his risk of default or late payment and his satisfaction with the bank and its services. •The more accurate score allowed the bank to manage its exposure better and to offer each customer better products and advice. •Hadoop increased revenue and improved customer satisfaction. •Not just a reduction of cost from the existing system, but improved revenue from better risk management and customer retention Customer Risk Profiling
  • 5. Case Study 2 5 Challenge • The bank already captured trading activity and used that data to assess, predict, and manage risk for both regulatory and non- regulatory purposes. • The very large volume of data, however, made it difficult to monitor trades for compliance, and virtually impossible to catch “rogue” traders, who engage in trades that violate policies or expose the bank to too much risk. • Regulatory reporting based on the new guidelines after 2008 market meltdown was proving a mammoth challenge. Solution • Built a Hadoop cluster that runs alongside its existing trading systems. The Hadoop cluster gets copies of all of the trading data, margins, limits, exposure but also holds information about parties in the trade. • Built a powerful suite of novel algorithms using statistical and other techniques to monitor human and automated or program trading. • Pattern matching techniques, developed a very good picture of normal trading activity, now they watch for unusual patterns that may reflect rogue trading. Advantage • Detect a variety of illegal activity, including money laundering, insider trading, front-running, intra-day manipulation, marking to close and more. • Fast detection allows the bank to protect itself from considerable losses. • Hadoop increased revenue and improved customer satisfaction. • The Hadoop cluster also helps the bank comply with financial industry regulations. • Hadoop provides cost effective, scalable, reliable storage so that the bank can retain records and deliver reports on activities for years, as required by law. Trade Surveillance and Reporting (DF, Volcker)
  • 6. Case Study 3 6 Challenge • Fraud detection cases -Bank wanted to focus on online account opening frauds. • Can be real-time/batch, requires data from structured and unstructured sources. • Identification of patterns, relational analysis, define certain rules for account opening online Solution • Validate user identity based on past patterns can be done in seconds. • Connectivity with external credit rating agencies using CEP and semantic search for continuous data flow. • Ingest huge requests in Hadoop ecosystem. • Identification of patterns based on vast historical data stored in Hadoop cluster from various sources (rating agencies, investigating agencies). • Defined rules based on geospatial, locational analysis, requests sent format. Advantage • Could analyze huge PBs of data with latency within SLA period. • Reduced costs handling huge volumes. • Further alerts / Events processing using CEP to take appropriate action for a fraudulent request. • Quick analytics helped bank track down the locations, regions and identify a pattern for such fraudulent requests. Online Account Opening
  • 7. Case Study 4 7 Challenge • Fraud detection cases – Collect data from various products and build customer score. • The data processing happens on legacy mainframes by state, takes hours of time and delays downstream batch processing. • The SAS Fraud Management product requires the data to be in a specific format and the reformat process is critical and takes longer time. Solution • The bank set up a single Hadoop cluster containing more than a petabyte of data collected from multiple data sources. • The reformat process is completed quickly as the processing is done on tables instead of by files . • With all of the information in one place, data from other sources is also added easily. • Pattern matching techniques, text processing, sentiment analysis, graph creation to combine, digest and analyze the data. Advantage • The bank used the Hadoop cluster to construct a more accurate customer score. Clear picture of a customer’s financial situation, his risk of default or late payment and his satisfaction with the bank and its services. • The more accurate score allowed the bank to manage its exposure better and to offer each customer better products and advice. • Hadoop increased revenue and improved customer satisfaction. • Not just a reduction of cost from the existing system, but improved revenue from better risk management and customer retention. Legacy Migration to Hadoop
  • 8. Case Study 5 8 Challenge • Fraud detection cases – Adjustment data from ATM/Mobile deposits. • The data is available to Fraud Analysts only on day-2 of the transaction. • Current business process allows funds to be available to the customer as soon as the adjustment is posted to the account. • Have to toggle between multiple screens for data from different sources for verification of transactions. Solution • Have the adjustment data available to the Fraud Analysts as soon as possible before posting of funds to customer account. • Have customer and transaction data available to Fraud Analysts at a single place. • Ingest data from multiple sources in Hadoop ecosystem. • Identification of patterns based on vast historical data stored in Hadoop cluster from various sources. • Block/close the account as soon as the fraudulent transaction is identified preventing funds going out of bank. Advantage • Could load huge PBs of data into Hadoop much faster than traditional processes. • Reduced costs of handling huge volumes. • Analysts got capability to easily identify patterns with all the data available at a single place. • Quick analytics helped bank track down the accounts and close/block them from operating further. ATM/Mobile adjustment data
  • 9. Information Ingestion Traditional Landscape – Analytical Sources Security and Business Continuity Management Information Consumption Analytics Sources Shared Operational Information Information Governing Systems Metadata Catalog Content Repository Master Data Hubs Reference Data Hubs TransformationEngines Warehouse Data Marts Data Cubes Database Files Database Files Traditional Sources 3rd Party Applications Visualization Reporting & Dashboards Geospatial Analysis Statistical Analysis Scorecards & Metrics Events & Alerts Data Mining Investigation Social Analysis Case Management Link Analysis Pattern Analysis 9 Case study for Customer Risk Profiling
  • 10. ActionDecisioning & ContextObservation Space Line of Business Applications • Market providers • Broker Dealer • Research • Transactional data Real time streaming data Unstructured Sources • Voice • Images • eMail • Web Interactions • Telematics Investigative Data • Twitter • Facebook • Streetwatch • finviz • Documents & Reports Unstructured Analytics Discovery Analytical Data Mart Real-time Decision & Analytics Visualization Reporting & Dashboards Geospatial Analysis Statistical Analysis Scorecards & Metrics Events & Alerts Data Mining Capture and Integration Identity & Relationship Analysis Who Is Who Analytical Modeling Anomaly Analytics Geospatial Analytics bitemporal Analytics Data Exploration & Discovery Text Analytics Who Knows Who Who Does What Who’s Name In Database Analytics Industry Data Model Investigation Social Analysis Case Management Link Analysis Pattern Analysis External Sources • Social Media • Government Records • Watch Lists Feedback Real-time and Batch Data Acquisition and Provisioning Data Movement Data Quality Metadata Mgmt USES •Scoring model •Linking customer accounts eg: •Brokerage, HELOC, mortgage, insurance, credit cards, savings and checking, MM etc. 10 Case study for Customer Risk Profiling (cont..) Rules Engine Authorization Permissions
  • 11. Big Data Enhanced Analytical Sources Enable Analytics - New sources, real-time analytics, deeper regulation, new technology Our analysts have instant access to consolidated information across various internal and external sources, and collaborating across departments, channels and LOBs is seamless. Investigation Management We use extensive contextual information from a variety of sources. New patterns we identify are fed back into screening. Analytics Our data is consolidated across channels, we use all of the customer interaction and behavior information we capture, and there is no delay before all of this information can be analyzed. Data Management 1 2 3 Ingestion & Real-Time Analytic Zone • Real-time (µs) data movement and analysis (anomaly detection, correlation, scoring, etc) • Structured and unstructured data (geospatial, text, voice, video, etc) Landing & Historical Zone • Structured and unstructured data • Single data hub across products, channels, behaviors and relationships Analytic Appliances • Self-provisioning • Deep analytics 2 2 1 3 Deep & predictive Models • Real-time 1 2 • Real-time • Consolidated • Integrated 11 Case study for Customer Risk Profiling (cont..)
  • 12. Reference Architecture Observation Space Line of Business Applications Unstructured Sources Investigative Data External Sources 12 Case study for Customer Risk Profiling (cont..)DE DEVELOPMENT Messaging Web services Logs Parsing JSON Cassandra DB Rules Engine HDFS Flume NG Oozie Zookeeper Hive PIG Mapreduce Sqoop HCatalog App Server Web server Reporting Elasticsearch Ikanow Lexalytics Provalis Lucene Mahout R
  • 14. Questions and next steps 14