SlideShare a Scribd company logo
1 of 32
Download to read offline
© 2016 Fair Isaac Corporation. Confidential. 1
© 2016 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
A Consortium and Its Data
Fraud Screening for 2/3rds of All Card Transactions
Scott Zoldi, PhD
Chief Analytics Officer, FICO
ScottZoldi@fico.com
@ScottZoldi
Predictive Analytics World for Business
San Francisco
May 16, 2017
© 2016 Fair Isaac Corporation. Confidential. 2
About Me
• Responsible for analytic development of FICO’s
product and technology solutions, including Falcon
Fraud Manager
• 17 years at FICO
• Author of 77 patents
─ 38 granted and 39 in process
• Recent focus on self learning analytics for real-time
detection of Cyber Security attacks, AML detection,
and mobile device analytics
• Ph.D. in theoretical physics from Duke University
© 2016 Fair Isaac Corporation. Confidential. 3
History of Fraud Detection at FICO
Falcon
introduced
Percentage of the US payment cards
covered by FICO fraud solutions90%
20
15
10
5
0
1990 1994 1998 2002 2006 2014
Fraud Losses
Banks participating in FICO’s fraud
data consortium9,000
Active financial accounts protected by
FICO worldwide2.6B
Average response time for fraud
decisions rendered by Falcon10ms
2010
© 2016 Fair Isaac Corporation. Confidential. 4
Profiles Summarize Customer Transaction History
Recursive analytic features to efficiently summarize history
numTrx(t-2)
numCashTrx(t-2)
avgAmount4hr(t-2)
Profile(t-2) Profile(t-1) Profile(t)
numTrx(t-1)
numCashTrx(t-1)
avgAmount4hr(t-1)
numTrx(t)
numCashTrx(t)
avgAmount4hr(t)
Patents 14/796,547 (USA) ,14/613,300 (USA)
12:31:05, MCC123, USD, ...
13:01:15, MCC234, USD, ...
13:32:07, MCC345, USD, ...
14:03:25, ATM223, USD, ...
...
...
18:42:27, MCC567, USD, ...
...
...
Customer History
Too big!
Time 
Amount
Clothing Expense
Restaurant
Expense
Cash
withdrawal at
ATM
© 2016 Fair Isaac Corporation. Confidential. 5
Fraud Detection Through Neural Networks
Activation function
𝑓 𝑧 =
𝑒 𝑧 − 𝑒−𝑧
𝑒 𝑧 + 𝑒−𝑧
Powerful detection of known patterns of frauds using supervised learning
• Features based on raw
transactions
Model Inputs
• Computational unit takes
input and generates output
• Uses an activation
function
Hidden Layer
• Single output indicating
fraud/non-fraud
Model Output
Hidden
Layer
Input
Layer
Feature
Extraction
Output
Layer
© 2016 Fair Isaac Corporation. Confidential. 6
Fraud Detection With Unsupervised Learning
Streaming self-calibration
• Track 95% and 99% points automatically
• For each feature, create outlier model
• Memory & time efficient, no historical data
storage
• Real-time adapting to every transaction
Multi-layered Self-Calibrating (MLSC) Score
• Combine the outlier models from all
features
• Features in hidden nodes are selected to
minimize correlation
• Weights based on:
• Expert knowledge
• Limited data
Multi-Layer
Self-Calibrating Score
Hidden
Layer
Input
Layer
Output
Layer
Weight
Tuning
Patents 8,027,439, 8,041,597 13/367,344 (USA), 14/796,547 (USA) ,14/613,300 (USA
Low Risk
Feature
Probability
Current feature value
High Risk
95%99%
Scores 1– 999
Effective detection of unknown and changing patterns of frauds
© 2016 Fair Isaac Corporation. Confidential. 7
Auto-Encoder Learns a Data Representation
• Deep Learning algorithm that sets target values equal to input and applies
unsupervised learning to minimize the reconstruction error
─ Provides a compressed distributed representation (encoding) of original data.
𝑥
𝑊𝐸 𝑊𝐷
𝐸 𝑅 = 𝑥 − 𝑥 𝑅
2
𝑥 𝑅𝑥 𝑥 𝑅 Learning
Latent features
Reconstruction Error
Reconstructed Image
https://commons.wikimedia.org/w/index.php?curid=488211
© 2016 Fair Isaac Corporation. Confidential. 8
Power of Pooling
Data Consortium
© 2016 Fair Isaac Corporation. Confidential. 9
Data Consortium
Percentage of credit card
accounts in the world that
are covered by FICO fraud
solutions
• Most transactions are genuine
─ Fraud: a rare class problem
─ Genuine cases could be non-
representative
─ Hard to create a great model
• Consortium
─ Pools data from across the globe
─ More fraud cases
─ More diverse data
─ Superior models
Clients benefit from the pooled data
© 2016 Fair Isaac Corporation. Confidential. 10
Challenges Working With Consortium
Media &
Frequency
Processing
power & time
Data-format
Data Security
& Compliance
Cross-
Contamination
Data-quality
Client-
uniqueness
Terabytes of raw data received each
month
© 2016 Fair Isaac Corporation. Confidential. 11
Consortium Data Flow: 4 Steps
Receive
File
Process
File
Process
Data
Clean
Data
© 2016 Fair Isaac Corporation. Confidential. 12
(1) Data Transmission
Data
FICO “Landing Zone”
• Process 50,000+ Consortium files per month
• About 1 petabyte of payment card data in 5 years
•Electronic, disc, etc.
•Daily, weekly, monthly
Receive
File
Process
File
Process
Data
Clean
Data
© 2016 Fair Isaac Corporation. Confidential. 13
Files
Receipt
• Encryption Check
• Extract
• File Tagging
• Assign Client & Build
Archives
(2) Data Security and Basic ETL
• Encrypt and obfuscate (hash) PII and sensitive client data
• Quarantine data that fails the FICO Data Security Analysis
• File ETL: archive, inventory, join, transform, and tag
•Receipt
•Security
•Basic file processing
•Archive data
Receive
File
Process
File
Process
Data
Clean
Data
© 2016 Fair Isaac Corporation. Confidential. 14
(3) High Level Statistically-Based Alerting and Trend Analysis
Data
checks
Transactionvolume Time
Missing data
analystA@fico.com
Subject: Data Issue
File x has invalid
fraud dates
• FICO processes about 20 billion Falcon records/month
• Each file checked against global and client-specific statistical distributions
•Health checks
•Statistical analysis
•Tabulate
Receive
File
Process
File
Process
Data
Clean
Data
© 2016 Fair Isaac Corporation. Confidential. 15
(4) Apply Domain and Client Specific Transformations
Raw
File
Clean
File
001000001
01010100 01001101
ATM
Currency Code
Spain
CVV2
valid
Cust.
not
present
.com
merchant
eCommerce
Record Length = 500
Record Length = 492
Record Length = 494
Record Length = 500
Record Length = 500
Record Length = 500
Record Length = 500
Record Length = 500
Record Length = 500
Record Length = 500
Regularly review transformations for relevance and sunset if obsolete
•Data fixes
•Automatic, client- specific
•Cross-client uniformity
Receive
File
Process
File
Process
Data
Clean
Data
© 2016 Fair Isaac Corporation. Confidential. 16
Model Governance
Applications of Supervised and Unsupervised Learning Technologies
© 2016 Fair Isaac Corporation. Confidential. 17
Model Governance
Is Serious Business
Regulators
Customers
Internal
Audit
Modeling
Data
• Model Inputs
• Development Data
• Data Quality
• Sensitivity Analysis
Model
Specifications
• Model Structure
• Model Assumptions
• Benchmark/Alternative Architectures
• Model Updates
Model
Validation
• Data Validation
• Model Validation
Deployment
Validation
• Post Implementation Validation
• Performance Monitoring
OCC
FICO
Client
Requests
FICO
Model
Governanc
e
© 2016 Fair Isaac Corporation. Confidential. 18
The OLD : Data Quality Reporting
Check basic data integrity
─ Data quality reports
─ Red flags: Missing records,
fields, or incorrect data types
Monitor before and during model
deployment
─ Data Statistics
─ Score Distributions
─ Model Performance
snapshots
© 2016 Fair Isaac Corporation. Confidential. 19
4. Anomalous
Transaction
Identification
3. Trigger
Model-
Retrain
2. New Client
Model
Selection
The NEW: Auto Encoders in Model & Data Governance
Production
Data
Production Model
1. Data Feed
Validation
© 2016 Fair Isaac Corporation. Confidential. 20
1. Data Feed Validation
Statistical analysis often too generic to point to data integrity issues
• Auto-Encoder can easily identify sets of transactions across clients with different
reconstruction errors which identify key data integrity issues
Transaction Amount
Frequency
Wrong currency conversion
Correct currency conversion
Cluster
reconstruction
errors
Per-cluster root
cause analysis
© 2016 Fair Isaac Corporation. Confidential. 21
2. New Client Model Selection
Identify model trained on consortium data with minimal reconstruction
error compared with new client’s transaction data
Minimum
Reconstruction
Error
Indonesia?
© 2016 Fair Isaac Corporation. Confidential. 22
3. Trigger Model-Retrain
Learn a companion auto-encoder network based on the same data as the
unsupervised model
─ Unsupervised model and the auto-encoder network is packaged together and
installed in the production environment.
Timeline
Recon-error
Timeline
Recon-error
No significant deviation
Significant deviation
Rebuild
Score
© 2016 Fair Isaac Corporation. Confidential. 23
4. Anomalous Transaction Identification
Score or,
rule-triggered review
Timeline
Recon-error
Above-threshold Error
Outlier
detection
Feature
Engineering
© 2016 Fair Isaac Corporation. Confidential. 24
Surfacing Patterns
Leveraging Consortium Data to Inform Predictive Modeling
27
© 2016 Fair Isaac Corporation. Confidential. 25
Three large US credit issuers for 2014 and 2015
MCC = Merchant Category Code
$1Transactions
Investigative Analysis
$ Amount
Tran#
Anomaly
emerges
8% of $1 CNP
© 2016 Fair Isaac Corporation. Confidential. 26
Investigating– Eating Places Analysis
75% of
$1 CNP Fraud
2 Food Vendors
What can we learn about the Vendors…
© 2016 Fair Isaac Corporation. Confidential. 27
Investigating– Eating Places Analysis
Vendor 1
32% Ep-
MCC
Vendor 2
42% Ep-
MCC
Vendor 1
32% Ep-MCC
Vendor 2
42% Ep-MCC
7,791
Fraud PANs
19,199
Fraud
PANs
Why?
© 2016 Fair Isaac Corporation. Confidential. 28
Investigating – Visibility makes Huge difference
Effective
Strategy by
Credit Issuer!
Compromise Detection
Vendor 1
32% Ep-
MCC
Vendor 2
42% Ep-
MCC
~55% PANs closed 1stday (Industry avg. )
~90% closed 1stday
Specifically targeted 1 issuer!
© 2016 Fair Isaac Corporation. Confidential. 29
Investigating New Card Testing Scheme
Fraudsters do “test” transactions
─ Usually $1
─ Fraudsters invent new testing
schemes to evade detection
• $22M loss could have been
reduced by Vendor 1
• US CNP fraud ~
$3.8Billion/Year
CNP = Card Not Present i.e., online, phone
etc
Aite Group.
$1
$10
$100
$1,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Fraud DollarsExample Trend after “test” transaction
“Test”
© 2016 Fair Isaac Corporation. Confidential. 30
Investigating New Card Testing Scheme
• Detecting and responding to schemes
─ Fraudsters will continuously attempt new
attack methods
─ Data science informs model design
• Data Science
─ Brute force statics can miss changes
─ Autoencoders can detect shifts in fraud
patterns in real time
• Model design
─ Adaptive models to learn new fraud patterns
─ Entity profiles respond at the entity level
© 2016 Fair Isaac Corporation. Confidential. 31
Consortium, Deep Learning, and Data Science
steps up the fight against Cybercrime
© 2016 Fair Isaac Corporation. Confidential. 32
© 2016 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
Thank You
Scott Zoldi, Chief Analytics Officer
FICO
scottzoldi@fico.com
@ScottZoldi

More Related Content

What's hot

Satyam open analytics nyc
Satyam open analytics nycSatyam open analytics nyc
Satyam open analytics nyc
Open Analytics
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
DataWorks Summit
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
Dragan Kinkela
 

What's hot (20)

Audit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data AnalyticsAudit Webinar: Surefire ways to succeed with Data Analytics
Audit Webinar: Surefire ways to succeed with Data Analytics
 
Introduction to CaseWare IDEA - Designed by Auditors for Auditors
Introduction to CaseWare IDEA - Designed by Auditors for AuditorsIntroduction to CaseWare IDEA - Designed by Auditors for Auditors
Introduction to CaseWare IDEA - Designed by Auditors for Auditors
 
Why You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit AnalysisWhy You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit Analysis
 
Testing the Data Warehouse
Testing the Data WarehouseTesting the Data Warehouse
Testing the Data Warehouse
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
IDEA 10.3 Launch Webinar
IDEA 10.3 Launch WebinarIDEA 10.3 Launch Webinar
IDEA 10.3 Launch Webinar
 
Testing the Data Warehouse―Big Data, Big Problems
Testing the Data Warehouse―Big Data, Big ProblemsTesting the Data Warehouse―Big Data, Big Problems
Testing the Data Warehouse―Big Data, Big Problems
 
Satyam open analytics nyc
Satyam open analytics nycSatyam open analytics nyc
Satyam open analytics nyc
 
Introduction to Anzo Unstructured
Introduction to Anzo UnstructuredIntroduction to Anzo Unstructured
Introduction to Anzo Unstructured
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
 
Conducting basic Data Analysis with IDEA
Conducting basic Data Analysis with IDEAConducting basic Data Analysis with IDEA
Conducting basic Data Analysis with IDEA
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
Extracting data from IDEA
Extracting data from IDEA Extracting data from IDEA
Extracting data from IDEA
 
Transforming Business Intelligence Testing
Transforming Business Intelligence TestingTransforming Business Intelligence Testing
Transforming Business Intelligence Testing
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
Graph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleGraph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise Scale
 
Analytics for Audit
Analytics for AuditAnalytics for Audit
Analytics for Audit
 
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
 

Similar to 1330 keynote shahapurkar

Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity Software Ireland
 
EAS-SEC Project
EAS-SEC ProjectEAS-SEC Project
EAS-SEC Project
ERPScan
 

Similar to 1330 keynote shahapurkar (20)

Necessity of Data Lakes in the Financial Services Sector
Necessity of Data Lakes in the Financial Services SectorNecessity of Data Lakes in the Financial Services Sector
Necessity of Data Lakes in the Financial Services Sector
 
MongoDB World 2018: A Journey to the Cloud with Fraud Detection, Transactions...
MongoDB World 2018: A Journey to the Cloud with Fraud Detection, Transactions...MongoDB World 2018: A Journey to the Cloud with Fraud Detection, Transactions...
MongoDB World 2018: A Journey to the Cloud with Fraud Detection, Transactions...
 
Data Driven Decisions - Big Data Warehousing Meetup, FICO
Data Driven Decisions - Big Data Warehousing Meetup, FICOData Driven Decisions - Big Data Warehousing Meetup, FICO
Data Driven Decisions - Big Data Warehousing Meetup, FICO
 
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
 
dataProtection_p3.ppt
dataProtection_p3.pptdataProtection_p3.ppt
dataProtection_p3.ppt
 
Cisco Connect Toronto 2018 DNA assurance
Cisco Connect Toronto 2018  DNA assuranceCisco Connect Toronto 2018  DNA assurance
Cisco Connect Toronto 2018 DNA assurance
 
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
 
Solnet dev secops meetup
Solnet dev secops meetupSolnet dev secops meetup
Solnet dev secops meetup
 
EAS-SEC Project
EAS-SEC ProjectEAS-SEC Project
EAS-SEC Project
 
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
Elevate your Splunk Deployment by Better Understanding your Value Breakfast S...
 
Applying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data SetsApplying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data Sets
 
The Notorious 9: Is Your Data Secure in the Cloud?
The Notorious 9: Is Your Data Secure in the Cloud?The Notorious 9: Is Your Data Secure in the Cloud?
The Notorious 9: Is Your Data Secure in the Cloud?
 
CisCon 2018 - Analytics per Storage Area Networks
CisCon 2018 - Analytics per Storage Area NetworksCisCon 2018 - Analytics per Storage Area Networks
CisCon 2018 - Analytics per Storage Area Networks
 
Scalar Security Roadshow April 2015
Scalar Security Roadshow April 2015Scalar Security Roadshow April 2015
Scalar Security Roadshow April 2015
 
Sean White- Kansas City
Sean White- Kansas CitySean White- Kansas City
Sean White- Kansas City
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
BREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAPBREACHED: Data Centric Security for SAP
BREACHED: Data Centric Security for SAP
 
Data in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathonData in Motion - tech-intro-for-paris-hackathon
Data in Motion - tech-intro-for-paris-hackathon
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 

More from Rising Media, Inc.

More from Rising Media, Inc. (20)

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
 
Elder shareable
Elder shareableElder shareable
Elder shareable
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
 

Recently uploaded

如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
a8om7o51
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 

Recently uploaded (20)

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 

1330 keynote shahapurkar

  • 1. © 2016 Fair Isaac Corporation. Confidential. 1 © 2016 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent. A Consortium and Its Data Fraud Screening for 2/3rds of All Card Transactions Scott Zoldi, PhD Chief Analytics Officer, FICO ScottZoldi@fico.com @ScottZoldi Predictive Analytics World for Business San Francisco May 16, 2017
  • 2. © 2016 Fair Isaac Corporation. Confidential. 2 About Me • Responsible for analytic development of FICO’s product and technology solutions, including Falcon Fraud Manager • 17 years at FICO • Author of 77 patents ─ 38 granted and 39 in process • Recent focus on self learning analytics for real-time detection of Cyber Security attacks, AML detection, and mobile device analytics • Ph.D. in theoretical physics from Duke University
  • 3. © 2016 Fair Isaac Corporation. Confidential. 3 History of Fraud Detection at FICO Falcon introduced Percentage of the US payment cards covered by FICO fraud solutions90% 20 15 10 5 0 1990 1994 1998 2002 2006 2014 Fraud Losses Banks participating in FICO’s fraud data consortium9,000 Active financial accounts protected by FICO worldwide2.6B Average response time for fraud decisions rendered by Falcon10ms 2010
  • 4. © 2016 Fair Isaac Corporation. Confidential. 4 Profiles Summarize Customer Transaction History Recursive analytic features to efficiently summarize history numTrx(t-2) numCashTrx(t-2) avgAmount4hr(t-2) Profile(t-2) Profile(t-1) Profile(t) numTrx(t-1) numCashTrx(t-1) avgAmount4hr(t-1) numTrx(t) numCashTrx(t) avgAmount4hr(t) Patents 14/796,547 (USA) ,14/613,300 (USA) 12:31:05, MCC123, USD, ... 13:01:15, MCC234, USD, ... 13:32:07, MCC345, USD, ... 14:03:25, ATM223, USD, ... ... ... 18:42:27, MCC567, USD, ... ... ... Customer History Too big! Time  Amount Clothing Expense Restaurant Expense Cash withdrawal at ATM
  • 5. © 2016 Fair Isaac Corporation. Confidential. 5 Fraud Detection Through Neural Networks Activation function 𝑓 𝑧 = 𝑒 𝑧 − 𝑒−𝑧 𝑒 𝑧 + 𝑒−𝑧 Powerful detection of known patterns of frauds using supervised learning • Features based on raw transactions Model Inputs • Computational unit takes input and generates output • Uses an activation function Hidden Layer • Single output indicating fraud/non-fraud Model Output Hidden Layer Input Layer Feature Extraction Output Layer
  • 6. © 2016 Fair Isaac Corporation. Confidential. 6 Fraud Detection With Unsupervised Learning Streaming self-calibration • Track 95% and 99% points automatically • For each feature, create outlier model • Memory & time efficient, no historical data storage • Real-time adapting to every transaction Multi-layered Self-Calibrating (MLSC) Score • Combine the outlier models from all features • Features in hidden nodes are selected to minimize correlation • Weights based on: • Expert knowledge • Limited data Multi-Layer Self-Calibrating Score Hidden Layer Input Layer Output Layer Weight Tuning Patents 8,027,439, 8,041,597 13/367,344 (USA), 14/796,547 (USA) ,14/613,300 (USA Low Risk Feature Probability Current feature value High Risk 95%99% Scores 1– 999 Effective detection of unknown and changing patterns of frauds
  • 7. © 2016 Fair Isaac Corporation. Confidential. 7 Auto-Encoder Learns a Data Representation • Deep Learning algorithm that sets target values equal to input and applies unsupervised learning to minimize the reconstruction error ─ Provides a compressed distributed representation (encoding) of original data. 𝑥 𝑊𝐸 𝑊𝐷 𝐸 𝑅 = 𝑥 − 𝑥 𝑅 2 𝑥 𝑅𝑥 𝑥 𝑅 Learning Latent features Reconstruction Error Reconstructed Image https://commons.wikimedia.org/w/index.php?curid=488211
  • 8. © 2016 Fair Isaac Corporation. Confidential. 8 Power of Pooling Data Consortium
  • 9. © 2016 Fair Isaac Corporation. Confidential. 9 Data Consortium Percentage of credit card accounts in the world that are covered by FICO fraud solutions • Most transactions are genuine ─ Fraud: a rare class problem ─ Genuine cases could be non- representative ─ Hard to create a great model • Consortium ─ Pools data from across the globe ─ More fraud cases ─ More diverse data ─ Superior models Clients benefit from the pooled data
  • 10. © 2016 Fair Isaac Corporation. Confidential. 10 Challenges Working With Consortium Media & Frequency Processing power & time Data-format Data Security & Compliance Cross- Contamination Data-quality Client- uniqueness Terabytes of raw data received each month
  • 11. © 2016 Fair Isaac Corporation. Confidential. 11 Consortium Data Flow: 4 Steps Receive File Process File Process Data Clean Data
  • 12. © 2016 Fair Isaac Corporation. Confidential. 12 (1) Data Transmission Data FICO “Landing Zone” • Process 50,000+ Consortium files per month • About 1 petabyte of payment card data in 5 years •Electronic, disc, etc. •Daily, weekly, monthly Receive File Process File Process Data Clean Data
  • 13. © 2016 Fair Isaac Corporation. Confidential. 13 Files Receipt • Encryption Check • Extract • File Tagging • Assign Client & Build Archives (2) Data Security and Basic ETL • Encrypt and obfuscate (hash) PII and sensitive client data • Quarantine data that fails the FICO Data Security Analysis • File ETL: archive, inventory, join, transform, and tag •Receipt •Security •Basic file processing •Archive data Receive File Process File Process Data Clean Data
  • 14. © 2016 Fair Isaac Corporation. Confidential. 14 (3) High Level Statistically-Based Alerting and Trend Analysis Data checks Transactionvolume Time Missing data analystA@fico.com Subject: Data Issue File x has invalid fraud dates • FICO processes about 20 billion Falcon records/month • Each file checked against global and client-specific statistical distributions •Health checks •Statistical analysis •Tabulate Receive File Process File Process Data Clean Data
  • 15. © 2016 Fair Isaac Corporation. Confidential. 15 (4) Apply Domain and Client Specific Transformations Raw File Clean File 001000001 01010100 01001101 ATM Currency Code Spain CVV2 valid Cust. not present .com merchant eCommerce Record Length = 500 Record Length = 492 Record Length = 494 Record Length = 500 Record Length = 500 Record Length = 500 Record Length = 500 Record Length = 500 Record Length = 500 Record Length = 500 Regularly review transformations for relevance and sunset if obsolete •Data fixes •Automatic, client- specific •Cross-client uniformity Receive File Process File Process Data Clean Data
  • 16. © 2016 Fair Isaac Corporation. Confidential. 16 Model Governance Applications of Supervised and Unsupervised Learning Technologies
  • 17. © 2016 Fair Isaac Corporation. Confidential. 17 Model Governance Is Serious Business Regulators Customers Internal Audit Modeling Data • Model Inputs • Development Data • Data Quality • Sensitivity Analysis Model Specifications • Model Structure • Model Assumptions • Benchmark/Alternative Architectures • Model Updates Model Validation • Data Validation • Model Validation Deployment Validation • Post Implementation Validation • Performance Monitoring OCC FICO Client Requests FICO Model Governanc e
  • 18. © 2016 Fair Isaac Corporation. Confidential. 18 The OLD : Data Quality Reporting Check basic data integrity ─ Data quality reports ─ Red flags: Missing records, fields, or incorrect data types Monitor before and during model deployment ─ Data Statistics ─ Score Distributions ─ Model Performance snapshots
  • 19. © 2016 Fair Isaac Corporation. Confidential. 19 4. Anomalous Transaction Identification 3. Trigger Model- Retrain 2. New Client Model Selection The NEW: Auto Encoders in Model & Data Governance Production Data Production Model 1. Data Feed Validation
  • 20. © 2016 Fair Isaac Corporation. Confidential. 20 1. Data Feed Validation Statistical analysis often too generic to point to data integrity issues • Auto-Encoder can easily identify sets of transactions across clients with different reconstruction errors which identify key data integrity issues Transaction Amount Frequency Wrong currency conversion Correct currency conversion Cluster reconstruction errors Per-cluster root cause analysis
  • 21. © 2016 Fair Isaac Corporation. Confidential. 21 2. New Client Model Selection Identify model trained on consortium data with minimal reconstruction error compared with new client’s transaction data Minimum Reconstruction Error Indonesia?
  • 22. © 2016 Fair Isaac Corporation. Confidential. 22 3. Trigger Model-Retrain Learn a companion auto-encoder network based on the same data as the unsupervised model ─ Unsupervised model and the auto-encoder network is packaged together and installed in the production environment. Timeline Recon-error Timeline Recon-error No significant deviation Significant deviation Rebuild Score
  • 23. © 2016 Fair Isaac Corporation. Confidential. 23 4. Anomalous Transaction Identification Score or, rule-triggered review Timeline Recon-error Above-threshold Error Outlier detection Feature Engineering
  • 24. © 2016 Fair Isaac Corporation. Confidential. 24 Surfacing Patterns Leveraging Consortium Data to Inform Predictive Modeling 27
  • 25. © 2016 Fair Isaac Corporation. Confidential. 25 Three large US credit issuers for 2014 and 2015 MCC = Merchant Category Code $1Transactions Investigative Analysis $ Amount Tran# Anomaly emerges 8% of $1 CNP
  • 26. © 2016 Fair Isaac Corporation. Confidential. 26 Investigating– Eating Places Analysis 75% of $1 CNP Fraud 2 Food Vendors What can we learn about the Vendors…
  • 27. © 2016 Fair Isaac Corporation. Confidential. 27 Investigating– Eating Places Analysis Vendor 1 32% Ep- MCC Vendor 2 42% Ep- MCC Vendor 1 32% Ep-MCC Vendor 2 42% Ep-MCC 7,791 Fraud PANs 19,199 Fraud PANs Why?
  • 28. © 2016 Fair Isaac Corporation. Confidential. 28 Investigating – Visibility makes Huge difference Effective Strategy by Credit Issuer! Compromise Detection Vendor 1 32% Ep- MCC Vendor 2 42% Ep- MCC ~55% PANs closed 1stday (Industry avg. ) ~90% closed 1stday Specifically targeted 1 issuer!
  • 29. © 2016 Fair Isaac Corporation. Confidential. 29 Investigating New Card Testing Scheme Fraudsters do “test” transactions ─ Usually $1 ─ Fraudsters invent new testing schemes to evade detection • $22M loss could have been reduced by Vendor 1 • US CNP fraud ~ $3.8Billion/Year CNP = Card Not Present i.e., online, phone etc Aite Group. $1 $10 $100 $1,000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Fraud DollarsExample Trend after “test” transaction “Test”
  • 30. © 2016 Fair Isaac Corporation. Confidential. 30 Investigating New Card Testing Scheme • Detecting and responding to schemes ─ Fraudsters will continuously attempt new attack methods ─ Data science informs model design • Data Science ─ Brute force statics can miss changes ─ Autoencoders can detect shifts in fraud patterns in real time • Model design ─ Adaptive models to learn new fraud patterns ─ Entity profiles respond at the entity level
  • 31. © 2016 Fair Isaac Corporation. Confidential. 31 Consortium, Deep Learning, and Data Science steps up the fight against Cybercrime
  • 32. © 2016 Fair Isaac Corporation. Confidential. 32 © 2016 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent. Thank You Scott Zoldi, Chief Analytics Officer FICO scottzoldi@fico.com @ScottZoldi