From fraudulence to adversarial learning
The First NIDA Business Analytics and Data Sciences Contest/Conference
วันที่ 1-2 กันยายน 2559 ณ อาคารนวมินทราธิราช สถาบันบัณฑิตพัฒนบริหารศาสตร์
https://businessanalyticsnida.wordpress.com
https://www.facebook.com/BusinessAnalyticsNIDA/
-- Fraudulent detection (ID Theft) approach & process
- Evolution of fraudulence to sophisticated actor - adversarial learning
จรัล งามวิโรจน์เจริญ
Current chief data scientist and VP of Data Innovation Lab at Sertis,
Former lead data scientist of Booz Allen Hamilton
นวมินทราธิราช 3002 วันที่ 1 กันยายน 2559 15.15-15.45 น.
F r o m F r a u d u l e n c e t o a d v e r s a r i a l l e a r n i n g
Theft
Address
National IDPhone Number
Child NameSpouse Name
Bank Account
Credit Card Number
User Profile
Electronic Record
Who?
ID Theft Definition
Business Objectives
• Financial/Medical/Insurance ID Theft
• Synthetic
• Account take over (ATO)
Common Type of ID Theft
Business
Objectives
Data
Exploration/
Preparation
DeploymentModeling Evaluation
Fraud Definition
Objectives
Account
Transaction
Behavior
External Data
Feature Engineering
Supervised Learning
Unsupervised Learning
Ensemble Model
Performance Metrics
Parameter Tuning
Platform Testing
Train vs Test
Fraud Modeling
Random Forest Support Vector Machine (SVM)
Deep Learning – Stacked denoising Autoencoder (SdA)UnsupervisedSupervised
Multistage Ensemble Model
Feature
Extraction
Boosting
Feature Extraction - Ensemble
IDT NonIDT
Selected 8 2 10
Not Selected 8 982 990
16 984 1,000
Determined By Model’s Performance
IDT NonIDT
Selected 8 2 10
Not Selected 8 982 990
16 984 1,000
IDT Definition  IDT Prevalence Estimate in Population
IDT NonIDT
Selected 8 2 10
Not Selected 8 982 990
16 984 1,000
Unverifiable
During the Operation
http://manager.co.th/Daily/ViewNews.aspx?NewsID=9590000083749
Dark Web Marketplace – Credentials for Sale/ Hacking Services
Reference: Trend Micro Follow the Data: Dissecting Data Breaches and Debunking Myths
SecureWorks: Underground Hacker Markets
New Trend – Adversarial Learning
Reference: https://sarahjamielewis.com/posts/adversarial-machine-learning.html
Model
Generate
new
sample
Desired
Outcome?
Evasion
Success
Yes
No
Model
Regular
Training
sample
Desired
Outcome?
Poisoned
Yes
Generate
Mallicious
sample

From fraudulence to adversarial learning จรัล งามวิโรจน์เจริญ chief data scientist and VP of Data Innovation Lab at Sertis, former lead data scientist of Booz Allen Hamilton

  • 1.
    From fraudulence toadversarial learning The First NIDA Business Analytics and Data Sciences Contest/Conference วันที่ 1-2 กันยายน 2559 ณ อาคารนวมินทราธิราช สถาบันบัณฑิตพัฒนบริหารศาสตร์ https://businessanalyticsnida.wordpress.com https://www.facebook.com/BusinessAnalyticsNIDA/ -- Fraudulent detection (ID Theft) approach & process - Evolution of fraudulence to sophisticated actor - adversarial learning จรัล งามวิโรจน์เจริญ Current chief data scientist and VP of Data Innovation Lab at Sertis, Former lead data scientist of Booz Allen Hamilton นวมินทราธิราช 3002 วันที่ 1 กันยายน 2559 15.15-15.45 น.
  • 2.
    F r om F r a u d u l e n c e t o a d v e r s a r i a l l e a r n i n g Theft
  • 3.
    Address National IDPhone Number ChildNameSpouse Name Bank Account Credit Card Number User Profile Electronic Record Who? ID Theft Definition Business Objectives
  • 4.
    • Financial/Medical/Insurance IDTheft • Synthetic • Account take over (ATO) Common Type of ID Theft
  • 5.
    Business Objectives Data Exploration/ Preparation DeploymentModeling Evaluation Fraud Definition Objectives Account Transaction Behavior ExternalData Feature Engineering Supervised Learning Unsupervised Learning Ensemble Model Performance Metrics Parameter Tuning Platform Testing Train vs Test Fraud Modeling
  • 6.
    Random Forest SupportVector Machine (SVM) Deep Learning – Stacked denoising Autoencoder (SdA)UnsupervisedSupervised Multistage Ensemble Model Feature Extraction Boosting Feature Extraction - Ensemble
  • 7.
    IDT NonIDT Selected 82 10 Not Selected 8 982 990 16 984 1,000 Determined By Model’s Performance
  • 8.
    IDT NonIDT Selected 82 10 Not Selected 8 982 990 16 984 1,000 IDT Definition  IDT Prevalence Estimate in Population
  • 9.
    IDT NonIDT Selected 82 10 Not Selected 8 982 990 16 984 1,000 Unverifiable During the Operation
  • 10.
  • 11.
    Dark Web Marketplace– Credentials for Sale/ Hacking Services Reference: Trend Micro Follow the Data: Dissecting Data Breaches and Debunking Myths SecureWorks: Underground Hacker Markets
  • 12.
    New Trend –Adversarial Learning Reference: https://sarahjamielewis.com/posts/adversarial-machine-learning.html Model Generate new sample Desired Outcome? Evasion Success Yes No Model Regular Training sample Desired Outcome? Poisoned Yes Generate Mallicious sample