Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DutchMLSchool. Anomaly Detection in KYC

108 views

Published on

Machine Learning for Your Business: Anomaly Detection in KYC (Know Your Customer) - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

DutchMLSchool. Anomaly Detection in KYC

  1. 1. Jan W. Veldsink MSc
  2. 2. Co-organized by: Sponsored by: Business Partners:
  3. 3. Jan W. Veldsink MSc 1rd edition | July 8 - 11, 2019
  4. 4. Jan W. Veldsink MSc THE ART OF AI AI - DEMYSTIFIED Jan W Veldsink MSc

  5. 5. Jan W. Veldsink MSc
  6. 6. Jan W. Veldsink MSc
  7. 7. Jan W. Veldsink MSc Anomaly detection
  8. 8. Jan W. Veldsink MSc Types of Machine Learning
  9. 9. Jan W. Veldsink MSc Data Application / Reports • Process can be automated completely. • BigML’s early detection systems are capable of predicting in batch or in real time if a transaction is suspicious or not. • Generation of reports that rank transactions by anomaly score. Clients / Accounts Operations Anomaly DetectorPayment Records Algorithmic Modeling Process Historical Data Data Filtered By Peer group Peer Group Anomaly Detector Anomalous Data Account anomaly score Customer Anomaly score Cluster Level anomaly score Anomaly scores can also be computed at the Account / Customer level or even at the cluster (similar user) level to reinforce positive alarms Multiple Levels of Anomaly Scores Early Detection of unwanted behaviour
  10. 10. Jan W. Veldsink MSc Time lines Events T F S S M T W T F S S M First call with Wibout Getting my head right and searching for data Machine learning concepts and setup Technical bumpy road / Data / Riskshield / BigML Experiments and reaching goal Some panic Data Arrived First ResultsThanks to: Wijnand Nuij,in helping to conceptualise the data Robert de Jong, for the flexibility and delivery of the data BigML-team, for the relentless support and help (24/7)
  11. 11. Jan W. Veldsink MSc CDD LOW • What is the question: • Do we as a bank have a good sight on our Low customers • NP? • ORG? • Yes in monitoring… and what about the clients we never see or alert? • Can we create an Machine learning model to give insight in the anomalies in those client groups? And can we,use peer-groups to do this?
  12. 12. Jan W. Veldsink MSc CDD LOW • Preproces: • Copied matrices (customer profiles, aggregated and more detailed customer and transaction data) • Generated a output with per account number transaction and customer KYC data
  13. 13. Jan W. Veldsink MSc Data Sources • DATA SOURCES • Customer information • KYC information • Account information • Transactions information • Cash transaction information
  14. 14. Jan W. Veldsink MSc Data Fields NP customers • Customer information • Customer Internal ID in the bank • Customer Type • Customer Segment • Residency Country • Date of Birth • Risk rating of the customer
  15. 15. Jan W. Veldsink MSc Data Fields ORG customers • Customer Internal ID in the bank • Customer Type • Customer Segment • Country of incorporation • Domiciled organization • Industry • Risk rating of the customer
  16. 16. Jan W. Veldsink MSc Dataset = ORG Customers with only CDD=A Dataset = NP Customers with only CDD=A Split on NP - ORG Anomaly model per peergroup Age category _ Account type Filter anomalyscore > XX% Anomaly model per peergroup: SBI code _ Account_type Filter anomalyscore > XX% OutputCreate explain clustering OutputCreate explain clustering
  17. 17. Jan W. Veldsink MSc Some results
  18. 18. Jan W. Veldsink MSc A lot of cash in 1/2 a year for a NP customer younger than 30
  19. 19. Jan W. Veldsink MSc Final Thoughts on ML Projects And finally, monitoring deployed models is important for business critical scenarios. COLLECT & TRANSFORM DATA BUILD A MODEL VALIDATE ANALYZE WHAT’S MISSING DEPLOY MODEL MONITOR MODEL
  20. 20. Jan W. Veldsink MSc RiskShield Server File per peergroup Create Anomaly / Model Execute Riskshield NP - file ORG - file Riskshield Output Place Sem-per-anomaly model When all models loaded Rerun the NP - ORG by adding sem per input file PMML-Wrapped-JsonPMML-Wrapped-Json
  21. 21. Jan W. Veldsink MSc
  22. 22. Jan W. Veldsink MSc
  23. 23. Jan W. Veldsink MSc THE ART OF AI AI - DEMYSTIFIED JAN VELDSINK MSC JAN@GRIO.NL J.VELDSINK@NYENRODE.NL
  24. 24. Co-organized by: Sponsored by: Business Partners:

×