Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How Big Data and Deep Learning are Revolutionizing AML and Financial Crime Detection

905 views

Published on

Banks, Payment Providers and capital markets firms are under intense regulatory mandate to process huge amounts of transaction-related data from both traditional and non-traditional sources. Compliance teams need to constantly analyze data-in-motion (wires, fund transfers, banking transactions) and data-at-rest (years worth of historical data) for actionable intelligence required for Suspicious Activity Reports—to discover illegal activity and provide detailed reporting to authorities. Annual estimates of global money laundering flows ranging anywhere from $ 1 trillion to 2 trillion – almost 5% of global GDP. Almost all of this is laundered via Retail & Merchant Banks, Payment Networks, Securities & Futures firms, Casino Services & Clubs etc – which explains why annual AML related fines on Banking organizations run into the billions and are increasing every year. However, the number of SARs (Suspicious Activity Reports) filed by banking institutions are much higher as a category as compared to the numbers filed by these other businesses. In this presentation we will discuss the business imperatives, value drivers and the woeful inadequacy of current technology architectures and approaches in tackling AML. We will then pivot to a deepdive around Big Data and Predictive Analytics in how they can ease and solve these vexing challenges that Banking executives are grappling with globally.

Speaker
Sanjay Kumar, GM Industry Solutions - Telecom and FS, Hortonworks

Published in: Technology

How Big Data and Deep Learning are Revolutionizing AML and Financial Crime Detection

  1. 1. Big Data and Predic.ve Analy.cs for AML and Financial Crime Detec.on Sanjay Kumar GM Industry Solu.ons – Telecom & FS
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Ã  Introduc=on Ã  What is Financial Crime, AML and what we are seeing in the AML Space Ã  Brief Discussion of Customer Ac=vity in AML Ã  Illustra=ve Use Cases Ã  Where Current Implementa=ons fall short? Ã  Reference Architecture for AML and Predic=ve Analy=cs Ã  Q&A
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved FSI Industry Market Segments FSI Industry" Capital Markets" Investment Banks" Hedge Funds" Wealth Mgmt" Retail Lines" Consumer lines" Corporate" Payments" Acquirer & Issuer Banks " Schemes" Market Exchanges" •  There are 4 primary market segments/ sectors comprising the global FSI industry: Capital Markets; Retail Banking, Payments; Market Exchanges. •  Each geography, country and state may have their own regula=on and compliance requirements for products, distribu=on and ra=ng requirements. Banking is the most regulated industry! •  It is key to understand the market segment of the Banking company as the business process and data/informa=on needs and challenges are very different across the 4. Addi=onally, challenges vary by Premium/Revenue =er. •  There are many Global FS companies which may define standards globally and deploy locally.
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Impact of Big Data in 5 major areas Predictive Analytics And ML/DL Digital Banking Capital Markets Wealth Management Cybersecurity Helping defend ins=tu=ons against cyber threats Improving wealth management capabili=es thereby providing enhanced customer service Enhancing capabili=es across investment banking, trading etc. Enabling Digital bank, providing seamless customer experience Analy=cs enabling both defensive and offensive use cases
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Big Data for Financial Crimes and Controls Ã  Firms, large and small, need to navigate a set of increasingly complex compliance rules and regula=ons as regulatory bodies clampdown on loopholes in the financial regulatory framework. With =ghter regula=on comes the need to seek out more advanced and cost effec=ve compliance solu=ons Ã  It is es=mated by the Financial Ac=on Task Force that over one trillion dollars is laundered annually. Ã  Regulators increasingly require greater oversight from ins=tu=ons, including closer monitoring for an=-money laundering (AML) and know your customer (KYC) compliance. Ã  The methods and tac=cs used to launder money are constantly evolving, from loan-back schemes and front companies, to trusts and black market currency exchanges, there is no “typical” money laundering case.
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What Is AML, Financial Crime and What we are seeing in AML
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is AML and Financial Crimes Ã  Financial crime is commonly considered as covering the following offences: –  Fraud –  Electronic Crime(Credit Card, stolen informa=on etc) –  Money Laundering –  Terrorist financing –  Bribery and Corrup=on (KYC) –  market abuse and insider dealing (Trade Surveillance) –  Informa=on security (Cyber Security) Ã  An=-money laundering (AML) is a term mainly used in the financial and legal industries to describe the legal controls that require financial ins=tu=ons and other regulated en==es to prevent or report money laundering ac=vi=es.
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Financial Crime Is On the Rise! of businesses were vic=ms of fraud of banks failed to catch fraud before funds were transferred out of fraud aiacks, the bank was unable to fully recover assets of businesses said they have moved their banking ac=vi=es elsewhere Only 20% of banks were able to iden=fy fraud before money was transferred. “The ROI of inves/ng in fraud preven/on is clear.” 58% Source: Ponemon Ins=tute/Guardian Analy=cs study, March, 2010 80% 87% 40% 20% A poll of 500 execu.ves and owners of small and medium businesses showed:
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Key AML Use Cases
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Case1: Understand Customer Profile (KYC) •  Case Descrip.on: Mr Alex is a Compliance officer at ABC bank. While scru=nizing number of the customer profile and account ac=vity he noted some suspicious ac=vity in one of the customer's account. Customer profile and account ac=vity has the following informa=on. •  Customer Profile: –  Individual customer account, Risk Type Classifica=on – Sensi=ve Client, Senior Public Figure. Customers carrying out large transac=ons –  A number of transac=ons in the range of $10000 to 5,000,000 carried out by the same customer within a short space of =me –  A number of customers sending payments to the same individual •  Uniqueness of Use case: Mul=–Channel Linked Accounts involving mul=ple geography •  Data elements involved – Customer Data – Transac=on Data over 5 year period •  Challenges with current technology – Mul=ple Linked Accounts and Past History beyond 6 months Data retrieval – Real-=me visualiza=on l  Suppor.ng Data required to simulate the use case – Cross Currency, Cross Geography Loca=ons – Mul=ple Channels Transac=ons – Mul=ple Cross Currency transac=ons from USD, SGD, GBP and EUR – Nearly x Accounts – Across Geography in 50 countries – Between 500-600 CR/DB transac=on every Month l  Results / Objec.ve of Use Case: To demonstrate Mul= Channel transac=ons with historic data set l  Visualiza.on to show results of use case: To be iden=fied
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Case2: Mul. Product Linked Accounts (KYC) •  Case Descrip.on: A customer profile with a business profile with linked accounts and Transac=on across products and investments. There are many funneled transac=ons in to the account and investments across geographical loca=ons of high risk countries. •  Customer Profile: –  Business customer account, Risk Type Classifica=on – High Risk Client, Customers carrying out large transac=ons –  Complex and Large cash transac=ons in the range of $50,000 above –  Mul=ple Exchange of cash in one currency for foreign currency –  High cash businesses such as restaurants, pubs, casinos, taxi firms, beauty salons and amusement arcades –  A number of customers sending payments to the same individual •  Uniqueness of Use case: Mul=–Product Linked Accounts •  Data elements involved – Customer Master Profile – Product Master – Transac=ons over x year data set •  Challenges with current technology – Mul=ple Linked Accounts with Mul= products – Real-=me link visualiza=on and tracking l  Suppor.ng Data required to simulate the use case – Cross Currency, Cross Geography Loca=ons – Mul=ple Product Transac=ons and wired transac=ons – Mul=ple Cross Currency transac=ons from USD, SGD, GBP and EUR – Nearly x Linked Accounts – Across Geography in 50 countries – Between 2000 CR/DB transac=on every Month l  Results / Objec.ve of Use Case: To demonstrate Product transac=on links with historic data set l  Visualiza.on to show results of use case: To be iden=fied
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Case3: $200 Million Credit Card Fraud •  Case Descrip.on: On Feb. 5, federal authori=es arrested 13 individuals allegedly connected to one of the biggest payment card schemes ever uncovered by the Department of Jus=ce. The defendants' alleged criminal enterprise - built on synthe=c, or fake, iden==es and fraudulent credit histories - crossed numerous state and interna=onal borders, inves=gators say. •  Customer Profile: –  169 Bank Accounts –  25000 Fraudulent Credit cards –  7000 false iden==es –  Wired Transac=on across geographies l  Uniqueness of Use case: Mul=ple customer profiles tracking •  Data elements involved – Customer Master Profile l  Challenges with current technology – Mul= Customer Profile tracking and verifica=on – Accurate profile verifica=on by cross-verifica=on of public records with u=lity bills and bank accounts around the world – Create a single en=ty view (SEV) of similar en==es – Detect aliases whether they are created inten=onally or through human error – Iden=fy irregulari=es in user input – Reduce false posi=ves through data enrichment l  Suppor.ng Data required to simulate the use case – Cross Geography Loca=ons Profiles – x Linked Accounts across different banks and products l  Results / Objec.ve of Use Case: To demonstrate DE-duplica=on of customer profiles and verifica=on of iden=ty l  Visualiza.on to show results of use case: To be iden=fied
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Case4: Social Network Analysis •  Case Descrip.on: Analysis of Social Network Network sites to establish links with fraudulent customers Links •  Customer Profile: – Customer Profiles with over 5 Million records – Across Geography in 50 countries – Search, match and link with Telephone, Mobile Number, Email, Social Network IDs – Iden=fy irregulari=es in user input – Protect individual privacy concerns through anonymous resolu=on, displaying either the full matching records – Reduce false posi=ves through Data enrichment l  Uniqueness of Use case: Social Network Analysis of Customer Profiles •  Data elements involved – Customer Master Profile l  Challenges with current technology – Ability to link to social network sites and Text Analysis l  Suppor.ng Data required to simulate the use case – Customer Profiles gleaned from social network sites like Facebook, LindedIn, Myspace and other social networks/ communi=es l  Results / Objec.ve of Use Case: To demonstrate Social Network iden=ty links with customer profiles to establish Fraudulent customer profiles and to reduce false iden=ty l  Visualiza.on to show results of use case: To be iden=fied
  14. 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Case5: WatchList Filtering and Text Mining •  Case Descrip.on: Watch list filtering primary requirement is to rou=nely scan current and prospec=ve clients against a database (watch list) consis=ng of names, aka and address entries. •  Customer Profile: – Compare and scru=nize 1,000,000 names on the global PEP list – Nearly 120 sanc=ons lists that collec=vely have more than 20,000 profiles. – Watch list screening is crea=ng an effec=ve screening process that minimizes false posi=ves and false nega=ves. – Search, match and link with names and provide comparison with actual and original records l  Uniqueness of Use case: Text Mining of Unstructured Data •  Data elements involved – Customer Master Profile l  Challenges with current technology – Unstructured data results in False Posi=ves – Number of Matching Rules and Ease of incorpora=ng Match Matrix changes. – Customer Data Integrity – Foreign names, mul=part names, hyphenated names, names which “sound” similar but spelled differently (eg.Muhammed v/s Mohamad) l  Suppor.ng Data required to simulate the use case – OFAC's SDN list, Bank of England List, Denied Person's List l  Results / Objec.ve of Use Case: To demonstrate Reliable and scalable watch-list filtering l  Visualiza.on to show results of use case: To be iden=fied
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ã  Need for highly interac=ve and visually appealing UI’s for inves=ga=on Ã  Need for advanced analy=cs for deeper insight into trends in customer behavior. Ã  Higher degree of depth of analysis in AML program. Ã  Guard against Aging technology and Manual approaches Ã  Automated Risk Classifica=on Approaches Ã  Need to reduce the volume of False posi=ves Ã  The need for structured and unstructured data analysis Data Analysis Trends in AML
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved l  Higher degree of technology sophis=ca=on among criminals l  AML programs need to move from running detec=on processes on similar data sets, to opera=ng across diverse data Fraud paierns of fraud demand 360 view of Risk as well as an ability to work across more complex and larger data sets l  Most illicit ac=vi=es spanning across geographies, products and accounts l  Lack of efficiency in Inves=ga=on Tools and Processes l  Expert Systems or Rules Engine based approaches becoming ineffec=ve l  Predic=ve approach to detec=ng fraud is emerging as a key trend l  Move to increased automa=on l  The amount of data that is needed to feed the predic=ve approaches is growing exponen=ally. What we are seeing in AML..
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Where current solutions fall short
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ã  Fragmented Book of Record Transac=on systems –  Lending systems along geographic and business lines –  Trading systems along desk and geographic lines Ã  Fragmented enterprise systems –  Mul=ple general ledgers –  Mul=ple Enterprise Risk Systems –  Mul=ple compliance systems by business line •  AML for Retail, AML for Commercial Lending, AML for Capital Markets… •  Lack of real =me data processing, transac=on monitoring and historical analy=cs Ã  Typically proprietary vendor and in-house built solu=ons that have been acquired over the years building up a significant technological debt. Ã  Unable to keep pace with the progress of technology Ã  Move to combine Fraud (AML, Credit Card Fraud & InfoSec) into one plavorm Ã  Issues with flexibility, cost and scalability What We Have Seen at Banks
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved High Level Solution - Architecture Predictive Analytics
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Some essen.al data elements for AML: Structure and Unstructured Ã  Inflow and ouvlow Ã  Links between en==es and accounts Ã  Account ac=vity: speed, volume, anonymity, etc. Ã  Reac=va=on of dormant accounts Ã  Signer rela=onship Ã  Deposit mix Ã  Transac=ons in areas of concern Ã  Use of mul=ple accounts and account types Ã  Social Media Behavior Ã  Etc.
  21. 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Big Data for Financial Crimes and Controls- Solu.on Ã  The unique nature of money laundering requires a new genera=on of solu=ons based on –  Vast variety of Historical Data –  Business rules –  fuzzy logic –  Data Mining –  supervised and unsupervised learning and other machine learning technologies to increase detec=on and reduce false posi=ves. Ã  To implement a next genera=on solu=on for BSA/AML, firms must look towards updated machine learning tools that allow finer grain resolu=on at the scale needed to detect AML. Ã  Phased Approach –  Rule Based Model ( Crawl Phase ) –  Feature based Model (Walk Phase) –  Data Driven Model ( Run Phase)
  22. 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved AML Solu.on: Rule Based Solu.on (Crawl Phase) Ã  Manual Analysis by a inves=gator Ã  Subjec=ve and Inconsistent Ã  Time Consuming Ã  High False Posi=ve Ã  Constant update to rules Ã  Not able to Catch no modes of Frauds Key Highlights and Challenges Transac=on Data LexisNexis Accounts Database Payment Data Card data Dashboard to Match Data NOT Alerts from Rule Based System Suspicious Rule Based AML Solu=on
  23. 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved AML Solu.on: Feature Based Solu.on (Walk Phase) Rule base & Supervised & Unsupervised Learning for AML Ã  Features are meta data (Extracted from the data)--average balance of last 7 days Ã  Features help algorithms capture informa=on from the data. Ã  Feature engineering is a form of language transla=on: Between raw data and the algorithm. Ã  Uses Supervised and/or unsupervised Machine Learning Ã  Quick classifica=on Ã  Low false posi=ve rate - tweaked based on risk appe=te. Key highlights Transac=on Data LexisNexis Accounts Database Payment Data Card data Dashboard to Match Data NOT Alerts from ML Based System Suspicious Machine Learning Algorithms Historical Alerts
  24. 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Type of Machine Learning and Poten.al Usage
  25. 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Next Gen AML Solu.on: Data Driven Based Solu.on (Deep Learning) Ã  The algorithm understands malicious behavior through data Ã  Algorithm is smart to work without features - metadata Ã  Does not need alerts for training Ã  Helps in iden=fying any kind of anomalous behavior Ã  Deeper insights about customer Key highlights Transac=on Data LexisNexis Accounts Database Payment Data Card data NOT Suspicious Deep learning Algorithms Data Driven Solu=on
  26. 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved High Level System Architecture: MAX ROI & Future Proof Solu.on Note Just for AML/Fraud Source Data (examples) Data.gov Accounts Transac=ons lexisNexis Social Real-Time Event Streaming Engine Dynamic Customer Profile /Risk Appe=te Model Central Data Lake Real-.me Intelligent Ac.on •  Risk Similarity/Risk Profiling •  Related En=ty Analysis (graph database) •  Fraud/Social Network Analysis •  Mul=-line “profitable” class code •  Geospa=al data •  Updated risk appe=te Risk Scoring Engine (examples) •  Credit score (if allowed by regulatory agencies) •  Ra=ng aiributes (demograhics, geographic, social, property aiributes) •  Likelihood of fraud/risk(frequency/severity) Enrich Events with Customer/Risk info and Scoring Models Update Profiles and Scoring Models External/3rd party Data Sources Na=ve API Rest API ODBC/JDBC Update Data Lake Visualiza.on / Analy.cal Views
  27. 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Key Deliverable to build Big Data Solu.on Ã  Automa=ng Due Diligence around KYC data –  Simple informa=on collected during customer onboarding –  More complex informa=on for certain en==es –  Applying sophis=cated analysis to such en==es –  Automa=ng Research across news feeds (LexisNexis, DB, TR, DJ, Google etc) Ã  Efficient Case Management Ã  Capture all Data Set at one place Ã  Applying Advanced Analy=cs (two sub Use Cases) –  Exploratory Data Science –  Advanced Transac=on Intelligence –  Machine Learning/ Deep Learning
  28. 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Business Analy.cs Must Evolve To Deal With Data Tipping Point PROVIDE INSIGHT INTO THE PAST via data aggrega.on, data mining, business repor.ng, OLAP, visualiza.on, dashboards, etc. UNDERSTAND THE FUTURE via sta.s.cal models, forecas.ng techniques, machine learning, etc. ADVISE ON POSSIBLE OUTCOMES via rules, op.miza.on and simula.on algorithms
  29. 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Data Tipping Point Drivers of a Connected Data Architecture
  30. 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ã  A free open source linearly scalable plavorm has only become available within the last few years Ã  Due to the amount of regula=on over the last 15 years all bank enterprise compliance, risk and finance systems now func=on essen=ally the same way Ã  Banks partnering with an open source partner is very different from partnering with a vendor who develops proprietary soyware Ã  Proprietary soyware vendors will adopt the new standards since it is in their self interest to do so Ã  Regulators can now streamline their regulatory prac=ces by adop=ng a Big Data based approach Ã  Having a standards based Open Source plavorm means that regulators can use the same plavorm as the banks Why Will This Work Now?
  31. 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Digital Banking Solu.on Architecture Distributed File System Staging, Database, Structured, Unstructured, Archival, Document Data Opera.ng System Mul=-purpose plavorm enablement Governance & Integra.on Business Workflow Batch Search In-Memory Real-Time Pivotal HAWQ SQL Predic.ve Retail Banking Apps Marke.ng Apps SVC Storage Processing Applica.ons & Workloads Enterprise Security NBA Retail Banking Enterprise Data & Compute Lake Customer Journey Social RDBMS Mainframe Document Mgmt Systems Data Silos Core Banking Industry Ref. Web Logs Banking Sources Business Analy=cs Other… Data Science BI & Repor.ng SAS Business Logic Layer Cloud Compu.ng Stack (Public or Private) Public Cloud, Private Cloud, Hybrid Cloud suppor=ng a full stack of VMs and Docker
  32. 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Q & A

×