2. Automated Anti-Money Laundering
Using Artificial Intelligence and Machine
Learning
Authors: Santhosh L (LSanthosh@hotmail.com), Rohan Chatterjee (rohan.chatterjee94@gmail.com)
Abstract
The aim of this paper is to display how automation of Anti-money Laundering can be
done with the help of Machine Learning and Artificial Intelligence. AML involves the attention
of a lot of people and it also costs hundreds of thousands of money to call customers for the
completion or updating of “Know Your Customer” scheme. Billions are invested by financial
institutions on technology to curb the threat of technical challenges. Despite the effort put in,
money laundering problem seems to be on the rise and the risk of penalties remain high. So
what if we train the system using Machine Learning and Artificial Intelligence so that it can
perform the KYC processes without the intervention of human being and also can detect risks
which seem to be lethal to the Financial Institution. Alert Routing, Anomaly Detection, Robotic
Process Automation are some of the processes discussed in the paper in which Machine
Learning can be used and improved to automate anti-money laundering technique.
Introduction
Now-a-days there’s a boom in the number of financial institutions. As the number of
institutions increase, the unprecedented plethora of technical glitches like cyber security
defenses, customer service innovation, data management exploration challenges also tend to
increase. Billions are invested by financial institutions on technology to curb the threat of
technical challenges. Despite the effort put in, money laundering problem seems to be on the
rise and the risk of penalties remain high. According to a WealthInsight report, global AML
spending will exceed $8 billion in 2017. Additionally, banks around the globe have paid
approximately $321 billion in fines since the 2007-2008 financial crisis as regulators stepped up
enforcement, according to a report by the Boston Consulting Group.
With the spending of billions, the institutions came up with Anti-Money Laundering program to
develop independent oversight functions. Many institutions are successful in managing huge
teams and manual processes for the program. But most of them are not. They continue to face
challenges that erode the effectiveness and efficiency of their AML programs, including the
following:
3. One of the most significant problems in this field is the management of data. The non-standard
data structure and fragmented source make data aggregation by legal entities difficult. For
example, people are still getting customer calls to refresh KYC documents and updating
information that is incorrect. From the bank’s side it’s tens of thousands of costly customer calls
every month.
Analytical approaches for customer risk scoring and transaction monitoring suffer from high rates
of false positives, resulting in significant resources focused on investigating low-risk accounts and
transactions. Adding new calibration tools and thresholds often leads to another spike in the
number of false alerts.
Inconsistent standards in processes such as customer identification, enhanced due diligence, and
account monitoring and screening mean that businesses do not agree on what constitutes risk
and violation of compliance requirements.
Similarly, inconsistency in the reporting of suspicious activities and currency transactions means
banks sometimes produce too many reports, and sometimes too few, exposing them to the twin
dangers of regulatory sanctions and excessive cost.
Fragmented systems and platforms limit the ability to automate transaction monitoring and due
diligence. Instead, compliance teams spend the bulk of their time collecting data, and then on
“stare and compare” sessions, instead of investigative work.
Reliable quantitative metrics to assess risk across products, geographies, and processes are often
not available.
Leading banks are trying to crack these problems by turning to new technologies. Machine learning,
real-time data-aggregation platforms using fuzzy logic, rapid automation, and text and voice
analytics offer a fundamentally new approach to managing compliance. Even better, they also offer
an opportunity to simultaneously cut structural costs and improve the customer experience. As they
take up these new tools, banks are shifting financial-crime compliance toward a more forward-
looking and sustainable approach.
However, despite some low hanging fruit opportunities to use AI/ML for AML, adoption of AI/ML
has been slow. There are a couple reasons for this.
4. (1) A limited understanding of the application of AI/ML within the context of AML compliance
programs;
(2) The notion of AI/ML being a “black box” where the inner workings are not clearly understood
by business people, regulators, and compliance officers; and
(3) The cautious tack taken by regulators requiring compliance officers to demonstrate
understanding and provide validation about the results provided by the models.
Fig-1: Money Laundering
Now, let’s now analyze how AI/ML can help financial institutions get out of the rut:
Alert Routing
An survey from PwC conducted a recent survey of the anti-money laundering alerts
generated by banks and found that more than 90% of the alerts were false positives.
The banks all over the world employ some hundreds of thousands of employees to
detect the anomalies in transactions and this costs them some billions of dollars. If we can
decrease the amount of false alerts which are generated, the number of employees can also be
curtailed and in this way the bank can save thousands of dollars. This money can be invested in
technology to automate AML.
It has been understood that Machine Learning and Artificial Intelligence are the best
processes to be used to automate AML. The Financial Institutes face a lot of setbacks. Firstly
tackling the vast number of alerts generated and it requires a very big operation team to look
after the alerts. Whereas, Machine Learning can be used to detect and process the suspicious
transactions. Moreover they can divide the alerts into various ‘buckets’ based on the severity
level of the transaction. Heuristics can be used to the buckets of classification to know what
5. would be the best way to proceed with the alerts. For example, in case of severe risk alert, the
manager of the bank should be informed by the ML program about the risk so that he can verify
it and do the required processes further.
Anomaly detection
The generation of alert can be done using Machine Learning and Artificial Intelligence.ML
is used to write certain lines of code which also involves the use of data mining and few modern
math formulas. Some of the logics are simple and they are the same thing which is used by
humans while performing AML manually. The system is trained to work and detect anomalies in
a certain manner and ML helps to make the system evolve it’s understanding of the concept as
and when it is provided with more and more test cases. However, there can be cases where
many people are sending 100 dollars each to a single bank account. These are very common
methods of money laundering but will very easily be ignored by automated AML. There will
always be cases where the defaulter will try to come up with cases in which the error will be left
undetected. Well, if we can develop a program which can keep a record of the methods in which
people were able to breach the system in the past, we can use advanced ML to make the system
learn from those errors. It would be impossible for a defaulter to cheat after that.
Fig-2: Know Your Customer Guidelines
6. Data aggregation
Banks in all markets struggle with the quality of data they keep on their customers, creating a
significant obstacle to data aggregation. Longtime clients may have signed up when information
standards weren’t as rigorous and manual forms were prone to error. Many banks have
implemented the modern data-entry processes for new customers; however these might be
followed across countries or even branches without any consistency. The real challenge would be in
some countries like the United States or the United Kingdom that have only limited nationwide
identification systems.
Financial Institutions and Banks are relaying on several new software tools to combine mediocre-
quality data that can help them avoid huge money in manual data structuring and cleansing, and also
in enabling them to invest hundreds of millions of dollars to build central “data lakes.” For instance,
most of the data platforms use machine learning or “fuzzy logic” (an approach to computing based
on degrees of truth, rather than the more conventional binary true/false logic) on unstructured
account and transaction data, to create a 360-degree view of suspected cases of money laundering.
In reality, these new software components allow banks to automatically validate more customer
credibility and identify beneficial owners quicker, and map how specific customers are connected to
legal entities and individuals, especially those marked as high risks profiles.
This can have significant implications on the volume of accounts and transactions that get escalated
for manual reviews. For example, our analysis at one global institution showed that about half of the
transactions flagged as “suspicious” would not require investigation of the institution was able to
connect to the data to the various division in which some of them where already identified as
previously cleared the parties involved.
Robotic Process Automation
Robotic process automation (RPA) is technology of enabling the executive in an institution to
configure automated software application or called as “robot” to capture and interpret existing
applications for any routine work like processing a transaction, notification system, generating
the preconfigured report, triggering a response notification, manipulating data and
communicating with other existing digital systems within the institution.
7. 1. Customer Due Diligence (CDD). There are various use cases within the CDD which includes
setting up the customer profile, for onboarding, updating the existing data for enhanced
attentiveness. For example, RPA can be used to validate existing customer information, by
getting the customer data from various internal software application or repositories to
verify the customer’s information or to generate a report to a human for review. Merging
several internal and external sources available for customer verification, RPA can be used to
search internal data repositories as well as approved third-party data sources for customer
information. RPA also sends emails to frontline staff and customers automatically,
requesting necessary “Know Your Customer” (KYC) documentation.
RPA can be helpful in most of the areas however are down sides, which pushes us to find the
better approach or enhanced approach to leverage the below cons of RPA over a large scale
institutions and over a longer run
1. Need of skilled Staff. Over the past decade it’s been challenging for the organizations to
identify right skilled staffs for handling the RPA a configuring the rules, that requires
constant review and revision of the applied rules for the automation. Where the
configuration administrator needs to be aware of the current industry situation and frauds
along with required coding skills to achieve the better results, which is possible to an extent,
but over the years, it would be very difficult to identify the matching resource.
2. Monetary Expense. The most challenging case would be the cost involved for building a
system with skilled staff, which would result in high cost that makes the institution to
hesitate to implement RPA, as it requires constant maintenance.
3. Major Upgrade: Adopting a new technology over an upgrade will be challenging both
technically as well as cost wise, where the upgrade involves in rewriting the existing
algorithms and rules. However the upgrade to any system is inevitable, but the cost
rebuilding an entire RPA, would be challenging for any organization.
How AI Technology integrates into the AML process
The application of AI for AML is a logical one. With an AI system, AML data points can be
pulled and consolidated automatically, the transactions scored for risk and the anomalies
documented for AML investigators that can now evolve from researchers desperately
fighting against the clock to unearth relevant data into analysts presented with automated
financial crime reports that allow them to be better informed and more accurate with their
determinations. An AI-based AML solution can analyze massive amounts of transactional
and client information from a variety of sources such as TMS, KYC databases, Lines of
Business (LOB) customer information, as well as investigative databases, public Internet
sources and the deep web where criminals often interact and transact business. To
conduct such an analysis, AI solutions utilize agents which are highly specialized algorithms
responsible for collecting and interpreting data, modeling behaviors, detecting anomalies,
inferring relationships, and identifying issues. These agents report issues to a machine
8. learning engine by delivering both the alerts and all necessary supporting evidence. The
machine learning engine accepts all of the collected artifacts and develops an overall risk
score. This score embodies the level of suspicion around transactions, transacting entities,
and entity networks. These three areas represent the what, who, and why of financial
activities, and provide a holistic view of transacting entities and their motives. The score
also includes a measure of confidence about the decision. Confidence is calculated based
on the number of alerts and anomalies detected, as well as similarity to past cases. This
also allows the AI system to constantly evolve, learning from past decisions. Specific AI and
machine learning techniques that are employed to identify transactional anomalies worthy
of further investigation include: • Collaborative filtering: capable of finding transactions
with missing, matching and/or odd information • Feature matching: utilized to identify
transactions below a specific monetary threshold • Fuzzy logic: used to find data matches
with slight changes to names or addresses • Cluster analysis: can detect abnormalities in
transactions benefiting a single person or entity • Time series analysis: detects transactions
benefiting a person or entity over an extended period • Focused keyword searches: ability
to dynamically monitor, screen and filter transactions based on keywords from high risk
AML, CTF and financial crimes typologies • Ability to learn from an AI-identified suspicious
activity to enhance transaction monitoring and KYC platforms
Moreover the Financial Institutions have decided to solve the risks in two steps:
1. Protecting Against False Negatives
2. Identifying and Reducing Intermediary Risk
1. Protecting Against False Negatives
False negatives, which are instances of illicit transactions not flagged by the TMS, represent
significant risk to financial organizations. It is estimated that 50 percent of financial crimes
trafficked through the banking system pass through transaction monitoring systems unnoticed.
Unlike static rules-based TM engines, AI systems can detect patterns of behavior, analyze the
intent of those patterns and expose anomalous activities. For example, transactions that do not
follow the usual frequency and directional patterns expected for a given type of account may
not be flagged by a TMS, but would be identified with an effective AI solution. An AI solution
can learn the baseline of normal reported payroll account activity and thus identify any
irregularities in payroll transactions as potentially fictitious and worthy of further investigation.
9. While there is no economic purpose for a Yemen-based government fire protection agency to
purchase fertilizer from a farm in the UK, that business relationship would not be flagged by a
TMS. However, by comparing entities’ NAICS codes, an AI solution could quickly determine that
the two entities were not engaged in complementary lines of business triggering the system to
further investigate those entities and their business transactions. Its ability to efficiently
evaluate large numbers of transactions means that an AI system can also be used to examine
transactions that currently go unexamined as part of Above-the-Line/ Below-the-Line practices.
1. Identifying and Reducing Intermediary Risk
The ability to send and receive payments internationally via correspondent banking is vital to
the global economy. The World Bank estimates that global remittances will increase to more
than $636 billion in 2017. However, high remittance volume can bring increased regulatory
scrutiny, risk, and compliance costs that bite at the heels of financial institutions. The result,
unfortunately, is that institutions often abandon revenue sources by derisking foreign
correspondent relationships rather than deal with the inherent risk and problems of
maintaining correspondent banking accounts. The financial institutions that choose not to de-
risk will often file 9 defensive SARs because the data is not available for them to establish the
economic purpose and verify complementary lines of business for their correspondent banking
customers. Effective entity resolution and entity relationship investigation are integral to
curbing the de-risking cycle. An important function of an AI solution is its ability to monitor
customers’ relationships to other customers and entities and learn from their associated
behavior. An AI-based AML solution can automate the transactional analysis of these
intermediary relationships to find anomalous behaviors and identify the end clients causing
those anomalies. An AI-enhanced solution can account for seasonality, mergers and
acquisitions, randomness and other legitimate variances to find the illegitimate anomalies that
are presenting significant risks to financial institutions. An AI solution can also provide
predictive insights into transactional behaviors, and automate the required regulatory analysis
and reporting. Correspondent banking relationships continue to grow so the AML risk to banks
will remain. Deployment of AI-enhanced solutions can assist banks in better SAR reporting and
ultimately prevent unnecessary de-risking of correspondent banking customers.
10. How to identify right customer: Soft-Clustering Behavioral
Misalignment Score
Whenever we take the Customer KYC data, it is always segmented to hard segments of complex
structure. In present scenarios we are using “Hard segmentation” of the customer KYC data or
identify the usual behavioral pattern, but the approach was not yielding enough results, so we
have come with Soft Clustering, a collaborative profiling of segmented data. The suggested
technique is Bayesian learning technique (collaborative profiling), to analyze customers’
banking transactions in aggregate and generates a model of customer behavior. Each customer
is then represented as a mixture of these models (rather than as a single archetype) and these
models are adjusted in real-time with the financial and non-financial activities of the customer.
Customers with similar archetype distribution are clustered together in peer groups. Different
clusters have different risk, and customers that are not within any cluster are suspicious.
Further, these behavioral soft-clusters allow precision in determining if a customer’s behavior
starts deviating from what’s usual for their behavioral peer group. This deviation informs the
soft clustering misalignment score, which can be used to prioritize alert-based cases or drive
entirely new SAR events undetected by traditional KYC and rule-based systems.
11. Auto-Encoders for Unsupervised Anomaly Scoring
An Auto-Encoder is a type of artificial neural network used to learn efficient data coding in an
unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a
set of data, typically for dimensionality reduction.
We can use Auto-Encoder for soft augmenting the misalignment score, with help of ML. Using
Auto-Encoders (one of the powerful ML neural network), the AI system needs to train to learn
the data by reconstructing with minimal data error (Negligible).
When we process a new record through such an auto-encoder, the reconstruction error is
computed. The lower the reconstruction error, the better the record conforms to the expected
normal transaction behavior and distribution. Conversely, a large reconstruction error provides
a very powerful way to identify behaviors not seen before or rarely seen in past historical data.
Source: FICO Blog
This is essentially a powerful magnet to surface needles in the haystack. We train these auto-
encoders on soft clustering collaborative profiling archetypes and behavioral profile variables of
the customers. When the auto-encoder finds a customer with large reconstruction error, it flags
it with a high score. Such high-score cases are anomalous and corresponding SARs are
generated, allowing subtle behaviors to be surfaced from the masses of transactions daily in the
bank.
12. Conclusion:
The proven field of AI and machine learning for AML is the only technology that can effectively
improve financial crime investigations, scale to the volume and velocity of the modern financial
system, and counter criminals’ evolving approaches to money laundering. This is the right time
for the financial institutions to deploy AI into their AML ecosystems. The Key to reduce the risk
related to financial frauds or crimes and money laundering, AI coupled with Machine learning
would be the best approach that addresses the regulation for eliminating the operational cost
through improved efficiency. AI with ML helps in preventing criminals and terrorists from using
the banking industry for their evil agendas, thus securing the nation finance institutions.
13. Reference:
1. “Supervisory Guidance on Model Risk Management,” Board of Governors of the Federal
Reserve System, the Office of the Comptroller of the Currency, April 4, 2011. Access at:
https://www.occ.treas.gov/newsissuances/bulletins/2011/bulletin-2011-12a.pdf
2. FinTech Innovation Lab, London: http://www.fintechinnovationlablondon.co.uk/
3. Compliance with the AML/CFT International Standard: Lessons from a Cross-Country Analysis
https://www.imf.org/external/pubs/ft/wp/2011/wp11177.pdf
4. Anti-Money Laundering and Terrorist Financing Measures and Financial Inclusion
http://www.fatf-
gafi.org/media/fatf/documents/reports/AML_CFT_Measures_and_Financial_Inclusion_2013.pd
f
5. Using Artificial Intelligence and Machine Learning to Help Financial Institutions Increase
Compliance with Know Your Customer (KYC) Regulations
https://www.h2o.ai/wp-content/uploads/2017/06/Know-Your-Customer_v3_pages.pdf
6. Analytics Optimizations
https://www.fico.com/blogs/analytics-optimization/when-ai-meets-aml-a-report-from-
edinburgh/