Minimize Fraud And Maximize Revenue Deposit Risk Scoring
1. adeliarisk
Minimize Losses & Maximize Revenue
a Step-by-Step Guide to Getting Started in Deposit Risk Modeling
DDA models recoup lost revenue by:
Allowing an intelligent, deliberate balance between risk and revenue.
Enabling risk-rating of both customers and transactions.
This step-by-step guide, compiled from actual practitioners’
experiences, will help you apply statistics to recoup lost revenue.
by
Josh Ablett Dr. Jiang Zhou
Adelia Risk Business Data Miners
1
2. adeliarisk
Introduction
Between the Durbin Amendment and the Federal Reserve prohibitions on overdraft fees, pressure has
never been greater on DDA fee revenue. Many financial institutions are eyeing an investment in data
analytics to make up for lost revenue. This paper describes the practical steps you should follow in
launching your own data analytics efforts to increase your chances of successfully replacing lost
revenue. We’ve used these steps multiple times to deliver millions of dollars of bottom-line
contribution through DDA risk scoring.
Step 1 – Compiling essential data
The single factor that will make or break your statistical modeling project is, without a doubt, the
availability of data. Three month sprint
projects turn into 18 month marathons
when assumptions are made without having
all of the appropriate data available.
Before jumping in, it’s important to be sure
you have the following data feeds:
All items processed by your proof
department (the “all items file”).
All returned deposited items (RDIs)
sent back to your bank.
All available data from your DDA
system (name, form of ID, etc.)
Metadata about accounts from your DDA system (opening date, average balance, etc.)
Metadata from your customer tracking system (customer address, signers, etc.)
This is absolutely critical; don’t even think about getting started until you have this data in your hands.
You don’t want to mobilize an entire project team only to watch them sit on their hands as they wait for
essential data to become available.
If possible, get these feeds too; they’ll make your models significantly more accurate:
Credit score (plus all other information available from the credit bureau)
Any information gathered from Chexsystems, eFunds or other derogatory bureau
A file of overdraft transactions
Any alerts from the EARNS process (Early Notification System of RDIs)
Customer claims
2
3. adeliarisk
Step 2 – identifying the real problem
“We want to build a model that increases revenue.” That was easy!
Or was it? Directing your analysts and statisticians towards such a vague finish line will cost you months
of project expense while doing nothing to replace revenue. To be effective, models need to focus on
very specific problems; choosing the wrong problem (or an ambiguous one) invariably results in dead
ends and rework.
This is the problem-selection process that we have followed in the past, with remarkable success:
Identify the precise conditions you want to deliver in your financial
First, start accounts. “We want chargeoffs posted to GL account 1234567 to go
down.” Or “we want fee revenue to GL account 9876543 to go up.”
at the end This may seem incredibly obvious. But many – perhaps most –
organizations skip this step, and end up wasting time when it becomes
necessary to make an expensive course correction.
Take the chargeoffs example. This should be easy. If you reduce your
Next, analyze RDIs, then your chargeoffs will go down, right?
In reality, it’s not that simple. Enlist your team to understand the
root cause factors that truly drive revenue or losses. In the case of losses, what
contributes a higher percentage of loss: fraud RDIs or non-fraud RDIs?
RDIs that alerted or not? RDIs that returned in 4 days or in 7 days?
Focusing on your areas of highest revenue or loss, create refined,
Refine, specific problem statements.
For example, banks that go through a loss-focused analysis learn that a
refine, refine model targeted at reducing fraud RDIs lowers their losses significantly
compared to a model that simply reduces the level of overall RDIs.
Now that’s a great problem for your analytics team to solve.
Re-apply this same process to each area you’d like to improve.
Rinse and Do you want to increase fee revenue? Did your analysis show that
repeat customers with a high rate of RDIs pay higher fees with a lower rate of
chargeoff? Then focus your model on increasing fees from that
population.
3
4. adeliarisk
Step 3 – Finding predictive variables
Now that you’ve got your data and have isolated a specific problem, it’s time for your statisticians to get
to work. The goal of this exercise is to develop an easy-to-understand, easy-to-discuss document that
looks like this:
Variable Predictive of Chargeoff
Account Type High
Number of items in past three days Medium
Current account balance High
First three digits of zip code Medium
And so on…
However, testing the predictive power of hundreds of variables is both time consuming and expensive.
The following practical lessons, based on our experience deploying models that measure both
transaction and customer risk, will properly orient your efforts and save you valuable time:
Start with common sense. Experience tells you that new accounts with low balances are
riskiest. Similarly, large deposits made to accounts with low balances are risky. Well guess
what? You’re right! We’ve found that these variables are highly predictive in determining
chargeoffs. You can save a lot of time in your analysis by first talking to the fraud and revenue
analysts to get a “gut check” from them regarding the most predictive variables.
Combine variables. You may find, as we have, that the number of items recently transacted on
an account is a useful indicator of risk. You may find, as we have, that the number of overdrafts
on an account is also a good indicator of risk – and you might be satisfied with that. However,
skilled statisticians testing numerous combinations of variables are often rewarded by
uncovering incredibly predictive variables. In this example, you may find that dividing the
number of overdrafts in the past 60 days by the number of items processed in the past 10 days
is a much more powerful predictor of risk.
The changing power of time. Some variables’ predictive power has a very short shelf life – in
the previous example, the number of items processed is only predictive for the past 10 days or
so. Some variables last much longer – again, from the example above, the number of overdrafts
is predictive for 60 days or longer. Your analytical team can test correlation against time in
buckets (0-20 days, 20-40 days, 40-60 days, etc.) to zero in on the timeframe that works best in
your model.
4
5. adeliarisk
Consider the opposite cases. Sure, your model can reduce chargeoffs, but at what cost to your
customers? A new model can definitely increase fee revenue, but will it drive your chargeoffs
up too? Your statistician should be able to represent these opposing factors in a gain chart. The
real example pictured below demonstrates how easy it is to apply this document in discussing
whether you should capture 90% of the fraud RDIs by holding 2.5% of the good RDIs, or to “go
for broke” and capture almost all of the fraud RDIs while inconveniencing 35% of your
customers with held funds.
% of Fraud Returned Items Captured vs. % of Good Items Falsely Alerted
100%
90%
% of Fraud Returned Items Captured
80%
70%
60%
50%
Refined Model
40%
30%
20%
10%
0%
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100
%
% of Good Items Falsely Alerted
Guess what? You’ve just developed the most important component of your business case to justify a
project. And you now have the data to confidently commit to delivering some tempting benefits. When
you can clearly demonstrate the annualized benefits you can deliver by increasing your fraud RDI
capture rate while reducing your false positive rate, you’re looking at a very high likelihood of project
approval.
5
6. adeliarisk
Step 4 – Building your model
Applying the results of their analyses, your analytics team will be able to combine all of the variables
described on the previous page into a clean scoring model that rates each transaction or customer on a
scale of 0-1000. From that, they should be able to produce a chart that looks like this:
Score $ Fraud $ Good Cumulative % Cumulative %
Fraud Good
900-1000 $93,458 $1,224 25% 0.15%
800-899 $35,678 $3,124 36% 0.35%
700-799 $23,456 $9,123 50% 1.51%
600-699 $8,124 $21,457 53% 10.67%
And so on…
As you can see, this chart makes it easy to see where this scoring model goes from being very accurate
(800-1000) to being not so accurate (799 and down). This lets you easily determine which scoring
bucket should be used to make an automated decision (e.g., automatically holding funds, automatically
declining accounts), a manual action (e.g., routing to an analyst for review), or no action at all.
Behind the scenes, things get slightly more complex. Your statistical analysis will most likely produce a
scoring model similar to this one:
Account Type
If account type is Personal Add 100 points
If account type is Small Business Add 10 points
If account type is Platinum Subtract 5 points
Number of Items in Past 3 Days
If >10 items processed on this account -
If 7-9 items processed on this account Add 10 points
If 4-6 items processed on this account Add 20 points
If 1-3 items processed on this account Add 40 points
If 0 items processed on this account Add 50 points
Current Account Balance
If > $10,134 -
If between $7,322 and $10,133 Add 10 points
If between $4,356 and $7,321 Add 25 points
If between $1,321 and $4,355 Add 50 points
If between $0 and $1,320 Add 100 points
And so on…
6
7. adeliarisk
While the grid on the previous page is just a sample of what a real model looks like, many people are
surprised to discover that a model can actually be this specific. While your gut instincts might tell you
that “an account with a low balance is risky,” a proper statistical analysis is really the only way to know
with any real certainty that “low balance” really means “anything under $1,320.”
Step 5 – All dressed up and no place to go
Remember how we said that the most common delay for these types of projects was the lack of data?
You’ve just reached the second most common point of delay for your project, and boy, can this one
hurt.
Before you start to build the model described in steps three and four, you must ensure that you actually
have an effective way to implement it. Many institutions simply assume they’ll be able to implement
whatever model is produced, only to discover that a six-to-nine month IT project stands between them
and the ability to start risk-scoring transactions or accounts.
An ounce of prevention is the best solution here. Before you start down this path, take this paper to the
systems owners and IT staff who support your critical systems and have a conversation about how much
time and effort will be required to implement the kind of model described here. By collecting order-of-
magnitude estimates, you’ll be able to complete the costing side of the business case that you
completed in step three.
Beyond this key preventative conversation with IT, there are a other important elements to successfully
implementing risk models:
Start by evaluating multi-factor regression models. In our experience, regression models
provide the best combination of performance, understandability, and ease of implementation.
They perform just as well in DDA modeling situations as traditional models built on neural
network (or similar) analysis, but are considerably easier to implement and can be installed on a
wider range of target systems.
Don’t be afraid to do it yourself. Many large account scoring and fraud prevention vendors try
to wrap statistical models in a shroud of mystery. However, you’d be absolutely amazed by the
returns you can deliver independently by assembling a team of a statistician, a skilled .NET or
Java developer, and a part-time DBA. With a properly managed effort, you’ll certainly be able to
deliver enough of a return to justify additional investment in expanding the project.
7
8. adeliarisk
Step 6 – “I'm sorry, Dave. I'm afraid I can't do that.”
People don’t trust machines.
And as part of implementing this project you are going to ask people to switch from trusting common
sense to blindly following a computer-generated score from 0-1000.
Take the time to train your deposit fraud analysts, your new account review analysts, and even your
account opening staff on the variables that sit behind the score that they see. People don’t trust what
they don’t understand; you need to teach them to understand the logic behind the score.
An even better approach, if it’s within your budget, is to build logic into your system that generates
reason codes to explain these scores to staff. “Personal account with low balance” or “Corporate
account with low rate of items processed” can make people a lot more comfortable than simply seeing
the score 832.
Step 7 – Don’t be afraid to ask for help.
Here it comes – the shameless self-promotion. Don’t worry, it’s not too bad.
Building and deploying statistical models that successfully replace lost revenue is a project that you can
absolutely, positively build yourself if you are willing to make the investment of time and resources.
That being said, we’d be happy to help you in whatever capacity you require, including:
Coaching your team
Leading training workshops
Developing custom models
Managing analytics projects and systems integration
We’ll also be happy to answer any questions you might have – please, feel encouraged to:
Learn more about Adelia Risk by writing to Josh.Ablett@adeliarisk.com or by visiting
www.adeliarisk.com.
Learn more about Business Data Miners, please email jzhou@businessdataminers.com or visiting
www.businessdataminers.com.
8