The document discusses credit scoring models and their quality. It notes that scoring models are tailored based on the available data in different markets and that a blended model using different data sources can improve scores. It also emphasizes that the predictive value of scores is important but other factors like coverage, speed, and understandability also contribute to quality. Customers evaluate scores based on both technical quality and other criteria.
1. Presentation Gertjan Kaart (Graydon Nederland)
Credit Alliance, Coface
18th of January, 2011, Paris
I would like to address briefly three topics: see slide.
When you talk to internationally oriented customers about credit scoring, or when you discuss the
issue of this workshop ‘worldwide quality of scores’, it very often raises the question of
standardization in scoring or unified global scoring models.
Competing for the best scoring models drives us to use the most we can get out of our databases.
That means that based on the availability and quality of data, we build our scoring models. So based
on specific conditions, mostly pertaining to data, we develop credit scoring models that are fine
tuned to all the data that is available.
Even in our own database for The Netherlands we use 11 plus 1 (I will come back to this one)
different scoring cards in order to rate 1.8 Million companies. That is simply because we do not have
the same levels of details on all companies. - Like for instance availability of financials or for instance
the type of data (do we have statistical turnover or turnover taken form officially filed P&L files).
You might accept the fact that the data is different, but the methods of scoring should have the same
global principles. I agree. For instance a few of our principles are that we use:
a statistical (logistic regression) scoring model,
the scoring is automated en recalculated dynamically based on new data-input,
we have a point in time score looking at a horizon of 12 months (as opposed to through the
cycle),
and we use the same DEFAULT definition in all scoring models (which is not 90 days overdue, but
officially registered bankruptcies).
But even than we sometimes need to change a key principle and change from a statistical model to
an expert model if we do not have sufficient data to run a statistical program.
Slide: Blended scoring
We did a project together with Experian in the Netherlands with the objective to construct a new
scoring model to improve the scoring on 400.000 Sole Ownership companies (one-person
companies).
We created a blended score (mixed score). We take the Graydon PD score on the company and we
also take from Experian the Delphi consumer score on the owner. Both the Graydon score and the
Delhi by Experian had predictive value by itself. We constructed using the two scores, a new score
model that statistically performs better than the individual scores in terms of CDR (cumulative
default rates). Simply, because by using the Delhi consumer score, we added relevant new
information to the model.
2. Conclusion:
Credit scoring models should be highly granular. Meaning using deeper levels of detail in the scoring
model when and wherever possible in order to improve quality and power of the model. This means
that scoring models can be different depending on different situations.
So scoring models are not the same over the world, but how can you compare or use the outcomes?
Slide: Interpretation and presentation of scores
Maybe I am saying something tricky here now, being a credit information company ourselves. But if
the crisis learned one thing, it is not to blindly trust your rating agency!
If you read the papers, there was a lot of criticism on rating bureaus like Fitch, Moody’s and S&P. The
criticism is that they were incorrect and could not prevent big defaults to have big negative financial
impact on the economy. They were to slow in their responses. They even helped banks develop
financial instruments with predetermined ratings (often junk bonds), and were even accused of
manipulating the market that way.
The big rating agencies had become part of the financial system in a way. The commercial credit
assessment bureaus like ourselves are in a different position (unsolicited ratings, point in time,
statistical, mass). But still, interpretation of scores is key.
The best outcome/deliverable of a scoring model is the Probability of Default or PD%. It is a very
straightforward way to express the chance that a company goes into default. Crucial is off course to
know the definition of default of the model. So you know in fact what chance you are talking about.
Much more tricky is to look at the rating scales or classes. Like the notation in letter-combinations,
colors (red, yellow, green), credit flags, etc. Because what do they tell us really? Triple-A is better
than single A. But is that relevant to ‘better’ in assessing risk? You need to quantify the risk. So is
Triple A really good? Or just relatively good compared to others?
It is like grading on the curve, very Anglo-Saxon. One of the key observations regarding rating agencies
like Moody’s and S&P in the crisis is that they benchmarked against other risks. I studied in Los Angeles
for one year and this is exactly what grading on the curve is: As long as I was better than most of the other
students in my class, I was likely to get an A. And I could still be one of the worst students in the US.
PS. There is another aspect as well with using risk scales, because you can play with it and use it to make a
nice looking (normal) distribution of companies with a nice spread from AAA to C. Therefore it also
becomes difficult to compare.
So be careful with risk categories or scales. You need to have insight in the mapping of de default
rates into the different classes in order to know how much risk is in one scale. That is why the Basel-
committee has now published a mapping table for all Basel compliant (ECAI’s) credit scores.
3. Slide: Quality of scores
If you talk about quality of scores, we need to have a definition of quality. If you talk to
econometrists or statisticians, very quickly you talk about the technical quality of the model. What is
the predictive value? Most used terms in credit industry are now qualifications in terms of the GINI of
the model, the CDR or the Distribution of the score in the population.
But the market and our customers have also other criteria on which they value credit scores. In fact
sometimes they more or less take the technical quality as a given and they look at other criteria to
value our credit scores.
Important quality aspects of a score are also:
• Coverage : can you deliver scores on the companies that you need?
• Speed and uptodateness : can you give them Just In Time, no delays
• Delivery (structured data, workflow integration, etc) : can you integrate it into customer
workflows?
• Presentation (interpretation) : is it easy to understand?
• Services (value add / monitoring, benchmarking, decision engines) :
• Price : Price should be the result of quality or should be measured against the benefits of using
scores (better managed risk). But oftentimes, price is a starting point.
• Customer complaints (type 1) : Mind you, generally speaking customers only see things that go
wrong one way (accepting a risk that goes bad). And they have less focus on the other side
(rejecting a risk that turned out to be good).
• Usage / users : an important indicator of quality is the number of users and volume of usage of
your scores. It generates a lot of experiences and information.
Conclusion:
So as a credit rating provider there is more to manage than the technical quality of the score alone.
And most of the time the quality experience is a trade-off between the predictive value of the model,
the coverage and the costs involved.
4. Slide: Lessons learned
A few observations are that we have today:
• More educated users and customers
• More awareness of the value of info and ratings
Then we see a trend in social media and internet where companies take a more active approach in
managing their own credit scores. It is like a form of Financial PR.
• Trend? Social business media: publish your own ratings
Than something else is that indeed we see that the banks have changed the rules of the game. If you
look at these statistics of defaulted companies, we see that companies go bust with better solvency
positions than in earlier years. As the rules change, we need to calibrate the models more frequently
than before, the market has become more volatile in this respect.
Also it is crucial to have sufficient behavioral statistics like payment and transaction data, as this type
of data is more actual and has great predictive power in the models.
So improving scoring models should focus on two aspects:
1. Timely modeling and calibration of scoring models
2. And on acquiring and using more relevant and dynamic data like payment behavior.
Finally I would make a point out of breaking the Chinese wall between suppliers of information, the
users of information (for instance the credit insurers) and the subjects of information (the
companies). And this is in fact what we see happening more and more today already. Buyers in your
systems would be motivated to self update themselves in the databases of the information
providers. We can co-create with you as an insurer, like in some other of our customer segments, if
you would push back relevant data to us. But today the information that we provide to you has a
one-way ticket.
This type of integration of business critical information flows between USERS, OBJECTS (the buyers in
your systems) and PROVIDERS is what we need more in the future.
Thank you for your attention.
END