Me & My Company
• Kristina Reicher
o MSc in Mathematical Statistics, Faculty of Science,
University of Zagreb
o Career: BI, data science, risk models, banking
• Koios
o BI, analytics, data science, and software
development in the finance industry
o 14 years, 60 people, 25 M kn income
o Data modeling, data integration, predictive analytics,
systems development
Introduction
A key reason for the existence of financial intermediaries is their superior
ability to access and process information relevant for screening and
monitoring of borrowers (Berg et al., 2018)
Questions
How good are our predictions?
How much it costs?
How quickly lending decisions are made?
Two main areas of development
Better models
New data
Better models - Traditional approach vs. „new stuff”
• Logistic regression / Probit model / Scorecards
• Great explainability and transparency
• Expert knowledge of model developer
Traditional approach (of Banks):
• March 2018: “Ensemble learning or deep learning? Application to default risk analysis”,
Journal of Risk and Financial Management
Comparison of 11 different models – BOOSTING wins
• April 2018: “Credit risk analysis using machine and deep learning models”, Risks
Gradient tree boosting models outperformed logistic regression, random forest, and
several neural network architectures
Machine learning and AI exceed statistical methods:
Regulatory supervision of
credit rating
Logistic regression – remains the standard credit scoring model in banking
ML & AI methods in credi scoring lacks explainability and interpretability - „black box”
models
EBA Discussion paper on ML for IRB models (November 2021)
Fintech is not subject to the same rigorous rules as traditional banks – an uneven
playing field!
New data for Credit Scoring
Digital
footprint
(device
attributes)
IP addresses
and GPS
coordinates
Account
history,
spending
habits (e-
commerce)
Telecom /
Utility /
Rental Data
Social
networks
profile data
Clickstream
Data
Audio and
Text Data
Survey /
Questionnaire
Data -
psychometric
Correlation with FICO score
July 2017 Federal Reserve Bank of
Philadelphia Working Paper
(Jagtiani & Lemieux)
• Correlation Between Lending
Club rating grade and FICO
score down from 80% to 35%
“it is obvious that these credit
grades are increasingly defined
using additional metrics beyond
FICO scores”
• High correlation with loan
performance kept!
FICO project for personal lending origination portfolio (February 2022)
Alternative data add predictive value on credit risk model based on traditional data
Using Digital Footprints for Credit Scoring
July 2018 paper:
Berg, Burg,
Gombović, Puri.
On the Rise of
FinTechs - Credit
Scoring using
Digital
Footprints.
National Bureau
of Economic
Research.
Digital footprint – information that people leave online
simply by accessing or registering on a website
Fintechs have a superior ability to access and process digital
footprints
10 variables derived from „digital footprint” are simple and
easily accessible for every firm operating in the digital sphere
Almost no cost to collect those „digital footprint” variables
Data in case study
250,000 observations
Purchases above €100 from an e-commerce compay in Germany
Default defined through customers who didn’t pay for online order (~1%)
A classic credit score from a private credit bureau
•„scorable customers” - 94% of the sample
•6% of the sample is „unscorable customers”– credit history is not sufficient for credit bureau to calculate a credit score
Data set is largely representative of the German population as well as default rates representative of a
typical consumer loan sample in Germany
Interesting findings – income &
wealth proxies
Orders from cell phones as three times as likely to default as orders from
desktops or tablets
Orders from Android OS are twice as likely to default as orders from iOS
(Betrand and Kamenica (2017) study: owning an iOS device is one of the best predictors for being in
the top quartile of the income distribution)
Customers with premium Internet service are significantly less likely to
default
Default rates for customers from shrinking platforms (like Hotmail or
Yahoo) are twice greater than average
Interesting findings – character
proxies
Customers arriving on the webshop through paid ads exhibit the largest default rate
Customers arriving via affiliate links (e.g., price comparison sites) or direct URL have
lower than average default rates
Customers ordering during the night have two times higher default rates (marketing
research: important personality traits for impulse shopping)
Customers making typing mistakes while inputting their email addresses are five
times more likely to default than average
Customers only using lowercase when typing are more than twice as likely to default
Digital footprint variables - proxies
for reputation
Customers with numbers in their email addressess default more frequently (strong
indicator for fraud, which is 10-15% of all defaults)
Customers with their first and/or last name in their email address are less likely to
default (consistent with Belenzon et al. (2017) study that eponymous firms perform better)
Some variables can be proxies for several characteristics, e.g., owning an iOS device is a
predictor for economic status, but might also proxy for the character (status-seeking
users). Interpretation of digital variables does not affect prediction but gives guidance to
connect with existing research.
Some conclusions
Findings are comparable with similar studies
• June 2020, „An Alternative Credit Scoring System in China’s Consumer Lending Market: A System
Based on Digital Footprint Data”
• December 2021, „Can System Log Data Enhance the Performance of Credit Scoring?—Evidence
from an Internet Bank in Korea”
Variables: email error, mobile/Android and the „Night” dummy have the highest
economic significance
Model results stand for alternative default definition (loss given default –
collection agency fully recovers 40% of the claims)
• The World Bank Group (2016):
promotes new use of data and
digital technology for expanding
access to financial services
• Help lenders and borrowers to overcome
a lack of information infrastructure, such
as credit bureau scores
• Discriminatory power (measured by AUC)
is broadly similar for unscorable
customers than for scorable customers
• Give billions of unbanked people
access to credit
• Lack of access to financial services affects
around 2 billion working-age adults
worldwide, particularly in developing
countries
Chance for
financial
inclusion
FinTech
industry
opportunity
in emerging
markets
Financial
inclusion is a
key for
reducing
poverty and
boosting
prosperity
Digital
footprints
Digital footprint as a predictor of change in future
credit bureau score?
YES!
A„good” digital footprint today can forcas an
increase in the credit bureau score
Digital footprints matter for other loan products
– a window into the traditional banking world
Digital Footprints today
• Sesame Credit
o Credit rating agency of Alibaba (16 million active users)
o Relies on users’ online-shopping habits to calculate their credit scores
o Team up with Baihe (dating service) - Encouraging users to display their credit scores
on their dating profiles
• China Rapid Finance
o Partnership with Tencent (social media & online gaming, WeChat leading messaging
platform; 800m users/month)
o Combs through its users’ social networks
• Other FinTechs have publicly announced using digital footprints for lending
o ZestFinance and Earnest in the U.S.
o Kreditech in various emerging markets
o Rapid Finance, CreditEase, and Yongqianbao in China
Implications of Digital footprint for behavior of
consumers
Some digital footprints are costly to
manipulate
Change of intrinsic habits – impulse
shopping or typing mistakes
Fear od expressing individual personality
Desire to portray a positive image
Consideralable impact on everyday life
with consumers constrantly considering
their digital footprints
Implications of Digital footprint for behavior of firms
and regulators
Firms associated with low creditworthiness
product may conceal the digital footprint of
their products
Commercial services that offer to manage
individual’s digital footprint may emerge
Firms wants to ensure that their digital
footprint clearly destinguishes them from
lower-reputation products in same category
Lending act worldwide legally prohibit the use
of variables such as race, color, gender, nation
origin, religion
Incumbent financial institutions may lobby for
a restriction or scrutiny in using digital
footprint
Digital footprint credit
scoring has wide
implications for financial
intermediaries (both Fintech
and traditional banking)
Using a digital footprint can
help to overcome
information asymmetries
between lenders and
borrowers for unscorable
customers
Consumers might
intentionally and
convincingly change their
online behavior if digital
footprints are widely used
for lending decisions
Every digital firm that
operates on the BNPL
principle can benefit from
incorporating digital
footprints in customer
scoring
Peer to Peer lending platform, world’s largest, since 2007, 2000 employees
Our mission is to transform the banking system to make credit more affordable and investing more rewarding.
headquartered in San Francisco, California
1$billion IPO in 2014 (largest that year in USA)
High proportion of funds from hedge funds
Started as a social network service: LendingMatch; until end of 2008.
LendingClub enabled borrowers to create unsecured personal loans between $1,000 and $40,000.
As of December 31, 2020, Lending Club will no longer operate as a retail peer-to-peer lender.
LendingClub is moving towards becoming a bank holding company, institutional investors only.
Ant Financial Services Group, formerly known as Alipay, is an affiliate company of the Chinese Alibaba Group.
Ant Financial is the highest valued fintech company in the world, and the world's most valuable unicorn (start-up) company, with a valuation of US$315 billion.
In June 2018, the company launched a blockchain-powered cash remittance service that will allow real-time transfers of cash between individuals in Hong Kong and the Philippines.
In November 2021, Alipay allowes users to track how the app collects, stores, and shares data about them (privacy protection feature due to China's Personal Information Protection Law (PIPL) ).
WeChat, a social network used by 1.27 billion people, but it is also an „app for everything”. Developed by Tencent, a Chinese tech giant.
WeiLiDai, which literally means “a tiny bit of loan” is an online lending platform that offers loans up to $30,000 that can be approved in a matter of minutes.
Leanpay, a Slovenian startup, on the Croatian market since November 2021.
Purchase by installment payment on the spot. Buy Now Pay Latter business model.
Offers loans up to HRK 22,500 payable in 24 installments.
credit bureau score (CS) 68.3%
CS in banks: 66.5%
CS in US peer-to-peer lending data: 62.5%
Lending Club: 59.8%
ONLY DIGITAL FOOTPRINT VARIABLES MODEL: 69.6%
Robust! (not proxies for time or region, various default definitions, sample splits)