More Related Content

More from DataScienceConferenc1(20)


[DSC Croatia 22] Digital Footprints in Credit Scoring - Kristina Reicher

  1. Digital Footprints in Credit Scoring Kristina Reicher
  2. Me & My Company • Kristina Reicher o MSc in Mathematical Statistics, Faculty of Science, University of Zagreb o Career: BI, data science, risk models, banking • Koios o BI, analytics, data science, and software development in the finance industry o 14 years, 60 people, 25 M kn income o Data modeling, data integration, predictive analytics, systems development
  3. Introduction A key reason for the existence of financial intermediaries is their superior ability to access and process information relevant for screening and monitoring of borrowers (Berg et al., 2018) Questions How good are our predictions? How much it costs? How quickly lending decisions are made? Two main areas of development Better models New data
  4. Better models - Traditional approach vs. „new stuff” • Logistic regression / Probit model / Scorecards • Great explainability and transparency • Expert knowledge of model developer Traditional approach (of Banks): • March 2018: “Ensemble learning or deep learning? Application to default risk analysis”, Journal of Risk and Financial Management  Comparison of 11 different models – BOOSTING wins • April 2018: “Credit risk analysis using machine and deep learning models”, Risks  Gradient tree boosting models outperformed logistic regression, random forest, and several neural network architectures Machine learning and AI exceed statistical methods:
  5. Regulatory supervision of credit rating Logistic regression – remains the standard credit scoring model in banking ML & AI methods in credi scoring lacks explainability and interpretability - „black box” models EBA Discussion paper on ML for IRB models (November 2021) Fintech is not subject to the same rigorous rules as traditional banks – an uneven playing field!
  6. New data for Credit Scoring Digital footprint (device attributes) IP addresses and GPS coordinates Account history, spending habits (e- commerce) Telecom / Utility / Rental Data Social networks profile data Clickstream Data Audio and Text Data Survey / Questionnaire Data - psychometric
  7. New Data – Alternative credit scoring
  8. Correlation with FICO score July 2017 Federal Reserve Bank of Philadelphia Working Paper (Jagtiani & Lemieux) • Correlation Between Lending Club rating grade and FICO score down from 80% to 35% “it is obvious that these credit grades are increasingly defined using additional metrics beyond FICO scores” • High correlation with loan performance kept!
  9. Composition of loans for each rating grade and evolving over the years
  10. FICO project for personal lending origination portfolio (February 2022) Alternative data add predictive value on credit risk model based on traditional data
  11. Using Digital Footprints for Credit Scoring July 2018 paper: Berg, Burg, Gombović, Puri. On the Rise of FinTechs - Credit Scoring using Digital Footprints. National Bureau of Economic Research. Digital footprint – information that people leave online simply by accessing or registering on a website Fintechs have a superior ability to access and process digital footprints 10 variables derived from „digital footprint” are simple and easily accessible for every firm operating in the digital sphere Almost no cost to collect those „digital footprint” variables
  12. Data in case study 250,000 observations Purchases above €100 from an e-commerce compay in Germany Default defined through customers who didn’t pay for online order (~1%) A classic credit score from a private credit bureau •„scorable customers” - 94% of the sample •6% of the sample is „unscorable customers”– credit history is not sufficient for credit bureau to calculate a credit score Data set is largely representative of the German population as well as default rates representative of a typical consumer loan sample in Germany
  13. Interesting findings – income & wealth proxies Orders from cell phones as three times as likely to default as orders from desktops or tablets Orders from Android OS are twice as likely to default as orders from iOS (Betrand and Kamenica (2017) study: owning an iOS device is one of the best predictors for being in the top quartile of the income distribution) Customers with premium Internet service are significantly less likely to default Default rates for customers from shrinking platforms (like Hotmail or Yahoo) are twice greater than average
  14. Interesting findings – character proxies Customers arriving on the webshop through paid ads exhibit the largest default rate Customers arriving via affiliate links (e.g., price comparison sites) or direct URL have lower than average default rates Customers ordering during the night have two times higher default rates (marketing research: important personality traits for impulse shopping) Customers making typing mistakes while inputting their email addresses are five times more likely to default than average Customers only using lowercase when typing are more than twice as likely to default
  15. Digital footprint variables - proxies for reputation Customers with numbers in their email addressess default more frequently (strong indicator for fraud, which is 10-15% of all defaults) Customers with their first and/or last name in their email address are less likely to default (consistent with Belenzon et al. (2017) study that eponymous firms perform better) Some variables can be proxies for several characteristics, e.g., owning an iOS device is a predictor for economic status, but might also proxy for the character (status-seeking users). Interpretation of digital variables does not affect prediction but gives guidance to connect with existing research.
  16. The Results ROC – AUC Logistic regression for every model Correlation in combined model: 10% Digital footprints complements, rather than substitute credit bureau score
  17. Some conclusions Findings are comparable with similar studies • June 2020, „An Alternative Credit Scoring System in China’s Consumer Lending Market: A System Based on Digital Footprint Data” • December 2021, „Can System Log Data Enhance the Performance of Credit Scoring?—Evidence from an Internet Bank in Korea” Variables: email error, mobile/Android and the „Night” dummy have the highest economic significance Model results stand for alternative default definition (loss given default – collection agency fully recovers 40% of the claims)
  18. • The World Bank Group (2016): promotes new use of data and digital technology for expanding access to financial services • Help lenders and borrowers to overcome a lack of information infrastructure, such as credit bureau scores • Discriminatory power (measured by AUC) is broadly similar for unscorable customers than for scorable customers • Give billions of unbanked people access to credit • Lack of access to financial services affects around 2 billion working-age adults worldwide, particularly in developing countries Chance for financial inclusion FinTech industry opportunity in emerging markets Financial inclusion is a key for reducing poverty and boosting prosperity Digital footprints
  19. Digital footprint as a predictor of change in future credit bureau score? YES! A„good” digital footprint today can forcas an increase in the credit bureau score Digital footprints matter for other loan products – a window into the traditional banking world
  20. Usage of Digital footprints
  21. Digital Footprints today • Sesame Credit o Credit rating agency of Alibaba (16 million active users) o Relies on users’ online-shopping habits to calculate their credit scores o Team up with Baihe (dating service) - Encouraging users to display their credit scores on their dating profiles  • China Rapid Finance o Partnership with Tencent (social media & online gaming, WeChat leading messaging platform; 800m users/month) o Combs through its users’ social networks • Other FinTechs have publicly announced using digital footprints for lending o ZestFinance and Earnest in the U.S. o Kreditech in various emerging markets o Rapid Finance, CreditEase, and Yongqianbao in China
  22. Implications of Digital footprint for behavior of consumers Some digital footprints are costly to manipulate Change of intrinsic habits – impulse shopping or typing mistakes Fear od expressing individual personality Desire to portray a positive image Consideralable impact on everyday life with consumers constrantly considering their digital footprints
  23. Implications of Digital footprint for behavior of firms and regulators Firms associated with low creditworthiness product may conceal the digital footprint of their products Commercial services that offer to manage individual’s digital footprint may emerge Firms wants to ensure that their digital footprint clearly destinguishes them from lower-reputation products in same category Lending act worldwide legally prohibit the use of variables such as race, color, gender, nation origin, religion Incumbent financial institutions may lobby for a restriction or scrutiny in using digital footprint
  24. Final thoughts
  25. Digital footprint credit scoring has wide implications for financial intermediaries (both Fintech and traditional banking) Using a digital footprint can help to overcome information asymmetries between lenders and borrowers for unscorable customers Consumers might intentionally and convincingly change their online behavior if digital footprints are widely used for lending decisions Every digital firm that operates on the BNPL principle can benefit from incorporating digital footprints in customer scoring
  26. Thank you!

Editor's Notes

  1. Peer to Peer lending platform, world’s largest, since 2007, 2000 employees Our mission is to transform the banking system to make credit more affordable and investing more rewarding. headquartered in San Francisco, California 1$billion IPO in 2014 (largest that year in USA) High proportion of funds from hedge funds Started as a social network service: LendingMatch; until end of 2008. LendingClub enabled borrowers to create unsecured personal loans between $1,000 and $40,000. As of December 31, 2020, Lending Club will no longer operate as a retail peer-to-peer lender. LendingClub is moving towards becoming a bank holding company,  institutional investors only. 
  2. Ant Financial Services Group, formerly known as Alipay, is an affiliate company of the Chinese Alibaba Group. Ant Financial is the highest valued fintech company in the world, and the world's most valuable unicorn (start-up) company, with a valuation of US$315 billion. In June 2018, the company launched a blockchain-powered cash remittance service that will allow real-time transfers of cash between individuals in Hong Kong and the Philippines. In November 2021, Alipay allowes users to track how the app collects, stores, and shares data about them (privacy protection feature due to China's Personal Information Protection Law (PIPL) ). WeChat, a social network used by 1.27 billion people, but it is also an „app for everything”. Developed by Tencent, a Chinese tech giant. WeiLiDai, which literally means “a tiny bit of loan” is an online lending platform that offers loans up to $30,000 that can be approved in a matter of minutes.
  3. Leanpay, a Slovenian startup, on the Croatian market since November 2021. Purchase by installment payment on the spot. Buy Now Pay Latter business model. Offers loans up to HRK 22,500 payable in 24 installments.
  4. credit bureau score (CS) 68.3% CS in banks: 66.5% CS in US peer-to-peer lending data: 62.5% Lending Club: 59.8% ONLY DIGITAL FOOTPRINT VARIABLES MODEL: 69.6% Robust! (not proxies for time or region, various default definitions, sample splits)