Imagine a world where you transform a business problem into data specification, collect a data set, learn a ML model and put it into stable, scheduled production within 7 days – rather than spending weeks on data preparation and implementing a production pipeline. At Raiffeisen Data Science this became reality with our Scoring Factory, a ML OPs framework where all customer data is condensed and prepared in a 'feature data layer' - ready to be used for ML algorithms, and a “Scoring Template” with which ML models are created in hours and which delivers production ready code. This overcomes our biggest obstacles in implementing new ML use cases: Data preparation and deployment used to demand so much time, that we were very limited in taking on new ML use cases. In particular, we can now try new ideas fast and without the risk of too much initial time investment. The Scoring Factory is the so-far final piece in a 3 year long effort to build a versatile, automated, robust data platform which allows to unleash the full potential of customer analytics and which started with local models executed on desktop PCs. The foundation is the Customer Analytics Platform. A Hadoop-Cluster where data from various sources gets collected, cleaned, interconnected and aggregated, resulting in thousands of attributes describing the customer in any angle which the business needs to solve their questions. In this talk, we give details about the Architecture of the Customer Analytics Platform and the key ingredients of the Scoring Factory.
From Ideation to Production in 7 days: The Scoring Factory at Raiffeisen
1. 2023-02-24 1
From Ideation to Production in 7 days
The Scoring Factory at Raiffeisen
Philipp Thomas, Raiffeisen Schweiz
Philipp Thomann, D ONE
1
2. 2023-02-24 2
Who we are
■ Third largest bank in Switzerland
■ Cooperative organisation with 226 independent
regional units
■ Retail bank
Data Science
■ Since 2016
■ Extract and deliver information to client
advisors, marketing, sales and other business
units
■ Infer patterns from customer data (3.5 mio) and
model customer behavior and needs
■ Consultancy for data-driven value creation
■ Curation of the most talented data team in
Switzerland
■ 100+ Consultants in Zurich
■ 20+ in Athens
■ Founded 2005, 1’000+ data projects
■ International and Swiss clients
■ Selectively invested in startups
3. 2023-02-24 3
How does sales analytics generate value?
Customer
selection
(Analytics)
Contact Appointment
with client
advisor
Deal
■ Impact of good selection: efficient allocation of time and money
■ Limited time availability & budget (Data Scientists and client advisors)
4. 2023-02-24 4
Data Science process
■ Time to prototype several months
→ High upfront costs for every idea
■ Manual recycling of old code
Solution: 3 Pillars
Customer
Centric
View
Feature
Layer
Scoring
Template
Customer Analytics Platform (CAP)
Scoring Factory
5. 2023-02-24 5
Foundation: Customer Analytics Platform (CAP)
■ 100+ Users
■ Business Analysts, Data Scientists
■ Customer Analytics, Marketing, Risk
■ Data Science
■ R, Python, SQL
■ Technology
■ Cloudera Data Platform
■ Lab-UI and Factory running in
project-specific Docker Containers
■ Azure DevOps Server
The Information Factory
6. 2023-02-24 6
Pillar 1: Customer centric view
CUR
RAW BUSINESS
Historization Cleaning, enrichment & connection Customer centric view
Payment
Investment
Contacts
Web
Open Data
Banking
7. 2023-02-24 7
Pillar 2: The feature layer
CUR
RAW BUSINESS
Historization Cleaning, enrichment & connection Customer centric view Algorithmic readiness
Payment
Investment
Contacts
Web
Open data
Banking
feature_table
FEATURE
8. 2023-02-24 8
Pillar 2: Data prep in feature layer
3 step feature reduction: Reduction of amount and redundancy of the customer data
■ Technical attributes: Blacklist
■ Variance: Filter attributes with low variance
■ correlation: Remove redundant, correlated attributes
Pre-defined, consistent and flexible model to treat missing values
one-hot-encoding of nominal attributes
1 K
10 K
100 K
P
A
N
D
B
b
l
a
c
k
l
i
s
t
v
a
r
i
a
n
c
e
c
o
r
r
e
l
a
t
i
o
n
9. 2023-02-24 9
Pillar 2: The architecture of the feature layer
Versioned Feature models
1. On demand
2. scheduled
Transformation
Feature tables
Learning
learn
transformations
Transformation
apply models
Transformation
ML Models
Use specific
version of feature
table
Customer
centric view
10. 2023-02-24 10
Pillar 3: Scoring template
Transformation
Use case specific
data loading
• Labels
• Eligible customers
• Features
Transformation
Production ready
ML Model
Transformation
Code shell
• Production ready
structure
• Modular
• Automated
• Consistent
business relevant questions
Automation and abstraction
of all technicalities
11. 2023-02-24 11
Pillar 3: Data flow in the scoring template
Predict
• scores
• XAI
• Labels
• Eligible
customers
• Features
Train
Module
Inference
Module
ML
• XGBoost
• Hyperparameter
tuning
Labeled
Training data
set
Unlabeled data
set to score
Model metrics
Persistent storage of
output
use case
specific Data
loading
model
12. 2023-02-24 12
Use Case: Product recommendation
Feature layer
• Algorithmic
readiness
Produktvorschlag
pro Kunde werden die
Produkte mit einer
Kombination aus hoher
Abschlusswahrscheinlichk
eit und hohem Business-
Nutzen aufbereitet
Product score
• Predict sales
probabilities for every
product and customer.
• SHAPly explanations
Business rules
• Reweight sales
probabilities with
business value
• Up to 3 product
recommendations per
customer
Lead management
• Consistency and
compatibility checks
• Deliver to core
banking system
Customer
centric view
13. 2023-02-24 13
Impact
■ Reduced implementation time and upfront
costs
→ From Idea to production in ~1 week
■ Business drives use cases (instead of data
scientists)
■ Increase in conversion rate of factor 3 for
leads from product scores
Customer
Centric
View
Feature
Layer
Scoring
Template
Customer Analytics Platform (CAP)
Scoring Factory