At ING Bank, machine learning models are a key factor in making relevant engagements with our customers, empowering them to stay a step ahead in life and in business. In our efforts to make the model building process more rapid, compliant, validated and accessible to roles other than data scientists (such as data analysts or customer journey experts), we have structured it for an easy creation of propensity models.
In this talk, I will present this structure, focusing on pipelining data science models in Apache Spark. In particular, I will show how we use Apache Sqoop & Ranger to comply with GDPR, build a data science workflow on top of python and Jupyter, extend the SparkML libraries on PySpark to create custom standardizers and cross-validators, and show an in-house developed monitoring tool built on top of Elasticsearch for model evaluation.
Finally, I will describe the type of engagement analysts and customer journey experts have with the result set of the models created, and how we refine our dashboards (in IBM Cognos) accordingly.
Speaker: Dor Kedem, Lead Data Scientist
ING Bank
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Model Factory at ING Bank
1. Model Factory @ING Bank
Presentation to DataWorks Summit - 2019
2019-03-20Dor Kedem
2. • Extensive software development career since 2002.
• Working on machine learning research & data science applications
since 2010.
• Today, a lead data scientist and product owner @ ING Bank in
Amsterdam.
Grab me (or via LinkedIn) during these couple of days to talk about:
• CI/CD solutions for a data science project lifecycle.
• Impact-driven data science (from POCs to MVPs).
• Modelling techniques and machine learning applications.
• Transitioning from software development or IT roles to data science.
• Boardgames and 3D puzzles.
A bit about me
3
Linkedin.com/in/KedemDor
Dor.Kedem (at) ing.com
Image Credit: My wife,
adorageek.com
HAPPY
BIRTHDAY!
3. ING Bank at a glance
Active in
more than
40 countries
+54.000
employees
in ING
Group
38M retail
customers and
12.5M primary
customers in 4Q18
Net Promoter Scores:
#1 in 6 out of 13 retail
countries
Source: https://www.ing.com/About-us/Profile/Key-figures.htm
4. Challenges in European Banking Scene
Historically low
interest rate =
less revenue from
lending
Source: macrotrends.net Source: https://hollandfintech.com/
Historical LIBOR rates (grey – recession)
Regulations leads to
a more transparent
and open banking
environment
Fintech industries are
looking into innovative
ways to fill traditional
banking roles
5. How does a bank differentiate itself from the rest?
Sources: https://www.forbes.com/sites/kurtbadenhausen/2019/03/04/the-worlds-best-banks-ing-and-citibank-lead-the-way/
https://www.ing.com/About-us
Empowering people to stay a step
ahead in life and in business
Our purpose
Our strategic priorities
6. Analytics Efforts in ING
"Data is the language of the future. If you don’t speak it
yet, we’ll help you master it," says Görkem Köseoğlu,
ING’s chief analytics officer.
Artificial Intelligence: Currently, ING employs around
80 data scientists, working on various AI-projects:
Analytics skillset: Thousands of employees to
engage analytical tools and insights:
7. Our ambition: all customer interactions driven by analytics
One-to-One Analytics
Maximising number of analytics driven service and
sales interactions
Data > insight > action is in ING’s DNA
Democratize big data usage across ING
Users of our services are extremely happy
9
8. Data Analytics for customer interactions (NL+BE)
Customer Journey
Experts
Data Analysts Data Scientists Data Engineers
How
many?
Over 300 (outside 1:1) Over 100 Roughly 20 Roughly 15
What do
we
know?
• Banking
• Marketing theory
• Customer engagement
• Message framing
• BI tools (SAS, IBM
Cognos)
• Data Privacy
• SQL
• Statistics & ML
• Data Privacy
• Programming (i.e.
Python, R, Scala)
• Big data
technologies
• CI/CD solutions
• Security &
Compliance
What do
we
create?
• Product specification
• Online & offline content
• Customer engagement
• Reports
• Dashboards
• A/B Testing
• Statistical models
• Data Products
• ETL systems
• Data lake
• Model hosting
10
CJE - Christina DA - Arjen DS - Samir DE - Eleanor
10. For Black-Friday (Nov 23rd), a customer engagement business unit would want to
engage eligible customers (via NBA + emails) with the option to acquire a new credit
card. We have two types of offers: regular credit cards & platinum credit card.
How to find who to contact with these offerings? 2 approaches:
Example case: Credit Card Acquisition
12
DA - Arjen
DS - Samir
CJE - Christina
• Build a likelihood model based on
past behavior and engagements.
• Rank customers according to this
model.
• Plot customer engagements on
different demographics.
• Come up with business rules based
on personal understanding.
11. For Black-Friday (Nov 23rd), a customer engagement business unit would want to
engage eligible customers (via NBA + emails) with the option to acquire a new credit
card. We have two types of offers: regular credit cards & platinum credit card.
How to find who to contact with these offerings? 2 approaches:
Example case: Credit Card Acquisition
13
DA - Arjen
DS - Samir
• Build a likelihood model based on
past behavior and engagements.
• Rank customers according to this
model.
• Plot customer engagements on
different demographics.
• Come up with business rules based
on personal understanding.
Very vast majority
CJE - Christina
12. Enticements are being offered to the wrong customers:
• Customer disengagement (unsubscribes, ad-blindness)
• Wasted work by CJE’s and DA’s
• Loss of potential revenue.
It takes a lot of time to make customer selections:
• No structured way to figure out target population.
• No structured learning from our past campaigns.
• Only a binary selection (to send / not to send) – no ranking.
Not leveraging on the full potential of our data:
• Not taking into account a large set of features.
• Not taking into account engagement with past offerings.
• Not taking into account engagement with other products.
• You only target what you can code.
Problem with manual selection of the customers
14
Purchase
No purchase
All clients Top 10%
One of the added value of models:
Ranking customers
Unordered Ranked by relevant
Selection based on threshold
13. Here are some responses gathered when asking about current way of work and
gaps from best practices:
Learning from our past & present…
CJE - Christina
DA - Arjen
“I don’t have time for experiments & evaluation.
We have a schedule and need to create the next campaigns”
“I get personal satisfaction from weighing in my opinion”
“I don’t see how testing everything to death leads to better results”
“Management is not critical enough about measuring our performance”
“I can’t get clear insights from my DA / CJE colleague”
“I know my customers”
“There’s a lack of clear guidelines and standards across ING tribes”
“I need to be able to contact more customers, even if models disagree”
Organizational
Personal
14. Fears needs to be addressed early on (i.e. fear of measurement, of loss of control, of automation)
Focusing on empowerment before the revolution:
• Helping DAs and CJEs to make better decisions, not to make all decisions for them.
• Creating direct link: standardization gain in personal efficiency.
• Incorporating customers in our development squad.
Resource: Check out ING PACE: Evidence-based design-driven lean approach.
How should we interpret the interviews?
16
PACE PACE Phases Experiment Loop
16. Democratizing model building: Enabling DA’s to create models for gaining customers insights.
Accelerate best practices: Make it easy & fast to be effective in customer selections.
• Model building process “built-in”: Tell us “what” you want – we take care of the “how”.
• Evaluation “built-in”: Build a model Get a free model & campaign evaluation!
• Compliance “built-in”: GDPR, archiving, legal, commercial pressure, risk – we got you covered.
Our Objective
CJE - Christina
DA - Arjen
DS - Samir
DE - Eleanor
Saves time
Better engagement
Making large-
scale impact
Understand the
customer better
Saves time
Grows in skills
Meeting
objectives
Customer - Claire
More relevant
offerings
ING Bank
17
17. Building customer models without reinventing the model building process
Model Factory
19
Building Blocks
Model
Recipe
Model Building Process Scoring Model
𝑓( ) = 𝑦
Scoring eligible
customers
Feeding scores to
ING processes
Creating reports in BI tools
for ING business units
Somewhat similar open source approach: Uber’s Ludwig, AirBnB’s BigHead & KPN’s model factory
18. Mandatory ingredients:
• Business Objective
Selection from: acquisition, deepsell, retention, customer journey.
• Business Objective specification
Based on the objective. For example: which product to acquire?
• Features to include / exclude
Selection from a list. Done based on domain expertise.
• Customers to include / exclude
SQL “where clause”. Based on domain expertise.
Optional ingredients (with defaults):
• Times specification: (How long does it take to acquire, how long
before customer makes decision)
• Modelling techniques: (for advanced / data scientists users)
Model Recipe
Model specification is
translated to a 10-15
lines JSON file and is
filled by a DA
19. Analytics features extraction
Machine learning monitoring processes
Target templates (i.e. acquisition, deepsell)
Classifiers
Evaluators
Hyperparameter / model selection (AutoML)
Fairness & bias reduction
Building Blocks
Data-sets creators
Uplift measurement
Storage management
Scheduling
Hosting
GDPR applications
Interaction with ING services
Available to all models built with a recipe specification:
21. Building Blocks Example (1): Data Sources
Clients (~80)
Products (~600)
Engagements (~300)
Data dumps &
streams from
ING sources
Data
Lake
Structured
Data Sources
Analytics
Features
Table(s)
DE - Eleanor
DA - Arjen
DS - Samir
Features in the
table are GDPR
validated
Data
scientists &
analysts build
an analytics
repository
from data
sources.
Data
engineers
build the ETL
processes to
create data
sources.
Built on top of:
• IBM PureData for Analytics (PDA)
• SAS Enterprise Global
Creating the model feature sources
22. Building Blocks Example (1): Data Sources
Via Apache Sqoop (diffs / full)
Deployed in Ansible
Scheduled by IBM Workload Scheduler
Hortonworks Data Platform
IBM PureData
for Analytics
Transferring the data to our model building environment
1 2
Managing access via Apache Ranger
on both levels:
Files are extracted
from PDA’s tables
to HDFS
Hive tables’
metadata is being
created / updated.
HDFS policies: hdfs://ingestion/raw
Hive policies on specific tables and model
results. i.e.:
• hdfs://access/BEL/models
• hdfs://access/NED/models
Synchronize users & permissions
ldap_user_sync
LDAP
Container
23. Building Blocks Example (2): Data Sets Creators
Some tips to building datasets:
• Selecting different customers in each timestamps Generalizing to new customers.
• Arranging data set in time series accordance Generalizing better for forecasting.
Training set
Jan ‘17 Jan ‘18
Valid
Mar ‘18
Training set
Jan ‘17 May ‘18
Valid
Mar ‘18
Training set
Jan ‘17 Jul ‘18
Valid
May ‘18
Training set
Jan ‘17 Jul ‘18
Test
Dec ‘18
Time
series
cross
validator
Picking best hyper-parameters
Train
fit the model.
Used to
Valid
learn hyper-
parameters.
Used to
Test
Legend
evaluate and to
pick best mode.
Used to
Useful resource - Timothy Lin’s Creating a Custom Cross-Validation Function in PySpark
24. Building Blocks Example (3): Preparing data for training
Pipeline approaches (pyspark.ml.pipeline): Elegant way to manage the workflow of your data
processing. Each stage simply appends new transformers to the pipeline model stages:
Transformers: The vast majority of transformers we use can be found in pyspark.ml.feature,
as well as some custom transformers we’ve defined ourselves to make sure all preprocessing
is managed in the pipeline object.
Code snippet: each fit function adds stages to
pipeline & transforms data for next stage
Code snippet: Example of basic custom transformer
25. Building Blocks Example (3 - Bonus): Filling missing values
Apart from the pyspark.ml.feature Imputer class, we also experiment with using Autoencoders for filling missing values.
• Encoder: A dimensionality reduction model, from the original features to a lower dimensional code.
• Decoder: The reverse action – tries to recreate the original input from the code (not a perfect match).
A very good resource for distributed deep learning on top of Spark: dist-keras (by CERN Database group).
To learn more about autoencoder’s check out Irhum Shafkat’s Intuitively Understanding Variational Autoencoders.
X: 5 10 23 <?> 0
Autoencoder design Autoencoder application:
Data with missing feature gets in, filled value comes out
X’: 5 10 22 7 0
26. Building Blocks Example (4): Model Building
Relying on open-source Big Data technologies as building blocks
Classifiers (the model types): mainly based on the Spark
Machine Learning framework and includes:
• Linear / Logistic regression
• Naïve Bayes
• Decision Trees
• Ensemble methods (Random Forest, GBRT)
• Neural Networks (MLP)
Evaluators (the model performance validation):
• Everything under the Spark MLLib evaluation metrics.
AutoML (finding the best model):
• Currently experimenting with auto-sklearn & H2O for faster
hyper-parameter tuning. See Georgian Partners’ comparison.
27. Building Blocks Example (5): Fairness
Resource: https://research.google.com/bigpicture/attacking-discrimination-in-ml/
For easy explanation:
Attacking discrimination with smarter machine learning
Resource: http://aif360.mybluemix.net/
For approaches on reducing bias:
IBM AI Fairness 360
29. Engaging with the model factory process & results
Customer
Journey
Expert
Data
Analyst
Data
Scientist
Data
Engineer
Validating
building
blocks
Validating
model
execution
Validate
model
quality
Understand
the customer
better
Selecting
customers for
campaign
Post-hoc
campaign
evaluation
Getting the big
picture of model
usage
Monitoring Tool
(Developed in-house)
BI Tools (IBM Cognos Analytics)
31. Designated system for monitoring production ML models
This architecture enables access to metrics data from production models
33
User views
project metrics
Frontend loads
the project
Logstash
Development / production cluster
JSON files with project details,
metrics, models, etc.
When a user creates
a model, we create a
new model monitoring
Project JSON
files are pushed
Backend
Elastic SearchMonitoring Dashboard
Test / Prd
Test / Prd
ING servers
32. Designated system for monitoring production ML models
Useful resource: Google AI’s What’s your ML test score? A rubric for ML production systems (Breck et. al, 2016)
Open source alternative: mlflow.org (platform for machine learning lifecycle)
33. Designated system for monitoring production ML models
Useful resource: Google AI’s What’s your ML test score? A rubric for ML production systems (Breck et. al, 2016)
Open source alternative: mlflow.org (platform for machine learning lifecycle)
34. Reporting on the model built
Recall
Precision
AUCs
BA
G
H
C
D E
F F*
I
F. Customer
Segmentation
G. Model comparison heat
map.
H. Compare features
distributions.
I. Score distribution
J. Conversion for feature
values.
A. Technical quality
metrics
B. Lift curve
C. Cumulative Gains
D. Overlap with manual
selection
E. Feature Importance
J
36
35. A couple of examples to explain visually how good the model is:
Credit Card Acquisition – Model Performance
Customer percentiles (1% = 54k)
37
B/C
Cumulative Gains – how many of our
conversions did the model catch?
Lift Curve – how much better is our
selection than random selection?
Customer percentiles (1% = 54k)
DA - Arjen
“Ok… It’s better
than random…
But how does it
compare to my
previous
selection?”
36. What’s the difference between my old selection and the model’s?
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
Customers not in
old selection
Ranked customers percentile (left – most relevant)
Customersselectedinpercentile
20% Threshold
38
D
DA - Arjen
Customers in old
selection
“I feel more confident the model makes
meaningful selections”
“I see that the model found top
customers that I haven’t contacted yet”
“I still don’t understand who did I
miss…”
37. Who are the likely customers? (In this example: by age)
6.24%
10.82% 10.55% 10.32% 10.13%
12.06% 12.72% 12.82%
14.34%
1.94%
7.05%
8.38% 8.00% 8.35%
11.59%
15.05%
18.01%
21.64%
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-60+
How does [age]
distribute over the
most likely
customers?
Top 10%
versus
Bottom 90%
Age distribution in each customer group
39
Portionofentireavailablepopulation
Who did we
miss in the top
20%?
Selected
versus
Not Selected
5.82%
13.77%
12.34%
9.61%
8.68%
10.63%
12.12% 12.35%
14.68%
1.63%
6.81%
8.27% 8.38% 8.72%
12.10%
15.29%
17.99%
20.81%
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-60+
H
38. Strategy for features: iterative refinement of the filtered features.
Credit Card Acquisition – Important Features
(larger = more important, blue color = significant to only one model)
Regular credit card acquisitions Platinum credit card acquisitions
40
f_53
f_348
f_127
f_218
f_31
f_38
f_43
f_8
f_857
f_842
f_842
f_12
f_15
f_38
f_31
f_457
f_458
f_857
f_218
f_127
E
DA - Arjen
“I gain confidence that the
models generate meaningful
results.”
“I can easily troubleshoot issues
with model recipe”.
“I learn more about our
customers.”
39. Grouping customers together based on the model’s important features.
Customer Segmentation
Segment size: indication of
number of customers.
Segment color: average
conversion (more yellow =
higher conversion).
Y,X Axes: Don’t mean
much, but the overall
distance between
segments mean that
customers are more
different based on
important features (closer
segments = more similar).
Allows for further
analysis on
customer segments
F
CJE
Christina
“This helps me understand who are my customers and to
tailor a message for each type of customers.”
40. X-axis: Ranked customers interested in
regular credit card (left - most interested)
Y-axis: Ranked customers interested in
platinum credit cards (down - most interested).
Rectangles – the top 10% of customers
in each group.
Credit Card Acquisition – Which proposal to who?
Bottom 90%
Platinum 347k 4.9Mil
Top 10% -
Platinum 232k 412k
Top 10%
Regular
Bottom 90%
Regular
# customers
in shared
percentile
(log scale)
Brighter
= more
customers
Combined ranking for both credit
card acquisition models
42
G
DA - Arjen
“I can now send the
relevant offer to the
relevant customers and
avoid spamming.
41. What to expect from the top-scored customers next month?
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
0 20 40 60 80 100
Expected next-month conversion rate for each
percentile
Conversionratenextmonth
Customer percentile as ranked by the model
(lower = more relevant)
Percentile
Expected
Percentile’s
Conversion
Expected
Total
Conversion
Expected
Total
Conversion
5 1.81% 2.25% 1404
10 1.37% 1.88% 2351
20 0.95% 1.50% 3737
50 0.48% 1.00% 6268
100 0.09% 0.65% 8103
43
43. Summary:
• Enabling model creation, using data scientists
best practices and cumulative efforts.
• Simple specification, modular design.
• Accelerates DA’s, empowers CJE’s, and makes
all of us more relevant to our customers.
Model Factory @ING Bank
Selected Resources:
Driving innovation:
• ING PACE: Evidence-based design-driven lean approach
Model building:
• Uber’s Ludwig – Building models without coding
• AirBnB’s BigHead
• Georgian Partners’ AutoML comparison
• Creating a Custom Cross-Validation Function in PySpark
• Distributed deep learning on spark: dist-keras
Machine learning in production:
• What’s your ML test score? A rubric for ML production systems
• MLFlow: machine learning lifecycle
Fairness & bias removal:
• Google’s “Attacking Discrimination in ML”
• IBM’s AI Fairness 360
Linkedin.com/in/KedemDor
Dor.Kedem (at) ing.com