Predictive Analytics could bring benefits virtually to any data-intensive and knowledge-intensive organization. With integration into existing business processes and applications it gives plenty of powerful opportunities. In this brief presentation I outline my experience and skills in data analytics.
2. Discovery of hidden opportunities in data and design of automated
Machine Learning solutions with integration into existing
systems/processes
Areas of applications:
● Reducing risk with early warning systems;
● Improving operations for example with advanced credit and
risk scoring methods, reduction of customer churn (attrition);
● Optimizing marketing campaigns with targeted
Up/Cross-sales, optimized budget allocation
● Detecting fraud with pattern matching, unusual operations
Industries:
Banking & Financial Services, Health Insurance, Retail, Government
& Public Sector, Energy & Utilities, Manufacturing
Cross-industry Synergies: Banking + Retail
Source 1
Source 2
Source 3
ML Methods Data Marts
Own predictive
reporting
Integrate insights into
business processes
Types of data:
Transactions
Log files
H/w usage data
Web statistics
Mobile statistics
Call center stat.
etc.
3. Summary of Opportunities
Banking
● Consumer Credit Risk Models via Machine-Learning Algorithms
● Online Risk Monitoring and Fraud Detection
● Customer Analytics for Targeted Cross-selling and Up-selling
Insurance
● Differentiated Insurance Premiums Based on Predictive Risk Factors
● Online Risk Monitoring and Fraud Detection
Retail
● Demand Forecasting and Retail Pricing Optimization
● Promotional optimization
4. Consumer Credit Risk Models via Machine Learning Algorithms
Predictive models based on consumer credit card usage (behavioural data) are
becoming increasingly popular in banks. The reason is a science-based risk
reduction and revenue generation approach.
The typical solution is based on Machine Learning algorithms that regularly
processes credit card usage and other available behavioural data in the
background, classifies the consumers to one or several risk groups and
recommend the appropriate actions, for example: upgrading or downgrading
the limits of the consumer, send to special care, and others.
Consumer classification models operate with confidence of above 85%. That
suggests scientific fundamentals to:
1. profitably increase credit portfolio while lowering risks
2. dynamically detect high-risk patterns and suggest appropriate actions
on time
3. differentiate on the market with new credit products
Banking (1)
Proposition
1. Pilot project that would confirm the
model, present the findings. Duration
6-8 weeks
2. Development and integration into
existing application infrastructure.
Duration 16-24 weeks
3. Stepwise roll-out and technical
support
5. Banking (2)
Online Risk Monitoring and Fraud Detection
Depending on regulatory or business requirements online or offline risk
monitoring systems could be implemented. Utilizing advanced classification
methods, such systems continuously monitor data flow and suggest transactions
with high probability of fraud or risk.
Predictive Customer Analytics for Targeted Cross-selling and Up-selling
Existing customer analytics and CRM systems could be enriched with advanced
segmentation methods. The resulting segmentation is more precise, and is
ultimately based on the likelihood that a consumer will accept a given offer. The
result is a win-win situation as customers are offered more relevant products and
services, leading to a more profitable relationship with the bank.
Predictive Cash Flow Management
Optimize cash flows based on recommended schedule of ATM or branch cash
collection, taking into account required level of service and associated costs.
Proposition
1. Prioritise business goals and project
deliverables
2. Pilot project that would confirm the
model, present the findings. Duration
6-8 weeks
3. Development and integration into
existing application infrastructure.
Duration 16-24 weeks
4. Stepwise roll-out and technical
support
6. Banking: Case Study
Predictive Customer Analytics for Targeted Cross-selling and
Up-selling
The data from direct marketing campaigns of a Portuguese banking
institution. The marketing campaigns were based on phone calls.
Often, more than one contact to the same client was required, in
order to access if the product (bank term deposit) would be ('yes') or
not ('no') subscribed.
There are 36190 accounts who could potentially be interested in the
new product. We construct a predictive model using Neural
Networks. The training set is based on 10% of data (3619 call data).
The model predicts responses for the rest 90% of customers with
accuracy above 95%.
Purple area indicates
prediction error
7. Example of work: Customer Attrition Analysis
Project: Assessment project to identify
customers whose credit usage will gradually
reduce
Goal: Identify patterns with high probability that
a customer will leave or become inactive
Data: Card transaction trends + call center
inquiries
Results: Identified patterns that matched the
goal to 86% of probability.
8. Opportunities in Insurance
Differentiated Insurance Premiums Based on Predictive Risk Factors
Differentiate insurance premiums based on risk factors calculated with wide
range of factors including: car model, year, colour and other attributes, statistics
of driver incidents and fines, driving intencity, regional statistics and trends, and
other factors.
Statistically safer patterns get lower premiums and you win new customers with
better scores. Statistically risky deals get increased in premium, they could
either leave to competitors for lower price or stay and become more profitable.
Online Risk Monitoring and Fraud Detection
Various online early warning systems based on financial and non-financial
indicators. Automatic identification of fraudulent operations. Stress testing.
Customer Targeting and Personalization
Using advanced segmentation methods identify most loyal customers and those
with high risk of attrition. Anticipate how a specific customer will react to
bundled packages or discounts. Identify opportunities for cross-sales of different
insurance products..
Proposition
1. Prioritise business goals and project
deliverables
2. Pilot project that would confirm the
model, present the findings. Duration
6-8 weeks
3. Development and integration into
existing application infrastructure.
Duration 16-24 weeks
4. Stepwise roll-out and technical
support
9. Insurance Case Study
The company XYZ is one of the top ten leading non-life Insurance companies in
Thailand.
Questions: How can insurance firms retain their best customers? Will this damaged
car be covered and get claim payment? How much of loss of claims associated with
this policy will be?
Summary: This data mining project helps develop a program to aid in the
underwriting process. The scoring model gives a probability of a given insurance
applicant defaulting on claims and the total loss of claims. The threshold can be
selected such that all applicants whose probability of defaults is in excess of the
threshold level (80%, for instance) will be recommend for rejection, a closer attention,
or higher deductible options. Some other key findings of this study are as follows:
● When gross premium increases, the total loss of claims decreases.
● When the maximum amount of liability increases, the total loss of claims
decreases.
● The non-Japanese vehicles have higher chance to report claims than the
Japanese vehicles.
● Sedan vehicles have the lowest risk of loss of claims; meanwhile truck has the
highest risk of loss of claims
● For the market channel, direct sale has the lowest risk of loss of claims
compared to Agent, Broker. Motor Partner and Kbank sales.
10. Insurance Case Study
The data used constitute a French auto insurance portfolio containing
50000 policies registered during the year 2009.
The non-life insurance pricing consists of establishing a premium or a
tariff paid by the insured to the insurance company in exchange for the
risk transfer. The premium insurance calculation is based on the
multiplication of estimated frequency and cost of claims.
We performed an analysis of the Generalized Linear Models in order to
establish the pure premium given the characteristics of the
policyholders. Therefore, as a first stage, the frequency of claims is
estimated through Poisson regression model. In the next analysis stage,
by using the Gamma model, the estimated average level of the claim
cost corresponding to each class of policyholders is determined.
Eventually, the research results have shown that for the new customers,
the insurance premium will be established while considering a series of
risk factors, like age and profession of the insured, purpose of vehicle
usage, bonus-malus coefficient and the age of the insurance
contract. Therefore, within the analyzed insurance portfolio, a decrease
of the pure premium is observed along with an increase of the insure
11. Opportunities for FMCG Retail
Demand Forecasting and Retail Pricing Optimization
Optimal retail pricing based on consumer price elasticity of demand,
competitive environment, customer loyalty, stock level and other factors.
The solution is currently working in 96 non-food supermarkets in Russia.
Retail case studies on the next page.
Promotional optimization
Promotional optimization helps to choosing right items to promote, set optimal
discount level and duration of promo. We can handle such factors as price
elasticity of demand, cross-elasticity of sales, cannibalization and halo effect,
consumer demand cycles and seasonality.
Proposition
1. Pilot project that would confirm the
model, present the findings. Duration
6-8 weeks
2. Development and integration into
existing application infrastructure.
Duration 16-24 weeks
3. Stepwise roll-out and technical
support
12. Example of work: Retail Pricing Optimization
Project: Optimization of retail prices in drogery
retail chain with 96 stores (Russia)
Goal: Optimize retail prices for in-store
assortment to maximize retailer profits.
Data: 28 months of sales history
Results: 4 months of testing on 10 pilot stores
demonstrated 6.14% LFL growth in gross profits.
The client has decided to roll-out the solution to
the entire chain of 96 stores.
Profit maximization
13. Example of work: Retail Demand Forecasting
Project: Improve demand forecasting
methods within Supply Chain process.
Goal: Accurate promotional forecasting. Win
6 week Moving Average.
Data: Ticket level data and promo history for
28 months.
Results: Store level forecasts demonstrated
92% accuracy on 6 month period, category
level 74% accuracy and SKU level with mean
accuracy of 65% and 85% with min/max
accuracy intervals.
14. Example of work: Assortment Clusterization
Project: Cluster stores based on customer
purchasing behaviour to optimize assortment
carried in the stores.
Goal: Increase store traffic and sales.
Data: Ticket level data and promo history for 28
months.
Approach: Clusterization by k-means method
using 100 high level categories each one with 4
price segments totaling to 400 clusterization
variables.
18. Farid Gurbanov: Publications and Speeches
● “An Algorithm and Demand Estimation Procedure for Retail Price Optimization”,
Invited speaker to Symposium on Data Science & Statistics 2018, Reston, VA, USA
● “Bringing online survey techniques to offline stores”, Invited speaker to Big Data
Meets Survey Science 2018, Barcelona, Spain
● “Artificial intelligence in sales: potential uses and competitive advantages”, New
Pharmacy magazine, 2018, Russia
● “Dynamic Pricing”, Interview to Retailer.ru, 2018, Russia
https://retailer.ru/dinamicheskoe-cenoobrazovanie/
● “Smart Pricing in Pharma”, Interview. Radio Mediametrics 2017, Moscow, Russia
● “Customer segmentation and other methods of predictive analytics in retail”, Guest
Speaker at CNEWS Forum 2016, Moscow, Russia
● “Predictive analytics and technological transformation of marketing”, Guest Speaker
at IT-Retail Forum 2016, Moscow, Russia