Machine Learning in
Customer Analytics

January 23, 2014 | Proprietary and Confidential
Transformation Through Integration: Realizing
the Full Potential of Your Information

blueocean is a next-generation services organization with a deep focus on analytics, market
intelligence and digital media, all uniquely delivered under one roof by 650 plus professionals.
Our 360 Discovery TM process ensures the comprehensive utilization of all available structured and
unstructured data sources, enabling us to bring the best to bear against each project.
By combining the talent, speed and cost benefit of a flat world, along with our scalable delivery
model, we are able to achieve a more nuanced and comprehensive understanding of the market at
the delivery speed and price advantage that today’s business climate demands.

2
What is Machine Learning?
Machine learns
patterns in the training
data using input
features

Patterns learned
applied to unseen data
to ensure generalization

If generalization fails,
input features modified;
more training data fed to
algorithm

Regression or
classification performed

3
Machine Learning Comes of Age
The era of Analytics 3.0 combines structured
transactional data and unstructured text data with
complex machine learning algorithms to generate
better and faster insights
Analytics 1.0

Implementing business intelligence
Reporting
Descriptive Analytics
Focus on internal, structured data

•
•
•
•

Key Technology Enablers for
Machine Learning
•

Better and inexpensive storage capacities

•

Increased processing power of machines

•

Large scale availability of data

•

Open source revolution

•

Advent of Hadoop ,NoSQL technologies

Key Business Enablers for
Machine Learning
•
•
Analytics 3.0
• Combining structured and unstructured data formats
• Analytics central to the business strategy
• Faster technologies
• Analytics model embedded into operational and decision processes
4

Applications in unconventional fields
thus gaining wider acceptance
Organizations have higher analytics
maturity curve

•

Lower implementation cost
From Science to Enterprise – How Big Data is Assisting
Machine Learning

•
•

Big Data Analytics offers access to speech, text and social analytics tools and expertise on demand
Machine Learning allows rapid processing of large amounts of customer centric data including customer
conversations in the form of calls, email, chat
Unstructured data comes from multiple sources:

CCTV camera
data
CDR data
(Telecom)

Digital pictures
and videos
posted online

Sensors used
to gather
information
5

Telephonic conversation
Emails and
feedbacks

GPS data
(from
mobile
devices)

Transaction records Access
Logs
Posts to social media sites

To churn big data to actionable insights brings in new
practical and theoretical challenges:

Data Acquisition l Storage l
Processing l Data Transport and
Dissemination l Data Management
and Curation l Archiving l Security
l Analyzing for Business Actions
What can Machine Learning Do for Business?
Learn – Algorithms and
computational models
to learn and gain
knowledge about users
Cloud Computing
Natural Language
Processing –
Sentiment Analysis
Text Classification
Knowledge
Acquisition
Multilingual
language
processing

Predict – Predictive
analytics to provide
actionable information
for organizations
Big data
Algorithms
• Bayesian
Classifier
• Neural Networks
• SVM

With machine learning everybody wins
Wide applications across industries:

• Recommender Systems
• Biotechnology
• Supply chain
optimization
6

• Product Marketing
• Counter-Terrorism
• Fraud Detection
Use-Case: Machine Learning in Customer Analytics
(Telecom)
Build single view
of customer

STRUCTURED

Network data

Analytics Engine

Call Data Records

Data
Aggregation
GPRS Data Records

Next Best offer
Churn prediction
Campaign Mgmt
Social Network Analytics

Contact Centre logs

UNSTRUCTURED

7
Categories of Machine Learning Algorithms
Supervised Learning Algorithms:
•
•

Training the machine on a training dataset with set of input features and a
corresponding output
Generalization: Machine learns a mathematical function which could be generalized
and applied to unseen data

Examples:
•
•
•

Classifying email as spam/not spam
Predict loan default ( Yes/No)
Forecast stock prices

Unsupervised Learning Algorithms:
•
•
•

Training dataset does not require labeled outputs.
Function mapping from inputs to output not done.
Objective is to understand structure in the data.

Examples:
•
•

8

Discovering different segments of telecom subscribers based on their call patterns and
data usage.
Social Network Analysis: Discovering communities within large groups of people.
Advantages of Machine Learning

•
•

Large scale deployments of Machine
Learning beneficial in terms of
improved speed and accuracy

•

Understands non-linearity in the data
and generates a function mapping
input to output (Supervised Learning)

•

Recommended for solving classification
and regression problems

•

Ensures better profiling of customers to
understand their needs

•

9

Useful where large scale data is
available

Helps serve customers better and
reduce attrition
Disadvantages of Machine Learning

• Limited understanding of the
machinery of classifiers (Black Box)
• Requires significant amount of data
• May not work in cases where data
collection is difficult or expensive

• Problem of over-fitting if model fitted
on small dataset

10
Challenges in Machine Learning Implementation

•

Integration of data from different sources within the organization

•

Good business understanding required to build better input features

•

Thorough understanding of algorithms required before it can be
deployed

•

Appropriate selection of machine learning algorithm essential

•

Implementing algorithms
interpretability and insights

11

which

can

give

more

business
Statistics in the Age of Machine Learning

• Statistics: Mainly deals with probabilistic or deterministic approach
• Popular in fields where data collection can be difficult or
expensive in nature
• Provides good understanding of population where only sample
data can be collected e.g. Brand survey, quality control checks,
clinical trials
• Intuitively provides more understanding about drivers of the
objective function
12
Case Studies

13
Case Study: Gender Prediction Using Supervised
Learning Algorithms
Challenge

Machine Learning

•
•
•
•

The client is a pioneer in measurement of mobile subscriber behavior
The metering application installed on smart devices captures behavior of the device accurately
The client wanted to predict gender of the subscribers based on installed mobile Applications
This information was to be used by advertisers in order to ensure focused and targeted marketing.

Approach
•
•
•
•
•

Initial data provided by the client was a set of user IDs along with the application names
Data cleansing and transformations were performed in order to ensure data can be fed to a supervised learning
algorithm
The data provided was highly imbalanced and skewed towards males as it was the dominant class to be
predicted
Applied weighted measures to give more importance to the minority class
Support Vector Machines Learning Algorithm was applied to predict gender of the subscribers

Result
•
•

14

Achieved accuracy close to 80% for both classes of interest
Developed an integrated solution with a GUI to enable real time results to be obtained based on real time data
feeds to the learning algorithm
Case Study: Incentivizing existing policies for a leading
Insurance Company
Challenge
Machine Learning & Predictive Analytics

•
•

Approach
•
•
•

The two policies Traditional and ULIP were in two states – In-force and Lapsed.
Data cleansing was done using a proprietary statistical tool
A binary logistic regression algorithm was applied on each of the policies with lapsed and in-force data

Result
•

•

15

Access lapsed insurance policies having a potential of repayment (and hence reactivation) within a specific time
frame
Identify criteria to incentivize existing in-force policies

Predictors that influenced the predictive model were:
o Premium to be paid
o Income of the policy holder
o Occupation and total sum assured at the end of maturity
It was important to target lapsed policies within a specific time frame beyond which customers would be difficult
to be re-activated
Case Study : Applying face recognition to enable
multiple applications
Challenge
•

Design a face detection and recognition algorithm for applications across multiple domains

Approach
•
•

Create a databases of faces and performed face detection using Haar cascades algorithm
Matched captured face images in the existing database of facial images of people. - We used face recognition
algorithms using Principle component analysis

Result
•
•

16

Achieved accuracy close to 60% for face recognition and 70% for face detection
Can be applied to strengthening security measures in organizations, identifying and providing offers to repeat
customers in retail stores
In Summary

• With big data a reality machine learning is finding wider acceptance across
various industries
• Machine learning is paving the way to solve complex business challenges in an
efficient and effective manner
• To reap the benefits of machine learning it is essential to identify the areas
where it can be applied effectively
• Good business understanding is required to build smarter solutions
17
Blueocean Analytics Service Areas
Customer Analytics

Marketing Analytics

Focus on better customer
experience through enhanced
engagement
•
•
•
•
•
•
•

Customer Acquisition
Portfolio Management
Attrition/Churn Analysis
Loyalty Management
Customer Contact Analytics
Customer Risk Analytics
Others …

Special Focus Areas

Develop and optimize marketing
strategies through smart
evaluation of programs
•
•
•
•
•
•

ROMI
Market Mix Modelling
Simulated Pricing Models
Promotion Analytics
Product Analysis
Others …

Specialized intelligent solutions
that keep pace with socioeconomic trends
•
•
•
•
•
•
•
•

Collections Analytics
Real Time Analytics
Social Network Analytics
Telemetry
Visual Analytics
Speech and Text Analytics
Social Media Analytics
Others…

Data Management, Big Data and Smart Business Intelligence
Focus on creating a single source of “truth” and providing insightful analysis rather than plethora of reports

Datamart Solution
18

Reporting and Smart
BI Services

Big Data Services
Thank you
For more information:
Durjoy Patranabish
Senior Vice President
durjoy.p@blueoceanmi.com
Eron Kar
Analytics Delivery Lead
eron.k@blueoceanmi.com
analytics@blueoceanmi.com

19

Machine Learning in Customer Analytics

  • 1.
    Machine Learning in CustomerAnalytics January 23, 2014 | Proprietary and Confidential
  • 2.
    Transformation Through Integration:Realizing the Full Potential of Your Information blueocean is a next-generation services organization with a deep focus on analytics, market intelligence and digital media, all uniquely delivered under one roof by 650 plus professionals. Our 360 Discovery TM process ensures the comprehensive utilization of all available structured and unstructured data sources, enabling us to bring the best to bear against each project. By combining the talent, speed and cost benefit of a flat world, along with our scalable delivery model, we are able to achieve a more nuanced and comprehensive understanding of the market at the delivery speed and price advantage that today’s business climate demands. 2
  • 3.
    What is MachineLearning? Machine learns patterns in the training data using input features Patterns learned applied to unseen data to ensure generalization If generalization fails, input features modified; more training data fed to algorithm Regression or classification performed 3
  • 4.
    Machine Learning Comesof Age The era of Analytics 3.0 combines structured transactional data and unstructured text data with complex machine learning algorithms to generate better and faster insights Analytics 1.0 Implementing business intelligence Reporting Descriptive Analytics Focus on internal, structured data • • • • Key Technology Enablers for Machine Learning • Better and inexpensive storage capacities • Increased processing power of machines • Large scale availability of data • Open source revolution • Advent of Hadoop ,NoSQL technologies Key Business Enablers for Machine Learning • • Analytics 3.0 • Combining structured and unstructured data formats • Analytics central to the business strategy • Faster technologies • Analytics model embedded into operational and decision processes 4 Applications in unconventional fields thus gaining wider acceptance Organizations have higher analytics maturity curve • Lower implementation cost
  • 5.
    From Science toEnterprise – How Big Data is Assisting Machine Learning • • Big Data Analytics offers access to speech, text and social analytics tools and expertise on demand Machine Learning allows rapid processing of large amounts of customer centric data including customer conversations in the form of calls, email, chat Unstructured data comes from multiple sources: CCTV camera data CDR data (Telecom) Digital pictures and videos posted online Sensors used to gather information 5 Telephonic conversation Emails and feedbacks GPS data (from mobile devices) Transaction records Access Logs Posts to social media sites To churn big data to actionable insights brings in new practical and theoretical challenges: Data Acquisition l Storage l Processing l Data Transport and Dissemination l Data Management and Curation l Archiving l Security l Analyzing for Business Actions
  • 6.
    What can MachineLearning Do for Business? Learn – Algorithms and computational models to learn and gain knowledge about users Cloud Computing Natural Language Processing – Sentiment Analysis Text Classification Knowledge Acquisition Multilingual language processing Predict – Predictive analytics to provide actionable information for organizations Big data Algorithms • Bayesian Classifier • Neural Networks • SVM With machine learning everybody wins Wide applications across industries: • Recommender Systems • Biotechnology • Supply chain optimization 6 • Product Marketing • Counter-Terrorism • Fraud Detection
  • 7.
    Use-Case: Machine Learningin Customer Analytics (Telecom) Build single view of customer STRUCTURED Network data Analytics Engine Call Data Records Data Aggregation GPRS Data Records Next Best offer Churn prediction Campaign Mgmt Social Network Analytics Contact Centre logs UNSTRUCTURED 7
  • 8.
    Categories of MachineLearning Algorithms Supervised Learning Algorithms: • • Training the machine on a training dataset with set of input features and a corresponding output Generalization: Machine learns a mathematical function which could be generalized and applied to unseen data Examples: • • • Classifying email as spam/not spam Predict loan default ( Yes/No) Forecast stock prices Unsupervised Learning Algorithms: • • • Training dataset does not require labeled outputs. Function mapping from inputs to output not done. Objective is to understand structure in the data. Examples: • • 8 Discovering different segments of telecom subscribers based on their call patterns and data usage. Social Network Analysis: Discovering communities within large groups of people.
  • 9.
    Advantages of MachineLearning • • Large scale deployments of Machine Learning beneficial in terms of improved speed and accuracy • Understands non-linearity in the data and generates a function mapping input to output (Supervised Learning) • Recommended for solving classification and regression problems • Ensures better profiling of customers to understand their needs • 9 Useful where large scale data is available Helps serve customers better and reduce attrition
  • 10.
    Disadvantages of MachineLearning • Limited understanding of the machinery of classifiers (Black Box) • Requires significant amount of data • May not work in cases where data collection is difficult or expensive • Problem of over-fitting if model fitted on small dataset 10
  • 11.
    Challenges in MachineLearning Implementation • Integration of data from different sources within the organization • Good business understanding required to build better input features • Thorough understanding of algorithms required before it can be deployed • Appropriate selection of machine learning algorithm essential • Implementing algorithms interpretability and insights 11 which can give more business
  • 12.
    Statistics in theAge of Machine Learning • Statistics: Mainly deals with probabilistic or deterministic approach • Popular in fields where data collection can be difficult or expensive in nature • Provides good understanding of population where only sample data can be collected e.g. Brand survey, quality control checks, clinical trials • Intuitively provides more understanding about drivers of the objective function 12
  • 13.
  • 14.
    Case Study: GenderPrediction Using Supervised Learning Algorithms Challenge Machine Learning • • • • The client is a pioneer in measurement of mobile subscriber behavior The metering application installed on smart devices captures behavior of the device accurately The client wanted to predict gender of the subscribers based on installed mobile Applications This information was to be used by advertisers in order to ensure focused and targeted marketing. Approach • • • • • Initial data provided by the client was a set of user IDs along with the application names Data cleansing and transformations were performed in order to ensure data can be fed to a supervised learning algorithm The data provided was highly imbalanced and skewed towards males as it was the dominant class to be predicted Applied weighted measures to give more importance to the minority class Support Vector Machines Learning Algorithm was applied to predict gender of the subscribers Result • • 14 Achieved accuracy close to 80% for both classes of interest Developed an integrated solution with a GUI to enable real time results to be obtained based on real time data feeds to the learning algorithm
  • 15.
    Case Study: Incentivizingexisting policies for a leading Insurance Company Challenge Machine Learning & Predictive Analytics • • Approach • • • The two policies Traditional and ULIP were in two states – In-force and Lapsed. Data cleansing was done using a proprietary statistical tool A binary logistic regression algorithm was applied on each of the policies with lapsed and in-force data Result • • 15 Access lapsed insurance policies having a potential of repayment (and hence reactivation) within a specific time frame Identify criteria to incentivize existing in-force policies Predictors that influenced the predictive model were: o Premium to be paid o Income of the policy holder o Occupation and total sum assured at the end of maturity It was important to target lapsed policies within a specific time frame beyond which customers would be difficult to be re-activated
  • 16.
    Case Study :Applying face recognition to enable multiple applications Challenge • Design a face detection and recognition algorithm for applications across multiple domains Approach • • Create a databases of faces and performed face detection using Haar cascades algorithm Matched captured face images in the existing database of facial images of people. - We used face recognition algorithms using Principle component analysis Result • • 16 Achieved accuracy close to 60% for face recognition and 70% for face detection Can be applied to strengthening security measures in organizations, identifying and providing offers to repeat customers in retail stores
  • 17.
    In Summary • Withbig data a reality machine learning is finding wider acceptance across various industries • Machine learning is paving the way to solve complex business challenges in an efficient and effective manner • To reap the benefits of machine learning it is essential to identify the areas where it can be applied effectively • Good business understanding is required to build smarter solutions 17
  • 18.
    Blueocean Analytics ServiceAreas Customer Analytics Marketing Analytics Focus on better customer experience through enhanced engagement • • • • • • • Customer Acquisition Portfolio Management Attrition/Churn Analysis Loyalty Management Customer Contact Analytics Customer Risk Analytics Others … Special Focus Areas Develop and optimize marketing strategies through smart evaluation of programs • • • • • • ROMI Market Mix Modelling Simulated Pricing Models Promotion Analytics Product Analysis Others … Specialized intelligent solutions that keep pace with socioeconomic trends • • • • • • • • Collections Analytics Real Time Analytics Social Network Analytics Telemetry Visual Analytics Speech and Text Analytics Social Media Analytics Others… Data Management, Big Data and Smart Business Intelligence Focus on creating a single source of “truth” and providing insightful analysis rather than plethora of reports Datamart Solution 18 Reporting and Smart BI Services Big Data Services
  • 19.
    Thank you For moreinformation: Durjoy Patranabish Senior Vice President durjoy.p@blueoceanmi.com Eron Kar Analytics Delivery Lead eron.k@blueoceanmi.com analytics@blueoceanmi.com 19