Best Practices In Predictive
Analytics
Keeping things simple…
June 10, 2014
Ajay Gopikrishnan
Lead Architect – Analytics & Bigdata
2
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
What is Predictive Analytics (PA)?
Definition
Predictive Analytics is the application of statistical
techniques and BI technologies to uncover
relationships and patters from within large volumes of
data that can be used to predict behavior or events of
interest
Source: TDWI.org
3
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Predictive Analytics in action…
!  Predictive analytics (PA) is used to address churn
!  PA is used to predict the likelihood of response to a mailer
!  PA is used to predict the risk of default on a credit card
!  PA is used to predict the average time to failure for a particular industrial
heavy machine
Predictive
Analytics is more
forward looking
compared to
regular BI – we
use past events
to anticipate the
future!
4
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Why is PA not widely deployed?
!  It is usually complex and calls for a combination of skills
!  The value generated is often under-rated
!  Software is expensive
!  Dependency on good quality data
!  PA is often taken up more as an experiment and not core to function
5
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Best Practices
Data &
Technology
Organization
Process
Structured
Unstructured /
semi-
structured
6
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Impact
Marketing
It is important to measure the RoI of PA projects
simply because organizational resources are
ploughed into action
Mailers are sent
Credit Cards Applications are denied / downgraded
A Canadian bank
uses PA to
increase
campaign
response rates by
600%,
cut customer
acquisition costs
in half, and boost
campaign ROI by
100%. TDWI.ORG
7
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Analytical Metrics
It is more important to evaluate PA projects by
using a set of Business metrics rather than
analytical metrics
!  R-square
!  Lift
!  ROC Curve
Business Metrics
!  Response rate
!  Gross sales
!  Net Profit
No one gets a
raise or a bonus
based on R-
square or lift !
- Build business
metrics into the
Analytical Plan
8
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Be prepared to
spend a good of
deal of your
project time (75%
sometimes) on
data management
The goal of PA is to isolate the variables from
amongst a large set that can best explain the
event or behavior of interest; normally EDW
tables cannot be used as such
Data jobs
!  Merges & joins
!  Transformations
!  Data Quality
Process jobs
!  Exploratory analysis
using central tendency
measures
!  Detect outliers
9
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
You are trying to
model human
behaviour so do
not expect a
silver bullet –
expect
incremental
improvements in
current level of
organizational
performance
PA models have a learning curve so be
prepared to stay invested over time to improve
performance and reap benefits
10
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Better is usually
measured in
terms of business
metrics
Complex Better
A neural network is not necessarily better than
Linear Regression if the basic assumptions of
Linear regression are being met in the given
business problem
11
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
PA projects can be outsourced to an expert
agency after suitable due diligence based on
resource availability and comparative cost-
benefit analysis with respect to in-sourcing
Outsourcing models
!  Leading organizations are known to set up captive analytics
service centers in remote locations where skills are available
!  BPOs/KPOs are known to undertake process elements in a
PA project
12
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
There is a
substantial
shortage of Data
Science skills in
the market;
Companies are
tying up with
academic
institutions for
PA programs
PA project teams comprise multiple skills – most
successful teams branch out as “Information
Management” as a bridge between business
and IT
PA teams
!  Business analyst
!  Quantitative expert
!  Tools expert
13
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Let it not be
reduced to a
research effort
by a PA
enthusiast
within a line
function
PA Projects require executive sponsorship; it is
better to adopt a top-down approach, that
originated from the business even if they are
small projects to begin with
14
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Cost of
deployment is
high; hence
need to measure
RoI of PA
projects
PA needs to exploit the Enterprise Data
Warehouse and enabling technology like In-
database and In-memory for scale, faster
throughout and a more comprehensive
approach (think enterprise PA!)
Example technology
!  SAS + Teradata
!  SPSS on Netezza
!  Analytical Sandboxes
15
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
PA is a
combination of
both art and
science!
There is no best software for Predictive
Analytics; the major contribution is efficiency; it
is up to the user to design the project, define
KPIs, evaluate candidate models and choose
the best model appropriate for the problem
PA + Big Data + Cloud
!  PA applications are deployable on the cloud
!  PA vendors are now building compatibility with Hadoop and
related technology to handle unconventional data types
16
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
The PA Maturity Curve
17
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
PA can create business value for the organization
Ajay Gopikrishnan
Lead Architect – Big Data & Analytics
ajay.gopikrishnan@capgemini.comom
Thank you!
Reference: TDWI 2010 paper by Thomas Rathburn - 10 mistakes to avoid in predictive analytics

Best Practices In Predictive Analytics

  • 1.
    Best Practices InPredictive Analytics Keeping things simple… June 10, 2014 Ajay Gopikrishnan Lead Architect – Analytics & Bigdata
  • 2.
    2 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 What is Predictive Analytics (PA)? Definition Predictive Analytics is the application of statistical techniques and BI technologies to uncover relationships and patters from within large volumes of data that can be used to predict behavior or events of interest Source: TDWI.org
  • 3.
    3 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Predictive Analytics in action… !  Predictive analytics (PA) is used to address churn !  PA is used to predict the likelihood of response to a mailer !  PA is used to predict the risk of default on a credit card !  PA is used to predict the average time to failure for a particular industrial heavy machine Predictive Analytics is more forward looking compared to regular BI – we use past events to anticipate the future!
  • 4.
    4 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Why is PA not widely deployed? !  It is usually complex and calls for a combination of skills !  The value generated is often under-rated !  Software is expensive !  Dependency on good quality data !  PA is often taken up more as an experiment and not core to function
  • 5.
    5 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Best Practices Data & Technology Organization Process Structured Unstructured / semi- structured
  • 6.
    6 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Impact Marketing It is important to measure the RoI of PA projects simply because organizational resources are ploughed into action Mailers are sent Credit Cards Applications are denied / downgraded A Canadian bank uses PA to increase campaign response rates by 600%, cut customer acquisition costs in half, and boost campaign ROI by 100%. TDWI.ORG
  • 7.
    7 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Analytical Metrics It is more important to evaluate PA projects by using a set of Business metrics rather than analytical metrics !  R-square !  Lift !  ROC Curve Business Metrics !  Response rate !  Gross sales !  Net Profit No one gets a raise or a bonus based on R- square or lift ! - Build business metrics into the Analytical Plan
  • 8.
    8 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Be prepared to spend a good of deal of your project time (75% sometimes) on data management The goal of PA is to isolate the variables from amongst a large set that can best explain the event or behavior of interest; normally EDW tables cannot be used as such Data jobs !  Merges & joins !  Transformations !  Data Quality Process jobs !  Exploratory analysis using central tendency measures !  Detect outliers
  • 9.
    9 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics You are trying to model human behaviour so do not expect a silver bullet – expect incremental improvements in current level of organizational performance PA models have a learning curve so be prepared to stay invested over time to improve performance and reap benefits
  • 10.
    10 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Better is usually measured in terms of business metrics Complex Better A neural network is not necessarily better than Linear Regression if the basic assumptions of Linear regression are being met in the given business problem
  • 11.
    11 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics PA projects can be outsourced to an expert agency after suitable due diligence based on resource availability and comparative cost- benefit analysis with respect to in-sourcing Outsourcing models !  Leading organizations are known to set up captive analytics service centers in remote locations where skills are available !  BPOs/KPOs are known to undertake process elements in a PA project
  • 12.
    12 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics There is a substantial shortage of Data Science skills in the market; Companies are tying up with academic institutions for PA programs PA project teams comprise multiple skills – most successful teams branch out as “Information Management” as a bridge between business and IT PA teams !  Business analyst !  Quantitative expert !  Tools expert
  • 13.
    13 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Let it not be reduced to a research effort by a PA enthusiast within a line function PA Projects require executive sponsorship; it is better to adopt a top-down approach, that originated from the business even if they are small projects to begin with
  • 14.
    14 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics Cost of deployment is high; hence need to measure RoI of PA projects PA needs to exploit the Enterprise Data Warehouse and enabling technology like In- database and In-memory for scale, faster throughout and a more comprehensive approach (think enterprise PA!) Example technology !  SAS + Teradata !  SPSS on Netezza !  Analytical Sandboxes
  • 15.
    15 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 Best practices in Predictive Analytics PA is a combination of both art and science! There is no best software for Predictive Analytics; the major contribution is efficiency; it is up to the user to design the project, define KPIs, evaluate candidate models and choose the best model appropriate for the problem PA + Big Data + Cloud !  PA applications are deployable on the cloud !  PA vendors are now building compatibility with Hadoop and related technology to handle unconventional data types
  • 16.
    16 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 The PA Maturity Curve
  • 17.
    17 Business Information Management Copyright© 2014 Capgemini. All rights reserved. Best Practices in Predictive Analytics | June 2014 PA can create business value for the organization Ajay Gopikrishnan Lead Architect – Big Data & Analytics ajay.gopikrishnan@capgemini.comom Thank you! Reference: TDWI 2010 paper by Thomas Rathburn - 10 mistakes to avoid in predictive analytics