Your SlideShare is downloading. ×
Presentation Title
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Presentation Title

456
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
456
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Adventures in SegmentationUsing Applied Data Mining to add Business Value  
    Drew Minkin
  • 2. The Value Add of Data Mining
    Segmentation 101
    Segmentation Tools in Analysis Services
    Methodology for Segmentation Analysis
    Building Confidence in your Model
    2
    Agenda
  • 3. 3
    The Value Add of Data Mining
  • 4. Statistics for the Computer Age
    Evolution, not revolution with traditional statistics
    Statistics enriched with brute-force capabilities of modern computing
    Associated with industrial-sized data sets
    4
    Value Add - What is Data Mining?
  • 5. 5
    Data Mining
    OLAP
    Reports (Ad hoc)
    Reports (Static)
    Value Add - Data Mining in the BI Spectrum
    Business Knowledge
    SQL-Server 2008
    Relative Business Value
    Easy
    Difficult
  • 6. VoterVault
    From Mid-1990s
    Massive get-out-the-vote drive for those expected to vote Republican
    Demzilla
    Names typically have 200 to 400 information items
    6
    Value Add – Data Mining and Democracy
  • 7. “The quiet statisticians have changed our world; not by discovering new facts or technical developments, but by changing the ways that we reason, experiment and form our opinions.”
    -- Ian Hacking
    Value Add – The Promise of Data Mining
    7
  • 8. 8
    Value Add – Spheres of Influence
  • 9. Value Add – Operational Benefits
    Improved efficiency
    Inventory management
    Risk management
  • 10. Value Add – Strategic Benefits
    The Bottom Line
    Increased agility
    Brand building
    Differentiate message
    “Relationship” building
  • 11. Value Add – Tactical Benefits
    Reduction of costs
    Transactional leakage
    Outlier analysis
  • 12. Identify a group of customers who are expected to attrite
    Conduct marketing campaigns to change the behavior in the desired direction
    change their behavior, reduce the attrition rate.
    Value Add - Customer Attrition Analysis
  • 13. Slow attriters: Customers who slowly pay down their outstanding balance until they become inactive.
    Fast attriters: Customers who quickly pay down their balance and either lapse it or close it via phone call or write in.
    Value Add - Target Result
  • 14. Credit models
    Retention models
    Elasticity models
    Cross-sell models
    Lifetime Value models
    Agent/agency monitoring
    Target marketing
    Fraud detection
    Value Add - Sample Applications
    14
  • 15. 15
    Segmentation 101
  • 16. Unsupervised learning
    Associations and patterns
    many entities
    target information
    Market basket analysis (“diapers and beer”)
    Supervised learning
    Predict the value
    target variable
    well-defined predictive variables
    Credit / non-credit scoring engines
    16
    Segmentation – Machine Learning
  • 17. Segmentation –Sample Data Sources
    Data Warehouse: Credit Card Data Warehouse containing about 200 product specific fields
    Third Party Data : A set of account related demographic and credit bureau information
    Segmentation files :Set of account related segmentation values based on our client's segmentation scheme which combines Risk, Profitability and External potential
    Payment Database :Database that stores all checks processed. The database can categorize source of checks
  • 18. 18
    Methodology for Segmentation Analysis
  • 19. 19
    Methodology–Distribution of Effort
  • 20. 20
    Methodology – Segmentation Lifecycle
  • 21. Research/Evaluate possible data sources
    Availability
    Hit rate
    Implementability
    Cost-effectiveness
    Extract/purchase data
    Check data for quality (QA)
    At this stage, data is still in a “raw” form
    Often start with voluminous transactional data
    Much of the data mining process is “messy”
    Methodology – Acquiring Raw Data
    21
  • 22. Reflects data changes over time.
    Recognizes and removes statistically insignificant fields
    Defines and introduces the "target" field
    Allows for second stage preprocessing and statistical analysis.
    Methodology – Goals of Refinement
  • 23. Scoring engine
    Formula that classifies or separates policies (or risks, accounts, agents…) into
    profitable vs. unprofitable
    Retaining vs. non-retaining…
    (Non-)Linear equation f() of several predictive variables
    Produces continuous range of scores
    score = f(X1, X2, …, XN)
    Methodology - Scoring Engines
    23
  • 24. Data
    To Predict
    Training Data
    Mining Model
    Mining Model
    Mining Model
    Methodology – Deployed Model
    DB data
    Client data
    Application log
    “Just one row”
    New Entry
    New Txion
    DM
    Engine
    DM
    Engine
    Predicted Data
  • 25. Randomly divide data into 3 pieces
    Training data
    Test data
    Validation data
    Use Training data to fit models
    Score the Test data to create a lift curve
    Perform the train/test steps iteratively until you have a model you’re happy with
    During this iterative phase, validation data is set aside in a “lock box”
    Score the Validation data and produce a lift curve
    Unbiased estimate of future performance
    Methodology - Testing
    25
  • 26. Examine correlations among the variables
    Weed out redundant, weak, poorly distributed variables
    Model design
    Build candidate models
    Regression/GLM
    Decision Trees/MARS
    Neural Networks
    Select final model
    26
    Methodology - Multivariate Analysis
  • 27. 27
    Segmentation Tools in Analysis Services
  • 28. Data Mining - Algorithm Matrix
    Segmentation
    Advanced Data
    Exploration
    Classification
    Forecasting
    Association
    Text Analysis
    Estimation
    Association Rules
    Clustering
    Decision Trees
    Linear Regression
    Logistic Regression
    Naïve Bayes
    Neural Nets
    Sequence Clustering
    Time Series
  • 29. 29
    Data Mining - SQL-Server Algorithms
    Decision Trees
    Time Series
    Neural Net
    Clustering
    Sequence Clustering
    Association
    Naïve Bayes
    Linear and Logistic Regression
  • 30. Offline and online modes
    Everything you do stays on the server
    Offline requires server admin privileges to deploy
    • Define Data Sources and Data Source Views
    • 31. Define Mining Structure and Models
    • 32. Train (process) the Structures
    • 33. Verify accuracy
    • 34. Explore and visualise
    • 35. Perform predictions
    • 36. Deploy for other users
    • 37. Regularly update and re-validate the Model
    Data Mining - Blueprint for Toolset
  • 38. Data Mining - Cross-Validation
    SQL Server 2008
    X iterations of retraining and retesting the model
    Results from each test statistically collated
    Model deemed accurate (and perhaps reliable) when variance is low and results meet expectations
  • 39. Data Mining - Microsoft Decision Trees
    Use for:
    Classification: churn and risk analysis
    Regression: predict profit or income
    Association analysis based on multiple predictable variable
    Builds one tree for each predictable attribute
    Fast
  • 40. COMPLEXITY_PENALTY
    FORCE_REGRESSOR
    MAXIMUM_INPUT_ATTRIBUTES
    MAXIMUM_OUTPUT_ATTRIBUTES
    MINIMUM_SUPPORT
    SCORE_METHOD
    SPLIT_METHOD
    Data Mining - Decision Tree Parameters
  • 41. Data Mining - Microsoft Naïve Bayes
    Use for:
    Classification
    Association with multiple predictable attributes
    Assumes all inputs are independent
    Simple classification technique based on conditional probability
  • 42. MAXIMUM_INPUT_ATTRIBUTES
    MAXIMUM_OUTPUT_ATTRIBUTES
    MAXIMUM_STATES
    MINIMUM_DEPENDENCY_PROBABILITY
    Data Mining - Naïve Bayes Parameters
  • 43. Data Mining - Clustering
    Applied to
    Segmentation: Customer grouping, Mailing campaign
    Also: classification and regression
    Anomaly detection
    Discrete and continuous
    Note:
    “Predict Only” attributes not used for clustering
  • 44. CLUSTER_COUNT
    CLUSTER_SEED
    CLUSTERING_METHOD
    MAXIMUM_INPUT_ATTRIBUTES
    MAXIMUM_STATES
    MINIMUM_SUPPORT
    MODELLING_CARDINALITY
    SAMPLE_SIZE
    STOPPING_TOLERANCE
    Data Mining - Clustering Parameters
  • 45. Data Mining - Neural Network
    Applied to
    Classification
    Regression
    Great for finding complicated relationship among attributes
    Difficult to interpret results
    Gradient Descent method
    Output Layer
    Loyalty
    Hidden
    Layers
    Input Layer
    Age
    Education
    Sex
    Income
  • 46. HIDDEN_NODE_RATIO
    HOLDOUT_PERCENTAGE
    HOLDOUT_SEED
    MAXIMUM_INPUT_ATTRIBUTES
    MAXIMUM_OUTPUT_ATTRIBUTES
    MAXIMUM_STATES
    SAMPLE_SIZE
    Data Mining - Neural Network Parameters
  • 47. Data Mining - Sequence Clustering
    Analysis of:
    Customer behaviour
    Transaction patterns
    Click stream
    Customer segmentation
    Sequence prediction
    Mix of clustering and sequence technologies
    Groups individuals based on their profiles including sequence data
  • 48. To discover the most likely beginning, paths, and ends of a customer’s journey through our domain consider using:
    Association Rules
    Sequence Clustering
    Data Mining - What is a Sequence?
  • 49. Data Mining - Sequence Data
  • 50. Your “if” statement will test the value returned from a prediction – typically, predicted probability or outcome
    Steps:
    Build a case (set of attributes) representing the transaction you are processing at the moment
    E.g. Shopping basket of a customer plus their shipping info
    Execute a “SELECT ... PREDICTION JOIN” on the pre-loaded mining model
    Read returned attributes, especially case probability for a some outcome
    E.g. Probability > 50% that “TransactionOutcome=ShippingDeliveryFailure”
    Your application has just made an intelligent decision!
    Remember to refresh and retest the model regularly – daily?
    Data Mining – Minor Introduction to DMX
  • 51. CLUSTER_COUNT
    MAXIMUM_SEQUENCE_STATES
    MAXIMUM_STATES
    MINIMUM_SUPPORT
    Data Mining- Sequence Clustering Parameters
  • 52. 45
    Data Mining – Detailed Workflow
  • 53. 46
    Data Mining – Detailed Mining Model
  • 54. 47
    Data Mining – Detailed Mining Model
  • 55. 48
    Data Mining – Detailed Mining Model
  • 56. 49
    Building Confidence in your Segmentation
  • 57. Which target variable to use?
    Frequency & severity
    Loss Ratio, other profitability measures
    Binary targets: defection, cross-sell
    …etc
    How to prepare the target variable?
    Period - 1-year or Multi-year?
    Losses evaluated @?
    Cap large losses?
    Cat losses?
    How / whether to re-rate, adjust premium?
    What counts as a “retaining” policy?
    …etc
    Building Confidence - Model Design
    50
  • 58. Approaches
    Change the algorithm
    Change model parameters
    Change inputs/outputs to avoid bad correlations
    Clean the data set
    Perhaps there are no good patterns in data
    Verify statistics (Data Explorer)
    Building Confidence - Improving Models
  • 59. Capping
    Outliers reduced in influence and to produce better estimates.
    Binning
    Small and insignificant levels of character variables are regrouped.
    Box-Cox Transformations
    These transformations are commonly included, specially, the square root and logarithm.
    Johnson Transformations
    Performed on numeric variables to make them more ‘normal’.
    Weight of Evidence
    Created for character variables and binned numeric variables.
    52
    Building Confidence – Alternate Methods
  • 60. 53
    Building Confidence - Confusion Matrix
    1241 correct predictions (516 + 725) .
    35 incorrect predictions (25 + 10).
    The model scored 1276 cases (1241+35).
    The error rate is 35/1276 = 0.0274.
    The accuracy rate is 1241/1276 = 0.9725.
  • 61. “All models are wrong, but some are useful."
    George Box
    54
    Building Confidence – Warning Signs
  • 62. Extrapolation
    Applying models from unrelated disciplines
    Equality
    The real world contains a surprising amount of uncertainty, fuzziness, and precariousness.
    Copula
    Binding probabilities can mask errors
    Distribution functions
    Small miscalculations can make coincidences look like certainties
    Gamma
    Human behavior difficult to quantify as a linear parameter
    55
    Building Confidence –Li’s Revenge
  • 63. 56
    Building Confidence - Lift Curves
    Sort data by score
    Break the dataset into 10 equal pieces
    Best “decile”: lowest score  lowest LR
    Worst “decile”: highest score  highest LR
    Difference: “Lift”
    Lift = segmentation power
    Lift translates into ROI of the modeling project
  • 64. Building Confidence – Vetted Results
    ~Top 5% of 750000
    2 groups with 10000 customers from random sampling
    37500 top customers from the prediction list sorted by the score
    Group 1
    Engaged or offered incentives by marketing department
    Group 2
    No action
    Results
    Group 1 has a attrition rate 0.8%,
    Group 2 has 10.6%
    Average attrition rate is 2.2% Lift is 4.8 (10.6% /2.2%)
  • 65. 58
    Discussion
  • 66. Xiaohua Hu, Drexel University
    Jerome Friedman, Trevor Hastie, Robert Tibshirani ,The Elements of Statistical Learning
    James Guszcza,Deloitte
    Jeff Kaplan, Apollo Data Technologies
    Rafal Lukawiecki, Project Botticelli Ltd
    David L. Olson, University of Nebraska Lincoln
    Donald Farmer, ZhaoHui Tang and Jamie MacLennan, Microsoft
    ASA Corporation
    Richard Boire, Boire Filler Group,
    John Spooner, SAS Corporation
    Shin-Yuan Hung , Hsiu-Yu Wang , National Chung-Cheng University
    Felix Salmon and Chris Anderson, Wired Magazine
    59
    Credits

×