Download presentation/whitepaper


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 38 2 1
  • Download presentation/whitepaper

    1. 1. Open World 2003
    2. 2. Data Warehousing for the Communications Industry: A Data Mining Approach to Customer Churn Analysis in Wireless Industry Shyam Varan Nath Senior Database Engineer Daleen Technologies Session id: 40332       
    3. 3. Introduction <ul><li>Oracle Data Mining </li></ul><ul><ul><li>JDeveloper </li></ul></ul><ul><ul><li>DM4J </li></ul></ul><ul><li>Wireless Industry and Customer Churn </li></ul><ul><li>Data Modeling for Churn Management </li></ul>
    4. 4. “ WLNP Threatens to Significantly Impact Wireless Churn Rates.” Source In-Stat 2002                                                                        
    5. 5. Churn North American Wireless industry monthly churn rate in Q4-02 Canadian Average U.S. Average 2.4% 2.8% Monthly Churn (%) - 4Q-02 <ul><li>Source: Company & analyst reports </li></ul>
    6. 6. Wireless Industry: Some Facts <ul><li>Wireless Local Number Portability (WLNP) from Nov 2003 </li></ul><ul><li>Average Cost to Acquire a New Wireless Customer: $400 to $500 </li></ul><ul><li>Data Mining as a Solution to the Business Problem </li></ul>
    7. 7. …facts Source: Duke Teradata 2002
    8. 8. …facts
    9. 9. Reasons for Churn <ul><li>Many companies to choose from </li></ul><ul><li>Similarity of their Offerings </li></ul><ul><li>Cheap prices of the handsets </li></ul><ul><li>The biggest current barrier to churn: </li></ul><ul><li>the lack of phone number portability! </li></ul>
    10. 10. A Dilemma <ul><li>Cross-Selling Through Database Marketing </li></ul><ul><ul><li>cross-selling is effective for customer retention by increasing switching costs and enhancing customer loyalty </li></ul></ul><ul><ul><li>on the other hand, cross-selling can also potentially weaken the firm’s relationship with the customer, because frequent attempts to cross-sell can render the customer non-responsive or even motivated to switch to a competitor </li></ul></ul>
    11. 11. Role of Data Mining Business Issues in a Wireless Industry
    12. 12. Some Definitions <ul><li>Data Warehousing: Data warehousing is a database or a collection of databases designed to give business decision-makers instant access to information </li></ul><ul><li>Data Mining: The Data Mining is the process of using raw data to infer important business relationships that can then be used for business advantage </li></ul>
    13. 13. “ Simply put, data mining is used to discover [hidden] patterns and relationships in your data in order to help you make better business decisions.” Source: Oracle9i Data Mining 2001
    14. 14. Choice of Tools
    15. 15. Justification for Data Mining <ul><li>Reporting Tools : Good at drilldowns into the details </li></ul><ul><li>OLAP/Statistical Tools : Used to draw conclusions from representative samples </li></ul><ul><li>Data Mining: Goes deep into the data. It uses machine-learning algorithms to automatically sift through each record and variable to uncover patterns and information that may have been hidden. </li></ul>
    16. 16. Predictive Modeling Visual Representation of Predictive Modeling
    17. 17. Benefits Of Data Warehousing And Predictive Modeling <ul><li>Immediate Information Delivery </li></ul><ul><li>Data Integration from across—and even outside—the Organization </li></ul><ul><li>Future Vision from Historical Trends </li></ul><ul><li>Tools for Looking at Data in New Ways </li></ul>
    18. 18. What is ODM? Connected to: Oracle9i Enterprise Edition Release - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release - Production SQL> Oracle9 i Data Mining , an option to Oracle9 i Enterprise Edition, that allows users to build advanced business intelligence applications that mine corporate databases to discover new insights, and integrate those insights into business applications.
    19. 19. Why Oracle? Integrated Environment of Oracle Relational Database
    20. 20. Supervised v/s Unsupervised Learning <ul><li>Supervised learning requires identification of a target field or dependent variable. The supervised-learning technique then sifts through data trying to find patterns and relationships between the independent variables and the dependent variable. (ODM provides the Naïve Bayes data mining algorithm for supervised-learning problems.) </li></ul><ul><li>Unsupervised learning allows the user not to indicate the objective to the data mining algorithm. Associations and clustering algorithms make no assumptions about the target field. Instead, try to find associations and clusters in the data independent of any a priori defined business objective – Market-basket analysis etc. (ODM provides the Association Rules data mining algorithm for unsupervised-learning problems.) </li></ul>
    21. 21. Naive Bayes algorithm <ul><li>The Naive Bayes algorithm uses the mathematics of Bayes' Theorem to make its predictions. The algorithm is typically used for: </li></ul><ul><ul><li>Identifying which customers are likely to purchase a certain product </li></ul></ul><ul><ul><li>Identifying customers who are likely to churn </li></ul></ul><ul><ul><li>Predicting the likelihood that a part will be defective </li></ul></ul><ul><li>Adaptive Bayes Network </li></ul><ul><ul><li>Human readable rules </li></ul></ul>IF RELATIONSHIP = &quot;Husband&quot; AND EDUCATION_NUM = &quot;13-16&quot; THEN CHURN= &quot;TRUE&quot;
    22. 22. Bayes Theorem According to the Bayesian rule, the probability of an example E being in class c is: P(C = c|a 1 , a 2 ……, a n ) = p(a 1 , a 2 ……, a n |C = c) p(C = c) p(a 1 , a 2 ……, a n )   The classification is taken as the C’s value with the largest probability: Assume all attributes are independent given the class: p(a 1 , a 2 ……, a n |c) = p(a 1 |c) p (a2|c) ….p(a n |c) The resulting Bayesian classifier is called the Naïve Bayesian classifier.
    23. 23. Major Steps Of Data Mining <ul><li>Build Model : Models are built in the data-mining server </li></ul><ul><li>Test Model : Model testing gives an estimate of model accuracy </li></ul><ul><li>Compute Lift : ODM supports computing lift for a binary classification model (confidence of prediction) </li></ul><ul><li>Apply Model : Applying a supervised learning model to data results in scores or predictions with an associated probability </li></ul><ul><li>computing lift for a binary classification model, </li></ul>
    24. 24. Build Model
    25. 25. Apply Process
    26. 26. Data For Modeling Nature of Dataset Used for Study (real Wireless Customer Data) Sample Size 100,000 51,306 100,462 # of Predictor Variables 171 171 171 Churn Indicator Customer ID Yes 1,000,001 – 1,100,000 No 2,000,001 – 2,051,306 No 3,000,001 – 3,100,462   Calibration Current Score Data Future Score Data
    27. 27. System Setup <ul><li>Database </li></ul><ul><li>Java Environment </li></ul><ul><li>Data Mining Wizard </li></ul>
    28. 28. Database: Oracle Installation of Oracle Database Software with Oracle Data Mining Option, with the database patch for version .
    29. 29. Java Environment: JDeveloper Installation of JDeveloper 9.0.3
    30. 30. Data Mining Wizard: DM4J
    31. 31. Question
    32. 32. Getting Started… <ul><li>Unlock odm user </li></ul><ul><li>Grants on the tables for wizard to display </li></ul><ul><li>Odm_mtr schema </li></ul>
    33. 33. Working with the DM4J Wizard Creating a new Workspace Configuring a Database Connection
    34. 34. …DM4J Selecting a model type in the DM4J wizard.
    35. 35. Algorithm for Data Modeling Selecting the Algorithm Fine tuning the algorithm
    36. 36. …DM4J The DM4J wizard generates the Java code that is compiled and executed to create the model.
    37. 37. …DM4J Here is the Java Code!
    38. 38. Our Study The input data was stored in a table called CALIBRATION. Our target variable for prediction is CHURN.
    39. 39. …study We pick all the input predictor variables (except customer Id) from the list of 171 to predict churn.
    40. 40. …study compilation and execution of the Java code containing the ODM model. The program runs in an asynchronous mode and we can monitor the progress of the task. The screen shot shows the successful completion of the model.
    41. 41. …study The Adaptive Bayes Network also generates the rules for the model in human readable form.
    42. 42. …study Testing the Model using the data from table PRESENT Confusion Matrix Cumulative Lift Chart
    43. 43. …study The last step is to apply the tested model to the data set where we want to predict the CHURN
    44. 44. …study When we apply the model, the predictions are obtained and stored in an output table After the Apply task is run
    45. 45. …study Rating the importance of the various predictor variables.
    46. 46. Top Ten Variables <ul><li>DUALBAND type of phone set </li></ul><ul><li>CARTYPE dominant vehicle lifestyle </li></ul><ul><li>EDUC1 education level of first house hold member </li></ul><ul><li>ETHNIC ethnicity </li></ul><ul><li>TOT_ACPT total offers accepted from retention team </li></ul><ul><li>OCCU1 occupation of the first household member </li></ul><ul><li>AREA geographic area </li></ul><ul><li>INCOME estimated household income </li></ul><ul><li>DWLLSIZE dwelling size </li></ul><ul><li>PROPTYPE property type details </li></ul>
    47. 47. Cost Savings Based on Churn Data savings per churnable subscriber = [ net(no intervention) – net(incentive) ] / [ L + NL ] net(no intervention) = [ L + NL ] X Cl net(incentive) = [ L + LS ] Ci + [ Pi L + NL ] Cl To estimate cost savings, the parameters Ci (cost of incentive per customer), Pi (reduction in probability to churn due to incentive Ci), and Cl (lost-revenue cost when a subscriber churns) are combined with four statistics obtained from a predictor model: L : number of subscribers who are predicted to leave (churn) and who actually leave barring Intervention. NL : number of subscribers who are predicted to stay (nonchurn) and who actually leave barring Intervention. LS : number of subscribers who are predicted to leave and who actually stay SS : number of subscribers who are predicted to stay and who actually stay
    48. 48. Churn Management Expected Saving to Carrier / Churnable Subscriber Source: Mozer 2000
    49. 49. Future Trends and Conclusion <ul><li>Real time Analytics and Text Mining (Oracle 10G) can take Data Mining to next level. </li></ul><ul><li>Oracle Data Mining can resolve a Business problem. </li></ul><ul><li>Churn Prediction and Churn Management can yield significant savings to the wireless provider. </li></ul>
    50. 50. Daleen at a Glance <ul><li>Founded in 1989 with a mission to build custom software for finance & telecom sectors </li></ul><ul><li>Worldwide base of over 80 billing & customer care contracts since 1997 </li></ul><ul><li>Innovator in deployment of convergent billing, event management & revenue assurance solutions for next-generation services </li></ul><ul><li>Long term focus on delivering exceptional customer service through a site license or service bureau relationship </li></ul><ul><li>Offices in Boca Raton, St. Louis, Amsterdam & Sydney </li></ul>RevChain – high performance billing & customer management <ul><li>Commerce - convergent billing & customer mgmt. </li></ul><ul><li>Interact - pure web CSR interface for comprehensive account management </li></ul><ul><li>Care - web-based self-care with EBPP </li></ul><ul><li>mCommerce - account mgmt. via the mobile device </li></ul>Asuriti – centralized event management & revenue assurance <ul><li>Configurable, rules-based architecture </li></ul><ul><li>Centralized management of event data </li></ul><ul><li>Data transformation & enrichment </li></ul><ul><li>Revenue assurance & error management </li></ul>BillingCentral – comprehensive outsourcing solution <ul><li>Advanced billing & event management technologies </li></ul><ul><li>Proven best practices & process controls </li></ul><ul><li>Carrier-class hardware & networks </li></ul><ul><li>Performance guarantees & revenue assurance </li></ul>
    51. 51. A Q & Q U E S T I O N S A N S W E R S
    52. 52. References & Useful Links <ul><li>Technet </li></ul><ul><li>Armstrong, G., and P. Kotler. 2001. Principles of Marketing . Prentice Hall New Jersey. </li></ul><ul><li>Duke Teradata 2002. Teradata Center for Customer Relationship Management. [On-line]. Retrieved on: Nov 7, 2002. Available: </li></ul><ul><li>In-Stat. 2002. WLNP Threatens to significantly impact wireless churn rates. [Online]. Retrieved on Sep 2002. </li></ul><ul><li>Available: </li></ul><ul><li>Mozer, Michael, Richard Wolniewicz, Eric Johnson and Howard Kaushansky. 1999. Churn reduction in the wireless industry, Proceedings of the Neural Information Processing Systems Conference , San Diego, CA. </li></ul><ul><li>Oracle9i Data Mining 2001. An Oracle white paper December 2001. [Online]. </li></ul><ul><li>Retrieved on: Nov 8, 2002. </li></ul><ul><li>Available: )  </li></ul><ul><li>Skedd, Kirsten 2002. WLNP threatens to significantly impact wireless churn rates [On-line]. Retrieved on Sep 14, 2002. </li></ul><ul><li>Available: </li></ul>
    53. 53. Acknowledgements <ul><li>Dr Ravi Behara, Faculty (Florida Atlantic University) </li></ul><ul><li>David Eastlund and Jennifer from Oracle </li></ul><ul><li>Cohorts at Daleen Technologies </li></ul>
    54. 54. Reminder – Please complete the OracleWorld online session survey. Session id: 40332 Data Warehousing for the Communications Industry Thank you.
    55. 55. Contact Information <ul><li>Email: [email_address] </li></ul><ul><li>Cell Phone: (954) 609-2402 </li></ul><ul><li>Test Message: </li></ul>