Data Mining Executive Overview Alan Montgomery


Published on

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This seminar gives an executive-level appreciation of what data mining is and how it can used to increase profits, decrease costs, and improve service quality in industry, commerce, science and academia, and in the public services. There are many case studies showing data mining in action. All of these are from clients of SPSS. Attendees should have a general understanding of business and of data. They do not need specific IT or analysis expertise though that will help. This seminar will not teach detailed data mining techniques. By the end of it we hope that you will be strongly motivated to learn more about data mining and apply it in your own or your client’s organization. Dr Montgomery is a Vice President of Business Development of SPSS Inc, specialising in Data Mining. He has degrees in Physics from Oxford and Sussex Universities. His career included real time systems, managing ICL’s compilers group, and software engineering at SD-Scicon, where he came across artificial intelligence. In 1989 Alan, with 6 colleagues founded ISL, creators of the Clementine Data Mining system. ISL was sold to SPSS Inc in December 1998. Dr Montgomery is a Fellow of the British Computer Society, and of the Royal Society of the Arts. He advises UK and EU governments on IT research and business development.
  • The famous and mythical “Diapers and Beer” basket association. This sort of association analysis lists everything which is associated with everything else, and reports those which deviate by some criteria, from the norm. The trouble is that in a large, real-world database nearly everything does differ in some way from the norm. So data mining algorithms can find a myriad of perfectly useless associations. Here’s another from an advertisment by IBM for its data mining services.
  • The slide is fairly self explanatory. The data which is relevant to discovering the high-value patterns underlying many business problems is a tiny fraction of the total available. Using all the data is logically wrong as well as inefficient. Once discovered, the pattern may be applied to all (or better, the complete relevant subset of all) the data. But this is a simple process, usually linear in time with the volume of data, whereas discovery algorithms timescales typically rise as some power of the number of factors to be considered, as well as linerally or more with number of records in the training data. Of course there are some problems where large volumes of data may have to be processed.
  • This slide summarises the whole talk. Business users, whether marketeers, engineers, finance, or manufacturing people, need not worry. There is no algorithm yet which has discovered any new useful business knowledge. I immediately have to qualify that to avoid the wrath of AI gurus like Prof. Donald Michie. Algorithms have discovered new knowledge, such as how to win a chess end-game previously thought to be a draw, but only in worlds with closed boundaries. My colleague Colin Shearer tells of a war-gaming program which always won naval battles; it’s first action was to sink all the weakest ships on its own side. The point is that the algorithms only find patterns in data . There is a long way to go to establish that the patterns found are real, and if real, are they of value for marketing or other business purposes. They may just be coincidences, or they may be self-evident from other considerations. In my view, only someone who understands the business meaning and context of the data can judge. And that context includes a whole world of common sense which we don’t yet know how to represent in machines, let alone the all the detailed knowledge specific to particular industries and markets.
  • In 1992, ISL initiated "Project Clementine", aiming to build a comprehensive data mining system accessible to business and professional end-users such as doctors. To make the advanced analysis techniques available to data owners requires: a variety of techniques: visualisation, statistics, and machine learning packaged so that technology details are hidden from the user presented in an intuitive and easy to learn tool. The user is encouraged to interact with data. Data features and patterns - a cluster, for example - can be identified using the mouse; the user can then generate icons which select the cases corresponding to the defined regions. These facilities allow very rapid data exploration, and formulation and testing of hypotheses based on observed features. It is also essential to encourage and reward exploration of the data and provide the responsiveness to keep up with the excitement of discovery. The configuration of the machine learning engines is automatic. The user expresses only high-level preferences - such as "favour generality over accuracy" - and Clementine configures the tools by considering both this user input and by examining the structure of the data. The user is protected from consideration of the details of the technologies involved. (Tools can also be used in "expert" mode.)
  • Data Mining Executive Overview Alan Montgomery

    1. 1. Data Mining Executive Overview Alan Montgomery VP Business Development, SPSS “ Data mining makes the difference”
    2. 2. Agenda <ul><li>What is data mining? </li></ul><ul><li>Who is using data mining, and for what? </li></ul><ul><li>How data mining fits into an IT system </li></ul><ul><li>Some myths about data mining </li></ul>
    3. 3. Information: Internet <ul><li>SPSS: </li></ul><ul><li> </li></ul><ul><li>Two Crows Corp (Herb Edelstein): </li></ul><ul><li> </li></ul><ul><li>Andy Pryke’s Data Mine </li></ul><ul><li> </li></ul><ul><li>Knowledge Discovery Mine: </li></ul>
    4. 4. Bibliography by (Herb Edelstein) <ul><li>M. Berry, G. Linoff, Data Mining Techniques , John Wiley, 1997 </li></ul><ul><li>William S. Cleveland, The Elements of Graphing Data , Hobart Press, 1994 </li></ul><ul><li>Howard Wainer, Visual Revelations, Copernicus, 1997 </li></ul><ul><li>R. Kennedy, Lee, Reed, Van Roy, Solving Pattern Recognition Problems , Prentice-Hall, 1998 </li></ul><ul><li>U. Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy, Advances in Knowledge Discovery and Data Mining , MIT Press, 1996 </li></ul><ul><li>Dorian Pyle, Data Preparation for Data Mining , Morgan Kaufmann, 1999 </li></ul><ul><li>C. Westphal, T. Blaxton, Data Mining Solutions, John Wiley, 1998 </li></ul><ul><li>Vasant Dhar, Roger Stein, Seven Methods for Transforming Corporate Data into Business Intelligence , Prentice Hall 1997 </li></ul><ul><li>Joseph P. Bigus, Data Mining With Neural Networks , McGraw-Hill, 1996L. </li></ul><ul><li>Brieman, Freidman, Olshen, Stone, Classification and Regression Trees , Wadsworth, 1984 </li></ul><ul><li>J. R. Quinlan, C4.5: Programs for Machine Learning , Morgan Kaufmann, 1992 </li></ul>
    5. 5. Data holds Knowledge <ul><li>Data can hold organization’s operations history, what we did . . . and what was the outcome </li></ul><ul><li>Can we find which actions gave good (bad) outcomes? </li></ul><ul><li>So learn from our past failures and successes to do better in future. </li></ul>
    6. 6. <ul><li>Marketing - who’s likely to buy? </li></ul><ul><li>Forecasts - what demand will we have? </li></ul><ul><li>Loyalty - who’s likely to defect? </li></ul><ul><li>Credit - which loans were profitable? </li></ul><ul><li>Fraud - when did it occur? </li></ul>What we learn from data In each case can we: find the signs ? . . . find others showing similar signs?
    7. 7. Data mining is natural <ul><li>This process is simply “learning from experience” </li></ul><ul><li>It is a totally natural and routine part of every successful business. </li></ul><ul><li>Data mining just helps you do it more quickly, accurately, and systematically. </li></ul>
    8. 8. An example Winterthur Insurance, Spain
    9. 9. Winterthur: Customer Loyalty or “Churn” <ul><li>Churn is a common data mining issue. </li></ul><ul><li>What’s at stake? Losing car insurance clients at rate of 13.25% a year ($$$$). </li></ul><ul><li>Business Goal: retain profitable clients. </li></ul><ul><li>Data Mining Goals: predict which clients are likely to resign their policy. </li></ul><ul><li>Winterthur can then take action. </li></ul>
    10. 10. Approach to churn <ul><li>Select data on customers who resigned </li></ul><ul><li>Divide this sample into: </li></ul><ul><ul><li>a training set to learn from; </li></ul></ul><ul><ul><li>a test set to check the results. </li></ul></ul><ul><li>Compare leavers in training set with similar customers who did not leave. </li></ul><ul><li>Learn the signature of likely churners. </li></ul>
    11. 11. Winterthur Application <ul><li>Two complementary approaches </li></ul><ul><li>In both we learn from a training set, and build a model. </li></ul><ul><li>1 Classify customers into leavers and non- leavers. Model gives Yes/No Answer. </li></ul><ul><li>2 Predict “likelihood” of people leaving. Generates a “propensity to leave”, or “score” for each case. Model gives numeric answer. </li></ul>
    12. 12. Winterthur Results <ul><li>Result on churn classification. </li></ul><ul><li>Achieved > 91.5% accuracy predicting churn (Yes/No) on the test set. </li></ul><ul><li>This was 20% better than next competitor! </li></ul>
    13. 13. Summary Data Mining <ul><li>Data Mining means </li></ul><ul><li>finding patterns in your data </li></ul><ul><li>which you can use </li></ul><ul><li>to do your business better. </li></ul><ul><li>Decisions from data </li></ul><ul><li>It is a completely natural business process </li></ul><ul><li>. . . with a very wide range of applicability. </li></ul>
    14. 14. Applications of Data Mining <ul><li>Four Case Studies </li></ul><ul><ul><li>Reuters </li></ul></ul><ul><ul><li>BBC </li></ul></ul><ul><ul><li>Halfords </li></ul></ul><ul><li>Survey of other users and applications </li></ul>
    15. 15. Reuters Validating Forex Data <ul><li>Reuters gets currency prices from many sources </li></ul><ul><li>May contain errors </li></ul><ul><li>Easy to spot afterwards (spikes, dips) </li></ul><ul><li>Conventional checking systems spot only obvious errors </li></ul><ul><li>What’s at stake? </li></ul><ul><li>Reuters reputation, therefore sales </li></ul>
    16. 16. Reuters Real-time Forex Data £/$ £/$ “ NOW” Time Time OK ERROR
    17. 17. Reuters - Validating Forex Data <ul><li>Used historical Forex data </li></ul><ul><li>Derived dynamic, time- based descriptors </li></ul><ul><li>Built models (neural networks, rules) to predict price movements </li></ul><ul><li>Report deviations from predictions </li></ul>
    18. 18. BBC TV Audience Prediction <ul><li>What’s at stake? Survival of BBC! </li></ul><ul><li>Business goal </li></ul><ul><ul><li>increase audience for TV programs </li></ul></ul><ul><li>Proposed business action </li></ul><ul><ul><li>better scheduling of programs </li></ul></ul><ul><li>Data mining goal </li></ul><ul><ul><li>predict audience share a programme will achieve in a particular slot </li></ul></ul>
    19. 19. BBC Results <ul><li>Neural network trained on 1 years data </li></ul><ul><ul><li>predicts audience share within 4% </li></ul></ul><ul><ul><li>equals best (> 2 years) human schedulers </li></ul></ul><ul><li>Some problem programmes </li></ul><ul><ul><li>human schedulers had same problems! </li></ul></ul><ul><li>Rules gave insight into “reasons” </li></ul><ul><li>. . . but beware of reasons . . . </li></ul>
    20. 20. Take care with “explanations” <ul><li>“ Any program (X) which follows a UK “soap” will achieve 6% less share that if X is put anywhere else” </li></ul><ul><li>So UK “soaps” cause audience to turn off ?? </li></ul><ul><li>No! The competition is at work! </li></ul>
    21. 21. Halfords - Predicting Sales <ul><li>Halfords are a retail organization </li></ul><ul><li>. . . planning to open new stores </li></ul><ul><li>What’s at stake? </li></ul><ul><ul><li>$10M investment / store </li></ul></ul><ul><li>Goal: predict sales from a new store </li></ul><ul><li>500 stores to learn from, many factors: </li></ul><ul><ul><li>site, competition, catchment area, management practice, . . . . </li></ul></ul>
    22. 22. Halfords - Predicting Sales Predicted sales Clementine models much more accurate than previous statistical models Regression Model (6m) Clementine Model(3w) Predicted sales Actual sales Actual sales
    23. 23. Who is using data mining? <ul><li>Telcos </li></ul><ul><li>AT & T </li></ul><ul><li>Cable & Wireless </li></ul><ul><li>Cellnet </li></ul><ul><li>Airtouch Cellular </li></ul><ul><li>Singapore Telecoms </li></ul><ul><li>Finance </li></ul><ul><li>Reuters </li></ul><ul><li>Barclays </li></ul><ul><li>National Westminster </li></ul><ul><li>Citibank </li></ul><ul><li>Pharmaceutical </li></ul><ul><li>Glaxo-Wellcome </li></ul><ul><li>Pfizer </li></ul><ul><li>Du Pont </li></ul><ul><li>Unilever </li></ul><ul><li>Government </li></ul><ul><li>HM Customs & Excise </li></ul><ul><li>IRS </li></ul><ul><li>The Home Office </li></ul><ul><li>DERA </li></ul><ul><li>Manufacturing </li></ul><ul><li>Daimler Benz </li></ul><ul><li>Ford </li></ul><ul><li>British Steel </li></ul><ul><li>Caterpillar </li></ul><ul><li>Retail </li></ul><ul><li>Boots </li></ul><ul><li>Tandy </li></ul><ul><li>ICL Retail </li></ul><ul><li>Halfords </li></ul>
    24. 24. Value of Reducing Attrition by 5% Based on The Loyalty Effect ; Frederick F. Reichheld, Thomas Teal; Harvard Business School Press, 1996 Auto/Home Insurance Branch Bank Deposits Credit Card Industrial Brokerage Industrial Distribution Life Insurance Publishing Software 0 10 20 30 40 50 60 70 80 90 100 Increase in Profitability
    25. 25. Two Crows Survey Results
    26. 26. Evolution of Marketing <ul><li>Market products to </li></ul><ul><ul><li>Everyone </li></ul></ul><ul><ul><li>Segments </li></ul></ul><ul><ul><li>Customers based on behavior (RFM) </li></ul></ul><ul><ul><li>Customers and non-customers based on demographics and psychographics </li></ul></ul>
    27. 27. Evolution of Marketing Technology <ul><li>Mailing list management </li></ul><ul><li>Ad-hoc segmentation </li></ul><ul><li>RFM </li></ul><ul><li>Statistical selection: clustering, regression, logistic regression, etc. </li></ul><ul><li>Statistical selection: CHAID </li></ul><ul><li>Statistical selection: data mining </li></ul>
    28. 28. Lift Lift measures the improvement between two treatments of the data
    29. 29. Return on Investment
    30. 30. Typical Applications <ul><li>Finance and Financial Services </li></ul><ul><ul><li>Lending risk assessment </li></ul></ul><ul><ul><li>Prediction of customer profitability </li></ul></ul><ul><ul><li>Targeting direct marketing </li></ul></ul><ul><ul><li>Predicting market rates </li></ul></ul><ul><ul><li>Fraud detection </li></ul></ul><ul><ul><li>Calculating insurance claim profiles </li></ul></ul>
    31. 31. Typical Applications <ul><li>Utilities </li></ul><ul><ul><li>Electricity demand forecasting </li></ul></ul><ul><ul><li>Modeling energy pricing </li></ul></ul><ul><ul><li>Developing control algorithms </li></ul></ul><ul><li>Retail </li></ul><ul><ul><li>“ Basket Analysis” (shopping patterns) </li></ul></ul><ul><ul><li>Promotions analysis </li></ul></ul><ul><ul><li>Analysis of personnel data </li></ul></ul>
    32. 32. Typical Applications <ul><li>Science and Healthcare </li></ul><ul><ul><li>Drug discovery </li></ul></ul><ul><ul><li>Predicting corrosivity of chemicals </li></ul></ul><ul><ul><li>Assessing treatment effectiveness </li></ul></ul><ul><ul><li>Monitoring intensive care patients </li></ul></ul><ul><ul><li>Predict crop yield from environmental factors </li></ul></ul><ul><ul><li>Choosing dental treatment for children </li></ul></ul><ul><ul><li>Predicting recovery time </li></ul></ul><ul><ul><li>Analysis of child care projects </li></ul></ul>
    33. 33. Typical Applications <ul><li>Market Research </li></ul><ul><ul><li>Increasing response rates to surveys </li></ul></ul><ul><ul><li>Estimating missing values in data </li></ul></ul><ul><li>Manufacturing/Defence </li></ul><ul><ul><li>Analyzing equipment failures </li></ul></ul><ul><ul><li>Managing spares, warranty claims, recalls </li></ul></ul><ul><ul><li>Quality management </li></ul></ul><ul><ul><li>Supply logistics </li></ul></ul>
    34. 34. Customer relationships <ul><li>Forecasting </li></ul><ul><ul><li>what demand will we have? </li></ul></ul><ul><li>Loyalty </li></ul><ul><ul><li>who’s likely to defect? </li></ul></ul><ul><li>Credit analysis </li></ul><ul><ul><li>What loans are the most risky? </li></ul></ul><ul><li>Profit modeling: </li></ul><ul><ul><li>which customers generate most, or least, profit </li></ul></ul><ul><li>Fraud detection </li></ul><ul><ul><li>When did it occur; what were the signs? </li></ul></ul><ul><ul><li>Do others show same signs? </li></ul></ul>
    35. 35. Summary <ul><li>Data mining has very broad range of applications </li></ul><ul><li>It is already being used by leading companies in many sectors world-wide </li></ul>
    36. 36. Agenda <ul><li>What is data mining? </li></ul><ul><li>Who is using data mining, and for what? </li></ul><ul><li>Systems Architecture for data mining </li></ul><ul><li>Some myths about data mining </li></ul>
    37. 37. Recall the decision-value pyramid Data from operational systems TPS, D/B, Management Reports Management information RD/B, EIS, OLAP Knowledge Data Mining Decision Value
    38. 38. “ Typical” multi-level IS Designed for: short transactions resilience. Big danger: killer SQL query Data Warehouse Designed for: killer SQL query. Big dangers: size? politics? unclean data? Receipts Orders Invoices Transaction Databases Operations management Supervisory Management Data Marts Strategy
    39. 39. BI architecture Browser Paper reports KNOWLEDGE WORKERS Data collection software External data ERP systems Other transaction systems Extract Cleanse Manage Load Calculate Enrich Impute Transform Functional department systems Legacy databases Data warehouse Reporting OLAP Pattern recognition Exception detection Segmentation Classification Profiling Scoring Forecasting Simulation Optimization Data sources Data preparation Data storage Data analysis & data mining Deployment INFORMATION CONSUMERS Web server Desktop software Services / Application development / Prototyping MODEL BUILDERS Data mart Data mart Browser Browser
    40. 40. DM in an Information System <ul><li>The only requirements for data mining are </li></ul><ul><ul><li>a business problem </li></ul></ul><ul><ul><li>some relevant data </li></ul></ul><ul><li>The data can come from any data source </li></ul><ul><li>. . . or combination of data sources </li></ul><ul><li>Successful data mining requires two viewpoints </li></ul><ul><ul><li>knowledge of the business meaning of the data </li></ul></ul><ul><ul><li>some common-sense analytical knowledge </li></ul></ul>
    41. 41. Data Mining Process in a multi-level IS Transaction Databases Data Warehouse Data Marts Orders Invoices Receipts Other e.g. geographic, demographic, etc. Eureka??
    42. 42. Business intelligence tools The data “mine” Neural networks Tree builders, Rule induction Statistics Data visualisation On Line Analytical Processing (OLAP) Automatic High dimensionality Non-Linear relations Highly predictive Query, SQL, Spreadsheets User driven Low dimensionality Little predictive value
    43. 43. Business intelligence compared Executable Decision Model Reports & Graphs <ul><li>Validation driven </li></ul><ul><li>Manual </li></ul><ul><li>‘ What were sales of product X in October’ </li></ul>Query/Reporting Data Mining OLAP <ul><li>Visualisation-driven </li></ul><ul><li>Manual </li></ul>time profit product ‘ Drill down October Sales of product X at 4% profit level, all regions’ <ul><li>Goal-driven </li></ul><ul><li>Automatic </li></ul><ul><li>Goal = ‘significant loss’: </li></ul><ul><li>‘ If period = week 40 </li></ul><ul><li>and product = BBQ </li></ul><ul><li>then profit level = significant loss’ </li></ul>
    44. 44. Discovered Knowledge is a non-trivial pattern in data <ul><li>classification </li></ul><ul><ul><li>these people will buy; those people will not </li></ul></ul><ul><li>association </li></ul><ul><ul><li>people who buy beer also buy nuts </li></ul></ul><ul><li>sequence </li></ul><ul><ul><li>after marriage, people buy insurance </li></ul></ul><ul><li>clustering/segmentation </li></ul><ul><ul><li>health, convenience, luxury food eaters . . . </li></ul></ul>
    45. 45. Select appropriate modeling technique rule induction neural networks tree generators rule induction neural networks regression kohonen networks rule induction k-means web diagrams a priori rule induction trend functions rule induction neural networks Categorize your customers or clients Classification Forecast future sales or usage Prediction Group similar customers or clients Segmentation Discover products that are purchased together Association Find patterns and trends over time Sequence
    46. 46. Decision models <ul><li>The ideal result is actionable knowledge </li></ul><ul><li>… executable software which makes a decision </li></ul><ul><ul><li>market to these people out of the list </li></ul></ul><ul><ul><li>accept/decline this loan application </li></ul></ul><ul><ul><li>predicted revenue from this store is $205M </li></ul></ul><ul><ul><li>weight this premium by -5% </li></ul></ul><ul><ul><li>sales in this area are below par: investigate! </li></ul></ul><ul><li>Models (software agents) can be deployed wherever appropriate in the existing IS </li></ul>
    47. 47. Models deployed in an IS Decision models (“agents”) in action Reports Orders Invoices Receipts Data Marts Data Warehouse Transaction Databases
    48. 48. Model used for new process New product? New promotion? Data Marts Data Warehouse
    49. 49. <ul><li>Warehouse not required for data mining... </li></ul><ul><li>... but it is usually an excellent platform </li></ul><ul><li>Warehouse cleans data and solves politics </li></ul><ul><ul><li>mine first, learn what the warehouse should hold </li></ul></ul><ul><ul><li>mine first, use the savings to pay for warehouse! </li></ul></ul>Warehousing and mining Data Warehouse Data Mining Storage, Management Organisation, Control Discovery, Understanding Modelling $0.5-5M $30-200K
    50. 50. Data mining is natural <ul><li>DM automates the oldest, most natural process: learning from experience </li></ul><ul><li>Finds models of best business practice that can be deployed throughout the enterprise </li></ul>Deploy models for best practice Data Data Mining Enterprise learning feedback loop
    51. 51. The Vision <ul><li> decision-enabled enterprises that continually adapt to new customer and market situations </li></ul>
    52. 52. Summary of this section <ul><li>Data mining automates “learning from experience” </li></ul><ul><li>. . . helps create organizations that adapt </li></ul><ul><li>there is no limit to the number of applications </li></ul><ul><li>only requirement is business problem plus relevant data </li></ul><ul><li>results can be reports, but better as active best practice models learned from data </li></ul><ul><li>models provide benefit only when deployed! </li></ul><ul><li>you don’t need to have a warehouse, </li></ul><ul><li>. . . but it can help. </li></ul>
    53. 53. Agenda <ul><li>What is data mining? </li></ul><ul><li>Who is using data mining, and for what? </li></ul><ul><li>How data mining fits into an IT system </li></ul><ul><li>Some myths about data mining </li></ul>
    54. 54. Data mining myths <ul><li>Myth: “data mining is something algorithms do to large volumes of data; algorithms can discover new knowledge” </li></ul><ul><li>Fact: “ Data mining is something people do on their businesses .” High-value results are often obtained with modest amounts of data. </li></ul><ul><li>Myth: Data mining requires a high degree of analytical skills (e.g. a PhD in statistics) </li></ul><ul><li>Fact: The best data miner is someone who knows and understands the business. </li></ul>
    55. 55. Data mining vendors - the myth-makers! <ul><li>Vendors position DM to sell their: </li></ul><ul><ul><li>parallel machines or large disks </li></ul></ul><ul><ul><li>expensive parallel algorithms </li></ul></ul><ul><ul><li>dramatic visualisation </li></ul></ul><ul><ul><li>high-power external consulting </li></ul></ul><ul><li>Some problems need these (and their cost); many do not. </li></ul>
    56. 56. Mine data intelligently <ul><li>Data mining is not blundering blindly about in data using the most powerful shovel (algorithm). </li></ul><ul><li>Though it is smart to have a lot of quality tools (algorithms) available. </li></ul><ul><li>Contrast: </li></ul><ul><ul><li>hydraulic mining by washing away mountains </li></ul></ul><ul><ul><li>mining by intelligent prospecting </li></ul></ul>
    57. 57. Hydraulic mining at Malakoff Diggins
    58. 58. Hydraulic data mining? Picture from Tandem TM advertisement
    59. 59. Good Data Mining is: <ul><li>. . . “intelligent prospecting” </li></ul><ul><li>decide what you are looking for first, </li></ul><ul><li>then apply knowledge (c.f. geology, mineralogy..), </li></ul><ul><li>then take samples, </li></ul><ul><li>assay the results from the samples, </li></ul><ul><li>finally mine. </li></ul>
    60. 60. Good Data Mining is: <ul><li>. . best with known business problem / opportunity patterns to learn from </li></ul><ul><li>(known buyers, bad debts, fraud cases, good promotions, profitable lines . . .) </li></ul><ul><li>This determines: </li></ul><ul><ul><li>business goals and goal variables, </li></ul></ul><ul><ul><li>data that is rich in information for this problem </li></ul></ul><ul><ul><li>suggest the analysis strategy </li></ul></ul>
    61. 61. Understand the Business Problem First Insight What you know Increase revenue Improve processes $ Business problem ? Data C2 C1 Clustering
    62. 62. DM rarely requires massive data during the prospecting phase <ul><li>Case of the mysterious disappearing Terabytes </li></ul><ul><li>“ Can Clementine handle our data base? We have 3Tb going back 20 years, 17M clients.” </li></ul><ul><li>“ Probably, tell us what you want to investigate.” </li></ul><ul><li>“ Account closure patterns, to reduce churn” </li></ul><ul><li>“ How many occur each month?” (1700) 10 -4 </li></ul><ul><li>What’s important? (age, marriage, . . . . ) 10 -5 </li></ul><ul><li>When did you start saving this? (2 years ago) 10 -6 </li></ul><ul><li>When do closure signs begin? (3 months) 10 -7 </li></ul>
    63. 63. Winterthur Result <ul><li>Recall the Winterthur “churn” problem </li></ul><ul><li>Result on churn classification. </li></ul><ul><li>Achieved > 91.5% accuracy predicting churn (Yes/No) on the ( unseen ) test set. </li></ul><ul><li>This was 20% better than next competitor! (SAS EM, IBM IM, HNC, Thinking Machines Inc.) </li></ul>
    64. 64. Halfords - Predicting Sales Predicted sales Recall the store sales prediction result Regression Model (6m) Clementine Model(3w) Predicted sales Actual sales Actual sales
    65. 65. <ul><li>Why? </li></ul>
    66. 66. The data is not the business Business Data Name Age Incom e Mar/S in/Div Car C Card Pur ch Val Last Purch Child ren Source F. Bloggs 25 25000 Single Yes M/C 5 23.5 34 0 L1 J. Smith 37 33000 Mar. Yes VISA 3 123.4 102 2 L2 J. Dow 45 40000 Div. No VISA 12 15.2 48 1 L1 The Business
    67. 67. Business deals with the real world <ul><li>Most of what is interesting to business is fuzzy - customers, customers’ behaviour </li></ul><ul><li>Hard to give a numeric value. </li></ul><ul><li>Business/market people know strengths and weaknesses in the data </li></ul><ul><li>Garbage (or bias) in = garbage (or bias) out. </li></ul>
    68. 68. What’s in the chasm? <ul><li>Business knowledge that’s in your head (or library, or in other department) </li></ul><ul><li>Data we aren’t yet using e.g. MR data. </li></ul><ul><li>E.g. company launched new product </li></ul><ul><ul><li>90% of our non-buyers are close to buying </li></ul></ul><ul><ul><li>90% of our non-buyers will never buy </li></ul></ul><ul><li>Same transaction data, but dramatically different prospects </li></ul>
    69. 69. Business knowledge <ul><li>Which factors are relevant? </li></ul><ul><ul><li>quality/blend of raw materials </li></ul></ul><ul><ul><li>time of year / weather </li></ul></ul><ul><li>Maybe key predictors must be derived </li></ul><ul><ul><li>a sum: household income, </li></ul></ul><ul><ul><li>a trend: rate of sales decrease </li></ul></ul><ul><ul><li>a ratio: sales/sq ft. </li></ul></ul><ul><li>Business/Market knowledge is the key </li></ul>
    70. 70. Halfords’ application Higher accuracy than previous statistical models. Why? External statistics company In-house business manager Regression (6 months) Clementine (3 weeks) Predicted sales Predicted sales Actual sales Actual sales
    71. 71. Halfords - Merging data
    72. 72. Halfords - Adding market knowledge
    73. 73. 1 Split into train and test data 3 Test the models 2 Train models
    74. 74. Rationale for Clementine TM <ul><li>Algorithms have no business knowledge or common sense </li></ul><ul><li>Need to use algorithms alongside business/ market expertise </li></ul><ul><li>DM is a creative/discovery process. We need fluency to follow train of thought (hunches). </li></ul><ul><li>Hunching is hard if business user must keep telling technology expert what to do. </li></ul>
    75. 75. Clementine objectives <ul><li>A data mining system which users can drive themselves </li></ul><ul><li>Many fully-packaged algorithms (no one silver bullet) </li></ul><ul><li>Can follow up clues discovered in the data </li></ul><ul><li>Easy to input own ideas / knowledge </li></ul><ul><li>As easy as a spreadsheet </li></ul>
    76. 76. Clementine
    77. 77. SPSS’ data mining workbench of the future User interface Algorithms Infrastructure Clementine Clementine SPSS Other algorithms Scalable architecture Common deployment vehicles
    78. 78. Data mining: decisions from data to do your business better Insight What you know Increase revenue Improve processes $ Business problem ? Data C2 C1 Clustering
    79. 79. Thank you for listening. ? Any Questions? [email_address]