When Good Intentions Fail<br />Tips on avoiding common advanced analytics traps<br />Evan Stubbs<br />Solution Manager, AN...
Today’s Agenda<br />Four (hopefully) thought provoking statements<br />Some answers<br />
My provocative statements for the day …<br />Seeing only part of the picture is worse than seeing nothing at all.<br />Rul...
Seeing only part of the picture is worse than seeing nothing at all.<br />
Anyone know what this is?<br />Formula courtesy of Wired:  http://www.wired.com/techbiz/it/magazine/17-03/wp_quant?current...
Consider this …<br />A process identifies non-state individuals conspiring against the government based on:<br />The conte...
So we execute!<br />The test is put into production<br />A collection of individuals are identified as conspiring against ...
Here’s why …<br />Few people actually conspire against the government:<br />Assume 1 / 500,000 people actually conspire<br...
The Lessons<br />If you look through a keyhole, you’ll only ever see a tiny part of the room.<br />If you rely too heavily...
Anyone know what this is?<br />David X. Li’s Gaussian Copula function, the formula that almost brought down the financial ...
Rule-based detection systems will seduce, distract, and eventually trap you.<br />
Another one …<br />Identification of the communication point of a seditious cell could involve<br />Their relationships<br...
Nope, yet again …<br />Bad rules lead to bad results.<br />Even worse, you may not know until well after the fact!<br />
The Lessons<br />Rules don’t work well with ‘context’, but they do provide a false sense of security.<br />Maintaining a r...
Focusing on tools is the fastest road to failure.<br />
There are many methodologies …<br />Knowledge<br />source<br />Statistical<br />Judgmental<br />Univariate<br />Multivaria...
And picking an approach can be complicated …<br />Sufficient<br />objective data<br />Judgmental methods<br />Quantitative...
Six months later …<br />
Here’s a simpler approach …<br />Which one gives me the answers?<br />Which one lets me automate the manual stuff?<br />Wh...
The Lessons<br />The tools aren’t as important as answering the question quickly, accurately, and in a way that can be exe...
Insight generated in isolation is less than useless and will actually hurt you.<br />
Evan’s Generalised Formula for Analysis Paralysis<br />Every isolated information source, s, will create p new ‘possibilit...
Evan’s Generalised Formula for Analysis Paralysis<br />Let’s say you have:<br />Five people<br />Each coming up with their...
The Lessons<br />Every time you create a new standalone datasource, you geometrically increase your pointless workload.<br...
The Answers …<br />
The Core Answers<br />Focus on solving the problem<br />Build a process that uses a wide range of validating / confirming ...
Integrated Business Analytics<br />Alert Generation Process<br />Operational Data Sources<br />Exploratory Data Analysis &...
Thanks for the time!<br />
Copyright © 2006, SAS Institute Inc. All rights reserved.<br />
Upcoming SlideShare
Loading in …5
×

Sunz 2010 Evan Stubbs When Good Intentions Fail

513 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
513
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Sunz 2010 Evan Stubbs When Good Intentions Fail

  1. 1. When Good Intentions Fail<br />Tips on avoiding common advanced analytics traps<br />Evan Stubbs<br />Solution Manager, ANZ – SAS<br />16th February, 2010<br />
  2. 2. Today’s Agenda<br />Four (hopefully) thought provoking statements<br />Some answers<br />
  3. 3. My provocative statements for the day …<br />Seeing only part of the picture is worse than seeing nothing at all.<br />Rule-based detection systems will seduce, distract, and eventually trap you.<br />Focusing on tools is the fastest road to failure.<br />Insight generated in isolation is less than useless and will actually hurt you.<br />
  4. 4. Seeing only part of the picture is worse than seeing nothing at all.<br />
  5. 5. Anyone know what this is?<br />Formula courtesy of Wired: http://www.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all<br />
  6. 6. Consider this …<br />A process identifies non-state individuals conspiring against the government based on:<br />The contents of their communications<br />Their communication methods of choice<br />The frequency of their interactions<br />If the individuals are conspiring, 99% of the time the test will be positive<br />If the individuals are not conspiring, 99% of the time the test will be negative<br />
  7. 7. So we execute!<br />The test is put into production<br />A collection of individuals are identified as conspiring against the government<br />The test is known to be 99% accurate, so enforcement is mobilised and set into action<br />Pretty conclusive, right?<br />It may be wrong as high as 99.99% of the time, despite being 99% accurate (Huh?!?)<br />
  8. 8. Here’s why …<br />Few people actually conspire against the government:<br />Assume 1 / 500,000 people actually conspire<br />Assume Australia’s population is 22 million<br />General formula:<br />Population * (Incidence rate / Sample Population) * Test Efficiency<br />A positive result will be wrong in 99.99% of cases, despite the test being 99% accurate<br />
  9. 9. The Lessons<br />If you look through a keyhole, you’ll only ever see a tiny part of the room.<br />If you rely too heavily on a single detection method, you will be wrong, catastrophically so at times.<br />It’s only a matter of time.<br />
  10. 10. Anyone know what this is?<br />David X. Li’s Gaussian Copula function, the formula that almost brought down the financial world<br />
  11. 11. Rule-based detection systems will seduce, distract, and eventually trap you.<br />
  12. 12. Another one …<br />Identification of the communication point of a seditious cell could involve<br />Their relationships<br />The directionality and frequency of ‘interesting’ communication<br />Analysis of the information shows that two individuals are equally possible information dissemination points<br />There is one standout who, over three months, leads the number of ‘interesting’ messages sent<br />Pretty conclusive, right?<br />
  13. 13. Nope, yet again …<br />Bad rules lead to bad results.<br />Even worse, you may not know until well after the fact!<br />
  14. 14. The Lessons<br />Rules don’t work well with ‘context’, but they do provide a false sense of security.<br />Maintaining a rules list can be a fun job in its own right!<br />Rule-based detection works great when your subjects maintain their behaviour and are happy to be observed. How often does that happen?<br />
  15. 15. Focusing on tools is the fastest road to failure.<br />
  16. 16. There are many methodologies …<br />Knowledge<br />source<br />Statistical<br />Judgmental<br />Univariate<br />Multivariate<br />Self<br />Others<br />Data-<br />based<br />Theory-<br />based<br />Role<br />No role<br />Unstructured<br />Structured<br />Extrapolation<br />models<br />Data<br />mining<br />Intentions/<br />expectations<br />Role playing(Simulatedinteraction)<br />Unaided<br />judgment<br />Quantitative<br />analogies<br />Neural<br />nets<br />Conjoint<br />analysis<br />Rule-based<br />forecasting<br />Feedback<br />No feedback<br />Linear<br />Classification<br />Segmentation<br />Causal<br />models<br />Prediction<br />markets<br />Decom-position<br />Structured<br />analogies<br />Delphi<br />Judgmental<br />bootstrapping<br />Game theory<br />Expert<br />systems<br />Methodology Tree for Forecasting<br />forecastingpriciples.com<br />JSA-KCG<br />September 2005<br />
  17. 17. And picking an approach can be complicated …<br />Sufficient<br />objective data<br />Judgmental methods<br />Quantitative methods<br />No<br />Yes<br />Large changes <br />expected<br />Good<br />knowledge of<br />relationships <br />Yes<br />No<br />Yes<br />No<br />Conflict among a few<br />decision makers<br />Policy analysis<br />Type of<br />data<br />Large changes <br />likely<br />Yes<br />No<br />Yes<br />No<br />Yes<br />No<br />Time series<br />Cross-section<br />Accuracy<br />feedback<br />Similar<br />cases exist<br />Policy<br />analysis<br />Policy<br />analysis<br />Good<br />domain<br />knowledge<br />Yes<br />No<br />No<br />Yes<br />Yes<br />No<br />Unaided<br />judgment<br />Type of<br />knowledge<br />No<br />Yes<br />Yes<br />No<br />Domain<br />Self<br />Delphi/<br />Predictionmarkets<br />Judgmental<br />bootstrapping/<br />Decomposition<br />Conjoint<br />analysis<br />Intentions/<br />expectations<br />Role playing(Simulatedinteraction/<br />Game theory)<br />Structured<br />analogies<br />Expert<br />systems<br />Rule-based<br />forecasting<br />Extrapolation/<br />Neural nets/Data mining<br />Causal<br />models/<br />Segmentation<br />Quantitative<br />analogies<br />Several <br />methods provide<br /> useful forecasts<br />Yes<br />No<br />Combine forecasts<br />Single<br />method<br />Omitted information?<br />Yes<br />No<br />Use adjusted forecast<br />Use unadjusted forecast<br />Selection Tree for Forecasting Methods<br />forecastingprinciples.com<br />JSA-KCG<br />January 2006<br />
  18. 18. Six months later …<br />
  19. 19. Here’s a simpler approach …<br />Which one gives me the answers?<br />Which one lets me automate the manual stuff?<br />Which one plays with everything else I have?<br />
  20. 20. The Lessons<br />The tools aren’t as important as answering the question quickly, accurately, and in a way that can be executed.<br />Focus on solving the intelligence problem, not on the colour of widget X.<br />
  21. 21. Insight generated in isolation is less than useless and will actually hurt you.<br />
  22. 22. Evan’s Generalised Formula for Analysis Paralysis<br />Every isolated information source, s, will create p new ‘possibilities’<br />Comparing and validating each of these possibilities will take t time<br />The total time to compare and validate these possibilities :<br />(((s*p)((s*p)-1))/2) * t<br />
  23. 23. Evan’s Generalised Formula for Analysis Paralysis<br />Let’s say you have:<br />Five people<br />Each coming up with their own set of ten calculations<br />On their standalone desktops with their own extract of data<br />And it takes two hours to validate and compare who has the ‘best’ answer<br />Total time elapsed: 306 work days, or two months of wasted team effort<br />And this is just for one small case!<br />
  24. 24. The Lessons<br />Every time you create a new standalone datasource, you geometrically increase your pointless workload.<br />Every time you use another non-integrated tool, you waste time and money.<br />Make sure your tools operationalise on a common platform, even if you find you must use multiple tools.<br />
  25. 25. The Answers …<br />
  26. 26. The Core Answers<br />Focus on solving the problem<br />Build a process that uses a wide range of validating / confirming techniques<br />Integrate, re-use, automate, and operationalise everything<br />Measure success by business outcomes, not models developed<br />Keep things as simple as possible, but no simpler<br />
  27. 27. Integrated Business Analytics<br />Alert Generation Process<br />Operational Data Sources<br />Exploratory Data Analysis & Transformation<br />Alert<br />Administration<br />Business<br />Rules<br />Social<br />Network<br />Analysis<br />AnalyticsData<br />Staging<br />Network<br />Rules<br />Network<br />Analytics<br />Individuals<br />Analytics<br />Text Analytics<br />Predictive<br />Modeling<br />Alert Management &<br />Reporting<br />Accounts<br />Learn and Improve Cycle<br />Interaction Management<br />Transactions<br />Intelligent<br />Data Repository<br />
  28. 28. Thanks for the time!<br />
  29. 29. Copyright © 2006, SAS Institute Inc. All rights reserved.<br />

×