Sunz 2010   Evan Stubbs   When Good Intentions Fail
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
644
On Slideshare
644
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. When Good Intentions Fail
    Tips on avoiding common advanced analytics traps
    Evan Stubbs
    Solution Manager, ANZ – SAS
    16th February, 2010
  • 2. Today’s Agenda
    Four (hopefully) thought provoking statements
    Some answers
  • 3. My provocative statements for the day …
    Seeing only part of the picture is worse than seeing nothing at all.
    Rule-based detection systems will seduce, distract, and eventually trap you.
    Focusing on tools is the fastest road to failure.
    Insight generated in isolation is less than useless and will actually hurt you.
  • 4. Seeing only part of the picture is worse than seeing nothing at all.
  • 5. Anyone know what this is?
    Formula courtesy of Wired: http://www.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all
  • 6. Consider this …
    A process identifies non-state individuals conspiring against the government based on:
    The contents of their communications
    Their communication methods of choice
    The frequency of their interactions
    If the individuals are conspiring, 99% of the time the test will be positive
    If the individuals are not conspiring, 99% of the time the test will be negative
  • 7. So we execute!
    The test is put into production
    A collection of individuals are identified as conspiring against the government
    The test is known to be 99% accurate, so enforcement is mobilised and set into action
    Pretty conclusive, right?
    It may be wrong as high as 99.99% of the time, despite being 99% accurate (Huh?!?)
  • 8. Here’s why …
    Few people actually conspire against the government:
    Assume 1 / 500,000 people actually conspire
    Assume Australia’s population is 22 million
    General formula:
    Population * (Incidence rate / Sample Population) * Test Efficiency
    A positive result will be wrong in 99.99% of cases, despite the test being 99% accurate
  • 9. The Lessons
    If you look through a keyhole, you’ll only ever see a tiny part of the room.
    If you rely too heavily on a single detection method, you will be wrong, catastrophically so at times.
    It’s only a matter of time.
  • 10. Anyone know what this is?
    David X. Li’s Gaussian Copula function, the formula that almost brought down the financial world
  • 11. Rule-based detection systems will seduce, distract, and eventually trap you.
  • 12. Another one …
    Identification of the communication point of a seditious cell could involve
    Their relationships
    The directionality and frequency of ‘interesting’ communication
    Analysis of the information shows that two individuals are equally possible information dissemination points
    There is one standout who, over three months, leads the number of ‘interesting’ messages sent
    Pretty conclusive, right?
  • 13. Nope, yet again …
    Bad rules lead to bad results.
    Even worse, you may not know until well after the fact!
  • 14. The Lessons
    Rules don’t work well with ‘context’, but they do provide a false sense of security.
    Maintaining a rules list can be a fun job in its own right!
    Rule-based detection works great when your subjects maintain their behaviour and are happy to be observed. How often does that happen?
  • 15. Focusing on tools is the fastest road to failure.
  • 16. There are many methodologies …
    Knowledge
    source
    Statistical
    Judgmental
    Univariate
    Multivariate
    Self
    Others
    Data-
    based
    Theory-
    based
    Role
    No role
    Unstructured
    Structured
    Extrapolation
    models
    Data
    mining
    Intentions/
    expectations
    Role playing(Simulatedinteraction)
    Unaided
    judgment
    Quantitative
    analogies
    Neural
    nets
    Conjoint
    analysis
    Rule-based
    forecasting
    Feedback
    No feedback
    Linear
    Classification
    Segmentation
    Causal
    models
    Prediction
    markets
    Decom-position
    Structured
    analogies
    Delphi
    Judgmental
    bootstrapping
    Game theory
    Expert
    systems
    Methodology Tree for Forecasting
    forecastingpriciples.com
    JSA-KCG
    September 2005
  • 17. And picking an approach can be complicated …
    Sufficient
    objective data
    Judgmental methods
    Quantitative methods
    No
    Yes
    Large changes
    expected
    Good
    knowledge of
    relationships
    Yes
    No
    Yes
    No
    Conflict among a few
    decision makers
    Policy analysis
    Type of
    data
    Large changes
    likely
    Yes
    No
    Yes
    No
    Yes
    No
    Time series
    Cross-section
    Accuracy
    feedback
    Similar
    cases exist
    Policy
    analysis
    Policy
    analysis
    Good
    domain
    knowledge
    Yes
    No
    No
    Yes
    Yes
    No
    Unaided
    judgment
    Type of
    knowledge
    No
    Yes
    Yes
    No
    Domain
    Self
    Delphi/
    Predictionmarkets
    Judgmental
    bootstrapping/
    Decomposition
    Conjoint
    analysis
    Intentions/
    expectations
    Role playing(Simulatedinteraction/
    Game theory)
    Structured
    analogies
    Expert
    systems
    Rule-based
    forecasting
    Extrapolation/
    Neural nets/Data mining
    Causal
    models/
    Segmentation
    Quantitative
    analogies
    Several
    methods provide
    useful forecasts
    Yes
    No
    Combine forecasts
    Single
    method
    Omitted information?
    Yes
    No
    Use adjusted forecast
    Use unadjusted forecast
    Selection Tree for Forecasting Methods
    forecastingprinciples.com
    JSA-KCG
    January 2006
  • 18. Six months later …
  • 19. Here’s a simpler approach …
    Which one gives me the answers?
    Which one lets me automate the manual stuff?
    Which one plays with everything else I have?
  • 20. The Lessons
    The tools aren’t as important as answering the question quickly, accurately, and in a way that can be executed.
    Focus on solving the intelligence problem, not on the colour of widget X.
  • 21. Insight generated in isolation is less than useless and will actually hurt you.
  • 22. Evan’s Generalised Formula for Analysis Paralysis
    Every isolated information source, s, will create p new ‘possibilities’
    Comparing and validating each of these possibilities will take t time
    The total time to compare and validate these possibilities :
    (((s*p)((s*p)-1))/2) * t
  • 23. Evan’s Generalised Formula for Analysis Paralysis
    Let’s say you have:
    Five people
    Each coming up with their own set of ten calculations
    On their standalone desktops with their own extract of data
    And it takes two hours to validate and compare who has the ‘best’ answer
    Total time elapsed: 306 work days, or two months of wasted team effort
    And this is just for one small case!
  • 24. The Lessons
    Every time you create a new standalone datasource, you geometrically increase your pointless workload.
    Every time you use another non-integrated tool, you waste time and money.
    Make sure your tools operationalise on a common platform, even if you find you must use multiple tools.
  • 25. The Answers …
  • 26. The Core Answers
    Focus on solving the problem
    Build a process that uses a wide range of validating / confirming techniques
    Integrate, re-use, automate, and operationalise everything
    Measure success by business outcomes, not models developed
    Keep things as simple as possible, but no simpler
  • 27. Integrated Business Analytics
    Alert Generation Process
    Operational Data Sources
    Exploratory Data Analysis & Transformation
    Alert
    Administration
    Business
    Rules
    Social
    Network
    Analysis
    AnalyticsData
    Staging
    Network
    Rules
    Network
    Analytics
    Individuals
    Analytics
    Text Analytics
    Predictive
    Modeling
    Alert Management &
    Reporting
    Accounts
    Learn and Improve Cycle
    Interaction Management
    Transactions
    Intelligent
    Data Repository
  • 28. Thanks for the time!
  • 29. Copyright © 2006, SAS Institute Inc. All rights reserved.