• Like

Base Rate Fallacy Sira Con 2012 05

  • 266 views
Uploaded on

Base Rate Fallacy; how fourfold tables can help in information security decision analysis. Understanding how to construct and use this tool helps us understand the correct probabilities of true …

Base Rate Fallacy; how fourfold tables can help in information security decision analysis. Understanding how to construct and use this tool helps us understand the correct probabilities of true positive, false positive, true negative and false negative events.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
266
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
1
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Patrick Florer Risk Centric Security, Inc. www.riskcentricsecurity.com Authorized reseller of ModelRisk from Vose SoftwareRisk Centric Security, Inc. Confidential and Proprietary . Risk Analysis for the 21st Century®Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 2. Patrick Florer has worked in information technology for32 years. In addition, he worked a parallel track inmedical outcomes research, analysis, and the creation ofevidence-based guidelines for medical treatment. Hisroles have included IT operations, programming, andsystems analysis. From 1986 until now, he has worked asan independent consultant, helping customers withstrategic development, analytics, risk analysis, anddecision analysis. He is a cofounder of Risk CentricSecurity and currently serves as Chief Technology Officer. Risk Centric Security, Inc. Confidential and Proprietary . Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 3. What is the Base Rate Fallacy? What are fourfold tables? How do fourfold tables work? How can fourfold tables help solve information security problems? How can the use of Monte Carlo simulation improve the use of fourfold tables?Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 4. A technology under evaluation claims: 95% accuracy in detecting malicious traffic 15% false positive rateRisk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 5. What is the probability that a sample identified as malicious is really malicious?Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 6. What is the probability that a sample identified as malicious is really malicious? Without knowing, or being able to estimate, the base rate in the sample or population, you cannot answer the question!Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 7. Does not exist/ Exists/True Not True/False Identified/Detected as True Positive (TP) False Positive (FP) Existing/True/Positive (+ +) (- +) Identified/Detected as True Negative (TN) False Negative (FN) Not (+ -) (- -) Existing/False/NegativeRisk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 8. A fourfold table, also called a 2 x 2 table, is a four cell table (2 rows x 2 columns) based upon two sets of dichotomous or ”yes/no” facts.Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 9. The two sets of facts could be: Something that exists or is true, or doesn’t exist/isn’t true, and Something else related to #1 that exists or is true, or doesn’t exist/isn’t true, including “something” that attempts to identify/detect whether #1 exists/is true or #1 does not exist/is not true. Risk Centric Security, Inc. Confidential and Proprietary. Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 10. The “something” that exists or doesn’t exist could be a disease, a virus or worm, malware, exploit code, or a malicious packet: i.e: you have a disease or you don’t; a piece of code is malicious or it isn’t, etc. The “something else” that “identifies/detects” could be a medical diagnostic test, anti-virus/anti-malware software, IDS/IPS systems, etc. The diagnostic test or software either correctly identifies the disease, virus, or malware, or it doesn’t.Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 11. True Positive: “something” does exist/is true and is correctly identified as existing/true. False Positive: “something” does not exist/is not true, but is incorrectly identified as existing/true. False Negative: “something” does exist/is true, but is incorrectly identified as not existing/not true. True Negative: “something” does not exist/is false, and is correctly identified as not existing/false.Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 12. Using the previous example: 95% accuracy in detecting malicious traffic 15% false positive rate And, assuming that 3% of all traffic is malicious (prevalence)Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 13. Out of 1 million packets: 3% are malicious = 30,000 97% are non-malicious = 970,000Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 14. Out of 30,000 malicious packets: 95% are correctly identified as malicious = 28,500 (True Positive) 5% are incorrectly identified as harmless = 1,500 (False Negative)Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 15. Out of 970,000 non-malicious packets: 15% are incorrectly identified as malicious = 145,500 (False Positive) 85% are correctly identified as non- malicious = 824,500 (True Negative)Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 16. True False Positive True Positive False Positive (TP) (FP) 28,500 145,500 Negative True Negative False Negative (TN) (FN) 824,500 1,500Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 17. What is the probability that a packet identified as malicious is really malicious? P(mal) = TP / (TP + FP) = 28,500 / (28,500 + 145,500) = 16.3% What happened to the 95% accuracy rate?Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 18. What is the probability that a packet identified as non-malicious is really non-malicious? P(<>mal) = TN / (TN + FN) = 824,500 / (824,500 + 1,500) = 98.8%Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 19. Some things to remember: The numbers in the four cells must add up to 100% of the total number being analyzed (1M in this example) As the base rate approaches 100%, the base rate fallacy ceases to applyRisk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 20. What’s the difference? Prevalence: cross-sectional, how much is out there right now? Incidence: longitudinal, a proportion of new cases found during a time period Both prevalence and incidence can be expressed as rates.Risk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 21. How can the use of Monte Carlo simulation improve the use of fourfold tables? Examples in ExcelRisk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.
  • 22. Thank you ! Risk Analysis for the 21st Century ® Patrick Florer CTO and Co-founder Risk Centric Security, Inc patrick@riskcentricsecurity.com 214.828.1172 Authorized reseller of ModelRisk from Vose SoftwareRisk Centric Security, Inc. Confidential and Proprietary.Copyright © 2012 Risk Centric Security, Inc . All rights reserved.