Your SlideShare is downloading. ×
Source seattle 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Source seattle 2012

338

Published on

The Base Rate Fallacy and Why it matters to information security professionals.

The Base Rate Fallacy and Why it matters to information security professionals.

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
338
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. INFORMATION SECURITY INFORMATION SECURITYWHY THE BASE RATE FALLACY MATTERS TO INFOSEC & HOW TO AVOID IT September 14, 2012 Patrick Florer Jeff Lowder, CISSP
  • 2. Introduction Natural Frequencies 1 2 3 4 Base Rate Fallacy Fourfold Tables & InfoSec
  • 3. Introduction Natural Frequencies 1 2 3 4 Base Rate Fallacy Fourfold Tables & InfoSec
  • 4. Patrick Florer • Cofounder and CTO of Risk Centric Security. • Fellow of and Chief Research Analyst at the Ponemon Institute. • 32 years of IT experience, including roles in IT operations, development, and systems analysis. • In addition, he worked a parallel track in medical outcomes research, analysis, and the creation of evidence-based guidelines for medical treatment. • From 1986 until now, he has worked as an independent consultant, helping customers with strategic development, analytics, risk analysis, and decision analysis.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 5. Jeff Lowder • President of the Society for Information Risk Analysts (www.societyinforisk.org) • Director, Global Information Security and Privacy, OpenMarket • Industry thought leader who leads world-class security organizations by building and implementing custom methodologies and frameworks that balance information protection and business agility. • 16 years of experience in information security and risk management • Previous leadership roles include: – Director, Information Security, The Walt Disney Company – Manager, Network Security, NetZero/United Online – Director, Security and Privacy, Elemica – Director, Network Security, United States Air Force AcademyProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 6. Introduction Natural Frequencies 1 2 3 4 Base Rate Fallacy Fourfold Tables & InfoSec
  • 7. Quick Check #1: How Good Are You at Estimating Risk? “I saw an orange cab” What color was the cab? The witness gets the color right 80% of the timeProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 8. Quick Check #1: What is the probability the cab was orange? (a) 80% (b) Somewhere between 50 and 80% (c) 50% (d) Less than 50% (the cab was green)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 9. Quick Check #2: How Good Are You at Estimating Risk? 85% of the taxis on the road are green cabs. 15% of the taxis on the road are orange cabs. “I saw an orange cab” What color was the cab? The witness gets the color right 80% of the timeProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 10. Quick Check #2: What is the probability the cab was orange? (a) 80% (b) Somewhere between 50 and 80% (c) 50% (d) Less than 50% (the cab was green)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 11. Quick Check #3 • 850 out of every 1,000 taxis are green cabs. • 150 out of every 1,000 taxis are orange cabs. • Of these 850 green cabs, the witness will see 170 of them as orange. • Of the 150 remaining orange cabs, the witness will see 120 of them as orange. • The witness says she saw an orange cab.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 12. Quick Check #3: What is the probability the cab was orange? (a) 80% (b) Somewhere between 50 and 80% (c) 50% (d) Less than 50% (the cab was green)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 13. Correct Answers Quick Check #1: Unknown. Without some idea of the base rate of orange cabs, it is impossible to estimate the probability that the witness is correct. Quick Check #2: (d) Less than 50% (the cab was green) Quick Check #3: (d) Less than 50% (the cab was green) The exact probability is 41%.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 14. Base Rates and the Base Rate Fallacy • Base Rate: “The base rate of an attribute (or event) in a population is the proportion of individuals manifesting that attribute (at a certain point in time). A synonym for base rate is prevalence.” Source: Gerd Gigerenzer, Calculated Risks: How to Know When Numbers Deceive You • Base Rate Fallacy: A fallacy of statistical reasoning that occurs when both base rate information and specific data about a specific individual is available. The fallacy is to make a conclusion about the probability (or frequency) by ignoring the base rate information and only considering the specific data.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 15. How Did Your Peers Do (with the Taxi Cab Question)? Jeff Lowder’s Research: D How often do GRC 5% professionals commit the base-rate fallacy? C 15% A 35% When presented with conditional probabilities (such as Quick Check #2), B 95% of GRC professionals 45% commit the base-rate fallacyProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 16. Why the Base Rate Matters to InfoSec Example Output Relevant Base Rate Common Vulnerability Vuln scores (0-10) used to Base rate of vuln Scoring System prioritize remediation exploitation Vulnerability Scanners Scan results Base rate of vulnerabilities Intrusion Detection Intrusion alerts Base rate of intrusion Systems events Anti-Virus/ Malware / Virus / Malware / Spyware Base rate of virus / Spyware alerts malware / spyware events Risk Analysis Residual Risk Base rate of hazard (part of inherent risk) Third-Party Assurance Third-Party Audit Report Base rate of compliant vendors Base rate of accurate audit reportsProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 17. Why the Base Rate Matters to InfoSec Example Questions to Ask Common Vulnerability Scoring System What is the base rate of vuln exploitation? Vulnerability Scanners What is the base rate of each type of vulnerability? Intrusion Detection Systems What is the base rate of intrusion events? Anti-Virus/ Malware / Spyware What is the base rate of virus / malware / spyware events? Risk Analysis What is the base rate of the hazard (part of inherent risk)? Third-Party Assurance What is the base rate of compliant vendors? What is the base rate of accurate audit reports?Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 18. Introduction Natural Frequencies 1 2 3 4 Base Rate Fallacy Fourfold Tables & InfoSec
  • 19. Use Natural Frequencies to Avoid the Base Rate Fallacy 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen 120 seen 30 seen as green by as orange as orange green by witness by witness by witness witness Pr (orange | witness says it was orange) 120 = ---------------- = 41% 170+120Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 20. Use Natural Frequencies to Avoid the Base Rate Fallacy 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen 120 seen 30 seen as green by as orange as orange green by witness by witness by witness witness Pr (Green | witness says it was orange) 170 = ---------------- = 59% 170+120Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 21. Natural Frequencies Work! “People tend to get the right answers using natural frequencies much more often than using conditional probabilities: 80% vs. 33%.” Gerd Gigerenzer Max Planck Institute (Berlin), former Professor of Psychology (University of Chicago)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 22. True Positives (aka “hits”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen 170 seen as 120 seen as 30 seen as as green orange by orange by green by by witness witness witness witness 𝐿𝐿𝐿 𝑇𝑇 = # 𝑜𝑜 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑇𝑇 = # 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 & (𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑔𝑔𝑔𝑔𝑔) 𝑇𝑇 = 𝟔𝟔𝟔Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 23. False Positives (aka “false alarms” or “Type I errors”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen as 30 seen as green by orange by orange by green by witness witness witness witness 𝐿𝐿𝐿 𝐹𝐹 = 𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝐹𝐹 = # 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 & (𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑜𝑜𝑜𝑜𝑜𝑜) 𝐹𝐹 = 𝟑𝟑Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 24. True Negatives Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen 30 seen as green by orange by as orange green by witness witness by witness witness 𝐿𝐿𝐿 𝑇𝑇 = 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑇𝑇 = # 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 & (𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑜𝑜𝑜𝑜𝑜𝑜) 𝑇𝑇 = 𝟏𝟏𝟏Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 25. False Negatives (aka “misses” or “Type II errors”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen 120 seen as 30 seen as green by as orange orange by green by witness by witness witness witness 𝐿𝐿𝐿 𝐹𝐹 = 𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝐹𝐹 = # 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 & (𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤 𝑔𝑔𝑔𝑔𝑔) 𝐹𝐹 = 𝟏𝟏𝟏Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 26. Summary Assumption: The cab was green 1,000 cabs Green Cab Orange Cab 850 green 150 orange cabs cabs Seen as Green Cab 680 True Positives 30 False Positives (TP=680) (FP=30) 680 seen 170 seen 120 seen 30 seen as as green as orange as orange green by by witness by witness by witness witness Seen as Orange Cab False Positive True Negative True Positive 170 False Negatives False Negative 120 True Negatives (FN=170) (TN=120)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 27. Introduction Natural Frequencies 1 2 3 4 Base Rate Fallacy Fourfold Tables & InfoSec
  • 28. A question: Let’s assume a technology that detects malicious packets: 95% accuracy in detecting malicious packets 15% false positive rate What is the probability that a packet identified as malicious is really malicious?Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 29. What are fourfold tables? Does not exist/ Exists/True Not True/False Identified/Detected as True Positive (TP) False Positive (FP) Existing/True/Positive (+ +) (- +) Identified/Detected as False Negative (FN) True Negative (TN) Not (+ -) (- -) Existing/False/NegativeProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 30. What are fourfold tables? A fourfold table, also called a 2 x 2 table, is a four cell table (2 rows x 2 columns) based upon two sets of dichotomous or ”yes/no” facts.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 31. What are fourfold tables? The two sets of facts could be: Something that exists or is true, or doesn’t exist/isn’t true, and Something else related to #1 that exists or is true, or doesn’t exist/isn’t true, including “something” that attempts to identify/detect whether #1 exists/is true or #1 does not exist/is not true.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 32. What are fourfold tables? The “something” that exists or doesn’t exist could be a disease, a virus or worm, malware, exploit code, or a malicious packet: i.e: you have a disease or you don’t; a piece of code is malicious or it isn’t, etc. The “something else” that “identifies/detects” could be a medical diagnostic test, anti-virus/anti-malware software, IDS/IPS systems, etc. The diagnostic test or software either correctly identifies the disease, virus, or malware, or it doesn’t.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 33. What are fourfold tables? True Positive: “something” does exist/is true and is correctly identified as existing/true. False Positive: “something” does not exist/is not true, but is incorrectly identified as existing/true. True Negative: “something” does not exist/is false, and is correctly identified as not existing/false. False Negative: “something” does exist/is true, but is incorrectly identified as not existing/not true.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 34. How do fourfold tables work? Using the previous example: 95% accuracy in detecting malicious traffic 15% false positive rate And, assuming that 3% of all traffic is malicious (prevalence)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 35. How do fourfold tables work? Out of 1 million packets: 3% are malicious = 30,000 97% are non-malicious = 970,000Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 36. How do fourfold tables work? Out of 30,000 malicious packets: 95% are correctly identified as malicious = 28,500 (True Positive) 5% are incorrectly identified as harmless = 1,500 (False Negative)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 37. How do fourfold tables work? Out of 970,000 non-malicious packets: 15% are incorrectly identified as malicious = 145,500 (False Positive) 85% are correctly identified as non- malicious = 824,500 (True Negative)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 38. How do fourfold tables work? True False Positive True Positive False Positive (TP) (FP) 28,500 145,500 Negative False Negative True Negative (FN) (TN) 1,500 824,500Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 39. Questions we can answer with fourfold tables : What is the probability that a packet identified as malicious is really malicious? P(mal) = TP / (TP + FP) = 28,500 / (28,500 + 145,500) = 16.3% What happened to the 95% accuracy rate?Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 40. Questions we can answer with fourfold tables : What is the probability that a packet identified as non-malicious is really non-malicious? P(<>mal) = TN / (TN + FN) = 824,500 / (824,500 + 1,500) = 98.8%Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 41. Questions we can answer with fourfold tables : Some things to remember: The numbers in the four cells must add up to 100% of the total number being analyzed (1M in this example) As the base rate approaches 100%, the base rate fallacy ceases to applyProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 42. Contact Info Patrick Florer Jeff Lowder CTO and Cofounder President, Society for Information Risk Centric Security, Inc. Risk Analysts (SIRA) • Web: • Web: www.jefflowder.com www.riskcentricsecurity.com • Blog: bloginfosec.com and • Email: www.societyinforisk.org patrick@riskcentricsecurity.com • Twitter: @agilesecurityProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 43. APPENDIX
  • 44. Accuracy (aka Efficiency) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen as 30 seen as green by orange by orange by green by witness witness witness witness What is the percentage of correct observations made by the witness? 𝑇𝑇 + 𝑇𝑇 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = 𝑇𝑇 + 𝑇𝑇 + 𝐹𝐹 + 𝐹𝑁 # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 + # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 680 + 30 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = = 𝟕𝟕𝟕 680 + 120 + 30 + 170Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 45. True Positive Rate (aka “Sensitivity”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen 170 seen 120 seen as 30 seen as as green as orange orange by green by 𝑇𝑇 by witness by witness witness witness 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) = 𝑇𝑇 + 𝐹𝐹 # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) = # 𝑜𝑜 𝑔𝑔𝑔𝑔𝑔 𝑐𝑐𝑐𝑐 680 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = = 𝟖𝟖𝟖 680 + 170Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 46. False Positive Rate (aka “False Alarm Rate”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen 30 seen as green by orange by as orange green by 𝐹𝐹 witness witness by witness witness 𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝛼) = 𝐹𝐹 + 𝑇𝑇 # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝛼) = # 𝑜𝑜 𝑜𝑜𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐 30 𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝛼) = = 𝟐𝟐𝟐 30 + 120Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 47. True Negative Rate (aka “Specificity”) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen 30 seen as green by orange by as orange green by 𝑇𝑇 witness witness by witness witness 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = 𝑇𝑇 + 𝐹𝐹 𝐶𝐶𝐶𝐶 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑆𝑆𝑆𝑆 𝑎𝑎 𝑂𝑂𝑂𝑂𝑂𝑂 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = # 𝑜𝑜 𝑂𝑂𝑂𝑂𝑂𝑂 𝐶𝐶𝐶𝐶 120 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = = 𝟖𝟖𝟖 120 + 30Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 48. False Negative Rate Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen 170 seen 120 seen as 30 seen as as green as orange orange by green by 𝐹𝐹 by witness by witness witness witness 𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 (𝛽) = 𝐹𝐹 + 𝑇𝑇 # 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 (𝛽) = # 𝑜𝑜 𝑔𝑔𝑔𝑔𝑔 𝑐𝑐𝑐𝑐 170 𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 (𝛽) = = 𝟐𝟐𝟐 170 + 680Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 49. Summary Green Cab orange Cab Seen as Green Cab 80% TP Rate 20% FP Rate (Sensitivity=80%) (α=20%) Seen as orange Cab 20% FN Rate 80% TN Rate (β = 20%) (Specificity=80%)Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 50. Why the Base Rate Matters to InfoSec Example Questions to Ask Common Vulnerability Scoring System What is the base rate of vuln exploitation? Vulnerability Scanners What is the base rate of each type of vulnerability? Intrusion Detection Systems What is the maximum acceptable false positive rate for my organization’s IDS? Anti-Virus/ Malware / Spyware What is the base rate of virus / malware / spyware events? Risk Analysis What is the base rate of the hazard (part of inherent risk)? Third-Party Assurance What is the base rate of compliant vendors? What is the base rate of accurate audit reports?Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 51. True Positive Rate (aka “Sensitivity”) for IBM AppScan and SQL Injection Vulns Source: “The SQL Injection Detection Accuracy of Web 146 Test Application Scanners” Cases SecToolMarket.com 10 non- 136 vulnerable vulnerable test test cases cases 136 detected 0 detected as 7 detected as 3 detected as as vulnerable non- not vulnerable vulnerable by by IBM vulnerable by by IBM IBM AppScan AppScan IBM AppScan AppScan 𝑇𝑇 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) = 𝑇𝑇 + 𝐹𝐹 # 𝑜𝑜 𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) = # 𝑜𝑜 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐 136 𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = = 𝟏𝟏𝟏𝟏 136 + 0Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 52. True Negative Rate (aka “Specificity”) for IBM AppScan and SQL Injection Vulns Source: “The SQL Injection Detection Accuracy of Web 146 Test Application Scanners” Cases SecToolMarket.com 10 non- 136 vulnerable vulnerable test test cases cases 136 detected 0 detected as 7 detected as 3 detected as as vulnerable non-vulnerable not vulnerable vulnerable by by IBM by IBM by IBM IBM AppScan AppScan AppScan AppScan 𝑇𝑇 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = 𝑇𝑇 + 𝐹𝐹 𝑇𝑇𝑇𝑇 𝐶𝐶𝐶𝐶𝐶 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼 𝑎𝑎 𝑁𝑁𝑁 − 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = # 𝑜𝑜 𝑁𝑁𝑁 − 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 𝑇𝑇𝑇𝑇 𝐶𝐶𝐶𝐶𝐶 7 𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = = 𝟕𝟕𝟕 7+3Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 53. Positive Prediction Value (PPV) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen as 30 seen as green by orange by orange by green by witness witness witness witness If the witness says the cab was green, what’s the probability the witness is right? 𝑇𝑇 𝑃𝑃𝑃 = 𝑇𝑇 + 𝐹𝐹 # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 𝑃𝑃𝑃 = # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎 𝑔𝑔𝑔𝑔𝑔 680 𝑃𝑃𝑃 = = 𝟗𝟗. 𝟕𝟕𝟕𝟕𝟕 680 + 30Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 54. Negative Prediction Value (NPV) Assumption: The cab was green 1,000 cabs 850 green 150 orange cabs cabs 680 seen as 170 seen as 120 seen as 30 seen as green by orange by orange by green by witness witness witness witness If the witness says the cab was orange, what’s the probability the witness is right? 𝑇𝑇 𝑁𝑁𝑁 = 𝑇𝑇 + 𝐹𝐹 # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 𝑁𝑃𝑃 = # 𝑜𝑜 𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜 120 𝑁𝑃𝑃 = = 𝟒𝟒. 𝟑𝟑𝟑𝟑𝟑 120 + 170Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 55. Example: PPV for IBM AppScan and SQL Injection Vulns Source: “The SQL Injection Detection Accuracy of Web 146 Test Application Scanners” Cases SecToolMarket.com 10 non- 136 vulnerable vulnerable test test cases cases 136 detected 0 detected as 7 detected as 3 detected as as vulnerable non-vulnerable not vulnerable vulnerable by If IBM AppScan says a SQL injection vulnerability is present, what’s the probability by IBM by IBM by IBM IBM AppScan AppScan AppScan AppScan that IBM AppScan is right? 𝑇𝑇 𝑃𝑃𝑃 = 𝑇𝑇 + 𝐹𝐹 # 𝑜𝑜 𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑃𝑃𝑃 = # 𝑜𝑜 𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 136 𝑃𝑃𝑃 = = 𝟗𝟗. 𝟖𝟖𝟖 136 + 3Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 56. Bayes Theorem: P(A|B) = P(B|A) * P(A) P(B) Where: P(A) = prevalence of malware P(B) = probability of a positive finding P(B|A) = probability of correctly identifying malwareProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 57. Bayes Theorem: Using the malware example: P(A) = 3% (prevalence of malware) P(B|A) = 95% (detection rate) P(B) = 17.4% (true positive rate) P(A|B) = (.95 * .03) / .174 = .1637 = 16.4%Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 58. Speaking of prevalence and incidence - What’s the difference? Prevalence: cross-sectional, how much is out there right now? Incidence: longitudinal, a proportion of new cases found during a time period Both prevalence and incidence can be expressed as rates.Proprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.
  • 59. Monte Carlo Simulation How can the use of Monte Carlo simulation improve the use of fourfold tables? Examples in ExcelProprietary. Copyright © 2012 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc (www.riskcentricsecurity.com).All rights reserved.

×