• Save
The Base Rate Fallacy - Source Boston 2013
Upcoming SlideShare
Loading in...5
×
 

The Base Rate Fallacy - Source Boston 2013

on

  • 244 views

A base rate is the prevalence of an item of interest in a population. In medicine, it would be the prevalence of a disease in a group of people. In information security, it might be the prevalence ...

A base rate is the prevalence of an item of interest in a population. In medicine, it would be the prevalence of a disease in a group of people. In information security, it might be the prevalence of sql injection flaws in web applications or the prevalence of malware in the population of downloaded *.exe files. Without an estimate of the base rate, it isn’t possible to talk meaningfully about detection rates (true positives) or false positives. Those who do so commit the “base rate fallacy. If the base rate is known, then a Fourfold table, also called a 2 x 2 table or matrix, is a mechanism that helps us understand the correct probabilities of True Positive, False Positive, True Negative, and False Negative events and avoid the base rate fallacy. Understanding these probabilities enables us to evaluate the claims of many types of security technologies, including the effectiveness of antivirus software, web application scanners, and IDS/IPS systems.
• The base rate fallacy will be explained and demonstrated.
• Gigerenzer’s Natural Frequencies Technique for Avoiding the Base Rate Fallacy
• Examples of why base rates apply to information risk management:
Common Vulnerability Scoring System (CVSS)
The Distinction between Inherent Risk vs. Residual Risk
Intrusion Detection Systems
Vendor Management, Hosting Providers, and SOC 2 (formerly SAS70) Audit Reports

Statistics

Views

Total Views
244
Views on SlideShare
244
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

The Base Rate Fallacy - Source Boston 2013 The Base Rate Fallacy - Source Boston 2013 Presentation Transcript

  • INFORMATION SECURITYINFORMATION SECURITYWHY THE BASE RATE FALLACY MATTERS TO INFOSEC& HOW TO AVOID ITSource Boston, Boston, MAApril 16, 2013Patrick FlorerJeff Lowder, CISSP
  • 1 2 3IntroductionBase Rate Fallacy& InfoSecNatural Frequencies4Fourfold Tables
  • 1 2 3IntroductionBase Rate Fallacy& InfoSecNatural Frequencies4Fourfold Tables
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Patrick Florer• Cofounder and CTO of Risk Centric Security.• Fellow at the Ponemon Institute.• 33 years of IT experience, including roles in IT operations,development, and systems analysis.• In addition, he worked a parallel track in medical outcomes research,analysis, and the creation of evidence-based guidelines for medicaltreatment.• From 1986 until now, he has worked as an independent consultant,helping customers with strategic development, analytics, riskanalysis, and decision analysis.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Jeff Lowder• President of the Society for Information Risk Analysts(www.societyinforisk.org)• Director, Global Information Security and Privacy,OpenMarket• Industry thought leader who leads world-class security organizationsby building and implementing custom methodologies and frameworksthat balance information protection and business agility.• 16 years of experience in information security and risk management• Previous leadership roles include:– Director, Information Security, The Walt Disney Company– Manager, Network Security, NetZero/United Online– Director, Security and Privacy, Elemica– Director, Network Security, United States Air Force Academy
  • 1 2 3IntroductionBase Rate Fallacy& InfoSecNatural Frequencies4Fourfold Tables
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Quick Check #1: How Good Are You at Estimating Risk?The witness gets the colorright 80% of the time“I saw anorangecab”What color wasthe cab?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Quick Check #1: What is the probability the cab was orange?(a) 80%(b) Somewhere between 50 and 80%(c) 50%(d) Less than 50% (the cab was green)(e) There is not enough information to answer the question.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Quick Check #2: How Good Are You at Estimating Risk?85% of the taxis onthe road are greencabs.15% of the taxis onthe road are orangecabs.The witness gets the colorright 80% of the time“I saw anorangecab”What color wasthe cab?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Quick Check #2: What is the probability the cab was orange?(a) 80%(b) Somewhere between 50 and 80%(c) 50%(d) Less than 50% (the cab was green)(e) There is not enough information to answer the question.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.• 850 out of every 1,000 taxis are green cabs.• 150 out of every 1,000 taxis are orange cabs.• Of these 850 green cabs, the witness will see 170 of them as orange.• Of the 150 remaining orange cabs, the witness will see 120 of themas orange.• The witness says she saw an orange cab.Quick Check #3
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Quick Check #3: What is the probability the cab was orange?(a) 80%(b) Somewhere between 50 and 80%(c) 50%(d) Less than 50% (the cab was green)(e) There is not enough information to answer the question.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Correct AnswersQuick Check #1:Unknown. Without some idea of the base rate of orange cabs, it is impossibleto estimate the probability that the witness is correct.Quick Check #2:(d) Less than 50% (the cab was green)Quick Check #3:(d) Less than 50% (the cab was green)The exact probability is 41%.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Base Rates and the Base Rate Fallacy• Base Rate: “The base rate of an attribute (or event)in a population is the proportion of individualsmanifesting that attribute (at a certain point in time).A synonym for base rate is prevalence.”Source: Gerd Gigerenzer, Calculated Risks: How to Know When NumbersDeceive You• Base Rate Fallacy: A fallacy of statistical reasoningthat occurs when both base rate information andspecific data about a specific individual is available.The fallacy is to make a conclusion about theprobability (or frequency) by ignoring the base rateinformation and only considering the specific data.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How Did Your Peers Do (with the Taxi Cab Question)?A35%B45%C15%D5%Jeff Lowder’s Research:How often do GRCprofessionals committhe base-rate fallacy?When presented withconditional probabilities(such as Quick Check #2),95% of GRC professionalscommit the base-ratefallacy
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Why the Base Rate Matters to InfoSecExample Output Relevant Base RateCommon VulnerabilityScoring SystemVuln scores (0-10) used toprioritize remediationBase rate of vulnexploitationVulnerability Scanners Scan results Base rate ofvulnerabilitiesIntrusion DetectionSystemsIntrusion alerts Base rate of intrusioneventsAnti-Virus/ Malware /SpywareVirus / Malware / SpywarealertsBase rate of virus /malware / spywareeventsRisk Analysis Residual Risk Base rate of hazard(part of inherent risk)Third-Party Assurance Third-Party Audit Report Base rate of compliantvendorsBase rate of accurateaudit reports
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Why the Base Rate Matters to InfoSecExample Questions to AskCommon Vulnerability Scoring System What is the base rate of vulnexploitation?Vulnerability Scanners What is the base rate of each type ofvulnerability?Intrusion Detection Systems What is the base rate of intrusionevents?Anti-Virus/ Malware / Spyware What is the base rate of virus /malware / spyware events?Risk Analysis What is the base rate of the hazard(part of inherent risk)?Third-Party Assurance What is the base rate of compliantvendors?What is the base rate of accurateaudit reports?
  • 1 2 3IntroductionBase Rate Fallacy& InfoSecNatural Frequencies4Fourfold Tables
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Use Natural Frequencies to Avoid the Base Rate Fallacy1,000 cabs850 greencabs680 seen asgreen bywitness170 seenas orangeby witness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessPr (orange | witness says it was orange)120= ---------------- = 41%170+120
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Use Natural Frequencies to Avoid the Base Rate Fallacy1,000 cabs850 greencabs680 seen asgreen bywitness170 seenas orangeby witness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessPr (Green | witness says it was orange)170= ---------------- = 59%170+120
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Natural Frequencies Work!Gerd GigerenzerMax Planck Institute (Berlin),former Professor of Psychology (University of Chicago)“People tend to get the right answersusing natural frequencies much moreoften than using conditionalprobabilities: 80% vs. 33%.”
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Positives (aka “hits”)1,000 cabs850 greencabs680 seenas greenby witness170 seen asorange bywitness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was green𝐿𝐿𝐿𝐿𝐿𝐿 𝑇𝑇𝑇𝑇 = # 𝑜𝑜𝑜𝑜 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑇𝑇𝑇𝑇 = # 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 & (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔)𝑇𝑇𝑇𝑇 = 𝟔𝟔𝟔𝟔𝟔𝟔
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.False Positives (aka “false alarms” or “Type I errors”)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was green𝐿𝐿𝐿𝐿𝐿𝐿 𝐹𝐹𝐹𝐹 = 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝐹𝐹𝐹𝐹 = # 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 & (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜)𝐹𝐹𝐹𝐹 = 𝟑𝟑𝟑𝟑
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Negatives1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessAssumption:The cab was green𝐿𝐿𝐿𝐿𝐿𝐿 𝑇𝑇𝑇𝑇 = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑇𝑇𝑇𝑇 = # 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 & (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜)𝑇𝑇𝑇𝑇 = 𝟏𝟏𝟏𝟏𝟏𝟏
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.False Negatives (aka “misses” or “Type II errors”)1,000 cabs850 greencabs680 seen asgreen bywitness170 seenas orangeby witness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was green𝐿𝐿𝐿𝐿𝐿𝐿 𝐹𝐹𝐹𝐹 = 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝐹𝐹𝐹𝐹 = # 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 & (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔)𝐹𝐹𝐹𝐹 = 𝟏𝟏𝟏𝟏𝟏𝟏
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Summary1,000 cabs850 greencabs680 seenas greenby witness170 seenas orangeby witness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessAssumption:The cab was greenTrue Positive False Positive True Negative False NegativeGreen Cab Orange CabSeen as Green Cab 680 True Positives(TP=680)30 False Positives(FP=30)Seen as Orange Cab 170 False Negatives(FN=170)120 True Negatives(TN=120)
  • 1 2 3IntroductionBase Rate Fallacy& InfoSecNatural Frequencies4Fourfold Tables
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.What are fourfold tables?A fourfold table, also called a 2 x 2 table, is afour cell table (2 rows x 2 columns) basedupon two sets of dichotomous or ”yes/no”facts.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.What are fourfold tables?Exists/TrueDoes not exist/Not True/FalseIdentified/Detected asExisting/True/PositiveTrue Positive (TP)(+ +)False Positive (FP)(- +)Identified/Detected asNotExisting/False/NegativeFalse Negative (FN)(+ -)True Negative (TN)(- -)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.What are fourfold tables?The two sets of facts could be:Something that exists or is true, or doesn’t exist/isn’ttrue, andSomething else related to #1 that exists or is true, ordoesn’t exist/isn’t true, including “something” thatattempts to identify/detect whether #1 exists/is true or#1 does not exist/is not true.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.What are fourfold tables?The “something” that exists or doesn’t exist could be a disease,a virus or worm, malware, exploit code, or a malicious packet:i.e: you have a disease or you don’t; a piece of code ismalicious or it isn’t, etc.The “something else” that “identifies/detects” could be amedical diagnostic test, anti-virus/anti-malware software,IDS/IPS systems, etc. The diagnostic test or software eithercorrectly identifies the disease, virus, or malware, or it doesn’t.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.What are fourfold tables?True Positive: “something” does exist/is true and is correctly identified asexisting/true.False Positive: “something” does not exist/is not true, but is incorrectly identifiedas existing/true.True Negative: “something” does not exist/is false, and is correctly identified asnot existing/false.False Negative: “something” does exist/is true, but is incorrectly identified as notexisting/not true.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.A question:Let’s assume a technology that detects malware:95% accuracy in detecting malware10% false positive rateWhat is the probability that a file identified asmalware really is malware (Positive PredictiveValue or PPV)?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.A question:If this is all we know, can we even answer thequestion?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.A question:If this is all we know, can we even answer thequestion?No, we need to know something else:
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.A question:If this is all we know, can we even answer thequestion?No, we need to know something else:We need to know the base rate of malware inour target population (the populationprevalence) – i.e.: what is our estimate of theproportion of malware on our network?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How do fourfold tables work?Assuming that 3% of all files on our network aremalware (base rate / prevalence)And accepting that the detection technology has:95% accuracy in detecting malware10% false positive rateWe are now ready to build a four-fold table.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How do fourfold tables work?Out of 1 million files:3% are malware = 30,00097% are not malware = 970,000
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How do fourfold tables work?Out of 30,000 malicious files:95% are correctly identified as malware =28,500 (True Positive)5% are incorrectly identified as notmalware = 1,500 (False Negative)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How do fourfold tables work?Out of 970,000 non-malicious files:10% are incorrectly identified as malware= 97,000 (False Positive)90% are correctly identified as notmalware = 873,000 (True Negative)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.How do fourfold tables work?True FalsePositive True Positive(TP)28,500False Positive(FP)97,000Negative False Negative(FN)1,500True Negative(TN)873,000
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Questions we can answer with fourfold tables :What is the probability that a file identified asmalware is really malware (PPV)?PPV = TP / (TP + FP)= 28,500 / (28,500 + 97,000)= 22.7%What happened to 95%?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Questions we can answer with fourfold tables :What is the probability that a file identified asnot-malware is really not malware (NegativePredictive Value or NPV)?NPV = TN / (TN + FN)= 873,000 / (873,000 + 1,500)= 99.8%
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Questions we can answer with fourfold tables :Some things to remember:The numbers in the four cells must add upto 100% of the total number being analyzed(1M in this example)As the base rate approaches 100%, thebase rate fallacy ceases to apply
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Bayes Theorem:P(A|B) = P(B|A) * P(A)P(B)Where:A = MalwareB = DetectionP(A) = probability that malware is present (prevalence)P(B|A) = probability of correct detection (TP)P(B) = probability of positive detection ((TP+FP)/All))
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Bayes Theorem:Using the numbers from the example:P(A) = 3%P(B|A) = 95%P(B) = (TP rate * P(A)) + (1 - P(A)) * FP rate)= (0.95 * 0.03) + ((1 - 0.03) * 0.10)= (0.0285 + 0.097)= 0.1255
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Bayes Theorem:We can solve the equation as follows:P(A|B) = (0.95 * 0.03) / 0.1255= 0.227= 22.7%
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Speaking of prevalence and incidence -What’s the difference?Prevalence: cross-sectional, how much is outthere right now?Incidence: longitudinal, a proportion of newcases found during a time periodBoth prevalence and incidence can beexpressed as rates.
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 – Solve with a fourfold tableThe government estimates that 1,000 terrorists live in the US, in apopulation of 300 million people. It wishes to install a nationwidesurveillance system with the following estimated characteristics:99% probability of correctly identifying a terrorist (TP rate)0.1% (0.001) probability of incorrectly identifying a non-terrorist (FPrate).What is the positive predictive value of this system?Credit: Schneier on Security, July 10, 2006
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #2 – Solve with a fourfold table or Bayes TheoremYou are thinking about licensing a vulnerability scanning technology.From experience and from published reports, you believe that the baserate of vulnerable systems in your environment could be as high as15%The vendor claims 95% accuracy (TP rate), with a false positive rate of10%.What is the positive predictive value of this technology?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Monte Carlo SimulationHow can the use of Monte Carlo simulationimprove the use of fourfold tables?Examples in Excel
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Contact InfoPatrick FlorerCTO and CofounderRisk Centric Security, Inc.• Web:www.riskcentricsecurity.com• Email:patrick@riskcentricsecurity.comJeff LowderPresident, Society for InformationRisk Analysts (SIRA)• Web: www.jefflowder.com• Blog: bloginfosec.com andwww.societyinforisk.org• Twitter: @agilesecurity
  • APPENDIX
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Accuracy (aka Efficiency)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was greenWhat is the percentage of correct observations made by the witness?𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 =𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝐹𝐹𝑁𝑁𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 =# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 + # 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 =680 + 30680 + 120 + 30 + 170= 𝟕𝟕𝟕𝟕𝟕
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Positive Rate (aka “Sensitivity”)1,000 cabs850 greencabs680 seenas greenby witness170 seenas orangeby witness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was green𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) =# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔# 𝑜𝑜𝑜𝑜 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =680680 + 170= 𝟖𝟖𝟖𝟖𝟖
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.False Positive Rate (aka “False Alarm Rate”)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessAssumption:The cab was green𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛼𝛼) =𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛼𝛼) =# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔# 𝑜𝑜𝑜𝑜 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛼𝛼) =3030 + 120= 𝟐𝟐𝟐𝟐𝟐
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Negative Rate (aka “Specificity”)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seenas orangeby witness30 seen asgreen bywitnessAssumption:The cab was green𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑎𝑎𝑎𝑎 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂# 𝑜𝑜𝑜𝑜 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =120120 + 30= 𝟖𝟖𝟖𝟖𝟖
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.False Negative Rate1,000 cabs850 greencabs680 seenas greenby witness170 seenas orangeby witness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was green𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛽𝛽) =𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛽𝛽) =# 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜# 𝑜𝑜𝑜𝑜 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝛽𝛽) =170170 + 680= 𝟐𝟐𝟐𝟐𝟐
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.SummaryGreen Cab orange CabSeen as Green Cab 80% TP Rate(Sensitivity=80%)20% FP Rate(α=20%)Seen as orange Cab 20% FN Rate(β = 20%)80% TN Rate(Specificity=80%)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Why the Base Rate Matters to InfoSecExample Questions to AskCommon Vulnerability Scoring System What is the base rate of vulnexploitation?Vulnerability Scanners What is the base rate of each type ofvulnerability?Intrusion Detection Systems What is the maximum acceptablefalse positive rate for myorganization’s IDS?Anti-Virus/ Malware / Spyware What is the base rate of virus /malware / spyware events?Risk Analysis What is the base rate of the hazard(part of inherent risk)?Third-Party Assurance What is the base rate of compliantvendors?What is the base rate of accurateaudit reports?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Positive Rate (aka “Sensitivity”) forIBM AppScan and SQL Injection Vulns146 TestCases136 vulnerabletest cases136 detectedas vulnerableby IBMAppScan0 detected asnon-vulnerable byIBM AppScan10 non-vulnerable testcases7 detected asnot vulnerableby IBMAppScan3 detected asvulnerable byIBM AppScanSource: “The SQL InjectionDetection Accuracy of WebApplication Scanners”SecToolMarket.com𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 (𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆) =# 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣# 𝑜𝑜𝑜𝑜 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =136136 + 0= 𝟏𝟏𝟏𝟏𝟏𝟏𝟏
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.True Negative Rate (aka “Specificity”) forIBM AppScan and SQL Injection Vulns146 TestCases136 vulnerabletest cases136 detectedas vulnerableby IBMAppScan0 detected asnon-vulnerableby IBMAppScan10 non-vulnerable testcases7 detected asnot vulnerableby IBMAppScan3 detected asvulnerable byIBM AppScanSource: “The SQL InjectionDetection Accuracy of WebApplication Scanners”SecToolMarket.com𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼 𝑎𝑎𝑎𝑎 𝑁𝑁𝑁𝑁𝑁𝑁 − 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉# 𝑜𝑜𝑜𝑜 𝑁𝑁𝑁𝑁𝑁𝑁 − 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 =77 + 3= 𝟕𝟕𝟕𝟕𝟕
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Positive Prediction Value (PPV)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was greenIf the witness says the cab was green, what’s the probability the witness is right?𝑃𝑃𝑃𝑃𝑃𝑃 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑃𝑃𝑃𝑃𝑃𝑃 =# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑃𝑃𝑃𝑃𝑃𝑃 =680680 + 30= 𝟗𝟗𝟗𝟗. 𝟕𝟕𝟕𝟕𝟕𝟕𝟕𝟕𝟕
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Negative Prediction Value (NPV)1,000 cabs850 greencabs680 seen asgreen bywitness170 seen asorange bywitness150 orangecabs120 seen asorange bywitness30 seen asgreen bywitnessAssumption:The cab was greenIf the witness says the cab was orange, what’s the probability the witness is right?𝑁𝑁𝑁𝑁𝑁𝑁 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑁𝑁𝑃𝑃𝑃𝑃 =# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜# 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑁𝑁𝑃𝑃𝑃𝑃 =120120 + 170= 𝟒𝟒𝟒𝟒. 𝟑𝟑𝟑𝟑𝟑𝟑𝟑𝟑𝟑
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Example: PPV for IBM AppScan and SQL Injection Vulns146 TestCases136 vulnerabletest cases136 detectedas vulnerableby IBMAppScan0 detected asnon-vulnerableby IBMAppScan10 non-vulnerable testcases7 detected asnot vulnerableby IBMAppScan3 detected asvulnerable byIBM AppScanIf IBM AppScan says a SQL injection vulnerability is present, what’s the probabilitythat IBM AppScan is right?𝑃𝑃𝑃𝑃𝑃𝑃 =𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹𝑃𝑃𝑃𝑃𝑃𝑃 =# 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣# 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑃𝑃𝑃𝑃𝑃𝑃 =136136 + 3= 𝟗𝟗𝟗𝟗. 𝟖𝟖𝟖𝟖𝟖Source: “The SQL InjectionDetection Accuracy of WebApplication Scanners”SecToolMarket.com
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 – Solve with a fourfold table - AnswerThe government estimates that 1,000 terrorists live in the US, in apopulation of 300 million people. It wishes to install a nationwidesurveillance system with the following estimated characteristics:99% probability of correctly identifying a terrorist (TP rate)0.1% (0.001) probability of incorrectly identifying a non-terrorist (FPrate).What is the positive predictive value of this system?Credit: Schneier on Security, July 10, 2006
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 - AnswerOut of 300 million people, there are 1,000terroristsBase rate = 1,000 / 300,000,000= 0.00000333= 0.000333%
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 - AnswerOut of 1,000 terrorists, surveillance willcorrectly identify:1,000 * 99% = 990 (True Positive)And mis-identify 10 (False Negative)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 - AnswerThere are 300,000,000 – 1,000 =299,999,000 non- terroristsOf these, surveillance will correctly identify= 299,999,999 * 99.9%= 299,699,001 (True Negative)And mis-identify 299,999 (False Positive)
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 - AnswerTrue FalsePositive True Positive(TP)990False Positive(FP)299,999Negative False Negative(FN)10True Negative(TN)299,699,001
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #1 - AnswerWhat is the probability that a person identifiedas a terrorist is really a terrorist (PPV)?PPV = TP / (TP + FP)= 990 / (990 + 299,999)= 0.33%Or = ~ 1 out of every 303
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #2 – Solve with a fourfold table or Bayes TheoremYou are thinking about licensing a vulnerability scanning technology.From experience and from published reports, you believe that the baserate of vulnerable systems in your environment could be as high as15%The vendor claims 95% accuracy (TP rate), with a false positive rate of10%.What is the positive predictive value of this technology?
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #2: Bayes Theorem - AnswerP(A|B) = P(B|A) * P(A)P(B)Where:A = VulnerabilityB = DetectionP(A) = probability that vuln is present (prevalence)P(B|A) = probability of correct detection (TP)P(B) = probability of positive detection ((TP+FP)/All))
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #2: Bayes Theorem - AnswerUsing the numbers from the example:P(A) = 15%P(B|A) = 95%P(B) = (TP rate * P(A)) + (1 - P(A)) * FP rate)= (0.95 * 0.15) + ((1 - 0.15) * 0.10)= (0.0285 + 0.085)= 0.2275= 22.75%
  • Proprietary. Copyright © 2012, 2013 by Jeff Lowder (www.jefflowder.com) and Risk Centric Security, Inc(www.riskcentricsecurity.com). All rights reserved.Homework #2: Bayes Theorem - AnswerWe can solve the equation as follows:P(A|B) = (0.95 * 0.15) / 0.2275= 0.626OrPPV = 62.6%