Metrics for Security Effort Prioritization

METRICS FOR
SECURITY EFFORT
PRIORITIZATION
Christopher Theisen

AGENDA
◦ Motivate Security Metrics
◦ Solution: Attack Surfaces
◦ Results
◦ Undergraduate Contributions
◦ Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Where are the
vulnerabilities?

[2] https://arstechnica.com/information-technology/2017/02/microsoft-hosts-the-windows-source-in-a-monstrous-300gb-git-repository/
Windows: 300GB repository, millions of source code files

780,000
Cybersecurity Jobs in the U.S. - 2017
3,500,000
Estimated Cybersecurity openings worldwide - 2021
350,000
Cybersecurity openings worldwide - 2017
https://cybersecurityventures.com/jobs/

Tons of code…
Rare, expensive vulnerabilities…
Not enough people to find them…
How do we prioritize code for security testing/inspection?
http://www.classroomnook.com/2017/08/dealing-with-teacher-overwhelm.html

ATTACK SURFACE
Definition:
◦ All paths for data and commands in a software
system
◦ The data that travels these paths
◦ The code that implements and protects both
Concept used for security effort prioritization.
Hard to measure in practice…

RISK-BASED ATTACK SURFACE APPROXIMATION
Crashes represent activity that put the system under stress.
Stack Traces tell us what happened.
foo!foobarDeviceQueueRequest+0x68
foo!fooDeviceSetup+0x72
foo!fooAllDone+0xA8
bar!barDeviceQueueRequest+0xB6
bar!barDeviceSetup+0x08
bar!barAllDone+0xFF
center!processAction+0x1034
center!dontDoAnything+0x1030
Pull out individual code
artifacts from traces.
If code appears on a
crash dump stack
trace, it’s on the attack
surface.

Crashes are used by attackers!
Great source of forensic data.

HYPOTHESIS
◦ Crashes are empirical evidence of…
▫ Paths through the system with flaws
▫ Data paths through software
◦ Code appearing on crashes is therefore…
▫ More likely to have vulnerabilities
▫ Vulnerabilities more likely to be exploited
Expectation: High percentage of code with
vulnerabilities also crashes.

Use known vulnerabilities as an “oracle” to measure effectiveness

CRASHES – WINDOWS, FIREFOX, FEDORA
◦ Covering majority of vulnerable files
▫ Focus testing and review effort on crashing files
▫ Large amount of code that is irrelevant for
security review
Code
Coverage
Vulnerability
Coverage
Windows (Binaries) 48.4% 94.8%
Firefox (Source Files) 14.8% 85.6%
Fedora (Packages) 8.9% 63.3%

DATA SCALE
◦ Windows – 10 million crashes
◦ Firefox – 1.2 million crashes
◦ Fedora – 47 million crashes
◦ Common complaint:
▫ “We don’t have that much data!”
◦ How does this approach scale down?

DATA SCALE
71%
72%
73%
74%
75%
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
PercentofVulnerabilitiesonaStackTrace
Random Sample Size

COMPARING VULNERABILITY PREDICTION MODELS
What about other approaches?
◦ Crash Dump Stack Traces
◦ Software Metrics
▫ Lines of Code, Code Churn, # of Developers…
◦ Text Mining
▫ Do specific words/strings mean more
vulnerabilities?

COMPARING VULNERABILITY PREDICTION MODELS
True
Positives
Vulnerability
Coverage
Crashes 5% 86%
Text Mining 1% 74%
Software Metrics 13% 42%
True
Positives
Vulnerability
Coverage
Crashes+Text Mining 22% 34%
Crashes+Software Metrics 18% 33%
Text Mining+Software Metrics 1% 85%
Crashes+Text+S. Metrics 23% 36%

CONCLUSIONS
◦ Other approaches need an “oracle” of vulnerabilities
to build their model.
▫ Known set of vulnerabilities so the model “knows”
what a vulnerability looks like.
◦ Crashes have no such restriction; single metric beats
out or equals models with 10’s or 100’s of metrics
◦ Better to optimize for Vulnerability Coverage.
▫ Professionals at Microsoft, etc. agree!

UNDERGRADUATE RESEARCH
What kinds of vulnerabilities are we covering?
Dawson Tripp – 3rd year undergraduate from NCSU
◦ Learned some basic exploits
◦ Developed classification scheme (CWE w/ caveats)
◦ Mined vulnerabilities from Firefox
◦ Classified vulnerabilities with two graduate students

UNDERGRADUATE CONTRIBUTIONS
What kinds of vulnerabilities are we covering?
Dawson Tripp – 3rd year undergraduate from NCSU
Still using his code today!

UNDERGRADUATE CONTRIBUTIONS
Practical application: visualize the data
Dalisha Rodriguez – 2nd year undergraduate from
Hanover (now Fayetteville State)

NEXT STEPS
◦ More public vulnerability datasets
◦ We need metrics that model attacker behavior
◦ Dynamic (runtime) metrics!
▫ Think of crashes as dynamic: generated at
runtime
▫ Count “messages” to/from/between objects
(AspectJ)
▫ Combine with techniques like fuzzing
◦ Chamoli et al. – dynamic metrics for defect prediction
▫ Better on vulnerabilities…?

Tons of code…
Rare, expensive vulnerabilities…
Not enough people to find them…
http://www.classroomnook.com/2017/08/dealing-with-teacher-overwhelm.html

MASSIVELY OPEN ONLINE COURSES
◦ “Flipped” NCSU’s Software Security course
▫ Iteration 1: ~400 students
▫ Iteration 2: ~1100 students
◦ Lessons Learned:
▫ Amplifies gulf between veterans and newbies
▫ Assume no more than 2-3 hours of work/week
▫ Important to “seed” discussions

SoftSec Materials publicly available!
https://tinyurl.com/ncsu-softsec

SoftSec Materials publicly available!
https://tinyurl.com/ncsu-softsec
theisen.cr@gmail.com
theisencr.github.io

RELATED WORK
Data in this talk:
In Submission: [SECDEV ‘18] Chris Theisen, Hyunwoo Song, Dawson Tripp, Laurie Williams,
“What Are We Missing? Vulnerability Coverage by Type”
In Submission: [FSE ‘18] Chris Theisen, Hyunwoo Song, Dalisha Rodriguez, Laurie Williams,
“Better Together: Comparing Vulnerability Prediction Models”
[ICSE – SEIP ‘17] Chris Theisen, Kim Herzig, Laurie Williams, “Risk-Based Attack Surface
Approximation: How Much Data is Enough?”
[ICSE - SEIP ‘15] Chris Theisen, Kim Herzig, Pat Morrison, Brendan Murphy, and Laurie
Williams, “Approximating Attack Surfaces with Stack Traces”, in Companion Proceedings of the
37th International Conference on Software Engineering
Other works:
Revising: [IST] Chris Theisen, Nuthan Munaiah, Mahran Al-Zyoud, Jeffery C. Carver, Laurie
Williams, “Attack Surface Definitions”
[HotSoS ’17] Chris Theisen and Laurie Williams, “Advanced Metrics for Risk-Based Attack
Surface Approximation”, Proceedings of the Symposium and Bootcamp on the Science of Security.
[FSE – SRC ’15 – 1st Place] Chris Theisen, “Automated Attack Surface Approximation”, in the
23rd ACM SIGSOFT International Symposium on the Foundations of Software Engineering -
Student Research Competition, 2015
[SRC Grand Finals ’15 – 3rd Place] Chris Theisen, “Automated Attack Surface Approximation”, in
the 23rd ACM SIGSOFT International Symposium on the Foundations of Software Engineering -
Student Research Competition, 2015

IDENTIFYING VULNERABLE CODE
#include <stdio.h>
int main(int argc, char **argv)
{
char buf[8]; // buffer for eight characters
gets(buf); // read from stdio (sensitive!)
printf("%sn", buf); // print out data in buf
return 0; // 0 as return value
}
https://www.owasp.org/index.php/Buffer_overflow_attack

Metrics for Security Effort Prioritization

More Related Content

Similar to Metrics for Security Effort Prioritization

More from Chris Theisen

Recently uploaded

Metrics for Security Effort Prioritization

Editor's Notes