More Related Content Similar to deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation (20) deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation1. deDacota: Toward Preventing
Server-Side XSS via Automatic
Code and Data Separation
Adam DoupĂ©, Weidong CuiâŹ, Mariusz H. JakubowskiâŹ, Marcus
PeinadoâŹ, Christopher Kruegel, and Giovanni Vigna
University of California, Santa Barbara
âŹMicrosoft Research
CCS 2013 â 11/7/13
14. XSS â Impact
âą Steal cookies
âą Perform actions as user
âą Exploit userâs browser
âą Fake login form
Doupé - 11/7/13
15. Fixing XSS â Sanitization
<html>
<body>
<p>Hello
<%= HtmlEncode(this.Name) %>
</p>
</body>
</html>
Doupé - 11/7/13
16. Fixing XSS â Sanitization
<html>
<script>alert("xss");</script>
<body>
<p>Hello
<%= HtmlEncode(this.Name) %>
</p>
</body>
<script>alert("xss");
</html>
</script>
Doupé - 11/7/13
18. XSS as Input Validation
Problem
Find All Paths
Many Different Contexts
Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011
Is Sanitization Correct?
Oakland 2008, USENIX
2011
Parsing Quirks
Oakland 2009
Doupé - 11/7/13
19. XSS as Input Validation
Problem
Find All Paths
Different Context
Is Sanitization Correct?
Parsing Quirks
Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011
Oakland 2008, USENIX
2011
Oakland 2009
Doupé - 11/7/13
20. XSS as Input Validation
Problem
Find All Paths
Different Context
Is Sanitization Correct?
Parsing Quirks
Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011
Oakland 2008, USENIX
2011
Oakland 2009
Doupé - 11/7/13
21. XSS as Input Validation
Problem
Find All Paths
Different Context
Is Sanitization Correct?
Parsing Quirks
Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011
Oakland 2008, USENIX
2011
Oakland 2009, CCS 2013
Doupé - 11/7/13
22. XSS as Input Validation
Problem
Find All Paths
Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011
We want to fundamentally
Different Context
solve XSS vulnerabilities
Is Sanitization Correct?
Oakland 2008, USENIX
2011
Parsing Quirks
Oakland 2009, CCS 2013
Doupé - 11/7/13
24. Another Example
Developer indented for this code to be executed on the
browser
<html>
<body>
<script>
alert("welcome to example.com!");
</script>
<p>Hello <%= this.Name %></p>
</body>
</html>
Doupé - 11/7/13
27. The Fundamental Problem
Developer indented for this code to be executed on the
http://example.com/Test.aspx?name=<script>alert("xss");</script>
browser
<html>
<body>
<script>
alert("welcome to example.com!");
</script>
<p>Hello <script>alert("xss");</script>
</p>
</body>
Developer did not intend for this code to be executed on
</html>
the browser
Doupé - 11/7/13
28. The Fundamental Problem
Developer indented for this code to be executed on the
http://example.com/Test.aspx?name=<script>alert("xss");</script>
browser
<html>
<body>
The
<script> browser canât tell the
alert("welcome to example.com!");
difference!
</script>
<p>Hello <script>alert("xss");</script>
</p>
</body>
Developer did not intend for this code to be executed on
</html>
the browser
Doupé - 11/7/13
30. The Fundamental Solution
Data
To fundamentally solve XSS
<html>
<body>
vulnerabilities, we must apply the
Code
<script>
alert("welcome to example.com!");
alert("welcome to example.com!");
basic security principles of Code
</script>
<p>Hello <%= this.Name %>
and Data separation!
</p>
</body>
</html>
Doupé - 11/7/13
31. Content Security Policy (CSP)
âą Mechanism for the website to communicate a policy to the browser
about what JavaScript to execute
âą The browser then enforces this policy
âą Supported by many modern browsers (68% of users use one of
these browsers
â
â
â
â
â
â
â
Firefox
Chrome
IE (10)
Safari
Opera
iOS
Android
Doupé - 11/7/13
32. Content Security Policy
Data
Content-Security-Policy: script-src
http://example.com/0cc111eb135.js
<html>
<body>
<script>
alert("welcome to example.com!");
</script>
<p>Hello <%= this.Name %>
</p>
</body>
</html>
Doupé - 11/7/13
Code
alert("welcome to example.com!");
34. Code and Data Separation
âą Code and Data separation from start
â No legacy applications
âą Manually rewrite application
â Difficult and error-prone (HotSec 2011)
deDacota: Automatically separate code and
data of a web application
Doupé - 11/7/13
35. Threat Model
âą Benign web application
â The developer has not obfuscated the web application
âą Server-side XSS
â Our approach will only address traditional XSS, in other words,
XSS where the resulting bug is in the server-side code
âą Inline JavaScript
â For the deDacota prototype, we focused only on inline
JavaScript
â We ignore JavaScript in HTML attributes and CSS
Doupé - 11/7/13
38. deDacota Process
The goal is to rewrite the web
application so that it is
Approximate
Extract Inline
Rewrite Web
semantically equivalent yet
HTML Output
JavaScript
Application
separates the code and data.
Doupé - 11/7/13
39. Approximate HTML Output
<%@ Page Language="C#"
CodeBehind="CodeBehind.cs" Inherits="Test" %>
<html>
<body>
<p>Hello <%= this.Name %></p>
<%= Scripts() %>
</body>
</html>
Doupé - 11/7/13
40. Approximate HTML Output
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13
41. Approximate HTML Output
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
The goal here is to create a graph
that approximates the HTML
content of the web page. We use
static analysis techniques to
construct the graph.
Doupé - 11/7/13
42. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13
43. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13
44. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13
45. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Here we need to analyze the
control flow of the application,
which means following the control
flow into the Scripts() method.
Doupé - 11/7/13
46. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13
47. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Here we encounter string
concatenation, which our analysis
is able to handle.
Doupé - 11/7/13
48. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
49. Approximate HTML Output
Now that we have constructed
the approximation graph, we
must determine what is being
output by each node in the graph.
Here we use data-flow analysis
and points-to analysis.
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
50. Approximate HTML Output
"<html><body><p>"
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
51. Approximate HTML Output
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
52. In this case,
Approximate HTML Output
Request.QueryString["name"]
is statically undecidable because
it comes from user input. In the
approximation graph we
represent this as a * which means
the output at this node could be
anything.
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
53. Approximate HTML Output
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
*
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
54. Approximate HTML Output
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
*
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
<script>alert('
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
55. Approximate HTML Output
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
*
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
<script>alert('
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
2013
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
Doupé - 11/7/13
56. Approximate HTML Output
"<html><body><p>"
<html><body><p>
class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
*
this.Name
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
"<script>alert('"
<script>alert('
writer.write(Scripts());
writer.write("</p></body></html>");
}
this.Year
2013
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
"');</script>"
');</script>
Doupé - 11/7/13
59. In this example approximation graph from a real-world
application, the branch in the graph comes from a
conditional branch in the control-flow of the application.
Doupé - 11/7/13
60. Statically undecidable content, represented here as a *,
can come from two different areas:
1. Statically undecidable according to the static analysis.
2. To make our analysis conservative, we treat all loops as
outputting a *, because we cannot statically determine
how many times a loop will execute.
Doupé - 11/7/13
62. In the second step, we simply extract the inline JavaScript
(aka the developer intended code) from the approximation
graph.
Doupé - 11/7/13
65. Rewrite Web Application
At this
Data point, if the inline
JavaScript code is static, we have
<html>
protected the application. No
<body>
Code
<script src="0cc111eb135.js">
attacked data inalert("welcome to example.com!");
the Data
</script>
<p>Hello <%= this.Name %>
</p>segment will ever be interpreted
</body>
as Code.
</html>
Content-Security-Policy: script-src
http://example.com/0cc111eb135.js
Doupé - 11/7/13
66. Rewrite Web Application
Unfortunately, developers
Data
sometimes dynamically generate
<html> the Code of an application. If this
<body>
Code
<script src="0cc111eb135.js">
happens with untrusted Data,
</script>
alert("welcome to example.com!");
<p>Hello <%= this.Name %>
there can still be a XSS
</p>
</body>
vulnerability.
</html>
Content-Security-Policy: script-src
http://example.com/0cc111eb135.js
Doupé - 11/7/13
70. We developed a technique to safely
Dynamic Inline JavaScript
transform cases of dynamic inline
Data
JavaScript. If the statically undecidable
<html>
content is used in a known Code
JavaScript
<script>
var username = "<%= Username %>";
var username = "<%= Username %>";
</script>
context (JavaScript string or comment),
</html>
we can safely rewrite thevar username = "*";
application.
We call these cases âsafe dynamic
inline JavaScript.â
Doupé - 11/7/13
73. Evaluation
âą Security
â Crafted exploits for applications with known
vulnerabilities
â Transformed applications, along with CSP, blocked
the exploits
âą Functional correctness
â ChronoZoom had 160 JavaScript tests and all passed
after the transformation
â Manually browsed the application and source code
looking for missing inline JavaScript
Doupé - 11/7/13
75. 100%
90%
80%
70%
60%
50%
Here we are going to look at what
percentage of the inline
JavaScript in each application is
either: static, safe dynamic, or
unsafe dynamic.
Unsafe Dynamic
Safe Dynamic
40%
30%
20%
10%
Static
0%
BugTracker.NET BlogEngine.NET
BlogSA.NET
ScrewTurn Wiki
Doupé - 11/7/13
WebGoat.NET
ChronoZoom
80. 100%
90%
2
3
4
80%
1
1
4
4
70%
60%
50%
10
6
5
Unsafe Dynamic
Safe Dynamic
41
In
10
40% cases of unsafe dynamic inline JavaScript, we alert the
Static
27
developer that the transformation could potentially contain
30%
an XSS vulnerability. After the developer confirms the
20%
absence of an XSS vulnerability in the unsafe dynamic
4
10%
inline JavaScript, then the application is guaranteed free of
0%
BugTracker.NET BlogEngine.NET
BlogSA.NET vulnerabilities.
XSS ScrewTurn Wiki WebGoat.NET ChronoZoom
Doupé - 11/7/13
81. Limitations
âą Might miss inline JavaScript
â Loops
â Dynamic code execution
âą Does not handle HTML attributes and CSS
Doupé - 11/7/13
82. Summary
âą Code and Data separation necessary to
prevent XSS
âą deDacota can automatically separate
Code and Data of web application
âą deDacota works in practice
Doupé - 11/7/13
Editor's Notes ----- Meeting Notes (11/7/13 11:22) -----3 We want to fundamentally solve XSS vulnerabilities. We want to fundamentally solve XSS vulnerabilities. ----- Meeting Notes (11/7/13 11:22) -----8:30 ----- Meeting Notes (11/7/13 11:22) -----8:30 Server-side: Traditional XSS attacks. Result of server-side code. ----- Meeting Notes (11/7/13 11:22) -----12 Branches.Loops. Branches.Loops. Just say we extract all the possible inline JavaScript from the approximation graph. We solved the problem!Hurray!Then talk about dynamic JS. The developer is choosing to break the code/data separation model.This is fundamentally a bad thing.However, we developed a technique to handle some of these cases. The developer is choosing to break the code/data separation model.This is fundamentally a bad thing.However, we developed a technique to handle some of these cases. The developer is choosing to break the code/data separation model.This is fundamentally a bad thing.However, we developed a technique to handle some of these cases. Missing inline JavaScript - dynamic code - loops