ACM Chicago Chapter seminar 
September 10, 2014 
Loyola University Chicago 
Computer Science Department 
Web security: Securing 
untrusted web content at 
browsers 
Phu H. Phung 
University of Illinois at Chicago
Web page is rendered at 
browsers 
• Web pages contain JavaScript code, a 
scripting language run at browsers 
• JavaScript can provide a lot of functionalities 
rich interactions 
2
92% of all websites use 
JavaScript [w3techs.com] 
3 
“88.45% of the Alexa top 10,000 
web sites included at least one 
remote JavaScript library” 
CCS’12
Web-based mobile 
applications 
• HTML5 + JavaScript is a new trend for mobile 
developers 
o Cross-platform development, 
“Write one run everywhere” 
4
Third-party JavaScript is 
everywhere 
• Advertisements 
o Adhese ad network 
• Social web 
o Facebook Connect 
o Google+ 
o Twitter 
o Feedsburner 
• Tracking 
o Scorecardresearch 
• Web Analytics 
o Yahoo! Web 
Analytics 
o Google Analytics 
• … 
5
Two basic composition 
techniques 
• Iframe integration 
6 
<html><body> 
… 
<iframe 
src=“http://3rdparty.com/frame.html”> 
</iframe> 
… 
</body></html> 
3rd party
Two basic composition 
techniques 
7 
<html><body> 
… 
<script src=“http://3rdparty.com/script.js”> 
</script> 
… 
</body></html> 
3rd party 
Script inclusion
Third-party JavaScript issues 
• Third-party script inclusion run with the 
same privilege of the hosting page. 
• Security issues: 
o Malicious third-party code 
o Trusted third-party is compromised 
o Confidentiality, integrity, and other 
security risks 
8
Difficult issues with 
JavaScript 
• JavaScript is a powerful language, but the 
language design is bad for security, e.g.: 
o Dynamic scripts: document.write, eval, ... 
o Encapsulation leakage 
o ... 
9 
A lot of 
<script> 
document.write(‘<attacks scr’); 
were 
document.write(‘ipt> malic’); 
var i= 1; 
launched in 
document.write(‘ious code; </sc’); 
document.write(‘ript>’); 
practice 
</script> 
<script> malicious code; </script>
Samy attack on Myspace 
• MySpace tries to 
filter out JavaScript 
code in user data 
• BUT: The malicious 
code was injected in a 
“strange” way that 
escapes the filter 
10 
<div id=mycode style="BACKGROUND: url('java 
script:eval(document.all.mycode.expr)')" 
expr="var B=String.fromCharCode(34);………"> 
</div>
Another real world attack 
• Million Browser Botnet 
(July 2013) 
o Leverage Advertising 
Networks using JavaScript 
to launch Application-Level 
DDoS 
o Paid on 2 ad networks for 
Jeremiah Grossman & Matt Johansen 
WhiteHat SECURITY 
displaying treacherous 
advertisements on pages visited 
by hundreds of thousands of people 
(Malicious code run automatically without user 
knowledge) 
11
A recent attack on Reuter 
homepage (June 2014) 
12 
• Reuters website was compromised by the 
Syrian Electronic Army 
o By code injection via a compromised third party 
ad network.
State-of-the-art 
• Limit third-party code to safe subset of JavaScript 
o Facebook JS, ADSafe, ADSafety, ... 
• Browser-based sandboxing solutions 
o ConScript, WebJail, Contego, ... 
• Server-side transformations of scripts to be included 
o Google Caja, BrowserShield, ... 
13 
No compatibility with existing scripts 
Browser modifications imply short-term 
deployment issues 
No direct script delivery to browser 
Great runtime overhead
Our approach 
Lightweight Self-Protecting JavaScript 
• A behavioral sandbox model for 
JavaScript 
o Using only JS libraries and wrappers 
o No browser modification is required 
o The JS code is keep in original 
o Easily dealing with dynamic features of 
JavaScript 
14
API call interception 
15 
JavaScript execution environment 
(e.g. browsers) 
Native implementations 
alert 
implementation 
code pointers User 
functions 
alert(‘Hi!’) window.alert 
unique 
alert 
wrapper 
(+policy code) 
Attacker code 
alert = 
function(){...}; 
alert 
wrapper 
(enforced by SPJS)
Deployment illustration 
16 
<html> 
<head> 
<script src=“selfprotectingJS.js"></script> 
<title>Self-protecting JavaScript </title> 
<meta content=…> <style>…</style> 
<script>…</script> 
<!-- more heading setting --> 
70 
60 
50 
</head> 
<body> 
<script type="text/javascript"> 
(function() {..})(); 
</script> 
<!-- the content of page --> 
</body> 
40 
30 
20 
10 
</html> 
Policy code 
and 
enforcement 
code defined 
in a text file 
The enforcement code can 
be deployed anywhere: 
server side, proxy or 
browser plug-in, i.e. no 
need for a modified 
browser 
The orgininal 
code is not 
syntactically 
modified 
6.33 
66.03 
0 
Self-Protecting BrowserShield 
Slowdown (times) 
Runtime overhead
Effectiveness 
• Defend almost all of the known XSS attacker 
vectors 
o 34 attack vectors over 38 successful attack 
vectors 
• Provide Security Policy Patterns to build realistic 
policies e.g. prevent the attack of Firesheep on 
Facebook 
• Defend real-world exploits 
o phpBB 2.0.18 vulnerabilities – a stored XSS attack 
o WebCal vulnerabilities –a reflected XSS attack 
17
Lightweight Self-Protecting 
JavaScript 
Lightweight Self-Protecting 
JavaScript 
Safe Wrappers and Sane 
Policies for 
Self-Protecting JavaScript
SPJS with Untrusted JavaScript 
• No privilege 
distinguish between 
hosting code and 
external code 
19 
Self-Protecting 
JavaScript Code 
TRUSTED 
UNTRUSTED 
Hosting 
code 
Hosting 
code 
Hosting 
code 
external 
code 
external 
code
Goals 
• Deploy SPJS in the context of untrusted JS 
o Load and execute untrusted code without pre-processing 
the code 
o No browser modification is required 
• Enforce modular and fined-grained, stateful 
security policies for a piece of untrusted 
code 
o Protect the hosting page from untrusted code 
• Robust to potential flaws in security policies 
o Bad written policies might not break security 
20
Two-tier Sandbox 
Architecture 
Base-line API 
implementation, 
in e.g. `api.js’ file 
Sandbox running policy 
code, defined in a 
separate JS e.g. `policy.js’ 
Sandbox 
running 
untrusted code, 
defined in a 
separate file e.g. 
`untrusted.js’ 
The policy code can only access 
the base-line API and provided 
wrapper functions 
The untrusted code can 
only access objects 
returned by the outer 
sandbox 
JavaScript 
environment 
, 
e.g. the 
DOM
Two-tier sandbox 
architecture 
22 
var api = loadAPI(…); 
var outerSandbox = 
cajaVM.compileModule(policyCode); 
var enforcedAPI = outerSandbox(api); 
var innerSandbox = 
cajaVM.compileModule(untrustedCode); 
innerSandbox(enforcedAPI);
The architecture in multiple-principal 
untrusted code 
23 
Policy 2 
Policy 1 
untrusted 
Policy 3 
untrusted 
Base-line API 
implementation 
in e.g. `api.js’ 
file 
untrusted
Sandboxing untrusted code 
• Use Secure ECMAScript (SES) library 
developed by Google Caja team 
o Load a piece of code to execute within an 
isolated environment 
• The code can only interact with the outside world via 
provided APIs 
24 
var api = {...}; //constructing 
var makeSandbox = 
cajaVM.compileModule(untrustedCodeSrc); 
var sandboxed = makeSandbox(api);
Isolation technique: The SES 
library 
Object-capability environment 
• Scripts can access 
o Objects they create themselves 
o Objects explicitly handed to them 
25 
API 
Global 
context 
untrustedCode 
sandbox
Isolation technique: The SES 
library 
26
Policy definition 
• Base-line APIs implementation 
o Can enforce coarse-grained, generic policies, e.g.: 
• Sanitize HTML 
• Ensure complete mediation 
• Fine-grained policies for multiple 
untrusted JavaScript code 
o Modular, principal-specific, e.g.: script1 is allowed to 
read/write reg_A, script2 is allowed to read reg_A 
o Stafeful, e.g.: limit the number of popups to 3 
o Cross-principal stateful policies, e.g: after script1 write 
to reg_A, disallow access from script2 to reg_A 
27
Base-line APIs 
implementation 
• Create a Virtual DOM 
o Intercepting wrapper around real DOM 
o Consult security policy on each operation 
o Use Harmony Proxies to generically intercept 
property accesses on objects 
• Virtual DOM implementation uses the 
Membrane Pattern 
o Wrap any object passed from DOM to sandbox 
(return values) 
o Unwrap any object passed from sandbox to DOM 
(arguments) 
28
Deployment model 
• Untrusted code is loaded into a string 
variable 
o Using server-side proxy + XMLHttpRequest (to 
overcome same origin policy) 
o CORS/UMP headers set by the script provider 
29 
<script src= 
“http://3rdparty.com/script.js”> 
</script> 
<script src=“ses.js”></script> 
<script src=“api.js”></script> 
<script src=“policy0.js”></script> 
<script> 
var script = get(“http://3rdparty.com/script.js”); 
ses.execute(script,policy0); 
before </script> 
after
Secure dynamic script 
evaluation 
• Special handlers to intercept all methods 
that allow script tags to be added 
o node.appendChild, node.insertBefore, 
node.replaceChild, node.insertAfter 
o document.write, … 
o Event handlers in HTML, e.g. 
<…onclick=“javascript:xyz(…)”> 
1. Parse partial DOM tree/HTML 
2. Execute scripts in the sandbox 
environment 
30
Case studies 
• Single principal code 
• Multiple-principal code 
o Context-aware ads 
31
Two-tier Sandbox 
Architecture 
A Two-tier Sandbox 
Architecture for 
Untrusted JavaScript 
JSand: complete client-side 
sandboxing of third-party 
JavaScript without 
browser modifications
A recent published work 
Phu H. Phung, Maliheh Monshizadeh, Meera Sridhar, Kevin W. 
Hamlen, and V.N. Venkatakrishnan. 
Between Worlds: Securing Mixed JavaScript/ActionScript Multi-party 
Web Content. IEEE Transactions on Dependable and 
Secure Computing (TDSC), forthcoming. 
33
Extra slides 
35
Wrapper example 
36
Different parsing techniques 
• Via a sandboxed iframe 
1. Create sandbox iframe 
2. Set content via srcdoc attribute 
o More performance 
o Parsed exactly as will be interpreted by browser 
o Executed asynchronously 
• (Alternative) Via a HTML parsing library 
in JavaScript 
37
Loading additional code in 
the sandbox 
• Several use cases require external 
code to be executed in a previously 
set up sandbox 
o Loading API + glue code 
o Dynamic script loading 
• Two new operations: 
o innerEval(code) 
o innerLoadScript(url) 
38

Web security: Securing Untrusted Web Content in Browsers

  • 1.
    ACM Chicago Chapterseminar September 10, 2014 Loyola University Chicago Computer Science Department Web security: Securing untrusted web content at browsers Phu H. Phung University of Illinois at Chicago
  • 2.
    Web page isrendered at browsers • Web pages contain JavaScript code, a scripting language run at browsers • JavaScript can provide a lot of functionalities rich interactions 2
  • 3.
    92% of allwebsites use JavaScript [w3techs.com] 3 “88.45% of the Alexa top 10,000 web sites included at least one remote JavaScript library” CCS’12
  • 4.
    Web-based mobile applications • HTML5 + JavaScript is a new trend for mobile developers o Cross-platform development, “Write one run everywhere” 4
  • 5.
    Third-party JavaScript is everywhere • Advertisements o Adhese ad network • Social web o Facebook Connect o Google+ o Twitter o Feedsburner • Tracking o Scorecardresearch • Web Analytics o Yahoo! Web Analytics o Google Analytics • … 5
  • 6.
    Two basic composition techniques • Iframe integration 6 <html><body> … <iframe src=“http://3rdparty.com/frame.html”> </iframe> … </body></html> 3rd party
  • 7.
    Two basic composition techniques 7 <html><body> … <script src=“http://3rdparty.com/script.js”> </script> … </body></html> 3rd party Script inclusion
  • 8.
    Third-party JavaScript issues • Third-party script inclusion run with the same privilege of the hosting page. • Security issues: o Malicious third-party code o Trusted third-party is compromised o Confidentiality, integrity, and other security risks 8
  • 9.
    Difficult issues with JavaScript • JavaScript is a powerful language, but the language design is bad for security, e.g.: o Dynamic scripts: document.write, eval, ... o Encapsulation leakage o ... 9 A lot of <script> document.write(‘<attacks scr’); were document.write(‘ipt> malic’); var i= 1; launched in document.write(‘ious code; </sc’); document.write(‘ript>’); practice </script> <script> malicious code; </script>
  • 10.
    Samy attack onMyspace • MySpace tries to filter out JavaScript code in user data • BUT: The malicious code was injected in a “strange” way that escapes the filter 10 <div id=mycode style="BACKGROUND: url('java script:eval(document.all.mycode.expr)')" expr="var B=String.fromCharCode(34);………"> </div>
  • 11.
    Another real worldattack • Million Browser Botnet (July 2013) o Leverage Advertising Networks using JavaScript to launch Application-Level DDoS o Paid on 2 ad networks for Jeremiah Grossman & Matt Johansen WhiteHat SECURITY displaying treacherous advertisements on pages visited by hundreds of thousands of people (Malicious code run automatically without user knowledge) 11
  • 12.
    A recent attackon Reuter homepage (June 2014) 12 • Reuters website was compromised by the Syrian Electronic Army o By code injection via a compromised third party ad network.
  • 13.
    State-of-the-art • Limitthird-party code to safe subset of JavaScript o Facebook JS, ADSafe, ADSafety, ... • Browser-based sandboxing solutions o ConScript, WebJail, Contego, ... • Server-side transformations of scripts to be included o Google Caja, BrowserShield, ... 13 No compatibility with existing scripts Browser modifications imply short-term deployment issues No direct script delivery to browser Great runtime overhead
  • 14.
    Our approach LightweightSelf-Protecting JavaScript • A behavioral sandbox model for JavaScript o Using only JS libraries and wrappers o No browser modification is required o The JS code is keep in original o Easily dealing with dynamic features of JavaScript 14
  • 15.
    API call interception 15 JavaScript execution environment (e.g. browsers) Native implementations alert implementation code pointers User functions alert(‘Hi!’) window.alert unique alert wrapper (+policy code) Attacker code alert = function(){...}; alert wrapper (enforced by SPJS)
  • 16.
    Deployment illustration 16 <html> <head> <script src=“selfprotectingJS.js"></script> <title>Self-protecting JavaScript </title> <meta content=…> <style>…</style> <script>…</script> <!-- more heading setting --> 70 60 50 </head> <body> <script type="text/javascript"> (function() {..})(); </script> <!-- the content of page --> </body> 40 30 20 10 </html> Policy code and enforcement code defined in a text file The enforcement code can be deployed anywhere: server side, proxy or browser plug-in, i.e. no need for a modified browser The orgininal code is not syntactically modified 6.33 66.03 0 Self-Protecting BrowserShield Slowdown (times) Runtime overhead
  • 17.
    Effectiveness • Defendalmost all of the known XSS attacker vectors o 34 attack vectors over 38 successful attack vectors • Provide Security Policy Patterns to build realistic policies e.g. prevent the attack of Firesheep on Facebook • Defend real-world exploits o phpBB 2.0.18 vulnerabilities – a stored XSS attack o WebCal vulnerabilities –a reflected XSS attack 17
  • 18.
    Lightweight Self-Protecting JavaScript Lightweight Self-Protecting JavaScript Safe Wrappers and Sane Policies for Self-Protecting JavaScript
  • 19.
    SPJS with UntrustedJavaScript • No privilege distinguish between hosting code and external code 19 Self-Protecting JavaScript Code TRUSTED UNTRUSTED Hosting code Hosting code Hosting code external code external code
  • 20.
    Goals • DeploySPJS in the context of untrusted JS o Load and execute untrusted code without pre-processing the code o No browser modification is required • Enforce modular and fined-grained, stateful security policies for a piece of untrusted code o Protect the hosting page from untrusted code • Robust to potential flaws in security policies o Bad written policies might not break security 20
  • 21.
    Two-tier Sandbox Architecture Base-line API implementation, in e.g. `api.js’ file Sandbox running policy code, defined in a separate JS e.g. `policy.js’ Sandbox running untrusted code, defined in a separate file e.g. `untrusted.js’ The policy code can only access the base-line API and provided wrapper functions The untrusted code can only access objects returned by the outer sandbox JavaScript environment , e.g. the DOM
  • 22.
    Two-tier sandbox architecture 22 var api = loadAPI(…); var outerSandbox = cajaVM.compileModule(policyCode); var enforcedAPI = outerSandbox(api); var innerSandbox = cajaVM.compileModule(untrustedCode); innerSandbox(enforcedAPI);
  • 23.
    The architecture inmultiple-principal untrusted code 23 Policy 2 Policy 1 untrusted Policy 3 untrusted Base-line API implementation in e.g. `api.js’ file untrusted
  • 24.
    Sandboxing untrusted code • Use Secure ECMAScript (SES) library developed by Google Caja team o Load a piece of code to execute within an isolated environment • The code can only interact with the outside world via provided APIs 24 var api = {...}; //constructing var makeSandbox = cajaVM.compileModule(untrustedCodeSrc); var sandboxed = makeSandbox(api);
  • 25.
    Isolation technique: TheSES library Object-capability environment • Scripts can access o Objects they create themselves o Objects explicitly handed to them 25 API Global context untrustedCode sandbox
  • 26.
  • 27.
    Policy definition •Base-line APIs implementation o Can enforce coarse-grained, generic policies, e.g.: • Sanitize HTML • Ensure complete mediation • Fine-grained policies for multiple untrusted JavaScript code o Modular, principal-specific, e.g.: script1 is allowed to read/write reg_A, script2 is allowed to read reg_A o Stafeful, e.g.: limit the number of popups to 3 o Cross-principal stateful policies, e.g: after script1 write to reg_A, disallow access from script2 to reg_A 27
  • 28.
    Base-line APIs implementation • Create a Virtual DOM o Intercepting wrapper around real DOM o Consult security policy on each operation o Use Harmony Proxies to generically intercept property accesses on objects • Virtual DOM implementation uses the Membrane Pattern o Wrap any object passed from DOM to sandbox (return values) o Unwrap any object passed from sandbox to DOM (arguments) 28
  • 29.
    Deployment model •Untrusted code is loaded into a string variable o Using server-side proxy + XMLHttpRequest (to overcome same origin policy) o CORS/UMP headers set by the script provider 29 <script src= “http://3rdparty.com/script.js”> </script> <script src=“ses.js”></script> <script src=“api.js”></script> <script src=“policy0.js”></script> <script> var script = get(“http://3rdparty.com/script.js”); ses.execute(script,policy0); before </script> after
  • 30.
    Secure dynamic script evaluation • Special handlers to intercept all methods that allow script tags to be added o node.appendChild, node.insertBefore, node.replaceChild, node.insertAfter o document.write, … o Event handlers in HTML, e.g. <…onclick=“javascript:xyz(…)”> 1. Parse partial DOM tree/HTML 2. Execute scripts in the sandbox environment 30
  • 31.
    Case studies •Single principal code • Multiple-principal code o Context-aware ads 31
  • 32.
    Two-tier Sandbox Architecture A Two-tier Sandbox Architecture for Untrusted JavaScript JSand: complete client-side sandboxing of third-party JavaScript without browser modifications
  • 33.
    A recent publishedwork Phu H. Phung, Maliheh Monshizadeh, Meera Sridhar, Kevin W. Hamlen, and V.N. Venkatakrishnan. Between Worlds: Securing Mixed JavaScript/ActionScript Multi-party Web Content. IEEE Transactions on Dependable and Secure Computing (TDSC), forthcoming. 33
  • 35.
  • 36.
  • 37.
    Different parsing techniques • Via a sandboxed iframe 1. Create sandbox iframe 2. Set content via srcdoc attribute o More performance o Parsed exactly as will be interpreted by browser o Executed asynchronously • (Alternative) Via a HTML parsing library in JavaScript 37
  • 38.
    Loading additional codein the sandbox • Several use cases require external code to be executed in a previously set up sandbox o Loading API + glue code o Dynamic script loading • Two new operations: o innerEval(code) o innerLoadScript(url) 38