Sandboxing JavaScript via Libraries
and Wrappers
Phu H. Phung
University of Gothenburg, Sweden, and
University of Illinois at Chicago
About Me

• Receipt of international postdoc grant (3
years) by Swedish Research Council
(VR), employed by Univ. of Gothenburg.
• Research Associate at UIC.
• PhD in Computer Science in 2011 from
Chalmers University, Sweden.

Hosted by OWASP & the NYC Chapter
• Selected research projects
– European WebSand (complete)
• End-to-end secure web framework

– Secure Web Advertisements, funded by NSF
(on-going)
– Defensive Optimizing Compiler, funded by
DARPA (on-going)

Hosted by OWASP & the NYC Chapter
This talk

• Based on the two published papers:
– PH Phung, L Desmet. A two-tier sandbox
architecture for untrusted JavaScript, invited
paper, JSTools’12.
– P Agten, S Van Acker, Y Brondsema, PH Phung, L
Desmet, F Piessens. JSand: complete client-side
sandboxing of third-party JavaScript without
browser modifications, ACSAC’12.
92% of all websites use JavaScript
[w3techs.com]

“88.45% of the Alexa top 10,000 web
sites included at least one remote
JavaScript library”
CCS’12

5
Third-party JavaScript is
everywhere
• Advertisements
– Adhese ad network

• Social web
–
–
–
–

Facebook Connect
Google+
Twitter
Feedsburner

• Tracking
– Scorecardresearch

• Web Analytics
– Yahoo! Web Analytics
– Google Analytics

• …
6
Two basic composition
techniques

Iframe integration
<html><body>
…
<iframe src=“http://3rdparty.com/frame.html”>
</iframe>
…
</body></html>

3rd party

7
Two basic composition techniques

Script inclusion
<html><body>
…
<script src=“http://3rdparty.com/script.js”>
</script>
…
</body></html>
3rd party

8
Third-party JavaScript issues

• Third-party script inclusion run with the same
privilege of the hosting page.
• Security issues:
– Malicious third-party code
– Trusted third-party is compromised
– Confidentiality, integrity, and other security risks

9
Difficult issues with JavaScript

• JavaScript is a powerful language, but the language
design is bad for security, e.g.:
– Dynamic scripts: document.write, eval, ...
– Encapsulation leakage
– ...
A lot of
<script>
document.write(‘<scr’);
document.write(‘ipt> malic’);
var i= 1;
document.write(‘ious code; </sc’);
document.write(‘ript>’);
</script>

attacks were
launched in
practice

<script> malicious code; </script>

10
Malicious third-party JavaScript
example

The most reliable, cost effective
method to inject evil code is to buy
an ad.
Principles of Security. Douglas Crockford
http://fromonesrc.com/blog/page/2/
An attack scenario

Million Browser Botnet
(July 2013)
– Leverage Advertising
Networks using JavaScript
to launch Application-Level
Jeremiah Grossman & Matt Johansen
DDoS
WhiteHat SECURITY
– Paid on 2 ad networks for
displaying treacherous
advertisements on pages visited
by hundreds of thousands of people

– One day, got 13.6 million views of the ads, just spent less
than $100
12
State-of-the-art

• Limit third-party code to safe subset of JavaScript
– Facebook JS, ADSafe, ADSafety, ...
No compatibility with existing scripts

• Browser-based sandboxing solutions
– ConScript, WebJail, Contego, ...
Browser modifications imply short-term
deployment issues

• Server-side transformations of scripts to be included
– Google Caja, BrowserShield, ...
No direct script delivery to browser
Great runtime overhead
13
Our approach

• A sandbox model for third-party JavaScript
– Using only JS libraries and wrappers
– Whitelist (least-privilege) implementation approach
• Only properties and objects defined in policies are
available to the untrusted code

– No browser modification is required
– The third-party code is keep in original
– Easily dealing with dynamic features of JavaScript

“Lightweight Self-Protecting JavaScript”, ASIACCS’09
14
Two-tier sandbox architecture
Base-line API
implementation,
in e.g. `api.js’ file
Sandbox running policy
code, defined in a
separate JS e.g. `policy.js’

Sandbox running
untrusted
code, defined in a
separate file e.g.
`untrusted.js’

The policy code can only access the
base-line API and provided
wrapper functions

The untrusted code can only
access objects returned by
the outer sandbox

JavaScript
environment,
e.g. the DOM
Two-tier sandbox architecture

var api = loadAPI(…);
var outerSandbox =
cajaVM.compileModule(policyCode);
var enforcedAPI = outerSandbox(api);
var innerSandbox =
cajaVM.compileModule(untrustedCode);

innerSandbox(enforcedAPI);
16
The architecture in multipleprincipal untrusted code

Base-line API
implementation,
in e.g. `api.js’ file

Policy 1
untrusted

Policy 2

untrusted

Policy 3

untrusted
17
Sandboxing untrusted code

• Use Secure ECMAScript (SES) library
developed by Google Caja team
– Load a piece of code to execute within an isolated
environment
• The code can only interact with the outside world via provided
APIs

var api = {...}; //constructing
var makeSandbox =
cajaVM.compileModule(untrustedCodeSrc);
var sandboxed = makeSandbox(api);
18
Isolation technique: The SES library

Object-capability environment
• Scripts can access
– Objects they create themselves
– Objects explicitly handed to them

untrustedCode
API
sandbox

Global
context

19
Isolation technique: The SES library

20
Base-line APIs implementation

• Create a Virtual DOM
– Intercepting wrapper around real DOM
– Use Harmony Proxies to generically intercept
property accesses on objects

• Virtual DOM implementation uses the
Membrane Pattern
– Wrap any object passed from DOM to sandbox (return
values)
– Unwrap any object passed from sandbox to DOM
(arguments)
21
Wrapper example

22
Policy definition

• Base-line APIs implementation
– Can enforce coarse-grained, generic policies, e.g.:
• Sanitize HTML
• Ensure complete mediation

• Fine-grained policies for multiple untrusted
JavaScript code
– Modular, principal-specific, e.g.: script1 is allowed to read/write
elemt_A, script2 is allowed to read elemt_A
– Stafeful, e.g.: limit the number of popups to 3
– Cross-principal stateful policies, e.g: after script1 write to
elemt_A, disallow access from script2 to elemt_A
23
Deployment model

• Untrusted code is loaded into a string variable
– Using server-side proxy + XMLHttpRequest (to
overcome same origin policy)
– CORS (Cross-Origin Resource Sharing)
/UMP(Uniform Messaging Policy) headers set by
the script provider
<script src=
“http://3rdparty.com/script.js”>
</script>

before

<script src=“ses.js”></script>
<script src=“api.js”></script>
<script src=“policy0.js”></script>
<script>
var script = get(“http://3rdparty.com/script.js”);
ses.execute(script,policy0);
</script>
Secure dynamic script
evaluation

• Special handlers to intercept all methods that
allow script tags to be added
– node.appendChild, node.insertBefore, node.replaceCh
ild, node.insertAfter
– document.write, …
– Event handlers in HTML, e.g.
<…onclick=“javascript:xyz(…)”>

1. Parse partial DOM tree/HTML
2. Execute scripts in the sandbox environment
25
Dynamic script loading in
JavaScript

• Example from Google Maps

26
Different parsing techniques

• Via a sandboxed iframe
1. Create sandbox iframe
2. Set content via srcdoc attribute
– Better performance
– Parsed exactly as will be interpreted by browser
– Executed asynchronously

• (Alternative) Via a HTML parsing library in
JavaScript

27
Loading additional code in the
sandbox

• External code needs to be executed in a
previously set up sandbox
– Loading API + glue code
– Dynamic script loading

• Two new operations:
– innerEval(code)
– innerLoadScript(url)

28
Case studies

• Single principal code

• Multiple-principal code
– Context-aware ads
29
Implementation challenges

• Legacy scripts need additional preprocessing to be compatible with the
framework
– Secure ECMAScript restrictions
• A subset of ECMAScritp strict mode
• Global variables aliased as window
properties
• No ‘this’ auto coercion
30
JS transformation examples

31
Summary

– A client-side JavaScript architecture for
untrusted JavaScript
• Only using libraries and wrappers

– Complete mediation using Secure
ECMAScript
• DOM node operations
• JavaScript APIs

– Backward compatibility
• No browser modifications
• Direct script delivery to the browser
• Support for legacy scripts
32
33

Phu appsec13

  • 1.
    Sandboxing JavaScript viaLibraries and Wrappers Phu H. Phung University of Gothenburg, Sweden, and University of Illinois at Chicago
  • 2.
    About Me • Receiptof international postdoc grant (3 years) by Swedish Research Council (VR), employed by Univ. of Gothenburg. • Research Associate at UIC. • PhD in Computer Science in 2011 from Chalmers University, Sweden. Hosted by OWASP & the NYC Chapter
  • 3.
    • Selected researchprojects – European WebSand (complete) • End-to-end secure web framework – Secure Web Advertisements, funded by NSF (on-going) – Defensive Optimizing Compiler, funded by DARPA (on-going) Hosted by OWASP & the NYC Chapter
  • 4.
    This talk • Basedon the two published papers: – PH Phung, L Desmet. A two-tier sandbox architecture for untrusted JavaScript, invited paper, JSTools’12. – P Agten, S Van Acker, Y Brondsema, PH Phung, L Desmet, F Piessens. JSand: complete client-side sandboxing of third-party JavaScript without browser modifications, ACSAC’12.
  • 5.
    92% of allwebsites use JavaScript [w3techs.com] “88.45% of the Alexa top 10,000 web sites included at least one remote JavaScript library” CCS’12 5
  • 6.
    Third-party JavaScript is everywhere •Advertisements – Adhese ad network • Social web – – – – Facebook Connect Google+ Twitter Feedsburner • Tracking – Scorecardresearch • Web Analytics – Yahoo! Web Analytics – Google Analytics • … 6
  • 7.
    Two basic composition techniques Iframeintegration <html><body> … <iframe src=“http://3rdparty.com/frame.html”> </iframe> … </body></html> 3rd party 7
  • 8.
    Two basic compositiontechniques Script inclusion <html><body> … <script src=“http://3rdparty.com/script.js”> </script> … </body></html> 3rd party 8
  • 9.
    Third-party JavaScript issues •Third-party script inclusion run with the same privilege of the hosting page. • Security issues: – Malicious third-party code – Trusted third-party is compromised – Confidentiality, integrity, and other security risks 9
  • 10.
    Difficult issues withJavaScript • JavaScript is a powerful language, but the language design is bad for security, e.g.: – Dynamic scripts: document.write, eval, ... – Encapsulation leakage – ... A lot of <script> document.write(‘<scr’); document.write(‘ipt> malic’); var i= 1; document.write(‘ious code; </sc’); document.write(‘ript>’); </script> attacks were launched in practice <script> malicious code; </script> 10
  • 11.
    Malicious third-party JavaScript example Themost reliable, cost effective method to inject evil code is to buy an ad. Principles of Security. Douglas Crockford http://fromonesrc.com/blog/page/2/
  • 12.
    An attack scenario MillionBrowser Botnet (July 2013) – Leverage Advertising Networks using JavaScript to launch Application-Level Jeremiah Grossman & Matt Johansen DDoS WhiteHat SECURITY – Paid on 2 ad networks for displaying treacherous advertisements on pages visited by hundreds of thousands of people – One day, got 13.6 million views of the ads, just spent less than $100 12
  • 13.
    State-of-the-art • Limit third-partycode to safe subset of JavaScript – Facebook JS, ADSafe, ADSafety, ... No compatibility with existing scripts • Browser-based sandboxing solutions – ConScript, WebJail, Contego, ... Browser modifications imply short-term deployment issues • Server-side transformations of scripts to be included – Google Caja, BrowserShield, ... No direct script delivery to browser Great runtime overhead 13
  • 14.
    Our approach • Asandbox model for third-party JavaScript – Using only JS libraries and wrappers – Whitelist (least-privilege) implementation approach • Only properties and objects defined in policies are available to the untrusted code – No browser modification is required – The third-party code is keep in original – Easily dealing with dynamic features of JavaScript “Lightweight Self-Protecting JavaScript”, ASIACCS’09 14
  • 15.
    Two-tier sandbox architecture Base-lineAPI implementation, in e.g. `api.js’ file Sandbox running policy code, defined in a separate JS e.g. `policy.js’ Sandbox running untrusted code, defined in a separate file e.g. `untrusted.js’ The policy code can only access the base-line API and provided wrapper functions The untrusted code can only access objects returned by the outer sandbox JavaScript environment, e.g. the DOM
  • 16.
    Two-tier sandbox architecture varapi = loadAPI(…); var outerSandbox = cajaVM.compileModule(policyCode); var enforcedAPI = outerSandbox(api); var innerSandbox = cajaVM.compileModule(untrustedCode); innerSandbox(enforcedAPI); 16
  • 17.
    The architecture inmultipleprincipal untrusted code Base-line API implementation, in e.g. `api.js’ file Policy 1 untrusted Policy 2 untrusted Policy 3 untrusted 17
  • 18.
    Sandboxing untrusted code •Use Secure ECMAScript (SES) library developed by Google Caja team – Load a piece of code to execute within an isolated environment • The code can only interact with the outside world via provided APIs var api = {...}; //constructing var makeSandbox = cajaVM.compileModule(untrustedCodeSrc); var sandboxed = makeSandbox(api); 18
  • 19.
    Isolation technique: TheSES library Object-capability environment • Scripts can access – Objects they create themselves – Objects explicitly handed to them untrustedCode API sandbox Global context 19
  • 20.
  • 21.
    Base-line APIs implementation •Create a Virtual DOM – Intercepting wrapper around real DOM – Use Harmony Proxies to generically intercept property accesses on objects • Virtual DOM implementation uses the Membrane Pattern – Wrap any object passed from DOM to sandbox (return values) – Unwrap any object passed from sandbox to DOM (arguments) 21
  • 22.
  • 23.
    Policy definition • Base-lineAPIs implementation – Can enforce coarse-grained, generic policies, e.g.: • Sanitize HTML • Ensure complete mediation • Fine-grained policies for multiple untrusted JavaScript code – Modular, principal-specific, e.g.: script1 is allowed to read/write elemt_A, script2 is allowed to read elemt_A – Stafeful, e.g.: limit the number of popups to 3 – Cross-principal stateful policies, e.g: after script1 write to elemt_A, disallow access from script2 to elemt_A 23
  • 24.
    Deployment model • Untrustedcode is loaded into a string variable – Using server-side proxy + XMLHttpRequest (to overcome same origin policy) – CORS (Cross-Origin Resource Sharing) /UMP(Uniform Messaging Policy) headers set by the script provider <script src= “http://3rdparty.com/script.js”> </script> before <script src=“ses.js”></script> <script src=“api.js”></script> <script src=“policy0.js”></script> <script> var script = get(“http://3rdparty.com/script.js”); ses.execute(script,policy0); </script>
  • 25.
    Secure dynamic script evaluation •Special handlers to intercept all methods that allow script tags to be added – node.appendChild, node.insertBefore, node.replaceCh ild, node.insertAfter – document.write, … – Event handlers in HTML, e.g. <…onclick=“javascript:xyz(…)”> 1. Parse partial DOM tree/HTML 2. Execute scripts in the sandbox environment 25
  • 26.
    Dynamic script loadingin JavaScript • Example from Google Maps 26
  • 27.
    Different parsing techniques •Via a sandboxed iframe 1. Create sandbox iframe 2. Set content via srcdoc attribute – Better performance – Parsed exactly as will be interpreted by browser – Executed asynchronously • (Alternative) Via a HTML parsing library in JavaScript 27
  • 28.
    Loading additional codein the sandbox • External code needs to be executed in a previously set up sandbox – Loading API + glue code – Dynamic script loading • Two new operations: – innerEval(code) – innerLoadScript(url) 28
  • 29.
    Case studies • Singleprincipal code • Multiple-principal code – Context-aware ads 29
  • 30.
    Implementation challenges • Legacyscripts need additional preprocessing to be compatible with the framework – Secure ECMAScript restrictions • A subset of ECMAScritp strict mode • Global variables aliased as window properties • No ‘this’ auto coercion 30
  • 31.
  • 32.
    Summary – A client-sideJavaScript architecture for untrusted JavaScript • Only using libraries and wrappers – Complete mediation using Secure ECMAScript • DOM node operations • JavaScript APIs – Backward compatibility • No browser modifications • Direct script delivery to the browser • Support for legacy scripts 32
  • 33.