SlideShare a Scribd company logo
BrowserShield: Vulnerability-Driven Filtering of Dynamic HTML

    Charles Reis∗           John Dunagan†               Helen J. Wang†      Opher Dubrovsky†             Saher Esmeir‡


                           Abstract                               often delayed after the patches become available. Ser-
                                                                  vices such as Windows Update download patches auto-
Vulnerability-driven filtering of network data can offer           matically, but typically delay enactment if the patch re-
a fast and easy-to-deploy alternative or intermediary to          quires a reboot or application restart. This delay helps
software patching, as exemplified in Shield [43]. In this          both home and corporate users to save work and schedule
paper, we take Shield’s vision to a new domain, inspect-          downtime. An additional delay in the corporate setting
ing and cleansing not just static content, but also dy-           is that patches are typically tested prior to deployment,
namic content. The dynamic content we target is the               to avoid the potentially high costs for recovering from a
dynamic HTML in web pages, which have become a                    faulty patch [5].
popular vector for attacks. The key challenge in filter-
ing dynamic HTML is that it is undecidable to statically             As a result, there is a dangerous time window between
determine whether an embedded script will exploit the             patch release and patch application during which attack-
browser at run-time. We avoid this undecidability prob-           ers often reverse-engineer patches to gain vulnerability
lem by rewriting web pages and any embedded scripts               knowledge and then launch attacks. One study showed
into safe equivalents, inserting checks so that the fil-           that a large majority of existing attacks target known vul-
tering is done at run-time. The rewritten pages con-              nerabilities [4].
tain logic for recursively applying run-time checks to               For vulnerabilities that are exploitable through appli-
dynamically generated or modified web content, based               cation level protocols (e.g., HTTP, RPC), previous work,
on known vulnerabilities. We have built and evaluated             Shield [43], addresses the patch deployment problem by
BrowserShield, a system that performs this dynamic in-            filtering malicious traffic according to vulnerability sig-
strumentation of embedded scripts, and that admits poli-          natures at a firewall above the transport layer. The vul-
cies for customized run-time actions like vulnerability-          nerability signatures consist of a vulnerability state ma-
driven filtering.                                                  chine that characterizes all possible message sequences
                                                                  that may lead to attacks, along with the message for-
1     Introduction                                                mats that can trigger the exploitation of the application
                                                                  (e.g., an overly long field of a message that triggers a
Web browsers have become an important interface be-               buffer overrun). The key characteristic of this approach
tween users and many electronic services such as infor-           is that it cleanses the network data without modifying
mation access, personal communications, office tasks,              the code of the vulnerable application. This data-driven
and e-commerce. The importance of web browsers is ac-             approach makes signature deployment (and removal if
companied by rich functionality and extensibility, which          needed) easier than it is for patches. Vulnerability signa-
arguably have also contributed to their popularity as a           ture deployment can be automatic rather than user-driven
vector of attack. During the year 2005, 8 out of 29 criti-        and use the same deployment model as anti-virus signa-
cal Microsoft security bulletins, corresponding to 19 vul-        tures.
nerabilities, are due to flaws in Internet Explorer (IE) or
                                                                     These desirable features of vulnerability-driven filter-
its extensions such as ActiveX controls [1]. There were
                                                                  ing motivated us to explore its potential for exploit re-
also 6 security bulletins for Firefox [14], corresponding
                                                                  moval in web pages. The Shield approach is able to filter
to 59 vulnerabilities over the same period of time.
                                                                  static HTML pages by treating HTML as another pro-
   To date, the primary way to defend browser vulnera-            tocol layer over HTTP. However, the challenge lies in
bilities is through software patching. However, studies           dynamic HTML, where pages can be dynamically gener-
have shown that the deployment of software patches is             ated or modified through scripts embedded in the page —
    ∗ Universityof Washington CS Dept., creis@cs.washington.edu
                                                                  attackers could easily evade Shield filters by using scripts
    † Microsoft,{jdunagan, helenw, opherd}@microsoft.com          to generate malicious web content at run-time, possi-
    ‡ Technion CS Dept., esaher@cs.technion.ac.il                 bly with additional obfuscation. Determining whether a
seiciloP                             or edge firewalls, browser extensions, or web publishers
                                                                  that republish third-party content such as ads.
                                                                     We have implemented a prototype of the Browser-
               + LMTH                       dedleihS              Shield system, in which the rewriting logic is injected
   beW                    dleihSresworB                  tneilC
  revres     tpircSavaJ   rotcejnI cigoL
                                             + LMTH
                                           tpircSavaJ   resworB   into a web page at an enterprise firewall and exe-
                                                                  cuted by the browser at rendering time. Our pro-
                                                                  totype can transparently render many familiar web-
           Figure 1: The BrowserShield System                     sites that contain JavaScript (e.g., www.google.com,
                                                                  www.cs.washington.edu, www.mit.edu). We also suc-
                                                                  cessfully translated and ran a large intranet portal ap-
                                                                  plication (Microsoft SharePoint) that uses 549 KB of
script will eventually exploit a vulnerability is undecid-
                                                                  JavaScript libraries.
able. Our approach to cleansing dynamic content is to
                                                                     We chose the firewall deployment scenario because
rewrite HTML pages and any embedded scripts into safe
                                                                  it offers the greatest manageability benefit, as Browser-
equivalents before they are rendered by the browser. The
                                                                  Shield updates can be centralized at the firewall, imme-
safe equivalent pages contain logic for recursively apply-
                                                                  diately protecting all client machines in the organization
ing run-time checks to dynamically generated or mod-
                                                                  without any BrowserShield-related installation at either
ified web content, based on known vulnerabilities. To
                                                                  clients or web servers. The main disadvantage of this
this end, we have designed BrowserShield, a system that
                                                                  deployment scenario is that firewalls have no visibility
performs dynamic instrumentation of embedded scripts
                                                                  into end-to-end encrypted traffic. Nevertheless, commer-
and that admits policies for changing web page behav-
                                                                  cial products [35] already exist that force traffic cross-
ior. A vulnerability signature is one such policy, which
                                                                  ing the organization boundary to use the firewall (in-
sanitizes web pages according to a known vulnerability.
                                                                  stead of a client within the organization) as the encryp-
Figure 1 gives an overview of the BrowserShield system,
                                                                  tion endpoint, trading client privacy for aggregate orga-
showing how it transforms HTML and JavaScript using a
                                                                  nization security. Also, the browser extension and web
set of policies. Our system focuses on JavaScript because
                                                                  publisher deployment scenarios transparently handle en-
it is the predominant scripting language used on the web;
                                                                  crypted traffic.
a full fledged system would require additionally rewrit-
                                                                     Our evaluation focuses on the effectiveness of the
ing or disabling VBScript and any other script languages
                                                                  BrowserShield design and the performance of our im-
used by web browsers that BrowserShield protects.
                                                                  plementation. Our analysis of recent IE vulnerabil-
   Our general approach of code rewriting for interposi-          ities shows that BrowserShield significantly advances
tion has been used in other contexts. Code rewriting has          the state-of-the-art; existing firewall and anti-virus tech-
been used to isolate faults of software extensions [41].          niques alone can only provide patch-equivalent protec-
Java bytecode rewriting has been used to enable secu-             tion for 1 of the 8 IE patches from 2005, but combining
rity polices [10, 37], such as stack inspection policies          these two with BrowserShield is sufficient to cover all
for access control. However, rewriting script code for            8. We evaluated BrowserShield’s performance on real-
web browsers poses additional challenges: JavaScript is           world pages containing over 125 KB of JavaScript. Our
a prototype-based language, and the combination of this           evaluation shows a 22% increase in firewall CPU utiliza-
with JavaScript’s scoping rules, implicit garbage collec-         tion, and client rendering latencies that are comparable
tion and pervasive reflection required a number of tech-           to the original page latencies for most pages.
niques not needed by previous rewriting work in other                The rest of the paper is organized as follows: In Sec-
contexts.                                                         tion 2 we describe a typical browser vulnerability that we
   We have designed BrowserShield to adhere to well es-           would like to filter. We discuss the design of Browser-
tablished principles for protection systems: complete in-         Shield in Section 3, and give BrowserShield’s JavaScript
terposition of the underlying resource (i.e., the HTML            rewriting approach in detail in Section 4. We describe
document tree), tamper-proofness and transparency [3,             our implementation in Section 5. In Section 6 we give
10, 33]. In addition, BrowserShield is a general frame-           our evaluation of BrowserShield. We discuss related
work that supports applications other than vulnerability-         work in Section 7, and conclude in Section 8.
driven filtering. For example, we have authored poli-
cies that add UI invariants to prevent certain phishing           2   A Motivating Example
attempts.                                                         As a motivating example of vulnerability-driven filter-
   Because BrowserShield protects web browsers by                 ing, we consider MS04-040: the HTML Elements Vul-
transforming their inputs, not the browser itself, the            nerability [28] of IE from December, 2004. In this vul-
BrowserShield logic injector can be deployed at client            nerability, IE had a vulnerable buffer that was overrun if
function (tag) {                                                                                             <html>
  var len = 255; // not the actual limit                       <html>                                         <head>
                                                                <head>                                          <script src=“http://internal/bshield.js”>
                                                                </head>                                       </head>
    // Look for long attribute values                           <body><script>             THTML              <body><script>
                                                                  alert(“hello world!”);                        eval(bshield.translate(
    if ((contains("name", tag.attrs) &&                         </script></body>                                    “alert(“hello world!”);”
         tag.attrs["name"].length > len) &&                    </html>                                          );
                                                                                                              </script></body>
        (contains("src", tag.attrs) &&                                                                       </html>

         tag.attrs["src"].length > len)) {
      // Remove all attributes to be safe                                          Figure 3: THT M L Translation
      tag.attrs = [];
      // Return false to indicate exploit
      return false;
    }
                                                                                                                            Translated
    // Return true to indicate safe tag
                                                                                                                            JavaScript
    return true;
}
                                                                      Original                                        Interposition Layer
                                                                     JavaScript                    Tscript               with Policies
Figure 2: JavaScript code snippet to identify exploits of
the MS04-040 vulnerability                                         HTML                                                    HTML
                                                                Document Tree                                           Document Tree

both the name and the src attributes were too long in                                Figure 4: Tscript Translation
an iframe, frame, or embed HTML element.
   Figure 2 shows a corresponding snippet of JavaScript
code that can be used to identify and to remove exploits
                                                              Tscript , will be applied at run-time during page render-
of this vulnerability. As input, the function takes an ob-
                                                              ing at the browser. THT M L is depicted in Figure 3 us-
ject representing an HTML tag, including an associative
                                                              ing bshield.translate(...) to invoke Tscript .
array of its attributes. When invoked on an <iframe>,
                                                              Tscript , as depicted in Figure 4, parses and rewrites
<frame> or <embed> tag, the function determines
                                                              JavaScript to access the HTML document tree through
whether the relevant attributes exceed the size of the vul-
                                                              an interposition layer. This layer regulates all accesses
nerable buffer.
                                                              and manipulations of the underlying document tree, re-
   The goal of BrowserShield is to take this vulnerability-
                                                              cursively applies THT M L to any dynamically generated
specific filtering function as a policy and apply it to all
                                                              HTML, and recursively applies Tscript to any dynami-
occurrences of the vulnerable tags whether they are in
                                                              cally generated script code. Additionally, the interposi-
static HTML pages or dynamically generated by scripts.
                                                              tion layer enforces policies, such as filtering exploits of
The framework could react in many ways to detected ex-
                                                              known vulnerabilities.
ploits; our current system simply stops page rendering
and notifies the user. Vulnerability driven filtering, used        Since users can choose to disable scripting in their web
as a patch alternative or intermediary, should prevent all    browsers, we must ensure BrowserShield protects such
exploits of the vulnerability (i.e., zero false negatives),   users even without the JavaScript library. We transpar-
and should not disrupt any exploit-free pages (i.e., zero     ently handle such clients by applying THT M L at the logic
false positives). We design BrowserShield to meet these       injector, independent of the user’s browser. Any mod-
requirements.                                                 ifications due to Tscript are still in place, but disabling
                                                              scripts has made them irrelevant, along with the original
3    Overview                                                 script code.
The BrowserShield system consists of a JavaScript li-            Browser extensions, such as ActiveX controls, can
brary that translates web pages into safe equivalents and     also manipulate the document tree. The security model
a logic injector (such as a firewall) that modifies web         for such extensions is that they have the same privileges
pages to use this library.                                    as the browser, and thus we focus on interposing between
   BrowserShield uses two separate translations along         script and the extensions, not between the extensions and
with policies that are enforced at run-time. The first         the document tree. This allows BrowserShield to prevent
translation, THT M L , translates the HTML: It tokenizes      malicious script from exploiting known vulnerabilities in
an HTML page, modifies the page according to its               trusted browser extensions.
policies (such as the one depicted in Figure 2) and              We have designed BrowserShield to adhere to well es-
wraps the script elements so that the second translation,     tablished principles for protection systems [3, 10, 33]:
• Complete interposition: All script access to the HTML       rewriting to interpose on function calls, object method
  document tree must be mediated by the BrowserShield         calls, object property accesses, object creation, and con-
  framework.                                                  trol constructs. We summarize our rewriting rules in Ta-
• Tamper-proof: Web pages must not be able to modify          ble 1.
  or tamper with the BrowserShield framework in unin-         Function and Object Method Calls There are two ways
  tended ways.                                                to rewrite function or method calls for interposition:
• Transparency: Apart from timing considerations and          callee rewriting or caller rewriting.
  reasonable increases in resource usage, web pages              In callee rewriting, the original function or method
  should not be able to detect any changes in behavior        definition is first saved under a different name, and then
  due to the BrowserShield framework. The sole excep-         the original function or method is redefined to allow in-
  tion is for policy enforcement (e.g., the behavior of a     terception before calling the saved original. We call the
  page containing an exploit is visibly modified).             redefined function the wrapper. The benefit of callee
• Flexible policies: We desire the BrowserShield frame-       rewriting is that the rewritten code is localized — only
  work to have a good separation between mechanism            functions or methods of interest are modified, but not
  and policy, to make the system flexible for many ap-         their invocations throughout the code. However, callee
  plications.                                                 rewriting does not work for cases where functions or
                                                              methods cannot be redefined.
4     Design                                                     In caller rewriting, the invocation is rewritten to an in-
We now give a detailed discussion of the BrowserShield        terposition function without changing the original func-
script library. While much previous work uses code            tion’s definition. The interposition function looks up the
rewriting for interposition [10, 11, 12, 41], our approach    appropriate interposition logic based on the identity of
is heavily influenced by the fact that our code lives in       the target function or method. Although caller rewriting
the same name space as the code it is managing, and           causes more pervasive code changes, it can interpose on
also several subtleties of JavaScript. First, JavaScript      those functions or methods that cannot be overwritten.
is a prototype-based language [39], not a class-based            In BrowserShield, we have to use a hybrid of both
language like Java. In prototype-based languages, ob-         approaches to accommodate the previously mentioned
jects are created using other objects as prototypes, and      JavaScript subtleties.
can then be modified to have a different set of member            JavaScript contains some native functions that can-
variables and methods. A consequence of this is that          not be redefined (e.g., alert), which necessi-
JavaScript has no static typing: different data types can     tates caller rewriting.       The first row of Table 1
be assigned to the same variable, even for references to      shows how BrowserShield indirectly invokes a func-
functions and object methods. Second, scoping issues          tion with its list of parameter values by passing it
must be dealt with carefully, as assigning a method to a      to bshield.invokeFunc(func, paramList),
new object causes any use of the this keyword in the          where bshield is a global object that we introduce to
method to bind to the new object. Thus, any interposition     contain BrowserShield library code.
mechanisms must ensure that this is always evaluated             However, using caller rewriting alone for interpos-
in the intended context. Third, JavaScript uses a garbage     ing on method calls requires maintaining references to
collector that is not exposed to the language. Fourth, the    state otherwise eligible for garbage collection. Caller
language has pervasive reflection features that let a script   rewriting requires maintaining a map from functions and
explore its own code and object properties.                   methods of interest to their associated interposition logic.
   As a result of these subtleties, BrowserShield must        Maintaining this map as a global table would require
use a series of interposition mechanisms: method wrap-        maintaining a reference to methods of interest on every
pers, new invocation syntax, and name resolution man-         object ever created, since each object may correspond
agement. We justify and describe these mechanisms in          to a distinct prototype requiring distinct interposition
the following subsections, organized by our goals for the     logic. These global table references would prevent recla-
framework.                                                    mation of objects otherwise eligible for garbage collec-
                                                              tion, possibly causing pages that render normally with-
4.1    Complete Interposition                                 out BrowserShield to require unbounded memory. To
To provide complete interposition, BrowserShield must         avoid this, BrowserShield maintains the necessary inter-
mediate all possible accesses and manipulations allowed       position logic on each method, allowing unused state to
by the Document Object Model (DOM) over the HTML              be reclaimed.
document trees (including script elements). In this sub-         It might seem tempting to maintain this interposi-
section, we detail how we achieve this using script           tion logic as a property on the object. Unfortunately,
Construct            Original Code                          Rewritten Code
 Function Calls       foo(x);                                bshield.invokeFunc(foo, x);
 Method Calls         document.write(s);                     bshield.invokeMeth(document, "write", s);
 Object Properties    obj.x = obj.y;                         bshield.propWrite(obj, "x",
                                                               bshield.propRead(obj, "y") );
 Object Creation      var obj = new MyClass(x);              var obj = bshield.createObj(
                                                               "MyClass", [x]);
 with Construct       with (obj) { x = 3; }                  (bshield.undefined(obj.x) ? x = 3 :
                      // x refers to obj.x                     bshield.propWrite(obj, "x", 3));
 Variable Names       bshield = x;                           bshield = x;
 in Construct         for (i in obj) {...}                   for (i in obj) {
                                                               if (i=="bshield") continue; ...
                                                             }

                               Table 1: Sample Code for BrowserShield Rewrite Rules


aliases to the interposed method can be created, and          caller rewriting through the rewritten method in-
these aliases provide no reference to the object contain-     vocation syntax invokeMeth(obj, methName,
ing the interposition logic. For example, after “f =          paramList), passing the name of the method to the
document.write”, any interposition logic associated           method wrapper.
with document.write is not associated with f; find-
ing the logic would require a global scan of JavaScript          The swapping process requires an additional check
objects. Therefore, we use callee rewriting to install a      to handle recursive methods, since otherwise a recur-
wrapper for the methods of interest, such as those that       sive call would directly invoke the original method rather
insert new HTML. These wrappers are installed by re-          than the swapped out method wrapper, bypassing any
placing the original method with the wrapper and saving       interposition logic on nested calls. To this end, the
the original method as a property on the wrapper (which       invokeMeth method checks to see if a wrapper is al-
is itself an object). Because we interpose on object prop-    ready swapped out. If so, invokeMeth invokes the
erty accesses, object creation, and method invocations,       wrapper again, ignoring any swapping logic until the
we can install wrappers when an object is first created or     original recursive call completes. Because JavaScript is
used.                                                         single threaded, we have not needed to handle concur-
                                                              rency during this process.
   Thus far we have justified caller rewriting for func-
tions and callee rewriting for methods.           Because
                                                              Object Properties The HTML document tree can
JavaScript allows functions to be aliased as methods on
                                                              be accessed and modified through JavaScript ob-
objects (e.g., “obj.m = eval”), we also must per-
                                                              ject property reads and writes. For example, the
form caller rewriting for method calls. The rewritten
                                                              HTML in a page can be modified by assign-
method invocations can then check for potential aliased
                                                              ing values to document.body.innerHTML, and
functions.
                                                              a script element’s code can be modified by chang-
   JavaScript scoping introduces additional complexity        ing its text property. To interpose on such ac-
in method interposition. The original method cannot be        tions, BrowserShield replaces any attempts to read
simply called from the method wrapper, because saving         or write object properties with calls to the global
the original method as a property of the wrapper causes       bshield object’s propRead(obj, propName)
the keyword this in the original method to refer to           and propWrite(obj, propName, val) meth-
the wrapper rather than the intended object. To avoid         ods, as shown in Table 1. We use an object’s identity at
this problem, we use a swapping technique: The wrap-          run-time to check whether an assignment will create new
per temporarily restores the original method during the       HTML or script code. If so, propWrite applies either
wrapper execution, and then reinstalls the wrapper for        THT M L or Tscript to the value as needed. These identity
the object method before the wrapper returns.                 checks can be done by calling JavaScript library func-
   During swapping, the first step is to restore the           tions that reveal whether the object is part of the HTML
original method.        One challenge here is that the        document tree. We ensure that BrowserShield uses the
method name may not be the same as when the                   authentic library functions, and not malicious replace-
method wrapper was installed, because methods can             ments, by creating private aliases of the functions before
be reassigned. We solve this problem again with               the script begins to run.
This interposition on property accesses is required          In the first case, BrowserShield must hide some ob-
for installing wrappers when an object is first accessed.     ject properties, because it maintains per-object interpo-
Additionally, while wrappers are swapped out during          sition state (details given in Section 4.3) on some ob-
method execution, propRead must ensure that any at-          jects. Such state is stored on a bshield property
tempts to access the original method are redirected to the   of the object, which we hide using property access in-
swapped-out wrapper.                                         terposition. Specifically, if a call to propRead or
Object Creation To ensure that method wrappers are           propWrite attempts to access a property name begin-
initialized in the case of new object creation, Browser-     ning with bshield, we simply append an underscore to
Shield must also rewrite the instantiation of new objects    the name, thus returning the property value that the orig-
to use the createObj(className, paramList)                   inal script would have seen. Since array indices can also
method. The createObj method is also responsible             be used to access object properties, we must return the
for interposing on the JavaScript Function construc-         appropriate value for the given index.
tor, which can create new executable functions from its         In the second case, the in construct allows itera-
parameters as follows:                                       tion through all of an object’s properties by name. The
f = new Function("x", "return x+1;");                        bshield property of an object must be hidden during
                                                             the iteration if it is present. Thus, BrowserShield inserts
In this case, createObj applies Tscript to the code ar-      a check as the first line of the iteration loop, jumping to
gument before instantiating the function.                    the next item if the property name is bshield. This is
Control Constructs For control constructs (e.g., if-         accomplished using the rewrite rule shown in Table 1.
then blocks, loops, etc.), the bodies of the con-
structs are translated by Tscript . The bodies of tradi-     4.3 Transparency
tional function constructors (e.g., function foo()           The BrowserShield framework must also ensure its pres-
{...}) are translated by Tscript as well.                    ence is transparent to the original script’s semantics. The
   JavaScript’s with construct presents a special case, as   techniques for preventing tampering described in Sec-
it has the ability to modify scope. As shown in Table 1,     tion 4.2 contribute to this goal by making BrowserShield
free variables within a with block are assumed to refer      inaccessible. Transparency additionally requires that we
to properties on the designated object, unless such prop-    present to scripts the context they would have in the ab-
erties are undefined. This construct is purely “syntactic     sence of BrowserShield.
sugar” for JavaScript, and thus we handle this case with
a syntactic transformation.                                  Shadow Copies Scripts can access both their own script
                                                             code and HTML, which BrowserShield modifies for in-
4.2   Tamper-Proof                                           terposition. To preserve the intended semantics of such
                                                             scripts, BrowserShield retains a “shadow copy” of all
Preventing scripts from tampering with BrowserShield         original code before rewriting it. The shadow copy is
is challenging because BrowserShield logic lives in the      stored on a bshield property of the object. Interposi-
same name space as the code it is managing. To address       tion on property reads and writes allows the shadow copy
this, we use name resolution management to ensure that       to be exposed to scripts for access and modification.
all BrowserShield logic is inaccessible.                        Shadowing translated HTML requires additional care.
Variable Names In the common case, variable names in         During THT M L transformation, a policy may rewrite
a script can remain unchanged. However, we make the          static HTML elements. We must similarly create shadow
bshield name inaccessible to scripts to prevent tam-         copies for such translated HTML elements, but we can-
                                                             not directly create a JavaScript object in HTML to store
pering with the global BrowserShield data structure.
                                                             the shadow copy. Thus, we persist the shadow copy to a
   To do this, we rename any variable references to          bshield HTML tag attribute during THT M L , which is
bshield by appending an underscore to the end of the         later used by the BrowserShield library. For example, a
name. We also append an underscore to any name that          policy function that rewrites link URLs may modify the
matches the bshield( *) regular expression (i.e., that       href attribute of <a> tags during the THT M L transfor-
begins with bshield and is optionally followed by any        mation. Then, the persisted shadow copy looks like this:
number of underscores). Note that JavaScript places no
limit on variable name length.                               <a href="http://translatedLink"
                                                                bshield="{href:’http://originalLink’}">
Reflection Reflection in JavaScript allows script code to
explore the properties of objects as well as its own code,   When BrowserShield looks for the bshield property
using two pervasive language features: the syntax for ac-    of the DOM object corresponding to this tag, it inter-
cessing object properties (such as myScript.text or          prets this string into an actual bshield property with a
myScript[i]), and the JavaScript in construct.               shadow copy for the href attribute.
Because scripts can only interact with shadow copies       loaded separately from the remainder of the Browser-
of their code and not modified copies, our transforma-         Shield code, and they can be updated and customized
tions are not required to be idempotent. That is, we will     based on the intended application.
never apply THT M L or Tscript to code that has already          Policy functions are given the chance to inspect and
been transformed.                                             modify script behavior at all interposition points, includ-
                                                              ing property reads and writes, function and method invo-
Preserving Context The JavaScript eval function eval-
                                                              cations, and object creations. We also allow policy writ-
uates a string as script code in the current scope, and any
                                                              ers to introduce new global state and functions as part
occurrence of the this keyword in the string is bound
                                                              of the global bshield object, or introduce local state
to the current enclosing object. Thus, if eval were to be
                                                              and methods for all objects or for specific objects. Policy
called from within bshield.invokeFunc, the this
                                                              functions for HTML can also be registered by tag name.
keyword might evaluate differently than in the original
                                                              The tags are presented to HTML policy functions as part
context.
   For this reason, the rewriting rule for functions is ac-   of a token stream of tags and text, without a full parse
tually more complex than shown in Table 1. Instead,           tree. It is also possible for policy functions to further
the rewritten code first checks if the function being in-      parse the HTML token stream to gain additional context,
voked is eval. If so, the parameter is translated using       although we have not yet encountered a need for this in
Tscript and then evaluated in the correct context; other-     the policies we have authored.
wise, invokeFunc is called as described before. Thus,
the code is rewritten as follows:                             5   Implementation
bshield.isEval(bshield.func = foo) ?                          We have implemented a prototype of BrowserShield as
  eval(bshield.translate(x)) :                                a service deployed at a firewall and proxy cache. Our
  bshield.invokeFunc(bshield.func, x);                        prototype consists of a standard plugin to Microsoft’s In-
                                                              ternet Security and Acceleration (ISA) Server 2004 [21],
Note that the function expression foo is assigned to a        and a JavaScript library that is sent to the client with
temporary state variable on the bshield object, so that       transformed web documents. The ISA plugin plays the
the expression is not evaluated a second time in the call     role of the BrowserShield logic injector.
to invokeFunc.                                                   We implemented our ISA plugin in C++ with 2,679
   This check is a special case that is only needed for       lines of code. Our JavaScript library has 3,493 lines
eval, because eval is the only native function in             (including comments). Most of the ISA plugin code
JavaScript that accesses this. Other native functions,        is devoted to parsing HTML, while about half of
such as alert or parseInt, do not access this, and            the JavaScript library is devoted to parsing HTML or
can be evaluated within invokeFunc.                           JavaScript. This is a significantly smaller amount of code
                                                              than in a modern web browser, which implies that our
4.4    Flexible policies                                      trusted computing base is small compared to the code
The final goal of BrowserShield is to support flexible          base we are protecting.
policy enforcement. This can be achieved by sepa-                The ISA plugin is responsible for applying the
rating mechanism from policy: Our mechanism con-              THT M L transformation to static HTML. The ISA plugin
sists of the rewrite rules for translating HTML and           first inserts a reference to the BrowserShield JavaScript
script code, and our policy consists of the run-time          library into the web document. Because this library is
checks invoked by the rewritten code. Some run-time           distributed in a separate file, clients automatically cache
checks are critical for complete interposition, such as       it, reducing network traffic on later requests. THT M L
applying Tscript to any string passed to eval or the          then rewrites all script elements such that they will be
Function constructor, or applying THT M L to any              transformed using Tscript at the client before they are ex-
string passed to document.write or assigned to                ecuted. Figure 3 depicts this transformation; note that it
document.body.innerHTML. These checks are al-                 does not require translating the JavaScript at the firewall.
ways applied, regardless of what policy is in place. Be-         In our implementation, the firewall component applies
cause the interposition is policy-driven, our system can      THT M L using a streaming model, such that the ISA
be made incrementally complete. For example, if an un-        Server can begin sending transformed data to the client
documented API is discovered that can manipulate the          before the entire page is received. This streaming model
document tree, we simply add a new policy to interpose        also means that we do not expect the filter to be vulnera-
on this API.                                                  ble to state-holding DoS attacks by malicious web pages.
   The remaining run-time checks are used for enforc-            One complexity is that BrowserShield’s HTML pars-
ing flexible policies, such as the MS04-040 vulnerabil-        ing and JavaScript parsing must be consistent with that
ity filter in Figure 2. Such policy functions are down-        of the underlying browser. Any inconsistency will cause
false positives and false negatives in BrowserShield run-       because malformed URLs could trigger the HTTP layer
time checks. For our prototype, we have sought to match         vulnerabilities regardless of whether the URL came over
IE’s behavior through testing and refinement. If future          the network or was generated internally by the browser.
versions of browsers exposed this logic to other pro-           BrowserShield is able to prevent the HTML/script layer
grams, it would make this problem trivial.                      from triggering the generation of these bad HTTP re-
   When the browser starts to run the script in the page,       quests. Processing images or other files accounted for
the library applies Tscript to each piece of script code,       the remaining 4 vulnerabilities. Patch-equivalent protec-
translating it to call into the BrowserShield interposition     tion for these vulnerabilities is already available using
layer. This may sometimes require decoding scripts, a           existing anti-virus solutions [13].
procedure that is implemented in publicly available li-
                                                                    vulnerability                      protected by
braries [34] and which does not require cryptanalysis,
                                                                 type             #    BrowserShield      HTTP filter   antivirus
though we have not yet incorporated it in our implemen-
                                                                 HTML, script,
tation.                                                                           12        12                0           0
                                                                 ActiveX
   A final issue in Tscript is translating scripts that           HTTP             3         2∗               3∗           0
originate in source files linked to from a source tag.
                                                                 images and
THT M L rewrites such source URLs so that they                   other files
                                                                                  4         0                 0           4
are fetched through a proxy. The proxy wraps the
scripts in the same way that script code embedded               Table 2: BrowserShield Vulnerability Coverage. ∗ Two of
directly in the page is wrapped. For example, a script          the HTTP vulnerabilities required both BrowserShield and
source URL of http://foo.com/script.js                          an HTTP filter to provide patch-equivalent protection.
would be translated to http://rewritingProxy/
translateJS.pl?url=http://foo.com/                                 Because management and deployment costs are of-
script.js. Tscript is then applied at the client after          ten incurred on a per-patch basis, we also analyze the
the script source file is downloaded.                            vulnerabilities in Table 2 in terms of the corresponding
                                                                patches. For the 8 IE patches released in 2005, combin-
6     Evaluation                                                ing BrowserShield with standard anti-virus and HTTP
                                                                filtering would have provided patch-equivalent protec-
Our evaluation focuses on measuring BrowserShield’s
                                                                tion in every case, greatly reducing the costs associ-
vulnerability coverage, the complexity of authoring vul-
                                                                ated with multiple patch deployments. In the absence
nerability filters, the overhead of applying the Browser-
                                                                of BrowserShield, anti-virus and HTTP filtering would
Shield transformations at firewalls, and the overhead of
                                                                have provided patch-equivalent protection for only 1 of
running the BrowserShield interposition layer and vul-
                                                                the IE patches.
nerability filters at end hosts.
                                                                6.2 Authoring Vulnerability Filters
6.1    Vulnerability Coverage
                                                                To evaluate the complexity of vulnerability filtering, we
We evaluated BrowserShield’s ability to protect IE              choose three vulnerabilities from three different classes:
against all critical vulnerabilities for which Microsoft re-    HTML Elements Vulnerability (MS04-040), COM Ob-
leased patches in 2005 [1]. Of the 29 critical patches          ject Memory Corruption (MS05-037), and Mismatched
that year, 8 are for IE, corresponding to 19 IE vulnera-        DOM Object Memory Corruption (MS05-054).
bilities. These vulnerabilities fall into three classes: IE’s      We filtered for the MS04-040 vulnerability using the
handling of (i) HTML, script, or ActiveX components,            function shown in Figure 2. Registering this filter for
(ii) HTTP, and (iii) images or other files. Table 2 shows        each of the three vulnerable tags is as simple as:
how many vulnerabilities there were in each area, and           bshield.addHTMLTagPolicy("IFRAME", func);
whether BrowserShield or another technology could pro-             COM object vulnerabilities typically result from IE in-
vide patch-equivalent protection. The BrowserShield de-         stantiating COM objects that have memory errors in their
sign is focused on HTML, script, and ActiveX controls,          constructors. The IE patch blacklists particular COM
and it can successfully handle all 12 of these vulnerabil-      objects (identified by their clsid). Implementing an
ities. This includes vulnerabilities where the underlying       equivalent blacklist requires adding checks for an HTML
programmer error is at a higher layer of abstraction than       tag (the OBJECT tag) and sometimes a JavaScript func-
a buffer overrun, e.g., a cross-domain scripting vulnera-       tion (the ActiveXObject constructor, which can be
bility. Handling HTTP accounted for 3 of the 19 vulnera-        used to instantiate a subset of the COM objects accessi-
bilities. Perhaps surprisingly, 2 out of 3 of these vulnera-    ble through the OBJECT tag). In the case of MS05-037,
bilities required BrowserShield in addition to an existing      it does not appear to be possible to instantiate the vulner-
HTTP filter, such as Snort [38] or Shield [43]. This is          able COM object using the ActiveXObject construc-
resource           unmodified        browsershield
tor. The OBJECT tag filter is conceptually similar to the
                                                                             cpu utilization    15.0%            18.3%
function shown in Figure 2.                                                  virtual memory     317 MB           319 MB
   The MS05-054 vulnerability results when the                               working set        45.5 MB          46.6 MB
                                                                             private bytes      26.3 MB          27.3 MB
window object, which is not a function, is called as
a function in the outermost scope. Our interposition            Table 3: BrowserShield Firewall overheads. “Virtual
layer itself prevents window from being called as a             memory” measures the total virtual memory allocated to
function in the outermost scope since all function calls        the process; “working set” measures memory pages that
are mediated by BrowserShield with invokeFunc.                  are referenced regularly; “private bytes” measures mem-
Hence there is no need for a filter. Nevertheless, if            ory pages that are not sharable.
this vulnerability had not depended on such a scoping
constraint, we could simply have added a filter to prevent
calling the object as a function.                                  We measured CPU and memory usage at the firewall,
   To test the correctness of our vulnerability filters, we      as shown in Table 3. CPU usage increased by about 22%,
installed an unpatched image of Windows XP Pro within           resulting a potential degradation of throughput by 18.1%;
a virtual machine, and created web pages for each of            all aspects of memory usage we measured increased by
the vulnerabilities that caused IE to crash. Applying           negligible amounts. We also found that network usage
BrowserShield with the filters caused IE to not crash            increased only slightly (more detail in Section 6.4.2).
upon viewing the malicious web pages. We tested the
fidelity of our filters using the same set of URLs that we        6.4 Client Performance
used in our evaluation of BrowserShield’s overhead (de-         We evaluated the client component of our Browser-
tails are in Section 6.3). Under side-by-side visual com-       Shield implementation through microbenchmarks on the
parisons, we found that the filters had not changed the          JavaScript interposition layer and macrobenchmarks on
behavior of any of the web pages, as desired.                   network load, client memory usage, and the latency of
                                                                page rendering.
6.3    Firewall Performance
                                                                6.4.1    Microbenchmarks
We evaluated BrowserShield’s performance by scripting
multiple IE clients to download web pages (and all their        We designed microbenchmarks to measure the overhead
embedded objects) through an ISA server running the             of individual JavaScript operations after translation. Ta-
BrowserShield firewall plugin. The ISA firewall ran on            ble 4 lists our microbenchmarks and their respective
a Compaq Evo PC containing a 1.7GHz Pentium 4 mi-               BrowserShield slow-down. Our results are averages over
croprocessor and 1 GB RAM. Because we are within a              10 trials, where each trial evaluated its microbenchmark
corporate intranet, our ISA server connected to another         repeatedly, and lasted over half a second. For the first
HTTP proxy, not directly to web sites over the inter-           11 micro-benchmarks, the standard deviation over the 10
net. We disabled caching at our ISA proxy, and we fixed          trials was less than 2%. In the last case it was less than
our IE client cache to contain only the BrowserShield           8%. The slowdown ratio was computed using the aver-
JavaScript library, consistent with the scenario of a fire-      age time required per microbenchmark evaluation with
wall translating all web sites to contain a reference to this   and without the interposition framework.
library.                                                                operation                                                      slowdown
   We ran 10 IE processes concurrently using 10 pages            1      i++                                                            1.00
that IE could render quickly (so as to increase the load on      2      a=b+c                                                          1.00
                                                                 3      if                                                             1.07
the firewall), and repeatedly initiated each page visit ev-       4      string concat (‘+’)                                            1.00
ery 5 seconds. We used manual observation to determine           5      string concat (‘concat’)                                       61.9
when the load on the ISA server had reached a steady             6      string split (‘split’)                                         21.9
                                                                 7      no-op function call                                            44.8
state.                                                           8      x.a = b (property write)                                       342
   We chose these 10 pages out of a set of 70 URLs that          9      eval of minimal syntactic structure                            47.3
are the basis for our client performance macrobench-             10     eval of moderate syntactic structure, minimal computation      136
                                                                 11     eval of moderate syntactic structure, significant computation   1.34
marks. This set is based on a sample of 250 of the               12     image swap                                                     1.07
top 1 million URLs clicked on after being returned as
MSN Search results in Spring 2005, weighted by click-           Table 4: BrowserShield Microbenchmarks. Slowdown
through count. Specifically, the 70 URLs are those that          is the ratio of the execution time of BrowserShield trans-
BrowserShield can currently render correctly; the re-           lated code and that of the original code.
maining URLs in the sample encountered problems due
to incomplete aspects of our implementation, such as              Microbenchmarks 1-4 measure operations for which
JavaScript parsing bugs.                                        we expect no changes during rewriting, and hence no
slowdown. The only slowdown we measure is in the              that BrowserShield translation adds to some language
case of the if statement. Further examination showed          constructs may still be quite small in the context of a
that the BrowserShield translation inserted a semi-colon      complete web page.
(e.g., var a = 1 (linebreak) changed to var a                    In summary, BrowserShield incurs a significant over-
= 1; (linebreak)). This results in a 7% slow-                 head on the language constructs where it must add
down.                                                         interpreter-like logic, but these overheads can be quite
   Microbenchmarks 5-8 measure operations we expect           small within the context of the larger DOM manipula-
to incur a slowdown comparable to an interpreter’s slow-      tions in embedded scripts.
down. As detailed in Section 4, BrowserShield trans-
lation introduces additional logic around method calls,       6.4.2   Macrobenchmarks
function calls, and property writes, leading to a slow-       We designed macrobenchmarks to measure the overall
down in the range of 20x-400x. This slowdown is in            client experience when the BrowserShield framework is
line with good interpreters [32], but worse than what is      in place. In particular, the macrobenchmarks include
achieved by rewriting systems targeting other languages,      all the dynamic parsing and translation that occurs be-
e.g., Java bytecode [10]. BrowserShield is paying a price     fore the page is rendered, while the microbenchmarks
for the JavaScript subtleties that previous rewriting sys-    primarily evaluated the performance of the translated
tems did not have to deal with. We were curious about         code accomplishing a task relative to the untranslated
the difference in slowdown between the two string meth-       code accomplishing that same task. To this end, we
ods; an additional experiment showed that the differ-         scripted an instance of IE to download each of the 70
ence can be attributed to the JavaScript built-in concat      web pages in our workload 10 times. For the same rea-
method requiring about 3 times as much CPU as the             sons given in our evaluation of the BrowserShield ISA
built-in split method. Also, it is not surprising that        component, we maintained that the only object in the IE
property writes have a greater slowdown than function or      cache was the BrowserShield JavaScript library. These
method calls because property writes need to both guard       caching policies represent a worst-case for client latency.
the BrowserShield namespace and interpose on writes to        This measurement includes the overhead of the three fil-
DOM elements (such as the text property of scripts).          ters that we discussed in Section 6.1. We then repeated
   Microbenchmarks 9-11 explore the overhead of trans-        these measurements without the BrowserShield frame-
lating JavaScript code of various complexity. The “eval       work and translation.
of minimal syntactic structure” microbenchmark mea-              We set a 30 second upper limit on the time to render
sures the cost of translating and then evaluating a simple    the web page, including launching secondary (popup)
assignment. The cause of the large slowdown is the addi-      windows and displaying embedded objects, but not wait-
tional work done by eval in the BrowserShield frame-          ing for secondary windows to render. We visually veri-
work: parsing, constructing an AST, modifying the AST,        fied that the programmatic signal that rendering had com-
and outputting the new AST as a JavaScript program.           pleted indeed corresponded to the user’s perception that
The two subsequent “eval of moderate syntactic struc-         the page had rendered. IE hit the 30-second timeout sev-
ture” microbenchmarks measure the cost of translating         eral times in these trials, and it hit the timeouts both
and evaluating a simple for(;;) loop. This simply             when the BrowserShield framework and translation were
demonstrates that as the cost of the computation inside       present and when the framework and translation were ab-
the simple loop increases, the cost of translating the code   sent. We did not discern any pattern in these timeouts,
can decrease to a small fraction of the overall computa-      and because our experiments include factors outside our
tional cost.                                                  control, such as the wide-area network and the servers
   The last microbenchmark measures the overhead of           originating the content, we do not expect page download
performing a simple manipulation of the DOM – swap-           times to be constant over our trials. We re-ran the trials
ping two 35 KB images. This microbenchmark is de-             that experienced the timeouts.
signed to measure the relative importance of overheads in        Figure 5 shows the CDF of page rendering with
the JavaScript engine when the JavaScript is manipulat-       and without BrowserShield. On average BrowserShield
ing the layout of the HTML page. The JavaScript code to       added 1.7 seconds to page rendering time. By way of
swap these two images requires two property writes (i.e.,     contrast, the standard deviation in rendering time with-
img.src = ’newLink’), and we described above                  out BrowserShield was 1.0 seconds.
how BrowserShield translation adds significant overhead           In Figure 6, we further break down the latency for
to property writes. Nonetheless, the overall slowdown is      the 10 pages that took the most time to render under
less than 8%. In particular, the raw time to swap the im-     BrowserShield. They experienced an average increase in
age only increases from 26.7 milliseconds to 28.5 mil-        latency of 6.3 seconds, requiring 3.9 seconds on average
liseconds. This suggests that even the large overheads        without BrowserShield and 10.2 seconds on average with
100%                                                                                                              7




                                                                                                                    Time to Parse JavaScript
                                                                                                                                               6
  Percentage of Pages

                             80%
                                                                                                                                               5




                                                                                                                            (seconds)
                             60%                                                                                                               4

                             40%                                                                                                               3
                                                                                         unmodified
                                                                                                                                               2
                             20%                                                         browsershield
                                                                                                                                               1

                              0%                                                                                                               0
                                    0       2        4       6        8        10   12        14      16   18                                      0           5               10           15   20   25
                                                            Time to Render (seconds)                                                                                           JavaScript (KB)



Figure 5: Latency CDF with and without BrowserShield                                                                                                    Figure 7: Latency of JavaScript parsing

                                                                                                                                          100%
                             20
  Time to Render (seconds)




                                        browsershield: other overhead                                                                                          unmodified




                                                                                                                    Percentage of Pages
                                                                                                                                          80%
                             15         browsershield: translation at client                                                                                   browsershield
                                        unmodified                                                                                        60%

                             10                                                                                                           40%


                                                                                                                                          20%
                             5

                                                                                                                                               0%
                                                                                                                                                    0          5               10           15   20   25
                             0
                                                                      Pages                                                                                           Client Memory Usage (MB)



Figure 6: Breakdown of latency for slowest 10 pages un-                                                                                                   Figure 8: Memory Usage at Client
der BrowserShield
                                                                                                                   We also measured the increased network load over
BrowserShield. Of this 6.3 seconds of increased latency,                                                        a single run through the pages both with and without
we found that 2.8 seconds (45%) could be attributed to                                                          BrowserShield. We measured an average increase of 9
the overhead of dynamically translating JavaScript and                                                          KB, less than the standard deviation in the network load
HTML within IE. We attribute the remaining overhead to                                                          over any individual trial due to background traffic during
effects such as the overhead of evaluating the translated                                                       our measurements. We expect BrowserShield rewriting
code, and the time to modify the HTML at the firewall.                                                           to only slightly increase the network load, because the
   We broke down the latency of dynamic translation for                                                         firewall just adds script wrappers, while the translation
both HTML and JavaScript into 2 parts each: time to                                                             itself happens at the client.
parse the JavaScript/HTML into an AST and convert the
                                                                                                                7                         Related Work
modified AST back to a string, and the time to modify
the AST. We found that the time to parse the JavaScript                                                         We first compare with other protection systems in Sec-
to and from a string was always more than 70% of the                                                            tion 7.1. We then discuss BrowserShield’s relation to
overall latency of dynamic translation, and it averaged                                                         the extensive work on code rewriting and interposition
80% of the overall latency. Figure 7 shows the JavaScript                                                       in Section 7.2.
parsing time versus the number of kilobytes. Fitting a
least-squares line to this data yields an average parse rate                                                    7.1 Remote Exploit Defense
of 4.1 KB of JavaScript per second, but there was signif-                                                       In our prior work on Shield [43], we proposed using
icant variation; the slowest parse rate we observed was                                                         vulnerability-specific filters to identify and block net-
1.3 KB/second.                                                                                                  work traffic that would exploit known software vulner-
   Figure 8 shows the memory usage of page rendering                                                            abilities. Shield maintains protocol-specific state ma-
with and without BrowserShield. We found that private                                                           chines in an end-host’s network stack, allowing it to rec-
bytes (memory pages that are not sharable) was the client                                                       ognize when a packet will trigger a vulnerability. How-
memory metric that increased the most when rendering                                                            ever, the Shield approach does not address dynamic con-
the transformed page. Private memory usage increased                                                            tent such as scripts in web documents, since it is undecid-
on average by 11.8%, from 19.8 MB to 22.1 MB. This                                                              able whether script code in a document will eventually
increase was quite consistent; no page caused memory                                                            exploit a vulnerability. BrowserShield shares Shield’s
usage to increase by more than 3 MB.                                                                            focus on vulnerability-specific filters, but in contrast to
Shield, its use of runtime interposition allows it to han-    techniques include system call interposition [17, 18, 22]
dle exploits in dynamic HTML.                                 and Microsoft’s “protected mode” for IE in Windows
   Like BrowserShield and Shield, IntroVirt also em-          Vista [20]. These may limit damage to a user’s com-
ploys vulnerability-specific predicates, specifically to de-    puting environment, but they do not protect the browser
tect past and present intrusions using virtual machine        itself, allowing attacks such as keylogging to easily be
introspection and replay [23]. As a result, IntroVirt al-     conducted from exploited browsers. Tahoma [8] takes
lows “just in time” patch application: postponing the ap-     the confinement approach one step further, sandboxing
plication of a patch while an exploit has not occurred,       browsers in virtual machines and using site-specific man-
and rolling back system state to apply a patch if an ex-      ifests to restrict browser traffic to known servers. While
ploit does occur. BrowserShield instead offers protec-        this could help to mitigate many browser related prob-
tion while a patch is being tested (or otherwise delayed)     lems, the difficulty of getting such manifests widely
by the administrator of a vulnerable system, buying time      adopted is unclear.
even in cases where exploits are immediately attempted.
Additionally, BrowserShield supports more flexible de-         7.2 Interposition Techniques
ployment scenarios. For example, it does not require the
client’s browser to run inside an instrumented virtual ma-    Interposition techniques such as code rewriting have
chine.                                                        been used in previous work to achieve additional safety
   Opus [2] seeks to address the problem of patching by       properties or otherwise modify the behavior of existing
allowing patches to be applied without restarting the ap-     code. Code rewriting is only one of several alternatives
plication. Opus provides tools for developers to increase     for backward compatible modifications, and the choice
the reliability of such “dynamic” patches during develop-     of technique is influenced by tradeoffs in deployability
ment. However, these tools reduce, but do not eliminate,      and performance. Directly modifying the execution envi-
the programmer’s burden to produce a correct dynamic          ronment, such as the Java Virtual Machine, has the high-
patch.                                                        est deployment barriers. Some work instead uses a level
   Vigilante [7] focuses on worm containment, automati-       of indirection, such as emulation (e.g, Bochs [6]), eas-
cally detecting and preventing the spread of worm traffic.     ing deployment but incurring a high performance over-
Vigilante combines rapid distribution of self-certifying      head. Thus, BrowserShield and others [10, 41, 42] em-
alerts and automatic filter generation, along with vul-        ploy code rewriting, with its low barriers to deployment
nerability detection techniques such as non-executable        and smaller performance overhead than that required by
pages [30] and dynamic dataflow analysis [29]. These           an emulator.
techniques, even with the Vigilante improvements, ad-            We characterize interposition techniques by the tar-
mit false negatives. BrowserShield does not share the         get of interposition, since the technical differences be-
speed constraint of Vigilante, since browser exploits re-     tween targets require different solutions. Compared to
quire human involvement and therefore do not spread on        approaches for other interposition targets, BrowserShield
the same time scales as worms. Therefore, we are able         must address a new combination of technical challenges
to trade off the speed of automatically generated vulner-     presented by JavaScript: its scoping rules, an implicit
ability filters for the accuracy of hand-coded filters.         garbage collector, pervasive reflection, and its prototype-
   EarlyBird [36] and Autograph [24] are two exemplary        based object model (which implies a lack of static typ-
systems that use pattern matching to block network traf-      ing).
fic containing exploits. Pattern matching scales to high       Machine Code Many approaches focus on the machine
data rates, crucial to the authors’ goal of stopping worm     code interface, whether rewriting binary instructions or
outbreaks at network choke points. The HTML scripts           emulating them at runtime. Software Fault Isolation
that are BrowserShield’s focus seem difficult to detect        (SFI) [41] rewrites binary code to insert runtime checks,
consistently with pattern matching, as they can trivially     creating sandboxes that prevent code from writing or
modify themselves at the client.                              jumping to addresses outside its fault domain. This cre-
   HoneyMonkey [44] aims to discover web servers that         ates process-like memory boundaries between units of
distribute malicious code. In HoneyMonkey, virtual ma-        code within a process. The more recent XFI [9] uses bi-
chines automatically fetch content from the web and use       nary rewriting to provide flexible access control and ad-
black-box techniques to discover exploits. Exploit dis-       ditional integrity guarantees. VMware ESX Server [42]
covery is complimentary to BrowserShield’s approach of        also rewrites machine code, in its case to allow programs
providing patch-equivalent protection to clients.             to be virtualized on x86 hardware. Etch [31] rewrites
   Finally, a number of techniques have aimed to sandbox      machine code with the goals of profiling and measure-
the browser or other applications, in effect protecting the   ment. Valgrind [40] and Program Shepherding [25] are
operating system from the impact of an exploit. These         dynamic binary instrumentation tools. Valgrind’s goal is
to offer debugging and profiling support, while Program       Java bytecode to enforce security policies expressed in
Shepherding’s goal is to monitor control flow, prevent-       the PSLang language [10].
ing the transfer of control to data regions which might         JavaScript’s pervasive reflection, scoping rules, and
include malicious code.                                      prototype-based object model forced us to develop sev-
   The techniques used for rewriting at the machine          eral techniques not needed for Java bytecode rewrit-
code interface do not need to address any of the four        ing. For example, where Java bytecode rewriting can
challenges of JavaScript rewriting that have influenced       interpose on Java’s reflection API, BrowserShield must
BrowserShield: scoping, reflection, garbage collection        interpose on all property reads and writes, as well as
or typing. Most work interposing at the machine code in-     some for loops, to achieve similar control over reflec-
terface only adds semantics that can be defined in terms      tion. Additionally, Java bytecode rewriting can achieve
of low level operations, such as enforcing a process-like    complete interposition by only modifying callees (using
memory boundary, as in SFI. Indeed, Erlingsson and           method wrappers) and without maintaining state, though
Schneider [11] note the difficulty of extending rewriting     some previous work allowed modifying callers or adding
at the machine code interface to enforce policies on the     state to simplify policy construction [10]. In contrast,
abstractions internal to an application. BrowserShield’s     BrowserShield must modify both callers and callees to
interposition target (the HTML document tree) is such an     appropriately handle scoping and the possibility of func-
application-internal abstraction.                            tions aliased as methods (and vice versa). Also, Browser-
                                                             Shield must maintain state, requiring careful attention to
System Call Interface Much previous work has modi-           its interaction with the JavaScript garbage collector.
fied user level program behavior by interposing on the
system call interface. Jones introduces a toolkit for sys-   Web Scripting Languages We are not aware of any
tem call interposition agents that simplifies tasks such      full interposition techniques for web scripting languages
as tracing, emulation, and sandboxing [22]. Wagner           like JavaScript. The SafeWeb anonymity service used
et al. use system call interposition in Janus to con-        a JavaScript rewriting engine that failed to provide ei-
fine untrusted applications to a secure sandbox environ-      ther complete interposition or transparency [27]. The
ment [18]. Garfinkel notes difficulties in trying to in-       Greasemonkey [19] extension to the Firefox browser al-
terpose on the system call interface [16], such as violat-   lows users to run additional site-specific scripts when a
ing OS semantics, side effects, and overlooking indirect     document is loaded, but it does not provide complete in-
paths. Garfinkel et al. discuss a delegation-based archi-     terposition between existing script code and the HTML
tecture to address some of these problems [17]. Naccio       document tree.
describes an approach to provide similar guarantees by
rewriting x86 code that links against the Win32 system       8   Conclusion
call interface [12]. Naccio can also rewrite Java byte-
code.                                                        Web browser vulnerabilities have become a popular vec-
                                                             tor of attacks. Filtering exploits of these vulnerabilities is
   Work on the system call interface differs from
                                                             made challenging by the dynamic nature of web content.
BrowserShield both in goal and in technique. System call
                                                             We have presented BrowserShield, a general framework
interposition can guard external resources from an appli-
                                                             that rewrites HTML pages and any embedded scripts to
cation, while the goal of BrowserShield is to guard an
                                                             enforce policies on run-time behavior. We have designed
application-internal resource, the HTML document tree.
                                                             BrowserShield to provide complete interposition over the
Naccio’s use of rewriting as a technique to interpose on
                                                             underlying resource (the HTML document tree) and to be
the system call interface does not present any of the four
                                                             transparent and tamper-proof. Because BrowserShield
technical challenges (scoping, reflection, garbage collec-
                                                             transforms content rather than browsers, it supports de-
tion or typing) relevant to JavaScript rewriting. For ex-
                                                             ployment at clients, firewalls, or web publishers. Our
ample, Naccio also wraps methods to accomplish inter-
                                                             evaluation shows that adding this approach to existing
position, but Naccio’s method wrappers do not need to
                                                             firewall and anti-virus techniques increases the fraction
handle JavaScript’s scoping rules, and so do not need to
                                                             of IE patches from 2005 that can be protected at the net-
implement swapping.
                                                             work level from 12.5% to 100%, and that this protection
Java Bytecode Several pieces of previous work [10, 11,       can be done with only moderate overhead.
37], including the previously mentioned Naccio [12],            We have focused on the application of vulnerability-
have used rewriting at the Java Virtual Machine bytecode     driven filtering in this paper, but JavaScript rewrit-
interface [26]. This interface is type-safe, and provides    ing techniques may also enable new functionality for
good support for reasoning about application-internal        AJAX (Asynchronous JavaScript and XML) applica-
abstractions. In the most similar of these works to          tions. Some potential uses include: eliminating the effort
BrowserShield, Erlingsson’s PoET mechanism rewrites          currently required to modify a website for the Coral [15]
CDN; modifying the cached search results returned by                     [18] I. Goldberg, D. Wagner, R. Thomas, and E. A. Brewer. A Se-
web search engines to redirect links back into the cache                      cure Environment for Untrusted Helper Applications. In Usenix
                                                                              Security, 1996.
(since the original site may be unavailable); allowing ap-
                                                                         [19] Greasemonkey. http://greasemonkey.mozdev.org/.
propriately sandboxed dynamic third-party content on a
community site (such as a blog or wiki) that currently                   [20] Protected          Mode           in        Vista        IE7.
                                                                              http://blogs.msdn.com/ie/archive/2006/02/09/528963.aspx.
must restrict third-party content to be static; and debug-
                                                                         [21] ISA Server. http://www.microsoft.com/isaserver/default.mspx.
ging JavaScript code when attaching a debugger is in-
                                                                         [22] M. B. Jones. Interposition Agents: Transparently Interposing
feasible, perhaps offering call traces or breakpoint func-                    User Code at the System Interface. In SOSP, 1993.
tionality for complex scripts. User interface changes
                                                                         [23] A. Joshi, S. T. King, G. W. Dunlap, and P. M. Chen. Detecting
could even be added to make phishing more difficult,                           Past and Present Intrusions Through Vulnerability-specific Pred-
e.g., enforcing the display of origin URLs on all pop-                        icates. In SOSP, 2005.
up windows. As this list suggests, we are optimistic that                [24] H.-A. Kim and B. Karp. Autograph: Toward Automated, Dis-
JavaScript rewriting is a widely applicable technique.                        tributed Worm Signature Detection. In Usenix Security, 2004.
                                                                         [25] V. Kiriansky, D. Bruening, and S. Amarasinghe. Secure Execu-
References                                                                    tion via Program Shepherding. In Usenix Security, 2002.
 [1] Microsoft Security Bulletin Summaries and Webcasts, 2005.           [26] T. Lindholm and F. Yellin. The Java Virtual Machine Specifica-
     http://www.microsoft.com/technet/security/bulletin/                      tion, 2nd edition, 1999.
     summary.mspx.                                                       [27] D. Martin and A. Schulman. Deanonymizing Users of the
 [2] G. Altekar, I. Bagrak, P. Burstein, and A. Schultz. OPUS: Online         SafeWeb Anonymizing Service. In USENIX Security, 2002.
     Patches and Updates for Security. In Usenix Security, 2005.         [28] Microsoft Security Bulletin MS04-040, December 2004.
 [3] J. P. Anderson. Computer Security Technology Planning Study              http://www.microsoft.com/technet/security/Bulletin/MS04-
     Volume II. ESD-TR-73-51, Vol. II, Electronic Systems Division,           040.mspx.
     Air Force Systems Command, Hanscom Field, Bedford, MA, Oc-          [29] J. Newsome and D. Song. Dynamic Taint Analysis for Auto-
     tober 1972.                                                              matic Detection, Analysis, and Signature Generation of Exploits
 [4] W. A. Arbaugh, W. L. Fithen, and J. McHugh. Windows of Vul-              on Commodity Software. In NDSS, 2005.
     nerability: a Case Study Analysis. IEEE Computer, December          [30] Pax. http://pax.grsecurity.net/.
     2000.
                                                                         [31] T. Romer, G. Voelker, D. Lee, A. Wolman, W. Wong, H. Levy,
 [5] S. Beattie, S. Arnold, C. Cowan, P. Wagle, and C. Wright. Timing         and B. Bershad. Instrumentation and Optimization of Win32/Intel
     the Application of Security Patches for Optimal Uptime. In LISA,         Executables Using Etch. In Usenix NT Workshop, 1997.
     2002.                                                               [32] T. H. Romer, D. Lee, G. M. Voelker, A. Wolman, W. A. Wong,
 [6] Bochs. http://bochs.sourceforge.net/.                                    J.-L. Baer, B. N. Bershad, and H. M. Levy. The Structure and
                                                                              Performance of Interpreters. In ASPLOS, 1996.
 [7] M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou,
     L. Zhang, and P. Barham. Vigilante: End-to-End Containment          [33] J. H. Saltzer and M. D. Schroeder. The Protection of Information
     of Internet Worms. In SOSP, 2004.                                        in Computer Systems. In SOSP, 1973.
 [8] R. S. Cox, J. G. Hansen, S. D. Gribble, and H. M. Levy. A Safety-   [34] Windows Script Decoder. http://www.virtualconspiracy.com.
     Oriented Platform for Web Applications. In IEEE Symposium on        [35] Secure Computing. http://www.securecomputing.com/pdf/WW-
     Security and Privacy, 2006.                                              SSLscan-PO.pdf.
     ´
 [9] U. Erlingsson, M. Abadi, M. Vrable, M. Budiu, and G. C. Necula.     [36] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated Worm
     XFI: Software guards for system address spaces. In OSDI, 2006.           Fingerprinting. In OSDI, 2004.
     ´
[10] U. Erlingsson and F. B. Schneider. IRM Enforcement of Java          [37] E. G. Sirer, R. Grimm, A. J. Gregory, and B. N. Bershad. Design
     Stack Inspection. In IEEE Symposium on Security and Privacy,             and Implementation of a Distributed Virtual Machine for Net-
     2000.                                                                    worked Computers. In SOSP, 1999.
     ´
[11] U. Erlingsson and F. B. Schneider. SASI Enforcement of Security     [38] The Open Source Network Intrusion Detection System.
     Policies: A Retrospective. In WNSP: New Security Paradigms               http://www.snort.org/.
     Workshop, 2000.                                                     [39] D. Ungar and R. B. Smith. Self: The Power of Simplicity. In
[12] D. Evans and A. Twyman. Flexible Policy-Directed Code Safety.            OOPSLA, 1987.
     In IEEE Symposium on Security and Privacy, 1999.                    [40] Valgrind. http://www.valgrind.org/.
[13] Eweek: Anti-Virus Protection for WMF Flaw, December 2005.           [41] R. Wahbe, S. Lucco, T. Anderson, and S. Graham. Efficient
     http://www.eweek.com/article2/0,1895,1907102,00.asp.                     Software-Based Fault Isolation. In SOSP, 1993.
[14] Mozilla     Security      Alerts        and     Announcements.      [42] C. A. Waldspurger. Memory Resource Management in VMware
     http://www.mozilla.org/security/.                                        ESX Server . In OSDI, 2002.
[15] M. J. Freedman, E. Freudenthal, and D. Mazires. Democratizing       [43] H. J. Wang, C. Guo, D. R. Simon, and A. Zugenmaier. Shield:
     Content Publication with Coral. In NSDI, 2004.                           Vulnerability-Driven Network Filters for Preventing Known Vul-
                                                                              nerability Exploits. In SIGCOMM, 2004.
[16] T. Garfinkel. Traps and Pitfalls: Practical Problems in in System
     Call Interposition based Security Tools. In NDSS, 2003.             [44] Y.-M. Wang, D. Beck, X. Jiang, R. Roussev, C. Verbowski,
                                                                              S. Chen, and S. King. Automated Web Patrol with Strider Hon-
[17] T. Garfinkel, B. Pfaff, and M. Rosenblum. Ostia: A Delegat-
                                                                              eyMonkeys: Finding Web Sites That Exploit Browser Vulnera-
     ing Architecture for Secure System Call Interposition. In NDSS,
                                                                              bilities. In NDSS, 2006.
     2004.

More Related Content

Similar to Bshield osdi2006

Web Access Firewall
Web Access FirewallWeb Access Firewall
Isolation of vm
Isolation of vmIsolation of vm
Isolation of vm
Home
 
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISORINLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
Neha Rana
 
Top Ten Web Hacking Techniques – 2008
Top Ten Web Hacking Techniques – 2008Top Ten Web Hacking Techniques – 2008
Top Ten Web Hacking Techniques – 2008
Jeremiah Grossman
 
Seeing O S Processes To Improve Dependability And Safety
Seeing  O S  Processes To  Improve  Dependability And  SafetySeeing  O S  Processes To  Improve  Dependability And  Safety
Seeing O S Processes To Improve Dependability And Safety
alanocu
 
A26001006
A26001006A26001006
A26001006
IJERA Editor
 
Browser Security ppt.pptx
Browser Security ppt.pptxBrowser Security ppt.pptx
Browser Security ppt.pptx
AjaySahre
 
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
white paper
 
WinJS at NYC Code Camp 2012
WinJS at NYC Code Camp 2012WinJS at NYC Code Camp 2012
WinJS at NYC Code Camp 2012
Dmitri Artamonov
 
Web application framework
Web application frameworkWeb application framework
Web application framework
Pankaj Chand
 
منصة شليلة
منصة شليلةمنصة شليلة
منصة شليلة
ssuser81f53f
 
Injecting Security into Web apps at Runtime Whitepaper
Injecting Security into Web apps at Runtime WhitepaperInjecting Security into Web apps at Runtime Whitepaper
Injecting Security into Web apps at Runtime Whitepaper
Ajin Abraham
 
Blackhat11 shreeraj reverse_engineering_browser
Blackhat11 shreeraj reverse_engineering_browserBlackhat11 shreeraj reverse_engineering_browser
Blackhat11 shreeraj reverse_engineering_browser
Shreeraj Shah
 
Secure Computing With Java
Secure Computing With JavaSecure Computing With Java
Secure Computing With Java
white paper
 
Cq3210191021
Cq3210191021Cq3210191021
Cq3210191021
IJMER
 
VAPT- A Service on Eucalyptus Cloud
VAPT- A Service on Eucalyptus CloudVAPT- A Service on Eucalyptus Cloud
VAPT- A Service on Eucalyptus Cloud
Swapna Shetye
 
Application Security Guide for Beginners
Application Security Guide for Beginners Application Security Guide for Beginners
Application Security Guide for Beginners
Checkmarx
 
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
IRJET Journal
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource Management
NASIRSAYYED4
 
BeEF
BeEFBeEF

Similar to Bshield osdi2006 (20)

Web Access Firewall
Web Access FirewallWeb Access Firewall
Web Access Firewall
 
Isolation of vm
Isolation of vmIsolation of vm
Isolation of vm
 
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISORINLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
INLINE_PATCH_PROXY_FOR_XEN_HYPERVISOR
 
Top Ten Web Hacking Techniques – 2008
Top Ten Web Hacking Techniques – 2008Top Ten Web Hacking Techniques – 2008
Top Ten Web Hacking Techniques – 2008
 
Seeing O S Processes To Improve Dependability And Safety
Seeing  O S  Processes To  Improve  Dependability And  SafetySeeing  O S  Processes To  Improve  Dependability And  Safety
Seeing O S Processes To Improve Dependability And Safety
 
A26001006
A26001006A26001006
A26001006
 
Browser Security ppt.pptx
Browser Security ppt.pptxBrowser Security ppt.pptx
Browser Security ppt.pptx
 
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
Safety in the Cloud(s): 'Vaporizing' the Web Application Firewall to Secure C...
 
WinJS at NYC Code Camp 2012
WinJS at NYC Code Camp 2012WinJS at NYC Code Camp 2012
WinJS at NYC Code Camp 2012
 
Web application framework
Web application frameworkWeb application framework
Web application framework
 
منصة شليلة
منصة شليلةمنصة شليلة
منصة شليلة
 
Injecting Security into Web apps at Runtime Whitepaper
Injecting Security into Web apps at Runtime WhitepaperInjecting Security into Web apps at Runtime Whitepaper
Injecting Security into Web apps at Runtime Whitepaper
 
Blackhat11 shreeraj reverse_engineering_browser
Blackhat11 shreeraj reverse_engineering_browserBlackhat11 shreeraj reverse_engineering_browser
Blackhat11 shreeraj reverse_engineering_browser
 
Secure Computing With Java
Secure Computing With JavaSecure Computing With Java
Secure Computing With Java
 
Cq3210191021
Cq3210191021Cq3210191021
Cq3210191021
 
VAPT- A Service on Eucalyptus Cloud
VAPT- A Service on Eucalyptus CloudVAPT- A Service on Eucalyptus Cloud
VAPT- A Service on Eucalyptus Cloud
 
Application Security Guide for Beginners
Application Security Guide for Beginners Application Security Guide for Beginners
Application Security Guide for Beginners
 
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
IRJET- Bug Hunting using Web Application Penetration Testing Techniques.
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource Management
 
BeEF
BeEFBeEF
BeEF
 

More from losalamos

Exp user guide_4.6
Exp user guide_4.6Exp user guide_4.6
Exp user guide_4.6
losalamos
 
Remote api
Remote apiRemote api
Remote api
losalamos
 
Effective Java Second Edition
Effective Java Second EditionEffective Java Second Edition
Effective Java Second Edition
losalamos
 
Swf File Format Spec V10
Swf File Format Spec V10Swf File Format Spec V10
Swf File Format Spec V10
losalamos
 
Developing Adobe AIR 1.5 Applications with HTML and Ajax
Developing Adobe AIR 1.5 Applications with HTML and AjaxDeveloping Adobe AIR 1.5 Applications with HTML and Ajax
Developing Adobe AIR 1.5 Applications with HTML and Ajax
losalamos
 
"Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
 "Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ... "Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
"Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
losalamos
 
Tcp santa cruz
Tcp santa cruzTcp santa cruz
Tcp santa cruz
losalamos
 
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes""Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
losalamos
 
Jscriptdeviationsfromes3
Jscriptdeviationsfromes3Jscriptdeviationsfromes3
Jscriptdeviationsfromes3
losalamos
 
Sourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team LabsSourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team Labs
losalamos
 
Mixing Games And Applications
Mixing Games And ApplicationsMixing Games And Applications
Mixing Games And Applications
losalamos
 
Cryptointro
CryptointroCryptointro
Cryptointro
losalamos
 
Astaro Orange Paper Oss Myths Dispelled
Astaro Orange Paper Oss Myths DispelledAstaro Orange Paper Oss Myths Dispelled
Astaro Orange Paper Oss Myths Dispelled
losalamos
 
Apache Eng
Apache EngApache Eng
Apache Eng
losalamos
 
Conociendo Db2 Express V9.5
Conociendo Db2 Express V9.5Conociendo Db2 Express V9.5
Conociendo Db2 Express V9.5
losalamos
 
Mision De Cada Signo
Mision De Cada SignoMision De Cada Signo
Mision De Cada Signo
losalamos
 
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+AventuraLectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
losalamos
 
Buenas Maneras
Buenas ManerasBuenas Maneras
Buenas Maneras
losalamos
 
Libro Lopd V2 Alta
Libro Lopd V2 AltaLibro Lopd V2 Alta
Libro Lopd V2 Alta
losalamos
 

More from losalamos (19)

Exp user guide_4.6
Exp user guide_4.6Exp user guide_4.6
Exp user guide_4.6
 
Remote api
Remote apiRemote api
Remote api
 
Effective Java Second Edition
Effective Java Second EditionEffective Java Second Edition
Effective Java Second Edition
 
Swf File Format Spec V10
Swf File Format Spec V10Swf File Format Spec V10
Swf File Format Spec V10
 
Developing Adobe AIR 1.5 Applications with HTML and Ajax
Developing Adobe AIR 1.5 Applications with HTML and AjaxDeveloping Adobe AIR 1.5 Applications with HTML and Ajax
Developing Adobe AIR 1.5 Applications with HTML and Ajax
 
"Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
 "Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ... "Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
"Performance Evaluation and Comparison of Westwood+, New Reno and Vegas TCP ...
 
Tcp santa cruz
Tcp santa cruzTcp santa cruz
Tcp santa cruz
 
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes""Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
"Start-up dynamics of TCP's Congestion Control and Avoidance Schemes"
 
Jscriptdeviationsfromes3
Jscriptdeviationsfromes3Jscriptdeviationsfromes3
Jscriptdeviationsfromes3
 
Sourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team LabsSourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team Labs
 
Mixing Games And Applications
Mixing Games And ApplicationsMixing Games And Applications
Mixing Games And Applications
 
Cryptointro
CryptointroCryptointro
Cryptointro
 
Astaro Orange Paper Oss Myths Dispelled
Astaro Orange Paper Oss Myths DispelledAstaro Orange Paper Oss Myths Dispelled
Astaro Orange Paper Oss Myths Dispelled
 
Apache Eng
Apache EngApache Eng
Apache Eng
 
Conociendo Db2 Express V9.5
Conociendo Db2 Express V9.5Conociendo Db2 Express V9.5
Conociendo Db2 Express V9.5
 
Mision De Cada Signo
Mision De Cada SignoMision De Cada Signo
Mision De Cada Signo
 
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+AventuraLectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
Lectura+Y+Mujeres%2c+Im%C3%81 Genes+De+Una+Aventura
 
Buenas Maneras
Buenas ManerasBuenas Maneras
Buenas Maneras
 
Libro Lopd V2 Alta
Libro Lopd V2 AltaLibro Lopd V2 Alta
Libro Lopd V2 Alta
 

Recently uploaded

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 

Recently uploaded (20)

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 

Bshield osdi2006

  • 1. BrowserShield: Vulnerability-Driven Filtering of Dynamic HTML Charles Reis∗ John Dunagan† Helen J. Wang† Opher Dubrovsky† Saher Esmeir‡ Abstract often delayed after the patches become available. Ser- vices such as Windows Update download patches auto- Vulnerability-driven filtering of network data can offer matically, but typically delay enactment if the patch re- a fast and easy-to-deploy alternative or intermediary to quires a reboot or application restart. This delay helps software patching, as exemplified in Shield [43]. In this both home and corporate users to save work and schedule paper, we take Shield’s vision to a new domain, inspect- downtime. An additional delay in the corporate setting ing and cleansing not just static content, but also dy- is that patches are typically tested prior to deployment, namic content. The dynamic content we target is the to avoid the potentially high costs for recovering from a dynamic HTML in web pages, which have become a faulty patch [5]. popular vector for attacks. The key challenge in filter- ing dynamic HTML is that it is undecidable to statically As a result, there is a dangerous time window between determine whether an embedded script will exploit the patch release and patch application during which attack- browser at run-time. We avoid this undecidability prob- ers often reverse-engineer patches to gain vulnerability lem by rewriting web pages and any embedded scripts knowledge and then launch attacks. One study showed into safe equivalents, inserting checks so that the fil- that a large majority of existing attacks target known vul- tering is done at run-time. The rewritten pages con- nerabilities [4]. tain logic for recursively applying run-time checks to For vulnerabilities that are exploitable through appli- dynamically generated or modified web content, based cation level protocols (e.g., HTTP, RPC), previous work, on known vulnerabilities. We have built and evaluated Shield [43], addresses the patch deployment problem by BrowserShield, a system that performs this dynamic in- filtering malicious traffic according to vulnerability sig- strumentation of embedded scripts, and that admits poli- natures at a firewall above the transport layer. The vul- cies for customized run-time actions like vulnerability- nerability signatures consist of a vulnerability state ma- driven filtering. chine that characterizes all possible message sequences that may lead to attacks, along with the message for- 1 Introduction mats that can trigger the exploitation of the application (e.g., an overly long field of a message that triggers a Web browsers have become an important interface be- buffer overrun). The key characteristic of this approach tween users and many electronic services such as infor- is that it cleanses the network data without modifying mation access, personal communications, office tasks, the code of the vulnerable application. This data-driven and e-commerce. The importance of web browsers is ac- approach makes signature deployment (and removal if companied by rich functionality and extensibility, which needed) easier than it is for patches. Vulnerability signa- arguably have also contributed to their popularity as a ture deployment can be automatic rather than user-driven vector of attack. During the year 2005, 8 out of 29 criti- and use the same deployment model as anti-virus signa- cal Microsoft security bulletins, corresponding to 19 vul- tures. nerabilities, are due to flaws in Internet Explorer (IE) or These desirable features of vulnerability-driven filter- its extensions such as ActiveX controls [1]. There were ing motivated us to explore its potential for exploit re- also 6 security bulletins for Firefox [14], corresponding moval in web pages. The Shield approach is able to filter to 59 vulnerabilities over the same period of time. static HTML pages by treating HTML as another pro- To date, the primary way to defend browser vulnera- tocol layer over HTTP. However, the challenge lies in bilities is through software patching. However, studies dynamic HTML, where pages can be dynamically gener- have shown that the deployment of software patches is ated or modified through scripts embedded in the page — ∗ Universityof Washington CS Dept., creis@cs.washington.edu attackers could easily evade Shield filters by using scripts † Microsoft,{jdunagan, helenw, opherd}@microsoft.com to generate malicious web content at run-time, possi- ‡ Technion CS Dept., esaher@cs.technion.ac.il bly with additional obfuscation. Determining whether a
  • 2. seiciloP or edge firewalls, browser extensions, or web publishers that republish third-party content such as ads. We have implemented a prototype of the Browser- + LMTH dedleihS Shield system, in which the rewriting logic is injected beW dleihSresworB tneilC revres tpircSavaJ rotcejnI cigoL + LMTH tpircSavaJ resworB into a web page at an enterprise firewall and exe- cuted by the browser at rendering time. Our pro- totype can transparently render many familiar web- Figure 1: The BrowserShield System sites that contain JavaScript (e.g., www.google.com, www.cs.washington.edu, www.mit.edu). We also suc- cessfully translated and ran a large intranet portal ap- plication (Microsoft SharePoint) that uses 549 KB of script will eventually exploit a vulnerability is undecid- JavaScript libraries. able. Our approach to cleansing dynamic content is to We chose the firewall deployment scenario because rewrite HTML pages and any embedded scripts into safe it offers the greatest manageability benefit, as Browser- equivalents before they are rendered by the browser. The Shield updates can be centralized at the firewall, imme- safe equivalent pages contain logic for recursively apply- diately protecting all client machines in the organization ing run-time checks to dynamically generated or mod- without any BrowserShield-related installation at either ified web content, based on known vulnerabilities. To clients or web servers. The main disadvantage of this this end, we have designed BrowserShield, a system that deployment scenario is that firewalls have no visibility performs dynamic instrumentation of embedded scripts into end-to-end encrypted traffic. Nevertheless, commer- and that admits policies for changing web page behav- cial products [35] already exist that force traffic cross- ior. A vulnerability signature is one such policy, which ing the organization boundary to use the firewall (in- sanitizes web pages according to a known vulnerability. stead of a client within the organization) as the encryp- Figure 1 gives an overview of the BrowserShield system, tion endpoint, trading client privacy for aggregate orga- showing how it transforms HTML and JavaScript using a nization security. Also, the browser extension and web set of policies. Our system focuses on JavaScript because publisher deployment scenarios transparently handle en- it is the predominant scripting language used on the web; crypted traffic. a full fledged system would require additionally rewrit- Our evaluation focuses on the effectiveness of the ing or disabling VBScript and any other script languages BrowserShield design and the performance of our im- used by web browsers that BrowserShield protects. plementation. Our analysis of recent IE vulnerabil- Our general approach of code rewriting for interposi- ities shows that BrowserShield significantly advances tion has been used in other contexts. Code rewriting has the state-of-the-art; existing firewall and anti-virus tech- been used to isolate faults of software extensions [41]. niques alone can only provide patch-equivalent protec- Java bytecode rewriting has been used to enable secu- tion for 1 of the 8 IE patches from 2005, but combining rity polices [10, 37], such as stack inspection policies these two with BrowserShield is sufficient to cover all for access control. However, rewriting script code for 8. We evaluated BrowserShield’s performance on real- web browsers poses additional challenges: JavaScript is world pages containing over 125 KB of JavaScript. Our a prototype-based language, and the combination of this evaluation shows a 22% increase in firewall CPU utiliza- with JavaScript’s scoping rules, implicit garbage collec- tion, and client rendering latencies that are comparable tion and pervasive reflection required a number of tech- to the original page latencies for most pages. niques not needed by previous rewriting work in other The rest of the paper is organized as follows: In Sec- contexts. tion 2 we describe a typical browser vulnerability that we We have designed BrowserShield to adhere to well es- would like to filter. We discuss the design of Browser- tablished principles for protection systems: complete in- Shield in Section 3, and give BrowserShield’s JavaScript terposition of the underlying resource (i.e., the HTML rewriting approach in detail in Section 4. We describe document tree), tamper-proofness and transparency [3, our implementation in Section 5. In Section 6 we give 10, 33]. In addition, BrowserShield is a general frame- our evaluation of BrowserShield. We discuss related work that supports applications other than vulnerability- work in Section 7, and conclude in Section 8. driven filtering. For example, we have authored poli- cies that add UI invariants to prevent certain phishing 2 A Motivating Example attempts. As a motivating example of vulnerability-driven filter- Because BrowserShield protects web browsers by ing, we consider MS04-040: the HTML Elements Vul- transforming their inputs, not the browser itself, the nerability [28] of IE from December, 2004. In this vul- BrowserShield logic injector can be deployed at client nerability, IE had a vulnerable buffer that was overrun if
  • 3. function (tag) { <html> var len = 255; // not the actual limit <html> <head> <head> <script src=“http://internal/bshield.js”> </head> </head> // Look for long attribute values <body><script> THTML <body><script> alert(“hello world!”); eval(bshield.translate( if ((contains("name", tag.attrs) && </script></body> “alert(“hello world!”);” tag.attrs["name"].length > len) && </html> ); </script></body> (contains("src", tag.attrs) && </html> tag.attrs["src"].length > len)) { // Remove all attributes to be safe Figure 3: THT M L Translation tag.attrs = []; // Return false to indicate exploit return false; } Translated // Return true to indicate safe tag JavaScript return true; } Original Interposition Layer JavaScript Tscript with Policies Figure 2: JavaScript code snippet to identify exploits of the MS04-040 vulnerability HTML HTML Document Tree Document Tree both the name and the src attributes were too long in Figure 4: Tscript Translation an iframe, frame, or embed HTML element. Figure 2 shows a corresponding snippet of JavaScript code that can be used to identify and to remove exploits Tscript , will be applied at run-time during page render- of this vulnerability. As input, the function takes an ob- ing at the browser. THT M L is depicted in Figure 3 us- ject representing an HTML tag, including an associative ing bshield.translate(...) to invoke Tscript . array of its attributes. When invoked on an <iframe>, Tscript , as depicted in Figure 4, parses and rewrites <frame> or <embed> tag, the function determines JavaScript to access the HTML document tree through whether the relevant attributes exceed the size of the vul- an interposition layer. This layer regulates all accesses nerable buffer. and manipulations of the underlying document tree, re- The goal of BrowserShield is to take this vulnerability- cursively applies THT M L to any dynamically generated specific filtering function as a policy and apply it to all HTML, and recursively applies Tscript to any dynami- occurrences of the vulnerable tags whether they are in cally generated script code. Additionally, the interposi- static HTML pages or dynamically generated by scripts. tion layer enforces policies, such as filtering exploits of The framework could react in many ways to detected ex- known vulnerabilities. ploits; our current system simply stops page rendering and notifies the user. Vulnerability driven filtering, used Since users can choose to disable scripting in their web as a patch alternative or intermediary, should prevent all browsers, we must ensure BrowserShield protects such exploits of the vulnerability (i.e., zero false negatives), users even without the JavaScript library. We transpar- and should not disrupt any exploit-free pages (i.e., zero ently handle such clients by applying THT M L at the logic false positives). We design BrowserShield to meet these injector, independent of the user’s browser. Any mod- requirements. ifications due to Tscript are still in place, but disabling scripts has made them irrelevant, along with the original 3 Overview script code. The BrowserShield system consists of a JavaScript li- Browser extensions, such as ActiveX controls, can brary that translates web pages into safe equivalents and also manipulate the document tree. The security model a logic injector (such as a firewall) that modifies web for such extensions is that they have the same privileges pages to use this library. as the browser, and thus we focus on interposing between BrowserShield uses two separate translations along script and the extensions, not between the extensions and with policies that are enforced at run-time. The first the document tree. This allows BrowserShield to prevent translation, THT M L , translates the HTML: It tokenizes malicious script from exploiting known vulnerabilities in an HTML page, modifies the page according to its trusted browser extensions. policies (such as the one depicted in Figure 2) and We have designed BrowserShield to adhere to well es- wraps the script elements so that the second translation, tablished principles for protection systems [3, 10, 33]:
  • 4. • Complete interposition: All script access to the HTML rewriting to interpose on function calls, object method document tree must be mediated by the BrowserShield calls, object property accesses, object creation, and con- framework. trol constructs. We summarize our rewriting rules in Ta- • Tamper-proof: Web pages must not be able to modify ble 1. or tamper with the BrowserShield framework in unin- Function and Object Method Calls There are two ways tended ways. to rewrite function or method calls for interposition: • Transparency: Apart from timing considerations and callee rewriting or caller rewriting. reasonable increases in resource usage, web pages In callee rewriting, the original function or method should not be able to detect any changes in behavior definition is first saved under a different name, and then due to the BrowserShield framework. The sole excep- the original function or method is redefined to allow in- tion is for policy enforcement (e.g., the behavior of a terception before calling the saved original. We call the page containing an exploit is visibly modified). redefined function the wrapper. The benefit of callee • Flexible policies: We desire the BrowserShield frame- rewriting is that the rewritten code is localized — only work to have a good separation between mechanism functions or methods of interest are modified, but not and policy, to make the system flexible for many ap- their invocations throughout the code. However, callee plications. rewriting does not work for cases where functions or methods cannot be redefined. 4 Design In caller rewriting, the invocation is rewritten to an in- We now give a detailed discussion of the BrowserShield terposition function without changing the original func- script library. While much previous work uses code tion’s definition. The interposition function looks up the rewriting for interposition [10, 11, 12, 41], our approach appropriate interposition logic based on the identity of is heavily influenced by the fact that our code lives in the target function or method. Although caller rewriting the same name space as the code it is managing, and causes more pervasive code changes, it can interpose on also several subtleties of JavaScript. First, JavaScript those functions or methods that cannot be overwritten. is a prototype-based language [39], not a class-based In BrowserShield, we have to use a hybrid of both language like Java. In prototype-based languages, ob- approaches to accommodate the previously mentioned jects are created using other objects as prototypes, and JavaScript subtleties. can then be modified to have a different set of member JavaScript contains some native functions that can- variables and methods. A consequence of this is that not be redefined (e.g., alert), which necessi- JavaScript has no static typing: different data types can tates caller rewriting. The first row of Table 1 be assigned to the same variable, even for references to shows how BrowserShield indirectly invokes a func- functions and object methods. Second, scoping issues tion with its list of parameter values by passing it must be dealt with carefully, as assigning a method to a to bshield.invokeFunc(func, paramList), new object causes any use of the this keyword in the where bshield is a global object that we introduce to method to bind to the new object. Thus, any interposition contain BrowserShield library code. mechanisms must ensure that this is always evaluated However, using caller rewriting alone for interpos- in the intended context. Third, JavaScript uses a garbage ing on method calls requires maintaining references to collector that is not exposed to the language. Fourth, the state otherwise eligible for garbage collection. Caller language has pervasive reflection features that let a script rewriting requires maintaining a map from functions and explore its own code and object properties. methods of interest to their associated interposition logic. As a result of these subtleties, BrowserShield must Maintaining this map as a global table would require use a series of interposition mechanisms: method wrap- maintaining a reference to methods of interest on every pers, new invocation syntax, and name resolution man- object ever created, since each object may correspond agement. We justify and describe these mechanisms in to a distinct prototype requiring distinct interposition the following subsections, organized by our goals for the logic. These global table references would prevent recla- framework. mation of objects otherwise eligible for garbage collec- tion, possibly causing pages that render normally with- 4.1 Complete Interposition out BrowserShield to require unbounded memory. To To provide complete interposition, BrowserShield must avoid this, BrowserShield maintains the necessary inter- mediate all possible accesses and manipulations allowed position logic on each method, allowing unused state to by the Document Object Model (DOM) over the HTML be reclaimed. document trees (including script elements). In this sub- It might seem tempting to maintain this interposi- section, we detail how we achieve this using script tion logic as a property on the object. Unfortunately,
  • 5. Construct Original Code Rewritten Code Function Calls foo(x); bshield.invokeFunc(foo, x); Method Calls document.write(s); bshield.invokeMeth(document, "write", s); Object Properties obj.x = obj.y; bshield.propWrite(obj, "x", bshield.propRead(obj, "y") ); Object Creation var obj = new MyClass(x); var obj = bshield.createObj( "MyClass", [x]); with Construct with (obj) { x = 3; } (bshield.undefined(obj.x) ? x = 3 : // x refers to obj.x bshield.propWrite(obj, "x", 3)); Variable Names bshield = x; bshield = x; in Construct for (i in obj) {...} for (i in obj) { if (i=="bshield") continue; ... } Table 1: Sample Code for BrowserShield Rewrite Rules aliases to the interposed method can be created, and caller rewriting through the rewritten method in- these aliases provide no reference to the object contain- vocation syntax invokeMeth(obj, methName, ing the interposition logic. For example, after “f = paramList), passing the name of the method to the document.write”, any interposition logic associated method wrapper. with document.write is not associated with f; find- ing the logic would require a global scan of JavaScript The swapping process requires an additional check objects. Therefore, we use callee rewriting to install a to handle recursive methods, since otherwise a recur- wrapper for the methods of interest, such as those that sive call would directly invoke the original method rather insert new HTML. These wrappers are installed by re- than the swapped out method wrapper, bypassing any placing the original method with the wrapper and saving interposition logic on nested calls. To this end, the the original method as a property on the wrapper (which invokeMeth method checks to see if a wrapper is al- is itself an object). Because we interpose on object prop- ready swapped out. If so, invokeMeth invokes the erty accesses, object creation, and method invocations, wrapper again, ignoring any swapping logic until the we can install wrappers when an object is first created or original recursive call completes. Because JavaScript is used. single threaded, we have not needed to handle concur- rency during this process. Thus far we have justified caller rewriting for func- tions and callee rewriting for methods. Because Object Properties The HTML document tree can JavaScript allows functions to be aliased as methods on be accessed and modified through JavaScript ob- objects (e.g., “obj.m = eval”), we also must per- ject property reads and writes. For example, the form caller rewriting for method calls. The rewritten HTML in a page can be modified by assign- method invocations can then check for potential aliased ing values to document.body.innerHTML, and functions. a script element’s code can be modified by chang- JavaScript scoping introduces additional complexity ing its text property. To interpose on such ac- in method interposition. The original method cannot be tions, BrowserShield replaces any attempts to read simply called from the method wrapper, because saving or write object properties with calls to the global the original method as a property of the wrapper causes bshield object’s propRead(obj, propName) the keyword this in the original method to refer to and propWrite(obj, propName, val) meth- the wrapper rather than the intended object. To avoid ods, as shown in Table 1. We use an object’s identity at this problem, we use a swapping technique: The wrap- run-time to check whether an assignment will create new per temporarily restores the original method during the HTML or script code. If so, propWrite applies either wrapper execution, and then reinstalls the wrapper for THT M L or Tscript to the value as needed. These identity the object method before the wrapper returns. checks can be done by calling JavaScript library func- During swapping, the first step is to restore the tions that reveal whether the object is part of the HTML original method. One challenge here is that the document tree. We ensure that BrowserShield uses the method name may not be the same as when the authentic library functions, and not malicious replace- method wrapper was installed, because methods can ments, by creating private aliases of the functions before be reassigned. We solve this problem again with the script begins to run.
  • 6. This interposition on property accesses is required In the first case, BrowserShield must hide some ob- for installing wrappers when an object is first accessed. ject properties, because it maintains per-object interpo- Additionally, while wrappers are swapped out during sition state (details given in Section 4.3) on some ob- method execution, propRead must ensure that any at- jects. Such state is stored on a bshield property tempts to access the original method are redirected to the of the object, which we hide using property access in- swapped-out wrapper. terposition. Specifically, if a call to propRead or Object Creation To ensure that method wrappers are propWrite attempts to access a property name begin- initialized in the case of new object creation, Browser- ning with bshield, we simply append an underscore to Shield must also rewrite the instantiation of new objects the name, thus returning the property value that the orig- to use the createObj(className, paramList) inal script would have seen. Since array indices can also method. The createObj method is also responsible be used to access object properties, we must return the for interposing on the JavaScript Function construc- appropriate value for the given index. tor, which can create new executable functions from its In the second case, the in construct allows itera- parameters as follows: tion through all of an object’s properties by name. The f = new Function("x", "return x+1;"); bshield property of an object must be hidden during the iteration if it is present. Thus, BrowserShield inserts In this case, createObj applies Tscript to the code ar- a check as the first line of the iteration loop, jumping to gument before instantiating the function. the next item if the property name is bshield. This is Control Constructs For control constructs (e.g., if- accomplished using the rewrite rule shown in Table 1. then blocks, loops, etc.), the bodies of the con- structs are translated by Tscript . The bodies of tradi- 4.3 Transparency tional function constructors (e.g., function foo() The BrowserShield framework must also ensure its pres- {...}) are translated by Tscript as well. ence is transparent to the original script’s semantics. The JavaScript’s with construct presents a special case, as techniques for preventing tampering described in Sec- it has the ability to modify scope. As shown in Table 1, tion 4.2 contribute to this goal by making BrowserShield free variables within a with block are assumed to refer inaccessible. Transparency additionally requires that we to properties on the designated object, unless such prop- present to scripts the context they would have in the ab- erties are undefined. This construct is purely “syntactic sence of BrowserShield. sugar” for JavaScript, and thus we handle this case with a syntactic transformation. Shadow Copies Scripts can access both their own script code and HTML, which BrowserShield modifies for in- 4.2 Tamper-Proof terposition. To preserve the intended semantics of such scripts, BrowserShield retains a “shadow copy” of all Preventing scripts from tampering with BrowserShield original code before rewriting it. The shadow copy is is challenging because BrowserShield logic lives in the stored on a bshield property of the object. Interposi- same name space as the code it is managing. To address tion on property reads and writes allows the shadow copy this, we use name resolution management to ensure that to be exposed to scripts for access and modification. all BrowserShield logic is inaccessible. Shadowing translated HTML requires additional care. Variable Names In the common case, variable names in During THT M L transformation, a policy may rewrite a script can remain unchanged. However, we make the static HTML elements. We must similarly create shadow bshield name inaccessible to scripts to prevent tam- copies for such translated HTML elements, but we can- not directly create a JavaScript object in HTML to store pering with the global BrowserShield data structure. the shadow copy. Thus, we persist the shadow copy to a To do this, we rename any variable references to bshield HTML tag attribute during THT M L , which is bshield by appending an underscore to the end of the later used by the BrowserShield library. For example, a name. We also append an underscore to any name that policy function that rewrites link URLs may modify the matches the bshield( *) regular expression (i.e., that href attribute of <a> tags during the THT M L transfor- begins with bshield and is optionally followed by any mation. Then, the persisted shadow copy looks like this: number of underscores). Note that JavaScript places no limit on variable name length. <a href="http://translatedLink" bshield="{href:’http://originalLink’}"> Reflection Reflection in JavaScript allows script code to explore the properties of objects as well as its own code, When BrowserShield looks for the bshield property using two pervasive language features: the syntax for ac- of the DOM object corresponding to this tag, it inter- cessing object properties (such as myScript.text or prets this string into an actual bshield property with a myScript[i]), and the JavaScript in construct. shadow copy for the href attribute.
  • 7. Because scripts can only interact with shadow copies loaded separately from the remainder of the Browser- of their code and not modified copies, our transforma- Shield code, and they can be updated and customized tions are not required to be idempotent. That is, we will based on the intended application. never apply THT M L or Tscript to code that has already Policy functions are given the chance to inspect and been transformed. modify script behavior at all interposition points, includ- ing property reads and writes, function and method invo- Preserving Context The JavaScript eval function eval- cations, and object creations. We also allow policy writ- uates a string as script code in the current scope, and any ers to introduce new global state and functions as part occurrence of the this keyword in the string is bound of the global bshield object, or introduce local state to the current enclosing object. Thus, if eval were to be and methods for all objects or for specific objects. Policy called from within bshield.invokeFunc, the this functions for HTML can also be registered by tag name. keyword might evaluate differently than in the original The tags are presented to HTML policy functions as part context. For this reason, the rewriting rule for functions is ac- of a token stream of tags and text, without a full parse tually more complex than shown in Table 1. Instead, tree. It is also possible for policy functions to further the rewritten code first checks if the function being in- parse the HTML token stream to gain additional context, voked is eval. If so, the parameter is translated using although we have not yet encountered a need for this in Tscript and then evaluated in the correct context; other- the policies we have authored. wise, invokeFunc is called as described before. Thus, the code is rewritten as follows: 5 Implementation bshield.isEval(bshield.func = foo) ? We have implemented a prototype of BrowserShield as eval(bshield.translate(x)) : a service deployed at a firewall and proxy cache. Our bshield.invokeFunc(bshield.func, x); prototype consists of a standard plugin to Microsoft’s In- ternet Security and Acceleration (ISA) Server 2004 [21], Note that the function expression foo is assigned to a and a JavaScript library that is sent to the client with temporary state variable on the bshield object, so that transformed web documents. The ISA plugin plays the the expression is not evaluated a second time in the call role of the BrowserShield logic injector. to invokeFunc. We implemented our ISA plugin in C++ with 2,679 This check is a special case that is only needed for lines of code. Our JavaScript library has 3,493 lines eval, because eval is the only native function in (including comments). Most of the ISA plugin code JavaScript that accesses this. Other native functions, is devoted to parsing HTML, while about half of such as alert or parseInt, do not access this, and the JavaScript library is devoted to parsing HTML or can be evaluated within invokeFunc. JavaScript. This is a significantly smaller amount of code than in a modern web browser, which implies that our 4.4 Flexible policies trusted computing base is small compared to the code The final goal of BrowserShield is to support flexible base we are protecting. policy enforcement. This can be achieved by sepa- The ISA plugin is responsible for applying the rating mechanism from policy: Our mechanism con- THT M L transformation to static HTML. The ISA plugin sists of the rewrite rules for translating HTML and first inserts a reference to the BrowserShield JavaScript script code, and our policy consists of the run-time library into the web document. Because this library is checks invoked by the rewritten code. Some run-time distributed in a separate file, clients automatically cache checks are critical for complete interposition, such as it, reducing network traffic on later requests. THT M L applying Tscript to any string passed to eval or the then rewrites all script elements such that they will be Function constructor, or applying THT M L to any transformed using Tscript at the client before they are ex- string passed to document.write or assigned to ecuted. Figure 3 depicts this transformation; note that it document.body.innerHTML. These checks are al- does not require translating the JavaScript at the firewall. ways applied, regardless of what policy is in place. Be- In our implementation, the firewall component applies cause the interposition is policy-driven, our system can THT M L using a streaming model, such that the ISA be made incrementally complete. For example, if an un- Server can begin sending transformed data to the client documented API is discovered that can manipulate the before the entire page is received. This streaming model document tree, we simply add a new policy to interpose also means that we do not expect the filter to be vulnera- on this API. ble to state-holding DoS attacks by malicious web pages. The remaining run-time checks are used for enforc- One complexity is that BrowserShield’s HTML pars- ing flexible policies, such as the MS04-040 vulnerabil- ing and JavaScript parsing must be consistent with that ity filter in Figure 2. Such policy functions are down- of the underlying browser. Any inconsistency will cause
  • 8. false positives and false negatives in BrowserShield run- because malformed URLs could trigger the HTTP layer time checks. For our prototype, we have sought to match vulnerabilities regardless of whether the URL came over IE’s behavior through testing and refinement. If future the network or was generated internally by the browser. versions of browsers exposed this logic to other pro- BrowserShield is able to prevent the HTML/script layer grams, it would make this problem trivial. from triggering the generation of these bad HTTP re- When the browser starts to run the script in the page, quests. Processing images or other files accounted for the library applies Tscript to each piece of script code, the remaining 4 vulnerabilities. Patch-equivalent protec- translating it to call into the BrowserShield interposition tion for these vulnerabilities is already available using layer. This may sometimes require decoding scripts, a existing anti-virus solutions [13]. procedure that is implemented in publicly available li- vulnerability protected by braries [34] and which does not require cryptanalysis, type # BrowserShield HTTP filter antivirus though we have not yet incorporated it in our implemen- HTML, script, tation. 12 12 0 0 ActiveX A final issue in Tscript is translating scripts that HTTP 3 2∗ 3∗ 0 originate in source files linked to from a source tag. images and THT M L rewrites such source URLs so that they other files 4 0 0 4 are fetched through a proxy. The proxy wraps the scripts in the same way that script code embedded Table 2: BrowserShield Vulnerability Coverage. ∗ Two of directly in the page is wrapped. For example, a script the HTTP vulnerabilities required both BrowserShield and source URL of http://foo.com/script.js an HTTP filter to provide patch-equivalent protection. would be translated to http://rewritingProxy/ translateJS.pl?url=http://foo.com/ Because management and deployment costs are of- script.js. Tscript is then applied at the client after ten incurred on a per-patch basis, we also analyze the the script source file is downloaded. vulnerabilities in Table 2 in terms of the corresponding patches. For the 8 IE patches released in 2005, combin- 6 Evaluation ing BrowserShield with standard anti-virus and HTTP filtering would have provided patch-equivalent protec- Our evaluation focuses on measuring BrowserShield’s tion in every case, greatly reducing the costs associ- vulnerability coverage, the complexity of authoring vul- ated with multiple patch deployments. In the absence nerability filters, the overhead of applying the Browser- of BrowserShield, anti-virus and HTTP filtering would Shield transformations at firewalls, and the overhead of have provided patch-equivalent protection for only 1 of running the BrowserShield interposition layer and vul- the IE patches. nerability filters at end hosts. 6.2 Authoring Vulnerability Filters 6.1 Vulnerability Coverage To evaluate the complexity of vulnerability filtering, we We evaluated BrowserShield’s ability to protect IE choose three vulnerabilities from three different classes: against all critical vulnerabilities for which Microsoft re- HTML Elements Vulnerability (MS04-040), COM Ob- leased patches in 2005 [1]. Of the 29 critical patches ject Memory Corruption (MS05-037), and Mismatched that year, 8 are for IE, corresponding to 19 IE vulnera- DOM Object Memory Corruption (MS05-054). bilities. These vulnerabilities fall into three classes: IE’s We filtered for the MS04-040 vulnerability using the handling of (i) HTML, script, or ActiveX components, function shown in Figure 2. Registering this filter for (ii) HTTP, and (iii) images or other files. Table 2 shows each of the three vulnerable tags is as simple as: how many vulnerabilities there were in each area, and bshield.addHTMLTagPolicy("IFRAME", func); whether BrowserShield or another technology could pro- COM object vulnerabilities typically result from IE in- vide patch-equivalent protection. The BrowserShield de- stantiating COM objects that have memory errors in their sign is focused on HTML, script, and ActiveX controls, constructors. The IE patch blacklists particular COM and it can successfully handle all 12 of these vulnerabil- objects (identified by their clsid). Implementing an ities. This includes vulnerabilities where the underlying equivalent blacklist requires adding checks for an HTML programmer error is at a higher layer of abstraction than tag (the OBJECT tag) and sometimes a JavaScript func- a buffer overrun, e.g., a cross-domain scripting vulnera- tion (the ActiveXObject constructor, which can be bility. Handling HTTP accounted for 3 of the 19 vulnera- used to instantiate a subset of the COM objects accessi- bilities. Perhaps surprisingly, 2 out of 3 of these vulnera- ble through the OBJECT tag). In the case of MS05-037, bilities required BrowserShield in addition to an existing it does not appear to be possible to instantiate the vulner- HTTP filter, such as Snort [38] or Shield [43]. This is able COM object using the ActiveXObject construc-
  • 9. resource unmodified browsershield tor. The OBJECT tag filter is conceptually similar to the cpu utilization 15.0% 18.3% function shown in Figure 2. virtual memory 317 MB 319 MB The MS05-054 vulnerability results when the working set 45.5 MB 46.6 MB private bytes 26.3 MB 27.3 MB window object, which is not a function, is called as a function in the outermost scope. Our interposition Table 3: BrowserShield Firewall overheads. “Virtual layer itself prevents window from being called as a memory” measures the total virtual memory allocated to function in the outermost scope since all function calls the process; “working set” measures memory pages that are mediated by BrowserShield with invokeFunc. are referenced regularly; “private bytes” measures mem- Hence there is no need for a filter. Nevertheless, if ory pages that are not sharable. this vulnerability had not depended on such a scoping constraint, we could simply have added a filter to prevent calling the object as a function. We measured CPU and memory usage at the firewall, To test the correctness of our vulnerability filters, we as shown in Table 3. CPU usage increased by about 22%, installed an unpatched image of Windows XP Pro within resulting a potential degradation of throughput by 18.1%; a virtual machine, and created web pages for each of all aspects of memory usage we measured increased by the vulnerabilities that caused IE to crash. Applying negligible amounts. We also found that network usage BrowserShield with the filters caused IE to not crash increased only slightly (more detail in Section 6.4.2). upon viewing the malicious web pages. We tested the fidelity of our filters using the same set of URLs that we 6.4 Client Performance used in our evaluation of BrowserShield’s overhead (de- We evaluated the client component of our Browser- tails are in Section 6.3). Under side-by-side visual com- Shield implementation through microbenchmarks on the parisons, we found that the filters had not changed the JavaScript interposition layer and macrobenchmarks on behavior of any of the web pages, as desired. network load, client memory usage, and the latency of page rendering. 6.3 Firewall Performance 6.4.1 Microbenchmarks We evaluated BrowserShield’s performance by scripting multiple IE clients to download web pages (and all their We designed microbenchmarks to measure the overhead embedded objects) through an ISA server running the of individual JavaScript operations after translation. Ta- BrowserShield firewall plugin. The ISA firewall ran on ble 4 lists our microbenchmarks and their respective a Compaq Evo PC containing a 1.7GHz Pentium 4 mi- BrowserShield slow-down. Our results are averages over croprocessor and 1 GB RAM. Because we are within a 10 trials, where each trial evaluated its microbenchmark corporate intranet, our ISA server connected to another repeatedly, and lasted over half a second. For the first HTTP proxy, not directly to web sites over the inter- 11 micro-benchmarks, the standard deviation over the 10 net. We disabled caching at our ISA proxy, and we fixed trials was less than 2%. In the last case it was less than our IE client cache to contain only the BrowserShield 8%. The slowdown ratio was computed using the aver- JavaScript library, consistent with the scenario of a fire- age time required per microbenchmark evaluation with wall translating all web sites to contain a reference to this and without the interposition framework. library. operation slowdown We ran 10 IE processes concurrently using 10 pages 1 i++ 1.00 that IE could render quickly (so as to increase the load on 2 a=b+c 1.00 3 if 1.07 the firewall), and repeatedly initiated each page visit ev- 4 string concat (‘+’) 1.00 ery 5 seconds. We used manual observation to determine 5 string concat (‘concat’) 61.9 when the load on the ISA server had reached a steady 6 string split (‘split’) 21.9 7 no-op function call 44.8 state. 8 x.a = b (property write) 342 We chose these 10 pages out of a set of 70 URLs that 9 eval of minimal syntactic structure 47.3 are the basis for our client performance macrobench- 10 eval of moderate syntactic structure, minimal computation 136 11 eval of moderate syntactic structure, significant computation 1.34 marks. This set is based on a sample of 250 of the 12 image swap 1.07 top 1 million URLs clicked on after being returned as MSN Search results in Spring 2005, weighted by click- Table 4: BrowserShield Microbenchmarks. Slowdown through count. Specifically, the 70 URLs are those that is the ratio of the execution time of BrowserShield trans- BrowserShield can currently render correctly; the re- lated code and that of the original code. maining URLs in the sample encountered problems due to incomplete aspects of our implementation, such as Microbenchmarks 1-4 measure operations for which JavaScript parsing bugs. we expect no changes during rewriting, and hence no
  • 10. slowdown. The only slowdown we measure is in the that BrowserShield translation adds to some language case of the if statement. Further examination showed constructs may still be quite small in the context of a that the BrowserShield translation inserted a semi-colon complete web page. (e.g., var a = 1 (linebreak) changed to var a In summary, BrowserShield incurs a significant over- = 1; (linebreak)). This results in a 7% slow- head on the language constructs where it must add down. interpreter-like logic, but these overheads can be quite Microbenchmarks 5-8 measure operations we expect small within the context of the larger DOM manipula- to incur a slowdown comparable to an interpreter’s slow- tions in embedded scripts. down. As detailed in Section 4, BrowserShield trans- lation introduces additional logic around method calls, 6.4.2 Macrobenchmarks function calls, and property writes, leading to a slow- We designed macrobenchmarks to measure the overall down in the range of 20x-400x. This slowdown is in client experience when the BrowserShield framework is line with good interpreters [32], but worse than what is in place. In particular, the macrobenchmarks include achieved by rewriting systems targeting other languages, all the dynamic parsing and translation that occurs be- e.g., Java bytecode [10]. BrowserShield is paying a price fore the page is rendered, while the microbenchmarks for the JavaScript subtleties that previous rewriting sys- primarily evaluated the performance of the translated tems did not have to deal with. We were curious about code accomplishing a task relative to the untranslated the difference in slowdown between the two string meth- code accomplishing that same task. To this end, we ods; an additional experiment showed that the differ- scripted an instance of IE to download each of the 70 ence can be attributed to the JavaScript built-in concat web pages in our workload 10 times. For the same rea- method requiring about 3 times as much CPU as the sons given in our evaluation of the BrowserShield ISA built-in split method. Also, it is not surprising that component, we maintained that the only object in the IE property writes have a greater slowdown than function or cache was the BrowserShield JavaScript library. These method calls because property writes need to both guard caching policies represent a worst-case for client latency. the BrowserShield namespace and interpose on writes to This measurement includes the overhead of the three fil- DOM elements (such as the text property of scripts). ters that we discussed in Section 6.1. We then repeated Microbenchmarks 9-11 explore the overhead of trans- these measurements without the BrowserShield frame- lating JavaScript code of various complexity. The “eval work and translation. of minimal syntactic structure” microbenchmark mea- We set a 30 second upper limit on the time to render sures the cost of translating and then evaluating a simple the web page, including launching secondary (popup) assignment. The cause of the large slowdown is the addi- windows and displaying embedded objects, but not wait- tional work done by eval in the BrowserShield frame- ing for secondary windows to render. We visually veri- work: parsing, constructing an AST, modifying the AST, fied that the programmatic signal that rendering had com- and outputting the new AST as a JavaScript program. pleted indeed corresponded to the user’s perception that The two subsequent “eval of moderate syntactic struc- the page had rendered. IE hit the 30-second timeout sev- ture” microbenchmarks measure the cost of translating eral times in these trials, and it hit the timeouts both and evaluating a simple for(;;) loop. This simply when the BrowserShield framework and translation were demonstrates that as the cost of the computation inside present and when the framework and translation were ab- the simple loop increases, the cost of translating the code sent. We did not discern any pattern in these timeouts, can decrease to a small fraction of the overall computa- and because our experiments include factors outside our tional cost. control, such as the wide-area network and the servers The last microbenchmark measures the overhead of originating the content, we do not expect page download performing a simple manipulation of the DOM – swap- times to be constant over our trials. We re-ran the trials ping two 35 KB images. This microbenchmark is de- that experienced the timeouts. signed to measure the relative importance of overheads in Figure 5 shows the CDF of page rendering with the JavaScript engine when the JavaScript is manipulat- and without BrowserShield. On average BrowserShield ing the layout of the HTML page. The JavaScript code to added 1.7 seconds to page rendering time. By way of swap these two images requires two property writes (i.e., contrast, the standard deviation in rendering time with- img.src = ’newLink’), and we described above out BrowserShield was 1.0 seconds. how BrowserShield translation adds significant overhead In Figure 6, we further break down the latency for to property writes. Nonetheless, the overall slowdown is the 10 pages that took the most time to render under less than 8%. In particular, the raw time to swap the im- BrowserShield. They experienced an average increase in age only increases from 26.7 milliseconds to 28.5 mil- latency of 6.3 seconds, requiring 3.9 seconds on average liseconds. This suggests that even the large overheads without BrowserShield and 10.2 seconds on average with
  • 11. 100% 7 Time to Parse JavaScript 6 Percentage of Pages 80% 5 (seconds) 60% 4 40% 3 unmodified 2 20% browsershield 1 0% 0 0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 Time to Render (seconds) JavaScript (KB) Figure 5: Latency CDF with and without BrowserShield Figure 7: Latency of JavaScript parsing 100% 20 Time to Render (seconds) browsershield: other overhead unmodified Percentage of Pages 80% 15 browsershield: translation at client browsershield unmodified 60% 10 40% 20% 5 0% 0 5 10 15 20 25 0 Pages Client Memory Usage (MB) Figure 6: Breakdown of latency for slowest 10 pages un- Figure 8: Memory Usage at Client der BrowserShield We also measured the increased network load over BrowserShield. Of this 6.3 seconds of increased latency, a single run through the pages both with and without we found that 2.8 seconds (45%) could be attributed to BrowserShield. We measured an average increase of 9 the overhead of dynamically translating JavaScript and KB, less than the standard deviation in the network load HTML within IE. We attribute the remaining overhead to over any individual trial due to background traffic during effects such as the overhead of evaluating the translated our measurements. We expect BrowserShield rewriting code, and the time to modify the HTML at the firewall. to only slightly increase the network load, because the We broke down the latency of dynamic translation for firewall just adds script wrappers, while the translation both HTML and JavaScript into 2 parts each: time to itself happens at the client. parse the JavaScript/HTML into an AST and convert the 7 Related Work modified AST back to a string, and the time to modify the AST. We found that the time to parse the JavaScript We first compare with other protection systems in Sec- to and from a string was always more than 70% of the tion 7.1. We then discuss BrowserShield’s relation to overall latency of dynamic translation, and it averaged the extensive work on code rewriting and interposition 80% of the overall latency. Figure 7 shows the JavaScript in Section 7.2. parsing time versus the number of kilobytes. Fitting a least-squares line to this data yields an average parse rate 7.1 Remote Exploit Defense of 4.1 KB of JavaScript per second, but there was signif- In our prior work on Shield [43], we proposed using icant variation; the slowest parse rate we observed was vulnerability-specific filters to identify and block net- 1.3 KB/second. work traffic that would exploit known software vulner- Figure 8 shows the memory usage of page rendering abilities. Shield maintains protocol-specific state ma- with and without BrowserShield. We found that private chines in an end-host’s network stack, allowing it to rec- bytes (memory pages that are not sharable) was the client ognize when a packet will trigger a vulnerability. How- memory metric that increased the most when rendering ever, the Shield approach does not address dynamic con- the transformed page. Private memory usage increased tent such as scripts in web documents, since it is undecid- on average by 11.8%, from 19.8 MB to 22.1 MB. This able whether script code in a document will eventually increase was quite consistent; no page caused memory exploit a vulnerability. BrowserShield shares Shield’s usage to increase by more than 3 MB. focus on vulnerability-specific filters, but in contrast to
  • 12. Shield, its use of runtime interposition allows it to han- techniques include system call interposition [17, 18, 22] dle exploits in dynamic HTML. and Microsoft’s “protected mode” for IE in Windows Like BrowserShield and Shield, IntroVirt also em- Vista [20]. These may limit damage to a user’s com- ploys vulnerability-specific predicates, specifically to de- puting environment, but they do not protect the browser tect past and present intrusions using virtual machine itself, allowing attacks such as keylogging to easily be introspection and replay [23]. As a result, IntroVirt al- conducted from exploited browsers. Tahoma [8] takes lows “just in time” patch application: postponing the ap- the confinement approach one step further, sandboxing plication of a patch while an exploit has not occurred, browsers in virtual machines and using site-specific man- and rolling back system state to apply a patch if an ex- ifests to restrict browser traffic to known servers. While ploit does occur. BrowserShield instead offers protec- this could help to mitigate many browser related prob- tion while a patch is being tested (or otherwise delayed) lems, the difficulty of getting such manifests widely by the administrator of a vulnerable system, buying time adopted is unclear. even in cases where exploits are immediately attempted. Additionally, BrowserShield supports more flexible de- 7.2 Interposition Techniques ployment scenarios. For example, it does not require the client’s browser to run inside an instrumented virtual ma- Interposition techniques such as code rewriting have chine. been used in previous work to achieve additional safety Opus [2] seeks to address the problem of patching by properties or otherwise modify the behavior of existing allowing patches to be applied without restarting the ap- code. Code rewriting is only one of several alternatives plication. Opus provides tools for developers to increase for backward compatible modifications, and the choice the reliability of such “dynamic” patches during develop- of technique is influenced by tradeoffs in deployability ment. However, these tools reduce, but do not eliminate, and performance. Directly modifying the execution envi- the programmer’s burden to produce a correct dynamic ronment, such as the Java Virtual Machine, has the high- patch. est deployment barriers. Some work instead uses a level Vigilante [7] focuses on worm containment, automati- of indirection, such as emulation (e.g, Bochs [6]), eas- cally detecting and preventing the spread of worm traffic. ing deployment but incurring a high performance over- Vigilante combines rapid distribution of self-certifying head. Thus, BrowserShield and others [10, 41, 42] em- alerts and automatic filter generation, along with vul- ploy code rewriting, with its low barriers to deployment nerability detection techniques such as non-executable and smaller performance overhead than that required by pages [30] and dynamic dataflow analysis [29]. These an emulator. techniques, even with the Vigilante improvements, ad- We characterize interposition techniques by the tar- mit false negatives. BrowserShield does not share the get of interposition, since the technical differences be- speed constraint of Vigilante, since browser exploits re- tween targets require different solutions. Compared to quire human involvement and therefore do not spread on approaches for other interposition targets, BrowserShield the same time scales as worms. Therefore, we are able must address a new combination of technical challenges to trade off the speed of automatically generated vulner- presented by JavaScript: its scoping rules, an implicit ability filters for the accuracy of hand-coded filters. garbage collector, pervasive reflection, and its prototype- EarlyBird [36] and Autograph [24] are two exemplary based object model (which implies a lack of static typ- systems that use pattern matching to block network traf- ing). fic containing exploits. Pattern matching scales to high Machine Code Many approaches focus on the machine data rates, crucial to the authors’ goal of stopping worm code interface, whether rewriting binary instructions or outbreaks at network choke points. The HTML scripts emulating them at runtime. Software Fault Isolation that are BrowserShield’s focus seem difficult to detect (SFI) [41] rewrites binary code to insert runtime checks, consistently with pattern matching, as they can trivially creating sandboxes that prevent code from writing or modify themselves at the client. jumping to addresses outside its fault domain. This cre- HoneyMonkey [44] aims to discover web servers that ates process-like memory boundaries between units of distribute malicious code. In HoneyMonkey, virtual ma- code within a process. The more recent XFI [9] uses bi- chines automatically fetch content from the web and use nary rewriting to provide flexible access control and ad- black-box techniques to discover exploits. Exploit dis- ditional integrity guarantees. VMware ESX Server [42] covery is complimentary to BrowserShield’s approach of also rewrites machine code, in its case to allow programs providing patch-equivalent protection to clients. to be virtualized on x86 hardware. Etch [31] rewrites Finally, a number of techniques have aimed to sandbox machine code with the goals of profiling and measure- the browser or other applications, in effect protecting the ment. Valgrind [40] and Program Shepherding [25] are operating system from the impact of an exploit. These dynamic binary instrumentation tools. Valgrind’s goal is
  • 13. to offer debugging and profiling support, while Program Java bytecode to enforce security policies expressed in Shepherding’s goal is to monitor control flow, prevent- the PSLang language [10]. ing the transfer of control to data regions which might JavaScript’s pervasive reflection, scoping rules, and include malicious code. prototype-based object model forced us to develop sev- The techniques used for rewriting at the machine eral techniques not needed for Java bytecode rewrit- code interface do not need to address any of the four ing. For example, where Java bytecode rewriting can challenges of JavaScript rewriting that have influenced interpose on Java’s reflection API, BrowserShield must BrowserShield: scoping, reflection, garbage collection interpose on all property reads and writes, as well as or typing. Most work interposing at the machine code in- some for loops, to achieve similar control over reflec- terface only adds semantics that can be defined in terms tion. Additionally, Java bytecode rewriting can achieve of low level operations, such as enforcing a process-like complete interposition by only modifying callees (using memory boundary, as in SFI. Indeed, Erlingsson and method wrappers) and without maintaining state, though Schneider [11] note the difficulty of extending rewriting some previous work allowed modifying callers or adding at the machine code interface to enforce policies on the state to simplify policy construction [10]. In contrast, abstractions internal to an application. BrowserShield’s BrowserShield must modify both callers and callees to interposition target (the HTML document tree) is such an appropriately handle scoping and the possibility of func- application-internal abstraction. tions aliased as methods (and vice versa). Also, Browser- Shield must maintain state, requiring careful attention to System Call Interface Much previous work has modi- its interaction with the JavaScript garbage collector. fied user level program behavior by interposing on the system call interface. Jones introduces a toolkit for sys- Web Scripting Languages We are not aware of any tem call interposition agents that simplifies tasks such full interposition techniques for web scripting languages as tracing, emulation, and sandboxing [22]. Wagner like JavaScript. The SafeWeb anonymity service used et al. use system call interposition in Janus to con- a JavaScript rewriting engine that failed to provide ei- fine untrusted applications to a secure sandbox environ- ther complete interposition or transparency [27]. The ment [18]. Garfinkel notes difficulties in trying to in- Greasemonkey [19] extension to the Firefox browser al- terpose on the system call interface [16], such as violat- lows users to run additional site-specific scripts when a ing OS semantics, side effects, and overlooking indirect document is loaded, but it does not provide complete in- paths. Garfinkel et al. discuss a delegation-based archi- terposition between existing script code and the HTML tecture to address some of these problems [17]. Naccio document tree. describes an approach to provide similar guarantees by rewriting x86 code that links against the Win32 system 8 Conclusion call interface [12]. Naccio can also rewrite Java byte- code. Web browser vulnerabilities have become a popular vec- tor of attacks. Filtering exploits of these vulnerabilities is Work on the system call interface differs from made challenging by the dynamic nature of web content. BrowserShield both in goal and in technique. System call We have presented BrowserShield, a general framework interposition can guard external resources from an appli- that rewrites HTML pages and any embedded scripts to cation, while the goal of BrowserShield is to guard an enforce policies on run-time behavior. We have designed application-internal resource, the HTML document tree. BrowserShield to provide complete interposition over the Naccio’s use of rewriting as a technique to interpose on underlying resource (the HTML document tree) and to be the system call interface does not present any of the four transparent and tamper-proof. Because BrowserShield technical challenges (scoping, reflection, garbage collec- transforms content rather than browsers, it supports de- tion or typing) relevant to JavaScript rewriting. For ex- ployment at clients, firewalls, or web publishers. Our ample, Naccio also wraps methods to accomplish inter- evaluation shows that adding this approach to existing position, but Naccio’s method wrappers do not need to firewall and anti-virus techniques increases the fraction handle JavaScript’s scoping rules, and so do not need to of IE patches from 2005 that can be protected at the net- implement swapping. work level from 12.5% to 100%, and that this protection Java Bytecode Several pieces of previous work [10, 11, can be done with only moderate overhead. 37], including the previously mentioned Naccio [12], We have focused on the application of vulnerability- have used rewriting at the Java Virtual Machine bytecode driven filtering in this paper, but JavaScript rewrit- interface [26]. This interface is type-safe, and provides ing techniques may also enable new functionality for good support for reasoning about application-internal AJAX (Asynchronous JavaScript and XML) applica- abstractions. In the most similar of these works to tions. Some potential uses include: eliminating the effort BrowserShield, Erlingsson’s PoET mechanism rewrites currently required to modify a website for the Coral [15]
  • 14. CDN; modifying the cached search results returned by [18] I. Goldberg, D. Wagner, R. Thomas, and E. A. Brewer. A Se- web search engines to redirect links back into the cache cure Environment for Untrusted Helper Applications. In Usenix Security, 1996. (since the original site may be unavailable); allowing ap- [19] Greasemonkey. http://greasemonkey.mozdev.org/. propriately sandboxed dynamic third-party content on a community site (such as a blog or wiki) that currently [20] Protected Mode in Vista IE7. http://blogs.msdn.com/ie/archive/2006/02/09/528963.aspx. must restrict third-party content to be static; and debug- [21] ISA Server. http://www.microsoft.com/isaserver/default.mspx. ging JavaScript code when attaching a debugger is in- [22] M. B. Jones. Interposition Agents: Transparently Interposing feasible, perhaps offering call traces or breakpoint func- User Code at the System Interface. In SOSP, 1993. tionality for complex scripts. User interface changes [23] A. Joshi, S. T. King, G. W. Dunlap, and P. M. Chen. Detecting could even be added to make phishing more difficult, Past and Present Intrusions Through Vulnerability-specific Pred- e.g., enforcing the display of origin URLs on all pop- icates. In SOSP, 2005. up windows. As this list suggests, we are optimistic that [24] H.-A. Kim and B. Karp. Autograph: Toward Automated, Dis- JavaScript rewriting is a widely applicable technique. tributed Worm Signature Detection. In Usenix Security, 2004. [25] V. Kiriansky, D. Bruening, and S. Amarasinghe. Secure Execu- References tion via Program Shepherding. In Usenix Security, 2002. [1] Microsoft Security Bulletin Summaries and Webcasts, 2005. [26] T. Lindholm and F. Yellin. The Java Virtual Machine Specifica- http://www.microsoft.com/technet/security/bulletin/ tion, 2nd edition, 1999. summary.mspx. [27] D. Martin and A. Schulman. Deanonymizing Users of the [2] G. Altekar, I. Bagrak, P. Burstein, and A. Schultz. OPUS: Online SafeWeb Anonymizing Service. In USENIX Security, 2002. Patches and Updates for Security. In Usenix Security, 2005. [28] Microsoft Security Bulletin MS04-040, December 2004. [3] J. P. Anderson. Computer Security Technology Planning Study http://www.microsoft.com/technet/security/Bulletin/MS04- Volume II. ESD-TR-73-51, Vol. II, Electronic Systems Division, 040.mspx. Air Force Systems Command, Hanscom Field, Bedford, MA, Oc- [29] J. Newsome and D. Song. Dynamic Taint Analysis for Auto- tober 1972. matic Detection, Analysis, and Signature Generation of Exploits [4] W. A. Arbaugh, W. L. Fithen, and J. McHugh. Windows of Vul- on Commodity Software. In NDSS, 2005. nerability: a Case Study Analysis. IEEE Computer, December [30] Pax. http://pax.grsecurity.net/. 2000. [31] T. Romer, G. Voelker, D. Lee, A. Wolman, W. Wong, H. Levy, [5] S. Beattie, S. Arnold, C. Cowan, P. Wagle, and C. Wright. Timing and B. Bershad. Instrumentation and Optimization of Win32/Intel the Application of Security Patches for Optimal Uptime. In LISA, Executables Using Etch. In Usenix NT Workshop, 1997. 2002. [32] T. H. Romer, D. Lee, G. M. Voelker, A. Wolman, W. A. Wong, [6] Bochs. http://bochs.sourceforge.net/. J.-L. Baer, B. N. Bershad, and H. M. Levy. The Structure and Performance of Interpreters. In ASPLOS, 1996. [7] M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, and P. Barham. Vigilante: End-to-End Containment [33] J. H. Saltzer and M. D. Schroeder. The Protection of Information of Internet Worms. In SOSP, 2004. in Computer Systems. In SOSP, 1973. [8] R. S. Cox, J. G. Hansen, S. D. Gribble, and H. M. Levy. A Safety- [34] Windows Script Decoder. http://www.virtualconspiracy.com. Oriented Platform for Web Applications. In IEEE Symposium on [35] Secure Computing. http://www.securecomputing.com/pdf/WW- Security and Privacy, 2006. SSLscan-PO.pdf. ´ [9] U. Erlingsson, M. Abadi, M. Vrable, M. Budiu, and G. C. Necula. [36] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated Worm XFI: Software guards for system address spaces. In OSDI, 2006. Fingerprinting. In OSDI, 2004. ´ [10] U. Erlingsson and F. B. Schneider. IRM Enforcement of Java [37] E. G. Sirer, R. Grimm, A. J. Gregory, and B. N. Bershad. Design Stack Inspection. In IEEE Symposium on Security and Privacy, and Implementation of a Distributed Virtual Machine for Net- 2000. worked Computers. In SOSP, 1999. ´ [11] U. Erlingsson and F. B. Schneider. SASI Enforcement of Security [38] The Open Source Network Intrusion Detection System. Policies: A Retrospective. In WNSP: New Security Paradigms http://www.snort.org/. Workshop, 2000. [39] D. Ungar and R. B. Smith. Self: The Power of Simplicity. In [12] D. Evans and A. Twyman. Flexible Policy-Directed Code Safety. OOPSLA, 1987. In IEEE Symposium on Security and Privacy, 1999. [40] Valgrind. http://www.valgrind.org/. [13] Eweek: Anti-Virus Protection for WMF Flaw, December 2005. [41] R. Wahbe, S. Lucco, T. Anderson, and S. Graham. Efficient http://www.eweek.com/article2/0,1895,1907102,00.asp. Software-Based Fault Isolation. In SOSP, 1993. [14] Mozilla Security Alerts and Announcements. [42] C. A. Waldspurger. Memory Resource Management in VMware http://www.mozilla.org/security/. ESX Server . In OSDI, 2002. [15] M. J. Freedman, E. Freudenthal, and D. Mazires. Democratizing [43] H. J. Wang, C. Guo, D. R. Simon, and A. Zugenmaier. Shield: Content Publication with Coral. In NSDI, 2004. Vulnerability-Driven Network Filters for Preventing Known Vul- nerability Exploits. In SIGCOMM, 2004. [16] T. Garfinkel. Traps and Pitfalls: Practical Problems in in System Call Interposition based Security Tools. In NDSS, 2003. [44] Y.-M. Wang, D. Beck, X. Jiang, R. Roussev, C. Verbowski, S. Chen, and S. King. Automated Web Patrol with Strider Hon- [17] T. Garfinkel, B. Pfaff, and M. Rosenblum. Ostia: A Delegat- eyMonkeys: Finding Web Sites That Exploit Browser Vulnera- ing Architecture for Secure System Call Interposition. In NDSS, bilities. In NDSS, 2006. 2004.