Preventing
Information Leaks
with Jeeves
Jean Yang, MIT
July 21, 2015
Wearable
devices
Data, Data Everywhere
Jean Yang / Jeeves 2
Social
media
Electronic health
records
Online courses
Jean Yang / Jeeves 3
All Kinds of People Are
Writing All Kinds of
Code
Open source
lines of code
Journalists
Medical
researchers
Social
scientists
Children
Jean Yang / Jeeves 4
Even Trained
Developers Leak
Information
Why Aren’t Existing
Approaches Enough?
Jean Yang / Jeeves 5
Exploit
Patch
But leaves system
builders a step
behind.
Defensive
protection
But people are still
showing the data
wrong.
Encrypting
Data
My Approach:
Privacy by Construction
Jean Yang / Jeeves 6
Factor out security and privacy to
reduce opportunity for leaks.
• Programmer specifies high-level policies about
how sensitive data can be used.
• Rest of program is policy-agnostic.
• System manages policies automatically.
Social Calendar Example
Jean Yang / Jeeves 7
Alice and Bob throw a surprise party for Carol.
Even Seemingly Simple
Policies Have Subtleties
Jean Yang / Jeeves 8
Guests Carol Strangers
Surprise
party for
Carol at
Chuck E.
Cheese.
Pizza with
Alice/Bob.
Private event
at Chuck E.
Cheese.
Policy: Must be guest. Policies can depend on
sensitive values and other
policies.Policy: Only visible to
hosts until finalized.
Problem:
Enforcing Policies Can
Leak Information!
Jean Yang / Jeeves 9
Guests
Surprise
party at
Chuck E.
Cheese.
Policy: Only visible to
hosts until finalized.
Policy: Must be guest.
Guest list finalized
Guests can’t see
event
Guests can see
event
• Subtle mistake:
check for policy 1
neglects
dependency on
policy 2.
• Problem arises
when programmers
trusted to get
dependencies right.
1
2
Policies Are Intertwined
Across the Code
Jean Yang / Jeeves 10
“What is the most
popular location
among friends 7pm
Tuesday?”
Update to
event
subscribers
• Track information flow through derived values.
• Track where derived values flow.
Problem:
Jean Yang / Jeeves 11
“Policy Spaghetti”
in Real Systems
Code from
HotCRP
conference
management
system
Highlighted: conditional permissions checks everywhere.
Jean Yang / Jeeves 12
Jean Yang / Jeeves 13
Programming model
provides mathematical
guarantees.
Implementation strategy is
practically feasible.
Automatic Enforcement
with Jeeves
The well-intentioned programmer
writes same code no matter what
policies are.
Jeeves Factors Out
Policies
Jean Yang / Jeeves 14
• Centralized policies.
• Policy-agnostic
program.
• Runtime
differentiates
behavior.
Model View Controller
Policy-Agnostic
Programming in Jeeves
Jean Yang / Jeeves 15
Application Code
Separate from policies.
policies
Sensitive values
encapsulate multiple
behaviors.
Policies describe rules for
how values may flow to
output contexts.
.guests [ ]
Jeeves Supports
Expressive Policies
def isNotCarol(oc): return oc !=
Jean Yang / Jeeves 16
Output context can be of arbitrary type.
def isGuest(oc): return oc in .guests
A policy is an arbitrary function that takes the
output context and returns a Boolean value.
Policies can depend on sensitive values.
17Jean Yang / Jeeves
==
true false
print { } print { }
true false
Jeeves Programming
Model
Programmer
writes policy-
agnostic
programs.
Runtime
propagates values
and policies.
Runtime
produces
differentiated
output based on
the viewer.
Programmer
specifies policies
and facets.
1 2
4
3
The Jeeves
Programming Model
• Well-defined runtime semantics
for policy-agnostic programming
with information flow policies.
• Can be implemented standalone
or embedded as a library.
• Has been adapted across
runtimes in web frameworks.
Jean Yang / Jeeves 18
19
FINALLY.. I CAN FOCUS ON
FUNCTIONALITY!
20Jean Yang / Jeeves
if == :
x += 1
return x
x = 0
print { } print { }
1 0
Jeeves Execution Model
Runtime
propagates
values and
policies.
Runtime
solves for
values to show
based on
policies and
viewer.
21
Runtime simulates simultaneous multiple
executions.
Using Policies to
Produce Outputs
Jean Yang / Jeeves 21
print { }
0
( != ) ? 1 : 0
policy( )
Jeeves uses
policies to defacet
appropriately.
1 0
def isNotCarol(oc):
return oc !=
Jean Yang / Jeeves 22
print { }
( == ) ? :
policy( )
def isMaybeCarol(oc):
return oc ==
But What About
Dependencies?
Possible solutions:
( == ) ? :
( == ) ? :
Jeeves runtime
will pick the secret
value if allowed.
Need to find a
fixed point!
Jean Yang / Jeeves 23
Using Constraints to
Handle Dependencies
Label Policy
a def isGuest(oc):
return oc in .guests
𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡 ⇒ in .guests
policy( )
𝑎 ∈ {𝑠𝑒𝑐𝑟𝑒𝑡, 𝑝𝑢𝑏𝑙𝑖𝑐}
print { }
𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡 ⇒ 𝑓𝑎𝑙𝑠𝑒
¬(𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡)
⊢
⊢
0
1 0
a
• Constraints contain
only Boolean
variables.
• Always a consistent
assignment.
Evaluated with
respect to state
at time of
output.
Tracking Policies
Through Execution
Jean Yang / Jeeves 24
if == :
a
x += 1
true false
a
if :
x += 1
x = xold+1 xold
a
Labels follow values
through all
computations, including
conditionals and
assignments.
Web server code
Target Domain:
Database-Backed Web
Applications
Jean Yang / Jeeves 25
Application Queries Database
Facets in the Application
and Database
Jean Yang / Jeeves 26
Application
Queries
select * from Users
where location =
SQL
Database
Application All data SQL
Databaseselect * from Users
Database
queries can
leak
information!
Impractical
and
potentially
slow!
Solution: Use object-relational mapping to facet the
database. Map facets onto non-faceted relational database.
Jeeves runtime
Jacqueline Web
Framework
Jean Yang / Jeeves 27
Application
Frontend Database
@jeevesViewer
Attach policies.
Programmer is responsible
Framework is responsible
Jean Yang / Jeeves 28
Django-like
data schema
for describing
fields.
Policy for
‘location’ field.
Helper
functions for
policy include
queries.
Public value for
‘location’ field.
Compare to Django
Jean Yang / Jeeves 29
Conference
management system
Course manager Health record
manager
(based on
representative
HIPAA fragment)
Implemented in
Jacqueline
Web Frameworks:
Django vs. Jacqueline
Jean Yang / Jeeves 30
Demo
Jean Yang / Jeeves 31
CMS Running Times
Jean Yang / Jeeves 32
Tests from Amazon AWS machine via HTTP requests from another machine.
0
0.05
0.1
0.15
0.2
0 500 1000
Timetoshowpage(s)
Papers in database
Single paper
Jacqueline Django
0
2
4
6
8
10
12
0 500 1000
Timetoshowpage(s)
Papers in database
All Papers
Jacqueline Django
Policy-Agnostic
Programming in Jeeves
Jean Yang / Jeeves 33
Design of a policy-
agnostic
programming
language
[POPL ‘12]
Semantics and
guarantees
[PLAS ’13]
Web framework,
and case studies
[in submission]
==
Other functionality
Policies
Sensitive
values
Python/DBCore team
Language evaluation and case studies
Semantics
Jeeves Team
Jean Yang / Jeeves 34
Armando
Solar-Lezama
Thomas AustinCormac
Flanagan
Travis
Hance
Benjamin
Shaibu
Pat Long &
Jesse Klimov
Lena
Abdalla
Amadu
Durham
Ariel
Jacobs
Scala
Kuat
Yessenov
Jean
Yang
Applying Policy-
Agnostic Ideas at Home
1. Associate policies with data.
2. Make rest of program aware of
data’s policies.
Jean Yang / Jeeves 35
It pays to think about policy enforcement
systematically: can get end-to-end guarantees—
often with negligible overheads!
Works not just for security and privacy, but also for
other customization!
Parting Thoughts
By reducing opportunity for
programmer error, we can
eliminate whole classes of
information leaks.
Jean Yang / Jeeves 36
http://jeeveslang.org

Preventing Information Flow with Jeeves - Singapore Data Privacy Workshop

  • 1.
  • 2.
    Wearable devices Data, Data Everywhere JeanYang / Jeeves 2 Social media Electronic health records Online courses
  • 3.
    Jean Yang /Jeeves 3 All Kinds of People Are Writing All Kinds of Code Open source lines of code Journalists Medical researchers Social scientists Children
  • 4.
    Jean Yang /Jeeves 4 Even Trained Developers Leak Information
  • 5.
    Why Aren’t Existing ApproachesEnough? Jean Yang / Jeeves 5 Exploit Patch But leaves system builders a step behind. Defensive protection But people are still showing the data wrong. Encrypting Data
  • 6.
    My Approach: Privacy byConstruction Jean Yang / Jeeves 6 Factor out security and privacy to reduce opportunity for leaks. • Programmer specifies high-level policies about how sensitive data can be used. • Rest of program is policy-agnostic. • System manages policies automatically.
  • 7.
    Social Calendar Example JeanYang / Jeeves 7 Alice and Bob throw a surprise party for Carol.
  • 8.
    Even Seemingly Simple PoliciesHave Subtleties Jean Yang / Jeeves 8 Guests Carol Strangers Surprise party for Carol at Chuck E. Cheese. Pizza with Alice/Bob. Private event at Chuck E. Cheese. Policy: Must be guest. Policies can depend on sensitive values and other policies.Policy: Only visible to hosts until finalized. Problem:
  • 9.
    Enforcing Policies Can LeakInformation! Jean Yang / Jeeves 9 Guests Surprise party at Chuck E. Cheese. Policy: Only visible to hosts until finalized. Policy: Must be guest. Guest list finalized Guests can’t see event Guests can see event • Subtle mistake: check for policy 1 neglects dependency on policy 2. • Problem arises when programmers trusted to get dependencies right. 1 2
  • 10.
    Policies Are Intertwined Acrossthe Code Jean Yang / Jeeves 10 “What is the most popular location among friends 7pm Tuesday?” Update to event subscribers • Track information flow through derived values. • Track where derived values flow. Problem:
  • 11.
    Jean Yang /Jeeves 11 “Policy Spaghetti” in Real Systems Code from HotCRP conference management system Highlighted: conditional permissions checks everywhere.
  • 12.
    Jean Yang /Jeeves 12
  • 13.
    Jean Yang /Jeeves 13 Programming model provides mathematical guarantees. Implementation strategy is practically feasible. Automatic Enforcement with Jeeves The well-intentioned programmer writes same code no matter what policies are.
  • 14.
    Jeeves Factors Out Policies JeanYang / Jeeves 14 • Centralized policies. • Policy-agnostic program. • Runtime differentiates behavior. Model View Controller
  • 15.
    Policy-Agnostic Programming in Jeeves JeanYang / Jeeves 15 Application Code Separate from policies. policies Sensitive values encapsulate multiple behaviors. Policies describe rules for how values may flow to output contexts.
  • 16.
    .guests [ ] JeevesSupports Expressive Policies def isNotCarol(oc): return oc != Jean Yang / Jeeves 16 Output context can be of arbitrary type. def isGuest(oc): return oc in .guests A policy is an arbitrary function that takes the output context and returns a Boolean value. Policies can depend on sensitive values.
  • 17.
    17Jean Yang /Jeeves == true false print { } print { } true false Jeeves Programming Model Programmer writes policy- agnostic programs. Runtime propagates values and policies. Runtime produces differentiated output based on the viewer. Programmer specifies policies and facets. 1 2 4 3
  • 18.
    The Jeeves Programming Model •Well-defined runtime semantics for policy-agnostic programming with information flow policies. • Can be implemented standalone or embedded as a library. • Has been adapted across runtimes in web frameworks. Jean Yang / Jeeves 18
  • 19.
    19 FINALLY.. I CANFOCUS ON FUNCTIONALITY!
  • 20.
    20Jean Yang /Jeeves if == : x += 1 return x x = 0 print { } print { } 1 0 Jeeves Execution Model Runtime propagates values and policies. Runtime solves for values to show based on policies and viewer. 21 Runtime simulates simultaneous multiple executions.
  • 21.
    Using Policies to ProduceOutputs Jean Yang / Jeeves 21 print { } 0 ( != ) ? 1 : 0 policy( ) Jeeves uses policies to defacet appropriately. 1 0 def isNotCarol(oc): return oc !=
  • 22.
    Jean Yang /Jeeves 22 print { } ( == ) ? : policy( ) def isMaybeCarol(oc): return oc == But What About Dependencies? Possible solutions: ( == ) ? : ( == ) ? : Jeeves runtime will pick the secret value if allowed. Need to find a fixed point!
  • 23.
    Jean Yang /Jeeves 23 Using Constraints to Handle Dependencies Label Policy a def isGuest(oc): return oc in .guests 𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡 ⇒ in .guests policy( ) 𝑎 ∈ {𝑠𝑒𝑐𝑟𝑒𝑡, 𝑝𝑢𝑏𝑙𝑖𝑐} print { } 𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡 ⇒ 𝑓𝑎𝑙𝑠𝑒 ¬(𝑎 = 𝑠𝑒𝑐𝑟𝑒𝑡) ⊢ ⊢ 0 1 0 a • Constraints contain only Boolean variables. • Always a consistent assignment. Evaluated with respect to state at time of output.
  • 24.
    Tracking Policies Through Execution JeanYang / Jeeves 24 if == : a x += 1 true false a if : x += 1 x = xold+1 xold a Labels follow values through all computations, including conditionals and assignments.
  • 25.
    Web server code TargetDomain: Database-Backed Web Applications Jean Yang / Jeeves 25 Application Queries Database
  • 26.
    Facets in theApplication and Database Jean Yang / Jeeves 26 Application Queries select * from Users where location = SQL Database Application All data SQL Databaseselect * from Users Database queries can leak information! Impractical and potentially slow! Solution: Use object-relational mapping to facet the database. Map facets onto non-faceted relational database.
  • 27.
    Jeeves runtime Jacqueline Web Framework JeanYang / Jeeves 27 Application Frontend Database @jeevesViewer Attach policies. Programmer is responsible Framework is responsible
  • 28.
    Jean Yang /Jeeves 28 Django-like data schema for describing fields. Policy for ‘location’ field. Helper functions for policy include queries. Public value for ‘location’ field.
  • 29.
    Compare to Django JeanYang / Jeeves 29 Conference management system Course manager Health record manager (based on representative HIPAA fragment) Implemented in Jacqueline
  • 30.
    Web Frameworks: Django vs.Jacqueline Jean Yang / Jeeves 30
  • 31.
  • 32.
    CMS Running Times JeanYang / Jeeves 32 Tests from Amazon AWS machine via HTTP requests from another machine. 0 0.05 0.1 0.15 0.2 0 500 1000 Timetoshowpage(s) Papers in database Single paper Jacqueline Django 0 2 4 6 8 10 12 0 500 1000 Timetoshowpage(s) Papers in database All Papers Jacqueline Django
  • 33.
    Policy-Agnostic Programming in Jeeves JeanYang / Jeeves 33 Design of a policy- agnostic programming language [POPL ‘12] Semantics and guarantees [PLAS ’13] Web framework, and case studies [in submission] == Other functionality Policies Sensitive values
  • 34.
    Python/DBCore team Language evaluationand case studies Semantics Jeeves Team Jean Yang / Jeeves 34 Armando Solar-Lezama Thomas AustinCormac Flanagan Travis Hance Benjamin Shaibu Pat Long & Jesse Klimov Lena Abdalla Amadu Durham Ariel Jacobs Scala Kuat Yessenov Jean Yang
  • 35.
    Applying Policy- Agnostic Ideasat Home 1. Associate policies with data. 2. Make rest of program aware of data’s policies. Jean Yang / Jeeves 35 It pays to think about policy enforcement systematically: can get end-to-end guarantees— often with negligible overheads! Works not just for security and privacy, but also for other customization!
  • 36.
    Parting Thoughts By reducingopportunity for programmer error, we can eliminate whole classes of information leaks. Jean Yang / Jeeves 36 http://jeeveslang.org

Editor's Notes

  • #2 Hi. I’m Jean. Today, I’m going to talk about how to prevent information leaks. TRANSITION: Before we talk about leaks, let’s talk about data. .
  • #3 There’s more and more data coming from all kinds of places. Sources include social media, electronic health records, online courses, and wearable devices. TRANSITION: To process all this data, all kinds of people have started writing all kinds of code.
  • #4 The lines of open source code have been growing exponentially since the early 90s. It’s not just professional developers coding anymore, but also medical researchers, social scientists, journalists, and even children. TRANSITION: This is great news from an information processing perspective, but not so much from an information leak perspective.
  • #5 Even trained developers leak information. We’re not at all surprised when information is leaked—everything from our social media data to our dental records. TRANSITION: You might be wondering why people haven’t already solved the problem of securing our data.
  • #6 Well, people can encrypt the data, but often the issue isn’t that people aren’t protecting their data at all, but that they’re showing it under the wrong circumstances. This leaves us with the strategy of finding exploits and developing patches, but this leaves system builders always a step behind. TRANSITION: The goal of my work is to help programmers show data correctly.
  • #7 I want to prevent information leaks by factoring privacy out from the rest of the program. I want to allow the programmer to specify high-level policies about how sensitive data can be shown. I want the rest of the program to be agnostic to these policies and I want the system to automatically manage these policies. In this model, the attacker is the user and the programmer is not assumed to be malicious. TRANSITION: Before we talk about how to make this happen, let’s talk about what’s hard.
  • #8 Suppose we have a social calendar like Google Calendar, but with more sharing and searching features. Let’s say Alice and Bob want to throw a surprise party for Carol. TRANSITION: Now I’ll explain how even seemingly simple policies have are subtle interactions that the programmer can easily get wrong.
  • #9 The guests should be able to see that there’s a surprise party and that it happens to be at Chuck E. Cheese. Carol, the subject of the surprise party, is on her own guest list. She should be able to see that she’s having pizza with Alice and Bob, but not necessarily where it is. Others should be able to see that there’s a private event at Chuck E. Cheese so they know not to go there. The guest list might have a policy that a viewer has to be on the guest list to see the list. To allow hosts to play around with potential guest lists before finalizing, there might be an additional policy that the list is visible only to hosts until it is finalized. These policies depend on both sensitive values and other policies. TRANSITION: Now let me explain how enforcing these policies can leak information.
  • #10 I said before that only guests should be able to see full event information and only hosts should know who is on the guest list until the list is finalized. Potential guests shouldn’t be able to see event information until the event is finalized. An easy mistake, however, is to neglect the connection between the two policies and show event information to potential guests who don’t end up being invited. This bug can arise whenever programmers are trusted to keep track of policy dependencies. There was an analogous bug reported in the HotCRP conference management system. When I worked at Facebook, I learned that this kind of problem is pretty common in the real world. TRANSITION: Not only do policies have dependencies on each other, but they are also deeply intertwined across the code.
  • #11 If the user can access events not just directly, but also perform searches, the programmer needs to track how sensitive values flow into derived values. For instance, only people who can see that their friends are at Chuck E. Cheese 7pm Tuesday can see that that’s the most popular location among their friends. It’s also increasingly common to share the result of a search not just to a single user, but to a set of users. To prevent leaks, the programmer needs to track not just how sensitive values are computing derived values, but also where derived values are flowing. TRANSITION: In real systems, reasoning about policies and functionality together can be a real mess.
  • #12 Here are two screen shots from the HotCRP conference management system. I don’t expect you to read the code, but I want you to see how conditional access checks are intertwined with the program. You’ve all probably used this or a similar system to submit and review academic papers. You may be familiar with policies about who can see the titles of papers, the names of authors, and the bodies of reviews. What I’ve highlighted are checks about roles like this. On the right there even dynamically generated SQL queries. In HotCRP, policies are in at least 24 of the 82 files. To implement a policy or to fix a bug, the programmer has to touch many parts of the code. TRANSITION: You might say this is just a software engineering problem, but the issue isn’t that developers don’t know how to fix these policies.
  • #13 But the thing is, developers know how to enforce policies. The problem is that they have to enforce the same policy over and over again across the program. And if there’s one thing we know in software engineering, it’s that doing something over and over again is error-prone. TRANSITION: My work addresses this issue by factoring policy implementation out of programs.
  • #14 In this talk I describe Jeeves, a language for automatically enforcing information flow policies. The goal of Jeeves is to help the well-intentioned programmer write the same code no matter what the policies are. I will describe the design of a policy-agnostic language and web framework and its mathematical guarantees about what it means to enforce policies. I have implemented Jeeves as libraries in Scala and Python and as a Python web framework. I describe the web framework and case studies I’ve built on top of it. TRANSITION: In Jeeves, there is no policy spaghetti.
  • #15 Here is the Jeeves web framework code for the calendar example I’ve described. I don’t expect you to read this code and later I will show some of it in more detail. In Jeeves, the policies are centralized, the rest of the program is policy-agnostic, and the runtime differentiates the behavior. TRANSITION: Now I’m going to tell you how this works.
  • #16 Sensitive values in Jeeves encapsulate multiple behaviors. Policies describe rules for how sensitive values and derived values may flow to different output contexts. I’m going to use this hut notation to denote policies guarding two facets of a sensitive value. The rest of the program is policy-agnostic. The programmer needs to know that different values may be flowing through the program, for instance a GPS location or a city, but the programmer does not have to know the policies determining this flow. The programmer can then rely on the runtime to enforce the policies. TRANSITION: A major contribution of Jeeves is that it supports expressive policies.
  • #17 Jeeves policies are arbitrary Boolean functions that take an argument corresponding to the output context. This output context can be of arbitrary type. Because policies are arbitrary functions, they can capture dependencies on sensitive values. For instance, we can have a policy protecting a guest list that the viewer must be a member of the sensitive list. I describe later how Jeeves handles these policies. TRANSITION: The Jeeves programming model works as follows.
  • #18 The programmer specifies policies and the different facets of sensitive values. The programmer writes policy-agnostic programs that can use faceted values interchangeably with other values. These programs can contain mutable state. The runtime propagates the values and policies and produces differentiated outputs based on the viewer. Computation sinks like print and write to file take an additional argument corresponding to the output context. TRANSITION: So this is what we mean when we say Jeeves.
  • #19 Jeeves is a programming model with a well-defined runtime semantics for policy-agnostic programming with information flow policies. It can be implemented as a standalone interpreter or as a library. I have implemented Jeeves as libraries in Scala and Python. I have also adapted the programming model across the multiple runtimes of a web framework. I’ll describe a bit later how this works. TRANSITION: For now, I will describe how Jeeves works in a single runtime.
  • #20 To conclude, Jeeves is taking us closer to a world in which programmers, freed from the burden of having to enforce privacy policies across the program, can finally focus on functionality. You can find more information and code online!
  • #21 The runtime propagates values and policies and solves for values to show based on the policies and the viewer. I picked this snippet of code to illustrate indirect flows. Here we are setting a variable x to zero and incrementing it if the sensitive location value is equal to Chuck E Cheese. Even if the location doesn’t leak directly, the programmer can infer its value by examining x. I’m going to describe how Jeeves prevents indirect flows. TRANSITION: But first let’s talk about how Jeeves enforces policies at outputs.
  • #22 For computation sinks like print and write to file, the Jeeves runtime uses the policies to defacet appropriately. For simple policies, the Jeeves runtime plugs the output channel into the relevant policy functions and then produces an output. Here the policy says the viewer can’t be Carol, so when it is Carol, the system shows the public output. TRANSITION: Now let’s look at what happens when policies can depend on sensitive values.
  • #23 Suppose instead we had a policy checking whether the output context is equal to a sensitive value with a Carol facet and an unknown stranger facet, and that this is the policy protecting the value itself. While this may seem like a contrived example, this sort of dependency arises with the guest list example I described before. When we plug in Carol as the viewer and we go to check the equality, we get stuck because there’s a mutual dependency between this value and the policy itself. We need to find the fixed point. There are actually two possible solutions: the sensitive value behaves as Carol or the sensitive value behaves as the stranger. The policies allow both, so the system uses the more secret value, in this case Carol. The Jeeves runtime will always try to show the secret facet if the policies allow. TRANSITION: Now I’ll describe how the Jeeves runtime uses constraints to handle these dependencies.
  • #24 To use constraints, we introduce labels that get mapped to policies. Labels can take on the values “secret” and “public” and guard the faceted values. To produce outputs, the system evaluates the policies with respect to the output channel and the state at the time of output. Here, if we’re trying to show the guest list to the stranger, we get the constraint “a is secret” implies the stranger is a guest. This stranger is not a guest, so we get “a is secret” implies false, and so a cannot be secret. Some things to note are that constraints contain only Boolean free variables corresponding to the labels. Also, there’s always a consistent assignment to labels. We can always assign everything to be public. TRANSITION:
  • #25 All values computed from sensitive values are associated with the same policies. All assignments occurring under conditional branches are associated with the assumptions that were made to enter that branch. This way, all values computed from sensitive values can be shown only according to the policies. TRANSITION: To make sure that our runtime is enforcing the policies even when there are these subtle interactions, we have defined a dynamic semantics for Jeeves.
  • #26 Web applications often leak information by revealing the results of arbitrary database queries, for instance through a search interface. It doesn’t matter how much policies are enforced in the application when a single query can subvert our guarantees. One way to take advantage of language-based solutions like Jeeves is to load all the data into the application and use the enforcement there. Language-based approaches often assume a single runtime so this is what you have to do. The issue is that doing all the work in the application is slow and we often can’t load the database into memory. Our solution is the facet the database. Object-relational mappings typically support a uniform data representation between the application and database and mediate a set of queries. We have implemented a Jeeves-based object-relational mapping that supports a uniform facet representation and mediates policy-agnostic queries. We do this by augmenting standard SQL tables with metadata to do facet bookkeeping and storing multiple rows for a single faceted value. This allows the programmer to write application code and database queries without worrying about the policies. TRANSITION: We have built a web framework based on this using the model-view-controller paradigm.
  • #27 Web applications often leak information by revealing the results of arbitrary database queries, for instance through a search interface. It doesn’t matter how much policies are enforced in the application when a single query can subvert our guarantees. One way to take advantage of language-based solutions like Jeeves is to load all the data into the application and use the enforcement there. Language-based approaches often assume a single runtime so this is what you have to do. The issue is that doing all the work in the application is slow and we often can’t load the database into memory. Our solution is the facet the database. Object-relational mappings typically support a uniform data representation between the application and database and mediate a set of queries. We have implemented a Jeeves-based object-relational mapping that supports a uniform facet representation and mediates policy-agnostic queries. We do this by augmenting standard SQL tables with metadata to do facet bookkeeping and storing multiple rows for a single faceted value. This allows the programmer to write application code and database queries without worrying about the policies. TRANSITION: We have built a web framework based on this using the model-view-controller paradigm.
  • #28 The application layer of our web framework runs according to the semantics I described. The programmer is responsible for specifying the policies once, along with the database schemas. The runtime keeps track of the viewer. All values input through the frontend go directly to the database, where they are associated with policies. All values pass back through the Jeeves runtime before display so Jeeves can figure out which value to show. The Jeeves web framework is responsible for keeping track of who is looking. TRANSITION: Now let’s go back to the policy code in the calendar example.
  • #29 Here I show how to implement Jacqueline policies and how they can contain arbitrary database queries. On top I show the data schema for describing Jacqueline fields. We define the name, location, time, and description fields for an event, as well as the “visibility” field where the user can specify whether an event is visible to everyone or only guests. The fields of the event are the secret values by default. The programmer can introduce a facet by specifying a function to compute a public value from the current row. Here we show the public value for any “location” field is “undisclosed location.” To specify when the actual location is to be shown versus this public value, the programmer can declare a policy. The policy takes as arguments the current row and an output context argument. Here, our policy says that if the event visibility is “guests only,” then the secret location field can be shown if the viewer is a host or a guest. Whether the viewer is a host or guest is computed using regular database queries. It’s important to note, however, that the policies are enforced throughout these queries so the programmer can be sure that they do not leak information! TRANSITION: We’ve implemented the following systems using our web framework.
  • #30 We’ve built a conference management system that we deployed for a small workshop, a course manager, and a health record manager. We compared the conference management system implementation to an implementation written using the standard Django Python web framework and found that Jacqueline reduces the overall amount of policy code. Instead of having to implement policies as checks and filters across the program, the programmer needs to specify them only once. TRANSITION: Before I show you some numbers, let’s do a quick demo of our conference management system.
  • #31 To evaluate the feasibility of the paradigm in the web setting, we created a web framework called Jacqueline for policy-agnostic programming in web servers. TRANSITION: The main problem we solved was addressing the issue of how to extend the model across applications and databases.
  • #32 Make new account for my host. Submit a paper. Log in as admin and see that the paper is not there. Change policy; refresh. TRANSITION: We took some measurements to see how this conference management system scales.
  • #33 We ran these tests on an Amazon AWS machine by sending multiple HTTP requests from another machine and averaging over them. We compare the time it takes to show a single paper in Jacqueline vs. a Django conference management system with hand-implemented policies. The times for Jacqueline are actually better because the Django implementation requires multiple passes over some query results to apply policy checks. We then compare the time it takes to show all papers. In this case, both implementations are resolving policy checks with database queries for each data item being shown. There is a 1.5x slowdown with Jacqueline. TRANSITION: We learned the following from running our systems.
  • #34  TRANSITION: To summarize, the goal of this work is to allow the programmer to focus on functionality.
  • #35 My advisor Armando, our students, and our collaborators.
  • #36 Here are some ways to apply the ideas I’ve been working on to your existing systems. What I’ve shown is that it’s useful—and possible—to associate policies directly with data and to make the rest of the program aware of the policies. A higher-level takeaway is that it pays to think about policy enforcement systematically. We can get end-to-end guarantees, often with negligible overheads. TRANSITION: To summarize, I have demonstrated the feasibility of an approach that reduces programmer error to eliminate whole classes of information leaks.
  • #37 With more work, automatically managing privacy and security policies can be as practical and common as automatically managing memory. It’s incredibly exciting that language technologies have brought us to the point where we can even think about factoring out global, intertwined concerns like privacy. And I look forward to thinking about other ways we can make programmers’ lives easier by helping them focus on what’s interesting about their code.