Slideshow for presentation at the International Society for Industrial Ecology 2015 meeting in Surrey UK, July 7-10.
The presentation deals with the importance of data privacy in LCA, introduces a provenance framework for the dissemination of LCA studies, and discusses two use cases in which private data are used to compute and publish LCA results
Privacy and Provenance in Environmental Impact Assessment - ISIE 2015 Surrey UK
1. photo: flickr.com/people/44073224@N04 (CC-A)
Privacy and Provenance in Environmental Impact Assessment
Brandon Kuczenski, Amr El Abbadi, Cetin Sahin
University of California, Santa Barbara
ISIE 2015 – University of Surrey
2. Discerning our Environmental Impacts
Kuczenski et al. ISIE 2015 – Surrey UK – 1 / 12
UNEP-SETAC 2011
Even the simplest query in Life-cycle assessment (LCA) requires vast and varied data, all of which
is “private” to some extent:
● Manufacturing: product composition, equipment operation, utility demand, waste;
● Use: individual consumption habits, product usage behavior, disposal decisions;
● Supply Chain: materials sourcing, supply contracts, logistics;
● Background: aggregated industrial-economic models.
Data privacy is at the heart of LCA practice:
● Averaging (Horizontal): similar processes operated
in parallel;
● Aggregation (Vertical): grouping related activities;
● Background Aggregation (“roll-up”): cradle-to-gate
computation.
Data privacy is at odds with the objectives of LCA:
● Attribution of impacts to specific activities;
● Identifying ways to improve environmental perfor-
mance through operational changes.
How do we evaluate environmental claims while ensuring data privacy?
3. What is privacy in life-cycle assessment?
Kuczenski et al. ISIE 2015 – Surrey UK – 2 / 12
PROV Primer (W3C)
Privacy means confidentiality (a.k.a. secrecy)
● Companies don’t want to reveal sensitive details to competitors, regulators, or the public;
● Usually accomplished through roll-up or vertical aggregation.
Privacy means anonymity:
● Published results should not reveal the details of any contributor;
● Usually accomplished through horizontal aggregation among “at least three” contributors.
Privacy goes hand in hand with provenance:
● Data provenance is the attribution of results to specific observations and/or computations.
● Standardized by the W3C as a directed graph model linking agents, entities and activities.
● Rich parallels with LCI modeling.
In LCA publishing it is desirable to make two assurances:
● Assure data providers that their information cannot
be discerned from published results;
● Assure data users that the results reflect an
accurate model of the system under study.
● Needs a provenance framework!
4. Provenance Framework
Kuczenski et al. ISIE 2015 – Surrey UK – 3 / 12
● Linked observations form a tree
called an Inventory fragment.
● Fragments are equivalent to
provenance graphs.
Define an inventory foreground model on the basis of observations of flows between a parent
node and child nodes and their directed implications :
Flow Child node
Parent
node
generated byrequires
Direction: Input
Flow Child node
Parent
node
consumed bygenerates
Direction: Output
node
node
node
node
node
node
exchange
Reference
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
5. Provenance Framework: Data Set Resolution
Kuczenski et al. ISIE 2015 – Surrey UK – 4 / 12
● Foreground nodes and background
dependencies are mapped to specific
datasets (public or private).
● Each reference must be resolved at
time of computation
Common background processes
node
node
node
node
node
node
exchange
Reference
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
Electricity (EU)
Electricity (CN)
...
Thermal Energy from Gas
...
Structural Steel
...
Freight Transport (truck)
...
6. Provenance Framework: LCI Publication
Kuczenski et al. ISIE 2015 – Surrey UK – 5 / 12
node
node
node
node
node
node
exchange
Reference
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
Electricity (EU)
Electricity (CN)
...
Thermal Energy from Gas
...
Structural Steel
...
Freight Transport (truck)
...
LCI model publication is a serialization of the graph
model:
● Fragment table describes the structure of the
foreground tree;
● Foreground table describes node resolutions.
● Background table describes background
resolutions.
These three pieces precisely describe an LCI model in
a database-independent fashion, even if data sets
themselves remain private.
LCI model can be formulated as a foreground tree and a strongly connected background:
AP =
Af 0
Ad A⋆ ; BP = Bf B⋆
● Af are links between foreground nodes (fragment table);
● Ad are the dependencies of the foreground on the background (fragment table);
● Bf includes foreground direct emissions (dereferenced node table);
● A⋆
and B⋆
are the background database (dereferenced background table).
7. LCI Publication Use Cases
Kuczenski et al. ISIE 2015 – Surrey UK – 6 / 12
Use Case 1: Secure Multiparty Computation
8. Secure Multiparty Computation: The Private Jet Problem
Kuczenski et al. ISIE 2015 – Surrey UK – 7 / 12
photo: flickr.com/photos/rodeime (CC-A)
I have some bad news, gentlemen.
9. Secure Multiparty Computation (SMC) and LCA
Kuczenski et al. ISIE 2015 – Surrey UK – 8 / 12
Setting: a group of untrusting parties with private inputs.
Goal: jointly compute a function of their inputs while maintaining secrecy of all private data.
SMC uses cryptographic techniques to collaboratively compute a function of private inputs (aver-
age, maximum, quantile, etc.) without any party revealing information to any other party.
● The output of the computation can be known to every party.
● The output can remain secret, reporting “flags” (high/low) or rank ordering to each contributor
privately.
● No trusted party is required.
SMC can be used anywhere horizontal
averaging is needed, to securely compute
exchange coefficients (e.g. values in Ad or Af
or Bf ).
● Contributors must agree on a provenance
model.
● Vulnerable to false information provided by
a careless or malicious contributor.
● Audit mechanisms can be established (but
require a trusted party).
10. LCI Publication Use Cases
Kuczenski et al. ISIE 2015 – Surrey UK – 9 / 12
Use Case 2: Secure Publication
11. Secure Publication: The Supplier Problem
Kuczenski et al. ISIE 2015 – Surrey UK – 10 / 12
Product designer wants to publish LCA results with as much detail as is permitted by data
providers’ confidentiality policies. Upstream supplier’s objectives:
● Conceal its impacts if they are “large” (may want complete anonymity);
● Publicize its impacts if they are “small” (wants credit for the low impacts attributed to them)
● “Private” data are values of entries in Af , Ad, Bf
● Other portions of the model may be public; adversary may have partial information or use
statistical methods.
How much information can be published without revealing private data?
node
node
node
node
node
node
exchange
Reference
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
exchange
Electricity (EU)
Electricity (CN)
...
Thermal Energy from Gas
...
Structural Steel
...
Freight Transport (truck)
...
GWP
AP
EP
Smog
ADP
12. Study Formulation and Obfuscation
Kuczenski et al. ISIE 2015 – Surrey UK – 11 / 12
The obfuscated study is constructed through graph transformation, e.g. grouping foreground
nodes (vertical aggregation) and background LCI results (background aggregation):
Af → A′
f ; Ad → A′
d; Bf → B′
f
The obfuscated study can be formulated as a linear equation:
ˆs = E · (B′
f + BxA′
d) · ˜x
● ˜x is derived from foreground traversal;
● Bx = B⋆
· (I − A⋆
)−1
is the aggregated background database;
● E is the characterization matrix.
● Privacy protection depends on the locations of nonzero elements in A′
d, B′
f and E.
Operationalize the two competency questions of privacy-preserving LCA publication:
1. How closely can an adversary (possibly with partial information) estimate the values of private
data?
2. How can a data user (or critical reviewer) be convinced of the accuracy of a computation that
conceals private data?
13. Conclusions and Outlook
Kuczenski et al. ISIE 2015 – Surrey UK – 12 / 12
● LCA interpretation is greatly facilitated with an explicit provenance framework.
● LCI models can be published precisely without revealing private data. Two use cases:
1. Mutually untrusting peers wish to privately evaluate their collective performance
(Secure multiparty computation);
2. Trusted party wishes to publicly reveal detailed results of a study that includes private
data (Secure publication).
● Current work:
− Implement SMC for horizontal averaging;
− Develop minimal constraints on privacy-preserving aggregation for secure publication;
− Operationalize result validation with private data.
Thanks to:
● Co-PI Amr El Abbadi (UCSB CS); PhD student Cetin Sahin (UCSB CS)
● Omer Eğecioğlu; Roland Geyer, Pascal Lesage, Kyle Meisterling
● NSF CCF-1442966
Thank you!
bkuczenski@ucsb.edu