Privacy-preserving
Metrics
Nonprofit founded in 2013 as a home for
public-benefit digital infrastructure projects,
including Let’s Encrypt, Divvi Up, and Prossimo.
Launched in October 2020, Divvi Up is a system
for privacy-preserving metrics collection based
on the Distributed Aggregation Protocol (DAP),
which is being standardized in the IETF.
Presenting:
Brandon Pitman, Technical
Lead
Sarah Gran, VP of Brand &
Donor Development
The need for privacy-preserving
metrics
● Privacy policies are insufficient as a privacy safeguard
● Hacks happen
● Mere presence of PII is a liability
But…
● Data provides valuable insight
● Data enables improved user experiences
● Data identifies problem areas to fix
Introducing Divvi Up
A service allowing private aggregation of sensitive data.
● No one but the client ever sees the original measurement.
● No one but the collector ever sees the aggregate.
The benefits:
● Users like a technically-enforced guarantee that their data
can’t be mishandled.
● Organizations like a technically-enforced guarantee that they
are not exposed to sensitive user data.
What can be aggregated?
Technically speaking:
● Numerical data: sums, mean, variance, most statistical functions
● Vectors of numbers
● Histograms, or vectors with a constrained number of nonzero
elements
● Extensible: new aggregation functions can be added
Common applications:
● Metrics/telemetry
● Survey results
● Machine-learning training data
How does it work?
Client
Client
Client
Leader
Aggregator
Helper
Aggregator
Collector
Protocol actors
There are several different protocol actors in Divvi Up:
● Client: generates measurements & uploads them to the Aggregators.
● Aggregator: receives report shares from Clients, verifies & aggregates
them, and provides aggregates to Collector. Every deployment involves
a Leader & Helper Aggregator.
○ Leader: directly receives reports from the Clients, drives
aggregation with the Helper, and provides aggregated batches to
the Collector.
○ Helper: driven by the Leader to perform aggregation & collection.
● Collector: retrieves batches of aggregated reports from the
Aggregators.
Protocol actors
Client
Client
Client
Leader
Aggregator
Helper
Aggregator
Collector
Subscribers: ENPA
Current status: turned down.
Use case: private analytics over COVID-19 exposure rates.
Apple & Google deployed the clients; ISRG & NIH operated the
aggregators; MITRE operated the collector.
The initial use-case for the technology behind Divvi Up was to
permit private analytics over COVID-19 exposure rates, operated as
part of an exposure notification system implemented by Apple &
Google during the pandemic.
Subscribers: Mozilla
Current status: in production.
Use case: sensitive telemetry.
Mozilla deploys the clients, and operates the helper aggregator and
the collector; the ISRG operates the leader aggregator.
Mozilla’s initial deployment targets sensitive metrics for their Firefox
web browser, such as determining which domains trigger a browser
crash. Mozilla’s use is interesting as they compose compose Divvi
Up with Oblivious HTTP to fully remove Divvi Up’s ability to see
metadata (e.g. IP address) associated with each report.
Subscribers: Horizontal
Current status: in production.
Use case: sensitive telemetry & survey results.
Horizontal deploys the clients, and operates the helper aggregator
and the collector. The ISRG operates the leader aggregator.
Horizontal has deployed private survey result collection in their
Shira product, and have deployed telemetry into their Tella product.
Horizontal is interesting in that they are the only subscriber who has
deployed our Android client.
Divvi Up & Oblivious HTTP
● Very high-level: OHTTP is an encrypted HTTP proxy requiring two
non-colluding servers, which hides all request metadata (e.g. IP)
from the server.
● OHTTP, when composed with a classic telemetry/aggregation system,
would hide the source of each measurement, but not the
measurement itself.
● Divvi Up hides the individual measurements, but may reveal which
clients contribute to the aggregates.
● Composing OHTTP with Divvi Up allows for private aggregation,
while preventing info leaks from the client exposing metadata to
the aggregators.
Divvi Up & Differential Privacy
● Very high-level: DP is a method to hide whether an individual
contributed to an aggregate via statistical noise.
● DP says nothing about how the aggregate is generated.
● Divvi Up can be composed with DP: Divvi Up protects the individual
measurements from being leaked while producing aggregates, DP
protects the aggregate from leaking information about individual
measurements.
● We are investigating “central” DP (noise added by aggregators) and
“client” DP (noise added by clients).
What’s next
● Discover more about how subscribers will use Divvi Up, as well as
further applications of the underlying DAP technology
● Gain production deployment experience with partners at scale
● Improve efficiency & lower cost to operate
● Continue to refine the Divvi Up subscriber web portal
● Publish DAP as an IETF RFC
Questions now? Ask!
Questions later? contact@divviup.org
Standardization Work
The Distributed Aggregation Protocol (DAP) is used by Divvi Up to
perform private aggregation. DAP inherently requires interoperation
by two non-colluding “aggregator” servers; therefore, DAP is being
standardized at the IETF.
Verifiable Distributed Aggregation Functions (VDAFs) provide the
cryptographic primitives used by the higher-level Distributed
Aggregation Protocol to perform aggregation. VDAF is being
standardized at the CFRG (Crypto Forum Research Group).
Internet Engineering Task Force (IETF) Specification

Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman

  • 1.
  • 2.
    Nonprofit founded in2013 as a home for public-benefit digital infrastructure projects, including Let’s Encrypt, Divvi Up, and Prossimo. Launched in October 2020, Divvi Up is a system for privacy-preserving metrics collection based on the Distributed Aggregation Protocol (DAP), which is being standardized in the IETF. Presenting: Brandon Pitman, Technical Lead Sarah Gran, VP of Brand & Donor Development
  • 3.
    The need forprivacy-preserving metrics ● Privacy policies are insufficient as a privacy safeguard ● Hacks happen ● Mere presence of PII is a liability But… ● Data provides valuable insight ● Data enables improved user experiences ● Data identifies problem areas to fix
  • 4.
    Introducing Divvi Up Aservice allowing private aggregation of sensitive data. ● No one but the client ever sees the original measurement. ● No one but the collector ever sees the aggregate. The benefits: ● Users like a technically-enforced guarantee that their data can’t be mishandled. ● Organizations like a technically-enforced guarantee that they are not exposed to sensitive user data.
  • 5.
    What can beaggregated? Technically speaking: ● Numerical data: sums, mean, variance, most statistical functions ● Vectors of numbers ● Histograms, or vectors with a constrained number of nonzero elements ● Extensible: new aggregation functions can be added Common applications: ● Metrics/telemetry ● Survey results ● Machine-learning training data
  • 6.
    How does itwork? Client Client Client Leader Aggregator Helper Aggregator Collector
  • 7.
    Protocol actors There areseveral different protocol actors in Divvi Up: ● Client: generates measurements & uploads them to the Aggregators. ● Aggregator: receives report shares from Clients, verifies & aggregates them, and provides aggregates to Collector. Every deployment involves a Leader & Helper Aggregator. ○ Leader: directly receives reports from the Clients, drives aggregation with the Helper, and provides aggregated batches to the Collector. ○ Helper: driven by the Leader to perform aggregation & collection. ● Collector: retrieves batches of aggregated reports from the Aggregators.
  • 8.
  • 9.
    Subscribers: ENPA Current status:turned down. Use case: private analytics over COVID-19 exposure rates. Apple & Google deployed the clients; ISRG & NIH operated the aggregators; MITRE operated the collector. The initial use-case for the technology behind Divvi Up was to permit private analytics over COVID-19 exposure rates, operated as part of an exposure notification system implemented by Apple & Google during the pandemic.
  • 10.
    Subscribers: Mozilla Current status:in production. Use case: sensitive telemetry. Mozilla deploys the clients, and operates the helper aggregator and the collector; the ISRG operates the leader aggregator. Mozilla’s initial deployment targets sensitive metrics for their Firefox web browser, such as determining which domains trigger a browser crash. Mozilla’s use is interesting as they compose compose Divvi Up with Oblivious HTTP to fully remove Divvi Up’s ability to see metadata (e.g. IP address) associated with each report.
  • 11.
    Subscribers: Horizontal Current status:in production. Use case: sensitive telemetry & survey results. Horizontal deploys the clients, and operates the helper aggregator and the collector. The ISRG operates the leader aggregator. Horizontal has deployed private survey result collection in their Shira product, and have deployed telemetry into their Tella product. Horizontal is interesting in that they are the only subscriber who has deployed our Android client.
  • 12.
    Divvi Up &Oblivious HTTP ● Very high-level: OHTTP is an encrypted HTTP proxy requiring two non-colluding servers, which hides all request metadata (e.g. IP) from the server. ● OHTTP, when composed with a classic telemetry/aggregation system, would hide the source of each measurement, but not the measurement itself. ● Divvi Up hides the individual measurements, but may reveal which clients contribute to the aggregates. ● Composing OHTTP with Divvi Up allows for private aggregation, while preventing info leaks from the client exposing metadata to the aggregators.
  • 13.
    Divvi Up &Differential Privacy ● Very high-level: DP is a method to hide whether an individual contributed to an aggregate via statistical noise. ● DP says nothing about how the aggregate is generated. ● Divvi Up can be composed with DP: Divvi Up protects the individual measurements from being leaked while producing aggregates, DP protects the aggregate from leaking information about individual measurements. ● We are investigating “central” DP (noise added by aggregators) and “client” DP (noise added by clients).
  • 14.
    What’s next ● Discovermore about how subscribers will use Divvi Up, as well as further applications of the underlying DAP technology ● Gain production deployment experience with partners at scale ● Improve efficiency & lower cost to operate ● Continue to refine the Divvi Up subscriber web portal ● Publish DAP as an IETF RFC
  • 15.
    Questions now? Ask! Questionslater? contact@divviup.org
  • 16.
    Standardization Work The DistributedAggregation Protocol (DAP) is used by Divvi Up to perform private aggregation. DAP inherently requires interoperation by two non-colluding “aggregator” servers; therefore, DAP is being standardized at the IETF. Verifiable Distributed Aggregation Functions (VDAFs) provide the cryptographic primitives used by the higher-level Distributed Aggregation Protocol to perform aggregation. VDAF is being standardized at the CFRG (Crypto Forum Research Group).
  • 17.
    Internet Engineering TaskForce (IETF) Specification