expeditions praneeth_june-2021

Distributed & Private
Computation
Praneeth Vepakomma
MIT

On Device Anonymize Obfuscate Smash/ Encrypt
Local calculations
Only download large
datasets
Random Identifiers Add Noise
Quantize
Differential Privacy
Smash: Convert to lower
dimensions or model
Encrypt: Use ‘shares’ and
multiple servers
ZKP: Zero knowledge proof
Imaging devices, Distributed ML and Privacy
Guiding Question
How to create portable skin imaging device(s) (like a wrist watch) that
collaborate and train to do better diagnostics without sharing sensitive data?

2009: Crowdsensing
Patient
Scale

Population
Scale
Gupta, McDuff, Raskar 2016

Assuming penalty of
just $6 per person =
$480M
Reality: Per-person
costs are way
higher in hundreds
or thousands of
dollars.
Ethical, moral, legal,
trust, economic,
news and PR
repercussions.
Follow-up effects

Semantic Privacy:
Architectures to prevent
empirical reconstruction
attacks
Formal Privacy + Images
Differentially Private Image Retrieval
Iterative feedback/private data
structures: Improving DP Histograms
& set intersection verification
Parallel Combinatorial
Optimization without
submodularity
What’s new in DPC @ MIT: Distributed & Private Computation Projects
Distributed ML: Split Learning AirMixML PoC: Wireless + Formal Renyi
DP + ML
Splintering: A foundation for
distributed scientific
computation

Part 1: 'Distributed' ML (DML)
Part 2: 'Private' DML & Computation (DPC)

Today’s 'Distributed' ML
Server
Client1 Client2 Client3 ..

Master Algo for
Diagnostic, Treatments
Device 1 2 3 ..
Device
Data
Train ML
100

b1. Federated Learning
Server
[McMahan17]

b2. Split Learning
Server
Smasher
Smashed Data
Back
Prop
[Gupta17, Vepakomma, Singh, Raskar]

12
VGG over CIFAR
10
ResNet over
CIFAR 100
Federated
Split
Compute Bandwidth
Vepakomma, Swedish, Gupta, Dubey, Raskar 2018
Raskar, McMahan et al CVPR 2019

Communication ratio: Rectangular Hyperbola

Local Synchronization/Asynchronous SL

Split
DenseNet for
Classification
Split
U-Net for
Segmentation
Versatile modes of
operation (MIT/MGH)

Input Data
Labels
Input Data
Labels
Input
Data
Input
Data
Client 2
Server
Client 1
Smasher
Server
Classic Boomerang without
sharing labels Multi-Client/Vertical Partitioning

IoT
Low Compute/Comms
(Cannot train models)
HealthData
Few Clients
(non-homogeneous data)
Complex Models
(large unoptimized models)
Too Little Data Per Client
(Unviable to train or send large models)
Many Untrusted Parties
(How to encourage 3rd party developers)
Challenges in Federated Learning

Part 2:
‘Private’ DML & Computation (DPC)

Client
Server
Smasher
Input Data
Labels
Attacker
Privacy and Attacks: Preventing reconstruction
Inverse
Model
Invert Data Invert Model

Private Collaborative Inference
Set-up
Attacks

Client FL
Server
Smasher
Smashed Data
𝐳
NoPeek-Infer via Decorrelation
Input Data
Labels
Input Data 𝐱
NoPeek-Infer: Preventing reconstruction attacks in distributed predictive inference after on-premise training
Vepakomma, Singh, Zhang, Raskar 2021

Autoencoder
with NoPeek
VGG whitebox
with NoPeek
Reconstruction testing: NoPeek

Colorectal histology image public dataset
Decreasing
leakage
Original
Activation
Original
Activation
Original
Original
Activation
Activation
Traditional
NoPeek

DISCO: Pruning to prevent reconstruction

Differential Privacy (Zoomed-out view)
Learning nothing about an individual
while learning useful statistical info about a population
• Is provided (noisy) data enough for task accuracy? (Util-priv tradeoff)
(ε, δ)-Differential Privacy:
• The distribution of the output M(D) (a query) on database D is nearly the
same as M(D′) for all adjacent databases D and D′:
∀S: Pr[M(D) ∈ S] ≤ exp(ε) ∙ Pr[M(D′) ∈ S]+δ.
Example mechanisms: Laplace Mechanism for ε-DP , Gaussian Mechanism for
(ε, δ)-DP, Contractive Noisy Iterations and the list goes on….
2 Key properties: Post-processing (once DP, DP forever) & composition (loss of
privacy via multiple queries)
How much noise to add? Depends on Global Sensitivity of query (not data).

Retrieve Nearest
Matches to
Privatized Image
Query
Client
Private Image Retrieval with Differential Privacy
Differentially Private Supervised Manifold Learning with Applications like Private Image Retrieval, Vepakomma, Balla, Raskar, 2021

Lifecycle of PrivateMail
Key: Perform retrieval after ‘differentially private’ manifold learning on DL features for IR
Big Question: How to Perform ‘differentially private’ manifold learning?

Non-Private Supervised Manifold Embeddings

PrivateMail Protocol
Differentially Private Supervised Manifold Embeddings
Local sensitivity
Global sensitivity
Can you spot the
difference?

Private Image Retrieval
PrivateMail Vs DP d-SNE vs non pvt t-SNE

Compute
Compute or memory on clients (e.g. IoT)
Unknown / large # of Clients
Unreliable nodes (slow downs, faults)
Adversarial clients
Trusting services, unregulated/unethical use
Ownership+control of ecosystem
Compute Local vs Remote tradeoff
Communication Bandwidth
Limited communication (few rounds, unreliable
comms)
Latency of training (queuing delays)
Dynamic availability (streaming, time zone)
Data
Too little or too much data per client
Unbalanced (size, distribution, adversarial )
Highly non-IID (heterogeneous data)
Personalization
Vertical partitions (instead of horizontal)
Exposing labels
ML
Validation on hidden data
Parameter tuning
Convergence guarantee
Incremental updates (new features)
Data leakage/ Invertibility
Challenges/Choices in Decentralized & Pvt ML: Compute, Privacy, BW,
Mem, Accuracy/Uility

Split learning ported into PySyft & FedML.ai
Project Website: splitlearning.mit.edu
Annual Research Workshop: SLDML
Tutorials/Talk Videos
vepakom@mit.edu

AirMixML
• Over-the-Air Data Mixup for Inherently
Privacy-Preserving Edge Machine Learning
• Wireless communications + PPML

Client creates secure splinters
from raw data
Server computes
on splinters
Send Secure Splinters
Receive. intermediate
results
Client Unsplinters for
final result
Splintering, Vepakomma, Raskar et.al 2020
Stochastic Splintering

Example 1: splintering for sigmoid

Example 2: splintering for softmax

Example 3:
splintering
for matrix
inverse

DAMS: Proposed private sketching data structure for set
intersection verification queries
Client
Device
Set Intersection
Verification
Result
• Key Idea: The algorithm is run q times using a different dictionary of hash functions in each of the run of the private sketching algorithm
• The final result is obtained as the average of the estimated counts. We refer to this option as our proposed private DAMS estimator.

Theoretical guarantees on utility
• Baseline 1: The scenario of using eps = p.eps’ with the algorithm being run once with one set of hash functions.
– This is equivalent to the privacy level obtained when the same set of hash functions are used across p runs of the algorithm on the
same dataset due to the sequential composition property
• Baseline 2: The scenario of using eps = eps’ , while the algorithm is run q times using a same dictionary of hash
functions in each of the run as part of the private count-mean-sketch algorithm.
– This is an important baseline to compare against in order to confirm that changing the hash function dictionary across each of the p
runs, is a better option than keeping them same across the p runs
• DAMS- Diversifed Averaging for Meta estimation of Sketches:
– The algorithm is run q times using a different dictionary of hash functions in each of the run of the private count-mean-sketch algorithm
– The final result is obtained as the average of the estimated counts.
– Proved Guarantees:
• Private DAMS estimator is unbiased
• Var(DAMS) < Var(Baseline 1) ; when eps > 2
• Var(DAMS) < Var(Baseline 2) ; always
Sum of covariances of order
DAMS: Meta-estimation of private sketch data structures for differentially private COVID-19 contact tracing,
P.Vepakomma, S.N.Pushpita, R.Raskar, (PRIML AND PPML JOINT EDITION, NeurIPS-2020)

DAMS: Improved
TPR/FPR’s & lower
variance than
traditional private
data structures like
Count-Mean-Sketch

Iterative Feedback to Improve Utility-Privacy
Tradeoff for DP Histograms
Key Idea: Use subsample to privately estimate heavy-hitters and use that as feedback
to distribute epsilon across partitions of HH and !HH

Quasi-Concave
Set Functions
Induced Quasi-
Concave Set Functions
Duality
Exponential
time
Polynomial
time
Monotone Linkage
Functions
Data
Data Data Data
Broadcast
data
maximal minimal
pi-cluster
pi-series pi-series pi-series
O(K2
N log N) + O(log K)
O(log K)
Objective
Find
Where
Parallel Quasi-concave set optimization:
A new frontier that scales without needing submodularity
Parallel Quasi-concave set optimization: A new frontier that scales without needing submodularity, Vepakomma, Kempner, Raskar, 2021

Compute, Bandwidth, Accuracy
Heterogeneous Data, Few hospitals
Multiple medical
imaging benchmarks
Split
Federated

expeditions praneeth_june-2021

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to expeditions praneeth_june-2021

Similar to expeditions praneeth_june-2021 (20)

Recently uploaded

Recently uploaded (20)

expeditions praneeth_june-2021