Preserving privacy while sharing data

1
Preserving privacy
while sharing data
Gordon Haff
Emerging Technology Evangelist
@ghaff
https://bitmason.blogspot.com
September 2020

@ghaff https://bitmason.blogspot.com
2
Shared data can
accelerate innovation
and improve outcomes
in energy, telecoms,
healthcare...
The opportunity

3
But data can be private
and sensitive at the
individual person or
organization level
The problem

4
Source: Andrew Trask

5
Anonymization
Removing/tokenizing personal
data fields
Encrypt/transform personal
data fields
Aggregation by trusted agency

6
Does it work?
Sort of...

7
What’s personal data?
Who can you really trust?
Lack of data diversity (e.g.
k-anonymity failures)
Susceptibility to attack

8

9
Reconstruction
Source: US Census

10
Identification of patterns
Source: https://avtanski.net/

11
Source: Privitar
Linkage attacks

12
Re-identification

13
US Census

14
US Census

15
Differential Privacy
Response to erosion of traditional Statistical
Disclosure Limitation (SDL) techniques
Widely share statistics over a set of data without
revealing anything about individuals
2006 Dwork, McSherry, Nissim, and Smith
(ε-differential privacy)

16
Requirements
Formal model
Resist linkage attacks
Resist unknown future attacks
Effective in settings in which
extensive external information
may be available

17
Injects random data into a data
set (in a mathematically rigorous
way) to protect individual privacy
Value of randomness trades off
privacy and utility/accuracy
https://www.accessnow.org/understanding-
differential-privacy-matters-digital-rights/

18

19
Limitations
Base rate
Noise
Repeated queries

20
But what if you don’t
have a trusted
third-party?

21
Multi-Party Computation
Collaborative analysis of silo-ed datasets
without trusting a third party
● Equivalence to incorruptible trusted party
● Parties jointly compute a function on their
inputs using a protocol
● No information is revealed about inputs

22
Preserve privacy and correctness
Adversarial participants
Collusion
Threat models
Overhead
Considerations

23
Protocol distributes encrypted
(AES) shares of (masked) data
Implementations and efficiency
depend on threat assumptions
In general, low compute but high
communications overhead

24
Ongoing research
● Subscribe to:
https://research.redhat.com/quarterly/
● Boston University Red Hat Collaboratory
● openmined.org (PySyft)

CONFIDENTIAL Designator
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHat
25
Red Hat is the world’s leading provider of enterprise
open source software solutions. Award-winning support,
training, and consulting services make Red Hat a trusted
adviser to the Fortune 500.
Thank you

Preserving privacy while sharing data

More Related Content

What's hot

Similar to Preserving privacy while sharing data

More from Gordon Haff

Recently uploaded

Preserving privacy while sharing data