The talk discusses how analytics can attack privacy and what we can do about it. It discusses the legal responses (e.g. GDPR) as well technical responses ( differential privacy and homomorphic encryption).
The video is in https://www.facebook.com/eduscopelive/videos/314847475765297/ from 1.18.
2. What is Privacy?
Privacy is the ability of an
individual or group to
seclude themselves, or
information about
themselves, and thereby
express themselves
selectively.
!2
3. I have nothing to Hide
!3
"If you have something that you don't want
anyone to know, maybe you shouldn't be
doing it in the first place”
—Eric Schmidt, the CEO of Google
4. I have nothing to Hide
!4
"If one would give me
six lines written by the
hand of the most
honest man, I would
find something in them
to have him hanged”
—Cardinal Richelieu ( first used by
Bruce Schneier)
5. I have nothing to Hide
Hard to avoid discrimination,
bias, mis-interpretations
We put unreasonable
expectations on others (e.g.
Fundamental attribution
error, Illusory superiority)
People behave differently
when they are watched
(e.g. Be more conformist,
take less risks)
People change, norms change, but data
is forever
In a Zero Privacy world, competition is
hard and power balance is skewed.
• Powerful countries has advantage
• In Democracy, reigning government
has advantage
• Bigger companies has advantage
7. What can Anonymized CDR data can tell about you?
• Where you live, work, your name? When you come
home? When you leave home, your friends, your family,
your income bracket
!7
8. What Cameras
can tell about
you?
• Eigen face (Unique face ) - Number plate, where you drive
• What you drive? Where you go? Who you met (*), when
you leave home, your habits, how you feel
• This is in public space, you are not protected via privacy
laws
!8
9. What does electricity data can tell
about you?
• Are you in the house?
When do you leave, when
you come back?
• What appliances in the
house?
• What are you doing
( limited granularity)
• What programs are you
watching?
!9
12. Fighting Back
• Stop Sharing
• Not possible due to value of data
• Having greater control over what we share and how it is stored
• Law and Policy
• Using algorithms
!12
13. Law and Policy: HIPPA
• In healthcare data must be shared by people to health care provider
• Those data must be shared with other parties ( other hospitals, insurance,
doctors)
• Health Insurance Portability and Accountability Act of 1996 (US)
• Dictate how individually identifiable health information can be collected,
stored, and shared
• It works, but implementation is expensive
!13
14. Law and Policy: GDPR
• The General Data Protection
Regulation (EU) 2016/679
• Data can be collected, stored, or
processed only with explicit consent
• Dictate how it is stored
• Owner can revoke consent at any
time
• Might be a burden to startups and
small companies
!14
15. Law and Policy:
Limiting Correlations
• We can limit what
information is legal to
use
• We can limit correlation
and publications of
correlated data
16. Algorithms: Differential Privacy
• Adding noise to the data such that while keeping aggregative
values make it harder to recover individual values
• Creating artificial data sets for machine learning that build a
similar model
• Apple reported that they adopted Differential privacy in 2016
!16
17. Algorithms: Distributed Data
!17
• Storing data in
distributed manner
(e.g. each users phone,
machine) and allow
queries
• Doing computations
by combining results
• Much more expensive
than centralized
methods
18. Algorithms: Homomorphic Encryption
• Encryption technique that enable
computations on encrypted data
• Encrypted data can be use to do
limited calculations
• Still it is computationally expensive
can’t be used widely
!18