User profiles collected by social networks and search engines are often used for targeted advertising, but adding noise to these profiles to protect privacy is challenging due to the sparse and correlated nature of the data. While the "smart noise" hypothesis proposes that knowing the exact model of user profiles allows for properly calibrated noise to be added, the document argues this is incorrect and that even simple monotone or linear threshold models of profiles can still allow for non-private inferences despite noise corruption of 25% or more of profile bits.
2. Privacy of profile-based targeting 2
User-profile targeting
• Goal: increase impact of your ads by targeting a group
potentially interested in your product.
• Examples:
• Social Network
Profile = user’s personal information + friends
• Search Engine
Profile = search queries + webpages visited by user
3. Privacy of profile-based targeting 3
Facebook ad targeting
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
•
Fifth level
4. Privacy of profile-based targeting 4
Characters
Advertising company Privacy researcher
e
a
p
v
s
r
!
t
i
i
m
M
e
s
y
s
y
t
5. Privacy of profile-based targeting 5
Simple attack [Korolova’10]
Amazing cat
food for $0.99!
ng
fishi
es
Lik
- 32 y.o. single man Nice!
- Mountain View,
CA
….
- has cat
- likes fishing
Targeted ad
Likes fishing
noise
# of Show Jon
impressions Public:
Eve - 32 y.o. single man
- Mountain View, CA
- ….
- has cat
Private:
- likes fishing
6. Privacy of profile-based targeting 6
Advertising company Privacy researcher
e
a
p
v
s
r
!
t
i
i
m
M
e
s
y
s
y
t
Unless your
targeting is not
private, it is not!
?
e
a
p
y
v
r
t
l
i
e
g
a
r
t
t
w
H
n
a
o
c
I
7. Privacy of profile-based targeting 7
How to protect information?
• Basic idea: add some noise
• Explicitly
• Implicit in the data
• noiseless privacy [BBGLT11]
• natural privacy [BD11]
• Two types of explicit noise
• Output perturbation
• Dynamically add noise to answers
• Input perturbation
• Modify the database
8. Privacy of profile-based targeting 8
Advertising company Privacy researcher
…
e
e
b
r
t
t
n
o
a
b
u
e
p
r
r
t
t
i
u
p
n
e
k
t
I
i
i
l
9. Privacy of profile-based targeting 9
Input perturbation
• Pro:
• Pan-private (not storing initial data)
• Do it once
• Simpler architecture
10. Privacy of profile-based targeting 10
Advertising company Privacy researcher
…
e
e
b
r
t
t
n
o
a
b
u
e
p
r
r
t
t
i
u
p
n
e
k
t
I
i
i
l
Signal is
sparse and
non-random
12. Privacy of profile-based targeting 12
Advertising company Privacy researcher
…
e
e
b
r
t
t
n
o
a
b
u
e
p
r
r
t
t
i
u
p
n
e
k
t
I
i
i
l
Signal is
sparse and
non-random
m
e
o
n
a
s
s
”
r
“
!
t
i
d
d
a
d
n
a
b
a
n
e
d
y
,
t
i
l
i
i
o
o
o
h
e
L
s
s
r
f
t
t
’
13. Privacy of profile-based targeting 13
“Smart noise”
• Consider two extreme cases
• All bits are independent
independent noise
• All bits are correlated with correlation coefficient 1 A
a
h
!
• correlated noise
• “Smart noise” hypothesis:
“If we know the exact model we can add right noise”
14. Privacy of profile-based targeting 14
Dependent bits in real data
• Netflix prize competition data
• ~480k users, ~18k movies, ~100m ratings
• Estimate movie-to-movie correlation
• Fact that a user rated a movie
• Visualize graph of correlations
• Edge – correlation with correlation coefficient > 0.5
15. Privacy of profile-based targeting 15
Netflix movie correlations
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
•
Fifth level
16. Privacy of profile-based targeting 16
Advertising company Privacy researcher
m
e
o
n
a
s
s
”
r
“
!
t
i
d
d
a
d
n
a
b
a
n
e
d
y
,
t
i
l
i
i
o
o
o
h
e
L
s
s
r
f
t
t
’
Let’s construct
models where
“smart noise” fails
17. Privacy of profile-based targeting 17
How can “smart noise” fail?
large
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
•
Fifth level
18. Privacy of profile-based targeting 18
Models of user profiles
1 0 1 … 0 1
1 1 0 1 … 0 1 0 1
• Are users well separated?
19. Privacy of profile-based targeting 19
Error-correcting codes
• Click to edit Master text styles
• Second level
•
Constant relative distance
• Third level •
Unique decoding
• Fourth level •
Explicit, efficient
•
Fifth level
20. Privacy of profile-based targeting 20
Advertising company Privacy researcher
See — unless
the noise is
>25%, no privacy
a
e
n
u
c
s
r
!
t
i
i
l
m
B
e
d
o
h
u
s
s
t
t
i
l
i
Let me see what I can
do with monotone
functions…
25. Privacy of profile-based targeting 25
Advertising company Privacy researcher
If the model is
monotone, blatant
w
?
o
e
e
e
o
n
k
v
s
r
r
i non-privacy is still
m
m
m
m
D
H
a
e
o
s
s
r
t
. possible
27. Privacy of profile-based targeting 27
Conclusion
• Two separate issues with input perturbation:
• Sparseness Arbitrary
• Dependencies Monotone
fallacy Linear threshold
• “Smart noise” hypothesis:
Even for a publicly known, relatively simple model, constant
corruption of profiles may lead to blatant non-privacy.
• Connection between noise sensitivity of boolean functions and
privacy
• Open questions:
• Linear threshold privacy-preserving mechanism?
• Existence of interactive privacy-preserving solutions?
28. Privacy of profile-based targeting 28
Thank for your attention!
Special thanks for Cynthia Dwork, Moises Goldszmidt,
Parikshit Gopalan, Frank McSherry, Moni Naor, Kunal
Talwar, and Sergey Yekhanin.