SlideShare a Scribd company logo
Is Privacy possible without
Anonymity?
The case for microblogging services
Panagiotis (Panos) Papadopoulos
Brave Software Inc.
Antonis Papadogiannakis , Michalis Polychronakis, Evangelos P. Markatos
The news delivery model changed
Shift to microblogging services and news
aggregators for news consumption
Panagiotis (Panos) Papadopoulos
Source: Statista
2
Publish-Subscribe Model
• Information providers (or channels):
- Artists - Politicians
- News channels or Journalists - Religious channels
- Health institutes or doctors - Other communities
• Users subscribe to/follow channels
• Receive interesting information in a timely manner
Panagiotis (Panos) Papadopoulos 3
What about users’ privacy?
• Based on the user’s channel subscriptions, the service provider learns
the user’s interests:
• Political preferences (e.g., by following Donald Trump)
• Health issues (e.g., by following Alcoholic Anonymous)
• Detailed user profiling
• Can be used for many purposes
• Beyond the control of the users
• Privacy-sensitive channels
Panagiotis (Panos) Papadopoulos 4
Threat model
• a service that enables users to receive personalized news about their topics of interest
Ønews aggregators and RSS readers (e.g., Digg, Apple News, Google News, Reddit)
Ømicroblogging services (e.g., Twitter, Tout, Tumblr, Weibo)
• users subscribe themselves to individual channels of information that distribute content
about various topics
• honest-but-curious microblogging service that tracks users interests by monitoring the
channels they follow
• Collected user data is property of the service operator and
can be later passed/sold to third parties (advertisers, DMPs)
Panagiotis (Panos) Papadopoulos 5
How can user protect their privacy?
• Pseudonym or fake account
• IP address, third-party cookies, can reveal user’s identity
• Fake account +
• Device fingerprints can correlate anonymous and eponymous browsing
sessions
• Fake account + + VM per browsing session
• Too complex for ordinary users and mobile devices
Panagiotis (Panos) Papadopoulos 6
In a world where anonymity is
(practically) not possible:
Can we still protect the privacy of users of a microblogging service?
Panagiotis (Panos) Papadopoulos 7
Our approach
An obfuscation-based approach to preserve the privacy of the users’
interests:
1. user clicks to follow a channel S
2. browser extension follows on the background N other noise
channels (say S1, S2, ..., SN )
3. browser extension filters-out noise channels during rendering time
Panagiotis (Panos) Papadopoulos 8
Example of the tracker’s view
N=2
Real choicesNoise
Panagiotis (Panos) Papadopoulos 9
Tracker’s confidence
• A priori confidence (PU
pr(S)):
Øbefore user U follows channel S tracker has some confidence that U is interested in S
• general background information available (e.g. what percentage of people are alcoholics),
• background information available specifically tailored to the profile of U (e.g. what percentage of
white males between 40 and 50 are alcoholics)
• A posteriori confidence (PU
po(S)):
Øafter U follows channel S tracker has a new confidence level that U is interested in S.
ØFor example:
• if U follows only S and no noise channels (PU
po(S) = 1)
• if U follows S along with 1 noise channel (say S1), the tracker will know that U is interested either in
S or S1 (PU
po(S) = PU
po(S1) = 0.5)
Panagiotis (Panos) Papadopoulos 10
Our scope:
• Tracker would like to:
• increase as much as possible the a posteriori confidence that U is interested
in S
• i.e PU
po (S) ≫ PU
pr (S)
• We would like to:
• reduce a posteriori confidence to the level where it is no higher than the a
priori confidence
• i.e. PU
po (S) = PU
pr (S)
Panagiotis (Panos) Papadopoulos 11
How much noise should be added?
Example:
• tracker 10% confident that U is interested in channel S => PU
pr(S) = 0.1
• By adding 9 noise channels (+ 1 real = 10 channels) tracker will still be about 10%
confident (a posteriori confidence) that user S is the “real” channel
üAdd ⌈1/PU
pr (S)⌉ − 1 noise channels so PU
po (S) = 1/ ⌈1/PU
pr(S )⌉ ≈ PU
pr (S)
Panagiotis (Panos) Papadopoulos 12
What kind of noise should be added?
User U must be interested in following all noise channels with the same level
of interest she is interested in following channel S.
• Most likely a 80 years old metalhead uses channel of Justin Bieber as noise
(don’t his songs sound a bit like that anyway? !)
So:
• a priori probability that U is interested in S should be
• same as the a priori probability that U is interested in S1
• same as the a priori probability that U is interested in S2
• …
• PU
pr(S) = PU
pr(S1) = PU
pr(S2) = ... = PU
pr(SN)
Panagiotis (Panos) Papadopoulos 13
Challenge in noise selection
PU
pr(S) = PU
pr(S1) = PU
pr(S2) = ... = PU
pr(SN)
• Good luck finding such channels…
• Constant amount of noise… sounds like k-anonymity
• What is the proper k though?
qLow value of k will increase confidence of tracker
qHigh value of k will impose unnecessary load to the service
Panagiotis (Panos) Papadopoulos 14
Bucketing
When user U is interested in following 1 channel from the bucket, she
is instructed to follow all channels of this bucket
ØSize of each bucket is not be constant: depends on the lowest
channel probability contained in the bucket.
Example:
Øif user U interested in channel S with probability 10%, the bucket
should contain 10 channels.
• If it contains less , this will increase confidencePU
po(S) >> 10%
Panagiotis (Panos) Papadopoulos 15
Bounds in the error estimation (1/2)
So far we assumed that we always know the probability that U is
interested in S: PU
pr (S)
But:
• what if we do not?
• or we do not accurately estimate it?
• or tracker has some background knowledge
How is this expected to influence our methodology?
Panagiotis (Panos) Papadopoulos 16
Bounds in the error estimation (2/2)
Assume channel S of a political party (general feeling: PU
pr(S)=10% of voters)
1. Overestimation of a priori probability:
• Tracker knows: people like user U are 2x less likely to vote for this party (PAu
pr(S)=5%)
• PU
po(S) = 10% > PAu
pr(S) = 5% -> Our algorithm uses 9 noise channels instead of 19 (tracker’s confident increased)
2. Underestimation of a priori probability:
• Tracker knows: people like user U are 2x more likely to vote for this party (PAu
pr(S)=20%)
• PU
po(S) = 10% < PAu
pr(S) = 20% -> Our algorithm uses 9 noise channels instead of 5 (we are golden!)
So if:
• A priori is overestimated by a factor of f this will increase the tracker’s belief by no more than a factor of f
• A priori is underestimated by a factor of f, this won’t increase the tracker’s belief
Panagiotis (Panos) Papadopoulos 17
Conclusion
üwe explored whether it is possible for users to retrieve sensitive
information (about health issues, sexual preferences) without revealing
their interests
• While consuming information from microblogging services or news aggregators.
üwe proposed an obfuscation-based algorithm which aims to decrease the
confidence of the service provider when it infers the user’s interests and
we analyzed its effectiveness
ü we showed that it is possible for users to protect their
privacy even if they can not hide their identity.
Panagiotis (Panos) Papadopoulos 18
Background Slides
Panagiotis (Panos) Papadopoulos 19
Correlation Attack
The longer a user interacts with a system, the more
information she reveals
Example:
• user U is already following channel S (+noise channels).
• U is interested in following channel T with content semantically
related with the content of S.
ØTracker identifies correlation and also that U is interested in channel S
and not in any of the noise ones.
Panagiotis (Panos) Papadopoulos 20
Mitigation against Correlation Attack
What is the definition of semantically related channels?
• P(S) is the probability that the user U is interested in channel S
• P(S|T) is the probability that U is actually interested in an already followed
channel S given that she now follows channel T.
So two channels are semantically related if: P(S|T). P(S)
Our approach: P(S|T) = P(S1|T1) = P(S2|T2) = ... = P(SN-1|TN-1)
• where, S1, S2,..., SN-1 already followed noise channels for sensitive channel S.
• S1 and T1, S2 and T2,..., SN-1 and TN-1 are also correlated channels.
Panagiotis (Panos) Papadopoulos 21
Bucket creation algorithm
1. order all channels C according to their a priori probability PU
pr(C) in
decreasing order.
2. construct a set of channels with probabilities in the range [1/2, 1).
• If there are more than two such channels create buckets of size two.
• Each bucket contains at least 2 channels with probability at least 1/2.
3. construct a set of channels with probabilities in the range [1/3, 1/2).
• If there are more than three such channels in this range, create buckets of size 3.
• If, however, no three such channels are found, the algorithm will try to find four
channels in the range [1/4, 1/2).
• If unsuccessful, it will try to find five channels in the range [1/5, 1/2), etc.
Panagiotis (Panos) Papadopoulos 22

More Related Content

Recently uploaded

Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 

Recently uploaded (20)

Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Is privacy possible without Anonymity? The case for microblogging services

  • 1. Is Privacy possible without Anonymity? The case for microblogging services Panagiotis (Panos) Papadopoulos Brave Software Inc. Antonis Papadogiannakis , Michalis Polychronakis, Evangelos P. Markatos
  • 2. The news delivery model changed Shift to microblogging services and news aggregators for news consumption Panagiotis (Panos) Papadopoulos Source: Statista 2
  • 3. Publish-Subscribe Model • Information providers (or channels): - Artists - Politicians - News channels or Journalists - Religious channels - Health institutes or doctors - Other communities • Users subscribe to/follow channels • Receive interesting information in a timely manner Panagiotis (Panos) Papadopoulos 3
  • 4. What about users’ privacy? • Based on the user’s channel subscriptions, the service provider learns the user’s interests: • Political preferences (e.g., by following Donald Trump) • Health issues (e.g., by following Alcoholic Anonymous) • Detailed user profiling • Can be used for many purposes • Beyond the control of the users • Privacy-sensitive channels Panagiotis (Panos) Papadopoulos 4
  • 5. Threat model • a service that enables users to receive personalized news about their topics of interest Ønews aggregators and RSS readers (e.g., Digg, Apple News, Google News, Reddit) Ømicroblogging services (e.g., Twitter, Tout, Tumblr, Weibo) • users subscribe themselves to individual channels of information that distribute content about various topics • honest-but-curious microblogging service that tracks users interests by monitoring the channels they follow • Collected user data is property of the service operator and can be later passed/sold to third parties (advertisers, DMPs) Panagiotis (Panos) Papadopoulos 5
  • 6. How can user protect their privacy? • Pseudonym or fake account • IP address, third-party cookies, can reveal user’s identity • Fake account + • Device fingerprints can correlate anonymous and eponymous browsing sessions • Fake account + + VM per browsing session • Too complex for ordinary users and mobile devices Panagiotis (Panos) Papadopoulos 6
  • 7. In a world where anonymity is (practically) not possible: Can we still protect the privacy of users of a microblogging service? Panagiotis (Panos) Papadopoulos 7
  • 8. Our approach An obfuscation-based approach to preserve the privacy of the users’ interests: 1. user clicks to follow a channel S 2. browser extension follows on the background N other noise channels (say S1, S2, ..., SN ) 3. browser extension filters-out noise channels during rendering time Panagiotis (Panos) Papadopoulos 8
  • 9. Example of the tracker’s view N=2 Real choicesNoise Panagiotis (Panos) Papadopoulos 9
  • 10. Tracker’s confidence • A priori confidence (PU pr(S)): Øbefore user U follows channel S tracker has some confidence that U is interested in S • general background information available (e.g. what percentage of people are alcoholics), • background information available specifically tailored to the profile of U (e.g. what percentage of white males between 40 and 50 are alcoholics) • A posteriori confidence (PU po(S)): Øafter U follows channel S tracker has a new confidence level that U is interested in S. ØFor example: • if U follows only S and no noise channels (PU po(S) = 1) • if U follows S along with 1 noise channel (say S1), the tracker will know that U is interested either in S or S1 (PU po(S) = PU po(S1) = 0.5) Panagiotis (Panos) Papadopoulos 10
  • 11. Our scope: • Tracker would like to: • increase as much as possible the a posteriori confidence that U is interested in S • i.e PU po (S) ≫ PU pr (S) • We would like to: • reduce a posteriori confidence to the level where it is no higher than the a priori confidence • i.e. PU po (S) = PU pr (S) Panagiotis (Panos) Papadopoulos 11
  • 12. How much noise should be added? Example: • tracker 10% confident that U is interested in channel S => PU pr(S) = 0.1 • By adding 9 noise channels (+ 1 real = 10 channels) tracker will still be about 10% confident (a posteriori confidence) that user S is the “real” channel üAdd ⌈1/PU pr (S)⌉ − 1 noise channels so PU po (S) = 1/ ⌈1/PU pr(S )⌉ ≈ PU pr (S) Panagiotis (Panos) Papadopoulos 12
  • 13. What kind of noise should be added? User U must be interested in following all noise channels with the same level of interest she is interested in following channel S. • Most likely a 80 years old metalhead uses channel of Justin Bieber as noise (don’t his songs sound a bit like that anyway? !) So: • a priori probability that U is interested in S should be • same as the a priori probability that U is interested in S1 • same as the a priori probability that U is interested in S2 • … • PU pr(S) = PU pr(S1) = PU pr(S2) = ... = PU pr(SN) Panagiotis (Panos) Papadopoulos 13
  • 14. Challenge in noise selection PU pr(S) = PU pr(S1) = PU pr(S2) = ... = PU pr(SN) • Good luck finding such channels… • Constant amount of noise… sounds like k-anonymity • What is the proper k though? qLow value of k will increase confidence of tracker qHigh value of k will impose unnecessary load to the service Panagiotis (Panos) Papadopoulos 14
  • 15. Bucketing When user U is interested in following 1 channel from the bucket, she is instructed to follow all channels of this bucket ØSize of each bucket is not be constant: depends on the lowest channel probability contained in the bucket. Example: Øif user U interested in channel S with probability 10%, the bucket should contain 10 channels. • If it contains less , this will increase confidencePU po(S) >> 10% Panagiotis (Panos) Papadopoulos 15
  • 16. Bounds in the error estimation (1/2) So far we assumed that we always know the probability that U is interested in S: PU pr (S) But: • what if we do not? • or we do not accurately estimate it? • or tracker has some background knowledge How is this expected to influence our methodology? Panagiotis (Panos) Papadopoulos 16
  • 17. Bounds in the error estimation (2/2) Assume channel S of a political party (general feeling: PU pr(S)=10% of voters) 1. Overestimation of a priori probability: • Tracker knows: people like user U are 2x less likely to vote for this party (PAu pr(S)=5%) • PU po(S) = 10% > PAu pr(S) = 5% -> Our algorithm uses 9 noise channels instead of 19 (tracker’s confident increased) 2. Underestimation of a priori probability: • Tracker knows: people like user U are 2x more likely to vote for this party (PAu pr(S)=20%) • PU po(S) = 10% < PAu pr(S) = 20% -> Our algorithm uses 9 noise channels instead of 5 (we are golden!) So if: • A priori is overestimated by a factor of f this will increase the tracker’s belief by no more than a factor of f • A priori is underestimated by a factor of f, this won’t increase the tracker’s belief Panagiotis (Panos) Papadopoulos 17
  • 18. Conclusion üwe explored whether it is possible for users to retrieve sensitive information (about health issues, sexual preferences) without revealing their interests • While consuming information from microblogging services or news aggregators. üwe proposed an obfuscation-based algorithm which aims to decrease the confidence of the service provider when it infers the user’s interests and we analyzed its effectiveness ü we showed that it is possible for users to protect their privacy even if they can not hide their identity. Panagiotis (Panos) Papadopoulos 18
  • 20. Correlation Attack The longer a user interacts with a system, the more information she reveals Example: • user U is already following channel S (+noise channels). • U is interested in following channel T with content semantically related with the content of S. ØTracker identifies correlation and also that U is interested in channel S and not in any of the noise ones. Panagiotis (Panos) Papadopoulos 20
  • 21. Mitigation against Correlation Attack What is the definition of semantically related channels? • P(S) is the probability that the user U is interested in channel S • P(S|T) is the probability that U is actually interested in an already followed channel S given that she now follows channel T. So two channels are semantically related if: P(S|T). P(S) Our approach: P(S|T) = P(S1|T1) = P(S2|T2) = ... = P(SN-1|TN-1) • where, S1, S2,..., SN-1 already followed noise channels for sensitive channel S. • S1 and T1, S2 and T2,..., SN-1 and TN-1 are also correlated channels. Panagiotis (Panos) Papadopoulos 21
  • 22. Bucket creation algorithm 1. order all channels C according to their a priori probability PU pr(C) in decreasing order. 2. construct a set of channels with probabilities in the range [1/2, 1). • If there are more than two such channels create buckets of size two. • Each bucket contains at least 2 channels with probability at least 1/2. 3. construct a set of channels with probabilities in the range [1/3, 1/2). • If there are more than three such channels in this range, create buckets of size 3. • If, however, no three such channels are found, the algorithm will try to find four channels in the range [1/4, 1/2). • If unsuccessful, it will try to find five channels in the range [1/5, 1/2), etc. Panagiotis (Panos) Papadopoulos 22