SlideShare a Scribd company logo
Privacy-preserving Information
Sharing: Tools and Applications
Emiliano De Cristofaro
University College London
https://emilianodc.com
Koc University, Jan 2016
Prologue
Privacy-Enhancing Technologies (PETs):
Increase privacy of users, groups, and/or organizations
PETs often respond to privacy threats
Protect personally identifiable information
Support anonymous communications
Privacy-respecting data processing
Another angle: privacy as an enabler
Actively enabling scenarios otherwise impossible w/o
clear privacy guarantees
2
Sharing Information w/ Privacy
When parties with limited mutual trust willing or
required to share information
Only the required minimum amount of information should
be disclosed in the process
3
Secure Computation
Alice (a) Bob (b)
f(a,b)
f(a,b)f(a,b)
Map information sharing to f(·,·)?
Realize secure f(·,·) efficiently?
Quantify information disclosure from output of f(·,·)?
4
Private Set Intersection (PSI)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
5
PSI Cardinality (PSI-CA)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
PSI-CA
SÇC
6
PSI w/ Data Transfer (PSI-DT)
Server Client
C ={c1, ,cv}
PSI-DT
 ),(),...,,( 11 ww datasdatasS 
SÇC = (sj,dataj ) $ci Î C :ci = sj{ }
7
PSI w/ Data Transfer
Client Server
8
Authorized Private Set Intersection
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
What if the client
populates C with its best
guesses for S?
Client needs to prove that inputs satisfy
a policy or be authorized
Authorizations issued by appropriate authority
Authorizations need to be verified implicitly
9
Size-Hiding Private Set Intersection
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
v w
10
Garbled-Circuit Based PSI
Very fast implementations
[HKE12, DCW13, PSZ14], possibly more 
Based on pipelining
Exploits advances in OT Extension
Do not support all functionalities
Still high communication overhead
11
Special-purpose PSI
[DT10]: scales efficiently to very large sets
First protocol with linear complexities and fast crypto
[DKT10]: extends to arbitrarily malicious adversaries
Works also for Authorized Private Set Intersection
[DJLLT11]: PSI-based database querying
Won IARPA APP challenge, basis for IARPA SPAR
[DT12]: optimized toolkit for PSI
Privately intersect sets – 2,000 items/sec
[ADT11]: size-hiding PSI
12
Oblivious Pseudo-Random Functions
Server Client
fk (x)
OPRF
k
fk (x)
x
13
OPRF-based PSI
Server Client
fk (x)
OPRF
k
Unless sj is in the intersection fk(sj)
indistinguishable from random
ci
fk (sj )
fk (ci )
14
OPRF from Blind-RSA Signatures
RSA Signatures:
PRF: fd (x)= H(sigd (x))
e×d º1mod(p-1)(q-1)(N = p×q, e), d
Sigd (x)= H(x)d
modN,
Ver(Sig(x), x)=1ÛSig(x)e
= H(x)modN
Server (d) Client (x)
(H one way function)
a = H(x)×re r Î ZN
(= H(x)d
red
)
sigd (x) = b /rb = ad
fd (x)= H(sigd (x))
15
Authorized Private Set Intersection (APSI)
Server Client
S ={s1, ,sw} C ={(c1,auth(c1)), ,(cv,auth(cv ))}
Authorized Private
Set Intersection
SÇC =
def
sj Î S $ci Î C :ci = sj Ùauth(ci )is valid{ }
Court
16
OPRF w/ Implicit Signature Verification
Server Client
fk (x)
OPRF with ISV
k sig(x)
fk (x) if Ver(sig(x), x)=1
$ otherwise
17
A simple OPRF-like with ISV
Court issues authorizations:
OPRF: fk (x)= F(H(x)2k
modN)
Sig(x)= H(x)d
modN
Server (k) Client (H(x)d)
a = H(x)d
gr r Î ZN
(b = H(x)2edk
g2rek
)
H(x)2k
= b/g2erk
b = a2×e×k
;gk
fk (x)= F(H(x)2k
)(Implicit Verification)
18
OPRF with ISV – Malicious Security
OPRF: fk (x)= F(H(x)2k
)
Server (k) Client (H(x)d)
a = H(x)d
gr r Î ZN
(b = H(x)2edk
g2rek
)
H(x)2k
= b/g2erkb = a2ek
fk (x)= F(H(x)2k
)
a = H(x)(g')r
p = ZKPK{r :a2e
/a2
=(ge
/g')2r
}
gk
p'=ZKPK{k:b=a2ek
}
19
Other Building Blocks
[DGT12]: Private Set Intersection Cardinality-only
[BDG12]: Private Sample Set Similarity
[DFT13]: Private Substring/Pattern Matching
20
Cool! So what? 
21
22
Genomics…
23
24
25
26
The Bad News
Sensitivity of human genome:
Uniquely identifies an individual
Discloses ethnicity, disease predispositions (including mental)
Progress aggravates fears of discrimination
Once leaked, it cannot be “revoked”
De-identification and obfuscation are not effective
More info:
[ADHT13] Chills and Thrills of Whole Genome Sequencing. IEEE
Computer Magazine.
27
Secure Genomics?
Privacy:
Individuals remain in control of their genome
Allow doctors/clinicians/labs to run genomic tests, while
disclosing the required minimum amount of information, i.e.:
(1) Individuals don’t disclose their entire genome
(2) Testing facilities keep test specifics (“secret sauce”)
confidential
[BBDGT11]: Secure genomics via *-PSI
Most personalized medicine tests in < 1 second
Works on Android too
28
Private Set
Intersection
Cardinality
Test Result:
(#fragments with same length)
Private RFLP-based Paternity Test
29
doctor
or lab
genome
individual
test specifics
Secure
Function
Evaluation
test result test result
• Private Set Intersection (PSI)
• Authorized PSI
• Private Pattern Matching
• Homomorphic Encryption
• Garbled Circtuis
• […]
Output reveals nothing beyond
test result
• Paternity/Ancestry Testing
• Testing of SNPs/Markers
• Compatibility Testing
• Disease Predisposition […]
30
Open Problems
Where do we store genomes?
Encryption can’t guarantee security past 30-50 yrs
Reliability and availability issues?
Cryptography
Efficiency overhead
Data representation assumptions
How much understanding required from users?
31
Collaborative Anomaly Detection
Anomaly detection is hard
Suspicious activities deliberately mimic normal behavior
But, malevolent actors often use same resources
Wouldn’t it be better if organizations
collaborated?
It’s a win-win, no?
“It is the policy of the United States
Government to increase the volume, timelines,
and quality of cyber threat information shared
with U.S. private sector entities so that these
entities may better protect and defend
themselves against cyber attacks.”
Barack Obama
2013 State of the Union Address
32
Problems with Collaborations
Trust
Will others leak my data?
Legal Liability
Will I be sued for sharing customer data?
Will others find me negligible?
Competitive concerns
Will my competitors outperform me?
Shared data quality
Will data be reliable?
33
Solution Intuition [FDB15]
t
now
Sharing
Information
w/ Privacy
Company 2
tnow
Company 1
Better Analytics
Securely assess
the benefits of
sharing
Securely assess
the risks of
sharing
34
Training Machine Learning Models
The Big Data “Hype”
Large-scale collection of contextual information often
essential to gather statistics, train machine learning models,
and extract knowledge from data
Doing so privately…
35
Efficient Private Statistics [MDD16]
Real-world problems:
1. Recommender systems for online streaming services
2. Statistics about mass transport movements
3. Traffic statistics for the Tor Network
Available tools for computing private statistics are
impractical for large streams collection
Intuition: Approximate statistics are acceptable in
some cases?
36
Preliminaries: Count-Min Sketch
An estimate of an item’s frequency in a stream
Mapping a stream of values (of length T) into a matrix of size
O(logT)
The sum of two sketches results in the sketch of the union of
the two data streams
37
ItemKNN Recommender Systems
Predict favorite TV programs based on their own
ratings and those of “similar” users
Consider N users, M programs and binary ratings
Build a co-views matrix C, where Cab is the number of
views for the pair of programs (a,b)
Compute the Similarity Matrix
Identify K-Neighbors based on the Similarity Matrix
38
Private Recommender System
We build a global matrix of co-views for training
ItemKNN in a privacy-friendly way by relying on:
Private data aggregation based on [Kursawe et al. 2011]
Count-Min Sketch to reduce overhead
System Model
Users (in groups)
Tally Server (e.g, the BBC)
39
Security & Implementation
Security
In the honest-but-curious model under the CDH assumption
Prototype implementation:
Tally as a Node.js web server
Users run in the browser or as a mobile cross-
platform application (Apache Cordova)
Transparency, ease of use, ease of deployment
40
41
User side
Server side
Accuracy
42
Challenges Ahead…
43
This slide is intentionally left blank

More Related Content

Viewers also liked

Capçelera del tresor
Capçelera del tresorCapçelera del tresor
Capçelera del tresor
Edu Alias
 
Christmas Spelling List
Christmas Spelling ListChristmas Spelling List
Christmas Spelling Liststandrewmlewis
 
HudiNi Ni FoNggiS KuNo !
HudiNi Ni FoNggiS KuNo !HudiNi Ni FoNggiS KuNo !
HudiNi Ni FoNggiS KuNo !
drcuray
 
Retos de la educacion universitaria
Retos de la educacion universitariaRetos de la educacion universitaria
Retos de la educacion universitariayonnysb
 
Umang Mehta - Emotion AI Developer Day 2016
Umang Mehta - Emotion AI Developer Day 2016Umang Mehta - Emotion AI Developer Day 2016
Umang Mehta - Emotion AI Developer Day 2016
Affectiva
 
Sistema informatic
Sistema informaticSistema informatic
Sistema informatic
Edu Alias
 
Facility Planning for Fire extinguisher
Facility Planning for Fire extinguisherFacility Planning for Fire extinguisher
Facility Planning for Fire extinguisherMahmoud Farg
 
Apple Watch Teardown and more
Apple Watch Teardown and moreApple Watch Teardown and more
Apple Watch Teardown and more
Bill Cardoso
 
Dtsi engine
Dtsi engineDtsi engine
Dtsi engine
pgayatrinaidu
 
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
Mohamed _el_shora
 
problems in tablet manufacturing and coating
problems in tablet manufacturing and coatingproblems in tablet manufacturing and coating
problems in tablet manufacturing and coating
venkatesh thota
 
Tec 2016
Tec 2016Tec 2016
Tec 2016
panchoeri
 
Granulation ppt.
Granulation ppt.Granulation ppt.
Granulation ppt.
Namdeo Shinde
 
Personal informatic
Personal informaticPersonal informatic
Personal informatic
Alex Ramon
 

Viewers also liked (16)

Capçelera del tresor
Capçelera del tresorCapçelera del tresor
Capçelera del tresor
 
Christmas Spelling List
Christmas Spelling ListChristmas Spelling List
Christmas Spelling List
 
HudiNi Ni FoNggiS KuNo !
HudiNi Ni FoNggiS KuNo !HudiNi Ni FoNggiS KuNo !
HudiNi Ni FoNggiS KuNo !
 
Retos de la educacion universitaria
Retos de la educacion universitariaRetos de la educacion universitaria
Retos de la educacion universitaria
 
Umang Mehta - Emotion AI Developer Day 2016
Umang Mehta - Emotion AI Developer Day 2016Umang Mehta - Emotion AI Developer Day 2016
Umang Mehta - Emotion AI Developer Day 2016
 
Sistema informatic
Sistema informaticSistema informatic
Sistema informatic
 
Facility Planning for Fire extinguisher
Facility Planning for Fire extinguisherFacility Planning for Fire extinguisher
Facility Planning for Fire extinguisher
 
Apple Watch Teardown and more
Apple Watch Teardown and moreApple Watch Teardown and more
Apple Watch Teardown and more
 
Gaas
GaasGaas
Gaas
 
Semiconductordevices
SemiconductordevicesSemiconductordevices
Semiconductordevices
 
Dtsi engine
Dtsi engineDtsi engine
Dtsi engine
 
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016
 
problems in tablet manufacturing and coating
problems in tablet manufacturing and coatingproblems in tablet manufacturing and coating
problems in tablet manufacturing and coating
 
Tec 2016
Tec 2016Tec 2016
Tec 2016
 
Granulation ppt.
Granulation ppt.Granulation ppt.
Granulation ppt.
 
Personal informatic
Personal informaticPersonal informatic
Personal informatic
 

Similar to Privacy-preserving Information Sharing: Tools and Applications

Privacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria GasconPrivacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria Gascon
Ulrik Lyngs
 
Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data
Serafeim Chatzopoulos
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
James Sirota
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Pluribus One
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
Arthur240715
 
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
Open Networking Perú (Opennetsoft)
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
DataWorks Summit
 
Qualifying exam-2015-final
Qualifying exam-2015-finalQualifying exam-2015-final
Qualifying exam-2015-final
Open Networking Perú (Opennetsoft)
 
Data-Mining
Data-MiningData-Mining
Data-Mining
Wakimul Alam
 
Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]
Chris Theisen
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
Martín Rezk
 
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
Charith Perera
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
International Journal of Engineering Inventions www.ijeijournal.com
 
Access control in decentralized online social networks applying a policy hidi...
Access control in decentralized online social networks applying a policy hidi...Access control in decentralized online social networks applying a policy hidi...
Access control in decentralized online social networks applying a policy hidi...
IGEEKS TECHNOLOGIES
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
Sanjay Padhi, Ph.D
 
NETWORJS3.pdf
NETWORJS3.pdfNETWORJS3.pdf
NETWORJS3.pdf
neeshukumari4
 
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
Alex Pruden
 
Building and Measuring Privacy-Preserving Mobility Analytics
Building and Measuring Privacy-Preserving Mobility AnalyticsBuilding and Measuring Privacy-Preserving Mobility Analytics
Building and Measuring Privacy-Preserving Mobility Analytics
Emiliano De Cristofaro
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
HPCC Systems
 

Similar to Privacy-preserving Information Sharing: Tools and Applications (20)

Privacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria GasconPrivacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria Gascon
 
Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data Towards Statistical Queries over Distributed Private User Data
Towards Statistical Queries over Distributed Private User Data
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
IntelFlow: Toward adding Cyber Threat Intelligence to Software Defined Networ...
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Qualifying exam-2015-final
Qualifying exam-2015-finalQualifying exam-2015-final
Qualifying exam-2015-final
 
Data-Mining
Data-MiningData-Mining
Data-Mining
 
Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]Approximating Attack Surfaces with Stack Traces [ICSE 15]
Approximating Attack Surfaces with Stack Traces [ICSE 15]
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
 
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
 
Access control in decentralized online social networks applying a policy hidi...
Access control in decentralized online social networks applying a policy hidi...Access control in decentralized online social networks applying a policy hidi...
Access control in decentralized online social networks applying a policy hidi...
 
Dynamix IoT 2012
Dynamix IoT 2012Dynamix IoT 2012
Dynamix IoT 2012
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
NETWORJS3.pdf
NETWORJS3.pdfNETWORJS3.pdf
NETWORJS3.pdf
 
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]
 
Building and Measuring Privacy-Preserving Mobility Analytics
Building and Measuring Privacy-Preserving Mobility AnalyticsBuilding and Measuring Privacy-Preserving Mobility Analytics
Building and Measuring Privacy-Preserving Mobility Analytics
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
 

More from Emiliano De Cristofaro

The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
Emiliano De Cristofaro
 
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
Emiliano De Cristofaro
 
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
Emiliano De Cristofaro
 
The Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyThe Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The Ugly
Emiliano De Cristofaro
 
Understanding, Characterizing, and Detecting Facebook Like Farms
Understanding, Characterizing, and Detecting Facebook Like FarmsUnderstanding, Characterizing, and Detecting Facebook Like Farms
Understanding, Characterizing, and Detecting Facebook Like Farms
Emiliano De Cristofaro
 
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
Emiliano De Cristofaro
 
The Genomics Revolution: The Good, The Bad, and The Ugly
The Genomics Revolution: The Good, The Bad, and The UglyThe Genomics Revolution: The Good, The Bad, and The Ugly
The Genomics Revolution: The Good, The Bad, and The Ugly
Emiliano De Cristofaro
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
Emiliano De Cristofaro
 

More from Emiliano De Cristofaro (8)

The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
 
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...
 
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...
 
The Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyThe Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The Ugly
 
Understanding, Characterizing, and Detecting Facebook Like Farms
Understanding, Characterizing, and Detecting Facebook Like FarmsUnderstanding, Characterizing, and Detecting Facebook Like Farms
Understanding, Characterizing, and Detecting Facebook Like Farms
 
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
The Genomics Revolution: The Good, The Bad, and The Ugly (UEOP16 Keynote)
 
The Genomics Revolution: The Good, The Bad, and The Ugly
The Genomics Revolution: The Good, The Bad, and The UglyThe Genomics Revolution: The Good, The Bad, and The Ugly
The Genomics Revolution: The Good, The Bad, and The Ugly
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
 

Recently uploaded

Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 

Recently uploaded (20)

Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 

Privacy-preserving Information Sharing: Tools and Applications

  • 1. Privacy-preserving Information Sharing: Tools and Applications Emiliano De Cristofaro University College London https://emilianodc.com Koc University, Jan 2016
  • 2. Prologue Privacy-Enhancing Technologies (PETs): Increase privacy of users, groups, and/or organizations PETs often respond to privacy threats Protect personally identifiable information Support anonymous communications Privacy-respecting data processing Another angle: privacy as an enabler Actively enabling scenarios otherwise impossible w/o clear privacy guarantees 2
  • 3. Sharing Information w/ Privacy When parties with limited mutual trust willing or required to share information Only the required minimum amount of information should be disclosed in the process 3
  • 4. Secure Computation Alice (a) Bob (b) f(a,b) f(a,b)f(a,b) Map information sharing to f(·,·)? Realize secure f(·,·) efficiently? Quantify information disclosure from output of f(·,·)? 4
  • 5. Private Set Intersection (PSI) Server Client S ={s1, ,sw} C ={c1, ,cv} Private Set Intersection SÇC 5
  • 6. PSI Cardinality (PSI-CA) Server Client S ={s1, ,sw} C ={c1, ,cv} PSI-CA SÇC 6
  • 7. PSI w/ Data Transfer (PSI-DT) Server Client C ={c1, ,cv} PSI-DT  ),(),...,,( 11 ww datasdatasS  SÇC = (sj,dataj ) $ci Î C :ci = sj{ } 7
  • 8. PSI w/ Data Transfer Client Server 8
  • 9. Authorized Private Set Intersection Server Client S ={s1, ,sw} C ={c1, ,cv} Private Set Intersection SÇC What if the client populates C with its best guesses for S? Client needs to prove that inputs satisfy a policy or be authorized Authorizations issued by appropriate authority Authorizations need to be verified implicitly 9
  • 10. Size-Hiding Private Set Intersection Server Client S ={s1, ,sw} C ={c1, ,cv} Private Set Intersection SÇC v w 10
  • 11. Garbled-Circuit Based PSI Very fast implementations [HKE12, DCW13, PSZ14], possibly more  Based on pipelining Exploits advances in OT Extension Do not support all functionalities Still high communication overhead 11
  • 12. Special-purpose PSI [DT10]: scales efficiently to very large sets First protocol with linear complexities and fast crypto [DKT10]: extends to arbitrarily malicious adversaries Works also for Authorized Private Set Intersection [DJLLT11]: PSI-based database querying Won IARPA APP challenge, basis for IARPA SPAR [DT12]: optimized toolkit for PSI Privately intersect sets – 2,000 items/sec [ADT11]: size-hiding PSI 12
  • 13. Oblivious Pseudo-Random Functions Server Client fk (x) OPRF k fk (x) x 13
  • 14. OPRF-based PSI Server Client fk (x) OPRF k Unless sj is in the intersection fk(sj) indistinguishable from random ci fk (sj ) fk (ci ) 14
  • 15. OPRF from Blind-RSA Signatures RSA Signatures: PRF: fd (x)= H(sigd (x)) e×d º1mod(p-1)(q-1)(N = p×q, e), d Sigd (x)= H(x)d modN, Ver(Sig(x), x)=1ÛSig(x)e = H(x)modN Server (d) Client (x) (H one way function) a = H(x)×re r Î ZN (= H(x)d red ) sigd (x) = b /rb = ad fd (x)= H(sigd (x)) 15
  • 16. Authorized Private Set Intersection (APSI) Server Client S ={s1, ,sw} C ={(c1,auth(c1)), ,(cv,auth(cv ))} Authorized Private Set Intersection SÇC = def sj Î S $ci Î C :ci = sj Ùauth(ci )is valid{ } Court 16
  • 17. OPRF w/ Implicit Signature Verification Server Client fk (x) OPRF with ISV k sig(x) fk (x) if Ver(sig(x), x)=1 $ otherwise 17
  • 18. A simple OPRF-like with ISV Court issues authorizations: OPRF: fk (x)= F(H(x)2k modN) Sig(x)= H(x)d modN Server (k) Client (H(x)d) a = H(x)d gr r Î ZN (b = H(x)2edk g2rek ) H(x)2k = b/g2erk b = a2×e×k ;gk fk (x)= F(H(x)2k )(Implicit Verification) 18
  • 19. OPRF with ISV – Malicious Security OPRF: fk (x)= F(H(x)2k ) Server (k) Client (H(x)d) a = H(x)d gr r Î ZN (b = H(x)2edk g2rek ) H(x)2k = b/g2erkb = a2ek fk (x)= F(H(x)2k ) a = H(x)(g')r p = ZKPK{r :a2e /a2 =(ge /g')2r } gk p'=ZKPK{k:b=a2ek } 19
  • 20. Other Building Blocks [DGT12]: Private Set Intersection Cardinality-only [BDG12]: Private Sample Set Similarity [DFT13]: Private Substring/Pattern Matching 20
  • 21. Cool! So what?  21
  • 23. 23
  • 24. 24
  • 25. 25
  • 26. 26
  • 27. The Bad News Sensitivity of human genome: Uniquely identifies an individual Discloses ethnicity, disease predispositions (including mental) Progress aggravates fears of discrimination Once leaked, it cannot be “revoked” De-identification and obfuscation are not effective More info: [ADHT13] Chills and Thrills of Whole Genome Sequencing. IEEE Computer Magazine. 27
  • 28. Secure Genomics? Privacy: Individuals remain in control of their genome Allow doctors/clinicians/labs to run genomic tests, while disclosing the required minimum amount of information, i.e.: (1) Individuals don’t disclose their entire genome (2) Testing facilities keep test specifics (“secret sauce”) confidential [BBDGT11]: Secure genomics via *-PSI Most personalized medicine tests in < 1 second Works on Android too 28
  • 29. Private Set Intersection Cardinality Test Result: (#fragments with same length) Private RFLP-based Paternity Test 29
  • 30. doctor or lab genome individual test specifics Secure Function Evaluation test result test result • Private Set Intersection (PSI) • Authorized PSI • Private Pattern Matching • Homomorphic Encryption • Garbled Circtuis • […] Output reveals nothing beyond test result • Paternity/Ancestry Testing • Testing of SNPs/Markers • Compatibility Testing • Disease Predisposition […] 30
  • 31. Open Problems Where do we store genomes? Encryption can’t guarantee security past 30-50 yrs Reliability and availability issues? Cryptography Efficiency overhead Data representation assumptions How much understanding required from users? 31
  • 32. Collaborative Anomaly Detection Anomaly detection is hard Suspicious activities deliberately mimic normal behavior But, malevolent actors often use same resources Wouldn’t it be better if organizations collaborated? It’s a win-win, no? “It is the policy of the United States Government to increase the volume, timelines, and quality of cyber threat information shared with U.S. private sector entities so that these entities may better protect and defend themselves against cyber attacks.” Barack Obama 2013 State of the Union Address 32
  • 33. Problems with Collaborations Trust Will others leak my data? Legal Liability Will I be sued for sharing customer data? Will others find me negligible? Competitive concerns Will my competitors outperform me? Shared data quality Will data be reliable? 33
  • 34. Solution Intuition [FDB15] t now Sharing Information w/ Privacy Company 2 tnow Company 1 Better Analytics Securely assess the benefits of sharing Securely assess the risks of sharing 34
  • 35. Training Machine Learning Models The Big Data “Hype” Large-scale collection of contextual information often essential to gather statistics, train machine learning models, and extract knowledge from data Doing so privately… 35
  • 36. Efficient Private Statistics [MDD16] Real-world problems: 1. Recommender systems for online streaming services 2. Statistics about mass transport movements 3. Traffic statistics for the Tor Network Available tools for computing private statistics are impractical for large streams collection Intuition: Approximate statistics are acceptable in some cases? 36
  • 37. Preliminaries: Count-Min Sketch An estimate of an item’s frequency in a stream Mapping a stream of values (of length T) into a matrix of size O(logT) The sum of two sketches results in the sketch of the union of the two data streams 37
  • 38. ItemKNN Recommender Systems Predict favorite TV programs based on their own ratings and those of “similar” users Consider N users, M programs and binary ratings Build a co-views matrix C, where Cab is the number of views for the pair of programs (a,b) Compute the Similarity Matrix Identify K-Neighbors based on the Similarity Matrix 38
  • 39. Private Recommender System We build a global matrix of co-views for training ItemKNN in a privacy-friendly way by relying on: Private data aggregation based on [Kursawe et al. 2011] Count-Min Sketch to reduce overhead System Model Users (in groups) Tally Server (e.g, the BBC) 39
  • 40. Security & Implementation Security In the honest-but-curious model under the CDH assumption Prototype implementation: Tally as a Node.js web server Users run in the browser or as a mobile cross- platform application (Apache Cordova) Transparency, ease of use, ease of deployment 40
  • 43. Challenges Ahead… 43 This slide is intentionally left blank

Editor's Notes

  1. using pipeling and multi-threading
  2. There are a few possible ways to instantiate OPRFs and apply them to PSI. I’d like to show one that is relatively simple and, incidentally, turns out to yield the most efficient PSI protocol. It’s based on Blind RSA. First, notice that, in the random oracle model, the hash of a signature is a PRF. Therefore, we can construct an OPRF using Blind RSA Signatures, whereby the client obtains, from the server, a signature on a message x without disclosing x and without the server disclosing the RSA private signing key to the client. We can do so with this simple two round protocol.
  3. ZKPK of equality…
  4. using pipeling and multi-threading
  5. So progress in sequencing can help researchers gain better understanding of diseases and their relationship with genetic features, but it is also already bearing fruits in the clinical setting as we hear more and more news of how doctors have been able to diagnose and/or cure patients thanks to whole genome sequencing
  6. Another important point I’d like to make is that the genomics revolution is not only happening within the walls of our research labs and our hospitals. Take as an example the case of Angelina Jolie, who was diagnosed … her choice of undergoing … really put genetic testing in the spotlight, if not for the debate it generated
  7. Another important phenomenon is the emergence of direct-to-consumer genomics, i.e., genetic tests that are marketed directly to customers, without necessarily involving a doctor or a health professional. One of the most popular companies in this market is 23andme, a US company now operating in the UK as well, that provides individuals with reports on a number of genetic traits, inherited risk factors, etc.
  8. Not clear if cryptographic techniques can scale to (full) genomes The ultimate ID, invariant, the most precise way of authentication, it survives time, even death
  9. Different sizes
  10. Users keep sequenced genomes (encrypted) But still allow doctors/clinicians to run tests Only disclose minimum amount of information E.g. privacy-preserving version of paternity, ancestry, genealogy, personalized medicine tests