SlideShare a Scribd company logo
Privacy-Preserving Data Analysis
Adria Gascon
The Alan Turing Institute & Warwick University
Based on joint work with Borja Balle, Phillipp Schoppmann,
Mariana Raykova, Jack Doerner, Samee Zahur, David
Evans, Age Chapman, Alan Davoust, Peter Buneman
What analysis on what data?

Fined grained private data, e.g. tracking for targeted
advertising, credit scoring...

Data held by several organisations, e.g. hospitals?

Data held by individuals, e.g. on their phones?
Who cares?

Data owners (of course)

Data controllers
Adria Gascon Phillipp Schoppmann Borja Balle
Mariana Raykova Jack Doerner Samee Zahur David Evans
Privacy Preserving
Distributed Linear Regression on
High-Dimensional Data
Motivation
Treatment
Outcome
Medical Data
Census Data
Financial Data
Atr. 1 Atr. 2 … Atr. 4 Atr. 5 … Atr. 7 Atr. 8 …
-1.0 0 54.3 … North 34 … 5 1 …
1.5 1 0.6 … South 12 … 10 0 …
-0.3 1 16.0 … East 56 … 2 0 …
0.7 0 35.0 … Centre 67 … 15 1 …
3.1 1 20.2 … West 29 … 7 1 …
Note: This is vertcally-parttoned data; similar problems with horizontally-parttoned
Private Multi-Party Machine Learning
Assumptons
• Parameters of the model will be received by all partes
• Partes can engage in on-line secure communicatons
• External partes might be used to outsource
computaton or initalize cryptographic primitves
Problem
• Two or more partes want to jointly learn a model of
their data
• But they can’t share their private data with other partes
The Trusted Party “Solution”
(secure channel)
(secure channel)
(secure channel)
Trusted
Party
Receives plain-text data, runs
algorithm, returns result to partes
?
The Trusted Party assumpton:
• Introduces a single point of failure
• Relies on weak incentves
• Requires agreement between all data providers
=> Useful but unrealistc. Maybe can be simulated?
Secure Multi-Party Computation (MPC)
Public:
Private:
(party i)
Goal:
Compute f in a way that each party
learns y (and nothing else!)
Our Contribution
A PMPML system for vertcally parttoned linear regression
Features:
• Scalable to millions of records and hundreds of dimensions
• Formal privacy guarantees (semi-honest security)
• Open source implementaton
Tools:
• Combine standard MPC constructons (GC, OT, TI, …)
• Efcient private inner product protocols
• Conjugate gradient descent robust to fxed-point encodings
FAQ: Why is PMPML…
Excitng?
Can provide access to previously ”locked” data
Hard?
Privacy is tricky to formalize, hard to implement,
and inherently interdisciplinary
Worth?
Beter models while avoiding legal risks and bad
PR
Read It, Use It
https://github.com/schoppmp/linreg-mpc
http://eprint.iacr.org/2016/892PETS’17
Adria Gascon Phillipp Schoppmann Borja Balle
Private Document Classifcaton in
Federated Databases
Secure document classification
Secure document classification
Adria Gascon James Bell Tejas Kulkarni
Privacy-Preserving Distributed
Hypothesis Testng
● Drop off in Manhattan?
● Tip over 25 %?
● Was it a short journey?
● Was payment method
credit card?
Drop-off in Manhattan and tip over 25%
are significantly correlated events.
But this result is differentially private, so I cannot easily tell
if a given journey was included in the training dataset or not.
Problem: model-check security properties on
private source code.
Privacy-Preserving Model Checking
●
Problem: Check security properties on (private)
source code.
●
“Public” equivalent: MOPS [1], and some others.
– Security property expressed as regular expression over
sequences of instructions
– Find all paths in control flow graph that match path
●
Application of Private Regular Path Queries
[1] Hao Chen and David Wagner. 2002. MOPS: an infrastructure for examining security properties of software.
In Proceedings of the 9th ACM conference on Computer and communications security (CCS '02), Vijay Atluri
(Ed.). ACM, New York, NY, USA, 235-244. DOI=http://dx.doi.org/10.1145/586110.586142
Privacy-Preserving Model Checking
Secure queries on graph data
Simple Example
1 #include <stdio.h>
2 #include <sys/types.h>
3 #include <unistd.h>
4 #include <pwd.h>
5
6 void drop_priv()
7 {
8 struct passwd *passwd;
9
10 if ((passwd = getpwuid(getuid())) == NULL)
11 {
12 printf("getpwuid() failed");
13 return;
14 }
15 printf("Drop user %s's privilegen", passwd-
>pw_name);
16 seteuid(getuid());
17 }
18
19 int main(int argc, char *argv[])
20 {
21 drop_priv();
22 printf("About to execn");
hello.c
Simple Example
Control flow graph Security property FSA
(system call with root priviledge)
Interesting case:
distributed private graph (code)
main.c library.c
Related Work
Verification Across Intellectual Property Boundaries [2]:
[2] Chaki, Sagar, Christian Schallhart, and Helmut Veith. "Verification across intellectual property boundaries."
ACM Transactions on Software Engineering and Methodology (TOSEM) 22.2 (2013): 15.
Related Work
Verification Across Intellectual Property Boundaries [2]
They also say...
“While we are aware of advanced methods such as secure multiparty computation
[Goldreich 2002] and zeroknowledge proofs [Ben-Or et al. 1988], we believe that they are
impracticable for our problem, as such methods cannot be easily wrapped over given
validation tools. Finally, we believe that any advanced method without an intuitive proof for
its secrecy will be heavily opposed by the supplier—and might therefore be hard to
establish in practice.”
Case study: thttpd
●
Tiny http server
●
2 main modules (thttp.c and libhttp.c)
thttp.c
(2k loc)
libhttp.c
(4k loc)
thttpd control flow graph...
●
2 main modules only
●
functions are disconnected
thttpd: next steps
●
Adapt private Regular Path Queries work for
pushdown automata
●
Find some bugs.
●
Write paper.
●
Voila!
Thanks!

More Related Content

What's hot

BSidesLV -The SOC Counter ATT&CK
BSidesLV -The SOC Counter ATT&CKBSidesLV -The SOC Counter ATT&CK
BSidesLV -The SOC Counter ATT&CK
Mathieu Saulnier
 
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan brugginkATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
Gert-Jan Bruggink
 
Leveraging MITRE ATT&CK - Speaking the Common Language
Leveraging MITRE ATT&CK - Speaking the Common LanguageLeveraging MITRE ATT&CK - Speaking the Common Language
Leveraging MITRE ATT&CK - Speaking the Common Language
Erik Van Buggenhout
 
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CK
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CKTracking Noisy Behavior and Risk-Based Alerting with ATT&CK
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CK
MITRE ATT&CK
 
SOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation LabyrinthSOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation Labyrinth
chrissanders88
 
Threat hunting in cyber world
Threat hunting in cyber worldThreat hunting in cyber world
Threat hunting in cyber world
Akash Sarode
 
Resistance Isn't Futile: A Practical Approach to Threat Modeling
Resistance Isn't Futile: A Practical Approach to Threat ModelingResistance Isn't Futile: A Practical Approach to Threat Modeling
Resistance Isn't Futile: A Practical Approach to Threat Modeling
Katie Nickels
 

What's hot (7)

BSidesLV -The SOC Counter ATT&CK
BSidesLV -The SOC Counter ATT&CKBSidesLV -The SOC Counter ATT&CK
BSidesLV -The SOC Counter ATT&CK
 
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan brugginkATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
ATT&CKcon Power Hour - ATT&CK-onomics - gert-jan bruggink
 
Leveraging MITRE ATT&CK - Speaking the Common Language
Leveraging MITRE ATT&CK - Speaking the Common LanguageLeveraging MITRE ATT&CK - Speaking the Common Language
Leveraging MITRE ATT&CK - Speaking the Common Language
 
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CK
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CKTracking Noisy Behavior and Risk-Based Alerting with ATT&CK
Tracking Noisy Behavior and Risk-Based Alerting with ATT&CK
 
SOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation LabyrinthSOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation Labyrinth
 
Threat hunting in cyber world
Threat hunting in cyber worldThreat hunting in cyber world
Threat hunting in cyber world
 
Resistance Isn't Futile: A Practical Approach to Threat Modeling
Resistance Isn't Futile: A Practical Approach to Threat ModelingResistance Isn't Futile: A Practical Approach to Threat Modeling
Resistance Isn't Futile: A Practical Approach to Threat Modeling
 

Similar to Privacy-Preserving Data Analysis, Adria Gascon

SplunkLive! - Splunk for Security
SplunkLive! - Splunk for SecuritySplunkLive! - Splunk for Security
SplunkLive! - Splunk for Security
Splunk
 
Operationalizing Security Intelligence
Operationalizing Security IntelligenceOperationalizing Security Intelligence
Operationalizing Security Intelligence
Splunk
 
Splunk for Security Breakout Session
Splunk for Security Breakout SessionSplunk for Security Breakout Session
Splunk for Security Breakout Session
Splunk
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session
Splunk
 
First Responders Course - Session 3 - Monitoring and Controlling Incident Costs
First Responders Course - Session 3 - Monitoring and Controlling Incident CostsFirst Responders Course - Session 3 - Monitoring and Controlling Incident Costs
First Responders Course - Session 3 - Monitoring and Controlling Incident Costs
Phil Huggins FBCS CITP
 
Privacy-preserving Information Sharing: Tools and Applications
Privacy-preserving Information Sharing: Tools and ApplicationsPrivacy-preserving Information Sharing: Tools and Applications
Privacy-preserving Information Sharing: Tools and Applications
Emiliano De Cristofaro
 
Technical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvertTechnical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvert
ISSA LA
 
Big security for big data
Big security for big dataBig security for big data
Big security for big data
Giuliano Tavaroli
 
Security Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM GapSecurity Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM Gap
Eric Johansen, CISSP
 
Meetup presenation 06192013
Meetup presenation 06192013 Meetup presenation 06192013
Meetup presenation 06192013
Sqrrl
 
Technology Threat Prediction
Technology Threat PredictionTechnology Threat Prediction
Technology Threat Prediction
H4Diadmin
 
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
Splunk
 
A guide to Sustainable Cyber Security
A guide to Sustainable Cyber SecurityA guide to Sustainable Cyber Security
A guide to Sustainable Cyber Security
Ernest Staats
 
IRJET- Data Leak Prevention System: A Survey
IRJET-  	  Data Leak Prevention System: A SurveyIRJET-  	  Data Leak Prevention System: A Survey
IRJET- Data Leak Prevention System: A Survey
IRJET Journal
 
Securing IoT medical devices
Securing IoT medical devicesSecuring IoT medical devices
Securing IoT medical devices
Benjamin Biwer
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
Databricks
 
Comparative Analysis of Digital Forensic Extraction Tools
Comparative Analysis of Digital Forensic Extraction ToolsComparative Analysis of Digital Forensic Extraction Tools
Comparative Analysis of Digital Forensic Extraction Tools
ijtsrd
 
Cloud Intrusion Detection Reloaded - 2018
Cloud Intrusion Detection Reloaded - 2018Cloud Intrusion Detection Reloaded - 2018
Cloud Intrusion Detection Reloaded - 2018
randomuserid
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with Splunk
Splunk
 
Cybersecurity - Jim Butterworth
Cybersecurity - Jim ButterworthCybersecurity - Jim Butterworth
Cybersecurity - Jim Butterworth
TechBiz Forense Digital
 

Similar to Privacy-Preserving Data Analysis, Adria Gascon (20)

SplunkLive! - Splunk for Security
SplunkLive! - Splunk for SecuritySplunkLive! - Splunk for Security
SplunkLive! - Splunk for Security
 
Operationalizing Security Intelligence
Operationalizing Security IntelligenceOperationalizing Security Intelligence
Operationalizing Security Intelligence
 
Splunk for Security Breakout Session
Splunk for Security Breakout SessionSplunk for Security Breakout Session
Splunk for Security Breakout Session
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session
 
First Responders Course - Session 3 - Monitoring and Controlling Incident Costs
First Responders Course - Session 3 - Monitoring and Controlling Incident CostsFirst Responders Course - Session 3 - Monitoring and Controlling Incident Costs
First Responders Course - Session 3 - Monitoring and Controlling Incident Costs
 
Privacy-preserving Information Sharing: Tools and Applications
Privacy-preserving Information Sharing: Tools and ApplicationsPrivacy-preserving Information Sharing: Tools and Applications
Privacy-preserving Information Sharing: Tools and Applications
 
Technical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvertTechnical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvert
 
Big security for big data
Big security for big dataBig security for big data
Big security for big data
 
Security Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM GapSecurity Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM Gap
 
Meetup presenation 06192013
Meetup presenation 06192013 Meetup presenation 06192013
Meetup presenation 06192013
 
Technology Threat Prediction
Technology Threat PredictionTechnology Threat Prediction
Technology Threat Prediction
 
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
Splunk Discovery: Warsaw 2018 - Solve Your Security Challenges with Splunk En...
 
A guide to Sustainable Cyber Security
A guide to Sustainable Cyber SecurityA guide to Sustainable Cyber Security
A guide to Sustainable Cyber Security
 
IRJET- Data Leak Prevention System: A Survey
IRJET-  	  Data Leak Prevention System: A SurveyIRJET-  	  Data Leak Prevention System: A Survey
IRJET- Data Leak Prevention System: A Survey
 
Securing IoT medical devices
Securing IoT medical devicesSecuring IoT medical devices
Securing IoT medical devices
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
 
Comparative Analysis of Digital Forensic Extraction Tools
Comparative Analysis of Digital Forensic Extraction ToolsComparative Analysis of Digital Forensic Extraction Tools
Comparative Analysis of Digital Forensic Extraction Tools
 
Cloud Intrusion Detection Reloaded - 2018
Cloud Intrusion Detection Reloaded - 2018Cloud Intrusion Detection Reloaded - 2018
Cloud Intrusion Detection Reloaded - 2018
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with Splunk
 
Cybersecurity - Jim Butterworth
Cybersecurity - Jim ButterworthCybersecurity - Jim Butterworth
Cybersecurity - Jim Butterworth
 

More from Ulrik Lyngs

Social Machines: Theoretical perspectives, Paul Smart
Social Machines: Theoretical perspectives, Paul SmartSocial Machines: Theoretical perspectives, Paul Smart
Social Machines: Theoretical perspectives, Paul Smart
Ulrik Lyngs
 
Mandevillian Intelligence, Paul Smart
Mandevillian Intelligence, Paul SmartMandevillian Intelligence, Paul Smart
Mandevillian Intelligence, Paul Smart
Ulrik Lyngs
 
Human-Extended Machine Cognition, Paul Smart
Human-Extended Machine Cognition, Paul SmartHuman-Extended Machine Cognition, Paul Smart
Human-Extended Machine Cognition, Paul Smart
Ulrik Lyngs
 
Understanding Algorithmic Decisions
Understanding Algorithmic DecisionsUnderstanding Algorithmic Decisions
Understanding Algorithmic Decisions
Ulrik Lyngs
 
Zooniverse Update
Zooniverse UpdateZooniverse Update
Zooniverse Update
Ulrik Lyngs
 
Data sharing in the age of the Social Machine
Data sharing in the age of the Social MachineData sharing in the age of the Social Machine
Data sharing in the age of the Social Machine
Ulrik Lyngs
 
Ulysses in Cyberspace: Distraction and Self-Regulation in Social Machines
Ulysses in Cyberspace: Distraction and Self-Regulation in Social MachinesUlysses in Cyberspace: Distraction and Self-Regulation in Social Machines
Ulysses in Cyberspace: Distraction and Self-Regulation in Social Machines
Ulrik Lyngs
 
SoLiD co operating.systems
SoLiD co operating.systemsSoLiD co operating.systems
SoLiD co operating.systems
Ulrik Lyngs
 
Sociagrams: How to design a social machine
Sociagrams: How to design a social machineSociagrams: How to design a social machine
Sociagrams: How to design a social machine
Ulrik Lyngs
 
Safe Haven in a Box, Petros Papapanagiotou
Safe Haven in a Box, Petros PapapanagiotouSafe Haven in a Box, Petros Papapanagiotou
Safe Haven in a Box, Petros Papapanagiotou
Ulrik Lyngs
 
App Observatory
App ObservatoryApp Observatory
App Observatory
Ulrik Lyngs
 
A Privacy Framework for Social Machines
A Privacy Framework for Social MachinesA Privacy Framework for Social Machines
A Privacy Framework for Social Machines
Ulrik Lyngs
 
SOCIAM Book: The Theory and Practice of Social Machines
SOCIAM Book: The Theory and Practice of Social MachinesSOCIAM Book: The Theory and Practice of Social Machines
SOCIAM Book: The Theory and Practice of Social Machines
Ulrik Lyngs
 
Provenance and Analytics for Social Machines, Trung Dong Huynh
Provenance and Analytics for Social Machines, Trung Dong HuynhProvenance and Analytics for Social Machines, Trung Dong Huynh
Provenance and Analytics for Social Machines, Trung Dong Huynh
Ulrik Lyngs
 

More from Ulrik Lyngs (14)

Social Machines: Theoretical perspectives, Paul Smart
Social Machines: Theoretical perspectives, Paul SmartSocial Machines: Theoretical perspectives, Paul Smart
Social Machines: Theoretical perspectives, Paul Smart
 
Mandevillian Intelligence, Paul Smart
Mandevillian Intelligence, Paul SmartMandevillian Intelligence, Paul Smart
Mandevillian Intelligence, Paul Smart
 
Human-Extended Machine Cognition, Paul Smart
Human-Extended Machine Cognition, Paul SmartHuman-Extended Machine Cognition, Paul Smart
Human-Extended Machine Cognition, Paul Smart
 
Understanding Algorithmic Decisions
Understanding Algorithmic DecisionsUnderstanding Algorithmic Decisions
Understanding Algorithmic Decisions
 
Zooniverse Update
Zooniverse UpdateZooniverse Update
Zooniverse Update
 
Data sharing in the age of the Social Machine
Data sharing in the age of the Social MachineData sharing in the age of the Social Machine
Data sharing in the age of the Social Machine
 
Ulysses in Cyberspace: Distraction and Self-Regulation in Social Machines
Ulysses in Cyberspace: Distraction and Self-Regulation in Social MachinesUlysses in Cyberspace: Distraction and Self-Regulation in Social Machines
Ulysses in Cyberspace: Distraction and Self-Regulation in Social Machines
 
SoLiD co operating.systems
SoLiD co operating.systemsSoLiD co operating.systems
SoLiD co operating.systems
 
Sociagrams: How to design a social machine
Sociagrams: How to design a social machineSociagrams: How to design a social machine
Sociagrams: How to design a social machine
 
Safe Haven in a Box, Petros Papapanagiotou
Safe Haven in a Box, Petros PapapanagiotouSafe Haven in a Box, Petros Papapanagiotou
Safe Haven in a Box, Petros Papapanagiotou
 
App Observatory
App ObservatoryApp Observatory
App Observatory
 
A Privacy Framework for Social Machines
A Privacy Framework for Social MachinesA Privacy Framework for Social Machines
A Privacy Framework for Social Machines
 
SOCIAM Book: The Theory and Practice of Social Machines
SOCIAM Book: The Theory and Practice of Social MachinesSOCIAM Book: The Theory and Practice of Social Machines
SOCIAM Book: The Theory and Practice of Social Machines
 
Provenance and Analytics for Social Machines, Trung Dong Huynh
Provenance and Analytics for Social Machines, Trung Dong HuynhProvenance and Analytics for Social Machines, Trung Dong Huynh
Provenance and Analytics for Social Machines, Trung Dong Huynh
 

Recently uploaded

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 

Recently uploaded (20)

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 

Privacy-Preserving Data Analysis, Adria Gascon

  • 1. Privacy-Preserving Data Analysis Adria Gascon The Alan Turing Institute & Warwick University Based on joint work with Borja Balle, Phillipp Schoppmann, Mariana Raykova, Jack Doerner, Samee Zahur, David Evans, Age Chapman, Alan Davoust, Peter Buneman
  • 2. What analysis on what data?  Fined grained private data, e.g. tracking for targeted advertising, credit scoring...  Data held by several organisations, e.g. hospitals?  Data held by individuals, e.g. on their phones?
  • 3. Who cares?  Data owners (of course)  Data controllers
  • 4. Adria Gascon Phillipp Schoppmann Borja Balle Mariana Raykova Jack Doerner Samee Zahur David Evans Privacy Preserving Distributed Linear Regression on High-Dimensional Data
  • 5. Motivation Treatment Outcome Medical Data Census Data Financial Data Atr. 1 Atr. 2 … Atr. 4 Atr. 5 … Atr. 7 Atr. 8 … -1.0 0 54.3 … North 34 … 5 1 … 1.5 1 0.6 … South 12 … 10 0 … -0.3 1 16.0 … East 56 … 2 0 … 0.7 0 35.0 … Centre 67 … 15 1 … 3.1 1 20.2 … West 29 … 7 1 … Note: This is vertcally-parttoned data; similar problems with horizontally-parttoned
  • 6. Private Multi-Party Machine Learning Assumptons • Parameters of the model will be received by all partes • Partes can engage in on-line secure communicatons • External partes might be used to outsource computaton or initalize cryptographic primitves Problem • Two or more partes want to jointly learn a model of their data • But they can’t share their private data with other partes
  • 7. The Trusted Party “Solution” (secure channel) (secure channel) (secure channel) Trusted Party Receives plain-text data, runs algorithm, returns result to partes ? The Trusted Party assumpton: • Introduces a single point of failure • Relies on weak incentves • Requires agreement between all data providers => Useful but unrealistc. Maybe can be simulated?
  • 8. Secure Multi-Party Computation (MPC) Public: Private: (party i) Goal: Compute f in a way that each party learns y (and nothing else!)
  • 9. Our Contribution A PMPML system for vertcally parttoned linear regression Features: • Scalable to millions of records and hundreds of dimensions • Formal privacy guarantees (semi-honest security) • Open source implementaton Tools: • Combine standard MPC constructons (GC, OT, TI, …) • Efcient private inner product protocols • Conjugate gradient descent robust to fxed-point encodings
  • 10. FAQ: Why is PMPML… Excitng? Can provide access to previously ”locked” data Hard? Privacy is tricky to formalize, hard to implement, and inherently interdisciplinary Worth? Beter models while avoiding legal risks and bad PR
  • 11. Read It, Use It https://github.com/schoppmp/linreg-mpc http://eprint.iacr.org/2016/892PETS’17
  • 12. Adria Gascon Phillipp Schoppmann Borja Balle Private Document Classifcaton in Federated Databases
  • 15. Adria Gascon James Bell Tejas Kulkarni Privacy-Preserving Distributed Hypothesis Testng
  • 16. ● Drop off in Manhattan? ● Tip over 25 %? ● Was it a short journey? ● Was payment method credit card? Drop-off in Manhattan and tip over 25% are significantly correlated events. But this result is differentially private, so I cannot easily tell if a given journey was included in the training dataset or not.
  • 17. Problem: model-check security properties on private source code. Privacy-Preserving Model Checking
  • 18. ● Problem: Check security properties on (private) source code. ● “Public” equivalent: MOPS [1], and some others. – Security property expressed as regular expression over sequences of instructions – Find all paths in control flow graph that match path ● Application of Private Regular Path Queries [1] Hao Chen and David Wagner. 2002. MOPS: an infrastructure for examining security properties of software. In Proceedings of the 9th ACM conference on Computer and communications security (CCS '02), Vijay Atluri (Ed.). ACM, New York, NY, USA, 235-244. DOI=http://dx.doi.org/10.1145/586110.586142 Privacy-Preserving Model Checking
  • 19. Secure queries on graph data
  • 20. Simple Example 1 #include <stdio.h> 2 #include <sys/types.h> 3 #include <unistd.h> 4 #include <pwd.h> 5 6 void drop_priv() 7 { 8 struct passwd *passwd; 9 10 if ((passwd = getpwuid(getuid())) == NULL) 11 { 12 printf("getpwuid() failed"); 13 return; 14 } 15 printf("Drop user %s's privilegen", passwd- >pw_name); 16 seteuid(getuid()); 17 } 18 19 int main(int argc, char *argv[]) 20 { 21 drop_priv(); 22 printf("About to execn"); hello.c
  • 21. Simple Example Control flow graph Security property FSA (system call with root priviledge)
  • 22. Interesting case: distributed private graph (code) main.c library.c
  • 23. Related Work Verification Across Intellectual Property Boundaries [2]: [2] Chaki, Sagar, Christian Schallhart, and Helmut Veith. "Verification across intellectual property boundaries." ACM Transactions on Software Engineering and Methodology (TOSEM) 22.2 (2013): 15.
  • 24. Related Work Verification Across Intellectual Property Boundaries [2] They also say... “While we are aware of advanced methods such as secure multiparty computation [Goldreich 2002] and zeroknowledge proofs [Ben-Or et al. 1988], we believe that they are impracticable for our problem, as such methods cannot be easily wrapped over given validation tools. Finally, we believe that any advanced method without an intuitive proof for its secrecy will be heavily opposed by the supplier—and might therefore be hard to establish in practice.”
  • 25. Case study: thttpd ● Tiny http server ● 2 main modules (thttp.c and libhttp.c) thttp.c (2k loc) libhttp.c (4k loc)
  • 26. thttpd control flow graph... ● 2 main modules only ● functions are disconnected
  • 27. thttpd: next steps ● Adapt private Regular Path Queries work for pushdown automata ● Find some bugs. ● Write paper. ● Voila!