SlideShare a Scribd company logo
Can Privacy Exist With
Machine Learning?
Matt Vogt, Sr. Solutions Architect, Immuta - Gartner Cool Vendor 2018
“Data can be either useful or
perfectly anonymous but never both.”
Paul Ohm,
Broken Promises of Privacy,
57 UCLA Law Review 1701 (2010)
I know stuff about
Judd and Leslie
Judd Apatow & Leslie Mann
Photo Credit: PacificCoastNews.com
© 2017 Immuta All Rights Reserved. 3
New York Taxi &
Limousine Commission
• Data was released containing taxi pickups,
dropoffs, location, time, amount, and tip
amount, among others
• This seems pretty harmless?
© 2017 Immuta All Rights Reserved. 4
Well, Judd and Leslie May
Not Think It’s Harmless
This photos was geotagged (with time), so
by simply querying by medallion and time,
we know how much Judd and Leslie tip!
© 2017 Immuta All Rights Reserved. 5
This is an example of a “link attack”
Medallion & Photo Time
Medallion & Pickup Time
New York
Taxi Data
© 2017 Immuta All Rights Reserved. 6
New York Actually Tried to Anonymize the data
By hashing the medallion
But that didn’t matter….
© 2017 Immuta All Rights Reserved. 7
New York
Taxi Data
Medallion & Photo Time
Pickup Time & Pickup Loc
Pickup Loc & Dropoff Loc
Dropoff Loc & Dropoff Time
Dropoff Time & Receipt
Medallion & Pickup Time
Pickup Time & Pickup Loc
Pickup Loc & Dropoff Loc
Dropoff Loc & Dropoff Time
Dropoff Time & Amount
© 2017 Immuta All Rights Reserved. 8
Remember!
Data can be either useful or perfectly
anonymous but never both.
In fact
“...just three data points were enough to
identify an even larger percentage of
people in the data set. That means that
someone with copies of just three of
your recent receipts — or one receipt,
one Instagram photo of you having
coffee with friends, and one tweet about
the phone you just bought — would have
a 94 percent chance of extracting your
credit card records from those of a
million other people”
© 2017 Immuta All Rights Reserved. 10
“...one Instagram photo of you having
coffee with friends, and one tweet
about the phone you just bought…”
More data is available to us than ever, which
means link attacks become increasingly simple
It’s very easy to build profiles of individuals...
© 2017 Immuta All Rights Reserved. 11
The European Union responds
General Data Protection Regulation (GDPR)
Effective May 25, 2018
Fines up to 4 percent of global revenue
Applies to any company collecting data on EU citizens
© 2017 Immuta All Rights Reserved. 12
GDPR Article 4(1):
'personal data' means any information relating to an identified or identifiable
natural person ('data subject'); an identifiable natural person is one who can be
identified, directly or indirectly, in particular by reference to an identifier such as
a name, an identification number, location data, an online identifier or to one or
more factors specific to the physical, physiological, genetic, mental, economic,
cultural or social identity of that natural person;
In Q3 alone, we’ve seen a huge uptick in interest
from regulators in regulating data, to include
• California Consumer Privacy Act was passed in June 2018, and will take effect in 2020.
• Vermont became the first state in the nation to regulate data brokers.
• In September 2018, the Trump administration, acting through National Telecommunications and
Information Administration, released a “Request for Comments on Developing the Administration’s
Approach to Consumer Privacy.”
• This is the first concrete illustration that a national-level privacy regulation like the GDPR is coming to the US.
• Immuta prediction: By 2020, no major economic zone will be free of an overarching data protection law.
© 2017 Immuta All Rights Reserved. 14
PRIVACY
MACHINE
LEARNING
MACHINE LEARNING WILL
CHANGE THE ECONOMY AS WE
KNOW IT
It’s all
about
the
data!
What Amazon Teaches Us About the Future
Responding to data is at the core of Amazon does… and
why organizations across verticals need to follow its lead
• Supply chain optimization: optimize distribution, storage, routes, schedules, products
• Pricing and profit optimization: elastically tailor pricing to products and consumers
• Customer segmentation: real-time analysis to boost marketing/advertising efficiency
• Software/hardware system analytics: optimizing use and distribution of IT infrastructure globally
• Competitive analysis: automatically process billions of data points about the company, its
competitors, and new trends to create daily / hourly / real-time, automated analyses
© 2017 Immuta All Rights Reserved. 17
The Newer Guys Have the Upper Hand
Low technical debt
• Futuristic software
architectures
Centralized Data
• No data silos
• Specific problem-
set drove data schemas
Fewer Regulatory
Controls
• Not for long!!
They are Data Agile
© 2017 Immuta All Rights Reserved. 18
© 2017 Immuta All Rights Reserved. 19
Centralized Policy
Enforcement
Rapid Access to Data Frictionless to
Data Analysts
Focus on this today
The Three Pillars to Data Agility
© 2017 Immuta All Rights Reserved. 20
Centralized Policy Enforcement
Old World
• Policies managed uniquely
at each data source
• Use ETL to create ”safe” versions of
data
• IT interprets legal
guidance themselves
• Audit logs are disjointed/inconsistent
New World
• Consistent layer for creating data policies
• Policies are enforced dynamically
• Plain-english policy builder usable by any
author and understandable by all
• An unprecedented list of policy logic
at your fingertips
• All actions monitored granularly and
consistently
© 2017 Immuta All Rights Reserved. 21
Introducing
Immuta
© 2017 Immuta All Rights Reserved. 22
Privacy Preserving Techniques
(we do a bunch, I’m only going to touch on a few here)
© 2017 Immuta All Rights Reserved. 23
Right To Privacy?
• Early on photography was expensive
• Near the turn of the century the masses
had general use of photography
• "instantaneous photographs and newspaper
enterprise have invaded the sacred precincts of
private and domestic life." - Samuel Warren and
Louis Brandeis (U.S. Supreme Court Justice)
• Proposed right to “be let alone”
• We generally accept being observed,
but rarely accept being identified
© 2017 Immuta All Rights Reserved. 24
The End of Privacy
[as we know it]?
• Rise of technology and data science
has killed privacy as we know it
• Instead of focusing on how and
when our data is gathered...
• Privacy should now be
how our data is being used.
© 2017 Immuta All Rights Reserved. 25
Immuta can do this
The GDPR understands this!
• The cornerstone of GDPR is consent
• You should only process data for the purposes for
which your data subjects have explicitly consented
• In other words: you must consider analytical
context as a guide to what data you can see
• This is very different from role-based access controls
© 2017 Immuta All Rights Reserved. 26
Towards Practical Differential Privacy for SQL Queries
Johnson, Near, Song, Aug 2017
The Internal study
of queries at Uber
• SQL queries written by
employees at Uber
• 8.1 million queries executed
between March 2013 and
August 2016
• Broad range of sensitive data
including rider and driver
information, trip logs, and
customer support data
27
34% of Uber Data Science
Queries are aggregates
Statistical queries matter!
Data can be either useful or perfectly
anonymous but never both.
IF WE CONSIDER STATISTICAL QUERIES USEFUL, THIS CAN BE A LIE:
How?
© 2017 Immuta All Rights Reserved. 29
Let’s play a game
• Think of a number between 1 and 6
• Now I’m going to ask you a question you
probably don’t want to answer in public
• Do you hide spending from your spouse?
• Now raise your hand if you thought of
a 3 OR answered yes to the above
© 2017 Immuta All Rights Reserved. 30
This is Differential Privacy
• I protected your privacy by providing plausible deniability
• But I can also understand the percentage of people that hide spending from their
spouse because I understand the probability of you selecting a 3
• Differential Privacy is restricted to only statistical queries and adds the appropriate
amount of noise based on the sensitivity of the question
• ‘Differential privacy formalizes the idea that a "private" computation should not reveal
whether any one person participated in the input or not, much less what their data are.’
- [Frank McSherry] (https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md)
© 2017 Immuta All Rights Reserved. 31
How Could NYT Have Done it?
Localized Sensitivity
© 2017 Immuta All Rights Reserved. 32
How do we
do it?
Simple…
In plain
English
everyone
can
understand
© 2017 Immuta All Rights Reserved. 33
Can Privacy and
Machine Learning
Exist Together?
We believe it can,
data agility is what
you need
© 2017 Immuta All Rights Reserved. 34
Questions
mvogt@immuta.com
@mattvogt
www.immuta.com
Come visit our Booth #xxx!

More Related Content

Similar to Can Privacy Exist with Machine Learning? Presentation from Gartner Data & Analytics Summit 2018

Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
DATAVERSITY
 
Data Privacy: What you need to know about privacy, from compliance to ethics
Data Privacy: What you need to know about privacy, from compliance to ethicsData Privacy: What you need to know about privacy, from compliance to ethics
Data Privacy: What you need to know about privacy, from compliance to ethics
AT Internet
 
PP Lec9n10 Sp2020.pptx
PP Lec9n10 Sp2020.pptxPP Lec9n10 Sp2020.pptx
PP Lec9n10 Sp2020.pptx
MuhammadAbdullah201796
 
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdfSFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
South Tyrol Free Software Conference
 
Getting Started with GDPR Compliance
Getting Started with GDPR ComplianceGetting Started with GDPR Compliance
Getting Started with GDPR Compliance
DATAVERSITY
 
Age Friendly Economy - Legislation and Ethics of Data Use
Age Friendly Economy - Legislation and Ethics of Data UseAge Friendly Economy - Legislation and Ethics of Data Use
Age Friendly Economy - Legislation and Ethics of Data Use
AgeFriendlyEconomy
 
GDPR: Where should you be right now? - Dennis Slattery, EDM Works
GDPR: Where should you be right now? - Dennis Slattery, EDM WorksGDPR: Where should you be right now? - Dennis Slattery, EDM Works
GDPR: Where should you be right now? - Dennis Slattery, EDM Works
BCS Data Management Specialist Group
 
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...Vivastream
 
Smart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislationSmart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislation
caniceconsulting
 
Internet of Things With Privacy in Mind
Internet of Things With Privacy in MindInternet of Things With Privacy in Mind
Internet of Things With Privacy in Mind
Gosia Fraser
 
The privacy and security implications of AI, big data and predictive analytics
The privacy and security implications of AI, big data and predictive analyticsThe privacy and security implications of AI, big data and predictive analytics
The privacy and security implications of AI, big data and predictive analytics
Dan Michaluk
 
Role of CAs in cyber world
Role of CAs in cyber worldRole of CAs in cyber world
Role of CAs in cyber world
CA. (Dr.) Rajkumar Adukia
 
TrustUX: balancing personalisation and privacy to create understanding and tr...
TrustUX: balancing personalisation and privacy to create understanding and tr...TrustUX: balancing personalisation and privacy to create understanding and tr...
TrustUX: balancing personalisation and privacy to create understanding and tr...
Ann Wuyts
 
Access now : Data Protection: What you should know about it?
Access now : Data Protection: What you should know about it?Access now : Data Protection: What you should know about it?
Access now : Data Protection: What you should know about it?
ANSItunCERT
 
DATA GOVERNANCE
DATA GOVERNANCEDATA GOVERNANCE
DATA GOVERNANCEVivastream
 
Privacy vs personalization: advisory for brand and comms practitioners into 2...
Privacy vs personalization: advisory for brand and comms practitioners into 2...Privacy vs personalization: advisory for brand and comms practitioners into 2...
Privacy vs personalization: advisory for brand and comms practitioners into 2...
Dave Holland
 
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...VINTlabs | The Sogeti Trendlab
 
Thierer Internet Privacy Regulation
Thierer Internet Privacy RegulationThierer Internet Privacy Regulation
Thierer Internet Privacy RegulationMercatus Center
 
Digital Transformation and Data Protection
Digital Transformation and Data ProtectionDigital Transformation and Data Protection
Digital Transformation and Data Protection
Serter Ozturk
 
Data set Legislation
Data set   Legislation Data set   Legislation
Data set Legislation
Data-Set
 

Similar to Can Privacy Exist with Machine Learning? Presentation from Gartner Data & Analytics Summit 2018 (20)

Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
Automated Data Governance 101 - A Guide to Proactively Addressing Your Privac...
 
Data Privacy: What you need to know about privacy, from compliance to ethics
Data Privacy: What you need to know about privacy, from compliance to ethicsData Privacy: What you need to know about privacy, from compliance to ethics
Data Privacy: What you need to know about privacy, from compliance to ethics
 
PP Lec9n10 Sp2020.pptx
PP Lec9n10 Sp2020.pptxPP Lec9n10 Sp2020.pptx
PP Lec9n10 Sp2020.pptx
 
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdfSFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
SFScon 22 - Paolo Pinto - Real Life Data Anonymization.pdf
 
Getting Started with GDPR Compliance
Getting Started with GDPR ComplianceGetting Started with GDPR Compliance
Getting Started with GDPR Compliance
 
Age Friendly Economy - Legislation and Ethics of Data Use
Age Friendly Economy - Legislation and Ethics of Data UseAge Friendly Economy - Legislation and Ethics of Data Use
Age Friendly Economy - Legislation and Ethics of Data Use
 
GDPR: Where should you be right now? - Dennis Slattery, EDM Works
GDPR: Where should you be right now? - Dennis Slattery, EDM WorksGDPR: Where should you be right now? - Dennis Slattery, EDM Works
GDPR: Where should you be right now? - Dennis Slattery, EDM Works
 
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...
Is More Data Always Better? The Legal Risks of Data Collection, Storage and U...
 
Smart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislationSmart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislation
 
Internet of Things With Privacy in Mind
Internet of Things With Privacy in MindInternet of Things With Privacy in Mind
Internet of Things With Privacy in Mind
 
The privacy and security implications of AI, big data and predictive analytics
The privacy and security implications of AI, big data and predictive analyticsThe privacy and security implications of AI, big data and predictive analytics
The privacy and security implications of AI, big data and predictive analytics
 
Role of CAs in cyber world
Role of CAs in cyber worldRole of CAs in cyber world
Role of CAs in cyber world
 
TrustUX: balancing personalisation and privacy to create understanding and tr...
TrustUX: balancing personalisation and privacy to create understanding and tr...TrustUX: balancing personalisation and privacy to create understanding and tr...
TrustUX: balancing personalisation and privacy to create understanding and tr...
 
Access now : Data Protection: What you should know about it?
Access now : Data Protection: What you should know about it?Access now : Data Protection: What you should know about it?
Access now : Data Protection: What you should know about it?
 
DATA GOVERNANCE
DATA GOVERNANCEDATA GOVERNANCE
DATA GOVERNANCE
 
Privacy vs personalization: advisory for brand and comms practitioners into 2...
Privacy vs personalization: advisory for brand and comms practitioners into 2...Privacy vs personalization: advisory for brand and comms practitioners into 2...
Privacy vs personalization: advisory for brand and comms practitioners into 2...
 
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...
Tom tom - Location services and privacy | Simon Hania @ VINT symposium THINGS...
 
Thierer Internet Privacy Regulation
Thierer Internet Privacy RegulationThierer Internet Privacy Regulation
Thierer Internet Privacy Regulation
 
Digital Transformation and Data Protection
Digital Transformation and Data ProtectionDigital Transformation and Data Protection
Digital Transformation and Data Protection
 
Data set Legislation
Data set   Legislation Data set   Legislation
Data set Legislation
 

Recently uploaded

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Can Privacy Exist with Machine Learning? Presentation from Gartner Data & Analytics Summit 2018

  • 1. Can Privacy Exist With Machine Learning? Matt Vogt, Sr. Solutions Architect, Immuta - Gartner Cool Vendor 2018
  • 2. “Data can be either useful or perfectly anonymous but never both.” Paul Ohm, Broken Promises of Privacy, 57 UCLA Law Review 1701 (2010)
  • 3. I know stuff about Judd and Leslie Judd Apatow & Leslie Mann Photo Credit: PacificCoastNews.com © 2017 Immuta All Rights Reserved. 3
  • 4. New York Taxi & Limousine Commission • Data was released containing taxi pickups, dropoffs, location, time, amount, and tip amount, among others • This seems pretty harmless? © 2017 Immuta All Rights Reserved. 4
  • 5. Well, Judd and Leslie May Not Think It’s Harmless This photos was geotagged (with time), so by simply querying by medallion and time, we know how much Judd and Leslie tip! © 2017 Immuta All Rights Reserved. 5
  • 6. This is an example of a “link attack” Medallion & Photo Time Medallion & Pickup Time New York Taxi Data © 2017 Immuta All Rights Reserved. 6
  • 7. New York Actually Tried to Anonymize the data By hashing the medallion But that didn’t matter…. © 2017 Immuta All Rights Reserved. 7
  • 8. New York Taxi Data Medallion & Photo Time Pickup Time & Pickup Loc Pickup Loc & Dropoff Loc Dropoff Loc & Dropoff Time Dropoff Time & Receipt Medallion & Pickup Time Pickup Time & Pickup Loc Pickup Loc & Dropoff Loc Dropoff Loc & Dropoff Time Dropoff Time & Amount © 2017 Immuta All Rights Reserved. 8
  • 9. Remember! Data can be either useful or perfectly anonymous but never both.
  • 10. In fact “...just three data points were enough to identify an even larger percentage of people in the data set. That means that someone with copies of just three of your recent receipts — or one receipt, one Instagram photo of you having coffee with friends, and one tweet about the phone you just bought — would have a 94 percent chance of extracting your credit card records from those of a million other people” © 2017 Immuta All Rights Reserved. 10
  • 11. “...one Instagram photo of you having coffee with friends, and one tweet about the phone you just bought…” More data is available to us than ever, which means link attacks become increasingly simple It’s very easy to build profiles of individuals... © 2017 Immuta All Rights Reserved. 11
  • 12. The European Union responds General Data Protection Regulation (GDPR) Effective May 25, 2018 Fines up to 4 percent of global revenue Applies to any company collecting data on EU citizens © 2017 Immuta All Rights Reserved. 12
  • 13. GDPR Article 4(1): 'personal data' means any information relating to an identified or identifiable natural person ('data subject'); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;
  • 14. In Q3 alone, we’ve seen a huge uptick in interest from regulators in regulating data, to include • California Consumer Privacy Act was passed in June 2018, and will take effect in 2020. • Vermont became the first state in the nation to regulate data brokers. • In September 2018, the Trump administration, acting through National Telecommunications and Information Administration, released a “Request for Comments on Developing the Administration’s Approach to Consumer Privacy.” • This is the first concrete illustration that a national-level privacy regulation like the GDPR is coming to the US. • Immuta prediction: By 2020, no major economic zone will be free of an overarching data protection law. © 2017 Immuta All Rights Reserved. 14
  • 16. MACHINE LEARNING WILL CHANGE THE ECONOMY AS WE KNOW IT
  • 17. It’s all about the data! What Amazon Teaches Us About the Future Responding to data is at the core of Amazon does… and why organizations across verticals need to follow its lead • Supply chain optimization: optimize distribution, storage, routes, schedules, products • Pricing and profit optimization: elastically tailor pricing to products and consumers • Customer segmentation: real-time analysis to boost marketing/advertising efficiency • Software/hardware system analytics: optimizing use and distribution of IT infrastructure globally • Competitive analysis: automatically process billions of data points about the company, its competitors, and new trends to create daily / hourly / real-time, automated analyses © 2017 Immuta All Rights Reserved. 17
  • 18. The Newer Guys Have the Upper Hand Low technical debt • Futuristic software architectures Centralized Data • No data silos • Specific problem- set drove data schemas Fewer Regulatory Controls • Not for long!! They are Data Agile © 2017 Immuta All Rights Reserved. 18
  • 19. © 2017 Immuta All Rights Reserved. 19
  • 20. Centralized Policy Enforcement Rapid Access to Data Frictionless to Data Analysts Focus on this today The Three Pillars to Data Agility © 2017 Immuta All Rights Reserved. 20
  • 21. Centralized Policy Enforcement Old World • Policies managed uniquely at each data source • Use ETL to create ”safe” versions of data • IT interprets legal guidance themselves • Audit logs are disjointed/inconsistent New World • Consistent layer for creating data policies • Policies are enforced dynamically • Plain-english policy builder usable by any author and understandable by all • An unprecedented list of policy logic at your fingertips • All actions monitored granularly and consistently © 2017 Immuta All Rights Reserved. 21
  • 22. Introducing Immuta © 2017 Immuta All Rights Reserved. 22
  • 23. Privacy Preserving Techniques (we do a bunch, I’m only going to touch on a few here) © 2017 Immuta All Rights Reserved. 23
  • 24. Right To Privacy? • Early on photography was expensive • Near the turn of the century the masses had general use of photography • "instantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life." - Samuel Warren and Louis Brandeis (U.S. Supreme Court Justice) • Proposed right to “be let alone” • We generally accept being observed, but rarely accept being identified © 2017 Immuta All Rights Reserved. 24
  • 25. The End of Privacy [as we know it]? • Rise of technology and data science has killed privacy as we know it • Instead of focusing on how and when our data is gathered... • Privacy should now be how our data is being used. © 2017 Immuta All Rights Reserved. 25
  • 26. Immuta can do this The GDPR understands this! • The cornerstone of GDPR is consent • You should only process data for the purposes for which your data subjects have explicitly consented • In other words: you must consider analytical context as a guide to what data you can see • This is very different from role-based access controls © 2017 Immuta All Rights Reserved. 26
  • 27. Towards Practical Differential Privacy for SQL Queries Johnson, Near, Song, Aug 2017 The Internal study of queries at Uber • SQL queries written by employees at Uber • 8.1 million queries executed between March 2013 and August 2016 • Broad range of sensitive data including rider and driver information, trip logs, and customer support data 27
  • 28. 34% of Uber Data Science Queries are aggregates Statistical queries matter!
  • 29. Data can be either useful or perfectly anonymous but never both. IF WE CONSIDER STATISTICAL QUERIES USEFUL, THIS CAN BE A LIE: How? © 2017 Immuta All Rights Reserved. 29
  • 30. Let’s play a game • Think of a number between 1 and 6 • Now I’m going to ask you a question you probably don’t want to answer in public • Do you hide spending from your spouse? • Now raise your hand if you thought of a 3 OR answered yes to the above © 2017 Immuta All Rights Reserved. 30
  • 31. This is Differential Privacy • I protected your privacy by providing plausible deniability • But I can also understand the percentage of people that hide spending from their spouse because I understand the probability of you selecting a 3 • Differential Privacy is restricted to only statistical queries and adds the appropriate amount of noise based on the sensitivity of the question • ‘Differential privacy formalizes the idea that a "private" computation should not reveal whether any one person participated in the input or not, much less what their data are.’ - [Frank McSherry] (https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md) © 2017 Immuta All Rights Reserved. 31
  • 32. How Could NYT Have Done it? Localized Sensitivity © 2017 Immuta All Rights Reserved. 32
  • 33. How do we do it? Simple… In plain English everyone can understand © 2017 Immuta All Rights Reserved. 33
  • 34. Can Privacy and Machine Learning Exist Together? We believe it can, data agility is what you need © 2017 Immuta All Rights Reserved. 34