Originally shared at the Gartner Symposium/ITxpo 2018, this presentation looks at the conflicts between privacy and machine learning, including common concerns such as link attacks and profiling. The presentation also explores strategies that allow usage of personal information while protecting privacy, including concepts around consent and analytical context and techniques for implementing privacy by design – a cornerstone of General Data Protection Regulation (GDPR) compliance.
13. GDPR Article 4(1):
'personal data' means any information relating to an identified or identifiable
natural person ('data subject'); an identifiable natural person is one who can be
identified, directly or indirectly, in particular by reference to an identifier such as
a name, an identification number, location data, an online identifier or to one or
more factors specific to the physical, physiological, genetic, mental, economic,
cultural or social identity of that natural person;
27. Towards Practical Differential Privacy for SQL Queries
Johnson, Near, Song, Aug 2017
The Internal study
of queries at Uber
• SQL queries written by
employees at Uber
• 8.1 million queries executed
between March 2013 and
August 2016
• Broad range of sensitive data
including rider and driver
information, trip logs, and
customer support data
27
28. 34% of Uber Data Science
Queries are aggregates
Statistical queries matter!