Deep learning and machine learning more broadly depend on large quantities of data to develop accurate predictive models. In areas such as medical research, sharing data among institutions can lead to even greater value. However, data often includes personally identifiable information that we may not want to (or even be legally allowed to) share with others. Traditional anonymization techniques only help to a degree.
In this talk, Red Hat's Gordon Haff will share with you the active activity taking place in academia, open source communities, and elsewhere into techniques such as differential privacy and secure multi-party computation. The goal of this research and ongoing work is to help individuals and organizations work collaboratively while preserving the anonymity of individual data points.
15. @ghaff https://bitmason.blogspot.com
15
Differential Privacy
Response to erosion of traditional Statistical
Disclosure Limitation (SDL) techniques
Widely share statistics over a set of data without
revealing anything about individuals
2006 Dwork, McSherry, Nissim, and Smith
(ε-differential privacy)
17. @ghaff https://bitmason.blogspot.com
17
Injects random data into a data
set (in a mathematically rigorous
way) to protect individual privacy
Value of randomness trades off
privacy and utility/accuracy
https://www.accessnow.org/understanding-
differential-privacy-matters-digital-rights/