SlideShare a Scribd company logo
1 of 18
Download to read offline
Differential Privacy
Young-Geun Choi
SNU STAT - Multivariate Statistics Lab.
17th August, 2016
Contents
• Differential privacy? + examples, properties
• 4 articles (LNCS 2014)
2016-08-17 2
𝜖-differential privacy
• 𝑋1 and 𝑋2 : 𝑛 by 𝑝 data.frames
• Both can be arbitrary but differ in only one row
• 𝜅 : user-query on 𝑋 + noise
e.g. Let 𝑝 = 1 and 𝑋 be a 𝑛 by 1 binary data.
consider sum(𝑋) + random Laplace 𝑝 𝑥 ∝ exp(−|𝑥|/𝜆).
• Probability 𝑃(⋅) comes from the randomness 𝑝 ⋅ of the noise.
• Other queries : counting, histogram, first name..
• Interactive setting vs. non-interactive setting
[1] S´anchez, Domingo-Ferrer, and Mart’Inez. Improving the Utility of Differential Privacy via Univariate Microaggregation. LNCS 2014.
2016-08-17 3
Why consider differential privacy?
• Data Cannot be Fully Anonymized and Remain Useful.
• (de-identified) medical encounter data + (publicly available) voter
registration records) Massachussetts = re-identification
• (de-identified) movie records published by Netflix + (publicly avaiable)
the Internet Movie Database (IMDb) = re-identification
• Re-Identification of “Anonymized” Records is Not the Only
Risk.
• membership disclosure + harm (small number of compliant?
diagnose?)
• Queries Over Large Sets are Not Protective.
• e.g. if it is known that Mr. X is in a certain medical database,
"How many people in the database have disease A?" +
"How many people, not named X, in the database have disease A?”
yield the A-status of Mr. X.
Chapter 1.1 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos.
3–4, 211–407.
2016-08-17 4
Why consider differential privacy?
• Query Auditing Is Problematic.
• refusing itself can be disclosive
• query auditing can be computationally infeasible
• Summary Statistics are Not “Safe.”
• (GWAS, SNP) the National Institutes of Health and Wellcome Trust
terminated public access to aggregate frequency data from the
studies they fund.
• “Ordinary” Facts are Not “OK.”
• [frequent bread purchase for long time -> infrequent purchase] +
type II diabetes?
• “Just a Few.”
• “Just a few” philosophy can involve ethnic issues.
2016-08-17 5
Chapter 1.1 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos.
3–4, 211–407.
Why consider differential privacy?
• What differential privacy promises
• “the probability of harm is not significantly increased by their choice to
participate.”
• What differential privacy does not promise
• NOT : what one believes to be one’s secrets will remain secret.
“Differential privacy promises that the behavior of an algorithm will
be roughly unchanged even if a single entry in the database is
modified.”
• Summary
• de-identification is not sufficient
• need to contaminate(?) not only (demographic) keys but also all the
other attributes.
• need to apply this philosophy in the stage of responding to user-query.
2016-08-17 6
Chapter 2.3 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos.
3–4, 211–407.
Some remarks on differential privacy
• Another definition ((𝜖, 𝛿)-differential privacy)
• Algorithms can be combined
2016-08-17 7
Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407.
Some remarks on differential privacy
• Algorithms can be combined (continued)
Differential privacy, Wikipedia webpage
Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407.
2016-08-17 8
[1] Univariate microaggregation
• “Microaggregation” : an alternative to histogram queries
• Procedure
• Assume 𝑛 by 𝑝 table 𝑋 with numeric attributes only
• Input : 𝑋 and 𝑘 ∈ ℕ
• For 𝑗 = (from 1 to 𝑝)
1. Divide X[ ,j] into [𝑛/𝑘] clusters of 𝑘 consecutive values.
2. Calculate the average for each of the clusters.
3. Perturb the averages by independent Laplace noises
4. Replace the original values in X[ ,j]by the values in 3.
• Guarantee : The output is 𝑝𝜖-differentially private.
[1] S´anchez, Domingo-Ferrer, and Mart’Inez. Improving the Utility of Differential Privacy via Univariate Microaggregation. LNCS 2014.
2016-08-17 9
[2] Exponential random graphs
• 𝑋 : (observed) exponential random graph
• Non-interactive query
• Want to give a differentially-private random graph 𝑌 based on 𝑋
• Randomized response for edges
[2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014.
2016-08-17 10
[2] Exponential random graphs
2016-08-17 11
[2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014.
[2] Exponential random graphs
• Of interest : likelihood-based inference on 𝜃
• Given 𝑌 : if 𝜋 (𝛾 below) is known, MCMC can apply.
[2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014.
2016-08-17 12
[3] 𝑘 𝑚
-Anonymity for Continuous Data
• How to guarantee 𝑘 𝑚
-anonymity with minimal information
loss?
[3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014.
2016-08-17 13
[3] 𝑘 𝑚
-Anonymity for Continuous Data
• Illustration of the proposed algorithm
[3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014.
2016-08-17 14
[3] 𝑘 𝑚
-Anonymity for Continuous Data
• 𝑘-anonymity vs. 𝑘 𝑚
-anonymity
[3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014.
2016-08-17 15
[4] Logistic regression + elastic net
• Query : 𝜃 (penalized regression coeffecients) from (𝑋, 𝑌)
stored in a DB. Tentatively 𝑋 is a SNP dataset and 𝑌 = ±1 .
[4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014.
2016-08-17 16
[4] Logistic regression + elastic net
• If we select a tuning parameter from cross-validation, will the
fitted model from the CV be also differentially private?
• Introduce complicated notations
[4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014.
2016-08-17 17
[4] Logistic regression + elastic net
• If 𝑞 is (𝛽1, 𝛽2, 𝛿)-stable and 𝒯 is 𝜖-differentially private,
• [5] : an ordinary CV-procedure (with some ‘randomization’ during
validation with ‘randomness’ 𝜖’) is (𝜖 + 𝜖’)-differentially private.
• [5] assumed the regularization function is differentiable.
[4] extended the assumption to non-differentiable but convex
penalties.
• [4] applied this general framework to [logistic regression +
elastic net].
• Application to GWAS : usually two-step (screening -> logistic reg.)
• The first-stage screening in differentially private manner was
developed in the literature.
• This paper focused on the second-stage only.
[5] Chaudhuri and Vinterbo (2013). A stability-based validation procedure for differentially private machine learning. Advances in Neural Information Processing
Systems, 1-19.
[4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014.
2016-08-17 18

More Related Content

Similar to Differential privacy (개인정보 차등보호)

Share and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next levelShare and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next levelKrzysztof Gorgolewski
 
NeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingNeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingKrzysztof Gorgolewski
 
Pitfalls of multivariate pattern analysis(MVPA), fMRI
Pitfalls of multivariate pattern analysis(MVPA), fMRI Pitfalls of multivariate pattern analysis(MVPA), fMRI
Pitfalls of multivariate pattern analysis(MVPA), fMRI Emily Yunha Shin
 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmManjari Narayan
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
Are we really including all relevant evidence
Are we really including all relevant evidence Are we really including all relevant evidence
Are we really including all relevant evidence cheweb1
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Henock Beyene
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1Johan Blomme
 
Transparency in Data Analysis
Transparency in Data AnalysisTransparency in Data Analysis
Transparency in Data AnalysisChristian Bokhove
 
Stories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okStories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okJisc RDM
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Daniel Roggen
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
Data Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityData Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityIkbal Ahmed
 
[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routineHsing-chuan Hsieh
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 
Introduction to Data Analytics with R
Introduction to Data Analytics with RIntroduction to Data Analytics with R
Introduction to Data Analytics with RWei Zhong Toh
 
UNIT - 5: Data Warehousing and Data Mining
UNIT - 5: Data Warehousing and Data MiningUNIT - 5: Data Warehousing and Data Mining
UNIT - 5: Data Warehousing and Data MiningNandakumar P
 

Similar to Differential privacy (개인정보 차등보호) (20)

Data in science
Data in science Data in science
Data in science
 
Share and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next levelShare and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next level
 
03 presentation-bothiesson
03 presentation-bothiesson03 presentation-bothiesson
03 presentation-bothiesson
 
NeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingNeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimaging
 
Pitfalls of multivariate pattern analysis(MVPA), fMRI
Pitfalls of multivariate pattern analysis(MVPA), fMRI Pitfalls of multivariate pattern analysis(MVPA), fMRI
Pitfalls of multivariate pattern analysis(MVPA), fMRI
 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggm
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Are we really including all relevant evidence
Are we really including all relevant evidence Are we really including all relevant evidence
Are we really including all relevant evidence
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Transparency in Data Analysis
Transparency in Data AnalysisTransparency in Data Analysis
Transparency in Data Analysis
 
Stories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) okStories from the Field: Data are Messy and that's (kind of) ok
Stories from the Field: Data are Messy and that's (kind of) ok
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Data Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & NormalityData Analysis in Research: Descriptive Statistics & Normality
Data Analysis in Research: Descriptive Statistics & Normality
 
[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Introduction to Data Analytics with R
Introduction to Data Analytics with RIntroduction to Data Analytics with R
Introduction to Data Analytics with R
 
UNIT - 5: Data Warehousing and Data Mining
UNIT - 5: Data Warehousing and Data MiningUNIT - 5: Data Warehousing and Data Mining
UNIT - 5: Data Warehousing and Data Mining
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 

Recently uploaded

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 

Recently uploaded (20)

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 

Differential privacy (개인정보 차등보호)

  • 1. Differential Privacy Young-Geun Choi SNU STAT - Multivariate Statistics Lab. 17th August, 2016
  • 2. Contents • Differential privacy? + examples, properties • 4 articles (LNCS 2014) 2016-08-17 2
  • 3. 𝜖-differential privacy • 𝑋1 and 𝑋2 : 𝑛 by 𝑝 data.frames • Both can be arbitrary but differ in only one row • 𝜅 : user-query on 𝑋 + noise e.g. Let 𝑝 = 1 and 𝑋 be a 𝑛 by 1 binary data. consider sum(𝑋) + random Laplace 𝑝 𝑥 ∝ exp(−|𝑥|/𝜆). • Probability 𝑃(⋅) comes from the randomness 𝑝 ⋅ of the noise. • Other queries : counting, histogram, first name.. • Interactive setting vs. non-interactive setting [1] S´anchez, Domingo-Ferrer, and Mart’Inez. Improving the Utility of Differential Privacy via Univariate Microaggregation. LNCS 2014. 2016-08-17 3
  • 4. Why consider differential privacy? • Data Cannot be Fully Anonymized and Remain Useful. • (de-identified) medical encounter data + (publicly available) voter registration records) Massachussetts = re-identification • (de-identified) movie records published by Netflix + (publicly avaiable) the Internet Movie Database (IMDb) = re-identification • Re-Identification of “Anonymized” Records is Not the Only Risk. • membership disclosure + harm (small number of compliant? diagnose?) • Queries Over Large Sets are Not Protective. • e.g. if it is known that Mr. X is in a certain medical database, "How many people in the database have disease A?" + "How many people, not named X, in the database have disease A?” yield the A-status of Mr. X. Chapter 1.1 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407. 2016-08-17 4
  • 5. Why consider differential privacy? • Query Auditing Is Problematic. • refusing itself can be disclosive • query auditing can be computationally infeasible • Summary Statistics are Not “Safe.” • (GWAS, SNP) the National Institutes of Health and Wellcome Trust terminated public access to aggregate frequency data from the studies they fund. • “Ordinary” Facts are Not “OK.” • [frequent bread purchase for long time -> infrequent purchase] + type II diabetes? • “Just a Few.” • “Just a few” philosophy can involve ethnic issues. 2016-08-17 5 Chapter 1.1 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407.
  • 6. Why consider differential privacy? • What differential privacy promises • “the probability of harm is not significantly increased by their choice to participate.” • What differential privacy does not promise • NOT : what one believes to be one’s secrets will remain secret. “Differential privacy promises that the behavior of an algorithm will be roughly unchanged even if a single entry in the database is modified.” • Summary • de-identification is not sufficient • need to contaminate(?) not only (demographic) keys but also all the other attributes. • need to apply this philosophy in the stage of responding to user-query. 2016-08-17 6 Chapter 2.3 of Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407.
  • 7. Some remarks on differential privacy • Another definition ((𝜖, 𝛿)-differential privacy) • Algorithms can be combined 2016-08-17 7 Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407.
  • 8. Some remarks on differential privacy • Algorithms can be combined (continued) Differential privacy, Wikipedia webpage Dwork and Roth (2014), The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 3–4, 211–407. 2016-08-17 8
  • 9. [1] Univariate microaggregation • “Microaggregation” : an alternative to histogram queries • Procedure • Assume 𝑛 by 𝑝 table 𝑋 with numeric attributes only • Input : 𝑋 and 𝑘 ∈ ℕ • For 𝑗 = (from 1 to 𝑝) 1. Divide X[ ,j] into [𝑛/𝑘] clusters of 𝑘 consecutive values. 2. Calculate the average for each of the clusters. 3. Perturb the averages by independent Laplace noises 4. Replace the original values in X[ ,j]by the values in 3. • Guarantee : The output is 𝑝𝜖-differentially private. [1] S´anchez, Domingo-Ferrer, and Mart’Inez. Improving the Utility of Differential Privacy via Univariate Microaggregation. LNCS 2014. 2016-08-17 9
  • 10. [2] Exponential random graphs • 𝑋 : (observed) exponential random graph • Non-interactive query • Want to give a differentially-private random graph 𝑌 based on 𝑋 • Randomized response for edges [2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014. 2016-08-17 10
  • 11. [2] Exponential random graphs 2016-08-17 11 [2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014.
  • 12. [2] Exponential random graphs • Of interest : likelihood-based inference on 𝜃 • Given 𝑌 : if 𝜋 (𝛾 below) is known, MCMC can apply. [2] Karwa, Slavkovi'c, and Krivitsky. Differentially Private Exponential Random Graphs. LNCS 2014. 2016-08-17 12
  • 13. [3] 𝑘 𝑚 -Anonymity for Continuous Data • How to guarantee 𝑘 𝑚 -anonymity with minimal information loss? [3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014. 2016-08-17 13
  • 14. [3] 𝑘 𝑚 -Anonymity for Continuous Data • Illustration of the proposed algorithm [3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014. 2016-08-17 14
  • 15. [3] 𝑘 𝑚 -Anonymity for Continuous Data • 𝑘-anonymity vs. 𝑘 𝑚 -anonymity [3] Gkountouna, Angeli, Zigomitros, Terrovitis, and Vassiliou. 𝑘 𝑚-Anonymity for Continuous Data Using Dynamic Hierarchies, LNCS 2014. 2016-08-17 15
  • 16. [4] Logistic regression + elastic net • Query : 𝜃 (penalized regression coeffecients) from (𝑋, 𝑌) stored in a DB. Tentatively 𝑋 is a SNP dataset and 𝑌 = ±1 . [4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014. 2016-08-17 16
  • 17. [4] Logistic regression + elastic net • If we select a tuning parameter from cross-validation, will the fitted model from the CV be also differentially private? • Introduce complicated notations [4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014. 2016-08-17 17
  • 18. [4] Logistic regression + elastic net • If 𝑞 is (𝛽1, 𝛽2, 𝛿)-stable and 𝒯 is 𝜖-differentially private, • [5] : an ordinary CV-procedure (with some ‘randomization’ during validation with ‘randomness’ 𝜖’) is (𝜖 + 𝜖’)-differentially private. • [5] assumed the regularization function is differentiable. [4] extended the assumption to non-differentiable but convex penalties. • [4] applied this general framework to [logistic regression + elastic net]. • Application to GWAS : usually two-step (screening -> logistic reg.) • The first-stage screening in differentially private manner was developed in the literature. • This paper focused on the second-stage only. [5] Chaudhuri and Vinterbo (2013). A stability-based validation procedure for differentially private machine learning. Advances in Neural Information Processing Systems, 1-19. [4] Yu, Rybar, Uhler, and Fienberg. Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases. LNCS 2014. 2016-08-17 18