0
Signal Processing and Data Privacy
Literature Review
By Kato Mivule
COSC891 Fall 2013
The Role of Signal Processing in Mee...
Introduction: Information Leakage Everywhere
• Growth of information technology has made personal data easily available.
•...
Introduction: Information Leakage Everywhere
• Users post information to social networks, unaware of the privacy risks.
• ...
Privacy is differs from Security
• Data privacy deals with confidentiality control.
• Data security involves the handling ...
Privacy differs from Cryptography
• Cryptography works as an access control methodology.
• Privacy works as a confidential...
Adversaries could be insiders
• Every user is a potential adversary.
• A database might be secure but vulnerable to confid...
Attribute types in statistical databases
• Authors: Public attributes and Private attributes.
• More on attributes:
• PII ...
Privacy versus utility – a.k.a. the Utility-Privacy (U-P) tradeoff problem.
• Data utility – how beneficial a privatized d...
Types of Privacy
• Database privacy
• Consumer privacy
• Competitive privacy
Bowie State University Department of Computer...
Data Privacy Mechanisms
Author: k-anonymity and Differential privacy
• Non-perturbative methods: original data values are ...
Data Privacy Mechanisms
• Non-perturbative methods: original data values are not modified.
Bowie State University Departme...
Data Privacy Mechanisms – Perturbation methods – data values modified
• Noise addition: random values are added to sensiti...
Data Privacy Mechanisms – Perturbation methods – data values modified
Differential Privacy (DP):
• DP is enforced by addin...
Utility-Privacy(U-P) Trade-off
• Data sanitization is concerned with:
• The statistics of the output that achieve a desire...
Utility-Privacy(U-P) Trade-off
• Data sanitization is concerned with:
• The statistics of the output that achieve a desire...
Data Utility Measure
• Utility captures how close the revealed database is to the original.
• A possible measure for the u...
Privacy Quantification
• Entropy is used as a measure of information or uncertainty.
• Privacy requires that there is rand...
Signal Processing applications of Data Privacy
Categories of database privacy:
• Statistical Data Privacy: which involves ...
Signal Processing applications of Data Privacy
Categories of database privacy:
• Statistical Data Privacy
• Competitive Pr...
Conclusion
• A general overview of the data privacy and utility problem is given.
• Most data privacy implementations cent...
References
• Sankar, L.; Trappe, W.; Ramchandran, K.; Poor, H.V.; Debbah, M., "The Role of Signal Processing in Meeting
Pr...
Upcoming SlideShare
Loading in...5
×

Literature Review: The Role of Signal Processing in Meeting Privacy Challenges: An Overview

262

Published on

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
262
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Literature Review: The Role of Signal Processing in Meeting Privacy Challenges: An Overview"

  1. 1. Signal Processing and Data Privacy Literature Review By Kato Mivule COSC891 Fall 2013 The Role of Signal Processing in Meeting Privacy Challenges: An Overview • Sankar, L.; Trappe, W.; Ramchandran, K.; Poor, H.V.; Debbah, M., "The Role of Signal Processing in Meeting Privacy Challenges: An Overview," Signal Processing Magazine, IEEE , vol.30, no.5, pp.95,106, Sept. 2013, doi: 10.1109/MSP.2013.2264541 Bowie State University Department of Computer Science
  2. 2. Introduction: Information Leakage Everywhere • Growth of information technology has made personal data easily available. • This overflow of unconstrained personal data raises privacy concerns. • Yet such data sources have remarkable value (utility) to their users. • This leads to a tension between data privacy and utility needs. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  3. 3. Introduction: Information Leakage Everywhere • Users post information to social networks, unaware of the privacy risks. • Companies use the cloud for data processing unaware of privacy risks. • The data cloud is a risk to “leakage” of private data. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  4. 4. Privacy is differs from Security • Data privacy deals with confidentiality control. • Data security involves the handling of accessibility control. • Data privacy is the procedure of protection against illegal data disclosure. • Data security is the control of data from illegal access. [ 1 2] • To exemplify this fundamental point, a house might be secured with locks to ensure access control; however, bystanders could still look inside the house from a distance if there are no curtains in the windows, thus no privacy even while access is denied to the bystanders. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  5. 5. Privacy differs from Cryptography • Cryptography works as an access control methodology. • Privacy works as a confidentiality control method. • After decryption, a plaintext database is a risk to ‘inside knowledge’ attacks. • After decryption, a plaintext database loses its confidentiality. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  6. 6. Adversaries could be insiders • Every user is a potential adversary. • A database might be secure but vulnerable to confidentiality breaches. • For example a user learning private information by inference. Bowie State University Department of Computer Science Signal Processing and Data Privacy Inference Vulnerabilities – Image Source: Sankar, Trappe, Ramchandran, Poor, Debbah (2013)
  7. 7. Attribute types in statistical databases • Authors: Public attributes and Private attributes. • More on attributes: • PII – Personally Identifiable Information attributes • Quasi attributes • Non-sensitive attributes • Sensitive attributes Bowie State University Department of Computer Science Signal Processing and Data Privacy
  8. 8. Privacy versus utility – a.k.a. the Utility-Privacy (U-P) tradeoff problem. • Data utility – how beneficial a privatized dataset is to a user. • Data utility (usefulness) diminishes during the data privacy process: • When PII is removed. • When data is perturbed. • Equilibrium between data privacy and utility needs is an intractable problem. • “Perfect privacy can be achieved by publishing nothing at all, but this has no utility; perfect utility can be obtained by publishing the data exactly as received, but this offers no privacy” Dwork (2006) Bowie State University Department of Computer Science Signal Processing and Data Privacy
  9. 9. Types of Privacy • Database privacy • Consumer privacy • Competitive privacy Bowie State University Department of Computer Science Signal Processing and Data Privacy
  10. 10. Data Privacy Mechanisms Author: k-anonymity and Differential privacy • Non-perturbative methods: original data values are not modified. • k-anonymity • l-diversity • Suppression • Generalization • Perturbative methods: original data values are transformed. • Noise addition • Multiplicative noise • Logarithmic noise • Differential privacy Bowie State University Department of Computer Science Signal Processing and Data Privacy
  11. 11. Data Privacy Mechanisms • Non-perturbative methods: original data values are not modified. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  12. 12. Data Privacy Mechanisms – Perturbation methods – data values modified • Noise addition: random values are added to sensitive numerical attribute values to ensure privacy. The general expression is: • 𝑋 + 𝜀 = 𝑍 • X is the original continuous dataset and ɛ is the set of random values (noise) with a distribution 𝑒~𝑁 0, 𝜎2 that is added to X, and finally Z is the privatized dataset. • Multiplicative noise: random values with mean µ= 1 and variance 𝜎2, is multiplied to the original values. The general expression is: • 𝑋𝑗 𝜀𝑗 = 𝑌𝑗 • Logarithmic noise: a logarithmic adjustment of the original values is done: • 𝑙𝑛𝑋𝑗 = 𝑌𝑗 • Random values 𝜀𝑗 are then created and added the logarithmic values, 𝑌𝑗, producing the privatized values 𝑍𝑗 as shown below: • 𝑌𝑗 + 𝜀𝑗 = 𝑍𝑗 Bowie State University Department of Computer Science Signal Processing and Data Privacy
  13. 13. Data Privacy Mechanisms – Perturbation methods – data values modified Differential Privacy (DP): • DP is enforced by adding Laplace noise to query results from the database. • With DP, the users of the database cannot discern if an item has been changed in that database. • A DP mechanism satisfies the following criteria: • 𝑃[𝑞 𝑛(𝐷1)∈𝑅] 𝑃[𝑞 𝑛 𝐷2 ∈𝑅] ≤ 𝑒 𝜀 • Laplace noise between (0, b) is generated and added to f(x), the original query response, such that: • 𝑏 = ∆𝑓 𝜀 • The max difference is calculated, ∆𝑓 is the max difference (most influential observation): • ∆𝑓 = 𝑀𝑎𝑥 𝑓 𝐷1 − 𝑓 𝐷2 • Finally, 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑙 𝑝𝑟𝑖𝑣𝑎𝑡𝑒 𝑑𝑎𝑡𝑎 = 𝑓 𝑥 + 𝐿𝑎𝑝𝑙𝑎𝑐𝑒(0, 𝑏) Bowie State University Department of Computer Science Signal Processing and Data Privacy
  14. 14. Utility-Privacy(U-P) Trade-off • Data sanitization is concerned with: • The statistics of the output that achieve a desired level of utility and privacy • Deciding which input values to perturb. • How to probabilistically perturb values. • The U-P tradeoff framework requires the following three components: • A (statistical) model for the data • Measures for privacy and utility • A method to formalize the mappings from 𝑋 to 𝑋 Bowie State University Department of Computer Science Signal Processing and Data Privacy
  15. 15. Utility-Privacy(U-P) Trade-off • Data sanitization is concerned with: • The statistics of the output that achieve a desired level of utility and privacy • Deciding which input values to perturb. • How to probabilistically perturb values. Bowie State University Department of Computer Science Signal Processing and Data Privacy Tension between Privacy and utility - Image Source: Sankar, Trappe, Ramchandran, Poor, Debbah (2013)
  16. 16. Data Utility Measure • Utility captures how close the revealed database is to the original. • A possible measure for the utility u is the requirement that the average distortion of the public variables is upper bounded, for each 𝜀 > 0, and all sufficiently large n: 𝑢 ≡ 𝐸 1 𝑛 𝑝 𝑛 𝑖=1 𝑋 𝐾 𝑟,𝑖 , 𝑋 𝐾 𝑟,𝑖 ≤ 𝐷 + 𝜖 • p . , . is the distortion function • 𝐸 is the expectation over the joint distribution 𝑋 𝐾 𝑟,𝑖 , 𝑋 𝐾 𝑟,𝑖 • The subscription 𝑖 , is the 𝑖 𝑡ℎ 𝑡ℎ entry of the database • Distance based distortion examples include: • Euclidean distance • Hamming distance • Kullback-Leibler divergence Bowie State University Department of Computer Science Signal Processing and Data Privacy
  17. 17. Privacy Quantification • Entropy is used as a measure of information or uncertainty. • Privacy requires that there is randomness or uncertainty of all the private variables • 𝑒 ≡ 1 𝑛 𝐻 𝑋 𝐾ℎ | 𝐽 ≥ 𝐸 − 𝜖 • 𝐻(. |. ) is Shannon’s conditional entropy • X and Y are two random variables with a joint distribution 𝑝 𝑋𝑌 • The conditional entropy 𝐻 𝑋 𝑌 = −(𝑥,𝑦) 𝑝 𝑋𝑌 𝑥, 𝑦 𝑙𝑜𝑔𝑝 𝑋|𝑌(𝑥|𝑦) Bowie State University Department of Computer Science Signal Processing and Data Privacy
  18. 18. Signal Processing applications of Data Privacy Categories of database privacy: • Statistical Data Privacy: which involves guaranteeing privacy of any individual in a database that is used for statistical information processing (utility). • Competitive Privacy: which involves information sharing for a common system good (utility) between competing agents that comprise the system. • Consumer Privacy: guaranteeing privacy in smart devices. • Image Classification Privacy: privacy guarantees in biometric identification. • The FBI to spend US$1 billion on a face recognition to scan surveillance video system. • Civil rights groups are raising objections about possible privacy violations. • Privacy preserving algorithms could be employed to find a balance by focusing on criminals only Bowie State University Department of Computer Science Signal Processing and Data Privacy
  19. 19. Signal Processing applications of Data Privacy Categories of database privacy: • Statistical Data Privacy • Competitive Privacy • Consumer Privacy • Image Classification Privacy Bowie State University Department of Computer Science Signal Processing and Data Privacy Database Privacy Categories – Image Source: Sankar, Trappe, Ramchandran, Poor, Debbah (2013)
  20. 20. Conclusion • A general overview of the data privacy and utility problem is given. • Most data privacy implementations center around data perturbation methods. • Signal processing could be applied to: • Finding the optimal balance between privacy and utility • Filtering out unneeded noise during the perturbation process. • The paper focused much on the data privacy and utility problem. • The paper offered new quantification approach to the data privacy and utility problem. • The actual application and implementation of signal processing to data privacy problems is left to the readers. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  21. 21. References • Sankar, L.; Trappe, W.; Ramchandran, K.; Poor, H.V.; Debbah, M., "The Role of Signal Processing in Meeting Privacy Challenges: An Overview," Signal Processing Magazine, IEEE , vol.30, no.5, pp.95,106, Sept. 2013, doi: 10.1109/MSP.2013.2264541 • Mivule, Kato; Turner, Claude, “A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge”, Complex Adaptive Systems 2013, Nov 13-15, 2013, Baltimore, MD, USA, (In Press) • Mivule, K; Turner, Claude, "A Review of Privacy Essentials for Confidential Mobile Data Transactions", eprint arXiv:1309.3953, 09/2013, ARXIV, online; http://arxiv.org/pdf/1309.3953v1 • Mivule, Kato, "Utilizing Noise Addition for Data Privacy, an Overview", Proceedings of the International Conference on Information and Knowledge Engineering (IKE 2012), Pages 65-71, Las Vegas, NV, USA. Bowie State University Department of Computer Science Signal Processing and Data Privacy
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×