Privacy Preserving DB Systems


Published on

Privacy Perserving DataBases, how they are managed, built and secured. with an introduction to main methods of Anonymization techniques, PPDB data mining, P3P and Hippocratic DBs.

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Privacy Preserving DB Systems

  1. 1. Privacy-Preserving Database Systems Presented By: Ashraf Bashir
  2. 2. Agenda <ul><li>Privacy </li></ul><ul><li>Data Privacy </li></ul><ul><li>Anonymization Techniques </li></ul><ul><li>Privacy-Preserving Data Mining </li></ul><ul><li>Platform for Privacy Preferences (P3P) Project </li></ul><ul><li>Hippocratic Databases </li></ul><ul><li>References </li></ul>
  3. 3. Privacy <ul><li>Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively [1] </li></ul><ul><li>Privacy is the right of individuals to determine for themselves when , how and to what extent information about them is communicated to others. [2] </li></ul>
  4. 4. Data Privacy <ul><li>A terminology used wherever data relating to a person or persons are collected and stored, in digital form or otherwise. </li></ul><ul><li>Improper or non-existent disclosure control can be the root cause for privacy issues. </li></ul><ul><li>Research Directions in Privacy-Preserving Database Systems </li></ul><ul><ul><li>Anonymization Techniques </li></ul></ul><ul><ul><li>Privacy-Preserving Data Mining </li></ul></ul><ul><ul><li>P3P </li></ul></ul><ul><ul><li>Hippocratic Databases </li></ul></ul><ul><ul><li>Fine-Grained Access Control Techniques </li></ul></ul>
  5. 5. Anonymization Techniques <ul><li>Prevents linking a record from a set of released records to a specific individual under k-anonymity. </li></ul><ul><li>There will be at least k individuals to whom a given record indistinctly refers </li></ul>
  6. 6. Data Privacy (cont’d) <ul><li>Example [3] : </li></ul>
  7. 7. Data Privacy (cont’d) <ul><li>Open Topic: </li></ul><ul><li>Given an arbitrary table, what’s the minimum number of entries that must be “suppressed” in order to achieve k-anonymity? </li></ul>
  8. 8. Data Privacy (cont’d) <ul><li>Open Topic: </li></ul><ul><li>Given an arbitrary table, what’s the minimum number of entries that must be “suppressed” in order to achieve k-anonymity? </li></ul><ul><li> NP-hard Problem ! </li></ul>
  9. 9. Privacy-Preserving Data Mining <ul><li>Most data mining applications operate under the assumption that all data is available at a single central repository, called a data warehouse. </li></ul><ul><li>This poses a huge privacy problem because violating only a single repository’s security exposes all data. </li></ul>
  10. 10. Privacy-Preserving Data Mining (cont’d) <ul><li>Proposed Solutions: </li></ul><ul><ul><li>Data swapping and randomization. [4] </li></ul></ul><ul><ul><li>Extension of data mining techniques to preserve privacy. </li></ul></ul><ul><ul><li>Distributed privacy-preserving data mining based on secure multi-party computation (SMC) techniques (several parties own different portions of the data; each party wish to share the data mining results without however disclosing the original data to the other parties) </li></ul></ul>
  11. 11. P3P [5] <ul><li>P3P refers to Platform for Privacy Preferences Project [6] </li></ul><ul><li>provides users with a way to establish a contract with the server they want to communicate with, and if the server breaks the contract, it will be possible to claim liability </li></ul>
  12. 12. P3P (cont’d) <ul><li>Websites encodes the privacy practice in a machine-readable format (XML) </li></ul><ul><ul><li>what information is collected </li></ul></ul><ul><ul><li>who can access the data for what purposes </li></ul></ul><ul><ul><li>how long the data will be stored by the sites </li></ul></ul>
  13. 13. P3P (cont’d)
  14. 14. P3P (cont’d)
  15. 15. P3P (cont’d)
  16. 16. P3P (cont’d)
  17. 17. P3P (cont’d)
  18. 18. P3P (cont’d)
  19. 19. P3P (cont’d)
  20. 20. P3P (cont’d) Is the scenario acceptable as privacy ?
  21. 21. P3P (cont’d)
  22. 22. P3P (cont’d) Reasonable questions: 1- is passing data through safe zone reasonable ? 2- Does the user need to accept the policy manually ?
  23. 23. P3P (cont’d)
  24. 24. P3P (cont’d)
  25. 25. P3P (cont’d) <ul><li>Preference File sample </li></ul>
  26. 26. P3P (cont’d) <ul><li>Preference File sample </li></ul>APPEL stands for: A P3P Preference Exchange Language 1. Block when …. 2. HTTP request-related information are … 3. delivered to other parties, i.e when the P3P policy tag is set to public or delivery or unrelated .
  27. 27. P3P (cont’d) <ul><li>Policy File sample </li></ul>
  28. 28. P3P (cont’d) <ul><li>Policy File sample </li></ul>If the user asks for what information the content provider has stored, access will be granted HTTP log files (called dynamic misc-data) are stored, just for data about the computer Human readable version Nobody else gets access to these data There is no limit in storing data Purpose is always required when querying for data
  29. 29. Hippocratic Databases <ul><li>“ And about whatever I may see or hear in treatment, or even without treatment, in the life of human beings – things that should not ever be blurted out outside – I will remain silent, holding such things to be unutterable” – Hippocratic Oath [7] </li></ul><ul><li>databases that include privacy as a central concern is called Hippocratic databases </li></ul><ul><li>Incorporates privacy protection within relational database systems </li></ul>
  30. 30. Hippocratic DBs ten principles [2] <ul><ul><li>1- Purpose Specification: For personal information stored in the database, the purposes for which the information has been collected shall be associated with that information. </li></ul></ul><ul><ul><li>2- Purpose Consent: The purposes associated with personal information shall have consent of the donor of the personal information. </li></ul></ul><ul><ul><li>3- Limited Collection: The personal information collected shall be limited to the minimum necessary for accomplishing the specified purposes. </li></ul></ul>
  31. 31. Hippocratic DBs ten principles (cont’d) <ul><ul><li>4- Limited Use (queries just for purpose): The database shall run only those queries that are consistent with the purposes for which the information has been collected. </li></ul></ul><ul><ul><li>5- Limited Disclosure (outer purpose): The personal information stored in the database shall not be communicated outside the database for purposes other than those for which there is consent from the donor of the information. </li></ul></ul><ul><ul><li>6- Limited Retention: Personal information shall be retained only as long as necessary for the fulfillment of the purposes for which it has been collected. </li></ul></ul>
  32. 32. Hippocratic DBs ten principles (cont’d) <ul><ul><li>7- Accuracy: Personal information stored in the database shall be accurate and up-to-date . </li></ul></ul><ul><ul><li>8- Safety: Personal information shall be protected by security safeguards against theft and other misappropriations. </li></ul></ul><ul><ul><li>9- Openness: A donor shall be able to access all information about the donor stored in the database. </li></ul></ul><ul><ul><li>10- Compliance: A donor shall be able to verify compliance with the above principles. </li></ul></ul>
  33. 33. Hippocratic Databases (cont’d) <ul><li>Uses privacy metadata, which consists of privacy policies and privacy authorizations stored in two tables: </li></ul><ul><ul><li>A privacy policy <attribute , purpose> </li></ul></ul><ul><ul><li>A privacy authorization <user , purpose> </li></ul></ul>
  34. 34. Hippocratic Databases (cont’d) <ul><li>Example </li></ul>
  35. 35. Hippocratic Databases (cont’d)
  36. 36. Hippocratic Databases (cont’d)
  37. 37. References <ul><li>[1] Privacy Scientific American Magazine [published in September 18 th , 2008] </li></ul><ul><li>[2] R. Agrawal, J. Kiernan, R. Srikant and Yirong Xu. Hippocratic Databases. August 28 th , 2002. International Conference on Very Large Data Bases, Hong Kong, China. </li></ul><ul><li>[3] Elisa Bertino. Privacy-Preserving Database Systems. March 24 th , 2008. </li></ul><ul><li>[4] Richard A. Moore. Controlled Data-Swapping Techniques. Statistical Research Division US Bureau of the Census Washington, DC 20233. </li></ul><ul><li>[5] [last visited March 25 th , 2009] </li></ul><ul><li>[6] Helena Lindskog and Stefan Lindskog. “Web Site Privacy with P3P”, Wiley Publishing, 2003. (chapter 5) </li></ul><ul><li>[7] Translation by Heinrich Von Staden. In a Pure and Holy Way: Personal and Professional Conduct in the Hippocratic Oath. Journal of the History of Medicine and Applied Sciences 51 (1966) 406–408. </li></ul>
  38. 38. Questions ?