Cryptography for privacy preserving data mining


Published on

it was my first seminar topic, wasn't that good presentation though...

Published in: Technology
  • Be the first to comment

Cryptography for privacy preserving data mining

  1. 1. Cryptography For Privacy Preserving Data Mining<br />CSE 4120 : Technical Writing & Seminar<br />Submitted by:<br />MD. MesbahUddin Khan <br />Roll – 0707059 Dated, June 14, 2011<br />
  2. 2. Things we need to know<br />Privacy<br />Privacy Preserving<br />Privacy Preserving Computation<br />Secure Computations<br />Privacy Preserving Data Mining<br />Cryptography<br />
  3. 3. Privacy (1/2)<br />Lets consider following facts:<br /><ul><li> Separate medical institutions wish to conduct a joint research while preserving the privacy of their patients.
  4. 4. In this scenario it is required to protect privileged information, but it is also required to enable its use for research.</li></ul>How can we solve this problem??<br />
  5. 5. Privacy (2/2)<br />Therefore we need a protocol which…<br />is secure, i.e. original parties would require a third party who will do the computation and leave results to the original parties.<br />limits information leak in distributed computation.<br />
  6. 6. Privacy Preserving<br />Ultra large database holds a lot of transactional records.<br />Privacy preserving protocols are designed in order to preserve privacy even in the presence of adversarial participants.<br />Adversarial participants attempt to gather information about the inputs of their peers.<br />
  7. 7. Adversarial participants<br />Two types of adversaries:<br />Semi-honest adversary<br />also known as a passive, or honest but curious adversary<br />Malicious adversary<br />may arbitrarily deviate from the protocol speciation<br />
  8. 8. Privacy Preserving Computations (1/3)<br />Classification<br />Separate parties try to build decision trees without disclosing contents of their private database<br />Algorithms: ID3, Gain Ratio, Gini Index etc<br />Data Clustering<br />Both parties want to jointly perform data clustering<br />Performed based on data clustering principles<br />
  9. 9. Privacy Preserving Computations (2/3)<br />Mining Association Rules<br />Both parties jointly find the association rules from their databases without revealing the information from individual databases.<br />Fraud Detection<br />Two parties want to cooperate in preventing fraudulent system, without sharing their data patterns.<br />Private database contains sensitive data.<br />
  10. 10. Privacy Preserving Computations (3/3)<br />Profile Matching<br />Mr. X has a database of hackers profile. <br />Mr. Y has recently traced a behavior of a person, whom he suspects a hacker. <br />Now, if Mr. Y wants to check whether his doubt is correct, he needs to check Mr. X’s database. <br />Mr. X’s database needs to be protected because it contains hackers related sensitive information. <br />Therefore, when Mr. Y enters the hackers behavior and searches Mr. X’s database, he cant view his whole database, but instead, only gets the comparison results of the matching behavior <br />
  11. 11. Two distinct problems <br />Secure Computation:<br />which functions can be safely computed.<br />safety means that privacy of individuals is preserved.<br />Privacy Preserving Data Mining:<br />compute results while minimizing the damage to privacy.<br />compute the results without pooling the data, and in a way that reveals nothing but the final results of the data mining computation.<br />
  12. 12. Cryptography<br />Cryptography is the practice and study of hiding information.<br />
  13. 13. Concluding Remarks<br />functions can be computed efficiently using specialized constructions<br />secure protocol for computing a certain function will always be more costly than a native protocol<br />
  14. 14. Thank you all… <br />