Cryptography For Privacy Preserving Data MiningCSE 4120 :  Technical Writing & SeminarSubmitted by:MD.  MesbahUddin Khan Roll – 0707059				Dated, June 14, 2011
Things we need to knowPrivacyPrivacy PreservingPrivacy Preserving ComputationSecure ComputationsPrivacy Preserving Data MiningCryptography
Privacy					(1/2)Lets consider following facts: Separate medical institutions wish to conduct a joint research while preserving the privacy of their patients.
 In this scenario it is required to protect privileged information, but it is also required to enable its use for research.How can we solve this problem??
Privacy					(2/2)Therefore we need a protocol which…is secure, i.e.  original parties would require a third party who will do the computation and leave results to the original parties.limits information leak in distributed computation.
Privacy PreservingUltra large database holds a lot of transactional records.Privacy preserving protocols are designed in order to preserve privacy even in the presence of adversarial participants.Adversarial participants attempt to gather information about the inputs of their peers.
Adversarial participantsTwo types of adversaries:Semi-honest adversaryalso known as a passive, or honest but curious adversaryMalicious adversarymay arbitrarily deviate from the protocol speciation
Privacy Preserving Computations (1/3)ClassificationSeparate parties try to build decision trees without disclosing contents of their private databaseAlgorithms: ID3, Gain Ratio, Gini Index etcData ClusteringBoth parties want to jointly perform data clusteringPerformed based on data clustering principles
Privacy Preserving Computations (2/3)Mining Association RulesBoth parties jointly find the association rules from their databases without revealing the information from individual databases.Fraud DetectionTwo parties want to cooperate in preventing fraudulent system, without sharing their data patterns.Private database contains sensitive data.
Privacy Preserving Computations (3/3)Profile MatchingMr. X has a database of hackers profile. Mr. Y has recently traced a behavior of a person, whom he suspects a hacker. Now, if Mr. Y wants to check whether his doubt is correct,  he needs to check Mr. X’s database. Mr. X’s database needs to be protected because it contains hackers related sensitive information. Therefore, when Mr. Y enters the hackers behavior and searches Mr. X’s database, he cant view his whole database, but instead, only gets the comparison results of the matching behavior 
Two distinct problems Secure Computation:which functions can be safely computed.safety means that privacy of individuals is preserved.Privacy Preserving Data Mining:compute results while minimizing the damage to privacy.compute the results without pooling the data, and in a way that reveals nothing but the final results of the data mining computation.
CryptographyCryptography is the practice and study of hiding information.
Concluding Remarksfunctions can be computed efficiently using specialized constructionssecure protocol for computing a certain function will always be more costly than a native protocol

Cryptography for privacy preserving data mining

  • 1.
    Cryptography For PrivacyPreserving Data MiningCSE 4120 : Technical Writing & SeminarSubmitted by:MD. MesbahUddin Khan Roll – 0707059 Dated, June 14, 2011
  • 2.
    Things we needto knowPrivacyPrivacy PreservingPrivacy Preserving ComputationSecure ComputationsPrivacy Preserving Data MiningCryptography
  • 3.
    Privacy (1/2)Lets consider followingfacts: Separate medical institutions wish to conduct a joint research while preserving the privacy of their patients.
  • 4.
    In thisscenario it is required to protect privileged information, but it is also required to enable its use for research.How can we solve this problem??
  • 5.
    Privacy (2/2)Therefore we needa protocol which…is secure, i.e. original parties would require a third party who will do the computation and leave results to the original parties.limits information leak in distributed computation.
  • 6.
    Privacy PreservingUltra largedatabase holds a lot of transactional records.Privacy preserving protocols are designed in order to preserve privacy even in the presence of adversarial participants.Adversarial participants attempt to gather information about the inputs of their peers.
  • 7.
    Adversarial participantsTwo typesof adversaries:Semi-honest adversaryalso known as a passive, or honest but curious adversaryMalicious adversarymay arbitrarily deviate from the protocol speciation
  • 8.
    Privacy Preserving Computations(1/3)ClassificationSeparate parties try to build decision trees without disclosing contents of their private databaseAlgorithms: ID3, Gain Ratio, Gini Index etcData ClusteringBoth parties want to jointly perform data clusteringPerformed based on data clustering principles
  • 9.
    Privacy Preserving Computations(2/3)Mining Association RulesBoth parties jointly find the association rules from their databases without revealing the information from individual databases.Fraud DetectionTwo parties want to cooperate in preventing fraudulent system, without sharing their data patterns.Private database contains sensitive data.
  • 10.
    Privacy Preserving Computations(3/3)Profile MatchingMr. X has a database of hackers profile. Mr. Y has recently traced a behavior of a person, whom he suspects a hacker. Now, if Mr. Y wants to check whether his doubt is correct, he needs to check Mr. X’s database. Mr. X’s database needs to be protected because it contains hackers related sensitive information. Therefore, when Mr. Y enters the hackers behavior and searches Mr. X’s database, he cant view his whole database, but instead, only gets the comparison results of the matching behavior 
  • 11.
    Two distinct problemsSecure Computation:which functions can be safely computed.safety means that privacy of individuals is preserved.Privacy Preserving Data Mining:compute results while minimizing the damage to privacy.compute the results without pooling the data, and in a way that reveals nothing but the final results of the data mining computation.
  • 12.
    CryptographyCryptography is thepractice and study of hiding information.
  • 13.
    Concluding Remarksfunctions canbe computed efficiently using specialized constructionssecure protocol for computing a certain function will always be more costly than a native protocol