SlideShare a Scribd company logo
Privacy Preserving in Data
Mining With Hybrid Approach
Guided By:- Presented by:-
Prof. Paresh M.Solanki Narenndra Dhadhal
M.Tech. (III) IT
14014021007 1
OUTLINE
1) Introduction PPDM
2) Need for Privacy
3) Privacy Preserving Techniques
4) Literature Survey
5) K- Anonymization
6) Proposed Work
7) References
2
Introduction
 Privacy preserving is one of the most important
research topics in the data security field and it has
become a serious concern in the secure transformation
of personal data in recent years.[1]
 A number of algorithmic techniques have been
designed for Privacy Preserving Data Mining
(PPDM).[1]
3
Introduction (cont.)
 It is used to efficiently protect individual privacy in
data sharing. [1]
 Thus, the various models have been designed for
privacy preserving data sharing. [1]
 In which various privacy preserving approaches in
data sharing and their merits and demerits are
analyzed, [1]
4
Need for Privacy[2]
 Privacy preserving data mining has become
increasingly popular because it allows sharing of
privacy sensitive data for analysis purposes.
 Suppose a hospital has some person-specific patient
data which it wants to publish.
 It wants to publish such that:
 Information remains practically useful
 Identity of an individual cannot be determined
5
Need for Privacy[4]
Non-Sensitive Data Sensitive Data
# Zip Age Nationality Name Condition
1 13053 28 Indian Kumar Heart Disease
2 13067 29 American Bob Heart Disease
3 13053 35 Canadian Ivan Viral Infection
4 13067 36 Japanese Umeko Cancer
Fig 1:- Sensitive and Non-Sensitive Data.[4]
6
Quasi Identifiers is a set of attributes that could
potentially identify a record owner when combined with
publicly available data.
Sensitive Attributes is a set of attributes that
contains sensitive person specific information such as
disease, salary etc.
 Non-Sensitive Attributes is a set of attributes that
reates no problem if revealed even to untrustworthy
parties.
7
Need for Privacy[5]
Need for Privacy[4]
Non-Sensitive Data Sensitive Data
# Zip Age Nationality Condition
1 13053 28 Indian Heart Disease
2 13067 29 American Heart Disease
3 13053 35 Canadian Viral Infection
4 13067 36 Japanese Cancer
# Name Zip Age Nationality
1 John 13053 28 American
2 Bob 13067 29 American
3 Chris 13053 23 American
Published
Data
Data leak!
Fig 2:- Sensitive and Non-Sensitive Data Leak.[4]
8
Privacy Preserving Techniques
 The Important Techniques of Privacy Preserving
Data Mining are: [3]
1)The randomization method
2)The encryption method
3)The Anonymization method
9
1. The Randomization Method [3]
 Randomization method is an important and popular
method in current privacy preserving data mining
techniques.
 It masks the values of the records by adding additional
data to the original data.
Privacy Preserving Techniques
10
2. The Encryption Method [3]
 Encryption method mainly resolves the problems that
people jointly conduct mining tasks based on the
private inputs they provide.
 These privacy mining tasks could occur between
mutual un-trusted parties, or even between competitors.
 Therefore, to protect the privacy becomes an important
concern in distributed data mining setting.
Privacy Preserving Techniques
11
3. The Anonymization Method [3]
 Anonymization method is aimed at making the
individual record will be indistinguishable among a
group record by using generalization and suppression
techniques.
 K-Anonymity is the representative anonymization
method.
Privacy Preserving Techniques
12
Literature Survey[1]
Privacy Preserving Data Mining Techniques-Survey
Author Ms. Dhanalakshmi.M, Mrs.Siva Sankari, (2014)
Summary In this paper the models of privacy preserving will be discussed
.Trust Third Party Model, Semi-honest Model, Malicious Model,
Other Models-Incentive Compatibility. Also discuss the survey
of privacy preserving techniques such as Randomization method,
Anonymization method and Encryption method.
Issues/Challen
ges
The personalized privacy preservation will become the issue.
13
Literature Survey[2]
A Survey on Privacy Preserving Data Mining
Author K.Saranya, K.Premalatha, S.S.Rajasekar, (2015)
Summary This paper presents a brief survey on various standard
techniques for privacy preserving data mining was presented
namely: Classification, Clustering and Associated rule
mining.
Issues/Challen
ges
The merits and demerits of different techniques were pointed
out. In future, propose a hybrid approach of all these
techniques.
14
Literature Survey[3]
A Survey on Privacy Preserving Data Mining
Author Jian Wang , Yongcheng Luo, Yan Zhao, Jiajin Le, (2009)
Summary This paper intends to reiterate several privacy preserving data
mining technologies clearly and then proceeds to analyze the
merits and shortcomings of these technologies.
Issues/Challeng
es
Limitations of the k-anonymity model stem from the two
assumptions. First, it may be very hard for the owner of a
database to determine which of the attributes are or are not
available in external tables. The second limitation is that the k-
anonymity model assumes a certain method of attack, while in
real scenarios there is no reason why the attacker should not try
other methods.
15
Literature Survey[4]
A Survey on Anonymity-based Privacy Preserving
Author Jian Wang, Yongcheng Luo, Shuo Jiang, Jiajin Le, (2009)
Summary In this paper author firstly shown that a k-anonymity dataset
permits strong attacks due to lack of diversity in the sensitive
attributes.
Issues/Challeng
es
k-anonymity protects against identity disclosure, it does not
provide sufficient protection against attribute disclosure.
16
Literature Survey[5]
Analysis of Privacy Preserving K-Anonymity Methods and Techniques
Author S.Vijayarani, A.Tamilarasi, M.Sampoorna, (2010)
Summary This paper present a survey of recent approaches that have
been applied to the k-Anonymity problem. Two main
techniques have been proposed for enforcing k-anonymity on a
private table: namely generalization and Suppression.
Issues/Challeng
es
Threats to k-anonymity that can arise from performing mining
on a collection of data and the approaches to combine k-
anonymity in data mining.
17
Literature Survey[6]
Privacy Preserving in Data Mining Using Hybrid Approach
Author Savita Lohiya, Lata Ragha, (2012)
Summary This paper propose a method called Hybrid approach for
privacy preserving. First randomizing the original data. Then
apply generalization on randomized or modified data. This
technique protect private data with better accuracy, also it can
reconstruct original data and provide data with no information
loss, makes usability of data.
Issues/Challeng
es
K-anonymity method has shortcoming of homogeneity and
background attack.
18
K- Anonymization
 Data anonymization is a type of information
sanitization whose intent is privacy protection.[6]
 It is the process of either encrypting or removing
personally identifiable information from data sets,
so that the people whom the data describe remain
anonymous.[6]
 For example, a hospital may release patients
records so that researchers can study the
characteristics of various diseases.[6]
19
K- Anonymization
 There are two common methods for achieving k-
anonymity for some value of k.[3]
 Suppression: In this method, certain values of the
attributes are replaced by an asterisk '*'. All or some
values of a column may be replaced by '*'. [3]
 Generalization: In this method, individual values of
attributes are replaced by with a broader category. For
example, the value ‘33' of the attribute 'Age' may be
replaced by ' < 40', the value '24' by '20 < Age ≤ 30' ,
etc.[3]
20
# Zip Age Nationality Condition
1 130** < 40 * Heart Disease
2 130** < 40 * Heart Disease
3 130** < 40 * Viral Infection
4 130** < 40 * Cancer
Generalization
Suppression (cell-level)
K- Anonymization(cont…)
Fig 3:- Generalization and Suppression.[2] 21
ID Attributes
Age Sex Zip Code Disease
1 26 M 83661 Headache
2 24 M 83634 Headache
3 31 M 83967 Viral Infection
4 39 F 83949 Cough
ID Attributes
Name Age Sex Zip Code
1 Jim 26 M 83661
2 Jay 24 M 83634
3 Tom 31 M 83967
4 Lily 39 F 83949
TABLE I. MICRODATA
TABLE II. VOTER REGISTRATION
LIST
K- Anonymization(cont…)[4]
22
1) Key attributes: [5]
Name, address, phone number - uniquely identifying!
Always removed before release.
2) Quasi-identifiers: [5]
It is a set of features whose associated values may be useful
for linking with another data set to re-identify the entity
that is the subject of the data.
(5-digit ZIP code, birth date, gender) uniquely identify
Classification of Attributes
23
ID Attributes
Age Sex Zip Code Disease
1 2* M 836** Headache
2 2* M 836** Headache
3 3* * 839** Viral
Infection
4 3* * 839** Cough
TABLE III. 2-ANONYMOUS TABLE
K- Anonymization(cont…)[4]
24
K- Anonymization[3]
 In general, k-anonymity guarantees that an individual can
be associated with his real tuple with a probability at most
1/k.
 While k-anonymity protects against identity disclosure, it
does not provide sufficient protection against attribute
disclosure.
 Two attacks were identified : the homogeneity attack and
the background knowledge attack.
25
 Suppose Jay knows that Jim was 26 year old man and
his zip code is 83661. So he conclude that Jim
corresponds to the first equivalence class, and thus
must have headache. This is the homogeneity attack.
 Suppose that, by knowing Lily's age and zip code, Jay
can conclude that Lily corresponds to a record in the
last equivalence class. Furthermore, suppose that Jay
knows that Lily has very low risk for viral infection.
This background knowledge enables Jay to conclude
that Lily most likely has cough
K- Anonymization[6]
26
 In today’s world, privacy is the major concern to
protect the sensitive data. People are very much
concerned about their sensitive information which they
don’t want to share.
 The proposed method as we combined K-anonymity
with perturbation technique.
Proposed work[5]
27
References
[1] Dhanalakshmi, M., and E. Siva Sankari. "Privacy
preserving data mining techniques-
survey."Information Communication and
Embedded Systems (ICICES), 2014 International
Conference on. IEEE, 2014.
[2] K.Saranya, K.Premalatha, S.S.Rajasekar, . " A
Survey on Privacy Preserving Data Mining."
International Journal of Innovations & Advancement
in Computer Science 2015,IEEE,2015.
28
[3] Wang, Jian, et al. "A survey on privacy preserving
data mining." Database Technology and
Applications, 2009 First International Workshop on.
IEEE, 2009.
[4] Wang, Jian, et al. "A survey on anonymity-based
privacy preserving." E-Business and Information
System Security, 2009. EBISS'09. International
Conference on. IEEE, 2009.
References (cont.)
29
References (cont.)
[5] Vijayarani, S., A. Tamilarasi, and M. Sampoorna.
"Analysis of privacy preserving k-anonymity
methods and techniques." Communication and
Computational Intelligence (INCOCCI), 2010
International Conference on. IEEE, 2010.
[6] Lohiya, Savita, and Lata Ragha. "Privacy Preserving
in Data Mining Using Hybrid
Approach."Computational Intelligence and
Communication Networks (CICN), 2012 Fourth
International Conference on. IEEE, 2012. 30
Thank You
31

More Related Content

What's hot

Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
idescitation
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
14894
 
Current trends in data security nursing research ppt
Current trends in data security nursing research pptCurrent trends in data security nursing research ppt
Current trends in data security nursing research ppt
Nursing Path
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
Editor IJMTER
 
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
IJCSIS Research Publications
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
Editor IJMTER
 
Ib3514141422
Ib3514141422Ib3514141422
Ib3514141422
IJERA Editor
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
ijujournal
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
ijujournal
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
IOSR Journals
 
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
IRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Hy3414631468
Hy3414631468Hy3414631468
Hy3414631468
IJERA Editor
 
78201919
7820191978201919
78201919
IJRAT
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
IJERA Editor
 
A novel ppdm protocol for distributed peer to peer information sources
A novel ppdm protocol for distributed peer to peer information sourcesA novel ppdm protocol for distributed peer to peer information sources
A novel ppdm protocol for distributed peer to peer information sources
IAEME Publication
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security application
bharatsvnit
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
IJERD Editor
 
A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization
Kato Mivule
 
Ej24856861
Ej24856861Ej24856861
Ej24856861
IJERA Editor
 

What's hot (20)

Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
 
Current trends in data security nursing research ppt
Current trends in data security nursing research pptCurrent trends in data security nursing research ppt
Current trends in data security nursing research ppt
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
 
Ib3514141422
Ib3514141422Ib3514141422
Ib3514141422
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
 
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Hy3414631468
Hy3414631468Hy3414631468
Hy3414631468
 
78201919
7820191978201919
78201919
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
 
A novel ppdm protocol for distributed peer to peer information sources
A novel ppdm protocol for distributed peer to peer information sourcesA novel ppdm protocol for distributed peer to peer information sources
A novel ppdm protocol for distributed peer to peer information sources
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security application
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization
 
Ej24856861
Ej24856861Ej24856861
Ej24856861
 

Viewers also liked

Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
Sudhir Kumar
 
Approximate Protocol for Privacy Preserving Associate Rule Mining
Approximate Protocol for Privacy Preserving Associate Rule MiningApproximate Protocol for Privacy Preserving Associate Rule Mining
Approximate Protocol for Privacy Preserving Associate Rule Mining
Pushpalanka Jayawardhana
 
Brisbane Health-y Data: Queensland Data Linkage Framework
Brisbane Health-y Data: Queensland Data Linkage FrameworkBrisbane Health-y Data: Queensland Data Linkage Framework
Brisbane Health-y Data: Queensland Data Linkage Framework
ARDC
 
Privacy Preserved Distributed Data Sharing with Load Balancing Scheme
Privacy Preserved Distributed Data Sharing with Load Balancing SchemePrivacy Preserved Distributed Data Sharing with Load Balancing Scheme
Privacy Preserved Distributed Data Sharing with Load Balancing Scheme
Editor IJMTER
 
Predictive Models and data linkage
Predictive Models and data linkagePredictive Models and data linkage
Predictive Models and data linkage
Nuffield Trust
 
Data Linkage
Data LinkageData Linkage
Data Linkage
Alasdair Gray
 
Data protection and linkage
Data protection and linkageData protection and linkage
Data protection and linkage
MakeMedicinesAffordable
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and Deduplication
Pradeeban Kathiravelu, Ph.D.
 
Privacy Protection Technologies: Introductory Overview
Privacy Protection Technologies: Introductory OverviewPrivacy Protection Technologies: Introductory Overview
Privacy Protection Technologies: Introductory Overview
Hiroshi Nakagawa
 
Approximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTASApproximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTAS
Benjamin Sach
 
Privacy and integrity-preserving range queries
Privacy and integrity-preserving range queriesPrivacy and integrity-preserving range queries
Privacy and integrity-preserving range queries
Keerthi Reddy Yeruva
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
Pradeeban Kathiravelu, Ph.D.
 
Introduction to Data Linkage
Introduction to Data LinkageIntroduction to Data Linkage
Introduction to Data Linkage
University of Southampton
 
An overview of methods for data anonymization
An overview of methods for data anonymizationAn overview of methods for data anonymization
An overview of methods for data anonymization
arx-deidentifier
 
Accounting concepts and conventions
Accounting concepts and conventionsAccounting concepts and conventions
Accounting concepts and conventions
Sukirat Kaur
 
Accounting Concepts and Principles with Examples
Accounting Concepts and Principles with ExamplesAccounting Concepts and Principles with Examples
Accounting Concepts and Principles with Examples
Rahul's Ventures
 

Viewers also liked (16)

Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relation...
 
Approximate Protocol for Privacy Preserving Associate Rule Mining
Approximate Protocol for Privacy Preserving Associate Rule MiningApproximate Protocol for Privacy Preserving Associate Rule Mining
Approximate Protocol for Privacy Preserving Associate Rule Mining
 
Brisbane Health-y Data: Queensland Data Linkage Framework
Brisbane Health-y Data: Queensland Data Linkage FrameworkBrisbane Health-y Data: Queensland Data Linkage Framework
Brisbane Health-y Data: Queensland Data Linkage Framework
 
Privacy Preserved Distributed Data Sharing with Load Balancing Scheme
Privacy Preserved Distributed Data Sharing with Load Balancing SchemePrivacy Preserved Distributed Data Sharing with Load Balancing Scheme
Privacy Preserved Distributed Data Sharing with Load Balancing Scheme
 
Predictive Models and data linkage
Predictive Models and data linkagePredictive Models and data linkage
Predictive Models and data linkage
 
Data Linkage
Data LinkageData Linkage
Data Linkage
 
Data protection and linkage
Data protection and linkageData protection and linkage
Data protection and linkage
 
Indexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and DeduplicationIndexing Techniques for Scalable Record Linkage and Deduplication
Indexing Techniques for Scalable Record Linkage and Deduplication
 
Privacy Protection Technologies: Introductory Overview
Privacy Protection Technologies: Introductory OverviewPrivacy Protection Technologies: Introductory Overview
Privacy Protection Technologies: Introductory Overview
 
Approximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTASApproximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTAS
 
Privacy and integrity-preserving range queries
Privacy and integrity-preserving range queriesPrivacy and integrity-preserving range queries
Privacy and integrity-preserving range queries
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
 
Introduction to Data Linkage
Introduction to Data LinkageIntroduction to Data Linkage
Introduction to Data Linkage
 
An overview of methods for data anonymization
An overview of methods for data anonymizationAn overview of methods for data anonymization
An overview of methods for data anonymization
 
Accounting concepts and conventions
Accounting concepts and conventionsAccounting concepts and conventions
Accounting concepts and conventions
 
Accounting Concepts and Principles with Examples
Accounting Concepts and Principles with ExamplesAccounting Concepts and Principles with Examples
Accounting Concepts and Principles with Examples
 

Similar to Privacy preserving in data mining with hybrid approach

A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
IJMER
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishing
ijcisjournal
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
acijjournal
 
Bj32809815
Bj32809815Bj32809815
Bj32809815
IJMER
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
International Journal of Engineering Inventions www.ijeijournal.com
 
A Rule based Slicing Approach to Achieve Data Publishing and Privacy
A Rule based Slicing Approach to Achieve Data Publishing and PrivacyA Rule based Slicing Approach to Achieve Data Publishing and Privacy
A Rule based Slicing Approach to Achieve Data Publishing and Privacy
ijsrd.com
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
IJSRD
 
F046043234
F046043234F046043234
F046043234
IJERA Editor
 
130509
130509130509
130509
130509130509
Bj32809815 (2)
Bj32809815 (2)Bj32809815 (2)
Bj32809815 (2)
Kalyani Kurra
 
Protection models
Protection modelsProtection models
Protection models
G Prachi
 
Significant features for steganography techniques using deoxyribonucleic acid...
Significant features for steganography techniques using deoxyribonucleic acid...Significant features for steganography techniques using deoxyribonucleic acid...
Significant features for steganography techniques using deoxyribonucleic acid...
nooriasukmaningtyas
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET Journal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
ijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
ijujournal
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
IJRAT
 
Query Processing with k-Anonymity
Query Processing with k-AnonymityQuery Processing with k-Anonymity
Query Processing with k-Anonymity
Waqas Tariq
 
Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...
IJDKP
 

Similar to Privacy preserving in data mining with hybrid approach (20)

A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishing
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
Bj32809815
Bj32809815Bj32809815
Bj32809815
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
 
A Rule based Slicing Approach to Achieve Data Publishing and Privacy
A Rule based Slicing Approach to Achieve Data Publishing and PrivacyA Rule based Slicing Approach to Achieve Data Publishing and Privacy
A Rule based Slicing Approach to Achieve Data Publishing and Privacy
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
 
F046043234
F046043234F046043234
F046043234
 
130509
130509130509
130509
 
130509
130509130509
130509
 
Bj32809815 (2)
Bj32809815 (2)Bj32809815 (2)
Bj32809815 (2)
 
Protection models
Protection modelsProtection models
Protection models
 
Significant features for steganography techniques using deoxyribonucleic acid...
Significant features for steganography techniques using deoxyribonucleic acid...Significant features for steganography techniques using deoxyribonucleic acid...
Significant features for steganography techniques using deoxyribonucleic acid...
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
 
Query Processing with k-Anonymity
Query Processing with k-AnonymityQuery Processing with k-Anonymity
Query Processing with k-Anonymity
 
Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...
 

Recently uploaded

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 

Privacy preserving in data mining with hybrid approach

  • 1. Privacy Preserving in Data Mining With Hybrid Approach Guided By:- Presented by:- Prof. Paresh M.Solanki Narenndra Dhadhal M.Tech. (III) IT 14014021007 1
  • 2. OUTLINE 1) Introduction PPDM 2) Need for Privacy 3) Privacy Preserving Techniques 4) Literature Survey 5) K- Anonymization 6) Proposed Work 7) References 2
  • 3. Introduction  Privacy preserving is one of the most important research topics in the data security field and it has become a serious concern in the secure transformation of personal data in recent years.[1]  A number of algorithmic techniques have been designed for Privacy Preserving Data Mining (PPDM).[1] 3
  • 4. Introduction (cont.)  It is used to efficiently protect individual privacy in data sharing. [1]  Thus, the various models have been designed for privacy preserving data sharing. [1]  In which various privacy preserving approaches in data sharing and their merits and demerits are analyzed, [1] 4
  • 5. Need for Privacy[2]  Privacy preserving data mining has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes.  Suppose a hospital has some person-specific patient data which it wants to publish.  It wants to publish such that:  Information remains practically useful  Identity of an individual cannot be determined 5
  • 6. Need for Privacy[4] Non-Sensitive Data Sensitive Data # Zip Age Nationality Name Condition 1 13053 28 Indian Kumar Heart Disease 2 13067 29 American Bob Heart Disease 3 13053 35 Canadian Ivan Viral Infection 4 13067 36 Japanese Umeko Cancer Fig 1:- Sensitive and Non-Sensitive Data.[4] 6
  • 7. Quasi Identifiers is a set of attributes that could potentially identify a record owner when combined with publicly available data. Sensitive Attributes is a set of attributes that contains sensitive person specific information such as disease, salary etc.  Non-Sensitive Attributes is a set of attributes that reates no problem if revealed even to untrustworthy parties. 7 Need for Privacy[5]
  • 8. Need for Privacy[4] Non-Sensitive Data Sensitive Data # Zip Age Nationality Condition 1 13053 28 Indian Heart Disease 2 13067 29 American Heart Disease 3 13053 35 Canadian Viral Infection 4 13067 36 Japanese Cancer # Name Zip Age Nationality 1 John 13053 28 American 2 Bob 13067 29 American 3 Chris 13053 23 American Published Data Data leak! Fig 2:- Sensitive and Non-Sensitive Data Leak.[4] 8
  • 9. Privacy Preserving Techniques  The Important Techniques of Privacy Preserving Data Mining are: [3] 1)The randomization method 2)The encryption method 3)The Anonymization method 9
  • 10. 1. The Randomization Method [3]  Randomization method is an important and popular method in current privacy preserving data mining techniques.  It masks the values of the records by adding additional data to the original data. Privacy Preserving Techniques 10
  • 11. 2. The Encryption Method [3]  Encryption method mainly resolves the problems that people jointly conduct mining tasks based on the private inputs they provide.  These privacy mining tasks could occur between mutual un-trusted parties, or even between competitors.  Therefore, to protect the privacy becomes an important concern in distributed data mining setting. Privacy Preserving Techniques 11
  • 12. 3. The Anonymization Method [3]  Anonymization method is aimed at making the individual record will be indistinguishable among a group record by using generalization and suppression techniques.  K-Anonymity is the representative anonymization method. Privacy Preserving Techniques 12
  • 13. Literature Survey[1] Privacy Preserving Data Mining Techniques-Survey Author Ms. Dhanalakshmi.M, Mrs.Siva Sankari, (2014) Summary In this paper the models of privacy preserving will be discussed .Trust Third Party Model, Semi-honest Model, Malicious Model, Other Models-Incentive Compatibility. Also discuss the survey of privacy preserving techniques such as Randomization method, Anonymization method and Encryption method. Issues/Challen ges The personalized privacy preservation will become the issue. 13
  • 14. Literature Survey[2] A Survey on Privacy Preserving Data Mining Author K.Saranya, K.Premalatha, S.S.Rajasekar, (2015) Summary This paper presents a brief survey on various standard techniques for privacy preserving data mining was presented namely: Classification, Clustering and Associated rule mining. Issues/Challen ges The merits and demerits of different techniques were pointed out. In future, propose a hybrid approach of all these techniques. 14
  • 15. Literature Survey[3] A Survey on Privacy Preserving Data Mining Author Jian Wang , Yongcheng Luo, Yan Zhao, Jiajin Le, (2009) Summary This paper intends to reiterate several privacy preserving data mining technologies clearly and then proceeds to analyze the merits and shortcomings of these technologies. Issues/Challeng es Limitations of the k-anonymity model stem from the two assumptions. First, it may be very hard for the owner of a database to determine which of the attributes are or are not available in external tables. The second limitation is that the k- anonymity model assumes a certain method of attack, while in real scenarios there is no reason why the attacker should not try other methods. 15
  • 16. Literature Survey[4] A Survey on Anonymity-based Privacy Preserving Author Jian Wang, Yongcheng Luo, Shuo Jiang, Jiajin Le, (2009) Summary In this paper author firstly shown that a k-anonymity dataset permits strong attacks due to lack of diversity in the sensitive attributes. Issues/Challeng es k-anonymity protects against identity disclosure, it does not provide sufficient protection against attribute disclosure. 16
  • 17. Literature Survey[5] Analysis of Privacy Preserving K-Anonymity Methods and Techniques Author S.Vijayarani, A.Tamilarasi, M.Sampoorna, (2010) Summary This paper present a survey of recent approaches that have been applied to the k-Anonymity problem. Two main techniques have been proposed for enforcing k-anonymity on a private table: namely generalization and Suppression. Issues/Challeng es Threats to k-anonymity that can arise from performing mining on a collection of data and the approaches to combine k- anonymity in data mining. 17
  • 18. Literature Survey[6] Privacy Preserving in Data Mining Using Hybrid Approach Author Savita Lohiya, Lata Ragha, (2012) Summary This paper propose a method called Hybrid approach for privacy preserving. First randomizing the original data. Then apply generalization on randomized or modified data. This technique protect private data with better accuracy, also it can reconstruct original data and provide data with no information loss, makes usability of data. Issues/Challeng es K-anonymity method has shortcoming of homogeneity and background attack. 18
  • 19. K- Anonymization  Data anonymization is a type of information sanitization whose intent is privacy protection.[6]  It is the process of either encrypting or removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous.[6]  For example, a hospital may release patients records so that researchers can study the characteristics of various diseases.[6] 19
  • 20. K- Anonymization  There are two common methods for achieving k- anonymity for some value of k.[3]  Suppression: In this method, certain values of the attributes are replaced by an asterisk '*'. All or some values of a column may be replaced by '*'. [3]  Generalization: In this method, individual values of attributes are replaced by with a broader category. For example, the value ‘33' of the attribute 'Age' may be replaced by ' < 40', the value '24' by '20 < Age ≤ 30' , etc.[3] 20
  • 21. # Zip Age Nationality Condition 1 130** < 40 * Heart Disease 2 130** < 40 * Heart Disease 3 130** < 40 * Viral Infection 4 130** < 40 * Cancer Generalization Suppression (cell-level) K- Anonymization(cont…) Fig 3:- Generalization and Suppression.[2] 21
  • 22. ID Attributes Age Sex Zip Code Disease 1 26 M 83661 Headache 2 24 M 83634 Headache 3 31 M 83967 Viral Infection 4 39 F 83949 Cough ID Attributes Name Age Sex Zip Code 1 Jim 26 M 83661 2 Jay 24 M 83634 3 Tom 31 M 83967 4 Lily 39 F 83949 TABLE I. MICRODATA TABLE II. VOTER REGISTRATION LIST K- Anonymization(cont…)[4] 22
  • 23. 1) Key attributes: [5] Name, address, phone number - uniquely identifying! Always removed before release. 2) Quasi-identifiers: [5] It is a set of features whose associated values may be useful for linking with another data set to re-identify the entity that is the subject of the data. (5-digit ZIP code, birth date, gender) uniquely identify Classification of Attributes 23
  • 24. ID Attributes Age Sex Zip Code Disease 1 2* M 836** Headache 2 2* M 836** Headache 3 3* * 839** Viral Infection 4 3* * 839** Cough TABLE III. 2-ANONYMOUS TABLE K- Anonymization(cont…)[4] 24
  • 25. K- Anonymization[3]  In general, k-anonymity guarantees that an individual can be associated with his real tuple with a probability at most 1/k.  While k-anonymity protects against identity disclosure, it does not provide sufficient protection against attribute disclosure.  Two attacks were identified : the homogeneity attack and the background knowledge attack. 25
  • 26.  Suppose Jay knows that Jim was 26 year old man and his zip code is 83661. So he conclude that Jim corresponds to the first equivalence class, and thus must have headache. This is the homogeneity attack.  Suppose that, by knowing Lily's age and zip code, Jay can conclude that Lily corresponds to a record in the last equivalence class. Furthermore, suppose that Jay knows that Lily has very low risk for viral infection. This background knowledge enables Jay to conclude that Lily most likely has cough K- Anonymization[6] 26
  • 27.  In today’s world, privacy is the major concern to protect the sensitive data. People are very much concerned about their sensitive information which they don’t want to share.  The proposed method as we combined K-anonymity with perturbation technique. Proposed work[5] 27
  • 28. References [1] Dhanalakshmi, M., and E. Siva Sankari. "Privacy preserving data mining techniques- survey."Information Communication and Embedded Systems (ICICES), 2014 International Conference on. IEEE, 2014. [2] K.Saranya, K.Premalatha, S.S.Rajasekar, . " A Survey on Privacy Preserving Data Mining." International Journal of Innovations & Advancement in Computer Science 2015,IEEE,2015. 28
  • 29. [3] Wang, Jian, et al. "A survey on privacy preserving data mining." Database Technology and Applications, 2009 First International Workshop on. IEEE, 2009. [4] Wang, Jian, et al. "A survey on anonymity-based privacy preserving." E-Business and Information System Security, 2009. EBISS'09. International Conference on. IEEE, 2009. References (cont.) 29
  • 30. References (cont.) [5] Vijayarani, S., A. Tamilarasi, and M. Sampoorna. "Analysis of privacy preserving k-anonymity methods and techniques." Communication and Computational Intelligence (INCOCCI), 2010 International Conference on. IEEE, 2010. [6] Lohiya, Savita, and Lata Ragha. "Privacy Preserving in Data Mining Using Hybrid Approach."Computational Intelligence and Communication Networks (CICN), 2012 Fourth International Conference on. IEEE, 2012. 30