SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2243
RANDOM DATA PERTURBATION TECHNIQUES IN PRIVACY PRESERVING DATA MINING
Sangavi N1, Jeevitha R2, Kathirvel P3, Dr. Premalatha K4
1,2,3PG Scholars, Bannari Amman Institute of Technology, Sathyamangalam
4Professor, Bannari Amman Institute of Technology, Sathyamangalam
----------------------------------------------------------------------***---------------------------------------------------------------------
ABSTRACT-Data mining strategies have been facing a
serious challenge in recent years due to heightened privacy
concerns and concerns, i.e. protecting the privacy of
important and sensitive data. Data perturbation is a
common Data Mining privacy technique. Dataperturbation's
biggest challenge is to balance privacy protection and data
quality, which is normally considered to be a pair of
contradictory factors. Geometric perturbation techniquefor
data is a combination of perturbationtechniqueforrotation,
translation, and noise addition. Publishing data while
protecting privacy–sensitive details–isparticularlyusefulfor
data owners. Typical examplesincludepublishingmicrodata
for research purposes or contractingthe datatothirdparties
providing services for data mining. In this paper we are
trying to explore the latest trends in the technique of
perturbation of geometric results.
Keywords: Data mining, Privacy preserving, data
perturbation, randomization, cryptography, Geometric
Data Perturbation.
INTRODUCTION
Enormous volumes of extensivepersonal data areroutinely
collected and analyzed using data mining tools. These data
include, among others, shopping habits, criminal records,
medical history, and credit records. Such data, on the one
hand, is an important asset for business organizations and
governments, both in decision-making processes and in
providing social benefits such as medical research, crime
reduction, national security, etc. Data mining techniques
are capable of deriving highly sensitive information from
unclassified data which is not even exposed to database
holders. Worse is the privacy invasion triggered by
secondary data use when people are unaware of usingdata
mining techniques "behind the scenes"[3].
The daunting problem: how can we defend against the
misuse of information that has been uncovered from
secondary data use and meet the needs of organizations
and governments to facilitate decision-making or even
promote social benefits? They claim that a solution to such
a problem involves two essential techniques: anonymityin
the first step of privacy protection to delete identifiers (e.g.
names, social insurance numbers, addresses, etc.) anddata
transformation to preserve those sensitive attributes (e.g.
income, age, etc.) since the release of data, after removal of
data. identifiers, may contain other information thatcanbe
linked with other datasets to re-identify individuals or
entities [4].
We cannot effectively safeguard dataprivacyagainstnaive
estimation. Rotation perturbation and random projection
perturbation are all threatened by prior knowledge
allowed Independent Component Analysis
Multidimensional-anonymization is only intended for
general-purpose utility preservationandmayresultinlow-
quality data mining models.Inthispaperweproposea new
multidimensional data perturbation technique: geometric
data perturbation that can be appliedforseveral categories
of popular data mining models with better utility
preservation and privacy preservation[5].
Need for Privacy in Data Mining
Presumably information is the most important and
demanded resource today. We live in an online societythat
relies on dissemination andinformationsharinginboththe
private as well as the public and government sectors.State,
federal, and private entitiesareincreasinglybeing required
to make their data electronicallyavailable[5][6].Protecting
respondent privacy (individuals, groups, associations,
businesses, etc.). Though technically anonymous, de-
identified data may include other data, such as ethnicity,
birth date, gender and ZIP code, which may be unique or
nearly unique. Identifying the characteristics of publicly
available databases associating these characteristics with
the identity of the respondent, data recipients may decide
to which respondent each pieceof releaseddata belongs, or
limit their confusion to a specific sub-set of persons.
DATA PERTURBATION
Data-perturbation-based approaches fall into two main
categories which we call the probability distribution
category and the fixed data perturbation category[8]. The
probability distribution group considerstheaggregationas
a sample from a given population with a given probability
distribution. In this case, the security check method
substitutes for the original data With another sample or by
the allotment itself from the same distribution. In the
context of fixed data perturbation the values of the
attributes in the database to be used to calculate statistics
are once and for all disrupted. The perturbationmethodsof
fixed data were developed solely for numerical or
categorical data[9].
Within the category of probability distribution two
methods can be defined. The firstiscalled"data swap-ping"
or "multidimensional transformation" This approach
replaces the original database with a randomly generated
database of exactly the same probability distributionasthe
original database[10]. When calculating a new
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2244
perturbation, consideration must be given to the
relationship between this entity and the rest of the
database, as long as a new entity is added or a current
entity is eliminated. A one-to - one mapping between the
original database and the disrupted database is needed.
The Precision resulting from this method may be
considered unacceptable, since in some cases the method
may have an error of up to 50 per cent. The lattermethodis
called probability distribution method. The method
consists of three steps: (1) Identify the underlying density
function of the attribute values, and estimate the
parameters of that function. (2) Generate a confidential
attribute data sample sequence from the approximate
density function. The most recent sample would need tobe
the same size as the database. (3) delete these genera In
other words, the lower value of thenewsamplewill replace
the lower value in the original data, and so on.
Data perturbation is a popular Data Mining privacy
technique. Data perturbation's biggest challenge is to
balance privacy protection and data quality, which are
normally considered as a pair of contradictory factors[11 ].
The distribution of in this method Reconstructed
independently of every data dimension. This means that
any data mining algorithm based on the distributionworks
under an implicit assumption that each dimension is
treated independently.
Approach to data perturbation is divided into two: the
approach to probability distribution and the approach to
value distortion. The approach to probability distribution
replaces the data with another sample from the same
distribution or the distribution itself, and the approach to
value distortion Disrupts data elements or attributes
directly by either additive noise, multiplicative noise, or
other procedures of randomization. There are three types
of approaches to data perturbation: Rotation perturbation,
Projection perturbation and perturbation of geometric
data.
DIFFERENT METHODS OF DATA
PERTURBATION
3.1 Noise Additive Perturbation
The standard technique of additive perturbation[13 ] is
column-based randomisation of additives. This type of
techniques is based on the facts that 1) data owners may
not want to protect all values in a record equally, so a
distortion of the column-based value can be applied to
disturb some sensitive columns. 2) The data classification
models to be used do not necessarily requiretheindividual
records, but only the distribution of the column value
assuming separate columns. The basic method is to
disguise the original values by injecting some amount of
random additive noise, while the specific information,such
as the column distribution, can still be effectively
reconstructed from the perturbed data.
We treat the original values (x1,x2,...,xn)froma columnto be
randomly drawn from a random variable X, which has
some kind of distribution. By adding random noises R to
the original data values, the randomization process
changes the original data and generates a disturbed data
column Y, Y= X+ R. It publishes the resulting record(x1+r1,
x2+r2,...,xn+rn) and the R distribution. The trick to
introducing random noise is the algorithm of distribution
reconstruction, which restores X's column distribution
based on perturbed data and R's distribution.
3.2 Condensation-based Perturbation:
The approach to condensation is a standard
multidimensional perturbation technique, aimed at
maintaining the matrix of covariance for multiplecolumns.
So some geometric properties like decision boundaryform
are well maintained. Unlike the randomization approach,
multiple columns as a whole are disturbed in order to
generate the whole "perturbed data set." As for the The
perturbed dataset preserves the covariance matrix, and
many existing data mining algorithms can be applied
directly to the perturbed dataset without requiring
algorithm modifications or new development.
It begins by partitioning the original data into groups k-
record. The groupconsistsof twosteps–randomlychoosing
a record as the center of the group from the current
records, and then identifying the (k − 1) nearest neighbors
of the center as the other (k − 1) members. Before the next
community is created the selected k records are extracted
from the original dataset. Since each group has a limited
locality, a set of k records may be regenerated to maintain
the distribution and covariance roughly. The record
replication algorithm aims to maintain each group's
ownvectors and values, as shown in the Figure 1.
Fig. 1 Eigen values of each group
3.3 Random Projection Perturbation:
Random projection perturbation (Liu, Kargupta and Ryan,
2006) refers to the technique of projecting a set of data
points to another randomly selected space from the
original multidimensional space.LetPk average bea matrix
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2245
of random projection, where the rows of P are
orthonormal[14 ].
G(X) = is applied to perturb the dataset X.
3.4 Geometric data perturbation:
Def: Geometric data perturbation consists of a sequence of
random geometric transformations, including
multiplicative transformation (R), translation
transformation (Ψ), and distance perturbation ∆.
G(X) = RX + Ψ + ∆ [15]
The data is assumed to be an Apxq matrix where each of
the p rows is an observation, Oi, and for each of the q
attributes, Ai, each observation containsvalues.Thematrix
may include both numerical and categorical attributes.
However, our methods of geometric data transformation
rely on numerical attributes d, such that d <=q.Thus,inthe
Euclidean space, the matrix px d, which is subject to
transformation, can be regarded as a vector subspace V, so
that each vectorvi€ V is the form vi= (a1;::; ad),1 <= i<=d,
where ai is one instance of Ai, ai€R, and R is the set of real
numbers. Before releasing the data for clustering analysis,
the vector subspace V must be transformed to preservethe
privacy of the individual data records. We need to add or
even multiply a constant noise term e to each element vi of
V in order to transform V into a distorted vector subspace
V.'
Translation Transformation: A constant is added for all
attribute values. The constant may be a negative or a
positive number. Although its degree of privacy protection
is 0 according to the formula for calculating the degree of
privacy protection, it makes us unable to see the raw data
directly from transformed data, so translation transform
can also play the role of privacy protection as well.
Translation is the task of moving a point with coordinates
(X;Y) through displacements(X0;Y0) to a new location.
Using a matrix representation v'=Tv, where T is a 2x 3
transformation matrix depicted in Figure 1(a), v is the
vector column containing theoriginal co-ordinates,andv' is
a column vector whose co-ordinates are thetransformed
co-ordinates, is easily achieved. This form of matrix is also
extended to Scaling and Rotation.
Rotation Transformation: Consider them as two-
dimensional space points for a pair of arbitrarily selected
attributes and rotate them according to a given angle with
the origin as the middle. If it is positive, we must rotateitin
anti-clockwise direction. Otherwise we'll rotatetheminthe
clockwise direction.
A more challenging transformation is rotation. This
transformation, in its simplest form, is for the rotation of a
point around the coordinate axes. Rotation of a point by
angle in a discrete 2D space is achieved using the
transformation matrix shown in Figure 1(b). The rotation
angle is measured clockwise and this transformation ects
the values of X and Y coordinates.
Fig. 2 (a) Translation Matrix (b) Rotation Matrix
The two elements above, translation and rotation maintain
the relationship between the distances. A bunch of
essential classification models will be "perturbation-
invariant" by retaining distances, which is the center of
geometric perturbation. In some situations, distance
conserving perturbation may be subject to distance-
inference attacks. The objective of distance disturbance is
to preserve the Distances are approximate, whileresilience
to distance-inference attacks is effectively increased. We
define the third component as a random matrix, where
each entry is a separate sample with zero mean and small
variance from the same distribution. By adding this
component it slightly disturbs the distance between a pair
of points.
CONCLUSIONS
It focuses mainly on random geometric perturbation
approach to privacy preservingdata classification.Random
geometric perturbation, G(X) = RX + Ψ + ∆, includes the
linear combination of the three components: rotation
perturbation, translation perturbation, and distance
perturbation. Geometric perturbation can preserve the
important geometric properties, thus most data mining
models that search for geometric class boundariesare well
preserved with the perturbed data.
Geometric perturbation perturbs multiple columns in one
transformation, which introduces new challenges in
evaluating the privacy guarantee for multi-dimensional
perturbation.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2246
REFERENCES
1. Chhinkaniwala H. and Garg S., “Privacy Preserving
Data Mining Techniques: Challenges and Issues”,
CSIT, 2011.
2. L.Golab and M.T.Ozsu, Data Stream Management
issues-”A Survey Technical Report”, 2003.
3. Majid, M.Asger, Rashid Ali, “PrivacypreservingData
Mining Techniques:Current Scenario and Future
Prospects”, IEEE 2012.
4. Aggrawal, C.C, and Yu.PS. ,” A condensation
approach to privacy preserving data mining”. Proc.
Of Int.conf. on extending Database
Technology(EDBT)(2004).
5. Chen K, and Liu, “Privacy Preserving Data
Classification with Rotation Perturbation”,
proc.ICDM, 2005, pp.589-592.
6. K.Liu, H Kargupta, and J.Ryan,”Randomprojection–
based multiplicative data perturbation for privacy
preserving distributed data mining.” IEEE
Transaction on knowledge and Data Engg,Jan
2006,pp 92-106.
7. Keke Chen, Gordon Sun, and Ling Liu. Towards
attack-resilient geometric data perturbation.” In
proceedings of the 2007 SIAM international
conference on Data mining, April 2007.
8. M. Reza,Somayyeh Seifi,” Classification and
Evaluation the PPDM Techniues by using a data
Modification -based framework”, IJCSE,Vol3.No2
Feb 2011.
9. Vassilios S.Verylios,E.Bertino,Igor N,”State –of-the
art in Privacy preserving Data Mining”,published in
SIGMOD 2004 pp.121-154.
10. Ching-Ming, Po-Zung & Chu-Hao,” Privacy
Preserving Clustering of Data streams”, Tamkang
Journal of Sc. & Engg, Vol.13 no. 3 pp.349-358
11. Jie Liu, Yifeng XU, “Privacy Preserving Clusteringby
Random Response Method of
Geometric Transformation”, IEEE 2010
12. Keke Chen, Ling lui, Privacy Preserving Multiparty
Collabrative Mining with Geometric Data
Perturbation , IEEE, January 2009

More Related Content

What's hot

A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)eSAT Journals
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
 
An Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodAn Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodIJAEMSJORNAL
 
Privacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataPrivacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataIOSR Journals
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...Happiest Minds Technologies
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Editor IJARCET
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesIRJET Journal
 
Digital image hiding algorithm for secret communication
Digital image hiding algorithm for secret communicationDigital image hiding algorithm for secret communication
Digital image hiding algorithm for secret communicationeSAT Journals
 
Protection models
Protection modelsProtection models
Protection modelsG Prachi
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET Journal
 
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATA
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATACOMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATA
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATAcscpconf
 
Combined mining approach to generate patterns for complex data
Combined mining approach to generate patterns for complex dataCombined mining approach to generate patterns for complex data
Combined mining approach to generate patterns for complex datacsandit
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Kato Mivule
 

What's hot (19)

Saif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submittedSaif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submitted
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
 
G44093135
G44093135G44093135
G44093135
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
 
An Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodAn Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering method
 
Privacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted dataPrivacy Preserving Clustering on Distorted data
Privacy Preserving Clustering on Distorted data
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
130509
130509130509
130509
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
Digital image hiding algorithm for secret communication
Digital image hiding algorithm for secret communicationDigital image hiding algorithm for secret communication
Digital image hiding algorithm for secret communication
 
Protection models
Protection modelsProtection models
Protection models
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATA
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATACOMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATA
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATA
 
Combined mining approach to generate patterns for complex data
Combined mining approach to generate patterns for complex dataCombined mining approach to generate patterns for complex data
Combined mining approach to generate patterns for complex data
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
 

Similar to IRJET - Random Data Perturbation Techniques in Privacy Preserving Data Mining

A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...IRJET Journal
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsIRJET Journal
 
Distance based transformation for privacy preserving data mining using hybrid...
Distance based transformation for privacy preserving data mining using hybrid...Distance based transformation for privacy preserving data mining using hybrid...
Distance based transformation for privacy preserving data mining using hybrid...csandit
 
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...rahulmonikasharma
 
Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...IJDKP
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining14894
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMIJNSA Journal
 
journal for research
journal for researchjournal for research
journal for researchrikaseorika
 
Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data IJECEIAES
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesEditor IJMTER
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacyIJDKP
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONcscpconf
 
Different Classification Technique for Data mining in Insurance Industry usin...
Different Classification Technique for Data mining in Insurance Industry usin...Different Classification Technique for Data mining in Insurance Industry usin...
Different Classification Technique for Data mining in Insurance Industry usin...IOSRjournaljce
 
Privacy preserving clustering on centralized data through scaling transf
Privacy preserving clustering on centralized data through scaling transfPrivacy preserving clustering on centralized data through scaling transf
Privacy preserving clustering on centralized data through scaling transfIAEME Publication
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniqueseSAT Journals
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Miningidescitation
 
A Survey on Privacy-Preserving Data Aggregation Without Secure Channel
A Survey on Privacy-Preserving Data Aggregation Without Secure ChannelA Survey on Privacy-Preserving Data Aggregation Without Secure Channel
A Survey on Privacy-Preserving Data Aggregation Without Secure ChannelIRJET Journal
 
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATION
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATIONMDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATION
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATIONcscpconf
 

Similar to IRJET - Random Data Perturbation Techniques in Privacy Preserving Data Mining (20)

A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy Algorithms
 
Distance based transformation for privacy preserving data mining using hybrid...
Distance based transformation for privacy preserving data mining using hybrid...Distance based transformation for privacy preserving data mining using hybrid...
Distance based transformation for privacy preserving data mining using hybrid...
 
130509
130509130509
130509
 
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
 
Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...Additive gaussian noise based data perturbation in multi level trust privacy ...
Additive gaussian noise based data perturbation in multi level trust privacy ...
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
 
journal for research
journal for researchjournal for research
journal for research
 
Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacy
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
 
Different Classification Technique for Data mining in Insurance Industry usin...
Different Classification Technique for Data mining in Insurance Industry usin...Different Classification Technique for Data mining in Insurance Industry usin...
Different Classification Technique for Data mining in Insurance Industry usin...
 
Privacy preserving clustering on centralized data through scaling transf
Privacy preserving clustering on centralized data through scaling transfPrivacy preserving clustering on centralized data through scaling transf
Privacy preserving clustering on centralized data through scaling transf
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
A Survey on Privacy-Preserving Data Aggregation Without Secure Channel
A Survey on Privacy-Preserving Data Aggregation Without Secure ChannelA Survey on Privacy-Preserving Data Aggregation Without Secure Channel
A Survey on Privacy-Preserving Data Aggregation Without Secure Channel
 
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATION
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATIONMDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATION
MDAV2K: A VARIABLE-SIZE MICROAGGREGATION TECHNIQUE FOR PRIVACY PRESERVATION
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 

Recently uploaded (20)

Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 

IRJET - Random Data Perturbation Techniques in Privacy Preserving Data Mining

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2243 RANDOM DATA PERTURBATION TECHNIQUES IN PRIVACY PRESERVING DATA MINING Sangavi N1, Jeevitha R2, Kathirvel P3, Dr. Premalatha K4 1,2,3PG Scholars, Bannari Amman Institute of Technology, Sathyamangalam 4Professor, Bannari Amman Institute of Technology, Sathyamangalam ----------------------------------------------------------------------***--------------------------------------------------------------------- ABSTRACT-Data mining strategies have been facing a serious challenge in recent years due to heightened privacy concerns and concerns, i.e. protecting the privacy of important and sensitive data. Data perturbation is a common Data Mining privacy technique. Dataperturbation's biggest challenge is to balance privacy protection and data quality, which is normally considered to be a pair of contradictory factors. Geometric perturbation techniquefor data is a combination of perturbationtechniqueforrotation, translation, and noise addition. Publishing data while protecting privacy–sensitive details–isparticularlyusefulfor data owners. Typical examplesincludepublishingmicrodata for research purposes or contractingthe datatothirdparties providing services for data mining. In this paper we are trying to explore the latest trends in the technique of perturbation of geometric results. Keywords: Data mining, Privacy preserving, data perturbation, randomization, cryptography, Geometric Data Perturbation. INTRODUCTION Enormous volumes of extensivepersonal data areroutinely collected and analyzed using data mining tools. These data include, among others, shopping habits, criminal records, medical history, and credit records. Such data, on the one hand, is an important asset for business organizations and governments, both in decision-making processes and in providing social benefits such as medical research, crime reduction, national security, etc. Data mining techniques are capable of deriving highly sensitive information from unclassified data which is not even exposed to database holders. Worse is the privacy invasion triggered by secondary data use when people are unaware of usingdata mining techniques "behind the scenes"[3]. The daunting problem: how can we defend against the misuse of information that has been uncovered from secondary data use and meet the needs of organizations and governments to facilitate decision-making or even promote social benefits? They claim that a solution to such a problem involves two essential techniques: anonymityin the first step of privacy protection to delete identifiers (e.g. names, social insurance numbers, addresses, etc.) anddata transformation to preserve those sensitive attributes (e.g. income, age, etc.) since the release of data, after removal of data. identifiers, may contain other information thatcanbe linked with other datasets to re-identify individuals or entities [4]. We cannot effectively safeguard dataprivacyagainstnaive estimation. Rotation perturbation and random projection perturbation are all threatened by prior knowledge allowed Independent Component Analysis Multidimensional-anonymization is only intended for general-purpose utility preservationandmayresultinlow- quality data mining models.Inthispaperweproposea new multidimensional data perturbation technique: geometric data perturbation that can be appliedforseveral categories of popular data mining models with better utility preservation and privacy preservation[5]. Need for Privacy in Data Mining Presumably information is the most important and demanded resource today. We live in an online societythat relies on dissemination andinformationsharinginboththe private as well as the public and government sectors.State, federal, and private entitiesareincreasinglybeing required to make their data electronicallyavailable[5][6].Protecting respondent privacy (individuals, groups, associations, businesses, etc.). Though technically anonymous, de- identified data may include other data, such as ethnicity, birth date, gender and ZIP code, which may be unique or nearly unique. Identifying the characteristics of publicly available databases associating these characteristics with the identity of the respondent, data recipients may decide to which respondent each pieceof releaseddata belongs, or limit their confusion to a specific sub-set of persons. DATA PERTURBATION Data-perturbation-based approaches fall into two main categories which we call the probability distribution category and the fixed data perturbation category[8]. The probability distribution group considerstheaggregationas a sample from a given population with a given probability distribution. In this case, the security check method substitutes for the original data With another sample or by the allotment itself from the same distribution. In the context of fixed data perturbation the values of the attributes in the database to be used to calculate statistics are once and for all disrupted. The perturbationmethodsof fixed data were developed solely for numerical or categorical data[9]. Within the category of probability distribution two methods can be defined. The firstiscalled"data swap-ping" or "multidimensional transformation" This approach replaces the original database with a randomly generated database of exactly the same probability distributionasthe original database[10]. When calculating a new
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2244 perturbation, consideration must be given to the relationship between this entity and the rest of the database, as long as a new entity is added or a current entity is eliminated. A one-to - one mapping between the original database and the disrupted database is needed. The Precision resulting from this method may be considered unacceptable, since in some cases the method may have an error of up to 50 per cent. The lattermethodis called probability distribution method. The method consists of three steps: (1) Identify the underlying density function of the attribute values, and estimate the parameters of that function. (2) Generate a confidential attribute data sample sequence from the approximate density function. The most recent sample would need tobe the same size as the database. (3) delete these genera In other words, the lower value of thenewsamplewill replace the lower value in the original data, and so on. Data perturbation is a popular Data Mining privacy technique. Data perturbation's biggest challenge is to balance privacy protection and data quality, which are normally considered as a pair of contradictory factors[11 ]. The distribution of in this method Reconstructed independently of every data dimension. This means that any data mining algorithm based on the distributionworks under an implicit assumption that each dimension is treated independently. Approach to data perturbation is divided into two: the approach to probability distribution and the approach to value distortion. The approach to probability distribution replaces the data with another sample from the same distribution or the distribution itself, and the approach to value distortion Disrupts data elements or attributes directly by either additive noise, multiplicative noise, or other procedures of randomization. There are three types of approaches to data perturbation: Rotation perturbation, Projection perturbation and perturbation of geometric data. DIFFERENT METHODS OF DATA PERTURBATION 3.1 Noise Additive Perturbation The standard technique of additive perturbation[13 ] is column-based randomisation of additives. This type of techniques is based on the facts that 1) data owners may not want to protect all values in a record equally, so a distortion of the column-based value can be applied to disturb some sensitive columns. 2) The data classification models to be used do not necessarily requiretheindividual records, but only the distribution of the column value assuming separate columns. The basic method is to disguise the original values by injecting some amount of random additive noise, while the specific information,such as the column distribution, can still be effectively reconstructed from the perturbed data. We treat the original values (x1,x2,...,xn)froma columnto be randomly drawn from a random variable X, which has some kind of distribution. By adding random noises R to the original data values, the randomization process changes the original data and generates a disturbed data column Y, Y= X+ R. It publishes the resulting record(x1+r1, x2+r2,...,xn+rn) and the R distribution. The trick to introducing random noise is the algorithm of distribution reconstruction, which restores X's column distribution based on perturbed data and R's distribution. 3.2 Condensation-based Perturbation: The approach to condensation is a standard multidimensional perturbation technique, aimed at maintaining the matrix of covariance for multiplecolumns. So some geometric properties like decision boundaryform are well maintained. Unlike the randomization approach, multiple columns as a whole are disturbed in order to generate the whole "perturbed data set." As for the The perturbed dataset preserves the covariance matrix, and many existing data mining algorithms can be applied directly to the perturbed dataset without requiring algorithm modifications or new development. It begins by partitioning the original data into groups k- record. The groupconsistsof twosteps–randomlychoosing a record as the center of the group from the current records, and then identifying the (k − 1) nearest neighbors of the center as the other (k − 1) members. Before the next community is created the selected k records are extracted from the original dataset. Since each group has a limited locality, a set of k records may be regenerated to maintain the distribution and covariance roughly. The record replication algorithm aims to maintain each group's ownvectors and values, as shown in the Figure 1. Fig. 1 Eigen values of each group 3.3 Random Projection Perturbation: Random projection perturbation (Liu, Kargupta and Ryan, 2006) refers to the technique of projecting a set of data points to another randomly selected space from the original multidimensional space.LetPk average bea matrix
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2245 of random projection, where the rows of P are orthonormal[14 ]. G(X) = is applied to perturb the dataset X. 3.4 Geometric data perturbation: Def: Geometric data perturbation consists of a sequence of random geometric transformations, including multiplicative transformation (R), translation transformation (Ψ), and distance perturbation ∆. G(X) = RX + Ψ + ∆ [15] The data is assumed to be an Apxq matrix where each of the p rows is an observation, Oi, and for each of the q attributes, Ai, each observation containsvalues.Thematrix may include both numerical and categorical attributes. However, our methods of geometric data transformation rely on numerical attributes d, such that d <=q.Thus,inthe Euclidean space, the matrix px d, which is subject to transformation, can be regarded as a vector subspace V, so that each vectorvi€ V is the form vi= (a1;::; ad),1 <= i<=d, where ai is one instance of Ai, ai€R, and R is the set of real numbers. Before releasing the data for clustering analysis, the vector subspace V must be transformed to preservethe privacy of the individual data records. We need to add or even multiply a constant noise term e to each element vi of V in order to transform V into a distorted vector subspace V.' Translation Transformation: A constant is added for all attribute values. The constant may be a negative or a positive number. Although its degree of privacy protection is 0 according to the formula for calculating the degree of privacy protection, it makes us unable to see the raw data directly from transformed data, so translation transform can also play the role of privacy protection as well. Translation is the task of moving a point with coordinates (X;Y) through displacements(X0;Y0) to a new location. Using a matrix representation v'=Tv, where T is a 2x 3 transformation matrix depicted in Figure 1(a), v is the vector column containing theoriginal co-ordinates,andv' is a column vector whose co-ordinates are thetransformed co-ordinates, is easily achieved. This form of matrix is also extended to Scaling and Rotation. Rotation Transformation: Consider them as two- dimensional space points for a pair of arbitrarily selected attributes and rotate them according to a given angle with the origin as the middle. If it is positive, we must rotateitin anti-clockwise direction. Otherwise we'll rotatetheminthe clockwise direction. A more challenging transformation is rotation. This transformation, in its simplest form, is for the rotation of a point around the coordinate axes. Rotation of a point by angle in a discrete 2D space is achieved using the transformation matrix shown in Figure 1(b). The rotation angle is measured clockwise and this transformation ects the values of X and Y coordinates. Fig. 2 (a) Translation Matrix (b) Rotation Matrix The two elements above, translation and rotation maintain the relationship between the distances. A bunch of essential classification models will be "perturbation- invariant" by retaining distances, which is the center of geometric perturbation. In some situations, distance conserving perturbation may be subject to distance- inference attacks. The objective of distance disturbance is to preserve the Distances are approximate, whileresilience to distance-inference attacks is effectively increased. We define the third component as a random matrix, where each entry is a separate sample with zero mean and small variance from the same distribution. By adding this component it slightly disturbs the distance between a pair of points. CONCLUSIONS It focuses mainly on random geometric perturbation approach to privacy preservingdata classification.Random geometric perturbation, G(X) = RX + Ψ + ∆, includes the linear combination of the three components: rotation perturbation, translation perturbation, and distance perturbation. Geometric perturbation can preserve the important geometric properties, thus most data mining models that search for geometric class boundariesare well preserved with the perturbed data. Geometric perturbation perturbs multiple columns in one transformation, which introduces new challenges in evaluating the privacy guarantee for multi-dimensional perturbation.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2246 REFERENCES 1. Chhinkaniwala H. and Garg S., “Privacy Preserving Data Mining Techniques: Challenges and Issues”, CSIT, 2011. 2. L.Golab and M.T.Ozsu, Data Stream Management issues-”A Survey Technical Report”, 2003. 3. Majid, M.Asger, Rashid Ali, “PrivacypreservingData Mining Techniques:Current Scenario and Future Prospects”, IEEE 2012. 4. Aggrawal, C.C, and Yu.PS. ,” A condensation approach to privacy preserving data mining”. Proc. Of Int.conf. on extending Database Technology(EDBT)(2004). 5. Chen K, and Liu, “Privacy Preserving Data Classification with Rotation Perturbation”, proc.ICDM, 2005, pp.589-592. 6. K.Liu, H Kargupta, and J.Ryan,”Randomprojection– based multiplicative data perturbation for privacy preserving distributed data mining.” IEEE Transaction on knowledge and Data Engg,Jan 2006,pp 92-106. 7. Keke Chen, Gordon Sun, and Ling Liu. Towards attack-resilient geometric data perturbation.” In proceedings of the 2007 SIAM international conference on Data mining, April 2007. 8. M. Reza,Somayyeh Seifi,” Classification and Evaluation the PPDM Techniues by using a data Modification -based framework”, IJCSE,Vol3.No2 Feb 2011. 9. Vassilios S.Verylios,E.Bertino,Igor N,”State –of-the art in Privacy preserving Data Mining”,published in SIGMOD 2004 pp.121-154. 10. Ching-Ming, Po-Zung & Chu-Hao,” Privacy Preserving Clustering of Data streams”, Tamkang Journal of Sc. & Engg, Vol.13 no. 3 pp.349-358 11. Jie Liu, Yifeng XU, “Privacy Preserving Clusteringby Random Response Method of Geometric Transformation”, IEEE 2010 12. Keke Chen, Ling lui, Privacy Preserving Multiparty Collabrative Mining with Geometric Data Perturbation , IEEE, January 2009