SlideShare a Scribd company logo
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
DOI : 10.5121/acij.2010.1101 1
Data Transformation Technique for Protecting
Private Information in Privacy Preserving Data
Mining
S.Vijayarani#1
, Dr.A.Tamilarasi*2
#1
Assistant Professor, School of Computer Science and Engineering, Bharathiar
University, Coimbatore
*2
Prof&Head, Department of MCA, Kongu Engg. College, Erode
1
vijimohan_2000@yahoo.com, 2
drtamil@kongu.ac.in
Abstract - Data mining is the process of extracting patterns from data. Data mining is seen as an
increasingly important tool by modern business to transform data into an informational advantage. Data
Mining can be utilized in any organization that needs to find patterns or relationships in their data. A group
of techniques that find relationships that have not previously been discovered. In many situations, the
extracted patterns are highly private and it should not be disclosed. In order to maintain the secrecy of data,
there is in need of several techniques and algorithms for modifying the original data in order to limit the
extraction of confidential patterns. There have been two types of privacy in data mining. The first type of
privacy is that the data is altered so that the mining result will preserve certain privacy. The second type of
privacy is that the data is manipulated so that the mining result is not affected or minimally affected. The
aim of privacy preserving data mining researchers is to develop data mining techniques that could be
applied on data bases without violating the privacy of individuals. Many techniques for privacy preserving
data mining have come up over the last decade. Some of them are statistical, cryptographic, randomization
methods, k-anonymity model, l-diversity and etc. In this work, we propose a new perturbative masking
technique known as data transformation technique can be used for protecting the sensitive information. An
experimental result shows that the proposed technique gives the better result compared with the existing
technique.
Keywords- Privacy, Sensitive data, Data transformation, Micro-aggregation, K-means clustering.
I. INTRODUCTION
With the speedy development in database, networking, and computing technologies, a
large amount of individual data can be integrated and investigated digitally, leading to an
increased use of data-mining tools to infer trends and patterns. This has lifted worldwide concerns
about protecting the privacy of individuals. Privacy preserving data mining (PPDM) refers to the
area of data mining that seeks to safeguard sensitive information from unsolicited or unsanctioned
disclosure. The main objective in privacy preserving data mining is to develop algorithms for
modifying the original data in some way, so that the private data and private knowledge remain
private even after the mining process.[11]
The need for shielding numerical data from leak has gained significant importance in
recent years. Government organizations which release data have always been interested in this
problem. However, with the increase in the ability of organizations to gather, store, analyze,
disseminate, and share data, there has also been a growing demand for commercial organizations
to secure sensitive data from disclosure. Recent legislation worldwide has made this an important
issue for all organizations that gather and store any sensitive information.
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
2
The advancement of information technologies has enabled various organizations (e.g.,
census agencies, hospitals) to collect large volumes of sensitive personal data (e.g., census data,
medical records). Due to the great research value of such data, it is often released for public
benefit purposes, which, however, poses a risk to individual privacy. A typical solution to this
problem is to anonymize the data before releasing it to the public. In particular, the
anonymization should be conducted in a careful manner, such that the published data not only
prevents an adversary from inferring sensitive information, but also remains useful for data
analysis.
Most methods for privacy computations use some form of transformation on the data in
order to perform the privacy preservation. A host of techniques are available for protecting
numerical data from disclosure. These include rounding or coarsening, perturbation, micro-
aggregation, data swapping, and more recently, data shuffling. Muralidhar and Sarathy [9]
provide a comprehensive discussion of the different techniques for protecting numerical data.
With the exception of swapping and shuffling, most other data masking techniques involve the
modification of the original values of the confidential variables. Many users find such
modification of values to be objectionable and hence are less likely to use the modified data. By
contrast, by transforming the original values leaves the original data unmodified. Hence, this
type of data transformation techniques are more likely to be accepted by users who find “data
modification” objectionable.
The rest of this paper is organized as follows. In Section 2, we present an overview of
micro data and masking techniques. Section 3 we discuss about the data transformation
perturbative masking technique. Micro-aggregation technique is discussed in section 4. Section 5
gives the experimental results of data transformation and micro-aggreagation. Conclusions are
given in Section 6.
II. STATISTICAL DISCLOSURE CONTROL
Inference control in statistical databases, also known as Statistical Disclosure Control
(SDC) or Statistical Disclosure Limitation (SDL), seeks to protect statistical data in such a way
that they can be publicly released and mined without giving away private information that can be
linked to specific individuals or entities
A. Micro Data
A microdata set is a set of records containing data of individuals being studied, who can
be persons, companies, etc. The individual records of a microdata set are stored in a microdata
file. Each individual j is assigned a data vector Vj, also called data record or data set. A data
vector is formed by several variables or attributes. The attributes in an initial micro data table are
usually classified as follows.[12]
1) Identifiers - Attributes that exclusively recognize a micro data respondent. For instance,
attribute employee number uniquely identifies the employee with which is associated.
2) Quasi-identifiers - Attributes that, in combination, can be linked with external
information to re-identify, all or some of the respondents to whom information refers or
reduce the uncertainty over their identities. For instance, attributes DOB, ZIP, and SEX
are quasi-identifiers: they can be linked to external public information to reveal the name
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
3
and address of the corresponding respondents or to reduce the uncertainty to a specific set
of respondents.
Confidential attributes - Attributes of the micro-data table contains confidential
information. For instance, attribute salary can be considered as confidential.
B. Classification of Micro data Disclosure Protection
The micro data protection techniques can be classified into two main categories: masking
techniques, and synthetic data generation techniques
Fig.1. Classification of MPTs
1) Masking techniques - The original data are transformed to produce new data that are valid for
statistical analysis and such that they preserve the confidentiality of respondents. Masking
techniques can be classified as:
1.1 Non-perturbative - the original data are not modified, but some data are suppressed
and/or some details are removed; Non-perturbative techniques produce protected
micro data by eliminating details from the original micro data. Some of the non-
perturbative techniques are Sampling, Local suppression, Global recoding, Top-
coding, Bottom-coding and Generalization.
1.2 Perturbative - the original data are modified. With perturbative techniques, the micro
data table is modified for publication. Modifications can make unique combinations
of values in the original table disappear as well as introduce new combinations.
Resampling, Lossy compression, Rounding, PRAM, MASSC, Random noise,
Swapping, Rank swapping and Micro-aggregation are some of the perturbative
masking techniques.
2) Synthetic data generation techniques - The original set of tuples in a micro data table is
replaced with a new set of tuples generated in such a way to preserve the key statistical
properties of the original data. The generation process is usually based on a statistical model
and the key statistical properties that are not included in the model will not be necessarily
respected by the synthetic data. Since the released micro data table contains synthetic data,
the re-identification risks are reduced. Note that the released micro data table can be entirely
synthetic (i.e., fully synthetic) or mixed with the original data (i.e., partially synthetic).
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
4
III. PROPOSED SYSTEM
A. Objective of the Problem
The sensitive attribute can be selected from the micro data and it can be modified by a
data transformation perturbative masking technique. After modification, the modified data can be
released to data mining researchers or any agency or firm. If they can apply data mining
techniques such as clustering, classification, etc for data analysis, the modified table does not
affect the result. In this work, we have applied k-means clustering algorithm to the modified data
and verified the result. The steps involved in this work are,
1. Sensitive Attribute Selection
2. Data Transformation perturbative masking technique for modifying the sensitive attribute
3. Applying k-means algorithm for original and modified data
4. Compare the results
B. Sensitive Attribute Selection
From the micro data table select the sensitive numeric attributes. For example, an
employee database the attributes are employee number, employee name, date of birth, salary,
account no, qualification, designation and etc... The attributes salary and account number are
considered as sensitive attributes.
C. Data Transformation perturbative masking technique
Fig.2 Proposed System Architecture
Algorithm for Data Transformation
1. Consider a database D consists of T tuples. D={t1,t2,…tn}. Each tuple in T consists of set
of attributes T={A1,A2,…Ap} where Ai Є T and Ti Є D
2. Identify the sensitive or confidential numeric attribute AR
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
5
3. Consider the sensitive attribute AR, find out the maximum number of digits, maxD Є AR
and minimum number of digits, minD Є AR
4. For grouping the same number of digits, find out the number of groups
5. Verify if (maxD=mind) then assign r as 1
Arrange the entire sensitive data item Є r group only
Go to step 7
6. Compute r=(maxD-minD)+1
Arrange all the sensitive data item Є Xk (k=1,2,…r) groups, according to the number
of digits. Check if (minD ≤ r ≤ maxD)
7. Find the number of items Kl(l=1,2,…m) in ∀ Xk group
∀ Xk group (k=1,2,…r)
If number of items kl Є Xk =1 then no modification
8. Otherwise, if number of items kl Є Xk =2
Then transform the value of kl to T, km to kl ; and T to km
else
9. Assign the value of k1 to T
10. Repeat (for l=1 to m)
begin
Assign the value of kl+1 to k
end
until (l>m)
11. Assign the value of T to km
D. K-means Clustering Algorihm
The k-means algorithm for partitioning, where each cluster’s center is represented by the mean
value of the objects in the cluster
Input
• K : the number of clusters
• D : a data set containing n objects
Output: Set of k clusters
Method
(1) arbitrarily choose k objects from D as the initial cluster centers;
(2) Repeat
(3) (re)assign each object to the cluster to which the object is the most similar, based on the
mean value of the objects in the cluster
(4) Update the cluster means, i.e. calculate the mean value of the objects for each cluster
(5) Until no change Apply the k-means clustering for both original and modified data set to
get the clusters
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
6
E. Comparing the results
The data items found in the clusters are verified in both original data and modified data.
IV. MICRO-AGGREGATION
Microaggregation is a statistical disclosure control technique for microdata. Raw
microdata (i. e. individual records) are grouped into small aggregates prior to publication. Each
aggregate should contain at least k records to prevent disclosure of individual information.
Microaggregation is a family of statistical disclosure control techniques for microdata which
belong to the data modification category. The rationale behind microaggregation is that
confidentiality rules in use allow publication of microdata sets if the data vectors correspond to
groups of k or more individuals, where no individual dominates (i. e. contributes too much to) the
group and k is a threshold value. Strict application of such confidentiality rules leads to replacing
individual values with values computed on small aggregates (microaggregates) prior to
publication. This is the basic principle of microaggregation.[3]
To obtain microaggregates in a microdata set with n data vectors, these are combined to
form g groups of size at least k. For each variable, the average value over each group is computed
and is used to replace each of the original averaged values. Groups are formed using a criterion of
maximal similarity. Once the procedure has been completed, the resulting (modified) data vectors
can be published. [1]
V. EXPERIMENTAL RESULTS
In order to conduct the experiments, synthetic employee dataset can be created with 500
records. From this dataset, we select the sensitive numeric attribute. i.e. income.
Data transformation perturbative masking technique is used to modify this sensitive
attribute.
The following performance factors are considered for evaluating the technique
A. Statistical performance of the original data and modified data
In order to calculate the statistical properties such as mean, variance and standard
deviation for original data and modified data. The chart shows that, in data transformation
technique has produced the same statistical results after the modification also. Microaggregation
technique returns only the mean value is same as the original. But other statistical property such
as variance and standard deviation does not produce the same results. We have applied different
size of data sets for verification.
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
7
Fig.1. Statistical performances
The data-transformation technique will produce accurate statistical results compared to micro-
aggregation
B. Privacy protection
To verify the privacy protection, we test whether all the original data items are modified
using the data transformation approach or not. All the data items are modified then we get 100%
privacy protection. The following chart depicts this. In the given data set both methods would
produce 100% of privacy protection.
Fig.2. Privacy Protection
C. Accuracy of data mining algorithm
The chart shows that the percentage of clustering accuracy is obtained from the data
transformation and microaggregation. From the results, we come know that the data items found
in the original clusters are same as the data transformation approach. Comparing the data-
transformation and micro-aggregation the clustering accuracy is higher in data-transformation.
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
8
Fig.3. Accuracy of clustering algorithm
VI. CONCLUSION AND FUTURE WORK
Protecting the sensitive data and also extracting knowledge is a very complicated
problem. Based on the above experimental results we come know that the proposed data
transformation technique is a good technique for protecting and modifying the sensitive data.
After modification, the data could be used for data mining also. And also it is very easy to get the
original data. After modification we need the original data only two steps are required to get the
original data. In the proposed work, the data transformation technique is used for numerical
attributes. In future, we would develop new masking techniques for protecting the categorical
attributes.
ACKNOWLEDGEMENT
I would like to thank “The UGC, New Delhi” for providing me the necessary funds.
REFERENCES
[1] Brand R (2002). “Micro data protection through noise addition”. In Domingo-Ferrer J, editor, Inference
Control in Statistical Databases, vol. 2316 of LNCS,pp. 97{116. Springer, Berlin Heidelberg.
[2] Charu C.Aggarwal IBM T.J. Watson Research Center, USA and Philip S. “Privacy preserving data
mining: Models and algorithms” Yu University of Illinois at Chicago, USA.
[3] Domingo-Ferrer, J & Torra, V (2002), “Aggregation Techniques for Statistical confidentiality”. In:
Aggregation operators: new trends and applications, pp. 260-271. Physica-Verlag GmbH, Heidelberg
(2002).
[4] Domingo-Ferrer, J & Mateo-Sanz, J. M. (2002), “Practical data-oriented microaggregation for
statistical disclosure control”, IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 1, pp.
189-201, 2002.
[5] Domingo-Ferrer, J & Torra, V (2005), Ordinal, continuous and heterogenerous k-anonymity through
microaggregation, Data Mining and Knowledge Discovery, vol. 11, no. 2, pp. 195-212, 2005.
Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010
9
[6] Defays, D & Nanopoulous, P (1993), “Panels of enterprises and confidentiality: the small aggregates
method”, in Proc. of 92 Symposium on Design and Analysis of Longitudinal Surveys. Ottawa: Statistics
Canada, 1993, pp. 195-204.
[7] J.m. mateo-sanz, j. Domingo-ferrer, “A comparative study of microaggregation methods”, question,
vol. 22, 3, p. 511-526, 1998.
[8] Krishnamurty Muralidhar, Rahul Parsa, Rathindra Sarathy, “A general Additive Data Perturbation
Method for database Security”, management science, Vol. 45, No. 10, October 1999, pp. 1399-1415
DOI: 10.1287/mnsc.45.10.1399
[9] Rathindra Sarathy, Krishnamurty Muralidhar, “The Security of Confidential Numerical Data in
Databases”, information systems research,Vol. 13, No. 4, December 2002, pp. 389-403 DOI:
10.1287/isre.13.4.389.74.
[10] Samarati, P (2001), “Protecting respondents' identities in microdata release”, IEEE Transactions on
Knowledge and Data Engineering, 13(6):1010-1027. 2001.
[11] Vassilios S. Veryhios, Elisa Bertino, Igor Nai Fovino Loredana Parasiliti Provenza, Yucel Saygin,
Yannis eodoridis, “State-of-the-art in Privacy Preserving Data Mining” , SIGMOD Record, Vol. 33, No.
1, March 2004.
[12] V.Ciriani, S.De Capitani di Vimercati, S.Foresti, and P.Samarati Universitua degli Studi di Milano,
“Micro data protection” 26013 Crema, ltalia., Springer US, Advances in Information Security (2007)

More Related Content

What's hot

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Kato Mivule
 
Applying Data Privacy Techniques on Published Data in Uganda
 Applying Data Privacy Techniques on Published Data in Uganda Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in UgandaKato Mivule
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Kato Mivule
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyKato Mivule
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Miningijtsrd
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...Editor IJMTER
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachNarendra Dhadhal
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining TechniquesIJMER
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningVrushali Malvadkar
 
Distributed Data mining using Multi Agent data
Distributed Data mining using Multi Agent dataDistributed Data mining using Multi Agent data
Distributed Data mining using Multi Agent dataIRJET Journal
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Publishing House
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration AnalysisIRJET Journal
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...IRJET Journal
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization techniquemustafasmart
 
A review on data mining
A  review on data miningA  review on data mining
A review on data miningEr. Nancy
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 

What's hot (20)

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...
 
Applying Data Privacy Techniques on Published Data in Uganda
 Applying Data Privacy Techniques on Published Data in Uganda Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in Uganda
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Mining
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
 
G045033841
G045033841G045033841
G045033841
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approach
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Distributed Data mining using Multi Agent data
Distributed Data mining using Multi Agent dataDistributed Data mining using Multi Agent data
Distributed Data mining using Multi Agent data
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
 
New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...New approaches of Data Mining for the Internet of things with systems: Litera...
New approaches of Data Mining for the Internet of things with systems: Litera...
 
Association rule visualization technique
Association rule visualization techniqueAssociation rule visualization technique
Association rule visualization technique
 
A review on data mining
A  review on data miningA  review on data mining
A review on data mining
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 

Similar to Data Transformation Technique for Protecting Private Information in Privacy Preserving Data Mining

A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...IJSRD
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudIOSR Journals
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...Editor IJMTER
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishingijcisjournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionIOSR Journals
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningIJNSA Journal
 
78201919
7820191978201919
78201919IJRAT
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacydbpublications
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
Bj32809815
Bj32809815Bj32809815
Bj32809815IJMER
 
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...cscpconf
 

Similar to Data Transformation Technique for Protecting Private Information in Privacy Preserving Data Mining (20)

A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishing
 
Ib3514141422
Ib3514141422Ib3514141422
Ib3514141422
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
 
130509
130509130509
130509
 
130509
130509130509
130509
 
J017536064
J017536064J017536064
J017536064
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved Mining
 
78201919
7820191978201919
78201919
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Bj32809815
Bj32809815Bj32809815
Bj32809815
 
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
 

More from acijjournal

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfacijjournal
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfacijjournal
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfacijjournal
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdfacijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)acijjournal
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Developmentacijjournal
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodologyacijjournal
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) acijjournal
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principlesacijjournal
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...acijjournal
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...acijjournal
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...acijjournal
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...acijjournal
 
10th International Conference on Cloud Computing: Services and Architecture (...
10th International Conference on Cloud Computing: Services and Architecture (...10th International Conference on Cloud Computing: Services and Architecture (...
10th International Conference on Cloud Computing: Services and Architecture (...acijjournal
 

More from acijjournal (20)

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdf
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdf
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdf
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ)
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
 
10th International Conference on Cloud Computing: Services and Architecture (...
10th International Conference on Cloud Computing: Services and Architecture (...10th International Conference on Cloud Computing: Services and Architecture (...
10th International Conference on Cloud Computing: Services and Architecture (...
 

Recently uploaded

Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwoodseandesed
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageRCC Institute of Information Technology
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf884710SadaqatAli
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdfKamal Acharya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationRobbie Edward Sayers
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringDr. Radhey Shyam
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerapareshmondalnita
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdfKamal Acharya
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamDr. Radhey Shyam
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...Amil baba
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdfKamal Acharya
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdfKamal Acharya
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edgePaco Orozco
 
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data AnalysisIT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data AnalysisDr. Radhey Shyam
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdfKamal Acharya
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfKamal Acharya
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdfKamal Acharya
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234AafreenAbuthahir2
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfKamal Acharya
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGKOUSTAV SARKAR
 

Recently uploaded (20)

Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answer
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdf
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data AnalysisIT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdf
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
 

Data Transformation Technique for Protecting Private Information in Privacy Preserving Data Mining

  • 1. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 DOI : 10.5121/acij.2010.1101 1 Data Transformation Technique for Protecting Private Information in Privacy Preserving Data Mining S.Vijayarani#1 , Dr.A.Tamilarasi*2 #1 Assistant Professor, School of Computer Science and Engineering, Bharathiar University, Coimbatore *2 Prof&Head, Department of MCA, Kongu Engg. College, Erode 1 vijimohan_2000@yahoo.com, 2 drtamil@kongu.ac.in Abstract - Data mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. Data Mining can be utilized in any organization that needs to find patterns or relationships in their data. A group of techniques that find relationships that have not previously been discovered. In many situations, the extracted patterns are highly private and it should not be disclosed. In order to maintain the secrecy of data, there is in need of several techniques and algorithms for modifying the original data in order to limit the extraction of confidential patterns. There have been two types of privacy in data mining. The first type of privacy is that the data is altered so that the mining result will preserve certain privacy. The second type of privacy is that the data is manipulated so that the mining result is not affected or minimally affected. The aim of privacy preserving data mining researchers is to develop data mining techniques that could be applied on data bases without violating the privacy of individuals. Many techniques for privacy preserving data mining have come up over the last decade. Some of them are statistical, cryptographic, randomization methods, k-anonymity model, l-diversity and etc. In this work, we propose a new perturbative masking technique known as data transformation technique can be used for protecting the sensitive information. An experimental result shows that the proposed technique gives the better result compared with the existing technique. Keywords- Privacy, Sensitive data, Data transformation, Micro-aggregation, K-means clustering. I. INTRODUCTION With the speedy development in database, networking, and computing technologies, a large amount of individual data can be integrated and investigated digitally, leading to an increased use of data-mining tools to infer trends and patterns. This has lifted worldwide concerns about protecting the privacy of individuals. Privacy preserving data mining (PPDM) refers to the area of data mining that seeks to safeguard sensitive information from unsolicited or unsanctioned disclosure. The main objective in privacy preserving data mining is to develop algorithms for modifying the original data in some way, so that the private data and private knowledge remain private even after the mining process.[11] The need for shielding numerical data from leak has gained significant importance in recent years. Government organizations which release data have always been interested in this problem. However, with the increase in the ability of organizations to gather, store, analyze, disseminate, and share data, there has also been a growing demand for commercial organizations to secure sensitive data from disclosure. Recent legislation worldwide has made this an important issue for all organizations that gather and store any sensitive information.
  • 2. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 2 The advancement of information technologies has enabled various organizations (e.g., census agencies, hospitals) to collect large volumes of sensitive personal data (e.g., census data, medical records). Due to the great research value of such data, it is often released for public benefit purposes, which, however, poses a risk to individual privacy. A typical solution to this problem is to anonymize the data before releasing it to the public. In particular, the anonymization should be conducted in a careful manner, such that the published data not only prevents an adversary from inferring sensitive information, but also remains useful for data analysis. Most methods for privacy computations use some form of transformation on the data in order to perform the privacy preservation. A host of techniques are available for protecting numerical data from disclosure. These include rounding or coarsening, perturbation, micro- aggregation, data swapping, and more recently, data shuffling. Muralidhar and Sarathy [9] provide a comprehensive discussion of the different techniques for protecting numerical data. With the exception of swapping and shuffling, most other data masking techniques involve the modification of the original values of the confidential variables. Many users find such modification of values to be objectionable and hence are less likely to use the modified data. By contrast, by transforming the original values leaves the original data unmodified. Hence, this type of data transformation techniques are more likely to be accepted by users who find “data modification” objectionable. The rest of this paper is organized as follows. In Section 2, we present an overview of micro data and masking techniques. Section 3 we discuss about the data transformation perturbative masking technique. Micro-aggregation technique is discussed in section 4. Section 5 gives the experimental results of data transformation and micro-aggreagation. Conclusions are given in Section 6. II. STATISTICAL DISCLOSURE CONTROL Inference control in statistical databases, also known as Statistical Disclosure Control (SDC) or Statistical Disclosure Limitation (SDL), seeks to protect statistical data in such a way that they can be publicly released and mined without giving away private information that can be linked to specific individuals or entities A. Micro Data A microdata set is a set of records containing data of individuals being studied, who can be persons, companies, etc. The individual records of a microdata set are stored in a microdata file. Each individual j is assigned a data vector Vj, also called data record or data set. A data vector is formed by several variables or attributes. The attributes in an initial micro data table are usually classified as follows.[12] 1) Identifiers - Attributes that exclusively recognize a micro data respondent. For instance, attribute employee number uniquely identifies the employee with which is associated. 2) Quasi-identifiers - Attributes that, in combination, can be linked with external information to re-identify, all or some of the respondents to whom information refers or reduce the uncertainty over their identities. For instance, attributes DOB, ZIP, and SEX are quasi-identifiers: they can be linked to external public information to reveal the name
  • 3. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 3 and address of the corresponding respondents or to reduce the uncertainty to a specific set of respondents. Confidential attributes - Attributes of the micro-data table contains confidential information. For instance, attribute salary can be considered as confidential. B. Classification of Micro data Disclosure Protection The micro data protection techniques can be classified into two main categories: masking techniques, and synthetic data generation techniques Fig.1. Classification of MPTs 1) Masking techniques - The original data are transformed to produce new data that are valid for statistical analysis and such that they preserve the confidentiality of respondents. Masking techniques can be classified as: 1.1 Non-perturbative - the original data are not modified, but some data are suppressed and/or some details are removed; Non-perturbative techniques produce protected micro data by eliminating details from the original micro data. Some of the non- perturbative techniques are Sampling, Local suppression, Global recoding, Top- coding, Bottom-coding and Generalization. 1.2 Perturbative - the original data are modified. With perturbative techniques, the micro data table is modified for publication. Modifications can make unique combinations of values in the original table disappear as well as introduce new combinations. Resampling, Lossy compression, Rounding, PRAM, MASSC, Random noise, Swapping, Rank swapping and Micro-aggregation are some of the perturbative masking techniques. 2) Synthetic data generation techniques - The original set of tuples in a micro data table is replaced with a new set of tuples generated in such a way to preserve the key statistical properties of the original data. The generation process is usually based on a statistical model and the key statistical properties that are not included in the model will not be necessarily respected by the synthetic data. Since the released micro data table contains synthetic data, the re-identification risks are reduced. Note that the released micro data table can be entirely synthetic (i.e., fully synthetic) or mixed with the original data (i.e., partially synthetic).
  • 4. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 4 III. PROPOSED SYSTEM A. Objective of the Problem The sensitive attribute can be selected from the micro data and it can be modified by a data transformation perturbative masking technique. After modification, the modified data can be released to data mining researchers or any agency or firm. If they can apply data mining techniques such as clustering, classification, etc for data analysis, the modified table does not affect the result. In this work, we have applied k-means clustering algorithm to the modified data and verified the result. The steps involved in this work are, 1. Sensitive Attribute Selection 2. Data Transformation perturbative masking technique for modifying the sensitive attribute 3. Applying k-means algorithm for original and modified data 4. Compare the results B. Sensitive Attribute Selection From the micro data table select the sensitive numeric attributes. For example, an employee database the attributes are employee number, employee name, date of birth, salary, account no, qualification, designation and etc... The attributes salary and account number are considered as sensitive attributes. C. Data Transformation perturbative masking technique Fig.2 Proposed System Architecture Algorithm for Data Transformation 1. Consider a database D consists of T tuples. D={t1,t2,…tn}. Each tuple in T consists of set of attributes T={A1,A2,…Ap} where Ai Є T and Ti Є D 2. Identify the sensitive or confidential numeric attribute AR
  • 5. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 5 3. Consider the sensitive attribute AR, find out the maximum number of digits, maxD Є AR and minimum number of digits, minD Є AR 4. For grouping the same number of digits, find out the number of groups 5. Verify if (maxD=mind) then assign r as 1 Arrange the entire sensitive data item Є r group only Go to step 7 6. Compute r=(maxD-minD)+1 Arrange all the sensitive data item Є Xk (k=1,2,…r) groups, according to the number of digits. Check if (minD ≤ r ≤ maxD) 7. Find the number of items Kl(l=1,2,…m) in ∀ Xk group ∀ Xk group (k=1,2,…r) If number of items kl Є Xk =1 then no modification 8. Otherwise, if number of items kl Є Xk =2 Then transform the value of kl to T, km to kl ; and T to km else 9. Assign the value of k1 to T 10. Repeat (for l=1 to m) begin Assign the value of kl+1 to k end until (l>m) 11. Assign the value of T to km D. K-means Clustering Algorihm The k-means algorithm for partitioning, where each cluster’s center is represented by the mean value of the objects in the cluster Input • K : the number of clusters • D : a data set containing n objects Output: Set of k clusters Method (1) arbitrarily choose k objects from D as the initial cluster centers; (2) Repeat (3) (re)assign each object to the cluster to which the object is the most similar, based on the mean value of the objects in the cluster (4) Update the cluster means, i.e. calculate the mean value of the objects for each cluster (5) Until no change Apply the k-means clustering for both original and modified data set to get the clusters
  • 6. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 6 E. Comparing the results The data items found in the clusters are verified in both original data and modified data. IV. MICRO-AGGREGATION Microaggregation is a statistical disclosure control technique for microdata. Raw microdata (i. e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Microaggregation is a family of statistical disclosure control techniques for microdata which belong to the data modification category. The rationale behind microaggregation is that confidentiality rules in use allow publication of microdata sets if the data vectors correspond to groups of k or more individuals, where no individual dominates (i. e. contributes too much to) the group and k is a threshold value. Strict application of such confidentiality rules leads to replacing individual values with values computed on small aggregates (microaggregates) prior to publication. This is the basic principle of microaggregation.[3] To obtain microaggregates in a microdata set with n data vectors, these are combined to form g groups of size at least k. For each variable, the average value over each group is computed and is used to replace each of the original averaged values. Groups are formed using a criterion of maximal similarity. Once the procedure has been completed, the resulting (modified) data vectors can be published. [1] V. EXPERIMENTAL RESULTS In order to conduct the experiments, synthetic employee dataset can be created with 500 records. From this dataset, we select the sensitive numeric attribute. i.e. income. Data transformation perturbative masking technique is used to modify this sensitive attribute. The following performance factors are considered for evaluating the technique A. Statistical performance of the original data and modified data In order to calculate the statistical properties such as mean, variance and standard deviation for original data and modified data. The chart shows that, in data transformation technique has produced the same statistical results after the modification also. Microaggregation technique returns only the mean value is same as the original. But other statistical property such as variance and standard deviation does not produce the same results. We have applied different size of data sets for verification.
  • 7. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 7 Fig.1. Statistical performances The data-transformation technique will produce accurate statistical results compared to micro- aggregation B. Privacy protection To verify the privacy protection, we test whether all the original data items are modified using the data transformation approach or not. All the data items are modified then we get 100% privacy protection. The following chart depicts this. In the given data set both methods would produce 100% of privacy protection. Fig.2. Privacy Protection C. Accuracy of data mining algorithm The chart shows that the percentage of clustering accuracy is obtained from the data transformation and microaggregation. From the results, we come know that the data items found in the original clusters are same as the data transformation approach. Comparing the data- transformation and micro-aggregation the clustering accuracy is higher in data-transformation.
  • 8. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 8 Fig.3. Accuracy of clustering algorithm VI. CONCLUSION AND FUTURE WORK Protecting the sensitive data and also extracting knowledge is a very complicated problem. Based on the above experimental results we come know that the proposed data transformation technique is a good technique for protecting and modifying the sensitive data. After modification, the data could be used for data mining also. And also it is very easy to get the original data. After modification we need the original data only two steps are required to get the original data. In the proposed work, the data transformation technique is used for numerical attributes. In future, we would develop new masking techniques for protecting the categorical attributes. ACKNOWLEDGEMENT I would like to thank “The UGC, New Delhi” for providing me the necessary funds. REFERENCES [1] Brand R (2002). “Micro data protection through noise addition”. In Domingo-Ferrer J, editor, Inference Control in Statistical Databases, vol. 2316 of LNCS,pp. 97{116. Springer, Berlin Heidelberg. [2] Charu C.Aggarwal IBM T.J. Watson Research Center, USA and Philip S. “Privacy preserving data mining: Models and algorithms” Yu University of Illinois at Chicago, USA. [3] Domingo-Ferrer, J & Torra, V (2002), “Aggregation Techniques for Statistical confidentiality”. In: Aggregation operators: new trends and applications, pp. 260-271. Physica-Verlag GmbH, Heidelberg (2002). [4] Domingo-Ferrer, J & Mateo-Sanz, J. M. (2002), “Practical data-oriented microaggregation for statistical disclosure control”, IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 1, pp. 189-201, 2002. [5] Domingo-Ferrer, J & Torra, V (2005), Ordinal, continuous and heterogenerous k-anonymity through microaggregation, Data Mining and Knowledge Discovery, vol. 11, no. 2, pp. 195-212, 2005.
  • 9. Advanced Computing: An International Journal ( ACIJ ), Vol.1, No.1, November 2010 9 [6] Defays, D & Nanopoulous, P (1993), “Panels of enterprises and confidentiality: the small aggregates method”, in Proc. of 92 Symposium on Design and Analysis of Longitudinal Surveys. Ottawa: Statistics Canada, 1993, pp. 195-204. [7] J.m. mateo-sanz, j. Domingo-ferrer, “A comparative study of microaggregation methods”, question, vol. 22, 3, p. 511-526, 1998. [8] Krishnamurty Muralidhar, Rahul Parsa, Rathindra Sarathy, “A general Additive Data Perturbation Method for database Security”, management science, Vol. 45, No. 10, October 1999, pp. 1399-1415 DOI: 10.1287/mnsc.45.10.1399 [9] Rathindra Sarathy, Krishnamurty Muralidhar, “The Security of Confidential Numerical Data in Databases”, information systems research,Vol. 13, No. 4, December 2002, pp. 389-403 DOI: 10.1287/isre.13.4.389.74. [10] Samarati, P (2001), “Protecting respondents' identities in microdata release”, IEEE Transactions on Knowledge and Data Engineering, 13(6):1010-1027. 2001. [11] Vassilios S. Veryhios, Elisa Bertino, Igor Nai Fovino Loredana Parasiliti Provenza, Yucel Saygin, Yannis eodoridis, “State-of-the-art in Privacy Preserving Data Mining” , SIGMOD Record, Vol. 33, No. 1, March 2004. [12] V.Ciriani, S.De Capitani di Vimercati, S.Foresti, and P.Samarati Universitua degli Studi di Milano, “Micro data protection” 26013 Crema, ltalia., Springer US, Advances in Information Security (2007)