SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2835
Amrutha HJ1, Anu A Kittur2, Chaitra MS3, Gowri M4, Sowmya SR5
1,2,3,4BE Student, Department of Information Science and Engineering
5Professor, Dept. of ISE, Dayananda Sagar Academy of Technology & Management, Karnataka, India
----------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - The amount of publicly accessible datasets isrising
every day in the present age. Improving data privacy erefore
becomes mandatory. This has become a major reason why
prolonged research has been undertaken to deliver effective
fortification techniques that obstruct the revelationofentities
in the datasets by conserving the data utility. Acomprehensive
attachement for categorical data protection is carried out by
applying clusters to the dataset and then safeguarding every
data segment.
Key Words: Categorical Data, Clustering, Data mining,
Data privacy
1. INTRODUCTION
Providing the requisite privacy is the mainagenda toprotect
the data or information. All the clients who entered the data
would expect their data to be protected. Data mining is a
method in which it transforms the base data tofinisheddata.
It is approach that calls for and examines thevastquantityof
dat collected to obtain trends. Categorical data can also be
known as statistical data consisting of categorical values.
There are three major attributes to reflect when
consideringa dataset, namely confidential, identifiers and
Quasiidentifiers. Quasiidentifiers are pieces of information
with some degree of uncertainty that are not by themselves
distinct identifiers.
In the case of confidential attributes, it includes information
of employment, health issues or religion. Clustering can be
defined as the process in which the abstract objects become
an interconnected class of objects withintheset.Thestudy of
clustering takes into account in applications such as market
survey, data-analysis, pattern recognition and image
processing.
Protection approaches are tested on the basis of two
important measures they threaten the loss and disclosure of
information.
The information loss is calculated by comparing the
statistical parameter between the anonymousone and the
original data table. Security approachescanbeclassifiedinto
two general categories: disruptive and non-perturbatory.
Perturbative is a technique for changing the attribute’s
sensitive value via a new value.
NonPerturbative technique does not change the attribute's
sensitive value, rather it attribute’s sensitive value, rather it
suppresses or deletes certain datasets.
2. METHODOLOGY
2.1 Subtractive Clustering:
The currently in effect subtractive clustering approach can
be used only for numerical data that cannot be used for data
with categorical values. Many cluster grids have a maximum
value in the conventional mountain-clustering process. But
this mountain clusteringapproachcansometimestrigger the
computation's increasing complexity, so one subtractive
method to clustering has been proposed. This approach can
be used only in numerical data since there is no natural
ordering of the categorical data. Though clustering using
kmeans gives better efficiency, subtractive clustering is
powerful.
2.2 Robust Hierarchical Clustering (RHC):
Hierarchical clustering is the popular unsupervised
technique used for the Metabolomics data. In the case of
conventional hierarchical clustering system, it is highly
reactive to outliers and if there is the existenceofmisleading
clustering tests, those outliers exist. Two Stage Generalized
S-estimator (TSGS) is used to robustify hierarchical
clustering which allows use of the covariance matrix.
There are 3 major steps in robust hierarchical data
segmentation methodology.
1. Estimation of Robust covariance matrix:
The biggest hurdle here is to estimate an appropriatematrix
of correlation or dispersion at a time in the presence of cell-
wise anomalies or outliers in case-wise and cell-wise.
2. Robust evaluation of correlation matrix based on
dissimilarity using the TSGS covariance matrix.
3. Estimate of RHC proposed with TSGS dispersion matrix.
2.3 Decision Tree Categorical Value Clustering
Data breakdown methods add noise to the data to avoid
correct confidential values beingrevealed.Categorical values
Survey on Clustering based Categorical Data Protection
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2836
of attributes are clustered in the beginning, and these
clusters are then used in the later stages to create noise.
Categorical value clustering and disruption technique of the
decision-tree disturbs a non-class categorical feature of a
dataset. Therefore, we apply it once for each non-class
attribute specified on the original dataset to agitate all non-
class categorical attributes.Everytimea datasetisgenerated
with one disturbed attribute within it. Lastly, we constructa
dataset (combining all disturbed data sets) where each non-
class categorical attribute is disturbed and all other
attributes s are not disturbed.
2.4 Outlier Diagnosis:
Outlier is one that does not adhere to the pattern in the
dataset or any other feature expected. This may be
diagonalised using anomaly detection methods. These
phenomena can also be called outliers, novelties, noise, or
variations.
They come in three different types:
1. Supervised anomaly detection
2. Semi supervised anomaly detection
3. Unsupervised anomaly detection
Unmonitored detections of anomalies identify anomalies in
an unlabeled test data data set under which the data
collection standard of events is considered normal by
searching for instances that appear to conform to the rest of
the data set atleast .
2.4.1 Outlier Detection Techniques:
A. Statistical outlier detection:
It calculates the arguments in the case of statistical
distribution by imagining all the data points produced by
statistical dispersion
B. Depth based outlier detection:
Depth based search originality at data space cap for outlier
detection. They're autonomous regarding statistical data
distribution.
C. Distance based outlier detection:
This judges a point based on separation of neighborhoods.
D. Density based outlier detection:
It practices the distribution of data element density into the
set of data.
E. Deviation based outlier detection:
The data components are scattered as a sparse matrix in the
data set which creates confusion over the analysis ofresults.
When departing from standard points some points are
considered anomalies.
Table 1: Comparison table for outlier algorithms
2.5 Evolutionary Optimization Approach
A progressive accession to protection of data is based on an
evolutionary algorithm, driven by the amalgamation of loss
in information and threat disclosure procedures. This
algorithm is dedicated to discover precise or approximate
results to simplify or explore problems. The algorithm uses
two simple genetic operators: mutation and crossover. It
uses state-of-the-art techniques for categorical stability.
Mutation: The pieces are randomly arrangedtoobtaina new
offspring in case of mutation.
Crossover: Consists of 2 chromosomal recombined values
which also produce two new off springs.
2.6 L-Diversity
The anonymity models through generalizationcanshieldthe
confidentiality of individuals but often lead to information
loss. (K, l, al)-variety diminishes knowledgelossandensures
data quality. This method ensures data privacyevenwithout
the knowledge of the opponent’s background to avoid
disclosure of attributes. In this case sensitive attributes are
well represented. That technique is a k-anonymity
modification. A definition from a set of n records (k, l, range)
diversity is used in such a way that the data segment cluster
includes at least k (k = n) data elements as well as at least 1
dissimilar sensitive characteristics and the sum of all intra
cluster distance is reduced.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2837
3. RESULT COMPARISION
3.1 Clustering Algorithms
Table 2: Comparison table for clustering algorithms
Algorithm Benefit Drawback
Subtractive
clustering
There is an
efficient method
in this case using.
On numerous UCI
datasets, a few
investigations are
carried out, and
some
experimental
results describe
that the approach
given can attain
better clustering
precision when
compared to k-
modes algorithm.
Unsupervised
clustering is not
clear.
Robust
hierarchical
clustering
Simulation
training clearly
shows that the
anticipated
approach
improves
performance
considerably over
conventional
hierarchical
clustering
1. The preceding
step cannot be
undoed.
2. Complexity of
time: Not
suitable for
large datasets.
3.2 Outlier Algorithms
Table 3: Comparison table for outlier algorithms
3.3 Protection Algorithms
Table 4: Comparison table for outlier algorithms
Algorithm Advantage Disadvantage
L-Diversity 1. Makes
distribution
more robust
1. This can be
redundant and
laborious to
within the
category of
critical
attributes,
thereby
increasing data
protection 2.
Protects from
disclosing
attribute.
achieve.
2. Prone to
attacks such as
skewness
attack.
Evolutionary
Optimization
Approach
We perform
better for
advanced
dimensional
failures.
We are robust
in terms of
noisy valuation
functions that
do not reap any
sensible
outcome in a
given stipulated
amount of time.
4. CONCLUSIONS
In this paper, a new approach is used todeal withcategorical
data confidentiality using the SCCA algorithm clustering
technique, which can result in more contented clustering
accuracy than the obsolete kmodes algorithm on each
collection. The efficiency of TSGS algorithms is greater than
that of robust estimation techniques.
Ldiversity will intensify the privacy of the defendantbutthis
function is not sufficient to protect critical attributes.Hence,
evolutionary optimization strategy is a better method of
defense.
REFERENCES
[1]H. Zhao and Z. Qi, "Hierarchical Agglomerative Clustering
with Ordering Constraints," 2010 Third International
Conference on Knowledge Discovery and Data Mining,
Phuket, 2010, pp. 195-199.
doi:10.1109/WKDD.2010.123
[2] Lei Gu, "A novel locality sensitive k-means clustering
algorithm based on subtractive clustering," 2016 7th IEEE
International Conference on Software Engineering and
Service Science (ICSESS), Beijing, 2016, pp. 836-839.
doi:10.1109/ICSESS.2016.7883196
[3] Jiang Chundong, Jia Haipeng, Du Taihang, Zhang Lei and
Chunbo Jiang, "Evolutionary algorithm and its application in
structural topology optimization," 2008 27th Chinese
Control Conference,Kunming,2008, pp.10-14.
doi:10.1109/CHICC.2008.4605057
[4] Marés J., Torra V. (2012) Clustering-Based Categorical
Data Protection. In: Domingo-Ferrer J., Tinnirello I . (eds)
Privacy in Statistical Databases PSD 2012.Lecture Notes in
Computer Science,vol 7556.Springer,Berlin,Heidelberg
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2838
[5] Wanliang Fu, "Multi-media data mining technology for
the systematic framework," 2012 IEEE International
Conference on Computer Science and Automation
Engineering, Beijing, 2012, pp. 570-572.
doi:10.1109/ICSESS.2012.6269531
[6] H. C. Mandhare and S. R. Idate, "A comparative study of
cluster based outlier detection, distance based outlier
detection and density based outlier detection techniques,"
2017 International ConferenceonIntelligentComputingand
Control Systems (ICICCS),Madurai,2017,pp.931-935.
[7] B. M. Varghese and U. A., "Recursive Decision Tree
Induction Based on Homogeneousness for Data Clustering,"
2008 International Conference on Cyberworlds,
Hangzhou,2008,pp.754-758.
doi:10.1109/CW.2008.56
[8] Han Jianmin, Cen Tingting and Yu Juan, "An l-MDAV
microaggregation algorithm for sensitive attribute l-
diversity," 2008 27th Chinese Control Conference,Kunming,
2008, pp. 713-718.
doi:10.1109/CHICC.2008.4605421
[9] S. Banerjee, A. Choudhary and S. Pal, "Empirical
evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means
and Genetic K-Means clustering algorithms," 2015 IEEE
International WIE Conference on Electrical and Computer
Engineering (WIECON-ECE), Dhaka, 2015, pp. 168-172.
doi:10.1109/WIECON-ECE.2015.7443889
[10] Fayyoumi and O.Nofal,"ApplyingGenetic Algorithmson
Multi-level Micro-Aggregation Techniques for Secure
Statistical Databases," 2018 IEEE/ACS 15th International
Conference on Computer Systems and Applications
(AICCSA),Aqaba,2018,pp.1-6.
doi: 10.1109/AICCSA.2018.8612813

More Related Content

What's hot

Survey on semi supervised classification methods and
Survey on semi supervised classification methods andSurvey on semi supervised classification methods and
Survey on semi supervised classification methods and
eSAT Publishing House
 
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
IRJET Journal
 
Improved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm dataImproved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm data
ISA Interchange
 
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET- 	  Plant Disease Detection and Classification using Image Processing a...IRJET- 	  Plant Disease Detection and Classification using Image Processing a...
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET Journal
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...
eSAT Publishing House
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
Editor IJCATR
 
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHODSURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
IJCI JOURNAL
 
Survey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selectionSurvey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selection
eSAT Journals
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET Journal
 
Data mining techniques a survey paper
Data mining techniques a survey paperData mining techniques a survey paper
Data mining techniques a survey paper
eSAT Publishing House
 
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
IJECEIAES
 
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET Journal
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
IRJET Journal
 
Iaetsd a survey on one class clustering
Iaetsd a survey on one class clusteringIaetsd a survey on one class clustering
Iaetsd a survey on one class clustering
Iaetsd Iaetsd
 
Decision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion DetectionDecision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion Detection
Eswar Publications
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...
eSAT Journals
 
Multi sensor-fusion
Multi sensor-fusionMulti sensor-fusion
Multi sensor-fusion
万言 李
 
IRJET- Detection and Classification of Leaf Diseases
IRJET-  	  Detection and Classification of Leaf DiseasesIRJET-  	  Detection and Classification of Leaf Diseases
IRJET- Detection and Classification of Leaf Diseases
IRJET Journal
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
IRJET Journal
 
Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS  Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS
IRJET Journal
 

What's hot (20)

Survey on semi supervised classification methods and
Survey on semi supervised classification methods andSurvey on semi supervised classification methods and
Survey on semi supervised classification methods and
 
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
Hybrid Model using Unsupervised Filtering Based on Ant Colony Optimization an...
 
Improved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm dataImproved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm data
 
IRJET- Plant Disease Detection and Classification using Image Processing a...
IRJET- 	  Plant Disease Detection and Classification using Image Processing a...IRJET- 	  Plant Disease Detection and Classification using Image Processing a...
IRJET- Plant Disease Detection and Classification using Image Processing a...
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
 
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHODSURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
 
Survey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selectionSurvey on semi supervised classification methods and feature selection
Survey on semi supervised classification methods and feature selection
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
 
Data mining techniques a survey paper
Data mining techniques a survey paperData mining techniques a survey paper
Data mining techniques a survey paper
 
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...
 
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
 
Iaetsd a survey on one class clustering
Iaetsd a survey on one class clusteringIaetsd a survey on one class clustering
Iaetsd a survey on one class clustering
 
Decision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion DetectionDecision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion Detection
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...
 
Multi sensor-fusion
Multi sensor-fusionMulti sensor-fusion
Multi sensor-fusion
 
IRJET- Detection and Classification of Leaf Diseases
IRJET-  	  Detection and Classification of Leaf DiseasesIRJET-  	  Detection and Classification of Leaf Diseases
IRJET- Detection and Classification of Leaf Diseases
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
 
Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS  Comparison of Data Mining Techniques used in Anomaly Based IDS
Comparison of Data Mining Techniques used in Anomaly Based IDS
 

Similar to IRJET - Survey on Clustering based Categorical Data Protection

IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET Journal
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
IRJET Journal
 
IRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache SparkIRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache Spark
IRJET Journal
 
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
IJET - International Journal of Engineering and Techniques
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy Algorithms
IRJET Journal
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
Editor IJMTER
 
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
IRJET Journal
 
Case Study: Prediction on Iris Dataset Using KNN Algorithm
Case Study: Prediction on Iris Dataset Using KNN AlgorithmCase Study: Prediction on Iris Dataset Using KNN Algorithm
Case Study: Prediction on Iris Dataset Using KNN Algorithm
IRJET Journal
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
IRJET Journal
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
IJERA Editor
 
A02610104
A02610104A02610104
A02610104theijes
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining Techniques
IRJET Journal
 
IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of Clustering
IRJET Journal
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
IJDKP
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
IRJET- Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
IRJET-  	  Priviledge Level Attribute Based Encryption Policy for Big Data Ac...IRJET-  	  Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
IRJET- Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
IRJET Journal
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)
eSAT Journals
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
IRJET Journal
 
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET Journal
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
IRJET Journal
 

Similar to IRJET - Survey on Clustering based Categorical Data Protection (20)

IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
 
IRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache SparkIRJET- Privacy Preservation using Apache Spark
IRJET- Privacy Preservation using Apache Spark
 
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
[IJET V2I3P14] Authors: S.Renuka Devi, A.C. Sumathi
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy Algorithms
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
 
Case Study: Prediction on Iris Dataset Using KNN Algorithm
Case Study: Prediction on Iris Dataset Using KNN AlgorithmCase Study: Prediction on Iris Dataset Using KNN Algorithm
Case Study: Prediction on Iris Dataset Using KNN Algorithm
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
A02610104
A02610104A02610104
A02610104
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining Techniques
 
IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of Clustering
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
 
IRJET- Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
IRJET-  	  Priviledge Level Attribute Based Encryption Policy for Big Data Ac...IRJET-  	  Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
IRJET- Priviledge Level Attribute Based Encryption Policy for Big Data Ac...
 
Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)Efficient classification of big data using vfdt (very fast decision tree)
Efficient classification of big data using vfdt (very fast decision tree)
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
 
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 

Recently uploaded (20)

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 

IRJET - Survey on Clustering based Categorical Data Protection

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2835 Amrutha HJ1, Anu A Kittur2, Chaitra MS3, Gowri M4, Sowmya SR5 1,2,3,4BE Student, Department of Information Science and Engineering 5Professor, Dept. of ISE, Dayananda Sagar Academy of Technology & Management, Karnataka, India ----------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - The amount of publicly accessible datasets isrising every day in the present age. Improving data privacy erefore becomes mandatory. This has become a major reason why prolonged research has been undertaken to deliver effective fortification techniques that obstruct the revelationofentities in the datasets by conserving the data utility. Acomprehensive attachement for categorical data protection is carried out by applying clusters to the dataset and then safeguarding every data segment. Key Words: Categorical Data, Clustering, Data mining, Data privacy 1. INTRODUCTION Providing the requisite privacy is the mainagenda toprotect the data or information. All the clients who entered the data would expect their data to be protected. Data mining is a method in which it transforms the base data tofinisheddata. It is approach that calls for and examines thevastquantityof dat collected to obtain trends. Categorical data can also be known as statistical data consisting of categorical values. There are three major attributes to reflect when consideringa dataset, namely confidential, identifiers and Quasiidentifiers. Quasiidentifiers are pieces of information with some degree of uncertainty that are not by themselves distinct identifiers. In the case of confidential attributes, it includes information of employment, health issues or religion. Clustering can be defined as the process in which the abstract objects become an interconnected class of objects withintheset.Thestudy of clustering takes into account in applications such as market survey, data-analysis, pattern recognition and image processing. Protection approaches are tested on the basis of two important measures they threaten the loss and disclosure of information. The information loss is calculated by comparing the statistical parameter between the anonymousone and the original data table. Security approachescanbeclassifiedinto two general categories: disruptive and non-perturbatory. Perturbative is a technique for changing the attribute’s sensitive value via a new value. NonPerturbative technique does not change the attribute's sensitive value, rather it attribute’s sensitive value, rather it suppresses or deletes certain datasets. 2. METHODOLOGY 2.1 Subtractive Clustering: The currently in effect subtractive clustering approach can be used only for numerical data that cannot be used for data with categorical values. Many cluster grids have a maximum value in the conventional mountain-clustering process. But this mountain clusteringapproachcansometimestrigger the computation's increasing complexity, so one subtractive method to clustering has been proposed. This approach can be used only in numerical data since there is no natural ordering of the categorical data. Though clustering using kmeans gives better efficiency, subtractive clustering is powerful. 2.2 Robust Hierarchical Clustering (RHC): Hierarchical clustering is the popular unsupervised technique used for the Metabolomics data. In the case of conventional hierarchical clustering system, it is highly reactive to outliers and if there is the existenceofmisleading clustering tests, those outliers exist. Two Stage Generalized S-estimator (TSGS) is used to robustify hierarchical clustering which allows use of the covariance matrix. There are 3 major steps in robust hierarchical data segmentation methodology. 1. Estimation of Robust covariance matrix: The biggest hurdle here is to estimate an appropriatematrix of correlation or dispersion at a time in the presence of cell- wise anomalies or outliers in case-wise and cell-wise. 2. Robust evaluation of correlation matrix based on dissimilarity using the TSGS covariance matrix. 3. Estimate of RHC proposed with TSGS dispersion matrix. 2.3 Decision Tree Categorical Value Clustering Data breakdown methods add noise to the data to avoid correct confidential values beingrevealed.Categorical values Survey on Clustering based Categorical Data Protection
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2836 of attributes are clustered in the beginning, and these clusters are then used in the later stages to create noise. Categorical value clustering and disruption technique of the decision-tree disturbs a non-class categorical feature of a dataset. Therefore, we apply it once for each non-class attribute specified on the original dataset to agitate all non- class categorical attributes.Everytimea datasetisgenerated with one disturbed attribute within it. Lastly, we constructa dataset (combining all disturbed data sets) where each non- class categorical attribute is disturbed and all other attributes s are not disturbed. 2.4 Outlier Diagnosis: Outlier is one that does not adhere to the pattern in the dataset or any other feature expected. This may be diagonalised using anomaly detection methods. These phenomena can also be called outliers, novelties, noise, or variations. They come in three different types: 1. Supervised anomaly detection 2. Semi supervised anomaly detection 3. Unsupervised anomaly detection Unmonitored detections of anomalies identify anomalies in an unlabeled test data data set under which the data collection standard of events is considered normal by searching for instances that appear to conform to the rest of the data set atleast . 2.4.1 Outlier Detection Techniques: A. Statistical outlier detection: It calculates the arguments in the case of statistical distribution by imagining all the data points produced by statistical dispersion B. Depth based outlier detection: Depth based search originality at data space cap for outlier detection. They're autonomous regarding statistical data distribution. C. Distance based outlier detection: This judges a point based on separation of neighborhoods. D. Density based outlier detection: It practices the distribution of data element density into the set of data. E. Deviation based outlier detection: The data components are scattered as a sparse matrix in the data set which creates confusion over the analysis ofresults. When departing from standard points some points are considered anomalies. Table 1: Comparison table for outlier algorithms 2.5 Evolutionary Optimization Approach A progressive accession to protection of data is based on an evolutionary algorithm, driven by the amalgamation of loss in information and threat disclosure procedures. This algorithm is dedicated to discover precise or approximate results to simplify or explore problems. The algorithm uses two simple genetic operators: mutation and crossover. It uses state-of-the-art techniques for categorical stability. Mutation: The pieces are randomly arrangedtoobtaina new offspring in case of mutation. Crossover: Consists of 2 chromosomal recombined values which also produce two new off springs. 2.6 L-Diversity The anonymity models through generalizationcanshieldthe confidentiality of individuals but often lead to information loss. (K, l, al)-variety diminishes knowledgelossandensures data quality. This method ensures data privacyevenwithout the knowledge of the opponent’s background to avoid disclosure of attributes. In this case sensitive attributes are well represented. That technique is a k-anonymity modification. A definition from a set of n records (k, l, range) diversity is used in such a way that the data segment cluster includes at least k (k = n) data elements as well as at least 1 dissimilar sensitive characteristics and the sum of all intra cluster distance is reduced.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2837 3. RESULT COMPARISION 3.1 Clustering Algorithms Table 2: Comparison table for clustering algorithms Algorithm Benefit Drawback Subtractive clustering There is an efficient method in this case using. On numerous UCI datasets, a few investigations are carried out, and some experimental results describe that the approach given can attain better clustering precision when compared to k- modes algorithm. Unsupervised clustering is not clear. Robust hierarchical clustering Simulation training clearly shows that the anticipated approach improves performance considerably over conventional hierarchical clustering 1. The preceding step cannot be undoed. 2. Complexity of time: Not suitable for large datasets. 3.2 Outlier Algorithms Table 3: Comparison table for outlier algorithms 3.3 Protection Algorithms Table 4: Comparison table for outlier algorithms Algorithm Advantage Disadvantage L-Diversity 1. Makes distribution more robust 1. This can be redundant and laborious to within the category of critical attributes, thereby increasing data protection 2. Protects from disclosing attribute. achieve. 2. Prone to attacks such as skewness attack. Evolutionary Optimization Approach We perform better for advanced dimensional failures. We are robust in terms of noisy valuation functions that do not reap any sensible outcome in a given stipulated amount of time. 4. CONCLUSIONS In this paper, a new approach is used todeal withcategorical data confidentiality using the SCCA algorithm clustering technique, which can result in more contented clustering accuracy than the obsolete kmodes algorithm on each collection. The efficiency of TSGS algorithms is greater than that of robust estimation techniques. Ldiversity will intensify the privacy of the defendantbutthis function is not sufficient to protect critical attributes.Hence, evolutionary optimization strategy is a better method of defense. REFERENCES [1]H. Zhao and Z. Qi, "Hierarchical Agglomerative Clustering with Ordering Constraints," 2010 Third International Conference on Knowledge Discovery and Data Mining, Phuket, 2010, pp. 195-199. doi:10.1109/WKDD.2010.123 [2] Lei Gu, "A novel locality sensitive k-means clustering algorithm based on subtractive clustering," 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, 2016, pp. 836-839. doi:10.1109/ICSESS.2016.7883196 [3] Jiang Chundong, Jia Haipeng, Du Taihang, Zhang Lei and Chunbo Jiang, "Evolutionary algorithm and its application in structural topology optimization," 2008 27th Chinese Control Conference,Kunming,2008, pp.10-14. doi:10.1109/CHICC.2008.4605057 [4] Marés J., Torra V. (2012) Clustering-Based Categorical Data Protection. In: Domingo-Ferrer J., Tinnirello I . (eds) Privacy in Statistical Databases PSD 2012.Lecture Notes in Computer Science,vol 7556.Springer,Berlin,Heidelberg
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2838 [5] Wanliang Fu, "Multi-media data mining technology for the systematic framework," 2012 IEEE International Conference on Computer Science and Automation Engineering, Beijing, 2012, pp. 570-572. doi:10.1109/ICSESS.2012.6269531 [6] H. C. Mandhare and S. R. Idate, "A comparative study of cluster based outlier detection, distance based outlier detection and density based outlier detection techniques," 2017 International ConferenceonIntelligentComputingand Control Systems (ICICCS),Madurai,2017,pp.931-935. [7] B. M. Varghese and U. A., "Recursive Decision Tree Induction Based on Homogeneousness for Data Clustering," 2008 International Conference on Cyberworlds, Hangzhou,2008,pp.754-758. doi:10.1109/CW.2008.56 [8] Han Jianmin, Cen Tingting and Yu Juan, "An l-MDAV microaggregation algorithm for sensitive attribute l- diversity," 2008 27th Chinese Control Conference,Kunming, 2008, pp. 713-718. doi:10.1109/CHICC.2008.4605421 [9] S. Banerjee, A. Choudhary and S. Pal, "Empirical evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means clustering algorithms," 2015 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Dhaka, 2015, pp. 168-172. doi:10.1109/WIECON-ECE.2015.7443889 [10] Fayyoumi and O.Nofal,"ApplyingGenetic Algorithmson Multi-level Micro-Aggregation Techniques for Secure Statistical Databases," 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA),Aqaba,2018,pp.1-6. doi: 10.1109/AICCSA.2018.8612813