SlideShare a Scribd company logo
1 of 3
Download to read offline
A SURVEY ON ONE CLASS CLUSTERING
HIERARCHY FOR PERFORMING DATA
LINKAGE
S.Rajalakshmi,
Assistant Professor, Department of CSE,
Velammal Engineering College,Anna University,
Chennai,India.
raji780@yahoo.co.in
A.Jayanthi,
M.E(CSE),Department of CSE,
Velammal Engineering College,Anna University,
Chennai,India.
jayanthiarumugamk@gmail.com
Abstract— Data linkage refers to the process of matching the
data from several databases that refers to the entities of same
type. Data linkage is also possible for the entities that do not
share the common identifier. With the growing size of the today’s
database, the complexity of the matching process becomes a
major challenge for Data linkage. Many Indexing techniques
were developed for data linkage but however those techniques
are not efficient. In this paper, a new data linkage method called
as One Class Clustering Tree(OCCT) is developed to overcome
the existing challenges and also to perform the data linkage
process for the entities that do not share a common identifier.
The developed technique builds the tree in such a way that the
inner nodes of the tree represents the features of the first set of
entities and the leaves of the tree represents the features of the
second sets that are similar. The one class clustering tree uses
certain splitting criteria and pruning methods for the data
linkage.
Keywords--Linkage, classification, clustering, splitting, decision
tree induction, index techniques.
I. INTRODUCTION
Data linkage is the process of identifying different entries that
refers to the same entity across different data sources[1]. The
main aim of the data linkage is to join the datasets that do not
share a common identifier or the foreign key. Data linkage is
usually performed to reduce the large data into the smaller
data. It also helps in removing the duplicate data in the
datasets. This technique is called as deduplication [19]. Data
linkage can be classified into two types namely, one-to-one
data linkage and one-to-many data linkage[15]. In one-to-one
data linkage, the aim is to link an entity from one dataset with
the matching entity from the other dataset. In one-to-many
data linkage the aim is to link an entity from first dat set with
the group of matching entities from the other data set. In this
paper a new data linkage approach is used called as One Class
Clustering Tree(OCCT) which is aimed at performing one-to-
many data linkage. The OCCT is most preferable compared to
all the indexing techniques because it can easily be translated
to linkage rules.
The paper is structured as follows: In Section II, we review on
indexing techniques,Section III deals with the data linkage
using OCCT and finally Section IV concludes the paper.
II. INDEXING TECHNIQUES
In this section the various indexing techniques are discussed
and the variation among them are discussed in more detail.
The indexing process of the data linkage can be divided into
two phases. 1)Build- All the records in the database are being
read and their Blocking Key Values(BKV) are generated.
Most of the indexing techniques uses inverted index approach
[6] where the record identifiers that have the same BKV will
be inserted into the same inverted index list.2)Retrieve- For
every block, the list of the record identifiers is retrieved from
the inverted index and the candidate record pairs are generated
from the list.
A.TRADITIONAL BLOCKING
Traditional blocking is one of the technique used in the data
linkage[1]. In traditional Blocking all the records that have the
same BKV are being inserted into the same block and the
records within that block are compared with each other. This
technique can be implemented using the inverted index[6].The
main disadvantage of traditional blocking is that the errors and
the variations in the record fields used to generate the BKVs
will lead to the record being inserted into the wrong block.
The second disadvantage is that the sizes of the block
generated depend upon the frequency distribution of the BKVs
and thus it is difficult to predict the total number of candidate
record pairs that will be generated.
B.SORTED NEIGHBORHOOD INDEXING
Sorted Neighborhood Indexing helps in sorting the database
according to the BKVs,and to subsequently move the window
of a fixed number of records over the sorted values and the
candidate record pairs are generated only from the records
within a current window. It uses three approaches namely
sorted array based approach [4],inverted index based
Proceedings of International Conference on Advancements in Engineering and Technology
ISBN NO : 978 - 1502893314
www.iaetsd.in
International Association of Engineering and Technology for Skill Development
51
approach[14] and Adaptive Sorted Neighborhood
approach[16].The sorted array based approach is not
applicable when the window size is small. However the
inverted index based approach also has the same drawback of
traditional blocking and it is inefficient approach as it takes
lots of time for splitting the entities. The Adaptive sorted
Neighborhood approach is not suitable when window size is
too large.
C. Q-GRAM BASED INDEXING
Q-Gram Based Indexing technique overcomes the drawback
of the traditional blocking and the sorted neighborhood
indexing. The main aim of this technique is to index the
database such that the records that have the similar,and not
just the same,BKV will be inserted into the same
block[8].However, much larger number of candidate record
pairs will be generated,leading to a more time consuming
process.
D. SUFFIX ARRAY-BASED INDEXING
Suffix Array-Based Indexing technique is one of the most
efficient approach compared to the previous works. The basic
idea of this technique is to insert the BKVs and their suffixes
into a suffix array based inverted index[11]. It uses the
approach called Robust Suffix Array Based Indexing where
the inverted index lists of the suffix values that are similar to
each other in the sorted suffix array are merged[13]. This
technique also takes a lot of time to merge the values.
E. CANOPY CLUSTERING
The canopy clustering[14]is built by converting BKVs into the
lists of tokens with each unique token becoming a key in the
inverted index. It uses the approach called as the Threshold-
based approach and Nearest Neighbor-Based approach.The
drawback of the canopy clustering is similar to that of the
sorted neighborhood technique based on the sorted array.
F. STRING-MAP-BASED INDEXING
String-map-based indexing [9] is based on mapping BKVs to
objects in a multidimensional Euclidean Space,such that the
distance between the pairs of the strings are preserved.Group
of similar strings are then generated by extracting the objects
that are similar to each other. However this technique fails
when the size of the database is too large or too small.
Hence all the above discussed indexing techniques has few
drawbacks in the data linkage process. In order to overcome
those indexing problems associated with the data linkage
process a new approach called as the One Class Clustering
Tree is proposed, which uses four splitting criteria
namely,Coarse-Grained Jaccard coefficient,Fine-Grained
Jaccard Coefficient, Least Probable Intersection(LPI) and
Maximum Likelihood Estimation(MLE) for data split and
pruning techniques.
III.DATA LINKAGE USING OCCT
OCCT is induced using one of the splitting criteria. The
splitting criteria is used to determine which attribute should be
used in each step of building the tree. OCCT uses the
prepruning process to decide which branches should be
trimmed.
Fig 1: Work Flow Diagram
Initially the tree is constructed where the inner nodes of the
tree consists of the attribute and the leaves represents the
clusters of the clusters of the matching entities. Secondly, the
prepruning technique is being used which means that the
algorithm stops expanding a branch whenever the subbranch
does not improve the accuracy of the model. OCCT uses the
probabilistic model to find the similar entities that are to be
matched. This probabilistic approach helps to avoid
overfitting. OCCT is chosen to be the best approach for data
linkage compared to indexing techniques.
IV.CONCLUSION
In this paper OCCT approach is used which performs one-to-
many data linkage.This method is based on the one class
decision tree model which sums up the knowledge of which
records to be linked together. This method uses one-class
approach which gives the results more accurately.OCCT
model has also been proved successful in three different
domains namely data linkage prevention,recommender system
and fraud detection.
CONSTRUCT OCCT USING ALL ENTITIES
PREPRUNING TECHNIQUE
COMPARE ENTITIES
MATCHING ENTITY NON-MATCHING
ENTITY
FINAL RESULT
DATABASE A DATABASE B
Proceedings of International Conference on Advancements in Engineering and Technology
ISBN NO : 978 - 1502893314
www.iaetsd.in
International Association of Engineering and Technology for Skill Development
52
REFERENCES
1. I.P. Fellegi and A.B. Sunter, “A Theory for Record
Linkage,” J. Am. Statistical Soc., vol. 64, no. 328, pp.
1183-1210, Dec. 1969.
2. D.D. Dorfman and E. Alf, “Maximum-Likelihood
Estimation of Parameters of Signal-Detection Theory
and Determination of Confidence Intervals—Rating-
Method Data,” J. Math. Psychology,vol. 6, no. 3, pp.
487-496, 1969.
3. J.R.Quinlan, “Induction of Decision Trees,” Machine
Learning, vol. 1, no. 1, pp. 81-106, March 1986.
4. M.A. Hernandez and S.J. Stolfo, “The Merge/Purge
Problem for Large Databases,” Proc. ACM SIGMOD
Int’l Conf. Management of Data (SIGMOD ’95),
1995.
5. P.Langley, Elements of Machine Learning, San Franc
Isco, Morgan Kaufmann, 1996.
6. I.H. Witten, A. Moffat, and T.C. Bell, Managing
Gigabytes, second ed. Morgan Kaufmann, 1999.
7. S.Guha, R.Rastogi and K.Shim, “Rock: A Robust
Clustering Algorithm for Categorical Attributes,”
Informat- ion Systems, vol. 25, no. 5, pp. 345-366,
July 2000.
8. L. Gravano, P.G. Ipeirotis, H.V. Jagadish, N. Koudas,
S. Muthukrishnan, and D. Srivastava, “Approximate
String Joins in a Database (Almost) for Free,” Proc.
27th Int’l Conf. Very Large Data Bases (VLDB ’01),
pp. 491-500, 2001.
9. L. Jin, C. Li, and S. Mehrotra, “Efficient Record
Linkage in Large Data Sets,” Proc. Eighth Int’l Conf.
Database Systems for Advanced Applications
(DASFAA ’03), pp. 137-146, 2003.
10. I.S.Dhillon, S. Mallela, and D.S. Modha,
“Information-Theoretic Co-Clustering,” Proc. Ninth
ACM SIGKDD Int’l Conf. Knowledge Discovery
and Data Mining, pp. 89-98, 2003.
11. A. Aizawa and K. Oyama, “A Fast Linkage Detection
Scheme for Multi-Source Information Integration,”
Proc. Int’l Workshop Chal- lenges in Web
Information Retrieval and Integration (WIRI ’05),
2005.
12. A.J.Storkey, C.K.I.Williams, E.Taylorand R.G.Mann,
“An Expectation Maximisation Algorithm for One-
to- Many Record Linkage,” University of Edinburgh
Informatics Research Report, 2005.
13. P. Christen, “A Comparison of Personal Name
Matching: Techniques and Practical Issues,” Proc.
IEEE Sixth Data Mining Workshop (ICDM ’06),
2006.
14. P. Christen, “Towards Parameter-Free Blocking for
Scalable Record Linkage,” Technical Report TR-CS-
07-03, Dept. of Com- puter Science, The Australian
Nat’l Univ., 2007.
15. P. Christen and K. Goiser, “Quality and Complexity
Measures for Data Linkage and Deduplication,”
Quality Measures in Data Mining, vol. 43, pp. 127-
151, 2007.
16. S. Yan, D. Lee, M.Y. Kan, and L.C. Giles, “Adaptive
Sorted Neighborhood Methods for Efficient Record
Linkage,” Proc. Seventh ACM/IEEE-CS Joint Conf.
Digital Libraries (JCDL ’07), 2007.
17. A.Gershman et al., “A Decision Tree Based
Recomme- nder System,” in Proc. the 10th Int. Conf.
on Innovative Internet Community Services, pp. 170-
179, 2010.
18. M.Yakout, A.K.Elmagarmid, H.Elmeleegy,
M.Quzzani and A.Qi, “Behavior Based Record
Linkage,” in Proc. of the VLDB Endowment, vol. 3,
no 1-2, pp. 439-448, 2010.
19. P. Christen, “A Survey of Indexing Techniques for
Scalable Record Linkage and Deduplication,” IEEE
Trans. Knowledge and Data Eng., vol. 24, no. 9, pp.
1537-1555, Sept. 2012, doi:10.1109/TKDE. 2011.
127.
20. M.Dror, A.Shabtai, L.Rokach, Y. Elovici, “OCCT: A
One-Class Clustering Tree for Implementing One-to-
Many Data Linkage,” IEEE Trans. on Knowledge
and Data Engineering, TKDE-2011-09-0577, 2013.
Proceedings of International Conference on Advancements in Engineering and Technology
ISBN NO : 978 - 1502893314
www.iaetsd.in
International Association of Engineering and Technology for Skill Development
53

More Related Content

What's hot

GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXINGIJDKP
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...IJDKP
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGIJDKP
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for properIJDKP
 
Recent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewRecent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewIOSRjournaljce
 
A Review of Various Clustering Techniques
A Review of Various Clustering TechniquesA Review of Various Clustering Techniques
A Review of Various Clustering TechniquesIJEACS
 
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data MiningUsage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data MiningIOSR Journals
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...IJCSIS Research Publications
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusabilityAlexander Decker
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESSIJDKP
 
A genetic based research framework 3
A genetic based research framework 3A genetic based research framework 3
A genetic based research framework 3prj_publication
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET Journal
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentIJDKP
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Enhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaEnhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaIJDKP
 
Bs31267274
Bs31267274Bs31267274
Bs31267274IJMER
 

What's hot (20)

GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXING
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for proper
 
Recent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewRecent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A Review
 
A new link based approach for categorical data clustering
A new link based approach for categorical data clusteringA new link based approach for categorical data clustering
A new link based approach for categorical data clustering
 
A Review of Various Clustering Techniques
A Review of Various Clustering TechniquesA Review of Various Clustering Techniques
A Review of Various Clustering Techniques
 
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data MiningUsage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
 
Spe165 t
Spe165 tSpe165 t
Spe165 t
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusability
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESS
 
A genetic based research framework 3
A genetic based research framework 3A genetic based research framework 3
A genetic based research framework 3
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environment
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
G1803054653
G1803054653G1803054653
G1803054653
 
Enhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging areaEnhancement techniques for data warehouse staging area
Enhancement techniques for data warehouse staging area
 
Bs31267274
Bs31267274Bs31267274
Bs31267274
 

Viewers also liked

Iaetsd enhancing vehicle to vehicle safety message
Iaetsd enhancing vehicle to vehicle safety messageIaetsd enhancing vehicle to vehicle safety message
Iaetsd enhancing vehicle to vehicle safety messageIaetsd Iaetsd
 
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...Iaetsd a low power and high throughput re-configurable bip for multipurpose a...
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...Iaetsd Iaetsd
 
Iaetsd implementation of chaotic algorithm for secure image
Iaetsd implementation of chaotic algorithm for secure imageIaetsd implementation of chaotic algorithm for secure image
Iaetsd implementation of chaotic algorithm for secure imageIaetsd Iaetsd
 
Iaetsd adaptive and well-organized mobile video streaming public
Iaetsd adaptive and well-organized mobile video streaming publicIaetsd adaptive and well-organized mobile video streaming public
Iaetsd adaptive and well-organized mobile video streaming publicIaetsd Iaetsd
 
Iaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detectionIaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detectionIaetsd Iaetsd
 
Iaetsd efficient file transferring in
Iaetsd efficient file transferring inIaetsd efficient file transferring in
Iaetsd efficient file transferring inIaetsd Iaetsd
 
Iaetsd the world’s smallest computer for programmers and app developers
Iaetsd the world’s smallest computer for programmers and app developersIaetsd the world’s smallest computer for programmers and app developers
Iaetsd the world’s smallest computer for programmers and app developersIaetsd Iaetsd
 
Iaetsd synthesis and investigation of properties in ga asxn1-x
Iaetsd synthesis and investigation of properties in ga asxn1-xIaetsd synthesis and investigation of properties in ga asxn1-x
Iaetsd synthesis and investigation of properties in ga asxn1-xIaetsd Iaetsd
 
Iaetsd synthesis and characterization of in as
Iaetsd synthesis and characterization of in asIaetsd synthesis and characterization of in as
Iaetsd synthesis and characterization of in asIaetsd Iaetsd
 
Iaetsd appearance based american sign language recognition
Iaetsd appearance based  american sign language recognitionIaetsd appearance based  american sign language recognition
Iaetsd appearance based american sign language recognitionIaetsd Iaetsd
 
Iaetsd mobile icu using android
Iaetsd mobile icu using androidIaetsd mobile icu using android
Iaetsd mobile icu using androidIaetsd Iaetsd
 
Iaetsd an enhancement for content sharing over
Iaetsd an enhancement for content sharing overIaetsd an enhancement for content sharing over
Iaetsd an enhancement for content sharing overIaetsd Iaetsd
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd Iaetsd
 
Iaetsd fpga based retinal blood oxygen saturation mapping using
Iaetsd fpga based retinal blood oxygen saturation mapping usingIaetsd fpga based retinal blood oxygen saturation mapping using
Iaetsd fpga based retinal blood oxygen saturation mapping usingIaetsd Iaetsd
 
Iaetsd secure emails an integrity assured email
Iaetsd secure emails an integrity assured emailIaetsd secure emails an integrity assured email
Iaetsd secure emails an integrity assured emailIaetsd Iaetsd
 
Iaetsd load stabilizing and energy conserving routing
Iaetsd load stabilizing and energy conserving routingIaetsd load stabilizing and energy conserving routing
Iaetsd load stabilizing and energy conserving routingIaetsd Iaetsd
 
Iaetsd eco friendly construction methods and materials
Iaetsd eco friendly construction methods and materialsIaetsd eco friendly construction methods and materials
Iaetsd eco friendly construction methods and materialsIaetsd Iaetsd
 
Iaetsd a survey on geographic routing relay selection in
Iaetsd a survey on geographic routing relay selection inIaetsd a survey on geographic routing relay selection in
Iaetsd a survey on geographic routing relay selection inIaetsd Iaetsd
 
Iaetsd a review on modified anti forensic
Iaetsd a review on modified anti forensicIaetsd a review on modified anti forensic
Iaetsd a review on modified anti forensicIaetsd Iaetsd
 

Viewers also liked (19)

Iaetsd enhancing vehicle to vehicle safety message
Iaetsd enhancing vehicle to vehicle safety messageIaetsd enhancing vehicle to vehicle safety message
Iaetsd enhancing vehicle to vehicle safety message
 
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...Iaetsd a low power and high throughput re-configurable bip for multipurpose a...
Iaetsd a low power and high throughput re-configurable bip for multipurpose a...
 
Iaetsd implementation of chaotic algorithm for secure image
Iaetsd implementation of chaotic algorithm for secure imageIaetsd implementation of chaotic algorithm for secure image
Iaetsd implementation of chaotic algorithm for secure image
 
Iaetsd adaptive and well-organized mobile video streaming public
Iaetsd adaptive and well-organized mobile video streaming publicIaetsd adaptive and well-organized mobile video streaming public
Iaetsd adaptive and well-organized mobile video streaming public
 
Iaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detectionIaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detection
 
Iaetsd efficient file transferring in
Iaetsd efficient file transferring inIaetsd efficient file transferring in
Iaetsd efficient file transferring in
 
Iaetsd the world’s smallest computer for programmers and app developers
Iaetsd the world’s smallest computer for programmers and app developersIaetsd the world’s smallest computer for programmers and app developers
Iaetsd the world’s smallest computer for programmers and app developers
 
Iaetsd synthesis and investigation of properties in ga asxn1-x
Iaetsd synthesis and investigation of properties in ga asxn1-xIaetsd synthesis and investigation of properties in ga asxn1-x
Iaetsd synthesis and investigation of properties in ga asxn1-x
 
Iaetsd synthesis and characterization of in as
Iaetsd synthesis and characterization of in asIaetsd synthesis and characterization of in as
Iaetsd synthesis and characterization of in as
 
Iaetsd appearance based american sign language recognition
Iaetsd appearance based  american sign language recognitionIaetsd appearance based  american sign language recognition
Iaetsd appearance based american sign language recognition
 
Iaetsd mobile icu using android
Iaetsd mobile icu using androidIaetsd mobile icu using android
Iaetsd mobile icu using android
 
Iaetsd an enhancement for content sharing over
Iaetsd an enhancement for content sharing overIaetsd an enhancement for content sharing over
Iaetsd an enhancement for content sharing over
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme via
 
Iaetsd fpga based retinal blood oxygen saturation mapping using
Iaetsd fpga based retinal blood oxygen saturation mapping usingIaetsd fpga based retinal blood oxygen saturation mapping using
Iaetsd fpga based retinal blood oxygen saturation mapping using
 
Iaetsd secure emails an integrity assured email
Iaetsd secure emails an integrity assured emailIaetsd secure emails an integrity assured email
Iaetsd secure emails an integrity assured email
 
Iaetsd load stabilizing and energy conserving routing
Iaetsd load stabilizing and energy conserving routingIaetsd load stabilizing and energy conserving routing
Iaetsd load stabilizing and energy conserving routing
 
Iaetsd eco friendly construction methods and materials
Iaetsd eco friendly construction methods and materialsIaetsd eco friendly construction methods and materials
Iaetsd eco friendly construction methods and materials
 
Iaetsd a survey on geographic routing relay selection in
Iaetsd a survey on geographic routing relay selection inIaetsd a survey on geographic routing relay selection in
Iaetsd a survey on geographic routing relay selection in
 
Iaetsd a review on modified anti forensic
Iaetsd a review on modified anti forensicIaetsd a review on modified anti forensic
Iaetsd a review on modified anti forensic
 

Similar to Iaetsd a survey on one class clustering

IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET Journal
 
How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...Nicolle Dammann
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET Journal
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismseSAT Journals
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningijitcs
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative StudyFiona Phillips
 
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...IOSR Journals
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesIRJET Journal
 
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...ijcseit
 
Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal
 
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...IJEACS
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES cscpconf
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIESENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIEScsandit
 
Enhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologiesEnhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologiescsandit
 
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...ijcsit
 
Data Security In Relational Database Management System
Data Security In Relational Database Management SystemData Security In Relational Database Management System
Data Security In Relational Database Management SystemCSCJournals
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536IJRAT
 
A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
 

Similar to Iaetsd a survey on one class clustering (20)

IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
 
How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...How Partitioning Clustering Technique For Implementing...
How Partitioning Clustering Technique For Implementing...
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanisms
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learning
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative Study
 
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
 
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...
 
Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTE
 
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
 
A0360109
A0360109A0360109
A0360109
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIESENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
 
Enhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologiesEnhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologies
 
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...
 
Data Security In Relational Database Management System
Data Security In Relational Database Management SystemData Security In Relational Database Management System
Data Security In Relational Database Management System
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 
A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User Transactions
 

More from Iaetsd Iaetsd

iaetsd Survey on cooperative relay based data transmission
iaetsd Survey on cooperative relay based data transmissioniaetsd Survey on cooperative relay based data transmission
iaetsd Survey on cooperative relay based data transmissionIaetsd Iaetsd
 
iaetsd Software defined am transmitter using vhdl
iaetsd Software defined am transmitter using vhdliaetsd Software defined am transmitter using vhdl
iaetsd Software defined am transmitter using vhdlIaetsd Iaetsd
 
iaetsd Health monitoring system with wireless alarm
iaetsd Health monitoring system with wireless alarmiaetsd Health monitoring system with wireless alarm
iaetsd Health monitoring system with wireless alarmIaetsd Iaetsd
 
iaetsd Equalizing channel and power based on cognitive radio system over mult...
iaetsd Equalizing channel and power based on cognitive radio system over mult...iaetsd Equalizing channel and power based on cognitive radio system over mult...
iaetsd Equalizing channel and power based on cognitive radio system over mult...Iaetsd Iaetsd
 
iaetsd Economic analysis and re design of driver’s car seat
iaetsd Economic analysis and re design of driver’s car seatiaetsd Economic analysis and re design of driver’s car seat
iaetsd Economic analysis and re design of driver’s car seatIaetsd Iaetsd
 
iaetsd Design of slotted microstrip patch antenna for wlan application
iaetsd Design of slotted microstrip patch antenna for wlan applicationiaetsd Design of slotted microstrip patch antenna for wlan application
iaetsd Design of slotted microstrip patch antenna for wlan applicationIaetsd Iaetsd
 
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBS
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBSREVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBS
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBSIaetsd Iaetsd
 
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...Iaetsd Iaetsd
 
Fabrication of dual power bike
Fabrication of dual power bikeFabrication of dual power bike
Fabrication of dual power bikeIaetsd Iaetsd
 
Blue brain technology
Blue brain technologyBlue brain technology
Blue brain technologyIaetsd Iaetsd
 
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...Iaetsd Iaetsd
 
iirdem Surveillance aided robotic bird
iirdem Surveillance aided robotic birdiirdem Surveillance aided robotic bird
iirdem Surveillance aided robotic birdIaetsd Iaetsd
 
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growth
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growthiirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growth
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid GrowthIaetsd Iaetsd
 
iirdem Design of Efficient Solar Energy Collector using MPPT Algorithm
iirdem Design of Efficient Solar Energy Collector using MPPT Algorithmiirdem Design of Efficient Solar Energy Collector using MPPT Algorithm
iirdem Design of Efficient Solar Energy Collector using MPPT AlgorithmIaetsd Iaetsd
 
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...Iaetsd Iaetsd
 
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...Iaetsd Iaetsd
 
iaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocoliaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocolIaetsd Iaetsd
 
iaetsd Secured multiple keyword ranked search over encrypted databases
iaetsd Secured multiple keyword ranked search over encrypted databasesiaetsd Secured multiple keyword ranked search over encrypted databases
iaetsd Secured multiple keyword ranked search over encrypted databasesIaetsd Iaetsd
 
iaetsd Robots in oil and gas refineries
iaetsd Robots in oil and gas refineriesiaetsd Robots in oil and gas refineries
iaetsd Robots in oil and gas refineriesIaetsd Iaetsd
 
iaetsd Modeling of solar steam engine system using parabolic
iaetsd Modeling of solar steam engine system using paraboliciaetsd Modeling of solar steam engine system using parabolic
iaetsd Modeling of solar steam engine system using parabolicIaetsd Iaetsd
 

More from Iaetsd Iaetsd (20)

iaetsd Survey on cooperative relay based data transmission
iaetsd Survey on cooperative relay based data transmissioniaetsd Survey on cooperative relay based data transmission
iaetsd Survey on cooperative relay based data transmission
 
iaetsd Software defined am transmitter using vhdl
iaetsd Software defined am transmitter using vhdliaetsd Software defined am transmitter using vhdl
iaetsd Software defined am transmitter using vhdl
 
iaetsd Health monitoring system with wireless alarm
iaetsd Health monitoring system with wireless alarmiaetsd Health monitoring system with wireless alarm
iaetsd Health monitoring system with wireless alarm
 
iaetsd Equalizing channel and power based on cognitive radio system over mult...
iaetsd Equalizing channel and power based on cognitive radio system over mult...iaetsd Equalizing channel and power based on cognitive radio system over mult...
iaetsd Equalizing channel and power based on cognitive radio system over mult...
 
iaetsd Economic analysis and re design of driver’s car seat
iaetsd Economic analysis and re design of driver’s car seatiaetsd Economic analysis and re design of driver’s car seat
iaetsd Economic analysis and re design of driver’s car seat
 
iaetsd Design of slotted microstrip patch antenna for wlan application
iaetsd Design of slotted microstrip patch antenna for wlan applicationiaetsd Design of slotted microstrip patch antenna for wlan application
iaetsd Design of slotted microstrip patch antenna for wlan application
 
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBS
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBSREVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBS
REVIEW PAPER- ON ENHANCEMENT OF HEAT TRANSFER USING RIBS
 
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...
A HYBRID AC/DC SOLAR POWERED STANDALONE SYSTEM WITHOUT INVERTER BASED ON LOAD...
 
Fabrication of dual power bike
Fabrication of dual power bikeFabrication of dual power bike
Fabrication of dual power bike
 
Blue brain technology
Blue brain technologyBlue brain technology
Blue brain technology
 
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...
iirdem The Livable Planet – A Revolutionary Concept through Innovative Street...
 
iirdem Surveillance aided robotic bird
iirdem Surveillance aided robotic birdiirdem Surveillance aided robotic bird
iirdem Surveillance aided robotic bird
 
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growth
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growthiirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growth
iirdem Growing India Time Monopoly – The Key to Initiate Long Term Rapid Growth
 
iirdem Design of Efficient Solar Energy Collector using MPPT Algorithm
iirdem Design of Efficient Solar Energy Collector using MPPT Algorithmiirdem Design of Efficient Solar Energy Collector using MPPT Algorithm
iirdem Design of Efficient Solar Energy Collector using MPPT Algorithm
 
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...
iirdem CRASH IMPACT ATTENUATOR (CIA) FOR AUTOMOBILES WITH THE ADVOCATION OF M...
 
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...
iirdem ADVANCING OF POWER MANAGEMENT IN HOME WITH SMART GRID TECHNOLOGY AND S...
 
iaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocoliaetsd Shared authority based privacy preserving protocol
iaetsd Shared authority based privacy preserving protocol
 
iaetsd Secured multiple keyword ranked search over encrypted databases
iaetsd Secured multiple keyword ranked search over encrypted databasesiaetsd Secured multiple keyword ranked search over encrypted databases
iaetsd Secured multiple keyword ranked search over encrypted databases
 
iaetsd Robots in oil and gas refineries
iaetsd Robots in oil and gas refineriesiaetsd Robots in oil and gas refineries
iaetsd Robots in oil and gas refineries
 
iaetsd Modeling of solar steam engine system using parabolic
iaetsd Modeling of solar steam engine system using paraboliciaetsd Modeling of solar steam engine system using parabolic
iaetsd Modeling of solar steam engine system using parabolic
 

Recently uploaded

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 

Recently uploaded (20)

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 

Iaetsd a survey on one class clustering

  • 1. A SURVEY ON ONE CLASS CLUSTERING HIERARCHY FOR PERFORMING DATA LINKAGE S.Rajalakshmi, Assistant Professor, Department of CSE, Velammal Engineering College,Anna University, Chennai,India. raji780@yahoo.co.in A.Jayanthi, M.E(CSE),Department of CSE, Velammal Engineering College,Anna University, Chennai,India. jayanthiarumugamk@gmail.com Abstract— Data linkage refers to the process of matching the data from several databases that refers to the entities of same type. Data linkage is also possible for the entities that do not share the common identifier. With the growing size of the today’s database, the complexity of the matching process becomes a major challenge for Data linkage. Many Indexing techniques were developed for data linkage but however those techniques are not efficient. In this paper, a new data linkage method called as One Class Clustering Tree(OCCT) is developed to overcome the existing challenges and also to perform the data linkage process for the entities that do not share a common identifier. The developed technique builds the tree in such a way that the inner nodes of the tree represents the features of the first set of entities and the leaves of the tree represents the features of the second sets that are similar. The one class clustering tree uses certain splitting criteria and pruning methods for the data linkage. Keywords--Linkage, classification, clustering, splitting, decision tree induction, index techniques. I. INTRODUCTION Data linkage is the process of identifying different entries that refers to the same entity across different data sources[1]. The main aim of the data linkage is to join the datasets that do not share a common identifier or the foreign key. Data linkage is usually performed to reduce the large data into the smaller data. It also helps in removing the duplicate data in the datasets. This technique is called as deduplication [19]. Data linkage can be classified into two types namely, one-to-one data linkage and one-to-many data linkage[15]. In one-to-one data linkage, the aim is to link an entity from one dataset with the matching entity from the other dataset. In one-to-many data linkage the aim is to link an entity from first dat set with the group of matching entities from the other data set. In this paper a new data linkage approach is used called as One Class Clustering Tree(OCCT) which is aimed at performing one-to- many data linkage. The OCCT is most preferable compared to all the indexing techniques because it can easily be translated to linkage rules. The paper is structured as follows: In Section II, we review on indexing techniques,Section III deals with the data linkage using OCCT and finally Section IV concludes the paper. II. INDEXING TECHNIQUES In this section the various indexing techniques are discussed and the variation among them are discussed in more detail. The indexing process of the data linkage can be divided into two phases. 1)Build- All the records in the database are being read and their Blocking Key Values(BKV) are generated. Most of the indexing techniques uses inverted index approach [6] where the record identifiers that have the same BKV will be inserted into the same inverted index list.2)Retrieve- For every block, the list of the record identifiers is retrieved from the inverted index and the candidate record pairs are generated from the list. A.TRADITIONAL BLOCKING Traditional blocking is one of the technique used in the data linkage[1]. In traditional Blocking all the records that have the same BKV are being inserted into the same block and the records within that block are compared with each other. This technique can be implemented using the inverted index[6].The main disadvantage of traditional blocking is that the errors and the variations in the record fields used to generate the BKVs will lead to the record being inserted into the wrong block. The second disadvantage is that the sizes of the block generated depend upon the frequency distribution of the BKVs and thus it is difficult to predict the total number of candidate record pairs that will be generated. B.SORTED NEIGHBORHOOD INDEXING Sorted Neighborhood Indexing helps in sorting the database according to the BKVs,and to subsequently move the window of a fixed number of records over the sorted values and the candidate record pairs are generated only from the records within a current window. It uses three approaches namely sorted array based approach [4],inverted index based Proceedings of International Conference on Advancements in Engineering and Technology ISBN NO : 978 - 1502893314 www.iaetsd.in International Association of Engineering and Technology for Skill Development 51
  • 2. approach[14] and Adaptive Sorted Neighborhood approach[16].The sorted array based approach is not applicable when the window size is small. However the inverted index based approach also has the same drawback of traditional blocking and it is inefficient approach as it takes lots of time for splitting the entities. The Adaptive sorted Neighborhood approach is not suitable when window size is too large. C. Q-GRAM BASED INDEXING Q-Gram Based Indexing technique overcomes the drawback of the traditional blocking and the sorted neighborhood indexing. The main aim of this technique is to index the database such that the records that have the similar,and not just the same,BKV will be inserted into the same block[8].However, much larger number of candidate record pairs will be generated,leading to a more time consuming process. D. SUFFIX ARRAY-BASED INDEXING Suffix Array-Based Indexing technique is one of the most efficient approach compared to the previous works. The basic idea of this technique is to insert the BKVs and their suffixes into a suffix array based inverted index[11]. It uses the approach called Robust Suffix Array Based Indexing where the inverted index lists of the suffix values that are similar to each other in the sorted suffix array are merged[13]. This technique also takes a lot of time to merge the values. E. CANOPY CLUSTERING The canopy clustering[14]is built by converting BKVs into the lists of tokens with each unique token becoming a key in the inverted index. It uses the approach called as the Threshold- based approach and Nearest Neighbor-Based approach.The drawback of the canopy clustering is similar to that of the sorted neighborhood technique based on the sorted array. F. STRING-MAP-BASED INDEXING String-map-based indexing [9] is based on mapping BKVs to objects in a multidimensional Euclidean Space,such that the distance between the pairs of the strings are preserved.Group of similar strings are then generated by extracting the objects that are similar to each other. However this technique fails when the size of the database is too large or too small. Hence all the above discussed indexing techniques has few drawbacks in the data linkage process. In order to overcome those indexing problems associated with the data linkage process a new approach called as the One Class Clustering Tree is proposed, which uses four splitting criteria namely,Coarse-Grained Jaccard coefficient,Fine-Grained Jaccard Coefficient, Least Probable Intersection(LPI) and Maximum Likelihood Estimation(MLE) for data split and pruning techniques. III.DATA LINKAGE USING OCCT OCCT is induced using one of the splitting criteria. The splitting criteria is used to determine which attribute should be used in each step of building the tree. OCCT uses the prepruning process to decide which branches should be trimmed. Fig 1: Work Flow Diagram Initially the tree is constructed where the inner nodes of the tree consists of the attribute and the leaves represents the clusters of the clusters of the matching entities. Secondly, the prepruning technique is being used which means that the algorithm stops expanding a branch whenever the subbranch does not improve the accuracy of the model. OCCT uses the probabilistic model to find the similar entities that are to be matched. This probabilistic approach helps to avoid overfitting. OCCT is chosen to be the best approach for data linkage compared to indexing techniques. IV.CONCLUSION In this paper OCCT approach is used which performs one-to- many data linkage.This method is based on the one class decision tree model which sums up the knowledge of which records to be linked together. This method uses one-class approach which gives the results more accurately.OCCT model has also been proved successful in three different domains namely data linkage prevention,recommender system and fraud detection. CONSTRUCT OCCT USING ALL ENTITIES PREPRUNING TECHNIQUE COMPARE ENTITIES MATCHING ENTITY NON-MATCHING ENTITY FINAL RESULT DATABASE A DATABASE B Proceedings of International Conference on Advancements in Engineering and Technology ISBN NO : 978 - 1502893314 www.iaetsd.in International Association of Engineering and Technology for Skill Development 52
  • 3. REFERENCES 1. I.P. Fellegi and A.B. Sunter, “A Theory for Record Linkage,” J. Am. Statistical Soc., vol. 64, no. 328, pp. 1183-1210, Dec. 1969. 2. D.D. Dorfman and E. Alf, “Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals—Rating- Method Data,” J. Math. Psychology,vol. 6, no. 3, pp. 487-496, 1969. 3. J.R.Quinlan, “Induction of Decision Trees,” Machine Learning, vol. 1, no. 1, pp. 81-106, March 1986. 4. M.A. Hernandez and S.J. Stolfo, “The Merge/Purge Problem for Large Databases,” Proc. ACM SIGMOD Int’l Conf. Management of Data (SIGMOD ’95), 1995. 5. P.Langley, Elements of Machine Learning, San Franc Isco, Morgan Kaufmann, 1996. 6. I.H. Witten, A. Moffat, and T.C. Bell, Managing Gigabytes, second ed. Morgan Kaufmann, 1999. 7. S.Guha, R.Rastogi and K.Shim, “Rock: A Robust Clustering Algorithm for Categorical Attributes,” Informat- ion Systems, vol. 25, no. 5, pp. 345-366, July 2000. 8. L. Gravano, P.G. Ipeirotis, H.V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava, “Approximate String Joins in a Database (Almost) for Free,” Proc. 27th Int’l Conf. Very Large Data Bases (VLDB ’01), pp. 491-500, 2001. 9. L. Jin, C. Li, and S. Mehrotra, “Efficient Record Linkage in Large Data Sets,” Proc. Eighth Int’l Conf. Database Systems for Advanced Applications (DASFAA ’03), pp. 137-146, 2003. 10. I.S.Dhillon, S. Mallela, and D.S. Modha, “Information-Theoretic Co-Clustering,” Proc. Ninth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp. 89-98, 2003. 11. A. Aizawa and K. Oyama, “A Fast Linkage Detection Scheme for Multi-Source Information Integration,” Proc. Int’l Workshop Chal- lenges in Web Information Retrieval and Integration (WIRI ’05), 2005. 12. A.J.Storkey, C.K.I.Williams, E.Taylorand R.G.Mann, “An Expectation Maximisation Algorithm for One- to- Many Record Linkage,” University of Edinburgh Informatics Research Report, 2005. 13. P. Christen, “A Comparison of Personal Name Matching: Techniques and Practical Issues,” Proc. IEEE Sixth Data Mining Workshop (ICDM ’06), 2006. 14. P. Christen, “Towards Parameter-Free Blocking for Scalable Record Linkage,” Technical Report TR-CS- 07-03, Dept. of Com- puter Science, The Australian Nat’l Univ., 2007. 15. P. Christen and K. Goiser, “Quality and Complexity Measures for Data Linkage and Deduplication,” Quality Measures in Data Mining, vol. 43, pp. 127- 151, 2007. 16. S. Yan, D. Lee, M.Y. Kan, and L.C. Giles, “Adaptive Sorted Neighborhood Methods for Efficient Record Linkage,” Proc. Seventh ACM/IEEE-CS Joint Conf. Digital Libraries (JCDL ’07), 2007. 17. A.Gershman et al., “A Decision Tree Based Recomme- nder System,” in Proc. the 10th Int. Conf. on Innovative Internet Community Services, pp. 170- 179, 2010. 18. M.Yakout, A.K.Elmagarmid, H.Elmeleegy, M.Quzzani and A.Qi, “Behavior Based Record Linkage,” in Proc. of the VLDB Endowment, vol. 3, no 1-2, pp. 439-448, 2010. 19. P. Christen, “A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication,” IEEE Trans. Knowledge and Data Eng., vol. 24, no. 9, pp. 1537-1555, Sept. 2012, doi:10.1109/TKDE. 2011. 127. 20. M.Dror, A.Shabtai, L.Rokach, Y. Elovici, “OCCT: A One-Class Clustering Tree for Implementing One-to- Many Data Linkage,” IEEE Trans. on Knowledge and Data Engineering, TKDE-2011-09-0577, 2013. Proceedings of International Conference on Advancements in Engineering and Technology ISBN NO : 978 - 1502893314 www.iaetsd.in International Association of Engineering and Technology for Skill Development 53