Information theoretic outlier detection for large-scale categorical data
Upcoming SlideShare
Loading in...5
×
 

Information theoretic outlier detection for large-scale categorical data

on

  • 539 views

...


Bulk Projects For sale

IEEE 2009-10-11-12-13 PAPERS AVILABLE.

We are providing low cost project for final year student projects.

Solved 2010 -2011 -2012 - 2013 IEEE in all the domain

Mobile : 8940956123

E-Mail : ambitlick@gmail.com,

INNOVATIVE TITLES ARE ALSO WELLCOME TO DO WITH US


For All BE/BTech, ME/MTech, MSC/MCA/MS , and diplamo graduates

PROJECT SUPPORTS & DELIVERABLES

•Project Abstract
•IEEE Paper
•PPT / Review Details
•Project Report
•Working Procedure in Video
•Screen Shots
•Materials & Books in CD
•Project Certification

Statistics

Views

Total Views
539
Views on SlideShare
539
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Information theoretic outlier detection for large-scale categorical data Information theoretic outlier detection for large-scale categorical data Document Transcript

  • Ambit lick Solutions Mail Id: Ambitlick@gmail.com , Ambitlicksolutions@gmail.Com 4. Information-Theoretic Outlier Detection for Large-Scale Categorical Data Outlier detection can usually be considered as a pre-processing step for locating, in a data set, those objects that do not conform to well-defined notions of expected behavior. It is very important in data mining for discovering novel or rare events, anomalies, vicious actions, exceptional phenomena, etc. We are investigating outlier detection for categorical data sets. This problem is especially challenging because of the difficulty of defining a meaningful similarity measure for categorical data. In this paper, we propose a formal definition of outliers and an optimization model of outlier detection, via a new concept of holoentropy that takes both entropy and total correlation into consideration. Based on this model, we define a function for the outlier factor of an object which is solely determined by the object itself and can be updated efficiently. We propose two practical 1-parameter outlier detection methods, named ITB-SS and ITB-SP, which require no user-defined parameters for deciding whether an object is an outlier. Users need only provide the number of outliers they want to detect. Experimental results show that ITB-SS and ITB-SP are more effective and efficient than mainstream methods and can be used to deal with both large and high-dimensional data sets where existing algorithms fail.