Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Depth based app


Published on

depth based approach for outlier detection

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Depth based app

  1. 1. Depth based approach for outlier detection Presented By: Madhav solanki M.Tech(CSE) II Year 14815000210/1/2015 1
  2. 2. Introduction Applications Algorithm Conclusion References Outline 10/1/2015 2
  3. 3. Introduction What is an outlier? Definition of Hawkins [Hawkins 1980]: “An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism” 3
  4. 4. Application • Sample applications of outlier detection – Fraud detection • Purchasing behaviour of a credit card owner usually changes when the card is stolen • Abnormal buying patterns can characterize credit card abuse – Medicine • Unusual symptoms or test results may indicate potential health problems of a patient
  5. 5. cont……… – Public health • The occurrence of a particular disease, e.g. tetanus, scattered across various hospitals of a city indicate problems with the corresponding vaccination program in that city Sports statistics • In many sports, various parameters are recorded for players in order to evaluate the players’ performances • Outstanding (in a positive as well as a negative sense) players may be identified as having abnormal parameter values 5
  6. 6. cont……… Detecting measurement errors: • Data derived from sensors (e.g. in a given scientific experiment) may contain measurement errors • Abnormal values could provide an indication of a measurement error • Removing such errors can be important in other data mining and data analysis tasks 6
  7. 7. General application scenarios – Supervised scenario • In some applications, training data with normal and abnormal data objects are provided • There may be multiple normal and/or abnormal classes • Often, the classification problem is highly unbalanced – Semi-supervised Scenario • In some applications, only training data for the normal class(es) (or only the abnormal class(es)) are provided – Unsupervised Scenario • In most applications there are no training data available • In this tutorial, we focus on the unsupervised scenario 7
  8. 8. Depth-based Approach: • General idea:: – Search for outliers at the border of the data space but independent of statistical distributions – Organize data objects in convex hull layers – Outliers are objects on outer layers • Basic assumption:: – Outliers are located at the border of the data space – Normal objects are in the centre of the data space 8
  9. 9. Cont….. • Model [Tukey 1977] • – Points on the convex hull of the full data space have depth = 1 • – Points on the convex hull of the data set after removing all points with • depth = 1 have depth = 2 • – Points having a depth ≤ k are reported as outliers 9
  10. 10. References 10/1/2015 10 [1] Wikipedia [2] [3] [4]
  11. 11. Thank You.......... 10/1/2015 11