Outlier detection for high dimensional data

1,000 views
830 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,000
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
35
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Outlier detection for high dimensional data

  1. 1. Outlier Detection for High Dimensional Data Presented by
  2. 2. Outline • Problem definition • Literature survey • System features • System Architecture • Analysis Models • UML diagrams • System Implementation Plans • Grantt chart, Cost implementation model
  3. 3. Problem definition The outlier detection technique finds applications in credit card fraud, network intrusion detection, financial applications and marketing. This problem typically arises in the context of very high dimensional data sets. Much of the recent work on find-ing outliers use methods which make implicit assumptions of relatively low dimensionality of the data. Thus, we discuss new techniques for outlier detection which find the outliers by studying the behavior of projections from the data set.
  4. 4. Literature survey  Many algorithms have been proposed in recent years for out-lier detection, but they are not methods which are specifically designed in order to deal with the curse of high dimensionality.  Two interesting algorithms define outliers by using the full dimensional distances of the points from one another. This measure is naturally sus-ceptible to the curse of high dimensionality.  According to Knorr and Ng, A point p in a data set is an outlier with respect to the parameters k and A, if no more than k points in the data set are at a distance A or less from p.  As pointed out , this method is sensitive to the use of the parameter A which is hard to figure out a-priori. In addition, when the dimensionality increases, it becomes in-creasingly difficult to pick.
  5. 5. System features  Hardware & Software Requirements Hard disk 80 GB RAM 1GB Technology Java Tools Net-beans IDE Processor Intel Pentium IV or above Operating System Windows XP
  6. 6. System features continued.. Quality Attributes • Usability : The application seem to user friendly since the GUI is interactive. • Maintainability : This application is maintained for long period of time since it will be implemented under java platform . • Reusability : The application can be reusable by expanding it to the new modules • Portability: The application is purely a portable mobile application since it can only be operated on android Operating system.
  7. 7. System architecture The system architecture is divided into three modules: • High dimensional outlier detection • Lower dimensional projection • Post processing
  8. 8. Use case
  9. 9. Activity
  10. 10. System Implementation Plan SR No Task Name Start Duration 1 Project topic finalization 10 days 2 Literature 10 days 3 Studying Core java,J2SE 30 days 4 Implementation of High Dimensional Outlier detection system 7 days 5 Implementation of Lower projections 10 days
  11. 11. Grant chart & cost implementation model 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Training System installation Studying Core java Implementation of Modules Testing Documentation Cost(in RS) Time(in days)

×