Your SlideShare is downloading. ×
Comparison of fuzzy   neural clustering based outlier detection techniques
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Comparison of fuzzy neural clustering based outlier detection techniques

520
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
520
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 422 COMPARISON OF FUZZY - NEURAL CLUSTERING BASED OUTLIER DETECTION TECHNIQUES Nilam Upasani Research Scholar, Computer Engineering, Indian School of Mines, Dhanbad, India Hari Om Computer Engineering, Indian School of Mines, Dhanbad, India ABSTRACT Fuzzy logic can be used to reason like humans and can deal with uncertainty other than randomness. Ability to learn, adapt, fault tolerance and reason with available knowledge, are the distinguished features of neural networks. Outlier detection is a difficult task to be performed, due to uncertainty involved in it. The outlier itself is a fuzzy concept and difficult to determine in a deterministic way. The hybridization of fuzzy and neural soft computing system is very promising, since they exactly tackle the situation associated with outliers. This paper provides a detailed comparison of fuzzy and/or neural approaches used for outlier detection by discussing their pros and cons. Thus this is a very helpful document for naive researchers in this field. Keywords: outlier detection; fuzzy logic; neural nerwork; data mining I. INTRODUCTION Business Intelligence is a de-facto standard that uses Data Mining techniques to manage and plan businesses. Mined knowledge is in the form of rules, association among the patterns, current and future customer trends. But there is a serious issue concerning with data mining, called outliers. In literature different definitions of outlier exist: the most commonly referred are [1]: - “An outlier is an observation that deviates so much from other observations as to arouse suspicions that is was generated by a different mechanism “ (Hawkins, 1980). - “An outlier is an observation (or subset of observations) which appear to be inconsistent with the remainder of the dataset” (Barnet & Lewis, 1994). - “An outlier is an observation that lies outside the overall pattern of a distribution” (Moore and McCabe, 1999). - “Outliers are those data records that do not follow any pattern in an application” (Chen and al., 2002). INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), pp. 422-431 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 423 - “An outlier in a set of data is an observation or a point that is considerably dissimilar or inconsistent with the remainder of the data” (Ramasmawy at al., 2000). Outlier may occur due to data error, noise, anomaly, improper sensors or typo error. Outliers may cause incorrect knowledge discovery and obstruct the process of data mining and thus may badly affect the business profit. Recently, many application domains have recognized the direct mapping between the outliers in data and real world anomalies that are of great interest to data analyst. The outlier detection is searching for objects in the database that do not obey laws valid for the major part of the data. Outlier detection is an important pre-processing task [2]. It has several realistic applications in areas such as, image processing [3], remote sensing [4], fraud detection [5], identifying computer network intrusions and bottlenecks [6], criminal activities in e-commerce and detecting suspicious activities [7], Satellite image analysis, etc. For any outlier detection technique, there are two ways to report the outliers which are scores and labels [8]. In scoring techniques, outlier score is assigned to each instance in the test data depending on the degree to which that instance is considered an outlier. Thus the output of such techniques is a ranked list of outliers. An analyst may choose to either analyze top few outliers or use a cut-off threshold to select the outliers. In label techniques, a label (normal or anomalous) is assigned to each test instance. Outlier detection and clustering analysis are two greatly interconnected tasks, although they serve different purposes. Clustering locates the majority patterns in a data set and categorizes data accordingly, whereas outlier detection tries to capture those exceptional cases that deviate substantially from the majority patterns. Without intrinsic clustering in the data, no natural outliers will exist. Clustering is an important tool for outlier analysis. A number of clustering-based outlier detection techniques have been developed, most of which rely on the key assumption that normal objects belong to large and dense clusters, while outliers form very small clusters [9, 10]. Outlier detection methods are derived from three fields of computing which are statistics (proximity-based, parametric, non-parametric and semi-parametric), neural networks (supervised and unsupervised) and machine learning. The main advantage of statistical models is they are mathematically justified. They are commonly appropriate to quantitative real-valued data sets. In case of least quantitative ordinal data distributions, the ordinal data should be transformed to suitable numerical values for statistical (numerical) processing. This restricts their applicability and in case of complex data transformations it also increases the processing time. Much outlier detection has only focused on continuous real-valued data attributes whereas Machine Learning has mainly centered on categorical data. Machine learning techniques are robust, do not suffer the Curse of Dimensionality, work well on noisy data and have simple class boundaries compared with the complex class boundaries. Neural Network approaches are normally non-parametric and model based. They generalize well to unseen patterns, capable of learning complex class boundaries and have better reasoning ability along with fault tolerance. Much of the pattern recognition task is now implemented using fuzzy logic as it can be used to reason in imprecise situations and can handle uncertainty. If we hybridize fuzzy logic with the neural networks, it combines advantages of both. Such a hybrid system can also be used for Outlier Detection. As outlier detection and clustering analysis are two greatly interconnected tasks, this survey aims at providing an overview of the outlier detection techniques involving fuzzy and/or neural network approaches based on clustering, by focusing on their strengths and weaknesses. This paper is organized as follows. Section II and III gives a comprehensive overview of the fuzzy clustering based outlier detection and neural network based outlier detection techniques. Advantages and disadvantages of these two approaches are discussed in Section IV. Section V points towards some hybridized fuzzy neural network approaches effectively used for pattern recognition, which can also be used for the task of outlier detection. Conclusions are given in section VI and references are cited at the end.
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 424 II. FUZZY CLUSTERING BASED OUTLIER DETECTION Crisp set can be used to handle structured data which is generally free from outlier points. But it is difficult to handle unstructured natural data which often contain outlier data points. Fuzzy set (introduced by Zadeh [11]) is different from its crisp counterpart, as it allows the elements to have a degree of membership. Thus, to handle unstructured natural data which is qualitative and imprecise in nature, many of the data mining techniques are integrated with fuzzy logic [12]. In traditional clustering approaches each pattern belongs to one and only one cluster [13]. In case of fuzzy clustering, the membership function associates each pattern with all the clusters. Thus fuzzy clusters grow in their natural shapes. In this section some fuzzy clustering outlier detection techniques are analyzed. A. Fuzzy C-Means algorithm (FCM) [14]: FCM is one of the well known fuzzy clustering algorithms, and used in a wide variety of applications, such as medical imaging, remote sensing, data mining and pattern recognition [15,16,17,18,19]. Chiu, Yager & Filev have proposed Fuzzy c-means data clustering algorithm in which each data point belongs to a clustering to a degree specified by a membership grade [20]. Fuzzy c-means clustering has two processes: the calculation of cluster centers and the assignment of points to these centers using a form of Euclidian distance. This process is repeated until the cluster centers are stabilized. FCM partitions a collection of data points into fuzzy groups. Unlike k-means where data point must exclusively belong to one cluster center, FCM finds a cluster center in each group in order to minimize the objective function of the dissimilarity measure. FCM employs fuzzy partitioning so that a given data point can belong to several groups with the degree of belongingness specified by membership grades between 0 and 1. The membership of each data point corresponding to each cluster center will be calculated on the basis of distance between the cluster center and the data point, using exp. [1], ………. [1] where µj(xi) : is the membership of xi in the jth cluster dji : is the distance of xi in cluster cj m : is the fuzzification parameter p : is the number of specified clusters dki : is the distance of xi in cluster Ck The new cluster centers are calculated with these membership values using the exp. [2]. ………. [2] where Cj : is the center of the jth cluster xi : is the ith data point µj : the function which returns the membership m : is the fuzzification parameter
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 425 More the data is near to the cluster center more is its membership towards the particular cluster center. Clearly, summation of membership of each data point should be equal to one. After each iteration, the membership and cluster centers are updated. Good characteristics of this algorithm are, it is simple to implement, robust in behavior, can handle multidimensional data, gives best result for overlapped data set and can model uncertainty within the data. The main drawback of this algorithm is that it is unable to handle outlier points effectively because of the restriction that the sum of membership values of a data point in all the clusters must be one. This tends to give high membership values for the outlier points. It is also incapable to calculate the membership value if the distance of a data point is zero. We get better result but at the cost of more number of iteration B. Fuzzy clustering based algorithm proposed by Binu Thomas and Raju Ghave [21]: They have proposed a novel fuzzy clustering method in which they have modified the membership function proposed by Fuzzy C-Means algorithm. In this method a very low membership value is assigned to the outlier points to handle unstructured natural data. In FCM, many limitations arise because the membership of a point in a cluster is calculated based on its membership in other clusters. In this method, the membership of a point in a cluster depends only on its distance in that cluster calculated by a simple expression exp. [3]. ………. [3] Where µj(xi) : is the membership of xi in the jth cluster dji : is the distance of xi in cluster cj Max(dj) : is the maximum distance in the cluster cj With this it is possible to calculate the membership value if the distance of a data point is zero which is the limitation of FCM. It is more proficient in handling the natural data with outlier points by allocating very low membership values to the outlier points. It is far better in the calculation of new cluster centers. This method has limitations in exploring highly structured crisp data which is free from outlier points. C. Fuzzy clustering based algorithm proposed by Moh'd Belal et.al. [22]: Moh'd Belal Al-Zoubi, Ali Al-Dahoud, Abdelfatah A. Yahya proposed an efficient method for outlier detection based on fuzzy clustering techniques. This technique is based on the key assumption that normal objects belong to large and dense clusters, whereas outliers form very small clusters. Here a small cluster is a cluster with fewer points than half the average number of points in the other clusters. In this approach, FCM algorithm is performed first to produce a set of k clusters as well as the objective function (OF) and Compute the threshold (T) to determine outliers. Then small clusters are determined and the points that belong to these clusters are considered as outliers. After this if any outliers are left then those would be detected in the remaining clusters by temporary removing a point from the data set and re-calculating the objective function. If a noticeable change is observed in the Objective Function (OF), the point is considered an outlier. This approach is efficient to locate the outliers when applied to different data sets. However, the proposed method is very time consuming because the FCM algorithm has to run multiple times.
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 426 III. NEURAL NETWORK BASED OUTLIER DETECTION Neural networks serve to learn, process, and predict the information using layers of interconnected units resembling the neurons that comprise the nervous system of the human brain. Neural networks have the advantage of powerful modeling ability. Neural network does not involve the complex process and has the advantage of requiring poor or no a priori assumption on the considered data. There are many outlier detection techniques based on neural networks. A. Outlier Detection Using Replicator Neural Networks[23] Hawkins S., H. X. Williams, G.J. and Baxter R.A. have proposed replicator neural network (RNN) based technique to find outliers in large multivariate databases. Earlier Replicator Neural Network was used for its data compression capabilities in several applications of image and speech processing [24, 25]. RNNs have a flexible, non-parametric representation of clusters and hence they have used it as a powerful tool for outlier detection [26]. In this approach, a feed-forward multi-layer neural network is used with three hidden layers sandwiched between the input and output layers which have n units each, corresponding to the n features of the training data. In order to minimize the average reconstruction error, the number of units in the three hidden layers is chosen experimentally across all training patterns. To develop a compressed and implicit model of the data during training, the input variables are also the output variables. A measure of outlyingness (called the Outlier Factor) of individuals is then developed as the reconstruction error of individual data points. The Outlier Factor of the ith data record OFi as the measure of outlyingness is defined by the average reconstruction error over all features (variables) as given in exp. [4]. ………. [4] where, n is the number of features, xij is the input value and Oij is the input value. The OF is evaluated for all data records using the trained RNN. Instead of the usual sigmoid activation function for the middle hidden layer a staircase-like function with parameters N (number of activation levels) and a3 (transition rate from one level to the next) are employed. This activation function provides the data compression as it continuously divides distributed data points into a number of discrete valued vectors. The mapping to discrete categories naturally places the data points into a number of clusters which is essential for outlier detection. For scalability the RNN is trained with a smaller training set and then applied to all the data to calculate their outlyingness. Neural Network methods often have difficulty with such smaller datasets, though RNN performs satisfactory for both small and large datasets. But its performance degrades with datasets containing radial outliers and so it is not suggested for this type of dataset. B. Multiple Outlier Detection in Multivariate Data Using Self-Organizing Maps Title [27] Nag A.K., Mitra A. and Mitra S. proposed an artificial intelligence technique of self- organizing map (SOM) based non-parametric method for outlier detection. It can be used to detect outliers from large multidimensional datasets and also provides information about the entire outlier neighborhood. SOM is a feed forward neural network that uses an unsupervised training algorithm, and through a process called self-organization, configures the output units into a topological representation of the original data (Kohonen 1997). SOM produces a topology-preserving mapping of the multidimensional data cloud onto lower dimensional visualizable plane and hence provides an easy way of detection of multidimensional outliers in the data, at respective levels of influence.
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 427 The outcome of the SOM algorithm is the set of weight vectors which are used to connect the multidimensional input vector space and the output grid. A trained weight vector associated with a grid on the output layer gives a unique identification of that unit. First the training process of winner selection and weight updation is done. Then the appropriate visualization of the topology of map is done and the outliers are detected by interpreting the results. Advantage includes the easy implementation as it does not require any priori assumption on the variable. C. Subtractive Clustering Based RBF Neural Network Model for Outlier Detection [28] Peng Yang, Qingsheng Zhu and Xun Zhong proposed a RBF neural network model using subtractive clustering algorithm for selecting the hidden node centers, which can achieve faster training speed. This method has adopted the Subtractive Clustering (SC) to determine the hidden node centers in RBF network since it allows a scatter input-output space partitioning, which can lead to a proper network structure and achieve computation efficiency. Accurate prediction is achieved by minimizing the variances of the nodes in the hidden layer. An RBF network is a three-layer feed forward neural network which consists of an input layer, a hidden layer and an output layer. The network is trained to implement a desired input-output mapping by producing incremental changes of the weights of the network. Finally, according to the degree of outlier for each input instance candidate outliers can be obtained. Advantage of this method is that it can perform more accurate prediction with lower false positive rate and higher detection rate and it can be an effective solution for detecting outliers. IV. ADVANTAGES AND DISADVANTAGES OF FUZZY - NEURAL CLUSTERING BASED TECHNIQUES A. Advantages and disadvantages of fuzzy based techniques • The use of a fuzzy set approach to pattern classification inherently provides degree of membership information that is extremely useful in higher level decision making. • Fuzzy logic is capable of supporting, to a reasonable extent, human type reasoning in natural form by allowing partial membership for data items in fuzzy subsets [29]. • Fuzzy logic can handle uncertainty at various stages. • With this approach knowledge can be expressed in terms of If–THEN rules. • Disadvantage of fuzzy clustering is that with a growing number of objects the amount of output of the results becomes huge, so that the information received often cannot be worked up. B. Advantages and disadvantages of neural networks based techniques • Neural networks have an adaptive capability of learning new patterns and refine existing ones. • These techniques are capable of learning complex class boundaries and have ability to learn complex nonlinear input-output relationships • As neural networks are massively parallel in nature, Complex computational task can be performed in great speed. • They are nonparametric that is it can work without any prior information available. • Neural network have disadvantages such as lower detection precision in case of small learning sample size, weaker detection stability.
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 428 V. HYBRID APPROACHES INVOLVING BOTH FUZZY LOGIC AND NEURAL NETWORKS FOR PATTERN RECOGNITION We have surveyed that the variations of fuzzy logic and neural network are best suitable for classification and clustering. This hybrid approach is used for speaker identification, real time heart diseases detection, face recognition, etc [30, 31, 32, 33]. However, not much work is done for using this hybrid approach for the outlier detection. Wang G., Jinxing Hao, Jian Ma, Lihua Huang have proposed a novel approach for Intrusion Detection System to solve network security problems based on the hybrid approach [34]. The technique is called Fuzzy Clustering Artificial Neural Network (FC-ANN). The basic objective was to enhance the detection precision for low-frequent attacks and detection stability. The general procedure of FC-ANN approach has three major stages. In the first stage, they have used one of the popular soft clustering techniques, fuzzy c-means clustering to divide the training data into several subsets. In the second stage, based on different training sets, different neural networks are trained. They have used classic feed-forward neural networks trained with the back-propagation algorithm for training. In the third stage, to eliminate the errors of different ANNs, a meta-learner, fuzzy aggregation module is used to learn again and combine the results of different ANN’s. This method has employed divide and conquer approach. Fuzzy clustering is used to divide the whole training set into subsets which have less number and lower complexity. Thus the ANN can learn each subset more quickly, robustly and precisely so that low-frequent attacks can also be detected. The proposed technique is found better in performance compared to BPNN and other well- known methods such as decision tree, the naïve Bayes in terms of detection precision and detection stability. VI. CONCLUSION AND FUTURE DIRECTIONS In this survey we have discussed different ways in which the problem of cluster based outlier detection has been formulated in literature, and we attempted to provide an overview of the huge literature on different techniques. Different outlier detection algorithms are available which are based on fuzzy logic and neural network, which have their own advantages and disadvantages. Some researchers have used hybrid approach, based on fuzzy logic and neural network to combine the advantages and to overcome the disadvantages of both the techniques for various domains which are mentioned in section V. But very few have used this hybrid approach for outlier detection. Thus, there is still a wide scope to apply modern fuzzy-neural techniques in this area to improve the quality and required time of outlier detection process. REFERENCES [1] Silvia Cateni, Valentina Colla and Marco Vannucci, Outlier Detection Methods for Industrial Applications, Advances in Robotics, Automation and Control, Book edited by: Jesús Arámburo and Antonio Ramírez Treviño, ISBN 78-953-7619-16-9, pp. 472, October 2008, I- Tech, Vienna, Austria [2] Han, J. and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2nd edition, 2006. [3] Al-Zoubi, M. B, A. M. Kamel and M. J. Radhy Techniques for Image Enhancement and Noise Removal in the Spatial Domain, WSEAS Transactions on Computers, Vol. 5, No. 5, 2006, pp. 1047-1052.
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 429 [4] Yang J. and Z. Zhao, A Fast Geometric Rectification of Remote Sensing Imagery Based on Feature Ground Control Point Database, WSEAS Transactions on Computers, Vol. 8, No. 2, 2009, pp. 195-204. [5] Bolton, R. and D. J. Hand, Statistical Fraud Detection: A Review, Statistical Science, Vol. 17, No. 3, 2002, pp. 235-255. [6] Lane, T. and C. E. Brodley. Temporal Sequence Learning and Data Reduction for Anomaly Detection, 1999. ACM Transactions on Information and System Security, Vol. 2, No. 3, 1999, pp. 295-331. [7] Chiu, A. and A. Fu, Enhancement on Local Outlier Detection. 7th International DatabaseEngineering and Application Symposium (IDEAS03), 2003, pp. 298-307. [8] Barnett, V. and T. Lewis, Outliers in Statistical Data, John Wiley, 1994. [9] Hodge, V. and J. Austin, A Survey of Outlier Detection Methodologies, Artificial Intelligence Review, Vol. 22, 2004, pp. 85–126. [10] Victoria J. Hodge and Jim Austin, A Survey of Outlier Detection Methodologies, Kluwer Academic Publishers, 2004 [11] L.Zadeh, Fuzzy sets, Inform. And control, vol.8, pp. 338-353, ,1965. [12] Sankar K. Pal, P. Mitra, Data Mining in Soft Computing Framework: A Survey, IEEE transactions on neural networks, vol. 13, no. 1, January 2002 [13] M.Deepa, P. Revathy, Validation of Document Clustering based on Purity and Entropy measures, International Journal of Advanced Research in Computer and Communication Engineering Vol. 1, Issue 3, May 2012 [14] E. Cox, Fuzzy Modeling And Genetic Algorithms For Data Mining And Exploration, Elsevier, 2005 [15] Bezdek, J, L. Hall, and L. Clarke, Review of MR Image Segmentation Techniques Using Pattern Recognition, Medical Physics, Vol. 20, No. 4, 1993, pp. 1033–1048. [16] Pham, D, Spatial Models for Fuzzy Clustering, Computer Vision and Image Understanding, Vol. 84, No. 2, 2001, pp. 285–297. [17] Rignot, E, R. Chellappa, and P. Dubois, Unsupervised Segmentation of Polarimetric SAR Data Using the Covariance Matrix, IEEE Trans. Geosci. Remote Sensing, Vol. 30, No. 4, 1992, pp. 697–705. [18] Al- Zoubi, M. B., A. Hudaib and B Al- Shboul A Proposed Fast Fuzzy C-Means Algorithm, WSEAS Transactions on Systems, Vol. 6, No. 6, 2007, pp. 1191-1195. [19] Maragos E. K and D. K. Despotis, The Evaluation of the Efficiency with Data Envelopment Analysis in Case of Missing Values: A fuzzy approach, WSEAS Transactions on Mathematics, Vol. 3, No. 3, 2004, pp. 656-663 [20] Chiu, S. L. (1994). Fuzzy model identification based on cluster estimation. Journal of Intelligent and Fuzzy Systems, 2, 267–278. [21] Binu Thomas and Raju G, A Novel Fuzzy Clustering Method for Outlier Detection in Data Mining, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009 [22] Moh'd Belal Al-Zoubi, Ali Al-Dahoud, Abdelfatah A. Yahya, New Outlier Detection Method Based on Fuzzy Clustering, WSEAS transactions on information science and applications, Issue 5, Volume 7, May 2010, pp. 681 – 690. [23] Hawkins, S.; He, X.; Williams, G.J. & Baxter, R.A., Outlier detection using replicator neural networks. Proceedings of the 5th international conference on Knowledge Discovery and Data Warehousing, 2002. [24] D. H. Ackley, G. E. Hinton, and T. J. Sejinowski. A learning algorithm for boltzmann machines. Cognit. Sci., 9:147–169, 1985. [25] R. Hecht-Nielsen. Replicator neural networks for universal optimal source coding. Science, 269(1860-1863), 1995.
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 430 [26] Williams, G.; Baxter, R.; He, H. & Hawkison,S., A comparative study of RNN for outlier detection in data mining, Proceedings of the IEEE International Conference on Data Mining, pp. 709–712, 9-12 December 2002, Australia. [27] Nag, A.K.; Mitra, A. & Mitra, S., Multiple outlier Detection in Multivariate Data Using Self- Organizing Maps Title, Computational Statistical, N.20, 2005, pp.245-264 [28] Peng Yang, Qingsheng Zhu and Xun Zhong, Subtractive Clustering Based RBF Neural Network Model for Outlier Detection, Journal Of Computers, Vol. 4, No. 8, August 2009, pp. 755-762 [29] Varun chandola, arindam banerjee and vipin kumar, outlier detection : a survey [30] N. P. Jawarkar, R. S. Holambe and T. K. Basu, Use of fuzzy min-max neural network for speaker identification, Proc. IEEE Int. Conference on Recent Trends in Information Technology (ICRTIT 2011), MIT, Anna university, Chennai, Jun.-3-5, 2011. [31] S. S. Panicker, P. S. Dhabe, M. L. Dhore, Fault Diagnosis Using Fuzzy Min-Max Neural Network Classifier, CiiT International Journal of Artificial Intelligent Systems and Machine Learning, Issue Jul. 2010. http://www.ciitresearch.org/aimljuly2010.html. [32] M. Mohammadi, R. V. Pawar, P. S. Dhabe, Heart Diseases Detection Using Fuzzy Hyper Sphere Neural Network Classifier, CiiT International Journal of Artificial Intelligent Systems and Machine Learning, Issue Jul. 2010. http://www.ciitresearch.org/aimljuly2010.html. [33] Nilam Upasani and M. Potey, Generalized Ring averaging: a new method for left and right directional illumination invariant face recognition for frontal poses and their small variants, Int. Conf. and Workshop on Emerging Trends in Technology, Mumbai, ACM-ISBN 978-1- 60558-812-4, 2010. [34] Wang G., Jinxing Hao, Jian Ma, Lihua Huang, A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering. Expert Systems with Applications (2010), doi:10.1016/j.eswa.2010.02.102 [35] Hawkins d. (1980). Identification of outliers. Chpaman and hall. London. [36] Knorr, E. and R. Ng, Algorithms for Mining Distance-based Outliers in Large Data Sets, Proc.the 24th International Conference on Very Large Databases (VLDB), 1998, pp. 392- 403. [37] Knorr, E., R. Ng, and V. Tucakov, Distancebased Outliers: Algorithms and Applications. VLDB Journal, Vol. 8, No. 3-4, 2000, pp. 237- 253. [38] Rousseeuw, P. and A. Leroy, Robust Regression and Outlier Detection, 3rd ed., John Wiley & Sons, 1996.Chandola, V., Banerjee, A. and Kumar, V. (2009) Anomaly detection: A survey, ACM Computing Surveys (CSUR), ACM Digital Library,Vol. 41, Issue 3, Pp.1-58. [39] Ramaswami, S., R. Rastogi and K. Shim,Efficient Algorithm for Mining Outliers from Large Data Sets. Proc. ACM SIGMOD, 2000, pp. 427-438. [40] Angiulli, F. and C. Pizzuti, Outlier Mining in Large High-Dimensional Data Sets, IEEE Transactions on Knowledge and Data Engineering, Vol. 17 No. 2, 2005, pp. 203- 215. [41] Breunig, M., H. Kriegel, R. Ng and J.Sander, 2000. LOF: Identifying Density-Based Local Outliers. In Proceedings of ACM SIGMOD International Conference on Management of Data. ACM Press, 2000, pp. 93–104. [42] Papadimitriou, S., H. Kitawaga, P. Gibbons, and C. Faloutsos, LOCI: Fast Outlier Detection Using the Local Correlation Integral. Proc. Of the International Conference on Data Engineering, 2003, pp. 315-326. [43] Gath, I and A. Geva, Fuzzy Clustering for the Estimation of the Parameters of the Components of Mixtures of Normal Distribution, Pattern Recognition Letters, Vol. 9, 1989, pp. 77-86. [44] Cutsem, B and I. Gath, Detection of Outliers and Robust Estimation using Fuzzy Clustering, Computational Statistics & Data Analyses, Vol. 15, 1993, pp. 47-61.
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 431 [45] Jiang, M., S. Tseng and C. Su, Twophase Clustering Process for Outlier Detection, Pattern Recognition Letters, Vol.22, 2001, pp. 691- 700. [46] G. Raju, Th. Shanta Kumar, Binu Thomas, Integration of Fuzzy Logic in Data Mining: A comparative Case Study, Proc. of International Conf. on Mathematicsand Computer Science, Loyola College, Chennai, 128-136,2008 [47] M. I. Petrovskiy, “Outlier Detection Algorithms in Data Mining Systems”, Programming and Computer Software, Vol. 29, No. 4, pp. 228–237, 2003. [48] Munoz, A. & Muruzabal, J. Self organizing maps for outliers detection, Elsevier, Neurocomputing, N°18, pp.33-60, August 1997. [49] S. S. Shirish and A. G. Ashok, “Use of Instance Typicality for Efficient Detection of Outliers with Neural Network Classifiers,” Proceedings of the 9th International Conference on Information Technology, 2006. [50] T. C. Lu, J. C. Juang and G. R. Yu, “On-line Outliers Detection by Neural Network with Quantum Evolutionary Algorithm,” Proceedings of International Conference on Innovative Computing, Information and Control, Kumamoto, 2007. [51] Patrick K. Simpson, Fuzzy Min-Max Neural Networks-Part 1: Classification, IEEE Trans. on Neural Network,” Vol. 3, No.5, pp. 776-786, Sept. 1992. [52] Patrick K. Simpson, Fuzzy min-max neural networks—Part 2: Clustering, IEEE Trans. on Fuzzy Systems,” Vol. 1, No. 1, pp. 32–45, Feb. 1993. [53] Bogdan G. and Andrzej B., General Fuzzy Min-Max Neural Network for Clustering and Classification, IEEE Trans. on Neural Networks, Vol.11, No.3, May 2000. [54] A. Joshi, N. Ramakrishman, E. N. Houstis, J. R. Rice, On neurobiological, neurofuzzy, machine learning, and statistical pattern recognition techniques, IEEE Trans. on Neural Networks, Vol. 8, No.1, pp. 18-31, Jan. 1997. [55] M. Marsella, M. Meneganti, and R. Tagliaferri, Detection of anomalies with fuzzy neural networks, Neural Nets WIRN Vietri 95, M. Marinaro and R. Tagliaferri, Eds. Singapore: World Scientific, pp. 311–317, 1996. [56] M. Meneganti, F. S. Saviello, and R. Tagliaferri, Fuzzy neural networks for classification and detection of anomalies, IEEE Trans. on Neural Networks, Vol. 9, No.5, Sept. 1998. [57] M. Meneganti, F. Saviello, and R. Tagliaferri, Fuzzy neural networks for classification and detection of anomalies, Proc. 7th IFSA World Congress Prague, M. Mares et al. Eds. Prague, Czechoslovakia: Academia, pp. 561–566, 1997. [58] A. V. Nandedkar and P. K. Biswas, A fuzzy in max neural network classifier with compensatory neuron architecture, Proc. 17th Int. Conference on pattern recognition (ICPR), 2004. [59] Liu J, H. Xin, Yang J., A fuzzy min-max neural network classifier based on centroid, Proc. 30th Chinese Control conference, Yantai, China, pp. 2749-2763, Jul. 22-24, 2011. [60] Hari Om, Alok Kumar Gupta, Design of Host based Intrusion Detection System using Fuzzy Inference Rule, International Journal of Computer Applications (0975 – 8887), Volume 64– No.9, February 2013. [61] M. Karthikeyan, M. Suriya Kumar and Dr. S. Karthikeyan, “A Literature Review on the Data Mining and Information Security”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 141 - 146, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [62] Chayadevi M.L and Dr. Raju G.T, “Classification and Clusters Extraction from Bacterial Images using Statistical and Neural Network Approaches”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 2, 2012, pp. 288 - 296, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.