Upcoming SlideShare
×

# FUAT – A Fuzzy Clustering Analysis Tool

1,987 views
1,770 views

Published on

For Full Paper:

http://www.scribd.com/doc/64588432/FUAT-%E2%80%93-A-Fuzzy-Clustering-Analysis-Tool

5 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,987
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
0
0
Likes
5
Embeds 0
No embeds

No notes for slide

### FUAT – A Fuzzy Clustering Analysis Tool

1. 1. A. Selman BOZKIR - Ebru Akçapınar Sezer Hacettepe University – Computer Eng. Dept
2. 2. <ul><li>What is clustering and FCM? </li></ul><ul><li>Principle of Fuzzy Clustering </li></ul><ul><li>The difficulties in FCM </li></ul><ul><li>Proposed solution: FUAT </li></ul><ul><li>Details </li></ul><ul><li>Conclusion </li></ul>Perspective
3. 3. Clustering <ul><li>Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the distances of objects in the same cluster (intra class) are less than the distances in different cluster s (inter class) . </li></ul>
4. 4. Clustering (Schemas) <ul><li>Hard Clustering (ex:k-means) Soft Clustering (ex: EM,FCM) </li></ul>each data element belongs to exactly one cluster elements can belong to more than one cluster, and associated with each element is a set of membership levels .
5. 5. Fuzzy c-means clustering <ul><li>Based on Zadeh’s fuzzy sets theory. </li></ul><ul><li>Invented by Bezdek, 1981 </li></ul><ul><li>A soft clustering method </li></ul><ul><li>C ombines the c-means approach with the handling of the fuzziness existing in the data </li></ul><ul><li>one of the most popular unsupervised c lustering algorithm, which is widely used in pattern recognition, image recognition, gene classification, etc [1] </li></ul>
6. 6. FCM in Principle <ul><li>c as an input parameter </li></ul><ul><li>segments data into fuzzy clusters by providing typical prototypes for each of them </li></ul><ul><li>link between objects and cluster prototypes are expressed via a membership matrix </li></ul><ul><li>where u ij is the membership degree of x i in the cluster j, m is a real number denoting the fuzziness coefficient greater than 1, x i is the i th of d-dimensional data and c j is the cluster centroid of cluster j. Further, fuzzy segmentation is done with the optimization </li></ul>
7. 7. Difficulties of Fuzzy c-means clustering <ul><li>as stated by [ 2 ], three major difficulties were drawn ; </li></ul><ul><li>(1) how to detect optimal number of clusters ? </li></ul><ul><li>(2) how to choos e the initial cluster centroids ? </li></ul><ul><li>(3) how to evaluate cluster results, characterized by large variations in cluster shape, cluster density, and the number of points in different clusters </li></ul>
8. 8. Solution: FUAT <ul><li>to analyze, explore and visualize different aspects of obtained fuzzy clusters </li></ul><ul><li>convert black box of fuzzy clustering to transparent box </li></ul>
9. 9. FUAT – General Overview <ul><li>FCM and EM based clustering </li></ul><ul><li>Automatic cluster count estimator for non domain-experts </li></ul><ul><li>Various interactive viewers for different insights </li></ul><ul><li>Zooming, filtering, saving is available for results </li></ul><ul><li>CSV file support </li></ul><ul><li>R connectivity package (StatConn’s R(D)COM), ZedGraph and Microsoft GLEE is employed during the development </li></ul><ul><li>Developed at C#.NET </li></ul>
10. 10. FUAT <ul><li>General FCM Settings and Membership Table </li></ul>
11. 11. FUAT <ul><li>Automatic cluster count detection is based on Bayesian Information Criteria (BIC) implemented in EM framework of Mclust package of R. </li></ul>
12. 12. FUAT <ul><li>Cluster Population Distribution Viewer </li></ul>
13. 13. FUAT <ul><li>Cluster Centroids Viewer </li></ul>
14. 14. FUAT <ul><li>Cluster Membership Histogram Viewer </li></ul>
15. 15. FUAT <ul><li>Points of Interest Viewer </li></ul>
16. 16. FUAT <ul><li>Cluster Dependency Viewer </li></ul>
17. 17. Conclusion <ul><li>FUAT is useful at gaining insight from cluster analysis. </li></ul><ul><li>Ability for cluster analysis seperately and integrated to overcome difficulties of FCM usage </li></ul><ul><li>Software R can be used in native applications to power third party ML,DM applications via suitable interfaces. </li></ul><ul><li>Some Examples of Practical Benefits: Useful at revealing the inner structure of imbalanced data sets Useful at detecting important and dominant attributes in datasets </li></ul>
18. 18. References [1] Jingwei Liu, Meizhi Xu, Kernelized fuzzy attribute C-means clustering algorithm, Fuzzy Sets and Systems 159 (2008) 2428 – 2445 [2] Dae-Won Kim , Kwang H. Lee, Doheon Lee, A novel initialization scheme for the fuzzy c-means algorithm for color clustering, Pattern Recognition Letters 25 (2004) 227–237
19. 19. <ul><li>Thanks for listening …. </li></ul><ul><li>Questions? </li></ul>