A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FORHIGH-DIMENSIONAL DATAABSTRACT:Feature selection involves ide...
Upcoming SlideShare
Loading in...5
×

Java a fast clustering-based feature subset selection algorithm for high-dimensional data

480

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
480
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Java a fast clustering-based feature subset selection algorithm for high-dimensional data

  1. 1. A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FORHIGH-DIMENSIONAL DATAABSTRACT:Feature selection involves identifying a subset of the most useful features that producescompatible results as the original entire set of features. A feature selection algorithm may beevaluated from both the efficiency and effectiveness points of view. While the efficiencyconcerns the time required to find a subset of features, the effectiveness is related to the qualityof the subset of features. Based on these criteria, a fast clustering-based feature selectionalgorithm (FAST) is proposed and experimentally evaluated in this paper.The FAST algorithm works in two steps.In the first step, features are divided into clusters by using graph-theoretic clustering methods.In the second step, the most representative feature that is strongly related to target classes isselected from each cluster to form a subset of features.Features in different clusters are relatively independent; the clustering-based strategy of FASThas a high probability of producing a subset of useful and independent features. To ensure theefficiency of FAST, we adopt the efficient minimum-spanning tree (MST) clustering method.The efficiency and effectiveness of the FAST algorithm are evaluated through an empiricalstudy. Extensive experiments are carried out to compare FAST and several representative featureselection algorithms results, on 35 publicly available real-world high-dimensional image,microarray, and text data, demonstrate that the FAST not only produces smaller subsets offeatures but also improves the performances of the four types of classifiers.ECWAY TECHNOLOGIESIEEE PROJECTS & SOFTWARE DEVELOPMENTSOUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORECELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com

×