1. The 7th International Conference of SNIKOM 2023 (Hybrid)
(ICoSNIKOM 2023)
November 10, 2023
A SOFT SET APPROACH FOR
FAST CLUSTERING ATTRIBUTE
SELECTION
Dedy Hartama, Iwan Tri Riyadi Yanto, Muhammad Zarlis
Invite Speaker ICOSNIKOM 2023
Hotel Sinabung β Brastagi (10 November 2023)
2. The 7th International Conference of SNIKOM 2023 (Hybrid)
(ICoSNIKOM 2023)
November 10, 2023
Outline Presentation
β’ Introduction
β’ Novelty Research
β’ MDDS Algorithm
β’ SSAA Algorithm
β’ EXPERIMENTAL RESULT AND DISCUSSION
β’ Conclusion
3. β’ Data clustering is one the most popular processes in data mining, which is a method of
analysis of relevant data that is used to group the data with the similar characteristics.
β’ In applications, data clustering has been used in many fields such as information
retrieval and text mining. Application-based spatial data, for example, the GIS data or
astronomy , web-based applications and DNA analysis in computational biology.
Introduction
6. Research About Categorial Data
ο§ Some clustering techniques for clustering categorical data based on soft set theory
have been developed. Qin et al. present new ideas in selecting an attribute clustering by
using soft-set theory .[19]
ο§ The same authors present another idea to use soft set in selecting a clustering attribute
in categorical datasets and they derived a heuristic algorithm namely the NSS model
[20].
ο§ Mamat et al. [21] proposed an alternative technique for categorical data clustering
using soft set theory by selecting attribute based on the Maximum Attribute Relative
(MAR).
ο§ Another novel idea namely Maximum degree of dominance of soft set theory (MDDS)
that applying the concept of dominance relationships in multi-soft sets in determining
the most dominant attributes has been proposed by Suhirman et al. [22]
7. Gap Research
The most dominant attribute will be used as a clustering attribute. The MDDS has been applied in
many fields to select the best attribute e.g. in education data clustering for the assessment of students.
In reviewing MDDS, determining the predominant values of MDDS of set in the multi soft sets is
computationally high. So that, computing time becomes a problem because of the method are
comparing all sub-attribute to calculating the domination degree.
To overcome the weakness of the MDDS technique.
8. Novelty Research
ο§ This paper presents an alternate technique to improve MDDS by the concept
that selecting the sub-attribute having no effect dominate to other sub-
attributes. Therefore, there is no need to compare all of the attributes such as
in the MDDS.
ο§ The proposed alternative technique potentially produces lower computation
time as compared to the baseline algorithms [19,20,21,22]
9. MDDS ALGORITM
ο§ The MDDS technique approach has been proposed by Suhirman et al. It is
based on the definition of soft set Domination degree, where the degree is
computed by the domain-value of each soft set on others in multi-soft sets.
ο§ MDDS technique suggested that the highest MD is selected as a clustering
attribute using equation MD=max {K1, K2β¦.Kn}
ο§ If there are more than one attribute shared the highest k value, then, the next
highest k in each attribute will be used and compared until the tie is broken.
10. MDDS Algorithm
Input: Categorical-valued data-set
Output: A Clustering attribute
Begin
1.Builds the multi-soft set approximation
2.Calculate Domination of Attributes ai, on all aj, where i β j
3.Select the maximum of domination degree of each attribute
4.Select the clustering attribute based on the maximum degree of
domination of attributes
End
Fig 1. Pseudo code of MDDS
11. Novelty Soft Set Fast Clustering Atribut Selection
Input: Categorical-valued data-setOutput: A Clustering attribute
Begin
1. Constructs the multi-soft sets representing categorical-valued data-set
and its cardinality
2. Exclude the class of soft set in attribute respect to
where
3. Calculate domination degrees soft set from step 2.
4. Select the maximum of domination degree of each attribute
5. Select the clustering attribute based on the maximum degree of
domination of all attributes
End.
π β πΆ πΉ,ππ
π β πΆ πΉ,ππ |π| > |π|
Selecting Sub Attribute Algorithm (SSAA)
Pseudo code Selecting Sub Attribute Algorithm
12. EXPERIMENTAL RESULT AND DISCUSSION
ο§ The program is used to compare the proposed algorithms wit existing
algorithms methods, author of implementing it with MATLAB Version 8
R2015b application.
ο§ The amount of main memory is 2 G. The experiments are done through
the UCI benchmark datasets as in Table 1:
14. Responses Time New Algoritm
ο§ The new algorithm is proposed to reduce
the complexity in determining the
execution time that can be seen in Table 2.
ο§ Formula to calculate the drop in speed
about the proposed new algorithms with
algorithms following MDDS by formula
π°ππππππππππ(%) =
π΄π«π«πΊ β πΊπΊπ¨π¨
π΄π«π«πΊ
β πππ%
15. Conclusion
The algorithm used to select an attribute in the categorical
data clustering has been proposed previously, one of them is
the MDDS
This paper has proposed a modified algorithm of MDDS based on the multi-soft
sets theory by selecting sub-attributes for clustering multiple informational
value
The results of the experiment illustrate the proposed
algorithms achieve lower execution time
16. The 7th International Conference of SNIKOM 2023 (Hybrid)
(ICoSNIKOM 2023)
November 10, 2023
Thank You.