SlideShare a Scribd company logo
1 of 8
Download to read offline
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 10, No. 3, June 2020, pp. 3227~3234
ISSN: 2088-8708, DOI: 10.11591/ijece.v10i3.pp3227-3234  3227
Journal homepage: http://ijece.iaescore.com/index.php/IJECE
The pertinent single-attribute-based classifier
for small datasets classification
Mona Jamjoom
Department of Computer Sciences, Princess Nourah Bint Abdulrahman University, Kingdom of Saudi Arabia
Article Info ABSTRACT
Article history:
Received Jul 27, 2019
Revised Dec 5, 2019
Accepted Dec 11, 2019
Classifying a dataset using machine learning algorithms can be a big
challenge when the target is a small dataset. The OneR classifier can be used
for such cases due to its simplicity and efficiency. In this paper, we revealed
the power of a single attribute by introducing the pertinent single-attribute-
based-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute
to classify small datasets. The SAB-HR’s used feature selection method,
which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most
homogeneous attribute among the other attributes in the set. Our empirical
results on 12 benchmark datasets from a UCI machine learning repository
showed that the SAB-HR classifier significantly outperformed the classical
OneR classifier for small datasets. In addition, using the H-Ratio as a feature
selection criterion for selecting the single attribute was more effectual than
other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Keywords:
Classification
Feature selection
OneR classifier
Single-attribute-based classifier
Small dataset
Copyright © 2020 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Mona Jamjoom,
Department of Computer Sciences,
Princess Nourah Bint Abdulrahman University,
Airport Road, Riyadh 11671, Kingdom of Saudi Arabia.
Email: mmjamjoom@pnu.edu.sa
1. INTRODUCTION
Classification is one of the main tasks of data mining and machine learning [1] that is widely used to
predict different real-life situations. High accuracy is a key indicator for a successful prediction model.
Building an accurate classifier is one of the important goals, and rich datasets make this task easier and more
effective [2]. Classifying small datasets efficiently is essential as some real situations cannot provide
a sufficient number of cases. A limited training set is challenging to learn and, as a result, base a decision on
it. In many multivariable classification or regression problems, such as estimation or forecasting, we have
a training set Tp = (xi, ti) of p pairs of input/output vector x ∈ ℜn and scalar target t. Thus, according to
Vapnik’s definition, a small dataset for Tp is determined as follows: "For estimating functions with VC
dimension h, we consider the size p of data to be small if the ratio p/h is small (say p/h < 20)" [3].
The problem with the small dataset is that, if not elaborately collected, it is not a representative
sample. Non-representative instances hinder the process of providing enough information for the learner
model because of the gaps existing between instances; thus, the model does not generalize well. Many works
have been proposed in the literature to solve the problem of small data size by using different methods. One
of the common methods used is to increase the size of data by adding artificial instances [4], but this
approach lacks data credibility and reflection on real-life use. Some researchers have used feature-selection
methods [5-8], whereas a novel technique using multiple runs for model development was proposed by [9]
and others.
A simple solution is one of the requirements when the problem is becoming increasingly complex.
This philosophy has been stated by Occam's razor [1]. Literature in the field of classification has shown some
successful attempts of very simple rules to achieve high accuracy with many datasets [10]. OneR is one of
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234
3228
the simple and widely used algorithms in machine learning to build a simple classifier. A trade-off between
simplicity and high performance [10] makes OneR’s performance slightly less accurate than state-of-the-art
classification algorithms [11, 12], although sometimes it outperforms them [13, 14]. Its main advantage is
that it balances the best accuracy possible with a model that is still simple enough for humans to
understand [12].
OneR is a single-attribute-based classifier that involves only one attribute at the classification time.
A single attribute concept is powerful if it can directly influence the classification accuracy of the dataset in
a positive manner. Yet not all attributes have to positively contribute to the classification process which may
increase the single attribute power. The single attribute rule can be more effective than complex methods
when it is difficult to learn from the dataset due to it being simple, small, noisy, or complex. A study by [15]
used the single attribute concept by creating multiple one-dimensional classifiers from the original dataset in
the training phase and combining the results in the prediction phase. The new method is unlike OneR because
it considers all attributes’ contributions at the prediction time. Feature selection is a data-mining pre-
processing step widely used to improve the classification and reduce the performance time. It is effective in
reducing the dataset’s dimensionality by eliminating non-contributable attributes. It uses different techniques
to come up with a single attribute or a subset of attributes [16, 17]. Moreover, it has proven its effectiveness
in improving various applications’ predictive accuracy [18-20].
In this paper, we tackle the problem of classifying small datasets by expanding the power of
a pertinent single attribute using SAB-HR classifier, which is similar to OneR classifier in using single
attribute at classification phase, but different in which instead of generating a rule for each attribute, a feature
selection method is employed to select the attribute that is less heterogenic among the other attributes.
We calculated the H-Ratio [21] for each attribute (att) then identified the attribute with the lowest H-Ratio
value (attH-Ratio). We used the pair (attH-Ratio, c), where c is the class value, to learn and classify the small
dataset. The results were encouraging and showed a significant improvement compared to the classical OneR
classifier. In addition, we created multiple classifiers in the same manner of SAB-HR, using different criteria
to select the pertinent single attribute. We used IG and GR in the feature-selection process and created SAB-
IG and SAB-GR classifiers, correspondingly. We individually compared the new classifier SAB-HR with
others (i.e., SAB-IG and SAB-GR). The remainder of this paper is organized as follows: Section 2 reviews
the background of our work. In Section 3, we propose the research method SAB-HR classifier.
The experiments and a brief discussion of the findings is in subsections 3.1 and 3.2, consequently. Finally,
Section 4 concludes the paper.
2. BACKGROUND
In this section we will review some of the techniques that will be used in this study.
2.1. OneR classifier
OneR, is short for "One Rule", and has been introduced by Rob Holte [22, 10]. It is one of the most
primitive techniques, based on a 1‐level decision tree that creates one rule for each attribute in the dataset,
then selects the rule with minimum classification errors as its "one rule". To create a rule for an attribute,
it constructs a frequency table for each attribute against the class [22], Figure 1 shows the pseudocode of
OneR algorithm. It has shown that OneR work distinctively well in practice with real-world data and can
compete the state-of-the-art classification algorithms in some situations [13, 14, 23]. OneR is using one
attribute for classification and many consider it as one of feature selection methods with feature subset
containing a single attribute [24]. Comparing the OneR classifier with the baseline classifier ZeroR [14],
OneR is a one step beyond. Both OneR and ZeroR are useful for determining a minimum standard classifier
for other classification algorithms. OneR’s accuracy is always higher or at least equal the baseline classifier
when evaluated on the training data. The authors in [25] proposed attempts to enhance the performance of
OneR by addressing two issues: the quantization of continuous-valued attributes, and the treatment of
missing values.
Figure 1. The pseudocode of OneR algorithm [15]
For each attribute (att),
For each value of that att, make a rule as follows;
Count how often each value of class appears
Find the most frequent class
Make the rule assign that class to this value of the att
Calculate the total error of the rules of each att
Choose the att with the smallest total error.
Int J Elec & Comp Eng ISSN: 2088-8708 
The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom)
3229
2.2. Feature selection
Feature selection methods attempt to find the minimal subset of features that do not significantly
decrease the classification accuracy. Feature selection methods can be categorized as wrapper methods or
filter methods [17]. Surveys done by [17] and [16] showed plenty of such methods. A wrapper method is
a model-based approach where the quality of the features selected is measured by the classification accuracy
of the classification algorithm being used. Some use a greedy search to select the subset [16]. Meanwhile,
in a filter method, called a model-free approach, the selection of features is done independently from
the classification algorithm. It selects the subset’s features dependent on general measurable characteristics of
the feature, such as information Gain, Gain Ratio, Pearson Correlation, Mutual Information (MI) [16], and
Heterogeneity Ratio [21]. In this paper, we used feature selection that utilizes filter methods (i.e., attribute
evaluation) and focused on some of the mentioned measures (i.e., IG, GR, and H-Ratio). A brief description
of each follows.
- Information gain [21] measures the amount of information given by an attribute about the class. It is
defined by formula (1):
𝐼𝐺(𝑎𝑡𝑡) = 𝐻(𝑌) − 𝐻 𝑎𝑡𝑡(𝑌) (1)
where Hatt (Y) measures the entropy of the attribute att by contributing to class Y while H(Y) calculates
the entropy of class Y. In fact, entropy is the quantity of information contained or delivered by a source of
information. It is also used in measuring the relevancy and defined by formula (2):
𝐻(𝑌) = ∑ − 𝑃(𝑣𝑖)𝑖 𝑙𝑜𝑔2 𝑃(𝑣𝑖) (2)
- Gain ratio [26] is a ratio of information gain to intrinsic information. It determines the relevancy of an
attribute. GR is calculated using the formula (3):
𝐺𝑅(𝑎𝑡𝑡) =
𝐼𝐺(𝑎𝑡𝑡)
𝐻(𝑎𝑡𝑡)
(3)
where H(att) = ∑ −𝑃(𝑣𝑗) 𝑙𝑜𝑔2𝑗 𝑃(𝑣𝑗) and P(vj) represents the probability to have the value vj by contributing
to overall values for attribute j.
- Heterogeneity ratio is a new measure defined by [21] that measures the ratio of heterogeneity of a nominal
attribute among the dataset attributes. In other words, it quantifies the homogeneity of a set of instances
sharing the same value of attributes. The H-Ratio is defined by formula (4):
𝐻 − 𝑅𝑎𝑡𝑖𝑜(𝑎𝑡𝑡) =
𝐻(𝑎𝑡𝑡)+𝐻 𝑎𝑡𝑡(𝑌)
𝐻(𝑌)
(4)
The ratio
𝐻(𝑎𝑡𝑡)
𝐻(𝑌)
adds value to the homogeneity instances based on attributes and class simultaneously whereas
the ratio
𝐻 𝑎𝑡𝑡(𝑌)
𝐻(𝑌)
appreciates the homogeneity instances of the same class and shares the same value of
attributes.
3. RESULTS AND ANALYSIS
In this section, we introduce a new single-attribute-based classifier SAB-HR to classify the small
datasets. The new algorithm uses a new criterion to select the powerful pertinent single attribute, which will
contribute in the classification. SAB-HR is unlike OneR in generating a rule for each attribute. It calculates
the H-Ratio for each attribute (attH-Ratio) in the dataset to determine the attribute that is less heterogenic
among the other attributes. The attribute with the lowest heterogeneity ratio value is used in pairs with
the class c (attH-Ratio, c) in the classification process while the remaining attributes are eliminated. The power
of the single attribute selected for SAB-HR lies in its homogeneity with other attributes in which it provides
enough information for the classifier to predict correctly. attH-Ratio is a representative attribute that is
sufficient for small datasets. The algorithmic description of SAB-HR is presented in Figure 2.
Figure 2. The pseudocode of SAB-HR algorithm
For each attribute (att),
Calculate the attH-Ratio;
Choose the att with the smallest attH-Ratio value;
Remove all att in the dataset except the pairs (attH-Rato, c);
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234
3230
3.1. Experiments
In the following experiments, we aim to evaluate the performance of the new SAB-HR classifier
when dealing with small datasets. In addition, we want to compare the performance of SAB-HR with other
single attribute classifiers that use different criteria, such as IG and GR, when selecting the single attribute
during the feature-selection process. We used the well-known open source software WEKA [27].
The datasets were obtained from the UCI Repository for Machine Learning [28]. We selected 12 small
datasets corresponding to Vapnik’s definition [3]. Table 1 lists the main characteristics of the datasets
collected and used in terms of number of instances, number of attributes, and Vapnik’s ratio for determining
the dataset’s size. The number beside the dataset name will be its reference in the figures.
The OneR was used as a base classifier; a 10-fold cross-validation and a paired t-test with
a confidence level of 95% were used to determine if the differences in classification accuracy were
statistically significant, and underlined in the tables. We compared the different methods with respect to
the average classification accuracy and the number of datasets for which each method achieved better results.
Better results are shown in the tables in bold font.
In the tables, we named each technique using the abbreviation SAB for single-attribute-based name, suffixed
with an abbreviation for the measure used for selecting the single attribute in the feature-selection process.
The new classifiers, with respect to the different measures, are named as follows: SAB-HR, SAB-IG and
SAB-GR. In our experiments, we applied the feature-selection process using different measures
(H-Ratio, IG, and GR) to select the pertinent single attribute, then we eliminated the remaining
(i.e., unselected) attributes and classified with a pair of attributes (pertinent single attribute, class).
Table 1. Characteristics of datasets used in the experiments
# Dataset # instances # attributes # instances/# attributes
1 Postoperative-patient-data 90 9 10
2 contact-lenses 24 4 6
3 weather-nominal 14 4 3.5
4 colic.ORIG 368 27 13.63
5 cylinder-bands 540 39 13.85
6 Dermatology 366 34 10.76
7 Flags 194 29 6.69
8 lung-cancer 32 56 0.57
9 spect_train 80 22 3.64
10 Sponge 72 45 1.6
11 Zoo 101 17 5.94
12 primary-tumor 339 17 19.94
3.2. Results and discussion
The experiment’s results are combined in Table 2, which compares the performance of classical
OneR with the new created classifiers. Noticeably, the performance of the classical OneR is insignificant
when compared to the new applied classifiers. The overall average accuracy for the new classifiers (i.e.,
SAB-HR, SAB-IG and SAB-GR) is 64.6%, 49.72% and 61.31%, respectively, corresponding to 48.53% for
the classical OneR classifier. Furthermore, the difference in average accuracy between SAB-HR compared to
the classical OneR is statistically significant. The average difference between the classical OneR and
the applied classifiers (i.e., SAB-HR, SAB-IG and SAB-GR) is 16.07%, 1.19% and 12.78%, respectively,
favoring new classifiers.
Table 2. The performance’s summary of applied classifiers compared to the classical OneR classifier
Dataset OneR SAB-HR OneR SAB-IG OneR SAB-GR
Postoperative-patient-data 67.78 71.11 67.78 71.11 67.78 68.89
contact-lenses 70.83 70.83 70.83 70.83 70.83 70.83
weather-nominal 42.86 57.14 42.86 50 42.86 50
colic.ORIG 67.66 65.76 67.66 67.66 67.66 63.86
cylinder-bands 49.63 67.59 49.63 49.63 49.63 65
dermatology 49.73 36.07 49.73 50.27 49.73 36.07
flags 4.64 33.51 4.64 4.64 4.64 42.78
lung-cancer 87.5 96.88 87.5 87.5 87.5 78.13
spect_train 67.5 92.5 67.5 75 67.5 75
sponge 4.17 98.61 4.17 4.17 4.17 95.83
zoo 42.57 60.4 42.57 42.57 42.57 60.4
primary-tumor 27.43 24.78 27.43 23.3 27.43 28.9
Average Accuracy 48.53 64.6 48.53 49.72 48.53 61.31
# of better dataset 3 8 1 4 3 8
Int J Elec & Comp Eng ISSN: 2088-8708 
The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom)
3231
Figure 3 (a-c) compare the applied classifiers to the classical OneR classifier in terms of average
accuracy, with the less heterogenous attribute classifier (SAB-HR) ranking first, followed by SAB-GR with
a slight difference (3.29%) from first, and SAB-IG classifier with a big difference from other classifiers but
looking typical to the classical OneR, the two lines approximately identical as shown in Figure 3 (b).
The (attIG) attribute used in SAB-IG contains the largest amount of information about the class. In a small
dataset case, it may be more important to be concerned about the consistency of the attribute with other
attributes due to the limited number of instances in the dataset. This would minimize the gaps existing
between the instances in the dataset. The homogeneity of the dataset helps make it more representative and,
thus, more accurate to be learned. In addition, the new classifiers achieved better average accuracy in more
datasets than OneR as shown in Table 2. Figure 4 (a-c) shows each new classifier in comparison to OneR.
The number of better datasets achieved is 8, 4 and 8 for SAB-HR, SAB-IG and SAB-GR, respectively,
corresponding to 3, 1, and 3 for OneR classifier.
Figure 3. Comparison of applied classifiers versus OneR classifier in term of average accuracy
Figure 4. Comparison of applied classifiers versus OneR classifier in term of number
of better datasets achieved
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234
3232
From Table 2, it is obvious that selecting the single attribute that has a lower classification error
rate for the OneR classifier is not always optimal, especially in small datasets. Using a more deliberate
technique to select the single attribute has a positive impact on classification accuracy and number of better
datasets achieved. Meanwhile, we developed Table 3 to highlight the new classifier SAB-HR, which used
homogeneity for the pertinent single attribute selection. Table 3 shows a comparison between the new
classifier SAB-HR and the other created classifiers for the same purpose (i.e., SAB-IG and SAB-GR).
The results showed that SAB-HR’s average accuracy outperforms SAB-IG’s average accuracy by nearly
14.88%, while with SAB-GR the difference is only 1.37%. In general, the performance of the SAB-HR
classifier is remarkable when compared to the classical OneR or the applied classifiers (i.e., SAB-IG and
SAB-GR). Figure 5 (a) and (b) show the difference of performance of each dataset between SAB-HR and
the other applied classifiers in terms of average accuracy.
Table 3. A comparison between the new classifier SAB-HR and the other classifiers
Dataset SAB-HR SAB-IG SAB-HR SAB-GR
Postoperative-patient-data 71.11 71.11 71.11 68.89
Contact-lenses 70.83 70.83 70.83 70.83
Weather-nominal 57.14 50 57.14 50
Colic.ORIG 65.76 67.66 65.76 63.86
Cylinder-bands 67.59 49.63 67.59 65
Dermatology 36.07 50.27 36.07 36.07
Flags 33.51 4.64 33.51 42.78
Lung-cancer 96.88 87.5 96.88 78.13
Spect_train 92.5 75 75 75
Sponge 98.61 4.17 93.06 95.83
Zoo 60.4 42.57 60.4 60.4
Primary-tumor 24.78 23.3 24.78 28.9
Average Accuracy 64.6 49.72 62.68 61.31
# of better dataset 8 2 5 3
Figure 5. Comparison of applied classifiers versus SAB-HR classifier in term of Average Accuracy
In summary, we can conclude that, for small datasets, using a simple classifier, such as OneR, is one
of the main options for enhancing its classification accuracy. In addition, employing the feature-selection
method for selecting a single attribute using a common measure like H-Ratio, IG or GR will do so, with
better results. On the other hand, considering the homogeneity of the attribute for pertinent single attribute
selection can positively impact the classification process. It helped to reduce the gap between instances, and
accordingly had a representative dataset. Consequently, it provided enough information for the classifier to
learn and achieve a decent average accuracy. From the previous results, single-attribute-based classifier can
be powerful for classifying small datasets when the pertinent attribute is selected. That is the case with
the new SAB-HR, which is recommended among the tested classifiers in this work.
4. CONCLUSION
In this work we have explored the power of the single attribute when selected using an effectual
feature-selection criterion. We have addressed the small dataset mining problem as it is not always easy to
gather a large amount of real data. The new algorithm SAB-HR is a pertinent single-attribute-based classifier
Int J Elec & Comp Eng ISSN: 2088-8708 
The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom)
3233
consisting of a pair of (simplicity, effectiveness) to contribute positively in classifying small datasets.
The single attribute selected to be the most homogenous with the other attributes in the dataset gives more
consistency between instances. Our empirical results used 12 benchmark datasets of a small size
corresponding to Vapnik’s definition. The results show that SAB-HR’s performance significantly
outperforms the classical OneR’s performance. In addition, we compared the performance of SAB-HR with
other single attribute classifiers that use different attribute selection criteria (e.g., IG and GR), and all
the results confirmed the effectiveness of the SAB-HR classifier. In future work, we intend to investigate
algorithms to improve the classification accuracy of small datasets using more progressive classifiers.
In addition, we aim to propose more simple methods for classification.
ACKNOWLEDGEMENTS
This research was funded by the Deanship of Scientific Research at Princess Nourah bint
Abdulrahman University through the Fast-track Research Funding Program. The author is so grateful for all
these supports in conducting this research and makes it successful.
REFERENCES
[1] T. Mitchell, “Machine Learning,” McGraw Hill, 1997.
[2] T. Van Gemert, “On the influence of dataset characteristics on classifier performance,” Bachelor Thesis, Faculty of
Humanities, Utrecht University, pp. 1–13, 2017.
[3] V. Vapnik, “Statistical Learning Theory,” Wiley, New York, 2000.
[4] N. H. Ruparel, N. M. Shahane, and D. P. Bhamare, “Learning from small data set to build classification model :
A survey,” IJCA Proceedings on International Conference on Recent Trends in Engineering and Technology
ICRTET, vol. 4, pp. 23–26, 2013.
[5] X. Chen and J. C. Jeong, “Minimum reference set based feature selection for small sample classifications,”
Proceedings of the 24th International Conference on Machine Learning - ICML ’07, pp. 153–160, 2007.
[6] S. L. Happy, R. Mohanty, and A. Routray, “An effective feature selection method based on pair-wise feature
proximity for high dimensional low sample size data,” 25th European Signal Processing Conference, EUSIPCO,
pp. 1574–1578, 2017.
[7] A. Golugula, G. Lee, and A. Madabhushi, “Evaluating feature selection strategies for high dimensional, small
sample size datasets,” Conference Proceedings Annual International Conference of the IEEE Engineering in
Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society, pp. 949–952, 2011.
[8] I. Soares, J. Dias, H. Rocha, M. do Carmo Lopes, and B. Ferreira, “Feature selection in small databases: A medical-
case study,” IFMBE Proceedings: XIV Mediterranean Conference on Medical and Biological Engineering and
Computing, vol. 57, pp. 808–813, 2016.
[9] T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Machine learning for predictive
modelling based on small data in biomedical engineering,” IFAC-PapersOnLine, vol. 28, pp. 469–474, 2015.
[10] R. C. Holte, “Very simple classification rules perform well on most commonly used datasets,” Machine Learning,
vol. 11, pp. 63–91, 1993.
[11] A. K. Dogra and T. Wala, “A comparative study of selected classification algorithms of data mining,” International
Journal of Computer Science and Mobile Computing, vol. 4, no. 6, pp. 220–229, 2015.
[12] F. Alam and S. Pachauri, “Comparative study of J48, Naive Bayes and One-R classification technique for
credit card fraud detection using WEKA,” Advances in Computational Sciences and Technology, vol. 10, no. 6,
pp. 1731–1743, 2017.
[13] V. S. Parsania, N. N. Jani, and N. H. Bhalodiya, “Applying Naïve Bayes, BayesNet, PART, JRip and OneR
algorithms on hypothyroid database for comparative analysis,” IJDI-ERET, vol. 3, pp. 1–6, 2015.
[14] C. Nasa and Suman, “Evaluation of different classification techniques for WEB data,” International Journal of
Computer Applications, vol. 52, pp. 34–40, 2012.
[15] L. Du and Q. Song, “A simple classifier based on a single attribute,” Proceedings of the 14th IEEE International
Conference on High Performance Computing and Communications, HPCC-2012 & 9th IEEE International
Conference on Embedded Software and Systems, ICESS-2012, pp. 660–665, 2012.
[16] M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, pp. 131–156, 1997.
[17] L. Huan and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE
Transactions on Knowledge and Data Engineering, vol. 17, pp. 491–502, 2005.
[18] M. Ramaswami and R. Bhaskaran, “A study on feature selection techniques in educational data mining,” Journal of
Computing, vol. 1, pp. 7–11, 2009.
[19] Y. Pan, “A proposed frequency-based feature selection method for cancer classification,” Master Theses &
Specialist Projects, Top SHCOLAR, Faculty of the Department of Computer Science, Westers Kentucky University,
2017.
[20] I. Sangaiah, A. V. A. Kumar, and A. Balamurugan, “An empirical study on different ranking methods for effective
data classification,” Journal of Modern Applied Statistical Methods, vol. 14, pp. 35–52, 2015.
[21] M. Trabelsi, N. Meddouri, and M. Maddouri, “A new feature selection method for nominal classifier based on
formal concept analysis,” Procedia Computer Science, vol. 112, pp. 186–194, 2017.
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234
3234
[22] R. Holte, “Machine learning,” Proceeding of the Tenth International Conference, University of Massachusetts,
Amherst, June 1993.
[23] D .I. Morariu, R. G. C. Ulescu, and M. Breazu, “Feature selection in document classification,” The Fourth
International Conference in Romania of Information Science and Information Literacy, Romania, 2013.
[24] J. Novakovic, “Using information gain attribute evaluation to classify sonar targets,” 17 ThT Elecommunucation
Forum, pp. 1351–1354, 2009.
[25] C. G. Nevill-Manning, G. Holmes, and I. H. Witten, “The development of Holte’s 1R classifier,” Proceedings 1995
Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems,
1995.
[26] J. Novaković, P. Strbac, and D. Bulatović, “Toward optimal feature selection using ranking methods and
classification algorithms,” Yugoslav Journal of Operations Research, vol. 21, pp. 119–135, 2011.
[27] U of Waikato, “WEKA: The Waikato environment for knowledge analysis,” 2018. [Online]. Available:
http://www.cs.waikato.ac.nz/ml/weka/
[28] UCI, “UCI machine learning repository,” 2018. [Online], Available:
http://archive.ics.uci.edu/ml/machine-learningdatabases/

More Related Content

What's hot

Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary AlgorithmAutomatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory basedijaia
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET Journal
 
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesIRJET Journal
 
Sca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemsSca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemslaxmanLaxman03209
 
A02610104
A02610104A02610104
A02610104theijes
 
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary AlgorithmAutomatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
 
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...IJDKP
 
The improved k means with particle swarm optimization
The improved k means with particle swarm optimizationThe improved k means with particle swarm optimization
The improved k means with particle swarm optimizationAlexander Decker
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATABINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATAacijjournal
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 

What's hot (17)

G046024851
G046024851G046024851
G046024851
 
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary AlgorithmAutomatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
 
02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction Techniques
 
Sca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemsSca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problems
 
A02610104
A02610104A02610104
A02610104
 
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary AlgorithmAutomatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithm
 
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...
 
The improved k means with particle swarm optimization
The improved k means with particle swarm optimizationThe improved k means with particle swarm optimization
The improved k means with particle swarm optimization
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATABINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
Ijetr021251
Ijetr021251Ijetr021251
Ijetr021251
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 

Similar to The pertinent single-attribute-based classifier for small datasets classification

Threshold benchmarking for feature ranking techniques
Threshold benchmarking for feature ranking techniquesThreshold benchmarking for feature ranking techniques
Threshold benchmarking for feature ranking techniquesjournalBEEI
 
IRJET- A Comparative Research of Rule based Classification on Dataset using W...
IRJET- A Comparative Research of Rule based Classification on Dataset using W...IRJET- A Comparative Research of Rule based Classification on Dataset using W...
IRJET- A Comparative Research of Rule based Classification on Dataset using W...IRJET Journal
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...eSAT Publishing House
 
Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...IJECEIAES
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Alexander Decker
 
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERSN ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERScsandit
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Editor IJCATR
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTION
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTIONGRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTION
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTIONIJCSEA Journal
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET Journal
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...IJMER
 
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...ijctcm
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
 
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)ieijjournal1
 
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...ijaia
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationIJECEIAES
 

Similar to The pertinent single-attribute-based classifier for small datasets classification (20)

Threshold benchmarking for feature ranking techniques
Threshold benchmarking for feature ranking techniquesThreshold benchmarking for feature ranking techniques
Threshold benchmarking for feature ranking techniques
 
IRJET- A Comparative Research of Rule based Classification on Dataset using W...
IRJET- A Comparative Research of Rule based Classification on Dataset using W...IRJET- A Comparative Research of Rule based Classification on Dataset using W...
IRJET- A Comparative Research of Rule based Classification on Dataset using W...
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...
 
Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)
 
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERSN ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
N ETWORK F AULT D IAGNOSIS U SING D ATA M INING C LASSIFIERS
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTION
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTIONGRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTION
GRID COMPUTING: STRATEGIC DECISION MAKING IN RESOURCE SELECTION
 
61_Empirical
61_Empirical61_Empirical
61_Empirical
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...
 
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
 
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
 
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
 

More from IJECEIAES

Remote field-programmable gate array laboratory for signal acquisition and de...
Remote field-programmable gate array laboratory for signal acquisition and de...Remote field-programmable gate array laboratory for signal acquisition and de...
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
 
Detecting and resolving feature envy through automated machine learning and m...
Detecting and resolving feature envy through automated machine learning and m...Detecting and resolving feature envy through automated machine learning and m...
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
 
Smart monitoring technique for solar cell systems using internet of things ba...
Smart monitoring technique for solar cell systems using internet of things ba...Smart monitoring technique for solar cell systems using internet of things ba...
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
 
An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
 
A review on internet of things-based stingless bee's honey production with im...
A review on internet of things-based stingless bee's honey production with im...A review on internet of things-based stingless bee's honey production with im...
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
 
A trust based secure access control using authentication mechanism for intero...
A trust based secure access control using authentication mechanism for intero...A trust based secure access control using authentication mechanism for intero...
A trust based secure access control using authentication mechanism for intero...IJECEIAES
 
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbers
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersFuzzy linear programming with the intuitionistic polygonal fuzzy numbers
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
 
The performance of artificial intelligence in prostate magnetic resonance im...
The performance of artificial intelligence in prostate  magnetic resonance im...The performance of artificial intelligence in prostate  magnetic resonance im...
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
 
Analysis of driving style using self-organizing maps to analyze driver behavior
Analysis of driving style using self-organizing maps to analyze driver behaviorAnalysis of driving style using self-organizing maps to analyze driver behavior
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
 
Hyperspectral object classification using hybrid spectral-spatial fusion and ...
Hyperspectral object classification using hybrid spectral-spatial fusion and ...Hyperspectral object classification using hybrid spectral-spatial fusion and ...
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...IJECEIAES
 
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...IJECEIAES
 
Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...IJECEIAES
 
A deep learning framework for accurate diagnosis of colorectal cancer using h...
A deep learning framework for accurate diagnosis of colorectal cancer using h...A deep learning framework for accurate diagnosis of colorectal cancer using h...
A deep learning framework for accurate diagnosis of colorectal cancer using h...IJECEIAES
 
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...IJECEIAES
 
Predicting churn with filter-based techniques and deep learning
Predicting churn with filter-based techniques and deep learningPredicting churn with filter-based techniques and deep learning
Predicting churn with filter-based techniques and deep learningIJECEIAES
 
Taxi-out time prediction at Mohammed V Casablanca Airport
Taxi-out time prediction at Mohammed V Casablanca AirportTaxi-out time prediction at Mohammed V Casablanca Airport
Taxi-out time prediction at Mohammed V Casablanca AirportIJECEIAES
 
Automatic customer review summarization using deep learningbased hybrid senti...
Automatic customer review summarization using deep learningbased hybrid senti...Automatic customer review summarization using deep learningbased hybrid senti...
Automatic customer review summarization using deep learningbased hybrid senti...IJECEIAES
 

More from IJECEIAES (20)

Remote field-programmable gate array laboratory for signal acquisition and de...
Remote field-programmable gate array laboratory for signal acquisition and de...Remote field-programmable gate array laboratory for signal acquisition and de...
Remote field-programmable gate array laboratory for signal acquisition and de...
 
Detecting and resolving feature envy through automated machine learning and m...
Detecting and resolving feature envy through automated machine learning and m...Detecting and resolving feature envy through automated machine learning and m...
Detecting and resolving feature envy through automated machine learning and m...
 
Smart monitoring technique for solar cell systems using internet of things ba...
Smart monitoring technique for solar cell systems using internet of things ba...Smart monitoring technique for solar cell systems using internet of things ba...
Smart monitoring technique for solar cell systems using internet of things ba...
 
An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
A review on internet of things-based stingless bee's honey production with im...
A review on internet of things-based stingless bee's honey production with im...A review on internet of things-based stingless bee's honey production with im...
A review on internet of things-based stingless bee's honey production with im...
 
A trust based secure access control using authentication mechanism for intero...
A trust based secure access control using authentication mechanism for intero...A trust based secure access control using authentication mechanism for intero...
A trust based secure access control using authentication mechanism for intero...
 
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbers
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersFuzzy linear programming with the intuitionistic polygonal fuzzy numbers
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbers
 
The performance of artificial intelligence in prostate magnetic resonance im...
The performance of artificial intelligence in prostate  magnetic resonance im...The performance of artificial intelligence in prostate  magnetic resonance im...
The performance of artificial intelligence in prostate magnetic resonance im...
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networks
 
Analysis of driving style using self-organizing maps to analyze driver behavior
Analysis of driving style using self-organizing maps to analyze driver behaviorAnalysis of driving style using self-organizing maps to analyze driver behavior
Analysis of driving style using self-organizing maps to analyze driver behavior
 
Hyperspectral object classification using hybrid spectral-spatial fusion and ...
Hyperspectral object classification using hybrid spectral-spatial fusion and ...Hyperspectral object classification using hybrid spectral-spatial fusion and ...
Hyperspectral object classification using hybrid spectral-spatial fusion and ...
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...
SADCNN-ORBM: a hybrid deep learning model based citrus disease detection and ...
 
Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...
 
A deep learning framework for accurate diagnosis of colorectal cancer using h...
A deep learning framework for accurate diagnosis of colorectal cancer using h...A deep learning framework for accurate diagnosis of colorectal cancer using h...
A deep learning framework for accurate diagnosis of colorectal cancer using h...
 
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...
Swarm flip-crossover algorithm: a new swarm-based metaheuristic enriched with...
 
Predicting churn with filter-based techniques and deep learning
Predicting churn with filter-based techniques and deep learningPredicting churn with filter-based techniques and deep learning
Predicting churn with filter-based techniques and deep learning
 
Taxi-out time prediction at Mohammed V Casablanca Airport
Taxi-out time prediction at Mohammed V Casablanca AirportTaxi-out time prediction at Mohammed V Casablanca Airport
Taxi-out time prediction at Mohammed V Casablanca Airport
 
Automatic customer review summarization using deep learningbased hybrid senti...
Automatic customer review summarization using deep learningbased hybrid senti...Automatic customer review summarization using deep learningbased hybrid senti...
Automatic customer review summarization using deep learningbased hybrid senti...
 

Recently uploaded

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsmeharikiros2
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxNANDHAKUMARA10
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessorAshwiniTodkar4
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdfKamal Acharya
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 

Recently uploaded (20)

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 

The pertinent single-attribute-based classifier for small datasets classification

  • 1. International Journal of Electrical and Computer Engineering (IJECE) Vol. 10, No. 3, June 2020, pp. 3227~3234 ISSN: 2088-8708, DOI: 10.11591/ijece.v10i3.pp3227-3234  3227 Journal homepage: http://ijece.iaescore.com/index.php/IJECE The pertinent single-attribute-based classifier for small datasets classification Mona Jamjoom Department of Computer Sciences, Princess Nourah Bint Abdulrahman University, Kingdom of Saudi Arabia Article Info ABSTRACT Article history: Received Jul 27, 2019 Revised Dec 5, 2019 Accepted Dec 11, 2019 Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attribute- based-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR). Keywords: Classification Feature selection OneR classifier Single-attribute-based classifier Small dataset Copyright © 2020 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Mona Jamjoom, Department of Computer Sciences, Princess Nourah Bint Abdulrahman University, Airport Road, Riyadh 11671, Kingdom of Saudi Arabia. Email: mmjamjoom@pnu.edu.sa 1. INTRODUCTION Classification is one of the main tasks of data mining and machine learning [1] that is widely used to predict different real-life situations. High accuracy is a key indicator for a successful prediction model. Building an accurate classifier is one of the important goals, and rich datasets make this task easier and more effective [2]. Classifying small datasets efficiently is essential as some real situations cannot provide a sufficient number of cases. A limited training set is challenging to learn and, as a result, base a decision on it. In many multivariable classification or regression problems, such as estimation or forecasting, we have a training set Tp = (xi, ti) of p pairs of input/output vector x ∈ ℜn and scalar target t. Thus, according to Vapnik’s definition, a small dataset for Tp is determined as follows: "For estimating functions with VC dimension h, we consider the size p of data to be small if the ratio p/h is small (say p/h < 20)" [3]. The problem with the small dataset is that, if not elaborately collected, it is not a representative sample. Non-representative instances hinder the process of providing enough information for the learner model because of the gaps existing between instances; thus, the model does not generalize well. Many works have been proposed in the literature to solve the problem of small data size by using different methods. One of the common methods used is to increase the size of data by adding artificial instances [4], but this approach lacks data credibility and reflection on real-life use. Some researchers have used feature-selection methods [5-8], whereas a novel technique using multiple runs for model development was proposed by [9] and others. A simple solution is one of the requirements when the problem is becoming increasingly complex. This philosophy has been stated by Occam's razor [1]. Literature in the field of classification has shown some successful attempts of very simple rules to achieve high accuracy with many datasets [10]. OneR is one of
  • 2.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234 3228 the simple and widely used algorithms in machine learning to build a simple classifier. A trade-off between simplicity and high performance [10] makes OneR’s performance slightly less accurate than state-of-the-art classification algorithms [11, 12], although sometimes it outperforms them [13, 14]. Its main advantage is that it balances the best accuracy possible with a model that is still simple enough for humans to understand [12]. OneR is a single-attribute-based classifier that involves only one attribute at the classification time. A single attribute concept is powerful if it can directly influence the classification accuracy of the dataset in a positive manner. Yet not all attributes have to positively contribute to the classification process which may increase the single attribute power. The single attribute rule can be more effective than complex methods when it is difficult to learn from the dataset due to it being simple, small, noisy, or complex. A study by [15] used the single attribute concept by creating multiple one-dimensional classifiers from the original dataset in the training phase and combining the results in the prediction phase. The new method is unlike OneR because it considers all attributes’ contributions at the prediction time. Feature selection is a data-mining pre- processing step widely used to improve the classification and reduce the performance time. It is effective in reducing the dataset’s dimensionality by eliminating non-contributable attributes. It uses different techniques to come up with a single attribute or a subset of attributes [16, 17]. Moreover, it has proven its effectiveness in improving various applications’ predictive accuracy [18-20]. In this paper, we tackle the problem of classifying small datasets by expanding the power of a pertinent single attribute using SAB-HR classifier, which is similar to OneR classifier in using single attribute at classification phase, but different in which instead of generating a rule for each attribute, a feature selection method is employed to select the attribute that is less heterogenic among the other attributes. We calculated the H-Ratio [21] for each attribute (att) then identified the attribute with the lowest H-Ratio value (attH-Ratio). We used the pair (attH-Ratio, c), where c is the class value, to learn and classify the small dataset. The results were encouraging and showed a significant improvement compared to the classical OneR classifier. In addition, we created multiple classifiers in the same manner of SAB-HR, using different criteria to select the pertinent single attribute. We used IG and GR in the feature-selection process and created SAB- IG and SAB-GR classifiers, correspondingly. We individually compared the new classifier SAB-HR with others (i.e., SAB-IG and SAB-GR). The remainder of this paper is organized as follows: Section 2 reviews the background of our work. In Section 3, we propose the research method SAB-HR classifier. The experiments and a brief discussion of the findings is in subsections 3.1 and 3.2, consequently. Finally, Section 4 concludes the paper. 2. BACKGROUND In this section we will review some of the techniques that will be used in this study. 2.1. OneR classifier OneR, is short for "One Rule", and has been introduced by Rob Holte [22, 10]. It is one of the most primitive techniques, based on a 1‐level decision tree that creates one rule for each attribute in the dataset, then selects the rule with minimum classification errors as its "one rule". To create a rule for an attribute, it constructs a frequency table for each attribute against the class [22], Figure 1 shows the pseudocode of OneR algorithm. It has shown that OneR work distinctively well in practice with real-world data and can compete the state-of-the-art classification algorithms in some situations [13, 14, 23]. OneR is using one attribute for classification and many consider it as one of feature selection methods with feature subset containing a single attribute [24]. Comparing the OneR classifier with the baseline classifier ZeroR [14], OneR is a one step beyond. Both OneR and ZeroR are useful for determining a minimum standard classifier for other classification algorithms. OneR’s accuracy is always higher or at least equal the baseline classifier when evaluated on the training data. The authors in [25] proposed attempts to enhance the performance of OneR by addressing two issues: the quantization of continuous-valued attributes, and the treatment of missing values. Figure 1. The pseudocode of OneR algorithm [15] For each attribute (att), For each value of that att, make a rule as follows; Count how often each value of class appears Find the most frequent class Make the rule assign that class to this value of the att Calculate the total error of the rules of each att Choose the att with the smallest total error.
  • 3. Int J Elec & Comp Eng ISSN: 2088-8708  The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom) 3229 2.2. Feature selection Feature selection methods attempt to find the minimal subset of features that do not significantly decrease the classification accuracy. Feature selection methods can be categorized as wrapper methods or filter methods [17]. Surveys done by [17] and [16] showed plenty of such methods. A wrapper method is a model-based approach where the quality of the features selected is measured by the classification accuracy of the classification algorithm being used. Some use a greedy search to select the subset [16]. Meanwhile, in a filter method, called a model-free approach, the selection of features is done independently from the classification algorithm. It selects the subset’s features dependent on general measurable characteristics of the feature, such as information Gain, Gain Ratio, Pearson Correlation, Mutual Information (MI) [16], and Heterogeneity Ratio [21]. In this paper, we used feature selection that utilizes filter methods (i.e., attribute evaluation) and focused on some of the mentioned measures (i.e., IG, GR, and H-Ratio). A brief description of each follows. - Information gain [21] measures the amount of information given by an attribute about the class. It is defined by formula (1): 𝐼𝐺(𝑎𝑡𝑡) = 𝐻(𝑌) − 𝐻 𝑎𝑡𝑡(𝑌) (1) where Hatt (Y) measures the entropy of the attribute att by contributing to class Y while H(Y) calculates the entropy of class Y. In fact, entropy is the quantity of information contained or delivered by a source of information. It is also used in measuring the relevancy and defined by formula (2): 𝐻(𝑌) = ∑ − 𝑃(𝑣𝑖)𝑖 𝑙𝑜𝑔2 𝑃(𝑣𝑖) (2) - Gain ratio [26] is a ratio of information gain to intrinsic information. It determines the relevancy of an attribute. GR is calculated using the formula (3): 𝐺𝑅(𝑎𝑡𝑡) = 𝐼𝐺(𝑎𝑡𝑡) 𝐻(𝑎𝑡𝑡) (3) where H(att) = ∑ −𝑃(𝑣𝑗) 𝑙𝑜𝑔2𝑗 𝑃(𝑣𝑗) and P(vj) represents the probability to have the value vj by contributing to overall values for attribute j. - Heterogeneity ratio is a new measure defined by [21] that measures the ratio of heterogeneity of a nominal attribute among the dataset attributes. In other words, it quantifies the homogeneity of a set of instances sharing the same value of attributes. The H-Ratio is defined by formula (4): 𝐻 − 𝑅𝑎𝑡𝑖𝑜(𝑎𝑡𝑡) = 𝐻(𝑎𝑡𝑡)+𝐻 𝑎𝑡𝑡(𝑌) 𝐻(𝑌) (4) The ratio 𝐻(𝑎𝑡𝑡) 𝐻(𝑌) adds value to the homogeneity instances based on attributes and class simultaneously whereas the ratio 𝐻 𝑎𝑡𝑡(𝑌) 𝐻(𝑌) appreciates the homogeneity instances of the same class and shares the same value of attributes. 3. RESULTS AND ANALYSIS In this section, we introduce a new single-attribute-based classifier SAB-HR to classify the small datasets. The new algorithm uses a new criterion to select the powerful pertinent single attribute, which will contribute in the classification. SAB-HR is unlike OneR in generating a rule for each attribute. It calculates the H-Ratio for each attribute (attH-Ratio) in the dataset to determine the attribute that is less heterogenic among the other attributes. The attribute with the lowest heterogeneity ratio value is used in pairs with the class c (attH-Ratio, c) in the classification process while the remaining attributes are eliminated. The power of the single attribute selected for SAB-HR lies in its homogeneity with other attributes in which it provides enough information for the classifier to predict correctly. attH-Ratio is a representative attribute that is sufficient for small datasets. The algorithmic description of SAB-HR is presented in Figure 2. Figure 2. The pseudocode of SAB-HR algorithm For each attribute (att), Calculate the attH-Ratio; Choose the att with the smallest attH-Ratio value; Remove all att in the dataset except the pairs (attH-Rato, c);
  • 4.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234 3230 3.1. Experiments In the following experiments, we aim to evaluate the performance of the new SAB-HR classifier when dealing with small datasets. In addition, we want to compare the performance of SAB-HR with other single attribute classifiers that use different criteria, such as IG and GR, when selecting the single attribute during the feature-selection process. We used the well-known open source software WEKA [27]. The datasets were obtained from the UCI Repository for Machine Learning [28]. We selected 12 small datasets corresponding to Vapnik’s definition [3]. Table 1 lists the main characteristics of the datasets collected and used in terms of number of instances, number of attributes, and Vapnik’s ratio for determining the dataset’s size. The number beside the dataset name will be its reference in the figures. The OneR was used as a base classifier; a 10-fold cross-validation and a paired t-test with a confidence level of 95% were used to determine if the differences in classification accuracy were statistically significant, and underlined in the tables. We compared the different methods with respect to the average classification accuracy and the number of datasets for which each method achieved better results. Better results are shown in the tables in bold font. In the tables, we named each technique using the abbreviation SAB for single-attribute-based name, suffixed with an abbreviation for the measure used for selecting the single attribute in the feature-selection process. The new classifiers, with respect to the different measures, are named as follows: SAB-HR, SAB-IG and SAB-GR. In our experiments, we applied the feature-selection process using different measures (H-Ratio, IG, and GR) to select the pertinent single attribute, then we eliminated the remaining (i.e., unselected) attributes and classified with a pair of attributes (pertinent single attribute, class). Table 1. Characteristics of datasets used in the experiments # Dataset # instances # attributes # instances/# attributes 1 Postoperative-patient-data 90 9 10 2 contact-lenses 24 4 6 3 weather-nominal 14 4 3.5 4 colic.ORIG 368 27 13.63 5 cylinder-bands 540 39 13.85 6 Dermatology 366 34 10.76 7 Flags 194 29 6.69 8 lung-cancer 32 56 0.57 9 spect_train 80 22 3.64 10 Sponge 72 45 1.6 11 Zoo 101 17 5.94 12 primary-tumor 339 17 19.94 3.2. Results and discussion The experiment’s results are combined in Table 2, which compares the performance of classical OneR with the new created classifiers. Noticeably, the performance of the classical OneR is insignificant when compared to the new applied classifiers. The overall average accuracy for the new classifiers (i.e., SAB-HR, SAB-IG and SAB-GR) is 64.6%, 49.72% and 61.31%, respectively, corresponding to 48.53% for the classical OneR classifier. Furthermore, the difference in average accuracy between SAB-HR compared to the classical OneR is statistically significant. The average difference between the classical OneR and the applied classifiers (i.e., SAB-HR, SAB-IG and SAB-GR) is 16.07%, 1.19% and 12.78%, respectively, favoring new classifiers. Table 2. The performance’s summary of applied classifiers compared to the classical OneR classifier Dataset OneR SAB-HR OneR SAB-IG OneR SAB-GR Postoperative-patient-data 67.78 71.11 67.78 71.11 67.78 68.89 contact-lenses 70.83 70.83 70.83 70.83 70.83 70.83 weather-nominal 42.86 57.14 42.86 50 42.86 50 colic.ORIG 67.66 65.76 67.66 67.66 67.66 63.86 cylinder-bands 49.63 67.59 49.63 49.63 49.63 65 dermatology 49.73 36.07 49.73 50.27 49.73 36.07 flags 4.64 33.51 4.64 4.64 4.64 42.78 lung-cancer 87.5 96.88 87.5 87.5 87.5 78.13 spect_train 67.5 92.5 67.5 75 67.5 75 sponge 4.17 98.61 4.17 4.17 4.17 95.83 zoo 42.57 60.4 42.57 42.57 42.57 60.4 primary-tumor 27.43 24.78 27.43 23.3 27.43 28.9 Average Accuracy 48.53 64.6 48.53 49.72 48.53 61.31 # of better dataset 3 8 1 4 3 8
  • 5. Int J Elec & Comp Eng ISSN: 2088-8708  The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom) 3231 Figure 3 (a-c) compare the applied classifiers to the classical OneR classifier in terms of average accuracy, with the less heterogenous attribute classifier (SAB-HR) ranking first, followed by SAB-GR with a slight difference (3.29%) from first, and SAB-IG classifier with a big difference from other classifiers but looking typical to the classical OneR, the two lines approximately identical as shown in Figure 3 (b). The (attIG) attribute used in SAB-IG contains the largest amount of information about the class. In a small dataset case, it may be more important to be concerned about the consistency of the attribute with other attributes due to the limited number of instances in the dataset. This would minimize the gaps existing between the instances in the dataset. The homogeneity of the dataset helps make it more representative and, thus, more accurate to be learned. In addition, the new classifiers achieved better average accuracy in more datasets than OneR as shown in Table 2. Figure 4 (a-c) shows each new classifier in comparison to OneR. The number of better datasets achieved is 8, 4 and 8 for SAB-HR, SAB-IG and SAB-GR, respectively, corresponding to 3, 1, and 3 for OneR classifier. Figure 3. Comparison of applied classifiers versus OneR classifier in term of average accuracy Figure 4. Comparison of applied classifiers versus OneR classifier in term of number of better datasets achieved
  • 6.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234 3232 From Table 2, it is obvious that selecting the single attribute that has a lower classification error rate for the OneR classifier is not always optimal, especially in small datasets. Using a more deliberate technique to select the single attribute has a positive impact on classification accuracy and number of better datasets achieved. Meanwhile, we developed Table 3 to highlight the new classifier SAB-HR, which used homogeneity for the pertinent single attribute selection. Table 3 shows a comparison between the new classifier SAB-HR and the other created classifiers for the same purpose (i.e., SAB-IG and SAB-GR). The results showed that SAB-HR’s average accuracy outperforms SAB-IG’s average accuracy by nearly 14.88%, while with SAB-GR the difference is only 1.37%. In general, the performance of the SAB-HR classifier is remarkable when compared to the classical OneR or the applied classifiers (i.e., SAB-IG and SAB-GR). Figure 5 (a) and (b) show the difference of performance of each dataset between SAB-HR and the other applied classifiers in terms of average accuracy. Table 3. A comparison between the new classifier SAB-HR and the other classifiers Dataset SAB-HR SAB-IG SAB-HR SAB-GR Postoperative-patient-data 71.11 71.11 71.11 68.89 Contact-lenses 70.83 70.83 70.83 70.83 Weather-nominal 57.14 50 57.14 50 Colic.ORIG 65.76 67.66 65.76 63.86 Cylinder-bands 67.59 49.63 67.59 65 Dermatology 36.07 50.27 36.07 36.07 Flags 33.51 4.64 33.51 42.78 Lung-cancer 96.88 87.5 96.88 78.13 Spect_train 92.5 75 75 75 Sponge 98.61 4.17 93.06 95.83 Zoo 60.4 42.57 60.4 60.4 Primary-tumor 24.78 23.3 24.78 28.9 Average Accuracy 64.6 49.72 62.68 61.31 # of better dataset 8 2 5 3 Figure 5. Comparison of applied classifiers versus SAB-HR classifier in term of Average Accuracy In summary, we can conclude that, for small datasets, using a simple classifier, such as OneR, is one of the main options for enhancing its classification accuracy. In addition, employing the feature-selection method for selecting a single attribute using a common measure like H-Ratio, IG or GR will do so, with better results. On the other hand, considering the homogeneity of the attribute for pertinent single attribute selection can positively impact the classification process. It helped to reduce the gap between instances, and accordingly had a representative dataset. Consequently, it provided enough information for the classifier to learn and achieve a decent average accuracy. From the previous results, single-attribute-based classifier can be powerful for classifying small datasets when the pertinent attribute is selected. That is the case with the new SAB-HR, which is recommended among the tested classifiers in this work. 4. CONCLUSION In this work we have explored the power of the single attribute when selected using an effectual feature-selection criterion. We have addressed the small dataset mining problem as it is not always easy to gather a large amount of real data. The new algorithm SAB-HR is a pertinent single-attribute-based classifier
  • 7. Int J Elec & Comp Eng ISSN: 2088-8708  The pertinent single-attribute-based classifier for small datasets classification (M. Jamjoom) 3233 consisting of a pair of (simplicity, effectiveness) to contribute positively in classifying small datasets. The single attribute selected to be the most homogenous with the other attributes in the dataset gives more consistency between instances. Our empirical results used 12 benchmark datasets of a small size corresponding to Vapnik’s definition. The results show that SAB-HR’s performance significantly outperforms the classical OneR’s performance. In addition, we compared the performance of SAB-HR with other single attribute classifiers that use different attribute selection criteria (e.g., IG and GR), and all the results confirmed the effectiveness of the SAB-HR classifier. In future work, we intend to investigate algorithms to improve the classification accuracy of small datasets using more progressive classifiers. In addition, we aim to propose more simple methods for classification. ACKNOWLEDGEMENTS This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program. The author is so grateful for all these supports in conducting this research and makes it successful. REFERENCES [1] T. Mitchell, “Machine Learning,” McGraw Hill, 1997. [2] T. Van Gemert, “On the influence of dataset characteristics on classifier performance,” Bachelor Thesis, Faculty of Humanities, Utrecht University, pp. 1–13, 2017. [3] V. Vapnik, “Statistical Learning Theory,” Wiley, New York, 2000. [4] N. H. Ruparel, N. M. Shahane, and D. P. Bhamare, “Learning from small data set to build classification model : A survey,” IJCA Proceedings on International Conference on Recent Trends in Engineering and Technology ICRTET, vol. 4, pp. 23–26, 2013. [5] X. Chen and J. C. Jeong, “Minimum reference set based feature selection for small sample classifications,” Proceedings of the 24th International Conference on Machine Learning - ICML ’07, pp. 153–160, 2007. [6] S. L. Happy, R. Mohanty, and A. Routray, “An effective feature selection method based on pair-wise feature proximity for high dimensional low sample size data,” 25th European Signal Processing Conference, EUSIPCO, pp. 1574–1578, 2017. [7] A. Golugula, G. Lee, and A. Madabhushi, “Evaluating feature selection strategies for high dimensional, small sample size datasets,” Conference Proceedings Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society, pp. 949–952, 2011. [8] I. Soares, J. Dias, H. Rocha, M. do Carmo Lopes, and B. Ferreira, “Feature selection in small databases: A medical- case study,” IFMBE Proceedings: XIV Mediterranean Conference on Medical and Biological Engineering and Computing, vol. 57, pp. 808–813, 2016. [9] T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Machine learning for predictive modelling based on small data in biomedical engineering,” IFAC-PapersOnLine, vol. 28, pp. 469–474, 2015. [10] R. C. Holte, “Very simple classification rules perform well on most commonly used datasets,” Machine Learning, vol. 11, pp. 63–91, 1993. [11] A. K. Dogra and T. Wala, “A comparative study of selected classification algorithms of data mining,” International Journal of Computer Science and Mobile Computing, vol. 4, no. 6, pp. 220–229, 2015. [12] F. Alam and S. Pachauri, “Comparative study of J48, Naive Bayes and One-R classification technique for credit card fraud detection using WEKA,” Advances in Computational Sciences and Technology, vol. 10, no. 6, pp. 1731–1743, 2017. [13] V. S. Parsania, N. N. Jani, and N. H. Bhalodiya, “Applying Naïve Bayes, BayesNet, PART, JRip and OneR algorithms on hypothyroid database for comparative analysis,” IJDI-ERET, vol. 3, pp. 1–6, 2015. [14] C. Nasa and Suman, “Evaluation of different classification techniques for WEB data,” International Journal of Computer Applications, vol. 52, pp. 34–40, 2012. [15] L. Du and Q. Song, “A simple classifier based on a single attribute,” Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 & 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012, pp. 660–665, 2012. [16] M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, pp. 131–156, 1997. [17] L. Huan and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, pp. 491–502, 2005. [18] M. Ramaswami and R. Bhaskaran, “A study on feature selection techniques in educational data mining,” Journal of Computing, vol. 1, pp. 7–11, 2009. [19] Y. Pan, “A proposed frequency-based feature selection method for cancer classification,” Master Theses & Specialist Projects, Top SHCOLAR, Faculty of the Department of Computer Science, Westers Kentucky University, 2017. [20] I. Sangaiah, A. V. A. Kumar, and A. Balamurugan, “An empirical study on different ranking methods for effective data classification,” Journal of Modern Applied Statistical Methods, vol. 14, pp. 35–52, 2015. [21] M. Trabelsi, N. Meddouri, and M. Maddouri, “A new feature selection method for nominal classifier based on formal concept analysis,” Procedia Computer Science, vol. 112, pp. 186–194, 2017.
  • 8.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 3227 - 3234 3234 [22] R. Holte, “Machine learning,” Proceeding of the Tenth International Conference, University of Massachusetts, Amherst, June 1993. [23] D .I. Morariu, R. G. C. Ulescu, and M. Breazu, “Feature selection in document classification,” The Fourth International Conference in Romania of Information Science and Information Literacy, Romania, 2013. [24] J. Novakovic, “Using information gain attribute evaluation to classify sonar targets,” 17 ThT Elecommunucation Forum, pp. 1351–1354, 2009. [25] C. G. Nevill-Manning, G. Holmes, and I. H. Witten, “The development of Holte’s 1R classifier,” Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, 1995. [26] J. Novaković, P. Strbac, and D. Bulatović, “Toward optimal feature selection using ranking methods and classification algorithms,” Yugoslav Journal of Operations Research, vol. 21, pp. 119–135, 2011. [27] U of Waikato, “WEKA: The Waikato environment for knowledge analysis,” 2018. [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/ [28] UCI, “UCI machine learning repository,” 2018. [Online], Available: http://archive.ics.uci.edu/ml/machine-learningdatabases/