SlideShare a Scribd company logo
1 of 6
Download to read offline
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
24
Performance Analysis of Selected Classifiers in
User Profiling
ManasRanjan Patra1
,V.Mohan Patro2
Department of Computer Science, Berhampur University,Berhampur-760007,Odisha, India
Email- mrpatra12@gmail.com , vmpatro@gmail.com
Abstract-User profiles can serve as indicators of personal
preferences which can be effectively used while providing
personalized services. Building user files which can capture
accurate information of individuals has been a daunting task.
Several attempts have been made by researchers to extract
information from different data sources to build user profiles
on different application domains. Towards this end, in this
paper we employ different classification algorithmsto create
accurate user profiles based on information gathered from
demographic data. The aim of this work is to analyze the
performance of five most effective classification methods,
namely Bayesian Network(BN), Naïve Bayesian(NB), Naives
Bayes Updateable(NBU), J48, and Decision Table(DT). Our
simulation results show that, in general, the J48has the highest
classification accuracy performance with the lowest error rate.
On the other hand, it is found that Naïve Bayesian and Naives
Bayes Updateable classifiers have the lowest time requirement
to build the classification model.
Keywords–Classifiers; data mining; Confusion
matrix; evaluation
I. INTRODUCTION
A user profile refers to the explicit digital representation of a
person's identity. A user profile can also be considered as the
computer representation of a user model. User profiles describe
user interests, characteristics, behavior and preferences. A user
profile is a set of information about a given user in a given
context and on a specific period of time [1]. User profiling is
the practice of gathering, organizing and interpreting the user
profile information.It refers to construction of a profile through
the extraction from a set of data. User profiles can be found on
recommender systems, or dynamic websites (such as online
social networking sites or bulletin boards). Behind every
instance of personalization, there is a profile that stores the
user preferences, context of use and other information that can
be used to deliver a user experience tailored to the individual
needs and preferences[2].Recommendation systems are widely
adopted in e-commerce businesses for helping customers
locate products they like to purchase and the major challenge
for these systems is bridging the gap between the physical
characteristics of data with the users’ perceptions [3]. In order
to address this challenge, employing user profiles to improve
accuracy become essential. Targeting customers, based on
inaccurate user profile data will not be as effective as targeting
customers based on accurate user profile data. Thus, companies
perform preprocessing on user profile data as part of effort to
maintain the accuracy of their user profile data[4].Yet another
application which can benefit from user profiling is E-
governance wherein personalized online services can be
provided to citizens. User profiling would help e-Gov service
systems communicate effectively and efficiently with their
users. With the provision of user profiles, the systems can
recognize the user every time he/she logs on to the system and
the user does notneed to enter the same personal information
again. This will make users’ interactions with the system easier
and faster[5].Several techniques are being applied to build user
profiles which can be effectively used in different application
domains to provide customized services. Among others
classification is one of the widely used techniques which have
been successfully applied to classify users based on
information gathered from different sources. However, the
performance of these classification techniques greatly vary in
terms of how accurately they classify the data for building
accurate user profiles.
In [6], Alicia et al. compared the performance of NB, J48 and
DT showing the correctly classified instances rate, time taken
to build the model, area under curve etc. In [7], Cufogluet al.
compared Naïve Bayesian Tree (NBTree), Sequential Minimal
Optimization (SMO), Naïve Bayesian(NB), Instance-Based
Learner(IB1), J48 (a version of C4.5), Classification and
Regression Tree (SimpleCART) and Iterative Dichotomiser
Tree (ID3) classifiers in the personalizationapplications.In
[6],using an automobile data set, the authors have obtained
only upto 73% of correctly classified instances rate whereas in
our case using a user profile data set we could achieve more
than 95%. For both the data sets the time taken to build the
model is the highest in case of Decision Table classifier in
comparison to other classifiers used in the experimentation.
From [7] and our observation it is clear that there is no
substantial increase in the time taken to build the model when
the number of instances is within 1000. However, the time
taken to build the model is always in increasing order when the
number of instances is in thousands.In [8], Panda et al.
compared the performance of NB, Id3 and J48 algorithms for
network intrusion detection. According to the authors, NB
performs better than ID3 and J48 with respect to overall
classification accuracy. However, the authors added that
Decision Trees (ID3 and J48) are robust in detecting new
intrusive behaviours in comparison to NB.In our work, we
have applied all the classifiers available in the WEKA(Waikato
Environment for Knowledge Analysis) workbench[9] on the
user profile dataset and observed that 20 classifiers resulted in
classification accuracy above 98%. Out of these 20 classifiers
we have selected the top 5 classifiers, namely, Bayesian
Network (BN), Naïve Bayesian (NB), Naives Bayes
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
25
Updateable (NBU), J48 (Java implementation of C4.5), and
Decision Table (DT) for further analysis.
II. CLASSIFICATION TECHNIQUES
Bayesian networks are probability based and are used for
reasoning and decision making under uncertainty, and heavily
rely on Bayes’ rule which is defined as follows [10];
 Assume Ai attributes where i= 1,2,3,…,n, and which
take values ai where i= 1,2,3,…,n.
 Assume C as class label and E = (a1, a2,.. an) as
unclassified test instance.
 E is classified into class Cwith the maximum
posterior class probabilityP(C | E),
P(C | E) = arg max P(C)P(E | C)
Bayesian Networks can represent uncertain attribute
dependencies; however it has been proven that learning
optimalBayesian network is NP (Non-deterministic
Polynomial) hard [11]. Naïve Bayesian Classifier is one of the
Bayesian Classifier techniques whichis also known as the state-
of-the-art of Bayesian Classifiers. In many works it has been
proven that Naïve Bayesian classifiers are one of the most
computationally efficient, effective and simple algorithms for
machine learning and data mining applications [12][13][14].
Naïve Bayesian classifiers assume that all attributes within the
same class are independent, given the class label. Based on this
assumption, the Bayesian rule has been modified as follows to
define the Naïve Bayesian rule [10];
P(C|E) = arg max ( ) ∏ ( | )Naïve Bayesian
classifiers are normally used for interactive applications.
However, because of its Naïve conditional in dependence
assumption, optimal accuracy cannot be achieved.Naïve Bayes
Updateable is the updateable version of Naïve Bayes in WEKA
(Waikato Environment for Knowledge Analysis) work bench.
This classifier uses a default precision of 0.1 for numeric
attributes when build Classifier ( a method in java code) is
called with zero training instances.J48 is an open
source Java implementation of the C4.5 algorithm in
the WEKA data mining tool. C4.5 builds decision trees from a
set of training data in the same way as ID3, using the concept
of information entropy[15].
The training data is a set S = s1, s2, … of already classified
samples. Each sample si consists of a p-dimensional
vector (x1,i, x2,i , …, xp,i) wherexj represent attributes or features
of the sample, as well as the class in which si falls.Decision
table is based on logical relationships just as the truth table. It
is a tool that helps us to look at the combination of both
completeness and inconsistency of conditions[6]. Decision
tables, like decision trees or neural nets, are classification
models used for prediction. They are induced by machine
learning algorithms. A decision table consists of a hierarchical
table in which each entry in a higher level table is decomposed
into values of a pair of additional attributes to form another
table. The structure is similar to dimensional stacking.
III. EXPERIMENTAL SETUP
A. Data Set
For our experimentation we have used Census Income data
set[16]which has 15 different attributes, namely, Age, Work-
class, Final-weight, Education, Education-num, Marital-status,
Occupation, Relationship, Race, Sex, Native-country, capital-
gain, capital-loss, Hours-per-week & Income. This data set has
been used by different authors for different purposes. It has
been used to classify if the income of a person is greater than
50K based on several census parameters, such as age,
education, marital status.
B. Kappa Statistics (KS)
Kappa is a chance-corrected measure of agreement between the
classification and the true classes. Kappa statistic is used to
access the accuracy of any particular measuring cases. It is
used to distinguish between the reliability of the data collected
and their validity. The value of kappa is less than or equal to 1
where the value of 1 indicates perfect agreement.
C. Cross-Validation
Cross validation calculates the accuracy of the model by
separating the data into two different populations - a training
set and a testing set. The model is created from the training set
and its accuracy is measured based on how well it classifies the
testing set. This testing process is continued k times to
complete the k-fold cross validation procedure. We have used
10-fold cross-validation in which the available data are
randomly divided into 10 disjoint subsets of approximately
equal size. One of the subsets is then used as the test set and
the remaining 9 sets are used for building the classifier. This
test set is then used to estimate the accuracy. This is done
repeatedly 10 times so that each subset is used as a test subset
once. The accuracy estimate is then the mean of the estimates
for each of the classifiers.
D. Confusion Matrix
A confusion matrix (also known as a contingency table or an
error matrix) is a table layout that allows visualization of the
performance of a supervised learning algorithm. Following
confusion matrix is for 2 class values of an attribute.
Table-1
Predicted Class
C1 C2
Actual Class
C1 True positive False negative
C2 False positive True negative
C1 – particular class C2 – different class
True positive (TP) - The number of instances correctly
classified as C1
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
26
True negative (TN) - The number of instances correctly
classified as C2
False positive (FP) - The number of instances incorrectly
classified as C1 (actually C2)False negative (FN) - The number
of instances incorrectly classified as C2 (actually C1)
P = Actual positive = TP + FN
P1
= Predicted positive =TP + FP
N = Actual negative = FP + TN
N1
= Predicted negative =FN + TN
TP rate = Sensitivity = TP / P
TN rate = Specificity = TN / N
FP rate = selectivity =1 – TN rate = FP / N
Precision = TP / P1
TP rate = % of correctly classified instances
Accuracy = (TP + TN) / (P + N)
= TP / P * P / (P + N) + TN / N * N / (P + N)
=Sensitivity * P / (P + N) + Specificity * N / (P + N)
Here, it can be noted that false negative cases are risky. For
example, in case of medical diagnosis, if a cancerous patient is
incorrectly predicted as non-cancerous patient, it can be a
major mistake. Similarly, in case of student examination results
a student who has passed an examination can be incorrectly
indicated as fail. Therefore, it is highly essential that false
negatives should be minimized as far as possible.Further, we
define two risk measures for a classifier, namely, high risk rate
and low risk rate which are defined as follows:High risk rate is
defined as the ratio of false negative to predicted negative (i.e.,
FN / N1
) and Low risk rate is defined as the ratio of false
positive to predicted positive (i.e., FP / P1
).This can be clear
from the following instance of confusion matrix, where the
class attribute “result” is having 4 class values, i.e. FAIL,
PASS, 2nd
and 1st
.
TABLE 2- Instance Of Confusion Matrix
This is an instance of result got from classifier Naïve Bayes in
WEKA work bench for a real examination system having 14
attributes, out of which “result” is one attribute. This data file
is having 33254 instances. Here it is observed that 2 of “pass”
students are predicted as “fail”. This is the case of false
negative, which is of high risk.
IV.SIMULATION AND EVALUATION
We compare the results of five different classifiers
(BN,NB,NBU, J48 and DT). Simulations were conducted
using demographic profile data provided by UCI’s census
dataset. The data set consists of 15 attributes. Several rounds of
simulation were carried out using different instances of data,
namely,1000, 2000, 3000 & 4000 on a Intel(R) Core(TM)2
Duo machine with 1.83GHz, and 1.99GB RAM capacity. All
simulations were performed in the WEKA (Waikato
Environment for Knowledge Analysis) machine learning
platform that provide a workbench which consists of collection
of implemented popular learning schemes that can be used for
practical data mining and machine learning works. We chose
10 fold cross-validations as a test mode where 10pairs of
training sets and testing sets are created. The selected
classification algorithms were run on thesame training data sets
and were tested on the same testingsets to obtain the
classification accuracy. The performance comparison under
different cases is presented below.
A. Calculation for time to build the model
It takes some timeto build the model for a classifier, which
depends on number of instances and the classifier as depicted
in figures 1 and 2.TABLE II. Time Taken To Build The Model
Using Different Instances Of The User Profile Dataset
Table-3
Figure-1
Figure-2
Figure 1&2 - Time taken in seconds to build the model vs.
number of instances in user profile datasetAccording to Table-
II, time analysis shows that DT takes maximum time to build
the model, whereas NB and NBU take minimum time.
Increasing the data instances gradually we found that the
increase in model building time is much higher in case of DT
in comparison to BN, NB, NBU, and J48.
a=FAIL b=PASS c = 2ND d = 1ST classified as
2556 0 0 0 a = FAIL
2 17274 1398 47 b = PASS
0 590 5847 850 c = 2ND
0 0 896 3794 d = 1ST
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
27
B. Calculation of CCI rate
Once a model is built for the classifier, the rate of correctly
classified instances (CCI) can be computed which isthe ratio of
correctly classified instances to the total number of instances in
the data set.
TABLE-4. Correctly Classified Instances (Cci) Rate Of The
Classifiers For Different Instances Of User Profile Dataset
CCI rate in
%
Classifiers
BN NB NBU J48 DT
Instances
1000 95.9 97.1 97.1 99.8 99.8
2000 98.6 98.65 98.65 99.9 99.9
3000 98.77 99.03 99.03 100 100
4000 99.03 99.15 99.15 100 100
Figure-3
Figure-4
Fig 3 & 4 - Correctly Classified Instances rate (in %) of five
classifiers for different instancesAs depicted in Table-III,
theCorrectly Classified Instances rate or TP rate in cases of J48
and DT are higher whereas it is the lowest for BN. However,
for NB and NBU it has the same values, which is also the case
for J48 and DT. It is further observed that increasing the data
instances the rate is also increasing in all classifiers as
presented in figures 3 and 4.
C. Calculation of KS, MAE and RAE
TABLE -5 .Ks, Mae & Rae For 4th
Data Set Having 4000
Instances
classifiers KS MAE RAE
BN 0.9879 0.005 4.9409
NB 0.9894 0.0018 1.7679
NBU 0.9894 0.0018 1.7679
J48 1 0 0
DT 1 0.0066 6.5745
Table-5 provides the values for Kappa Statistics (KS), Mean
Absolute Error (MAE) and Relative Absolute Error (RAE) for
all the five classifiers. It is clear that J48 gives accurate values
as KS is 1, MAE is 0 and RAE is 0. Next to it is DT.
Fig 5 for Table IV
TABLE 6-A. Ks Of 3 Data Sets For 5 Classifiers
KS for Instances
1000 2000 3000
Classifiers
BN 0.9488 0.9825 0.9847
NB 0.9639 0.9831 0.988
NBU 0.9639 0.9831 0.988
J48 0.9975 0.9987 1
DT 0.9975 0.9987 1
TABLE 6-B. Mae Of 3 Data Sets For 5 Classifiers
MAE for Instances
1000 2000 3000
classifiers
BN 0.0122 0.0074 0.0059
NB 0.0049 0.0027 0.002
NBU 0.0049 0.0027 0.002
J48 0.0002 0.0001 0
DT 0.0189 0.0113 0.0084
TABLE 6-C.Rae Of 3 Data Sets For 5 Classifiers
RAE for Instances
1000 2000 3000
classifiers
BN 12.076 7.4191 5.8911
NB 4.8186 2.7255 1.9934
NBU 4.8186 2.7255 1.9934
J48 0.2428 0.1249 0
DT 18.8063 11.2926 8.3107
It is evident from Table-V and figures 6, 7 and 8 that the values
of KS tend to 1 as the number of instances increases, whereas
the error values approach 0 in case of MAE and RAE.
Fig 6 - Kappa Statistics for classifiers
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
28
Fig 7 - Mean Absolute Errors in classifiers
Fig 8 - Relative Absolute Errors in classifiers
D. Comparison of performances with the complete data
set
Here, we have used the complete data set for our
experimentation after removing some irrelevant attributes and
compute time to build the model, correctly classified instances
(CCI), KS, MAE and RAE. Next, attributes like capital-gain,
which is less relevant to education (class attribute) has been
removed and values are calculated again which shows
improvement in the results. In this way other 2 attributes
(capital-loss & race) are removed. Table-VI presents the values
for the complete data set (CDS) with all attributes, and table
VII presents the values obtained in case of a reduced data set
(RDS), after eliminating 3 attributes from CDS. Further, a
comparison of values obtained for CDS and RDS is depicted in
figures 9 and 10 which clearly indicate an improvement when
irrelevant attributes are removed from the data set.
TABLE 7 – VALUES OF CDS FOR 4 CLASSIFIERS
TABLE 8– VALUES OF RDS FOR 4 CLASSIFIERS
Classi-
fiers Time CCI KS MAE RAE
BN 0.94 99.874 0.9984 0.0011 1.0531
NB 0.31 99.9536 0.9994 0.0006 0.5495
J48 1.67 100 1 0 0
DT 12.75 100 1 0.0011 1.048
Fig 9 - Time analysis of classifiers for CDS vs RDS
Fig 10 - CCI analysis of classifiers for CDS vs RDS
V.CONCLUSION
In this work, we evaluated the performance of five classifiers
namely, Bayesian Network, Naïve Bayesian, Naives Bayes
Updateable, J48, and Decision Table in terms of their accuracy.
Simulations were performed in the WEKA machine learning
platform. The objective of the work was to find the best
classification algorithm that has high classification accuracy
while building user profiles. According to the simulation
results J48 classifier performs much better on user related
information with least errors. Furthermore, Naïve Bayesian and
Naives Bayes Updateable have taken least time to build the
model. This indicates that J48 classification algorithm should
be favored over Bayesian Network, Naïve Bayesian, Naives
Bayes Updateable, and Decision Table classifiers in the
personalization applications where classification accuracy
performance is important.
REFERENCES
[1] Teixeira, C., Pinto, J.S., Martins, J.A., “User Profiles in Corporate
Scenarios”, IEEE, 2004, pp. 614-619.
[2] Bartolomeo, G.,Kovacikova, T., “User Profile Management in Next
Generation Networks”, AICT' 09. Fifth Advanced International
Conference onTelecommunications, 2009, pp. 129-135.
[3] Damian Fijałkowski, RadosławZatoka, “An architecture of a Web
recommender system using social network user profiles for e-
commerce”, IEEE, 2011, pp. 287–290.
[4] Sung Hyuk Park, Sang Pil Han, Soon Young Huh, Hojin Lee,
“Preprocessing Uncertain User Profile Data: Inferring User's Actual Age
from Ages of the User's Neighbors”, ICDE '09, IEEE 25th International
Conference onData Engineering, 2009, pp. 1619-1024.
[5] Malak Al-hassan, Haiyan Lu, Jie Lu, “A Framework for Delivering
Personalized e-Government Services from a Citizen-Centric Approach”,
Proceedings of iiWAS2009, Kuala Lumpur, Malaysia, 2009, pp. 436-
440.
[6] Alicia Y.C. Tang, NurHananiAzami, Norfaezah Osman, “Application of
Data Mining Techniques in Customer Relationship Management for An
Automobile Company”, Proceedings of the 5th International Conference
onIT & Multimedia at UNITEN (ICIMU 2011) Malaysia, IEEE, 2011.
[7] Cufoglu A., Lohi M.,Madani K., “A Comparative Study of Selected
Classifiers with Classification Accuracy in User Profiling”, Seventh
International Conference on Machine Learning and Applications, IEEE,
2009, pp. 787-791.
Classi-
fiers Time CCI % KS MAE RAE
BN 1.73 99.8177 0.9977 0.0012 1.2107
NB 0.5 99.4728 0.9935 0.0012 1.2248
J48 2.05 100 1 0 0
DT 16.99 100 1 0.0011 1.048
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 24- 29
ISSN: 2278-2419
29
[8] Panda M., Patra M. R., “A comparative study of data mining algorithms
for network intrusion detection”, 1st International Conference on
Emerging Trends in Engineering and Technology,IEEE, 2008, pp. 504-
507.
[9] Wekahomepage.Availableat:http://www.cs.waikato.ac.nz/ml/weka/access
ed on 24/12/11.
[10] Jensen F.V., “Introduction to Bayesian Networks”, Denmark, Hugin
Expert A/S, 1993.
[11] Jiang L.,Guo Y., “Learning lazy Naïve Bayesian classifier for ranking”,
Proceedings of the 17th IEEE International Conference on Tools with
Artificial Intelligence (ICTAI’05), IEEE, 2005, Page(s) 5pp.
[12] Wang Z., I. Geoffrey Webb, “Comparison of lazy Bayesian rule and tree-
augmented Bayesian learning”, Paper presented at the IEEE Conference
on Data Mining, ICDM 2002, Page(s): 490 – 497.
[13] Shi Z., Huang Y., Zhang S., “Fisher score based naive Bayesian
classifier”. Paper appears in ICNN&B International Conference on
Neural Networks and Brain, IEEE 2005, pp. 1616-1621.
[14] Santafe G., Lozano J.A., Larranaga P., “Bayesian model averaging of
Naive Bayes for clustering”, IEEE Transactions on Systems, Man, and
Cybernetics, 2006, Vol. 36, No. 5, pp. 1149 – 1161.
[15] http://en.wikipedia.org/wiki/C4.5_algorithmaccessed on 21/03/13.
[16] Asuncion A. and Newman D.J. (2007) UCI Machine Learning
Repository Irvine, CA: University of California, School of Information
and Computer Science. [Online] Available from:
http://www.ics.uci.edu/~mlearn/MLRepository.htmlaccessed on
11/02/13.

More Related Content

What's hot

Prediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural NetworkPrediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural Networkrahulmonikasharma
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
 
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RSelecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RIOSR Journals
 
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerIJERA Editor
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET -  	  Recommendation System using Big Data Mining on Social NetworksIRJET -  	  Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social NetworksIRJET Journal
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASECONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASEIJwest
 
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...ijcsa
 
Comparison Between WEKA and Salford System in Data Mining Software
Comparison Between WEKA and Salford System in Data Mining SoftwareComparison Between WEKA and Salford System in Data Mining Software
Comparison Between WEKA and Salford System in Data Mining SoftwareUniversitas Pembangunan Panca Budi
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019loki536577
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation ijmpict
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET Journal
 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
 
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVAL
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVALUML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVAL
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVALijcsit
 
V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1warishali570
 

What's hot (19)

Prediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural NetworkPrediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural Network
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
 
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RSelecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
 
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET -  	  Recommendation System using Big Data Mining on Social NetworksIRJET -  	  Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social Networks
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASECONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
 
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
 
Comparison Between WEKA and Salford System in Data Mining Software
Comparison Between WEKA and Salford System in Data Mining SoftwareComparison Between WEKA and Salford System in Data Mining Software
Comparison Between WEKA and Salford System in Data Mining Software
 
Ijatcse71852019
Ijatcse71852019Ijatcse71852019
Ijatcse71852019
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
 
Cw36587594
Cw36587594Cw36587594
Cw36587594
 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisons
 
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVAL
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVALUML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVAL
UML MODELING AND SYSTEM ARCHITECTURE FOR AGENT BASED INFORMATION RETRIEVAL
 
Ijmet 10 02_050
Ijmet 10 02_050Ijmet 10 02_050
Ijmet 10 02_050
 
V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1
 

Similar to Performance Analysis of Selected Classifiers in User Profiling

IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
 
Comparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using WekaComparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using Wekaijtsrd
 
A new hybrid algorithm for business intelligence recommender system
A new hybrid algorithm for business intelligence recommender systemA new hybrid algorithm for business intelligence recommender system
A new hybrid algorithm for business intelligence recommender systemIJNSA Journal
 
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEMA NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEMIJNSA Journal
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsSurvey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsIRJET Journal
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET Journal
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
 
Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...IRJET Journal
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningIRJET Journal
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...Editor IJCATR
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusabilityAlexander Decker
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...IRJET Journal
 
IRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET Journal
 
A recommender system-using novel deep network collaborative filtering
A recommender system-using novel deep network collaborative filteringA recommender system-using novel deep network collaborative filtering
A recommender system-using novel deep network collaborative filteringIAESIJAI
 
Comparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesComparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesIRJET Journal
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Editor IJAIEM
 

Similar to Performance Analysis of Selected Classifiers in User Profiling (20)

IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
 
Comparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using WekaComparative Analysis of Classification Algorithms using Weka
Comparative Analysis of Classification Algorithms using Weka
 
A new hybrid algorithm for business intelligence recommender system
A new hybrid algorithm for business intelligence recommender systemA new hybrid algorithm for business intelligence recommender system
A new hybrid algorithm for business intelligence recommender system
 
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEMA NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
A NEW HYBRID ALGORITHM FOR BUSINESS INTELLIGENCE RECOMMENDER SYSTEM
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsSurvey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data Mining
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
 
Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...Providing highly accurate service recommendation for semantic clustering over...
Providing highly accurate service recommendation for semantic clustering over...
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusability
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
IRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label Classification
 
A recommender system-using novel deep network collaborative filtering
A recommender system-using novel deep network collaborative filteringA recommender system-using novel deep network collaborative filtering
A recommender system-using novel deep network collaborative filtering
 
Comparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesComparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News Articles
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 

More from ijdmtaiir

A review on data mining techniques for Digital Mammographic Analysis
A review on data mining techniques for Digital Mammographic AnalysisA review on data mining techniques for Digital Mammographic Analysis
A review on data mining techniques for Digital Mammographic Analysisijdmtaiir
 
Comparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face RecognitionComparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face Recognitionijdmtaiir
 
A Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data MiningA Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Miningijdmtaiir
 
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...ijdmtaiir
 
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...ijdmtaiir
 
An Analysis of Data Mining Applications for Fraud Detection in Securities Market
An Analysis of Data Mining Applications for Fraud Detection in Securities MarketAn Analysis of Data Mining Applications for Fraud Detection in Securities Market
An Analysis of Data Mining Applications for Fraud Detection in Securities Marketijdmtaiir
 
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...ijdmtaiir
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...ijdmtaiir
 
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...ijdmtaiir
 
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM MethodsA Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM Methodsijdmtaiir
 
Certain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic DataminingCertain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
 
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...ijdmtaiir
 
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...ijdmtaiir
 
An Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathy
An Approach for the Detection of Vascular Abnormalities in Diabetic RetinopathyAn Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathy
An Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathyijdmtaiir
 
Improve the Performance of Clustering Using Combination of Multiple Clusterin...
Improve the Performance of Clustering Using Combination of Multiple Clusterin...Improve the Performance of Clustering Using Combination of Multiple Clusterin...
Improve the Performance of Clustering Using Combination of Multiple Clusterin...ijdmtaiir
 
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...ijdmtaiir
 
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)ijdmtaiir
 
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)ijdmtaiir
 
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...ijdmtaiir
 

More from ijdmtaiir (20)

A review on data mining techniques for Digital Mammographic Analysis
A review on data mining techniques for Digital Mammographic AnalysisA review on data mining techniques for Digital Mammographic Analysis
A review on data mining techniques for Digital Mammographic Analysis
 
Comparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face RecognitionComparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face Recognition
 
A Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data MiningA Novel Approach to Mathematical Concepts in Data Mining
A Novel Approach to Mathematical Concepts in Data Mining
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Mining
 
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...
 
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
 
An Analysis of Data Mining Applications for Fraud Detection in Securities Market
An Analysis of Data Mining Applications for Fraud Detection in Securities MarketAn Analysis of Data Mining Applications for Fraud Detection in Securities Market
An Analysis of Data Mining Applications for Fraud Detection in Securities Market
 
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
 
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...
 
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM MethodsA Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
 
Certain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic DataminingCertain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic Datamining
 
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...
 
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...
 
An Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathy
An Approach for the Detection of Vascular Abnormalities in Diabetic RetinopathyAn Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathy
An Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathy
 
Improve the Performance of Clustering Using Combination of Multiple Clusterin...
Improve the Performance of Clustering Using Combination of Multiple Clusterin...Improve the Performance of Clustering Using Combination of Multiple Clusterin...
Improve the Performance of Clustering Using Combination of Multiple Clusterin...
 
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...
 
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
 
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)
 
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...
 

Recently uploaded

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 

Recently uploaded (20)

Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 

Performance Analysis of Selected Classifiers in User Profiling

  • 1. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 24 Performance Analysis of Selected Classifiers in User Profiling ManasRanjan Patra1 ,V.Mohan Patro2 Department of Computer Science, Berhampur University,Berhampur-760007,Odisha, India Email- mrpatra12@gmail.com , vmpatro@gmail.com Abstract-User profiles can serve as indicators of personal preferences which can be effectively used while providing personalized services. Building user files which can capture accurate information of individuals has been a daunting task. Several attempts have been made by researchers to extract information from different data sources to build user profiles on different application domains. Towards this end, in this paper we employ different classification algorithmsto create accurate user profiles based on information gathered from demographic data. The aim of this work is to analyze the performance of five most effective classification methods, namely Bayesian Network(BN), Naïve Bayesian(NB), Naives Bayes Updateable(NBU), J48, and Decision Table(DT). Our simulation results show that, in general, the J48has the highest classification accuracy performance with the lowest error rate. On the other hand, it is found that Naïve Bayesian and Naives Bayes Updateable classifiers have the lowest time requirement to build the classification model. Keywords–Classifiers; data mining; Confusion matrix; evaluation I. INTRODUCTION A user profile refers to the explicit digital representation of a person's identity. A user profile can also be considered as the computer representation of a user model. User profiles describe user interests, characteristics, behavior and preferences. A user profile is a set of information about a given user in a given context and on a specific period of time [1]. User profiling is the practice of gathering, organizing and interpreting the user profile information.It refers to construction of a profile through the extraction from a set of data. User profiles can be found on recommender systems, or dynamic websites (such as online social networking sites or bulletin boards). Behind every instance of personalization, there is a profile that stores the user preferences, context of use and other information that can be used to deliver a user experience tailored to the individual needs and preferences[2].Recommendation systems are widely adopted in e-commerce businesses for helping customers locate products they like to purchase and the major challenge for these systems is bridging the gap between the physical characteristics of data with the users’ perceptions [3]. In order to address this challenge, employing user profiles to improve accuracy become essential. Targeting customers, based on inaccurate user profile data will not be as effective as targeting customers based on accurate user profile data. Thus, companies perform preprocessing on user profile data as part of effort to maintain the accuracy of their user profile data[4].Yet another application which can benefit from user profiling is E- governance wherein personalized online services can be provided to citizens. User profiling would help e-Gov service systems communicate effectively and efficiently with their users. With the provision of user profiles, the systems can recognize the user every time he/she logs on to the system and the user does notneed to enter the same personal information again. This will make users’ interactions with the system easier and faster[5].Several techniques are being applied to build user profiles which can be effectively used in different application domains to provide customized services. Among others classification is one of the widely used techniques which have been successfully applied to classify users based on information gathered from different sources. However, the performance of these classification techniques greatly vary in terms of how accurately they classify the data for building accurate user profiles. In [6], Alicia et al. compared the performance of NB, J48 and DT showing the correctly classified instances rate, time taken to build the model, area under curve etc. In [7], Cufogluet al. compared Naïve Bayesian Tree (NBTree), Sequential Minimal Optimization (SMO), Naïve Bayesian(NB), Instance-Based Learner(IB1), J48 (a version of C4.5), Classification and Regression Tree (SimpleCART) and Iterative Dichotomiser Tree (ID3) classifiers in the personalizationapplications.In [6],using an automobile data set, the authors have obtained only upto 73% of correctly classified instances rate whereas in our case using a user profile data set we could achieve more than 95%. For both the data sets the time taken to build the model is the highest in case of Decision Table classifier in comparison to other classifiers used in the experimentation. From [7] and our observation it is clear that there is no substantial increase in the time taken to build the model when the number of instances is within 1000. However, the time taken to build the model is always in increasing order when the number of instances is in thousands.In [8], Panda et al. compared the performance of NB, Id3 and J48 algorithms for network intrusion detection. According to the authors, NB performs better than ID3 and J48 with respect to overall classification accuracy. However, the authors added that Decision Trees (ID3 and J48) are robust in detecting new intrusive behaviours in comparison to NB.In our work, we have applied all the classifiers available in the WEKA(Waikato Environment for Knowledge Analysis) workbench[9] on the user profile dataset and observed that 20 classifiers resulted in classification accuracy above 98%. Out of these 20 classifiers we have selected the top 5 classifiers, namely, Bayesian Network (BN), Naïve Bayesian (NB), Naives Bayes
  • 2. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 25 Updateable (NBU), J48 (Java implementation of C4.5), and Decision Table (DT) for further analysis. II. CLASSIFICATION TECHNIQUES Bayesian networks are probability based and are used for reasoning and decision making under uncertainty, and heavily rely on Bayes’ rule which is defined as follows [10];  Assume Ai attributes where i= 1,2,3,…,n, and which take values ai where i= 1,2,3,…,n.  Assume C as class label and E = (a1, a2,.. an) as unclassified test instance.  E is classified into class Cwith the maximum posterior class probabilityP(C | E), P(C | E) = arg max P(C)P(E | C) Bayesian Networks can represent uncertain attribute dependencies; however it has been proven that learning optimalBayesian network is NP (Non-deterministic Polynomial) hard [11]. Naïve Bayesian Classifier is one of the Bayesian Classifier techniques whichis also known as the state- of-the-art of Bayesian Classifiers. In many works it has been proven that Naïve Bayesian classifiers are one of the most computationally efficient, effective and simple algorithms for machine learning and data mining applications [12][13][14]. Naïve Bayesian classifiers assume that all attributes within the same class are independent, given the class label. Based on this assumption, the Bayesian rule has been modified as follows to define the Naïve Bayesian rule [10]; P(C|E) = arg max ( ) ∏ ( | )Naïve Bayesian classifiers are normally used for interactive applications. However, because of its Naïve conditional in dependence assumption, optimal accuracy cannot be achieved.Naïve Bayes Updateable is the updateable version of Naïve Bayes in WEKA (Waikato Environment for Knowledge Analysis) work bench. This classifier uses a default precision of 0.1 for numeric attributes when build Classifier ( a method in java code) is called with zero training instances.J48 is an open source Java implementation of the C4.5 algorithm in the WEKA data mining tool. C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy[15]. The training data is a set S = s1, s2, … of already classified samples. Each sample si consists of a p-dimensional vector (x1,i, x2,i , …, xp,i) wherexj represent attributes or features of the sample, as well as the class in which si falls.Decision table is based on logical relationships just as the truth table. It is a tool that helps us to look at the combination of both completeness and inconsistency of conditions[6]. Decision tables, like decision trees or neural nets, are classification models used for prediction. They are induced by machine learning algorithms. A decision table consists of a hierarchical table in which each entry in a higher level table is decomposed into values of a pair of additional attributes to form another table. The structure is similar to dimensional stacking. III. EXPERIMENTAL SETUP A. Data Set For our experimentation we have used Census Income data set[16]which has 15 different attributes, namely, Age, Work- class, Final-weight, Education, Education-num, Marital-status, Occupation, Relationship, Race, Sex, Native-country, capital- gain, capital-loss, Hours-per-week & Income. This data set has been used by different authors for different purposes. It has been used to classify if the income of a person is greater than 50K based on several census parameters, such as age, education, marital status. B. Kappa Statistics (KS) Kappa is a chance-corrected measure of agreement between the classification and the true classes. Kappa statistic is used to access the accuracy of any particular measuring cases. It is used to distinguish between the reliability of the data collected and their validity. The value of kappa is less than or equal to 1 where the value of 1 indicates perfect agreement. C. Cross-Validation Cross validation calculates the accuracy of the model by separating the data into two different populations - a training set and a testing set. The model is created from the training set and its accuracy is measured based on how well it classifies the testing set. This testing process is continued k times to complete the k-fold cross validation procedure. We have used 10-fold cross-validation in which the available data are randomly divided into 10 disjoint subsets of approximately equal size. One of the subsets is then used as the test set and the remaining 9 sets are used for building the classifier. This test set is then used to estimate the accuracy. This is done repeatedly 10 times so that each subset is used as a test subset once. The accuracy estimate is then the mean of the estimates for each of the classifiers. D. Confusion Matrix A confusion matrix (also known as a contingency table or an error matrix) is a table layout that allows visualization of the performance of a supervised learning algorithm. Following confusion matrix is for 2 class values of an attribute. Table-1 Predicted Class C1 C2 Actual Class C1 True positive False negative C2 False positive True negative C1 – particular class C2 – different class True positive (TP) - The number of instances correctly classified as C1
  • 3. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 26 True negative (TN) - The number of instances correctly classified as C2 False positive (FP) - The number of instances incorrectly classified as C1 (actually C2)False negative (FN) - The number of instances incorrectly classified as C2 (actually C1) P = Actual positive = TP + FN P1 = Predicted positive =TP + FP N = Actual negative = FP + TN N1 = Predicted negative =FN + TN TP rate = Sensitivity = TP / P TN rate = Specificity = TN / N FP rate = selectivity =1 – TN rate = FP / N Precision = TP / P1 TP rate = % of correctly classified instances Accuracy = (TP + TN) / (P + N) = TP / P * P / (P + N) + TN / N * N / (P + N) =Sensitivity * P / (P + N) + Specificity * N / (P + N) Here, it can be noted that false negative cases are risky. For example, in case of medical diagnosis, if a cancerous patient is incorrectly predicted as non-cancerous patient, it can be a major mistake. Similarly, in case of student examination results a student who has passed an examination can be incorrectly indicated as fail. Therefore, it is highly essential that false negatives should be minimized as far as possible.Further, we define two risk measures for a classifier, namely, high risk rate and low risk rate which are defined as follows:High risk rate is defined as the ratio of false negative to predicted negative (i.e., FN / N1 ) and Low risk rate is defined as the ratio of false positive to predicted positive (i.e., FP / P1 ).This can be clear from the following instance of confusion matrix, where the class attribute “result” is having 4 class values, i.e. FAIL, PASS, 2nd and 1st . TABLE 2- Instance Of Confusion Matrix This is an instance of result got from classifier Naïve Bayes in WEKA work bench for a real examination system having 14 attributes, out of which “result” is one attribute. This data file is having 33254 instances. Here it is observed that 2 of “pass” students are predicted as “fail”. This is the case of false negative, which is of high risk. IV.SIMULATION AND EVALUATION We compare the results of five different classifiers (BN,NB,NBU, J48 and DT). Simulations were conducted using demographic profile data provided by UCI’s census dataset. The data set consists of 15 attributes. Several rounds of simulation were carried out using different instances of data, namely,1000, 2000, 3000 & 4000 on a Intel(R) Core(TM)2 Duo machine with 1.83GHz, and 1.99GB RAM capacity. All simulations were performed in the WEKA (Waikato Environment for Knowledge Analysis) machine learning platform that provide a workbench which consists of collection of implemented popular learning schemes that can be used for practical data mining and machine learning works. We chose 10 fold cross-validations as a test mode where 10pairs of training sets and testing sets are created. The selected classification algorithms were run on thesame training data sets and were tested on the same testingsets to obtain the classification accuracy. The performance comparison under different cases is presented below. A. Calculation for time to build the model It takes some timeto build the model for a classifier, which depends on number of instances and the classifier as depicted in figures 1 and 2.TABLE II. Time Taken To Build The Model Using Different Instances Of The User Profile Dataset Table-3 Figure-1 Figure-2 Figure 1&2 - Time taken in seconds to build the model vs. number of instances in user profile datasetAccording to Table- II, time analysis shows that DT takes maximum time to build the model, whereas NB and NBU take minimum time. Increasing the data instances gradually we found that the increase in model building time is much higher in case of DT in comparison to BN, NB, NBU, and J48. a=FAIL b=PASS c = 2ND d = 1ST classified as 2556 0 0 0 a = FAIL 2 17274 1398 47 b = PASS 0 590 5847 850 c = 2ND 0 0 896 3794 d = 1ST
  • 4. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 27 B. Calculation of CCI rate Once a model is built for the classifier, the rate of correctly classified instances (CCI) can be computed which isthe ratio of correctly classified instances to the total number of instances in the data set. TABLE-4. Correctly Classified Instances (Cci) Rate Of The Classifiers For Different Instances Of User Profile Dataset CCI rate in % Classifiers BN NB NBU J48 DT Instances 1000 95.9 97.1 97.1 99.8 99.8 2000 98.6 98.65 98.65 99.9 99.9 3000 98.77 99.03 99.03 100 100 4000 99.03 99.15 99.15 100 100 Figure-3 Figure-4 Fig 3 & 4 - Correctly Classified Instances rate (in %) of five classifiers for different instancesAs depicted in Table-III, theCorrectly Classified Instances rate or TP rate in cases of J48 and DT are higher whereas it is the lowest for BN. However, for NB and NBU it has the same values, which is also the case for J48 and DT. It is further observed that increasing the data instances the rate is also increasing in all classifiers as presented in figures 3 and 4. C. Calculation of KS, MAE and RAE TABLE -5 .Ks, Mae & Rae For 4th Data Set Having 4000 Instances classifiers KS MAE RAE BN 0.9879 0.005 4.9409 NB 0.9894 0.0018 1.7679 NBU 0.9894 0.0018 1.7679 J48 1 0 0 DT 1 0.0066 6.5745 Table-5 provides the values for Kappa Statistics (KS), Mean Absolute Error (MAE) and Relative Absolute Error (RAE) for all the five classifiers. It is clear that J48 gives accurate values as KS is 1, MAE is 0 and RAE is 0. Next to it is DT. Fig 5 for Table IV TABLE 6-A. Ks Of 3 Data Sets For 5 Classifiers KS for Instances 1000 2000 3000 Classifiers BN 0.9488 0.9825 0.9847 NB 0.9639 0.9831 0.988 NBU 0.9639 0.9831 0.988 J48 0.9975 0.9987 1 DT 0.9975 0.9987 1 TABLE 6-B. Mae Of 3 Data Sets For 5 Classifiers MAE for Instances 1000 2000 3000 classifiers BN 0.0122 0.0074 0.0059 NB 0.0049 0.0027 0.002 NBU 0.0049 0.0027 0.002 J48 0.0002 0.0001 0 DT 0.0189 0.0113 0.0084 TABLE 6-C.Rae Of 3 Data Sets For 5 Classifiers RAE for Instances 1000 2000 3000 classifiers BN 12.076 7.4191 5.8911 NB 4.8186 2.7255 1.9934 NBU 4.8186 2.7255 1.9934 J48 0.2428 0.1249 0 DT 18.8063 11.2926 8.3107 It is evident from Table-V and figures 6, 7 and 8 that the values of KS tend to 1 as the number of instances increases, whereas the error values approach 0 in case of MAE and RAE. Fig 6 - Kappa Statistics for classifiers
  • 5. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 28 Fig 7 - Mean Absolute Errors in classifiers Fig 8 - Relative Absolute Errors in classifiers D. Comparison of performances with the complete data set Here, we have used the complete data set for our experimentation after removing some irrelevant attributes and compute time to build the model, correctly classified instances (CCI), KS, MAE and RAE. Next, attributes like capital-gain, which is less relevant to education (class attribute) has been removed and values are calculated again which shows improvement in the results. In this way other 2 attributes (capital-loss & race) are removed. Table-VI presents the values for the complete data set (CDS) with all attributes, and table VII presents the values obtained in case of a reduced data set (RDS), after eliminating 3 attributes from CDS. Further, a comparison of values obtained for CDS and RDS is depicted in figures 9 and 10 which clearly indicate an improvement when irrelevant attributes are removed from the data set. TABLE 7 – VALUES OF CDS FOR 4 CLASSIFIERS TABLE 8– VALUES OF RDS FOR 4 CLASSIFIERS Classi- fiers Time CCI KS MAE RAE BN 0.94 99.874 0.9984 0.0011 1.0531 NB 0.31 99.9536 0.9994 0.0006 0.5495 J48 1.67 100 1 0 0 DT 12.75 100 1 0.0011 1.048 Fig 9 - Time analysis of classifiers for CDS vs RDS Fig 10 - CCI analysis of classifiers for CDS vs RDS V.CONCLUSION In this work, we evaluated the performance of five classifiers namely, Bayesian Network, Naïve Bayesian, Naives Bayes Updateable, J48, and Decision Table in terms of their accuracy. Simulations were performed in the WEKA machine learning platform. The objective of the work was to find the best classification algorithm that has high classification accuracy while building user profiles. According to the simulation results J48 classifier performs much better on user related information with least errors. Furthermore, Naïve Bayesian and Naives Bayes Updateable have taken least time to build the model. This indicates that J48 classification algorithm should be favored over Bayesian Network, Naïve Bayesian, Naives Bayes Updateable, and Decision Table classifiers in the personalization applications where classification accuracy performance is important. REFERENCES [1] Teixeira, C., Pinto, J.S., Martins, J.A., “User Profiles in Corporate Scenarios”, IEEE, 2004, pp. 614-619. [2] Bartolomeo, G.,Kovacikova, T., “User Profile Management in Next Generation Networks”, AICT' 09. Fifth Advanced International Conference onTelecommunications, 2009, pp. 129-135. [3] Damian Fijałkowski, RadosławZatoka, “An architecture of a Web recommender system using social network user profiles for e- commerce”, IEEE, 2011, pp. 287–290. [4] Sung Hyuk Park, Sang Pil Han, Soon Young Huh, Hojin Lee, “Preprocessing Uncertain User Profile Data: Inferring User's Actual Age from Ages of the User's Neighbors”, ICDE '09, IEEE 25th International Conference onData Engineering, 2009, pp. 1619-1024. [5] Malak Al-hassan, Haiyan Lu, Jie Lu, “A Framework for Delivering Personalized e-Government Services from a Citizen-Centric Approach”, Proceedings of iiWAS2009, Kuala Lumpur, Malaysia, 2009, pp. 436- 440. [6] Alicia Y.C. Tang, NurHananiAzami, Norfaezah Osman, “Application of Data Mining Techniques in Customer Relationship Management for An Automobile Company”, Proceedings of the 5th International Conference onIT & Multimedia at UNITEN (ICIMU 2011) Malaysia, IEEE, 2011. [7] Cufoglu A., Lohi M.,Madani K., “A Comparative Study of Selected Classifiers with Classification Accuracy in User Profiling”, Seventh International Conference on Machine Learning and Applications, IEEE, 2009, pp. 787-791. Classi- fiers Time CCI % KS MAE RAE BN 1.73 99.8177 0.9977 0.0012 1.2107 NB 0.5 99.4728 0.9935 0.0012 1.2248 J48 2.05 100 1 0 0 DT 16.99 100 1 0.0011 1.048
  • 6. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications Volume: 03 Issue: 01 June 2014, Page No. 24- 29 ISSN: 2278-2419 29 [8] Panda M., Patra M. R., “A comparative study of data mining algorithms for network intrusion detection”, 1st International Conference on Emerging Trends in Engineering and Technology,IEEE, 2008, pp. 504- 507. [9] Wekahomepage.Availableat:http://www.cs.waikato.ac.nz/ml/weka/access ed on 24/12/11. [10] Jensen F.V., “Introduction to Bayesian Networks”, Denmark, Hugin Expert A/S, 1993. [11] Jiang L.,Guo Y., “Learning lazy Naïve Bayesian classifier for ranking”, Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’05), IEEE, 2005, Page(s) 5pp. [12] Wang Z., I. Geoffrey Webb, “Comparison of lazy Bayesian rule and tree- augmented Bayesian learning”, Paper presented at the IEEE Conference on Data Mining, ICDM 2002, Page(s): 490 – 497. [13] Shi Z., Huang Y., Zhang S., “Fisher score based naive Bayesian classifier”. Paper appears in ICNN&B International Conference on Neural Networks and Brain, IEEE 2005, pp. 1616-1621. [14] Santafe G., Lozano J.A., Larranaga P., “Bayesian model averaging of Naive Bayes for clustering”, IEEE Transactions on Systems, Man, and Cybernetics, 2006, Vol. 36, No. 5, pp. 1149 – 1161. [15] http://en.wikipedia.org/wiki/C4.5_algorithmaccessed on 21/03/13. [16] Asuncion A. and Newman D.J. (2007) UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science. [Online] Available from: http://www.ics.uci.edu/~mlearn/MLRepository.htmlaccessed on 11/02/13.