SlideShare a Scribd company logo
1 of 3
Download to read offline
Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 
www.ijera.com 91 | P a g e 
Preprocessing and Classification in WEKA Using Different 
Classifiers 
Payal P.Dhakate1, Suvarna Patil2, K. Rajeswari3, Deepa Abin4 
1,2,3,4 Department of Computer Science, Pune University, Aakurdi Pune 
ABSTRACT 
Data mining is a process of extracting information from a dataset and transform it into understandable structure 
for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques 
such as preprocessing, classification. Classification is one such technique which is based on supervised learning. 
It is a technique used for predicting group membership for the data instance. Here in this paper we use 
preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the 
result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes. 
ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people 
had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates 
in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of 
death. 
Keywords- AD Tree; J48; Random Tree; REP Tree; Simple cart; WEKA; 
I. INTRODUCTION 
WEKA is introduced by Waikato University , it 
is open source software written in Java and used for 
different purposes such as research, education. Fig 1 
illustrates WEKA Interface. Classification is the 
process of finding model or function which describes 
and distinguishes data classes or concepts, for the 
intension of using the model to predict the class of 
objects whose class label is unknown. The problem is 
a comparative study of classification technique such 
as Random Tree, FT tree, LMT Tree using various 
parameters using Diabetes Dataset containing 9 
attributes and 768 instances. Classification and 
Regression trees was introduced by Breiman. It is 
based on binary splitting of the attributes. Gini index 
is used in selecting the splitting attribute. It uses both 
numeric and categorical attributes for building the 
decision tree and also uses in-built features to deal 
with missing attributes. FT tree is classifier for 
building functional trees, which are classification 
trees that could have logistic regression functions at 
the inner nodes and/or leaves. The algorithm can deal 
with binary and multi-class target variables, numeric 
and nominal attributes and missing values Lmt tree is 
classifier for building logistic model trees, which are 
classification trees with logistic regression functions 
at the leaves. The algorithm can deal with binary and 
multi-class target variables, numeric and nominal 
attributes and missing values.REP Tree is a fast 
decision tree learner, it builds a decision/regression 
tree using information gain/variance and prunes it 
using reduced-error pruning. It only sorts values for 
numeric attributes once. Missing values are dealt 
with by splitting the corresponding instances into 
pieces. Lad tree is class for generating a multi-class 
alternating decision tree using the Logit Boost 
strategy. In this experiment we are using diabetes 
database having 768 attributes and 7 instances, some 
of the attributes are preg , plas, pedi ,age ,class. 
Fig1. WEKA Interface 
Here in this paper in Section 2 proposed method 
is discussed defines problem statement. Section 3 
describes the proposed classification method to 
identify the class of diabetes using data mining 
classification Random tree. Experimental results and 
performance evaluation are discussed in Section 4 
and in Section 5 conclusion is putforth. 
II. PROPOSED METHOD 
Classification is the process of finding a model 
(or function) that describes and distinguishes data 
RESEARCH ARTICLE OPEN ACCESS
Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 
www.ijera.com 92 | P a g e 
classes or concepts, for the able purpose of being to use the model to predict the class of objects whose class label is unknown [3]. It is a technique which is used to predict group membership for data instances. Classification is a two step process, first, it builds classification model using training data. Every object of the dataset must be pre-classified i.e. its class label must be known, second the model generated in the preceding step is tested by assigning class labels to data objects in a test dataset.Here we are using diabetes dataset because now a days the percentage of diabetes patient is growing very fast. According to Diabetes Atlas published by the International Diabetes Federation (IDF), there were an estimated 40 million persons with diabetes in India in 2007 and this number is predicted to rise to almost 70 million people by 2025. India accounts for the largest number of people 50.8 million suffering from diabetes in the world. India continues to be the "diabetes capital" of the world, and by 2030, nearly 9 per cent of the country's population is likely to be affected from the disease It is estimated that every fifth person with diabetes will be an Indian. This means that India has highest number of diabetes in any one of the country in the world . Here we are using diabetes database having 768 attributes and 7 instances. It consists of attributes like preg ,mass, age, insu .These attributes predict whether a person having diabetes or not.Consider an example we may wish whether a person having type1, type2 diabetes. In classification we apply different classifiers such as simple cart, AD Tree, Random tree etc. Discretization is applied as preprocessing technique, such as preg, plas, pres, insu and it is apply to all attributes. Out of these attributes we preprocess it for preg attribute and it is shown by Fig.2 Fig2. Preprocessing of diabetes database 
Then we can compare the results of different classification techniques. Fig. 4 can shows Comparison of classifiers using different measures .In this paper out of all other classifiers the Random Tree gives better accuracy and also the time required is less. The below Fig. 3 shows this random tree and Fig .5 illustrates Accuracy and time required for trees different trees. Out of this Random tree gives good accuracy. 
Fig 3.Classification using FT Tree 
III. EXPERIMENTAL RESULTS AND PERFORMANCE EVALUATION 
a) Measures for performance evaluation: To measure the performance sensitivity, accuracy and specificity are used, such concepts are readily usable for the evaluation of any binary classifier. TP is true positive, FP is false positive, TN is true negative and FN is false negative. TPR is true positive rate, which is equivalent to Recall [2]. Sensitivity=True Positive Rate=True Positive/( True Positive + False Negative) (1) Specificity=True Negative/(False Positive + True Negative) (2) b) Precision: Precision is calculated as number of correctly classified instances belongs to X divided by number of instances classified as belonging to class X; that is, it is the proportion of true positives out of all positive results and can be defined as [2]. Precision=True Positive/(True Positive + False Positive) (3) c) Accuracy: It is a ratio of ((no. of correctly classified instances) / (total no. of instances)) *100) and it can be defined as [2]. Accuracy=(True Positive + True Negative)/( True Positive + False Negative) + (False Negative + True Negative) (4) d) False Positive Rate : It is simply the ratio of false positives to false positives plus true negatives.[2] False positive rate is zero in ideal world. It can be defined as [2]. False Positive Rate =False Positive/(False Positive + True Negative) (5) e) F-Measure: F-measure is nothing but combining recall and precision scores into a single measure of performance. It is given as [2]. F-Measure=2*recall*precision/(recall + precision) (6)
Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 
www.ijera.com 93 | P a g e 
Table .1 Comparison of classifiers using different measures 
Table .2Error rates of different classifier 
IV. CONCLUSION 
In this paper we discussed data mining, preprocessing and different classification techniques on diabetes database using WEKA tool . The FT Tree has shown accurate results than other techniques such as LAD Tree , Simple cart,J48 , LMT Tree and REP Tree in terms of accuracy ,time and errors . 
REFERENCES 
[1] J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 2000. [2] Varun Kumar and Nisha Rathee,” Knowledge discovery from database Using an integration of clustering and classification”, (IJACSA) International Journal of Advanced Computer Science and Applications, 2011. [3] Swasti Singhal, Monika Jena, “A Study on WEKA Tool for Data Preprocessing, Classification and Clustering”, International Journal of Innovative Technology and Exploring Engineering(IJITEE), 2013 
Trees 
J48 
FT Tree 
LAD 
Simple Cart 
LMT 
REP Tree 
TP Rate 
1 
0.593 
0.57 
0.534 
0.56 
0.58 
FP Rate 
0.053 
0.13 
0.17 
0.132 
0.11 
0.154 
Precision 
0.833 
0.71 
0.644 
0.684 
0.732 
0.67 
Recall 
1 
0.593 
0.575 
0.534 
0.56 
0.582 
Time Taken 
0.04 
0.08 
0.08 
0.06 
0.59 
0.02 
Correctly Classified Instances (%) 
91 
77 
74 
75 
77 
75 
Accuracy( %) 
84 
77 
74 
75 
77 
75 
Kappa statistic 
Mean absolute error 
Root mean squared error 
Relative absolute error 
Root relative squared error 
LAD tree 
0.415 
0.0322 
0.4237 
70.85 
88.88 
REP tree 
0.4415 
0.3261 
0.4181 
75.21 
90.7 
J48 
0.6319 
0.2383 
0.3452 
52.4339 
72.4207 
Simple cart 
0.4232 
0.3419 
75.21 
76.49 
87.47 
LMT Tree 
0.4756 
0.3175 
0.3963 
69.84 
83.15

More Related Content

What's hot

A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniquesijsrd.com
 
Comprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction TechniquesComprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction Techniquesijsrd.com
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
 
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Waqas Tariq
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
 
Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers ijcsa
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET Journal
 
The use of rough set theory in determining the preferences of the customers o...
The use of rough set theory in determining the preferences of the customers o...The use of rough set theory in determining the preferences of the customers o...
The use of rough set theory in determining the preferences of the customers o...Alexander Decker
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...RSIS International
 
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...IRJET Journal
 
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEaciijournal
 
Data mining technique for opinion
Data mining technique for opinionData mining technique for opinion
Data mining technique for opinionIJDKP
 
IRJET- Disease Prediction System
IRJET- Disease Prediction SystemIRJET- Disease Prediction System
IRJET- Disease Prediction SystemIRJET Journal
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...eSAT Publishing House
 
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...cscpconf
 
Multi-label Classification Methods: A Comparative Study
Multi-label Classification Methods: A Comparative StudyMulti-label Classification Methods: A Comparative Study
Multi-label Classification Methods: A Comparative StudyIRJET Journal
 
Analysis On Classification Techniques In Mammographic Mass Data Set
Analysis On Classification Techniques In Mammographic Mass Data SetAnalysis On Classification Techniques In Mammographic Mass Data Set
Analysis On Classification Techniques In Mammographic Mass Data SetIJERA Editor
 

What's hot (20)

A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniques
 
Comprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction TechniquesComprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction Techniques
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
 
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
 
G046024851
G046024851G046024851
G046024851
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
 
Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
 
The use of rough set theory in determining the preferences of the customers o...
The use of rough set theory in determining the preferences of the customers o...The use of rough set theory in determining the preferences of the customers o...
The use of rough set theory in determining the preferences of the customers o...
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
 
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
IRJET- Comparative Analysis of Data Mining Classification Techniques for Hear...
 
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
 
Data mining technique for opinion
Data mining technique for opinionData mining technique for opinion
Data mining technique for opinion
 
IRJET- Disease Prediction System
IRJET- Disease Prediction SystemIRJET- Disease Prediction System
IRJET- Disease Prediction System
 
Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...Comparative study of various supervisedclassification methodsforanalysing def...
Comparative study of various supervisedclassification methodsforanalysing def...
 
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
 
Multi-label Classification Methods: A Comparative Study
Multi-label Classification Methods: A Comparative StudyMulti-label Classification Methods: A Comparative Study
Multi-label Classification Methods: A Comparative Study
 
Analysis On Classification Techniques In Mammographic Mass Data Set
Analysis On Classification Techniques In Mammographic Mass Data SetAnalysis On Classification Techniques In Mammographic Mass Data Set
Analysis On Classification Techniques In Mammographic Mass Data Set
 

Viewers also liked

Strategies for assessing cloud security
Strategies for assessing cloud securityStrategies for assessing cloud security
Strategies for assessing cloud securityArun Gopinath
 
Computational approach of the homeostatic turnover of memory B cells.
Computational approach of the homeostatic turnover of memory B cells.Computational approach of the homeostatic turnover of memory B cells.
Computational approach of the homeostatic turnover of memory B cells.IJERA Editor
 
A New Approach of Learning Hierarchy Construction Based on Fuzzy Logic
A New Approach of Learning Hierarchy Construction Based on Fuzzy LogicA New Approach of Learning Hierarchy Construction Based on Fuzzy Logic
A New Approach of Learning Hierarchy Construction Based on Fuzzy LogicIJERA Editor
 
Meteor Goodie Bag case study
Meteor Goodie Bag case studyMeteor Goodie Bag case study
Meteor Goodie Bag case studyPost Media
 
HeroCloud: The Next Generation Solution to Online Game Development
HeroCloud: The Next Generation Solution to Online Game Development HeroCloud: The Next Generation Solution to Online Game Development
HeroCloud: The Next Generation Solution to Online Game Development HeroEngine
 
Securing virtualization in real world environments
Securing virtualization in real world environmentsSecuring virtualization in real world environments
Securing virtualization in real world environmentsArun Gopinath
 
Cloud computing white paper who do you trust
Cloud computing white paper who do you trustCloud computing white paper who do you trust
Cloud computing white paper who do you trustArun Gopinath
 
Photoluminescence of Bioceramic Materials and Bioceramic Resonance
Photoluminescence of Bioceramic Materials and Bioceramic ResonancePhotoluminescence of Bioceramic Materials and Bioceramic Resonance
Photoluminescence of Bioceramic Materials and Bioceramic ResonanceIJERA Editor
 
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...IJERA Editor
 
Separation of Water to Concentrate Aloe Vera Juice:
Separation of Water to Concentrate Aloe Vera Juice:Separation of Water to Concentrate Aloe Vera Juice:
Separation of Water to Concentrate Aloe Vera Juice:IJERA Editor
 

Viewers also liked (20)

Cx4301574577
Cx4301574577Cx4301574577
Cx4301574577
 
Strategies for assessing cloud security
Strategies for assessing cloud securityStrategies for assessing cloud security
Strategies for assessing cloud security
 
K045066265
K045066265K045066265
K045066265
 
B044040916
B044040916B044040916
B044040916
 
Computational approach of the homeostatic turnover of memory B cells.
Computational approach of the homeostatic turnover of memory B cells.Computational approach of the homeostatic turnover of memory B cells.
Computational approach of the homeostatic turnover of memory B cells.
 
A New Approach of Learning Hierarchy Construction Based on Fuzzy Logic
A New Approach of Learning Hierarchy Construction Based on Fuzzy LogicA New Approach of Learning Hierarchy Construction Based on Fuzzy Logic
A New Approach of Learning Hierarchy Construction Based on Fuzzy Logic
 
Meteor Goodie Bag case study
Meteor Goodie Bag case studyMeteor Goodie Bag case study
Meteor Goodie Bag case study
 
R04504114117
R04504114117R04504114117
R04504114117
 
HeroCloud: The Next Generation Solution to Online Game Development
HeroCloud: The Next Generation Solution to Online Game Development HeroCloud: The Next Generation Solution to Online Game Development
HeroCloud: The Next Generation Solution to Online Game Development
 
Securing virtualization in real world environments
Securing virtualization in real world environmentsSecuring virtualization in real world environments
Securing virtualization in real world environments
 
Cloud computing white paper who do you trust
Cloud computing white paper who do you trustCloud computing white paper who do you trust
Cloud computing white paper who do you trust
 
Photoluminescence of Bioceramic Materials and Bioceramic Resonance
Photoluminescence of Bioceramic Materials and Bioceramic ResonancePhotoluminescence of Bioceramic Materials and Bioceramic Resonance
Photoluminescence of Bioceramic Materials and Bioceramic Resonance
 
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...
Enhanced Seamless Handoff Using Multiple Access Points in Wireless Local Area...
 
J044025054
J044025054J044025054
J044025054
 
Separation of Water to Concentrate Aloe Vera Juice:
Separation of Water to Concentrate Aloe Vera Juice:Separation of Water to Concentrate Aloe Vera Juice:
Separation of Water to Concentrate Aloe Vera Juice:
 
Eu4301883896
Eu4301883896Eu4301883896
Eu4301883896
 
A42020106
A42020106A42020106
A42020106
 
O44087882
O44087882O44087882
O44087882
 
Aj044236240
Aj044236240Aj044236240
Aj044236240
 
Y04404152158
Y04404152158Y04404152158
Y04404152158
 

Similar to Preprocessing and Classification in WEKA Using Different Classifiers

Assessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalAssessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalIRJET Journal
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Enhanced Detection System for Trust Aware P2P Communication Networks
Enhanced Detection System for Trust Aware P2P Communication NetworksEnhanced Detection System for Trust Aware P2P Communication Networks
Enhanced Detection System for Trust Aware P2P Communication NetworksEditor IJCATR
 
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...Editor IJCATR
 
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...Editor IJCATR
 
Disease prediction in big data healthcare using extended convolutional neural...
Disease prediction in big data healthcare using extended convolutional neural...Disease prediction in big data healthcare using extended convolutional neural...
Disease prediction in big data healthcare using extended convolutional neural...IJAAS Team
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET Journal
 
IRJET- Predicting Heart Disease using Machine Learning Algorithm
IRJET- Predicting Heart Disease using Machine Learning AlgorithmIRJET- Predicting Heart Disease using Machine Learning Algorithm
IRJET- Predicting Heart Disease using Machine Learning AlgorithmIRJET Journal
 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)aciijournal
 
Propose a Enhanced Framework for Prediction of Heart Disease
Propose a Enhanced Framework for Prediction of Heart DiseasePropose a Enhanced Framework for Prediction of Heart Disease
Propose a Enhanced Framework for Prediction of Heart DiseaseIJERA Editor
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET Journal
 
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGPREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGIRJET Journal
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challengesijcnes
 
Decision Tree Models for Medical Diagnosis
Decision Tree Models for Medical DiagnosisDecision Tree Models for Medical Diagnosis
Decision Tree Models for Medical Diagnosisijtsrd
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisjournalBEEI
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET Journal
 
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...ijtsrd
 
Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...IJECEIAES
 

Similar to Preprocessing and Classification in WEKA Using Different Classifiers (20)

Assessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s RecitalAssessment of Decision Tree Algorithms on Student’s Recital
Assessment of Decision Tree Algorithms on Student’s Recital
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Enhanced Detection System for Trust Aware P2P Communication Networks
Enhanced Detection System for Trust Aware P2P Communication NetworksEnhanced Detection System for Trust Aware P2P Communication Networks
Enhanced Detection System for Trust Aware P2P Communication Networks
 
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...
 
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...
 
Disease prediction in big data healthcare using extended convolutional neural...
Disease prediction in big data healthcare using extended convolutional neural...Disease prediction in big data healthcare using extended convolutional neural...
Disease prediction in big data healthcare using extended convolutional neural...
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
 
IRJET- Predicting Heart Disease using Machine Learning Algorithm
IRJET- Predicting Heart Disease using Machine Learning AlgorithmIRJET- Predicting Heart Disease using Machine Learning Algorithm
IRJET- Predicting Heart Disease using Machine Learning Algorithm
 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)
 
Propose a Enhanced Framework for Prediction of Heart Disease
Propose a Enhanced Framework for Prediction of Heart DiseasePropose a Enhanced Framework for Prediction of Heart Disease
Propose a Enhanced Framework for Prediction of Heart Disease
 
IJET-V2I6P32
IJET-V2I6P32IJET-V2I6P32
IJET-V2I6P32
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNINGPREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
PREDICTION OF DISEASE WITH MINING ALGORITHMS IN MACHINE LEARNING
 
Machine Learning Approaches and its Challenges
Machine Learning Approaches and its ChallengesMachine Learning Approaches and its Challenges
Machine Learning Approaches and its Challenges
 
Decision Tree Models for Medical Diagnosis
Decision Tree Models for Medical DiagnosisDecision Tree Models for Medical Diagnosis
Decision Tree Models for Medical Diagnosis
 
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysis
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of Diabetes
 
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...
A Hybrid Apporach of Classification Techniques for Predicting Diabetes using ...
 
Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...
 

Recently uploaded

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 

Recently uploaded (20)

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 

Preprocessing and Classification in WEKA Using Different Classifiers

  • 1. Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 www.ijera.com 91 | P a g e Preprocessing and Classification in WEKA Using Different Classifiers Payal P.Dhakate1, Suvarna Patil2, K. Rajeswari3, Deepa Abin4 1,2,3,4 Department of Computer Science, Pune University, Aakurdi Pune ABSTRACT Data mining is a process of extracting information from a dataset and transform it into understandable structure for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques such as preprocessing, classification. Classification is one such technique which is based on supervised learning. It is a technique used for predicting group membership for the data instance. Here in this paper we use preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes. ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of death. Keywords- AD Tree; J48; Random Tree; REP Tree; Simple cart; WEKA; I. INTRODUCTION WEKA is introduced by Waikato University , it is open source software written in Java and used for different purposes such as research, education. Fig 1 illustrates WEKA Interface. Classification is the process of finding model or function which describes and distinguishes data classes or concepts, for the intension of using the model to predict the class of objects whose class label is unknown. The problem is a comparative study of classification technique such as Random Tree, FT tree, LMT Tree using various parameters using Diabetes Dataset containing 9 attributes and 768 instances. Classification and Regression trees was introduced by Breiman. It is based on binary splitting of the attributes. Gini index is used in selecting the splitting attribute. It uses both numeric and categorical attributes for building the decision tree and also uses in-built features to deal with missing attributes. FT tree is classifier for building functional trees, which are classification trees that could have logistic regression functions at the inner nodes and/or leaves. The algorithm can deal with binary and multi-class target variables, numeric and nominal attributes and missing values Lmt tree is classifier for building logistic model trees, which are classification trees with logistic regression functions at the leaves. The algorithm can deal with binary and multi-class target variables, numeric and nominal attributes and missing values.REP Tree is a fast decision tree learner, it builds a decision/regression tree using information gain/variance and prunes it using reduced-error pruning. It only sorts values for numeric attributes once. Missing values are dealt with by splitting the corresponding instances into pieces. Lad tree is class for generating a multi-class alternating decision tree using the Logit Boost strategy. In this experiment we are using diabetes database having 768 attributes and 7 instances, some of the attributes are preg , plas, pedi ,age ,class. Fig1. WEKA Interface Here in this paper in Section 2 proposed method is discussed defines problem statement. Section 3 describes the proposed classification method to identify the class of diabetes using data mining classification Random tree. Experimental results and performance evaluation are discussed in Section 4 and in Section 5 conclusion is putforth. II. PROPOSED METHOD Classification is the process of finding a model (or function) that describes and distinguishes data RESEARCH ARTICLE OPEN ACCESS
  • 2. Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 www.ijera.com 92 | P a g e classes or concepts, for the able purpose of being to use the model to predict the class of objects whose class label is unknown [3]. It is a technique which is used to predict group membership for data instances. Classification is a two step process, first, it builds classification model using training data. Every object of the dataset must be pre-classified i.e. its class label must be known, second the model generated in the preceding step is tested by assigning class labels to data objects in a test dataset.Here we are using diabetes dataset because now a days the percentage of diabetes patient is growing very fast. According to Diabetes Atlas published by the International Diabetes Federation (IDF), there were an estimated 40 million persons with diabetes in India in 2007 and this number is predicted to rise to almost 70 million people by 2025. India accounts for the largest number of people 50.8 million suffering from diabetes in the world. India continues to be the "diabetes capital" of the world, and by 2030, nearly 9 per cent of the country's population is likely to be affected from the disease It is estimated that every fifth person with diabetes will be an Indian. This means that India has highest number of diabetes in any one of the country in the world . Here we are using diabetes database having 768 attributes and 7 instances. It consists of attributes like preg ,mass, age, insu .These attributes predict whether a person having diabetes or not.Consider an example we may wish whether a person having type1, type2 diabetes. In classification we apply different classifiers such as simple cart, AD Tree, Random tree etc. Discretization is applied as preprocessing technique, such as preg, plas, pres, insu and it is apply to all attributes. Out of these attributes we preprocess it for preg attribute and it is shown by Fig.2 Fig2. Preprocessing of diabetes database Then we can compare the results of different classification techniques. Fig. 4 can shows Comparison of classifiers using different measures .In this paper out of all other classifiers the Random Tree gives better accuracy and also the time required is less. The below Fig. 3 shows this random tree and Fig .5 illustrates Accuracy and time required for trees different trees. Out of this Random tree gives good accuracy. Fig 3.Classification using FT Tree III. EXPERIMENTAL RESULTS AND PERFORMANCE EVALUATION a) Measures for performance evaluation: To measure the performance sensitivity, accuracy and specificity are used, such concepts are readily usable for the evaluation of any binary classifier. TP is true positive, FP is false positive, TN is true negative and FN is false negative. TPR is true positive rate, which is equivalent to Recall [2]. Sensitivity=True Positive Rate=True Positive/( True Positive + False Negative) (1) Specificity=True Negative/(False Positive + True Negative) (2) b) Precision: Precision is calculated as number of correctly classified instances belongs to X divided by number of instances classified as belonging to class X; that is, it is the proportion of true positives out of all positive results and can be defined as [2]. Precision=True Positive/(True Positive + False Positive) (3) c) Accuracy: It is a ratio of ((no. of correctly classified instances) / (total no. of instances)) *100) and it can be defined as [2]. Accuracy=(True Positive + True Negative)/( True Positive + False Negative) + (False Negative + True Negative) (4) d) False Positive Rate : It is simply the ratio of false positives to false positives plus true negatives.[2] False positive rate is zero in ideal world. It can be defined as [2]. False Positive Rate =False Positive/(False Positive + True Negative) (5) e) F-Measure: F-measure is nothing but combining recall and precision scores into a single measure of performance. It is given as [2]. F-Measure=2*recall*precision/(recall + precision) (6)
  • 3. Payal P. Dhakate et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 8( Version 5), August 2014, pp.91-93 www.ijera.com 93 | P a g e Table .1 Comparison of classifiers using different measures Table .2Error rates of different classifier IV. CONCLUSION In this paper we discussed data mining, preprocessing and different classification techniques on diabetes database using WEKA tool . The FT Tree has shown accurate results than other techniques such as LAD Tree , Simple cart,J48 , LMT Tree and REP Tree in terms of accuracy ,time and errors . REFERENCES [1] J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 2000. [2] Varun Kumar and Nisha Rathee,” Knowledge discovery from database Using an integration of clustering and classification”, (IJACSA) International Journal of Advanced Computer Science and Applications, 2011. [3] Swasti Singhal, Monika Jena, “A Study on WEKA Tool for Data Preprocessing, Classification and Clustering”, International Journal of Innovative Technology and Exploring Engineering(IJITEE), 2013 Trees J48 FT Tree LAD Simple Cart LMT REP Tree TP Rate 1 0.593 0.57 0.534 0.56 0.58 FP Rate 0.053 0.13 0.17 0.132 0.11 0.154 Precision 0.833 0.71 0.644 0.684 0.732 0.67 Recall 1 0.593 0.575 0.534 0.56 0.582 Time Taken 0.04 0.08 0.08 0.06 0.59 0.02 Correctly Classified Instances (%) 91 77 74 75 77 75 Accuracy( %) 84 77 74 75 77 75 Kappa statistic Mean absolute error Root mean squared error Relative absolute error Root relative squared error LAD tree 0.415 0.0322 0.4237 70.85 88.88 REP tree 0.4415 0.3261 0.4181 75.21 90.7 J48 0.6319 0.2383 0.3452 52.4339 72.4207 Simple cart 0.4232 0.3419 75.21 76.49 87.47 LMT Tree 0.4756 0.3175 0.3963 69.84 83.15