This document discusses using fuzzy set theory and decision trees to predict student performance. It proposes using fuzzy sets to represent numeric student data like test scores and attendance to allow for imprecise values. A decision tree is generated on this fuzzy data set to classify students as passing or failing. The fuzzy decision tree achieves an accuracy of 81.5% compared to 76% for a non-fuzzy decision tree, indicating fuzzy sets improve predictive performance. Location, attendance, and prior academic performance were identified as important factors impacting student results.
Supervised WSD Using Master- Slave Voting Techniqueiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Preprocessing and Classification in WEKA Using Different ClassifiersIJERA Editor
Data mining is a process of extracting information from a dataset and transform it into understandable structure
for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques
such as preprocessing, classification. Classification is one such technique which is based on supervised learning.
It is a technique used for predicting group membership for the data instance. Here in this paper we use
preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the
result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes.
ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people
had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates
in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of
death.
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...IJECEIAES
The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy.
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryIJERA Editor
Theimmense volumes of data are populated into repositories from various applications. In order to find out desired information and knowledge from large datasets, the data mining techniques are very much helpful. Classification is one of the knowledge discovery techniques. In Classification, Decision trees are very popular in research community due to simplicity and easy comprehensibility. This paper presentsan updated review of recent developments in the field of decision trees.
Supervised WSD Using Master- Slave Voting Techniqueiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Preprocessing and Classification in WEKA Using Different ClassifiersIJERA Editor
Data mining is a process of extracting information from a dataset and transform it into understandable structure
for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques
such as preprocessing, classification. Classification is one such technique which is based on supervised learning.
It is a technique used for predicting group membership for the data instance. Here in this paper we use
preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the
result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes.
ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people
had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates
in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of
death.
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...IJECEIAES
The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy.
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryIJERA Editor
Theimmense volumes of data are populated into repositories from various applications. In order to find out desired information and knowledge from large datasets, the data mining techniques are very much helpful. Classification is one of the knowledge discovery techniques. In Classification, Decision trees are very popular in research community due to simplicity and easy comprehensibility. This paper presentsan updated review of recent developments in the field of decision trees.
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
In this paper, different classification algorithms for data mining are discussed. Data Mining is about
explaining the past & predicting the future by means of data analysis. Classification is a task of data mining,
which categories data based on numerical or categorical variables. To classify the data many algorithms are
proposed, out of them five algorithms are comparatively studied for data mining through classification. There are
four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions &
Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors,
Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark
datasets like Iris & Lung Cancer.
The process of determining cuts tuition for students are usually given with the same nominal. And in this paper is the determination of the pieces tuition for students who are less able to be different, depending on how much income parents and the number of children covered. For income parents who get discounted tuition fee of IDR Rp.1,500,000 and for the number of children in these families also determine the number of pieces obtained. Tsukamoto Fuzzy system is the model used in this paper. Each input variable is divided into three membership functions. In this paper, Nine Tsukamoto Fuzzy model rules have been applied. The system also provides a consequent change of parameters if the current parameter values to be changed. The smaller the parent's income, the greater the pieces obtained. The more children insured the greater the college acquired pieces.
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
Efficient classification of big data using vfdt (very fast decision tree)eSAT Journals
Abstract
Decision Tree learning algorithms have been able to capture knowledge successfully. Decision Trees are best considered when
instances are described by attribute-value pairs and when the target function has a discrete value. The main task of these
decision trees is to use inductive methods to the given values of attributes of an unknown object and determine an
appropriate classification by applying decision tree rules. Decision Trees are very effective forms to evaluate the performance
and represent the algorithms because of their robustness, simplicity, capability of handling numerical and categorical data,
ability to work with large datasets and comprehensibility to a name a few. There are various decision tree algorithms available
like ID3, CART, C4.5, VFDT, QUEST, CTREE, GUIDE, CHAID, CRUISE, etc. In this paper a comparative study on three of
these popular decision tree algorithms - (Iterative Dichotomizer 3), C4.5 which is an evolution of ID3 and VFDT (Very
Fast Decision Tree has been made. An empirical study has been conducted to compare C4.5 and VFDT in terms of accuracy
and execution time and various conclusions have been drawn.
Key Words: Decision tree, ID3, C4.5, VFDT, Information Gain, Gain Ratio, Gini Index, Over−fitting.
Building a Classifier Employing Prism Algorithm with Fuzzy LogicIJDKP
Classification in data mining is receiving immense interest in recent times. As the knowledge is based on
historical data, classifications of data are essential for discovering the knowledge. To decrease the
classification complexity, the quantitative attributes of data need splitting. But the splitting using the
classical logic is less accurate. This can be overcome by the use of fuzzy logic. This paper illustrates how to
build up the classification rules using the fuzzy logic. The fuzzy classifier is built on by using the prism
decision tree algorithm. This classifier produces more realistic results than the classical one. The
effectiveness of this method is justified over a sample dataset.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
A survey of modified support vector machine using particle of swarm optimizat...Editor Jacotech
The main objective of this survey paper is to provide a detailed description of Wireless Sensor Networks with Medium Access Control layer and Routing layer. In the medium access control layer, Event Driven Time Division Multiple Access protocol is studied and in Network layer, two routing protocols Bellman-Ford and Dynamic Source Routing are studied.
Among many data clustering approaches available today, mixed data set of numeric and category data
poses a significant challenge due to difficulty of an appropriate choice and employment of
distance/similarity functions for clustering and its verification. Unsupervised learning models for
artificial neural network offers an alternate means for data clustering and analysis. The objective of this
study is to highlight an approach and its associated considerations for mixed data set clustering with
Adaptive Resonance Theory 2 (ART-2) artificial neural network model and subsequent validation of the
clusters with dimensionality reduction using Autoencoder neural network model.
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
Data mining environment produces a large amount of data that need to be analyzed.
Using traditional databases and architectures, it has become difficult to process, manage and analyze
patterns. To gain knowledge about the Big Data a proper architecture should be understood.
Classification is an important data mining technique with broad applications to classify the various
kinds of data used in nearly every field of our life. Classification is used to classify the item
according to the features of the item with respect to the predefined set of classes. This paper put a
light on various classification algorithms including j48, C4.5, Naive Bayes using large dataset.
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
In this paper, different classification algorithms for data mining are discussed. Data Mining is about
explaining the past & predicting the future by means of data analysis. Classification is a task of data mining,
which categories data based on numerical or categorical variables. To classify the data many algorithms are
proposed, out of them five algorithms are comparatively studied for data mining through classification. There are
four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions &
Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors,
Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark
datasets like Iris & Lung Cancer.
The process of determining cuts tuition for students are usually given with the same nominal. And in this paper is the determination of the pieces tuition for students who are less able to be different, depending on how much income parents and the number of children covered. For income parents who get discounted tuition fee of IDR Rp.1,500,000 and for the number of children in these families also determine the number of pieces obtained. Tsukamoto Fuzzy system is the model used in this paper. Each input variable is divided into three membership functions. In this paper, Nine Tsukamoto Fuzzy model rules have been applied. The system also provides a consequent change of parameters if the current parameter values to be changed. The smaller the parent's income, the greater the pieces obtained. The more children insured the greater the college acquired pieces.
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
Efficient classification of big data using vfdt (very fast decision tree)eSAT Journals
Abstract
Decision Tree learning algorithms have been able to capture knowledge successfully. Decision Trees are best considered when
instances are described by attribute-value pairs and when the target function has a discrete value. The main task of these
decision trees is to use inductive methods to the given values of attributes of an unknown object and determine an
appropriate classification by applying decision tree rules. Decision Trees are very effective forms to evaluate the performance
and represent the algorithms because of their robustness, simplicity, capability of handling numerical and categorical data,
ability to work with large datasets and comprehensibility to a name a few. There are various decision tree algorithms available
like ID3, CART, C4.5, VFDT, QUEST, CTREE, GUIDE, CHAID, CRUISE, etc. In this paper a comparative study on three of
these popular decision tree algorithms - (Iterative Dichotomizer 3), C4.5 which is an evolution of ID3 and VFDT (Very
Fast Decision Tree has been made. An empirical study has been conducted to compare C4.5 and VFDT in terms of accuracy
and execution time and various conclusions have been drawn.
Key Words: Decision tree, ID3, C4.5, VFDT, Information Gain, Gain Ratio, Gini Index, Over−fitting.
Building a Classifier Employing Prism Algorithm with Fuzzy LogicIJDKP
Classification in data mining is receiving immense interest in recent times. As the knowledge is based on
historical data, classifications of data are essential for discovering the knowledge. To decrease the
classification complexity, the quantitative attributes of data need splitting. But the splitting using the
classical logic is less accurate. This can be overcome by the use of fuzzy logic. This paper illustrates how to
build up the classification rules using the fuzzy logic. The fuzzy classifier is built on by using the prism
decision tree algorithm. This classifier produces more realistic results than the classical one. The
effectiveness of this method is justified over a sample dataset.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
A survey of modified support vector machine using particle of swarm optimizat...Editor Jacotech
The main objective of this survey paper is to provide a detailed description of Wireless Sensor Networks with Medium Access Control layer and Routing layer. In the medium access control layer, Event Driven Time Division Multiple Access protocol is studied and in Network layer, two routing protocols Bellman-Ford and Dynamic Source Routing are studied.
Among many data clustering approaches available today, mixed data set of numeric and category data
poses a significant challenge due to difficulty of an appropriate choice and employment of
distance/similarity functions for clustering and its verification. Unsupervised learning models for
artificial neural network offers an alternate means for data clustering and analysis. The objective of this
study is to highlight an approach and its associated considerations for mixed data set clustering with
Adaptive Resonance Theory 2 (ART-2) artificial neural network model and subsequent validation of the
clusters with dimensionality reduction using Autoencoder neural network model.
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
Data mining environment produces a large amount of data that need to be analyzed.
Using traditional databases and architectures, it has become difficult to process, manage and analyze
patterns. To gain knowledge about the Big Data a proper architecture should be understood.
Classification is an important data mining technique with broad applications to classify the various
kinds of data used in nearly every field of our life. Classification is used to classify the item
according to the features of the item with respect to the predefined set of classes. This paper put a
light on various classification algorithms including j48, C4.5, Naive Bayes using large dataset.
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...ijcnes
Educational data mining is used to study the data available in the educational field and bring out the hidden knowledge from it. Classification methods like decision trees, rule mining can be applied on the educational data for predicting the students behavior. This paper focuses on finding thesuitablealgorithm which yields the best result to find out the reason behind students absenteeism in an academic year. The first step in this processis to gather students data by using questionnaire.The datais collected from 123 under graduate students from a private college which is situated in a semirural area. The second step is to clean the data which is appropriate for mining purpose and choose the relevant attributes. In the final step, three different Decision tree induction algorithms namely, ID3(Iterative Dichotomiser), C4.5 and CART(Classification and Regression Tree)were applied for comparison of results for the same data sample collected using questionnaire. The results were compared to find the algorithm which yields the best result in predicting the reason for student s absenteeism.
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILEScscpconf
In this paper we have focused on an efficient feature selection method in classification of audio files.
The main objective is feature selection and extraction. We have selected a set of features for further
analysis, which represents the elements in feature vector. By extraction method we can compute a
numerical representation that can be used to characterize the audio using the existing toolbox. In this
study Gain Ratio (GR) is used as a feature selection measure. GR is used to select splitting attribute
which will separate the tuples into different classes. The pulse clarity is considered as a subjective
measure and it is used to calculate the gain of features of audio files. The splitting criterion is
employed in the application to identify the class or the music genre of a specific audio file from
testing database. Experimental results indicate that by using GR the application can produce a
satisfactory result for music genre classification. After dimensionality reduction best three features
have been selected out of various features of audio file and in this technique we will get more than
90% successful classification result.
In this paper we have focused on an efficient feature selection method in classification of audio files.
The main objective is feature selection and extraction. We have selected a set of features for further
analysis, which represents the elements in feature vector. By extraction method we can compute a
numerical representation that can be used to characterize the audio using the existing toolbox. In this
study Gain Ratio (GR) is used as a feature selection measure. GR is used to select splitting attribute
which will separate the tuples into different classes. The pulse clarity is considered as a subjective
measure and it is used to calculate the gain of features of audio files. The splitting criterion is
employed in the application to identify the class or the music genre of a specific audio file from
testing database. Experimental results indicate that by using GR the application can produce a
satisfactory result for music genre classification. After dimensionality reduction best three features
have been selected out of various features of audio file and in this technique we will get more than
90% successful classification result.
Fuzzy clustering has been widely studied and applied in a variety of key areas of science and
engineering. In this paper the Improved Teaching Learning Based Optimization (ITLBO)
algorithm is used for data clustering, in which the objects in the same cluster are similar. This
algorithm has been tested on several datasets and compared with some other popular algorithm
in clustering. Results have been shown that the proposed method improves the output of
clustering and can be efficiently used for fuzzy clustering.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
Data mining or Knowledge discovery (KDD) is
extracting unknown (hidden) and useful knowledge from data.
Data mining is widely used in many areas like retail, sales, ecommerce,
remote sensing, bioinformatics etc. Student’s
performance has become one of the most complex puzzle for
universities and colleges in recent past with the tremendous
growth. In this paper, authors deployed data mining techniques
like classification, association rule, chi-square etc. for knowledge
discovery. For this study, authors have used data set containing
Approx. 180 MCA (post graduate) students results data of 3
colleges. Study found that one can apply data mining
functionalities like Chi-square, Association rule and Lift in
Education and discover areas of improvement.
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELijcsit
Predicting the student performance is a great concern to the higher education managements.This
prediction helps to identify and to improve students' performance.Several factors may improve this
performance.In the present study, we employ the data mining processes, particularly classification, to
enhance the quality of the higher educational system. Recently, a new direction is used for the improvement
of the classification accuracy by combining classifiers.In thispaper, we design and evaluate a fastlearning
algorithm using AdaBoost ensemble with a simple genetic algorithmcalled “Ada-GA” where the genetic
algorithm is demonstrated to successfully improve the accuracy of the combined classifier performance.
The Ada-GA algorithm proved to be of considerable usefulness in identifying the students at risk early,
especially in very large classes. This early prediction allows the instructor to provide appropriate advising
to those students. The Ada/GA algorithm is implemented and tested on ASSISTments dataset, the results
showed that this algorithm hassuccessfully improved the detection accuracy as well as it reduces the
complexity of computation.
The objective of this project was to classify the given set of events as either tau-tau decay of Higgs Boson or as a background noise. This project was completed as a part of the Machine Learning module. We have come up with an ensemble model with XGBoosting and Random Forest classifiers to solve this problem.
Association rule discovery for student performance prediction using metaheuri...csandit
According to the increase of using data mining tech
niques in improving educational systems
operations, Educational Data Mining has been introd
uced as a new and fast growing research
area. Educational Data Mining aims to analyze data
in educational environments in order to
solve educational research problems. In this paper
a new associative classification technique
has been proposed to predict students final perform
ance. Despite of several machine learning
approaches such as ANNs, SVMs, etc. associative cla
ssifiers maintain interpretability along
with high accuracy. In this research work, we have
employed Honeybee Colony Optimization
and Particle Swarm Optimization to extract associat
ion rule for student performance prediction
as a multi-objective classification problem. Result
s indicate that the proposed swarm based
algorithm outperforms well-known classification tec
hniques on student performance prediction
classification problem.
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...cscpconf
According to the increase of using data mining techniques in improving educational systems
operations, Educational Data Mining has been introduced as a new and fast growing research
area. Educational Data Mining aims to analyze data in educational environments in order to
solve educational research problems. In this paper a new associative classification technique
has been proposed to predict students final performance. Despite of several machine learning
approaches such as ANNs, SVMs, etc. associative classifiers maintain interpretability along
with high accuracy. In this research work, we have employed Honeybee Colony Optimization
and Particle Swarm Optimization to extract association rule for student performance prediction
as a multi-objective classification problem. Results indicate that the proposed swarm based
algorithm outperforms well-known classification techniques on student performance prediction
classification problem.
Fuzzy Association Rule Mining based Model to Predict Students’ Performance IJECEIAES
The major intention of higher education institutions is to supply quality education to its students. One approach to get maximum level of quality in higher education system is by discovering knowledge for prediction regarding the internal assessment and end semester examination. The projected work intends to approach this objective by taking the advantage of fuzzy inference technique to classify student scores data according to the level of their performance. In this paper, student’s performance is evaluated using fuzzy association rule mining that describes Prediction of performance of the students at the end of the semester, on the basis of previous database like Attendance, Midsem Marks, Previous semester marks and Previous Academic Records were collected from the student’s previous database, to identify those students which needed individual attention to decrease fail ration and taking suitable action for the next semester examination.
Scalable decision tree based on fuzzy partitioning and an incremental approachIJECEIAES
Classification as a data mining materiel is the process of assigning entities to an already defined class by examining the features. The most significant feature of a decision tree as a classification method is its ability to data recursive partitioning. To choose the best attributes for partition, the value range of each continuous attribute should be divided into two or more intervals. Fuzzy partitioning can be used to reduce noise sensitivity and increase the stability of trees. Also, decision trees constructed with existing approaches, tend to be complex, and consequently are difficult to use in practical applications. In this article, a fuzzy decision tree has been introduced that tackles the problem of tree complexity and memory limitation by incrementally inserting data sets into the tree. Membership functions are generated automatically. Then fuzzy information gain is used as a fast-splitting attribute selection criterion and the expansion of a leaf is done attending only with the instances stored in it. The efficiency of this algorithm is examined in terms of accuracy and tree complexity. The results show that the proposed algorithm by reducing the complexity of the tree can overcome the memory limitation and make a balance between accuracy and complexity.
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Predictionijtsrd
Data mining techniques play an important role in data analysis. For the construction of a classification model which could predict performance of students, particularly for engineering branches, a decision tree algorithm associated with the data mining techniques have been used in the research. A number of factors may affect the performance of students. Data mining technology which can related to this student grade well and we also used classification algorithms prediction. In this paper, we used educational data mining to predict students final grade based on their performance. We proposed student data classification using ID3 Iterative Dichotomiser 3 Decision Tree Algorithm Khin Khin Lay | San San Nwe "Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26545.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26545/using-id3-decision-tree-algorithm-to-the-student-grade-analysis-and-prediction/khin-khin-lay
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Essentials of Automations: Optimizing FME Workflows with Parameters
E1802023741
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 2, Ver. II (Mar-Apr. 2016), PP 37-41
www.iosrjournals.org
DOI: 10.9790/0661-1802023741 www.iosrjournals.org 37 | Page
Role of Fuzzy Set in Students’ Performance Prediction
Mrs. Jyoti Upadhyay1,
Dr. Pratima Gautam2
1
Ph.D. (Research Scholar) AISECT University, Bhopal, M.P.
2
Dean (IT Dept.) AISECT University, Bhopal, M.P
Abstract: We can use educational data mining to predict student’ performance on the basis of different
attribute. In this paper, the classification task is used to predict the result of students. Decision tree (DT)
learning is one of the most famous practical methods for analytical study. We use DT as prediction method in
this paper. In this paper we propose a model using fuzzy set to predict more accurate result.Fuzzy logic brings
in an improvement of analysis aspects due to the elasticity of fuzzy sets formalism. Therefore, we proposed a
decision tree on fuzzy set data, which combines ID3 with fuzzy theory.The results are compared to some other
popular classification algorithms.
Key words: Decision tree, Fuzzy set, Data mining
I. Introduction
1.1 Decision tree: In data mining Decision trees (DT) are among the most popular prediction technique.
Although DT’s are better known in their role as classifiers, they also have prominent applications in regression,
clustering and feature selection.A decision tree presents possible outcomes of a decision through graph. We use
specialized software to draw graph, which is rapid miner. DT is useful for focusing discussion when a group
must make a decision.
Ross Quinlan developed ID3 (Iterative Dichotomiser 3) in 1986. ID3 algorithm based on top-down
search approach on the given data sets. To select the attribute that is most useful for classifying a given sets, use
information gain. To find an optimal way to classify a learning set we need some function, which provides the
most balanced splitting [1]. A data set contains attributes, we have to select most suitable attribute for the root
node of DT. We are using Entropy and information gain. Entropy is the index used to measure degree of
impurity [2].
The Entropy is calculated as follows: Entropy= −𝑝 log2 𝑝 𝑗𝑗
Splitting criteria used for splitting of nodes of the tree is Information gain. To determine the best attribute for a
particular node in the tree we use the measure called Information Gain. The information gain, Gain (S, A) of an
attribute A, relative to a collection of examples S, is defined as:
Gain (S , A ) = Entropy (S) – sv𝑣𝜖𝑣𝑎𝑙𝑢𝑒 𝐴 /|𝑠|Entropy (sv)
For the best split point we have to calculate gain for each attribute and repeat the process until we didn’t get
label attribute.
1.2 fuzzy Set: The selection of the best classification algorithm for a given dataset is a very widespread
problem. In this sense it requires to make several methodological choices. Among them, in this research it
focuses on the decision tree algorithms from classification methods, which is used to assess the classification
performance and to find the best algorithm in obtaining qualitative student data. Fuzzy set theory is also known
as possibility theory, was proposed by Lotfi Zadeh in 1965. A researcher seems it as an alternative to traditional
theory. Most important, fuzzy set theory allows us to deal with inexact facts. For example, if we says that a
student who has 80% is eligible for any course but less % is not considerable then what happen for 79.8%? This
can be thought of as an extension of traditional crisp sets, in which each element must either be in or not in a
seti.e. Fuzzy set.Fuzzy sets are defined on a non-fuzzy universe of discourse, which is an ordinary sets.Fuzzy
logic set can model the normal attributes as linguistic variables (such as good, poor, avg, tall, high) into the
inductive generation of the tree structure. Fuzzy decision trees (FDTs) are better suitable to deal with the
uncertainty, fuzziness [3].A fuzzy sets F of a universe of discourse U is characterized by a membership function
μF (x) which assigns to every elementx ∈ U ,a membership degree μF (x)∈[0,1]. Anelement x∈U is said to be in
a fuzzy sets F if and only if μ (x)>0 and to be a full member if and A only if μF (x) =1[4]. Membership functions
can either be chosen by the user arbitrarily or based on the user’s experience.Typically, a fuzzy subset A can be
represented as,
A = {μA (x1 ) / x1},{μA (x2 ) / x2 },...,{μA (xn ) / xn }
Where the separating symbol / is used to associate the membership value with its coordinate on the horizontal
axis. Fuzzy sets and fuzzy logic allow providing a symbolic framework for knowledge clarity. Fuzzy decision
trees differ from traditional crisp decision trees in following manner, they use splitting criteria based on fuzzy
2. Role of fuzzy set in students’ performance prediction
DOI: 10.9790/0661-1802023741 www.iosrjournals.org 38 | Page
restrictions and the fuzzy sets representing the defined data.
Fuzzy decision tree required to develop the following things attribute value partitioning methods,
branching attribute selection method, branching test method to decide which degree data follows down branches
of a node, and leaf node labeling methods to determine classes for which leaf nodes stand[5]. A data set with
some condition attributes and one decision attributes can be presented in the form of knowledge representation
system J = (U , C ∪D ) , whereU ={u1,u2....,us}is the set of data samples, C = {c1 , c2 ...., cn } is the set of
condition attributes.D = {d } is the one-elemental set with the decision attribute or class label attribute. Suppose
this class label attribute has m distinct values and defining m distinct classes , di (for i=l, ..,m).
For a given subset Sj , information gain s expressed as
I(s1j,...smj)=−∑m pijlog2 pij
So information gain of attribute ci is given by
Gain(ci ) = I (s1 j ,...smj ) − E(ci )
The attribute withhighest information gain is the most informative attribute of the given data set.
II. Methodology
For our work we are using Rapid Miner tool. Which is the most powerful, easy to use graphical user
interface for the design of analytic processes[6]. In this paper, we propose a computational model based on
fuzzy-rough decision trees to learn the most significant factor for student performance.Our methodology uses
fuzzy rough sets to discard irrelevant features on the basis of their dependency. Our work methodology shows in
fig.1.We collect data from govt. schools of Chhattisgarh for the session 2014-15.Using Rapidminer we obtained
different confidence and support values. Model Evaluation shows accuracy of model
Figure 1: work methodology
III. Proposed work
To implement above methodology we will follow some steps, which are:
3.1 Data Set: For Hypothesis formation we select Educational Environment.For an educational environment it
is important to know which factor can affect result of students so that they can emphasis on particular factor
and improve successful result and get less drop out ratio. For mining select relevant data shows in table 1.
Attribute Values
Prev_year % Result of 12th
class
Cast SC,ST,OBC,GEN
Attendance % Attendance % in present class
Location RURAL,URBAN,SEMI-URBAN
Recent Result(label class) Pass,Fail
We convert data set into Fuzzy Set.
Figure 2:Data Set
3. Role of fuzzy set in students’ performance prediction
DOI: 10.9790/0661-1802023741 www.iosrjournals.org 39 | Page
Fuzzy Set:
Class of object can define Fuzzy set, there is no noisy margins for object [7].A fuzzy set formed by
combination of linguistic variable.
Figure 3: Fuzzy Data Set
3.2 Data Preparation done: The success of data mining techniques depends highly on an appropriate pre-
processing of the data. Pre-processing includes data selection, data normalization and transformation.
Selection Data selection is critical for the result of a data mining process. Although a relation between a
certain attribute and the desired result is not obvious, the attribute has to be considered as well because
some information may be hidden in the data.
3.3 Unsuitable Size: Remove inconsistent or remove noisy data and apply treatment about incomplete and
erroneous data. Apply Data transformation into Modified Data (after it data transforming into a new
format).We apply filter operator of Rapid miner, which takes an Example Set as input and returns a new
Example Set including only those examples that satisfy the specified condition. Several predefined
conditions are provided; users can select any of them. Users can also define their own conditions to filter
examples. This operator may reduce the number of examples in an Example Set but it has no effect on the
number of attributes. The select Attributes operator is used to select required attributes.
Figure 4: graphical representation of attributes
3.4 Experiment: We collected qualitative data for experiment and 10-fold cross validation applied. After
applying model we evaluate it and generate confusion matrix.
IV. Result and Discussion:
Data set consist 56 students.On the basis of decision tree algorithm fuzzy_attn has the highest
information gain, then the decision tree generated as shown in figure:5. We extract Classification rule with the
help of decision tree which shows location and cast also play an important role in students’ performance.
4. Role of fuzzy set in students’ performance prediction
DOI: 10.9790/0661-1802023741 www.iosrjournals.org 40 | Page
Figure 5:Decision tree
We use the condition attribute Fuzzy_attn, Fuzzy_prev_result, cast and location. Result as class label.
Fuzzy attn = AVG: fail {fail=17, pass=0, Pass=0}
Fuzzy attn = good
fuzzy prev result = first: pass {fail=0, pass=7, Pass=1}
fuzzy prev result = second
Cast = Gen.: pass {fail=0, pass=7, Pass=0}
Cast = OBC: pass {fail=0, pass=1, Pass=1}
Cast = SC: pass {fail=1, pass=2, Pass=0}
Cast = ST: fail {fail=1, pass=1, Pass=0}
fuzzy prev result = thierd
Location = Rural: pass {fail=1, pass=2, Pass=0}
Location = Semi-urban: fail {fail=2, pass=0, Pass=0}
Location = Urban: pass {fail=1, pass=2, Pass=0}
Fuzzy attn = poor: fail {fail=9, pass=0, Pass=0}
Figure 6: Predicted Result
We can interpret from the rule for confidence values are that the students’’ performance will be poor if
their attendance is poor. Also their location will have an impact on their result.
V. Model Evaluation
Rapid Miner is the most powerful, easy to use and intuitive graphical user interface for the design of
analytic processes [8].Classification performance is referred as the characteristics and how successful the
models formulated using DT and Fuzzy rough set decision tree learning algorithms are in accurately classifying
data points from the testing dataset and/or the independent one-class dataset.The confusion matrix shows the
accuracy of the DT for the given data sets. The proposed model was able to classify 82% of the input instances
correctly. The results show clearly that the proposed method performs well compared to other similar methods
in the literature.
5. Role of fuzzy set in students’ performance prediction
DOI: 10.9790/0661-1802023741 www.iosrjournals.org 41 | Page
PerformanceVector:
accuracy: 81.50% +/- 10.97% (mikro: 81.63%)
ConfusionMatrix:
True: fail pass Pass
fail: 24 2 1
pass: 5 16 1
kappa: 0.636 +/- 0.207 (mikro: 0.639)
ConfusionMatrix:
True: fail pass Pass
fail: 24 2 1
pass: 5 16 1
Performance Vector for non Fuzzy Set
accuracy: 76.00% +/- 14.97% (mikro: 75.51%)
ConfusionMatrix:
True: fail pass Pass
fail: 25 6 0
pass: 4 12 2
kappa: 0.502 +/- 0.327 (mikro: 0.501)
ConfusionMatrix:
True: fail pass Pass
fail: 25 6 0
pass: 4 12 2
VI. Conclusion
Our work is highly concerned with fuzzy set and decision tree .we propose that numeric data can be
represented by fuzzy values. When we apply decision tree model on fuzzy set then it produce more accurate
result as we we can see through confusion matrix. Our analysis shows that students’ performance can be
affected by many factors. In our next paper we will present more attributes with weight.
References
[1]. T.Miranda Lakshmi , A.Martin ,others,An Analysis on Performance of Decision Tree Algorithms using Student’s Qualitative
Data,I.J.Modern Education and Computer Science, 5, 18-27, 2013
[2]. Abeer Badr El Din Ahmed, Ibrahim Sayed Elaraby,Data Mining: A prediction for Student's Performance Using Classification
Method,World Journal of Computer Application and Technology,vol2,43-47, 2014
[3]. Y.-l. Chen, T. Wang, B.-s. Wang, and Z.-j. Li, “A survey of fuzzy decision tree classifier,” Fuzzy Information and Engineering, vol.
1, pp. 149–159, June 2009.
[4]. FuzzyLogical Toolbox: zmf. Matlab Help. The MathWorks, Inc. 1984-2004.
[5]. TIEN-CHIN WANG,HSIEN-DA LEE,Constructing a Fuzzy Decision Tree by Integrating Fuzzy Sets and Entropy
[6]. https://rapidminer.com/
[7]. Nasution, H., Design Methodology of Fuzzy Logic Control, UTM, 2002.
[8]. Han J, Rodriguze J, Beheshti M. Diabetes data analysis and prediction model discovery using rapid miner. 2nd International
Conference on Future Generation Communication and Networking, IEEE; 2008