Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, it states that the best algorithm for the improvement of students performance using these algorithms.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, itstates that the best algorithm for the improvement of students performance using these algorithms.
Using data mining in e learning-a generic framework for military educationElena Susnea
Susnea E. (2013). Using Data Mining in eLearning: A Generic Framework for Military Education, in Proceedings of "The International Scientific Conference eLearning and Software for Education", Iss. 01 (pp. 411-415).
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, it states that the best algorithm for the improvement of students performance using these algorithms.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, itstates that the best algorithm for the improvement of students performance using these algorithms.
Using data mining in e learning-a generic framework for military educationElena Susnea
Susnea E. (2013). Using Data Mining in eLearning: A Generic Framework for Military Education, in Proceedings of "The International Scientific Conference eLearning and Software for Education", Iss. 01 (pp. 411-415).
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Extending the Student’s Performance via K-Means and Blended Learning IJEACS
In this paper, we use the clustering technique to monitor the status of students’ scholastic recital. This paper spotlights on upliftment the education system via K-means clustering. Clustering is the process of grouping the similar objects. Commonly in the academic, the performances of the students are grouped by their Graded Point (GP). We adopted K-means algorithm and implemented it on students’ mark data. This system is a promising index to screen the development of students and categorize the students by their academic performance. From the categories, we train the students based on their GP. It was implemented in MATLAB and obtained the clusters of students exactly.
Educational Data Mining is used to predict the future learning behavior of the student. It is still a research topic for the researcher who wants do better result from the prediction of the student. The results of all these techniques help the teachers, management, and administrator to draft new rules and policy for the improvement of the educational standards and hence overall results and student retention. Taking this point in mind work has been done to find the slow learner in a High School class and then provide timely help to them for improving their overall result. There are lots of techniques of data mining are available for use but we are selecting only those techniques which are mostly used by different research for their result prediction like J48, REPTree, Naive Bayes, SMO, Multilayer Perceptron. On the collected dataset Multilayer Perception classification algorithm gives 87.43% accuracy when using whole dataset as training dataset and SMO and J48 gives 69.00% accuracy when using 10-fold cross validation algorithm.
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
Educational Data Mining is used to find interesting patterns from the data taken from
educational settings to improve teaching and learning. Assessing student’s ability and performance with
EDM methods in e-learning environment for math education in school level in India has not been
identified in our literature review. Our method is a novel approach in providing quality math education
with assessments indicating the knowledge level of a student in each lesson. This paper illustrates how
Learning Curve – an EDM visualization method is used to compare rural and urban students’ progress
in learning mathematics in an e-learning environment. The experiment is conducted in two different
schools in Tamil Nadu, India. After practicing the problems the students attended the test and their
interaction data are collected and analyzed their performance in different aspects: Knowledge
component level, time taken to solve a problem, error rate. This work studies the student actions for
identifying learning progress. The results show that the learning curve method is much helpful to the
teachers to visualize the students’ performance in granular level which is not possible manually. Also it
helps the students in knowing about their skill level when they complete each unit.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
A study model on the impact of various indicators in the performance of stude...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Ijdms050304A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDSijdms
Educational Data Mining (EDM) is an emerging field exploring data in educational context by applying
different Data Mining (DM) techniques/tools. It provides intrinsic knowledge of teaching and learning
process for effective education planning. In this survey work focuses on components, research trends (1998
to 2012) of EDM highlighting its related Tools, Techniques and educational Outcomes. It also highlights
the Challenges EDM.
Big data integration for transition from e-learning to smart learning framework eraser Juan José Calderón
Big data integration for transition from e-learning to smart learning framework . Dr. Prakash Kumar Udupi Mr. Puttaswamy Malali Mr. Herald Noronha Department of Computing Department of Computing Department of Computing Middle East College Middle East College Middle East College .
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
Data mining approach to predict academic performance of studentsBOHRInternationalJou1
Powerful data mining techniques are available in a variety of educational fields. Educational research is
advancing rapidly due to the vast amount of student data that can be used to create insightful patterns
related to student learning. Educational data mining is a tool that helps universities assess and identify student
performance. Well-known classification techniques have been widely used to determine student success in
data mining. A decisive and growing exploration area in educational data mining (EDM) is predicting student
academic performance. This area uses data mining and automaton learning approaches to extract data from
education repositories. According to relevant research, there are several academic performance prediction
methods aimed at improving administrative and teaching staff in academic institutions. In the put-forwarded
approach, the collected data set is preprocessed to ensure data quality and labeled student education data
is used to apply ANN classifiers, support vector classifiers, random forests, and DT Compute and train a
classifier. The achievement of the four classifications is measured by accuracy value, receiver operating curve
(ROC), F1 score, and confusion matrix scored by each model. Finally, we found that the top three algorithmic
models had an accuracy of 86–95%, an F1 score of 85–95%, and an average area under ROC curve of
OVA of 98–99.6%
An Investigation into Brain Tumor Segmentation Techniques IIRindia
A tumor is an anomalous mass in the brain which can be cancerous. Such anomalous growth within this restricted space or inside the covering skull can cause problems. Detecting brain tumors from images of medical modalities like CT scan or MRI involves segmentation (Division into parts) for analysis and can be a challenging task. Accurate segmentation of brain images is very essential for proper diagnosis of tumor and non-tumor areas for clinical analysis. This paper details on segmentation algorithms for brain images, advantages, disadvantages and a comparison of the algorithms.
Agricultural sector is the backbone of our country and it plays a vital role in the overall economic growth of our nation. India has about 59% of its total area for agricultural purpose. The contribution of agricultural sector to our GDP is about 17%. Advanced techniques or the betterment in the arena of agriculture will as certain to increase the competence of certain farming activities. In this paper we introduce a concept for smart farming which utilizes wireless sensor web technology with a web based application. This will play a crucial role in helping farmers. It will aim for the betterment in the facilities given to the farmers and by focussing on the measurement of production of the crops. With the help of data mining techniques and algorithms like K-nearest, decision tree we will gather each and every data related to the farming and it should be updated frequently so that farmers and the consumers will get the right knowledge of the respective crops and about the suitable equipments related to farming. Existing system are not so much efficient in displaying such data characteristics. Our main aim is to enhance the growth in the agriculture sector and make the existing system smarter so that the decision- maker can define the expansion of agriculture activities to empower the different forces in existing agriculture sector
More Related Content
Similar to A Survey on Educational Data Mining Techniques
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Extending the Student’s Performance via K-Means and Blended Learning IJEACS
In this paper, we use the clustering technique to monitor the status of students’ scholastic recital. This paper spotlights on upliftment the education system via K-means clustering. Clustering is the process of grouping the similar objects. Commonly in the academic, the performances of the students are grouped by their Graded Point (GP). We adopted K-means algorithm and implemented it on students’ mark data. This system is a promising index to screen the development of students and categorize the students by their academic performance. From the categories, we train the students based on their GP. It was implemented in MATLAB and obtained the clusters of students exactly.
Educational Data Mining is used to predict the future learning behavior of the student. It is still a research topic for the researcher who wants do better result from the prediction of the student. The results of all these techniques help the teachers, management, and administrator to draft new rules and policy for the improvement of the educational standards and hence overall results and student retention. Taking this point in mind work has been done to find the slow learner in a High School class and then provide timely help to them for improving their overall result. There are lots of techniques of data mining are available for use but we are selecting only those techniques which are mostly used by different research for their result prediction like J48, REPTree, Naive Bayes, SMO, Multilayer Perceptron. On the collected dataset Multilayer Perception classification algorithm gives 87.43% accuracy when using whole dataset as training dataset and SMO and J48 gives 69.00% accuracy when using 10-fold cross validation algorithm.
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
Educational Data Mining is used to find interesting patterns from the data taken from
educational settings to improve teaching and learning. Assessing student’s ability and performance with
EDM methods in e-learning environment for math education in school level in India has not been
identified in our literature review. Our method is a novel approach in providing quality math education
with assessments indicating the knowledge level of a student in each lesson. This paper illustrates how
Learning Curve – an EDM visualization method is used to compare rural and urban students’ progress
in learning mathematics in an e-learning environment. The experiment is conducted in two different
schools in Tamil Nadu, India. After practicing the problems the students attended the test and their
interaction data are collected and analyzed their performance in different aspects: Knowledge
component level, time taken to solve a problem, error rate. This work studies the student actions for
identifying learning progress. The results show that the learning curve method is much helpful to the
teachers to visualize the students’ performance in granular level which is not possible manually. Also it
helps the students in knowing about their skill level when they complete each unit.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
A study model on the impact of various indicators in the performance of stude...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Ijdms050304A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDSijdms
Educational Data Mining (EDM) is an emerging field exploring data in educational context by applying
different Data Mining (DM) techniques/tools. It provides intrinsic knowledge of teaching and learning
process for effective education planning. In this survey work focuses on components, research trends (1998
to 2012) of EDM highlighting its related Tools, Techniques and educational Outcomes. It also highlights
the Challenges EDM.
Big data integration for transition from e-learning to smart learning framework eraser Juan José Calderón
Big data integration for transition from e-learning to smart learning framework . Dr. Prakash Kumar Udupi Mr. Puttaswamy Malali Mr. Herald Noronha Department of Computing Department of Computing Department of Computing Middle East College Middle East College Middle East College .
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
Data mining approach to predict academic performance of studentsBOHRInternationalJou1
Powerful data mining techniques are available in a variety of educational fields. Educational research is
advancing rapidly due to the vast amount of student data that can be used to create insightful patterns
related to student learning. Educational data mining is a tool that helps universities assess and identify student
performance. Well-known classification techniques have been widely used to determine student success in
data mining. A decisive and growing exploration area in educational data mining (EDM) is predicting student
academic performance. This area uses data mining and automaton learning approaches to extract data from
education repositories. According to relevant research, there are several academic performance prediction
methods aimed at improving administrative and teaching staff in academic institutions. In the put-forwarded
approach, the collected data set is preprocessed to ensure data quality and labeled student education data
is used to apply ANN classifiers, support vector classifiers, random forests, and DT Compute and train a
classifier. The achievement of the four classifications is measured by accuracy value, receiver operating curve
(ROC), F1 score, and confusion matrix scored by each model. Finally, we found that the top three algorithmic
models had an accuracy of 86–95%, an F1 score of 85–95%, and an average area under ROC curve of
OVA of 98–99.6%
Similar to A Survey on Educational Data Mining Techniques (20)
An Investigation into Brain Tumor Segmentation Techniques IIRindia
A tumor is an anomalous mass in the brain which can be cancerous. Such anomalous growth within this restricted space or inside the covering skull can cause problems. Detecting brain tumors from images of medical modalities like CT scan or MRI involves segmentation (Division into parts) for analysis and can be a challenging task. Accurate segmentation of brain images is very essential for proper diagnosis of tumor and non-tumor areas for clinical analysis. This paper details on segmentation algorithms for brain images, advantages, disadvantages and a comparison of the algorithms.
Agricultural sector is the backbone of our country and it plays a vital role in the overall economic growth of our nation. India has about 59% of its total area for agricultural purpose. The contribution of agricultural sector to our GDP is about 17%. Advanced techniques or the betterment in the arena of agriculture will as certain to increase the competence of certain farming activities. In this paper we introduce a concept for smart farming which utilizes wireless sensor web technology with a web based application. This will play a crucial role in helping farmers. It will aim for the betterment in the facilities given to the farmers and by focussing on the measurement of production of the crops. With the help of data mining techniques and algorithms like K-nearest, decision tree we will gather each and every data related to the farming and it should be updated frequently so that farmers and the consumers will get the right knowledge of the respective crops and about the suitable equipments related to farming. Existing system are not so much efficient in displaying such data characteristics. Our main aim is to enhance the growth in the agriculture sector and make the existing system smarter so that the decision- maker can define the expansion of agriculture activities to empower the different forces in existing agriculture sector
A Survey on the Analysis of Dissolved Oxygen Level in Water using Data Mining...IIRindia
Data Mining (DM) is a powerful and a new field having various techniques to analyses the recent real world problems. In DM, environmental mining is one of the essential and interesting research areas. DM enables to collect fundamental insights and knowledge from massive volume of environmental data. The water quality is determining the condition of water in the environment. It represents the concentration and state (dissolved or particulate) of some or all the organic and inorganic material present in the water, together with certain physical characteristics of the water. The Dissolved Oxygen (DO) is one of the important aspects of water quality. The DO is the quantity of gaseous oxygen (O2) incorporated into the water. The DO is essential for keeping the water organisms alive. The amount of DO level in the water can be detected by various methods. The data mining techniques are properly used to find DO Level in the different types of water. A number of DM methods used to analyze the DO level such as Multi-Layer Perceptron, Multivariate Linear Regression, Factor Analysis, and Feed Forward Neural Network. This survey work discusses about such type of methods, particularly used for the analysis of DO level elaborately.
Kidney Failure Due to Diabetics – Detection using Classification Algorithm in...IIRindia
In order to analyse the chosen data from various points of view, data mining is used as the effective process. This process is also used to sum-up all those views into useful information. There are several types of algorithms in data mining such as Classification algorithms, Regression, Segmentation algorithms, association algorithms, sequence analysis algorithms, etc.,. The classification algorithm can be used to bifurcate the data set from the given data set and foretell one or more discrete variables, based on the other attributes in the dataset. The ID3 (Iterative Dichotomiser 3) algorithm is an original data set S as the root node. An unutilised attribute of the data set S calculates the entropy H(S) (or Information gain IG (A)) of the attribute. Upon its selection, the attribute should have the smallest entropy (or largest information gain) value. The prime objective of this paper is to analyze the data from a Kidney disorder due to diabetics by using classification technique to predict class accurately.
Silhouette Threshold Based Text Clustering for Log AnalysisIIRindia
Automated log analysis has been a dominant subject area of interest to both industry and academics alike. The heterogeneous nature of system logs, the disparate sources of logs (Infrastructure, Networks, Databases and Applications) and their underlying structure & formats makes the challenge harder. In this paper I present the less frequently used document clustering techniques to dynamically organize real time log events (e.g. Errors, warnings) to specific categories that are pre-built from a corpus of log archives. This kind of syntactic log categorization can be exploited for automatic log monitoring, priority flagging and dynamic solution recommendation systems. I propose practical strategies to cluster and correlate high volume log archives and high velocity real time log events; both in terms of solution quality and computational efficiency. First I compare two traditional partitional document clustering approaches to categorize high dimensional log corpus. In order to select a suitable model for our problem, Entropy, Purity and Silhouette Index are used to evaluate these different learning approaches. Then I propose computationally efficient approaches to generate vector space model for the real time log events. Then to dynamically relate them to the categories from the corpus, I suggest the use of a combination of critical distance measure and least distance approach. In addition, I introduce and evaluate three different critical distance measures to ascertain if the real time event belongs to a totally new category that is unobserved in the corpus.
Analysis and Representation of Igbo Text Document for a Text-Based SystemIIRindia
The advancement in Information Technology (IT) has assisted in inculcating the three Nigeria major languages in text-based application such as text mining, information retrieval and natural language processing. The interest of this paper is the Igbo language, which uses compounding as a common type of word formation and as well has many vocabularies of compound words. The issues of collocation, word ordering and compounding play high role in Igbo language. The ambiguity in dealing with these compound words has made the representation of Igbo language text document very difficult because this cannot be addressed using the most common and standard approach of the Bag-Of-Words (BOW) model of text representation, which ignores the word order and relation. However, this cause for a concern and the need to develop an improved model to capture this situation. This paper presents the analysis of Igbo language text document, considering its compounding nature and describes its representation with the Word-based N-gram model to properly prepare it for any text-based application. The result shows that Bigram and Trigram n-gram text representation models provide more semantic information as well addresses the issues of compounding, word ordering and collocations which are the major language peculiarities in Igbo. They are likely to give better performance when used in any Igbo text-based system.
A Survey on E-Learning System with Data MiningIIRindia
E-learning process has been widely used in university campus and educational institutions are playing vital role to enhance the skill set of students. Modern E-learning done by many electronic devices, such as smartphones, Tabs, and so on, on existing E-learning tools is insufficient to achieve the purpose of online training of education. This paper presents a survey of online e-Learning authoring tools for creating and integrating reusable e-learning tool for generation and enhancing existing learning resources with them. The work concentrates on evaluation of the existing e-learning tools a, and authoring tools that have shown good performance in the past for online learners. This survey work takes more than 20 online tools that deal with the educational sector mechanism, for the purpose of observations, and the outcome were analyzed. The findings of this paper are the main reason for developing a new tool, and it shows that educators can enhance existing learning resources by adding assessment resources, if suitable authoring tools are provided. Finally, the different factors that assure the reusability of the created new e-learning tool has been analysed in this paper.E-learning environment is a guide for both students and tutorial management system. The useful on the e-learning system for apart from students and distance learning students. The purpose of using e-learning environment for online education system, developed in data mining for more number of clustering servers and resource chain has been good.
Image Segmentation Based Survey on the Lung Cancer MRI ImagesIIRindia
Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
The Preface Layer for Auditing Sensual Interacts of Primary Distress Conceali...IIRindia
Resting anterior brain electrical activity, self-report measures of Behavioral Approach System (BAS) and Behavioral Inhibition System (BIS) strength, and common levels of Positive Affect (PA) and Negative Affect (NA) were composed from 46 unselected undergraduates two split occasions Electroencephalogram (EEG) measures of prefrontal asymmetry and the self-report measures showed excellent internal reliability, steadiness and tolerable test-retest stability. Strong connection betweens the unconstrained facial emotional expressions and the full of feeling states correlated cerebrum movement. When seeing dreadful as contrasted with unbiased faces, members showed larger amounts of actuation inside the privilege average prefrontal cortex (PFC). To propose a multimodal method to deal with assess Efficient Practical near Infrared Spectroscopy (EPNIS) signals and EEG signals for full of feeling state identification. Outcomes demonstrate that proposed technique with EPNIS enhances execution over EPNIS methodologies. Based on
Feature Based Underwater Fish Recognition Using SVM ClassifierIIRindia
An approach for underwater fish recognition based on wavelet transform is presented in this paper. This approach decomposes the input image into sub-bands by using the multi resolutional analysis known as Discrete Wavelet Transform (DWT). As each sub-band in the decomposed image contains useful information about the image, the mean values of every sub-band are assumed as features. This approach is tested on Underwater Photography - A Fish Database. The database contains 7953 pictures of 1458 different species. The database is considered for the classification based on Support Vector machine (SVM) classifier. The result shows that maximum recognition accuracy of 90.74% is achieved by the wavelet features.
The objective of this research work is focused on the right cluster creation of lung cancer data and analyzed the efficiency of k-Means and k-Medoids algorithms. This research work would help the developers to identify the characteristics and flow of algorithms. In this research work is pertinent for the department of oncology in cancer centers. This implementation helps the oncologist to make decision with lesser execution time of the algorithm.It is also enhances the medical care applications. This work is very suitable for selection of cluster development algorithm for lung cancer data analysis.Clustering is an important technique in data mining which is applied in many fields including medical diagnosis to find diseases. It is the process of grouping data, where grouping is recognized by discovering similarities between data based on their features. In this research work, the lung cancer data is used to find the performance of clustering algorithms via its computational time. Considering a limited number attributes of lung cancer data, the algorithmic steps are applied to get results and compare the performance of algorithms. The partition based clustering algorithms k-Means and k-Mediods are selected to analyze the lung cancer data.The efficiency of both the algorithms is analyzed based on the results produced by this approach. The finest outcome of the performance of the algorithm is reported for the chosen data concept.
A Study on MRI Liver Image Segmentation using Fuzzy Connected and Watershed T...IIRindia
A comparison study between automatic and interactive methods for liver segmentation from contrast-enhanced MRI images is ocean. A collection of 20 clinical images with reference segmentations was provided to train and tune algorithms in advance. Employed algorithms include statistical shape models, atlas registration, level-sets, graph-cuts and rule-based systems. All results were compared to refer five error measures that highlight different aspects of segmentation accuracy. The measures were combined according to a specific scoring system relating the obtained values to human expert variability. In general, interactive methods like Fuzzy Connected and Watershed Methods reached higher average scores than automatic approaches and featured a better consistency of segmentation quality. However, the best automatic methods (mainly based on statistical shape models with some additional free deformation) could compete well on the majority of test images. The study provides an insight in performance of different segmentation approaches under real-world conditions and highlights achievements and limitations of current image analysis techniques. In this paper only Fuzzy Connected and Watershed Methods are discussed.
A Clustering Based Collaborative and Pattern based Filtering approach for Big...IIRindia
With web services developing and aggregating in application range, benefit revelation has turned into a hot issue for benefit organization and service management. Service clustering gives a promising approach to part the entire seeking space into little areas in order to limit the disclosure time successfully. In any case, semantic data is a basic component amid the entire arranging process. Current industrialized Web Service Portrayal Language (WSPL) does not contain enough data for benefit depiction. Thusly, a service clustering technique has been proposed, which upgrades unique WSPL report with semantic data by methods for Connected Open Information (COI). Examination based genuine service information has been performed, and correlation with comparable techniques has additionally been given to exhibit the adequacy of the strategy. It is demonstrated that using semantic data from COI improves the exactness of service grouping. Furthermore, it shapes a sound base for promote thorough preparing with semantic data.
Hadoop and Hive Inspecting Maintenance of Mobile Application for Groceries Ex...IIRindia
Numerous movable applications on secure groceries expenditure and e-health have designed recently. Health aware clients respect such applications for secure groceries expenditure, particularly to avoid irritating groceries and added substances. However, there is the lack of a complete database including organized or unstructured information to help such applications. In the paper propose the Multiple Scoring Frameworks (MSF), a healthy groceries expenditure search service for movable applications using Hadoop and MapReduce (MR). The MSF works in a procedure behind a portable application to give a search service for data on groceries and groceries added substances. MSF works with similar logic from a web search engine (WSE) and it crawls over Web sources cataloguing important data for possible utilize in reacting to questions from movable applications. MSF outline and advancement are featured in the paper during its framework design, inquiry understanding, its utilization of the Hadoop/MapReduce infrastructure, and activity contents.
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...IIRindia
Educational Data mining(EDM)is a prominent field concerned with developing methods for exploring the unique and increasingly large scale data that come from educational settings and using those methods to better understand students in which they learn. It has been proved in various studies and by the previous study by the authors that data mining techniques find widespread applications in the educational decision making process for improving the performance of students in higher educational institutions. Classification techniques assumes significant importance in the machine learning tasks and are mostly employed in the prediction related problems. In machine learning problems, feature selection techniques are used to reduce the attributes of the class variables by removing the redundant and irrelevant features from the dataset. The aim of this research work is to compares the performance of various feature selection techniques is done using WEKA tool in the prediction of students’ performance in the final semester examination using different classification algorithms. Particularly J48, Naïve Bayes, Bayes Net, IBk, OneR, and JRip are used in this research work. The dataset for the study were collected from the student’s performance report of a private college in Tamil Nadu state of India. The effectiveness of various feature selection algorithms was compared with six classifiers and the results are discussed. The results of this study shows that the accuracy of IBK is 99.680% which is found to be
A Review of Edge Detection Techniques for Image SegmentationIIRindia
Edge detection is a key stride in Image investigation. Edges characterize the limits between areas in a image, which assists with division and article acknowledgment.Edge discovery is a image preparing method for finding the limits of articles inside Image. It works by distinguishing irregular in brilliance and utilized for Image division and information extraction in zones, for example, Image preparing, PC vision and Image vision. There are likely more algorithms in a writing of upgrading and distinguishing edges than whatever other single subject.In this paper, the principle is to concentrate most usually utilized edge methods for Image segmentation.
Leanness Assessment using Fuzzy Logic Approach: A Case of Indian Horn Manufac...IIRindia
Lean principles are being implemented by many industries today that focus on improving the efficiency of the operations for reducing the waste, efforts and consumption. Organizations implementing lean principles can be assessed using the some tools. This paper attempts to assess the lean implementation in a leading Horn manufacturing industry in South India. The twofold objectives are set to be achieved through this paper. First is to find the leanness level of a manufacturing organization for which a horn manufacturing company has been selected as the case company. Second is to find the critical obstacles for the lean implementation. The fuzzy logic computation method is used to extract the perceptions about the particular variables by using linguistic values and then match it with fuzzy numbers to compute the precise value of the leanness level of the organization. Based on the results obtained from this analysis, it was found that the case study company has performed in the lean to vey lean range and the weaker areas have been identified to improve the performance further.
Comparative Analysis of Weighted Emphirical Optimization Algorithm and Lazy C...IIRindia
Health care has millions of centric data to discover the essential data is more important. In data mining the discovery of hidden information can be more innovative and useful for much necessity constraint in the field of forecasting, patient’s behavior, executive information system, e-governance the data mining tools and technique play a vital role. In Parkinson health care domain the hidden concept predicts the possibility of likelihood of the disease and also ensures the important feature attribute. The explicit patterns are converted to implicit by applying various algorithms i.e., association, clustering, classification to arrive at the full potential of the medical data. In this research work Parkinson dataset have been used with different classifiers to estimate the accuracy, sensitivity, specificity, kappa and roc characteristics. The proposed weighted empirical optimization algorithm is compared with other classifiers to be efficient in terms of accuracy and other related measures. The proposed model exhibited utmost accuracy of 87.17% with a robust kappa statistics measurement and roc degree indicated the strong stability of the model when compared to other classifiers. The total penalty cost generated by the proposed model is less when compared with the penalty cost of other classifiers in addition to accuracy and other performance measures.
Survey on Segmentation Techniques for Spinal Cord ImagesIIRindia
Medical imaging is a technique which is used to expose the interior part of the body, to diagnose the diseases and to treat them as well. Different modalities are used to process the medical images. It helps the human specialists to make diagnosis ailments. In this paper, we surveyed segmentation on the spinal cord images using different techniques such as Data mining, Support vector machine, Neural Networks and Genetic Algorithm which are applied to find the disorders and syndromes affected in the spinal cord system. As a result, we have gained knowledge in an identified disarrays and ailments affected in lumbar vertebra, thoracolumbar vertebra and spinal canal. Finally how the Disc Similarity Index values are generated in each method is also analysed.
An Approach for Breast Cancer Classification using Neural NetworksIIRindia
Breast Cancer,an increasing predominant death causing disease among women has become a social concern. Early detection and efficient treatment helps to reduce the breastcancerrisk.AdaptiveResonanceTheory(ART1),anunsupervised neural network has become an efficient tool in the classification of breast cancer as Benign(non dangerous tumour) or Malignant (dangerous tumour). 400 instances were pre processed to convert real data into binary data and the classification was carried out using ART1 network. The results of the classified data and the physician diagnosed data were compared and the standard performance measures accuracy, sensitivity and specificity were computed. The results show that the simulation results are analogous to the clinical results.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
1. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 05 Issue: 02 December 2016, Page No.167-171
ISSN: 2278-2419
167
A Survey on Educational Data Mining Techniques
A.S. Arunachalam1
, T.Velmurugan2
1
Research scholar, Department of Computer Science, Vels University, Chennai, India.
2
Associate Professor, PG and Research Department of Computer Science, D.G.Vaishanav College, Chennai, India.
Mail:arunachalam1976@gmail.com, velmurugan_dgvc@yahoo.co.in
Abstract - Educational data mining (EDM) creates high impact
in the field of academic domain. The methods used in this topic
are playing a major advanced key role in increasing knowledge
among students. EDM explores and gives ideas in
understanding behavioral patterns of students to choose a
correct path for choosing their carrier. This survey focuses on
such category and it discusses on various techniques involved
in making educational data mining for their knowledge
improvement. Also, it discusses about different types of EDM
tools and techniques in this article. Among the different tools
and techniques, best categories are suggested for real world
usage.
Key words: Educational Data Mining, Web Mining, E-
Learning, Data Mining Techniques
I. INTRODUCTION
Collecting relevant student record and analyzing the same from
huge record set always remain difficult task for researchers.
Data mining process of extracting hidden information from
large database provides a meaningful solution for educational
data mining. The researcher also faces many problems in
implementing the developed system for educational data
mining in different platform. Huge number of developments in
educational courses always remains difficult task for students
in choosing best course. Current web based course applications
doesn’t provide static learning materials by understanding
students mentality. User friendly environment for web based
educational system always remain a good solution to richer
learning environment. In traditional education system, students
share their learning experiences one to one interaction and
continual evaluation process [1]. Classroom Evaluations
processes by observing student’s attitude, analyzing record,
and student appraisal in teaching strategies. The supervision is
not possible when the students working in IT field; pedagogue
chooses for other techniques to get class room data.
Institutions, which run websites for distance learning, collect
huge data, collecting server access log and web server by
automatically. Web based learning analysing tools available
online increses the interaction data between acdemicion and
students [2]. Most effective learninig environment can be
carried by following data mining techniques. The data mining
techniques stages starts from pre processing to post processing
techniques by following KDD process of identifying necessary
educational data. Web based domain area E-commerce uses
data mining techniques in advancing educational mining. E-
Learning process gives optimal solution for improving the
educational data mining process. Some differents in E-learning
and E-Commerce systems are disscussed below [3].
E-commerce technology is used for communicating client with
server for commercial purpose. Web based applications are
used for carrying out this technology. Web access log files
stores E Commerce transaction data for money transfer. The
commercial web sites transaction data passes the control to
bank website application and purchases happen. E Learning
technology is used for learning from web sites. The users can
gather necessary knowledge from web based learning process.
The web applications used in this type of websites are very
useful in providing knowledge for learners. Web based
applications stores the information through web access log
files, which eventually stores information about users working
on the websites.
In educational system the knowledge assessment techniques
applied to improve students’ learning process. The formative
assessment process evaluating continues improvement of
students learning capacity. The formative system helps the
educator to improve instructional materials. The data mining
techniques helped the educator to make academic decision
when designing or editing the teaching methodology. The
educational data mining follow the common data mining
methods. Extracted information should enter the circle of the
system and guide, fine tuning and refinement of learning [4].
This data not only becoming the knowledge, it improves the
mined knowledge for decision making. The rest of the survey
paper is planned as follows. Section 2 discusses about the basic
concepts of educational data mining. In section 3, it is
discussed about tools used for educational data mining. The
applications or techniques of educational data mining are
illustrated in section 4. Finally, section 5 concludes the survey
work.
II. EDUCATIONAL DATA MINING
Educational data mining play a major role in society and
educational area. The data mining sequence applying in
educational system can be clearly represented with a diagram
shown in Figure1.
Figure 1: Data mining sequence applying in Educational
System
2. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 05 Issue: 02 December 2016, Page No.167-171
ISSN: 2278-2419
168
a. Academician
Academician plays a major role in designing educational
system. The educational system should be constructed for
better benefits to students. This effort may cause dramatic
changes in educational environment and society. The
academicians and teachers should analyze students’ records
and construct the better educational system. Data mining
techniques provides an additional benefit to academicians for
analyzing student’s behavior based on historical data.
b. Educational System
Proper educational system provides a healthy environment and
society. The students are one of the major factors in
educational system that eventually makes the society healthy.
The rules and regulations are made possible by the use of
experienced academicians.
c. Data mining Techniques
Historical Students records are collected from various localities
and data mining techniques are applied. The unwanted data are
removed by applying preprocessing technique. There are
multiple data mining techniques have been used in EDM
process including classification, association, prediction,
clustering, sequential pattern and decision tree.
d. Analyzing for Recommendation
The goal to have constraint about how to develop educational
system efficiency and get used to it to the performance of their
students, have measures about how to enhanced categorize
institutional resources (human and material) and their
educational offer, enhance educational programs offer and
determine effectiveness of the new computer mediated distance
learning approach.
e. Students
The objective is absorb students learning experience,resources,
their activities and interest of learning based on the
responsibilities already done by the student and their successes
and on errands made by other similar learners, etc.
III. TOOLS FOR EDUCATIONAL DATA MINING
Zaiane and Luo discusses about Web Utilization Miner
(WUM) for students’ feedback system. This added value by
specific event recording on the E- learning side will give click
steams and the patterns discovered a better meaning and
interpretation [5]. Silva and Vieira uses web usage ranking
technique to identify student information web pages. MultiStar
textually presents the patterns it finds. The patterns resulting
from the classification task are expressed through certain rules
[6]. Shen implements student academic performances
visualization through statistical graphs [7]. Romero discovers
interesting prediction rules from student usage information to
improve adaptive web courses like AHA!(Adaptive
Hypermedia for All). He also uses a visual tool (EPRules)
discover prediction rules and it is oriented to be used by the
teacher [8]. Tane propose an ontology-based tool to build the
majority of the resources available on the web. He uses text
mining and text clustering techniques in order to group papers
according to their topics and similarities [9]. Merceron and
Yacef , used traditional SQL queries to mining student data
captured from a web-based tutoring tool and association rule
and symbolic data analysis. Their objective is to find mistakes
that often occurs together [10]. Becker introduced Sequential
patterns can expose which content has provoked the access to
other contents, or how tools and contents are tangled in the
learning process [11]. Avouris, N., Komis, V., Fiotakis, G.,
Margaritis, M. and Voyiatzaki, E. develop automatically
generated log files by introducing contextual information as
additional events and by associating comments and static files
[12]. Mazza and Milani introduced a tool GISMO/CourseVis
which features Information visualization techniques can be
used to graphically render multidimensional, complex student
tracking data collected by web-based educational systems these
techniques facilitate to analyze large amounts of information
by representing the data in some visual display [13]. Mostow
used Listen tool for understanding of their learners and become
aware of what is happening in distance classes [14]. Damez,
M., Dang, T.H., Marsala, C. and Bouchon-Meunier, B., used a
fuzzy decision tree for user modeling and discriminating a
learner from an experimented consumer automatically. They
use an agent to learn the cognitive characteristics of a user’s
relations and classify users as experimented or not [15]. Bari
and Benzater recover data from pdf hypermedia productions
for serving the assessment of multimedia presentations, for
statistics purpose and for extracting relevant data. They
recognize the major blocks of multimedia presentations and
recover their internal properties [16]. Qasem A. Al.Radaideh
has introduced CRISP Framework for mining student related
academic data. He have used decision tree as a classification
technique for rule mining in academic data [17]. Cristobal
Romero also introduced a tool KEEL which is a software tool
to access evolutionary algorithms to solve various data mining
problems in regression, classification and unsupervised
learning [18]. C.Marquez-Vera has introduced an SMOTE
algorithm for classifying rebalanced students success rate using
10 Fold Cross Validation, which the rules was implemented
and tested in WEKA [19]. Dragan Gašević identifies the
critical topics that require immediate research attention for
learning analytics to make a sustainable impact on the research
and practice of learning and teaching by using online learning
tool and video annotation tool [20].
IV. TECHNIQUES FOR EDUCATIONAL DATA
MINING
Cristobal Romero compares different type of data mining
classification techniques with the use of Moodle usage data of
Cordoba University. He modeled a rebalanced preprocessing
technique for classifying original numerical data [21].
Ramasami has developed a Predictive data mining model for
identifying slow learners and to study the influence of the
dominating factor of their academic performances. He also
introduced CHAID model for predicting slow learners in an
accurate manner [22]. Edin Osmanbegovie successfully
implemented a datamining technique in higher education
socio-demographic variables and analysis high school entrance
exam and attribute related to it. He also uses some of
supervised algorithms for analyzing the student data [23].
Sujeet Kumar Yadav has used three decision trees and three
machine learning algorithms (ID3, C4.5, and CART) for
obtaining students predictive model. He also classifies by
giving accuracy value in time and identifies student’s success
and failure ratio [24].
3. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 05 Issue: 02 December 2016, Page No.167-171
ISSN: 2278-2419
169
Table 1: Educational Data Mining Tools
Authors Tool Mining task Findings
Zaiane and
Luo (2001)
WUM
Association and
Patterns
E- learning side will give click steams and the patterns discovered a
better meaning and interpretation.
Silva and
Vieira
(2002)
MultiStar
Association and
classification
MultiStar textually presents the patterns it finds. The patterns
resulting from the classification task are expressed through certain
rules.
Shen et al.
(2002)
Data Analysis
Center
Association and
classification
E- Learning system for solving problem like students and teacher’s
interaction problem in assignments and other problems.
Romero et
al. (2003)
EPRules Association
Grammar based genetic programming with multi-objective
optimization techniques for providing a feedback to course ware
authors was developed and identifies increasing relationship in
student’s usage data.
Tane et al.
(2004)
KAON
Text mining and
Clustering
The Course ware Watch dog addresses the different needs of
teachers and students. It integrates the Semantic Web vision by
using ontologies and a peer-to-peer network of semantically
annotated learning material.
Merceron
and Yacef
(2005)
TADA-ED
Classification and
Association
Implementing traditional SQL queries to mining student data from
a web based tutoring tool. The objectives of fining mistakes that
occur often are done with accurate rating.
Becker et al.
(2005)
O3R Sequential patterns
The proposed filtering functionality 1) They have the support of the
ontology to understand the domain, and establish interesting filters,
and 2) Direct manipulation of domain concepts and structural
operators minimize the skills required for defining filters.
Mostow et
al. (2005)
Listen tool Visualization
Log each distinct type of tutorial event in its own table. Include
student ID, computer, start time, and end time as fields of each such
table so as to identify its records as events.
Merceron
and Yacef
(2005)
TADA-ED
Classification and
Association
Implementing traditional SQL queries to mining student data from
a web based tutoring tool. The objectives of fining mistakes that
occur often are done with accurate rating.
Avouris et
al. (2005)
Synergo/ ColAT
Statistics and
visualization
Main features of two tools that facilitate analysis of complex field
data of technology mediated learning activities, the Synergo
Analysis Tool and ColAT.
Mazza and
Milani
(2005)
GISMO/
CourseVis
Visualization
GISMO has been implemented based on authors previous
experience with the CourseVis research, and proposes some
graphical representations that can be useful to gain some insights
on the students of the course.
Damez et al.
(2005)
TAFPA Classification
Four new steps were expected as four questions were asked, but
that made a notice that one user missed a question by double-
clicking accidentally on the button “Show next question”. It can
lead to some mistakes to use the LCS.
Qasem A.
Al.Radaideh
(2006)
CRISP Classifier Classification
The classification algorithms ID3, C4.5 and Naive Bayes correctly
classification parentage accuracy rating is not so high.
Cristobal
Romero
(2009)
KEEL
Régression,
classification and
unsupervised
Learning
Association rule mining has been used to provide new, important
and therefore demand-oriented impulses for the development of
new bachelor and master courses
C.Marquez-
Vera (2010)
10 Fold Cross
Validation using
WEKA Tool
Classification
Rule induction algorithms such as JRip, NNge, OneR, Prism and
Ridor; and decision tree algorithms such as J48, SimpleCart,
ADTree, RandomTree and REPTree are used for experiment
because these algorithms can be used directly for decision making
and provides detailed classification results.
Dragan
Gasevic
(2015)
Online learning
tools and video
annotation tool
Classification and
segmentation
Transition graphs are constructed from a contingency matrix in
which rows and columns were all events logged by the video
annotation tool.
4. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 05 Issue: 02 December 2016, Page No.167-171
ISSN: 2278-2419
170
Table 2: Techniques for Educational Data Mining
Authors Techniques Mining task Findings
Cristobal
Romero
(2007)
Moodles Classification
C4.5 and CART algorithms are simple for instructors to
understand and interpret. GGP algorithms have a higher
expressive power allowing the user to determine the specific
format of the rules.
M. Ramasami
(2010)
CHIAD model Association
Features whose chi-square values were greater than 100 were
given due considerations and the highly influencing variables
with high chi-square values. These features were used for the
CHAID prediction model construction.
Edin Osman
begovie (2012)
L86 Classifier
Classification
and Association
The Results shows that the naïve Bayes algorithms performance
and accuracy level for decision tree is much more than that of
the neural network method for decision tree classification.
Sujeet Kumar
Yadav (2012)
C4.5,ID3,CART
Algorithm based tool
Classification
ID3, C4.5 and CART machine learning algorithms that produce
predictive models with the best class wise accuracy. Classifiers
accuracy True positive rate for FAIL is 0.786 for ID3 and C4.5.
The created model successfully classified.
Dorina
Kabakchieva
(2013)
CRISP-DM Classification
The J48 classifier correctly classifies about 66% of the instance
with 10-fold cross-validation testing and 66.59 % for the
percentage split testing and produces. The achieved results are
slightly better for the percentage split testing option.
Abeer Badr El
Din Ahmed
(2014)
Decission Tree (ID3
Algorithm)
Classification
For measurements of best attribute for a particular node in the
tree Information Gain are used with attribute A, relative to a
collection of sample S.
Xing Wanli
(2015)
GP-ICRM for rule
generation and work
flow
Classification
Implemented EDM, theory and application to solve the problem
of predicting student’s performance in a CSCL learning
environment with small datasets. Model for performance
prediction is evaluated using a GP algorithm.
Ashwin
Satyanarayana
(2016)
Bootstrap Averaging
Classification
and Clustering
Single Model: Decision trees (J48) was used for single filtering
base model.
Online Bagging: Implemented online bagging using Naive
Bayes as the base model. Ensemble Filtering: The proposed
algorithm uses the following classifiers: J48, RandomForest and
Naive Bayes.
Dorina Kabakchieva implemented CRISP (Cross-Industry
Standard Process) approach for Data mining model for non-
property, freely available and application neutral standard for
data mining projects. Author also discusses about decision tree
classifier of NaiveBayes and BayesNet with J48 10 fold cross
validation and J48 percentage split and identifies weighted
average. Same J48 10 fold cross validation and J48 percentage
split comparison testing was carried for K-NN Classifier (with
k=100 and k=250) and OneR and JRip classifiers [25]. Abeer
Badr El Din Ahmed uses decision tree method for predicting
students’ performance with the help of ID3 Algorithm [26].
Xing Wanli introduces student prediction messures by using
different rules and uses gnetic operator for classification and
evaluate offspring for analysing student participations [27].
Ashwin Satyanarayana uses multiple classifiers such as J48,
NaiveBayes, and Random Forest for classifying students’
prediction. He also uses K-means clustering algorithm for
calculating similar cluster cancroids average in student cluster
[28].
V. CONCLUSION
Educational data mining is the most valuable research area
which makes society a better one by giving nice prediction
techniques for academician, teachers and students. The papers
discussed in this survey will give the detailed thought of
educational data mining and core paths of EDM. The
techniques and tools discussed in this survey will provide a
clear cut idea to the young educational data mining researchers
to carry out their work in this field. Also, this research work
carried out on the areas which make data mining process with
educational data mining in a batter way. Finally, it is confirmed
that most of the classification algorithms perform in a better
way of understating the current trends of EDM by the students
as well as academicians.
References
[1] Sheard. J, Ceddia. J, Hurst. J & Tuovinen. J, “Inferring
student learning behaviour from website interactions: A
usage analysis”, Journal of Education and Information
Technologies, 2003, Vol: 8(3), pp. 245-266.
[2] Sheard. J, Ceddia. J, Hurst. J & Tuovinen. J, “Determining
website usage time from interactions: Data preparation and
analysis”, Journal of Educational Technology Systems,
2003, Vol: 32(1), pp.101-121.
[3] Muehlenbrock, Martin.,“Automatic action analysis in an
interactive learning environment”, Proceedings of the 12th
International Conference on Artificial Intelligence in
Education. 2005, pp.452-455.
[4] Anuradha. C and T. Velmurugan, “A Data Mining based
Survey on Student Performance Evaluation System”, IEEE
5. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 05 Issue: 02 December 2016, Page No.167-171
ISSN: 2278-2419
171
Int. Conference on Computational Intelligence and
Computing Research, 2014, pp. 452-455.
[5] Zaiane. O, & Luo. J, “Web usage mining for a better web-
based learning environment”, In Proceedings of
conference on advanced technology for education, Banff,
Alberta, 2001, pp. 60–64.
[6] Silva, D., & Vieira, M. “Using data warehouse and data
mining resources for ongoing assessment in distance
learning”. In IEEE international conference on advanced
learning technologies, Kazan, Russia, 2002, pp. 40–45.
[7] Shen, Ruimin, Fan Yang, and Peng Han. “Data analysis
center based on e-learning platform”, The Internet
Challenge: Technology and Applications. Springer
Netherlands, 2002, pp.19-28.
[8] Cristobal Romero, Sebastian Ventura, Paul de Bra &
Carlos de Castro, “Discovering prediction rules in AHA!
Courses”, International Conference on User Modeling,
Springer Berlin Heidelberg, 2003, pp.25-34.
[9] Tane, Julien, Christoph Schmitz, and Gerd Stumme,
“Semantic resource management for the web: an e-
learning application”, Proceedings of the 13th
international World Wide Web conference on Alternate
track papers & posters, ACM, 2004, pp. 1-10.
[10]Merceron, Agathe, and Kalina Yacef, “Tada-ed for
educational data mining”, Interactive multimedia
electronic journal of computer-enhanced learning, 2005,
Vol:7(1), pp: 267-287.
[11]Vanzin, Mariangela, Karin Becker, and Duncan Dubugras
Alcoba Ruiz , “Ontology-based filtering mechanisms for
web usage patterns retrieval”, International Conference on
Electronic Commerce and Web Technologies, Springer
Berlin Heidelberg, 2005, pp. 267-277.
[12]Avouris, N., Komis, V., Fiotakis, G., Margaritis, M. and
Voyiatzaki, E., “Logging of fingertip actions is not enough
for analysis of learning activities”, In 12th International
Conference on Artificial Intelligence in Education, AIED
05 Workshop1: Usage analysis in learning systems, 2005,
pp.1-8.
[13]Mazza. R , and Milani. C , “Exploring usage analysis in
learning systems: Gaining insights from visualizations”,
Workshop on usage analysis in learning systems at 12th
international conference on artificial intelligence in
education, 2005, pp. 65-72.
[14]Mostow. J, Beck. J, Cen. H, Cuneo. A, Gouvea. E, &
Heiner. C, “An educational data mining tool to browse
tutor-student interactions: Time will tell”, Proceedings of
the Workshop on Educational Data Mining, National
Conference on Artificial Intelligence. AAAI Press, 2005,
pp. 15-22.
[15]Damez. M, Marsala. C, Dang. T, & Bouchon-Meunier. B,
“Fuzzy decision tree for user modeling from human–
computer interactions”, In International conference on
human system learning: Who is in control?,2005, pp.287–
302.
[16]Bari. M, & Benzater. B, “Retrieving data from pdf
interactive multimedia productions”. In International
conference on human system learning: Who is in control?
,2005, pp.321–330.
[17]Qasem A. Al-Radaideh, Emad M. Al-Shawakfa, Mustafa
I. Al-Najjar, “Mining Student Data Using Decision Trees”,
The 2006 International Arab Conference on Information
Technology, Jordan,2006, pp.1-5.
[18]Romero. C, Alcala-Fdez. J, Sanchez. L, Garcia. S, del
Jesus. M. J, Ventura. S, Garrell. J. M,& Fernandez. J,
“KEEL: a software tool to assess evolutionary algorithms
for data mining problems.” Soft Computing, 2009,
Vol:13(3), pp. 307-318.
[19]Marquez-Vera. C, ROMERO. C, "Predicting School
Failure Using Data Mining", Educational Data
Mining, 2010,2011.
[20]Dragan Gasevic, “Let’s not forget: Learning analytics
areabout learning", Association for Educational
Communications and Technology, 2015, Vol: 59(1), pp.
64-71.
[21]Romero. C, Ventura. S, Espejo. P. G, & Hervas. C, “Data
mining algorithms to classify students”, Educational Data
Mining 2007, 2007, pp:1-10.
[22]Ramaswami. M, and R. Bhaskaran, “A CHAID based
performance prediction model in educational data
mining”, 2010, arXiv preprint arXiv:1002.1144.
[23]Edin Osmanbegovic & Mirza Suljic , “DATA MINING
APPROACH FOR PREDICTING STUDENT
PERFORMANCE”, Journal of Economics and Business,
2012, Vol: 10(1), pp. 3-12.
[24]Surjeet Kumar Yadav, Saurabh Pal, “Data Mining: A
Prediction for Performance Improvement of Engineering
Students using Classification”, World of Computer
Science and Information Technology Journal (WCSIT),
2012, Vol: 2(2), pp. 51-56.
[25]Dorina Kabakchieva, “Predicting Student Performance by
Using Data Mining Methods for Classification”,
Cybernetics and Information Technologies, 2013, Vol:
13(1), pp.61-72.
[26]Abeer Badr El Din Ahmed and Ibrahim Sayed Elaraby,
“Data Mining: A prediction for Student's Performance
Using Classification Method”, World Journal of Computer
Application and Technology, 2014, Vol: 2(2), pp. 43-47.
[27]Xing Wanli , Guo Rui, Petakovic Eva & Goggins Sean,
“Participation-based student final performance prediction
model through interpretable Genetic Programming:
Integrating learning analytics, Educational data mining
and theory”, Computers in Human Behavior, 2015, Vol:
47, pp. 168–181.
[28]Ashwin Satyanarayana, Gayathri Ravichandran, "Mining
Student data by Ensemble Classification and Clustering
for Profiling and Prediction of Student Academic
Performance", 2016 ASEE Mid-Atlantic Section
Conference, 2016.