The increasing need for data driven decision making recently has resulted in the application of data mining in various fields including the educational sector which is referred to as educational data mining. The need for improving the performance of data mining models has also been identified as a gap for future researcher. In Nigeria, higher educational institutions collect various students’ data, but these data are rarely used in any decision or policy making to improve the academic performance of students. This research work, attempts to improve the performance of data mining models for predicting students’ academic performance using stacking classifiers ensemble and synthetic minority over-sampling techniques. The research was conducted by adopting and evaluating the performance of J48, IBK and SMO classifiers. The individual classifiers models, standard stacking classifier ensemble model and stacking classifiers ensemble model were trained and tested on 206 students’ data set from the faculty of science federal university Dutse. Students’ specific previous academic performance records at Unified Tertiary Matriculation Examination, Senior Secondary Certificate Examination and first year Cumulative Grade Point Average of students are used as data inputs in WEKA 3.9.1 data mining tool to predict students’ graduation classes of degrees at undergraduate level. The result shows that application of synthetic minority over-sampling technique for class balancing improves all the various models performance with the proposed modified stacking classifiers ensemble model outperforming the various classifiers models in both performance accuracy and RSME values making it the best model.
The main objective of this paper is to develop a basic prototype model which can determine and extract
unknown knowledge (patterns, concepts and relations) related with multiple factors from past database records of
specific students. Data mining is science and engineering study of extracting previously undiscovered patterns
from a huge set of data. Data mining techniques are helpful for decision making as well as for discovering patterns
of data. In this paper students eligibility prediction system using Rule based classification is proposed to predict
the eligibility of students based on their details with high prediction accuracy. In Educational Institutes, a
tremendous amount of data is generated. This paper outlines the idea of predicting a particular student’s placement
eligibility by performing operations on the data stored. In this paper an efficient algorithm with the technique
Fuzzy for prediction is proposed.
Predicting Success : An Application of Data Mining Techniques to Student Outc...IJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter
tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Fuzzy Association Rule Mining based Model to Predict Students’ Performance IJECEIAES
The major intention of higher education institutions is to supply quality education to its students. One approach to get maximum level of quality in higher education system is by discovering knowledge for prediction regarding the internal assessment and end semester examination. The projected work intends to approach this objective by taking the advantage of fuzzy inference technique to classify student scores data according to the level of their performance. In this paper, student’s performance is evaluated using fuzzy association rule mining that describes Prediction of performance of the students at the end of the semester, on the basis of previous database like Attendance, Midsem Marks, Previous semester marks and Previous Academic Records were collected from the student’s previous database, to identify those students which needed individual attention to decrease fail ration and taking suitable action for the next semester examination.
A Study on Learning Factor Analysis – An Educational Data Mining Technique fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
This study used multilevel analysis to determine the predictive value of selected intrinsic factors (gender,
computer ownership, mathematics background and computer experience) and institutional type (an
extrinsic factor) on undergraduates’ Self Efficacy in Java Computer Programming (SEiJCP) in South –
West, Nigeria. The study adopted a correlational design. Purposive Sampling was used to select 254
computer science undergraduates from four universities (three federal-owned and one state-owned) in
south-west, Nigeria. Three research questions were answered. Two research instruments namely,
Computer experience scale (r = 0.84) and Java Programming Self Efficacy Scale (JPSES, r = 0.96) were
used to collect data. Data were analysed using descriptive statistics, and null and linear growth model
(LGM) procedures. The intercorrelation coefficients among the extrinsic factor, intrinsic factors and
SEiJCP were moderate. Null model shows that the variations in SEiJCP accounted for by insitutional level
differences was 99.0%. The fixed part of the LGM of intrinsic factors showed that only mathematical
backgroung contributed significantly (p < 0.05) to the prediction of SEiJCP. The random part of the LGM
showed no significant contributions of the interactions of the intrinsic factors, to the prediction of SEiJCP.
About 60.0% of the student level variation in SEiJCP is explained by the differences in intrinsic factors.
The institution – level variable had large predictive value on programming self efficacy. Computer science
departments should increase the number of mathematics courses in their curriculum.
Educational Data Mining (EDM) is one of the crucial application areas of data mining which helps in predicting educational dropout and hence provides timely help to students. In Indian context, predicting educational dropouts is a major problem. By implementing EDM, we can predict the learning habits of the student. At present EDM has not been introduced at higher education level. Due to this we cannot recognize the genuine problems of students during their education. The objective of this analysis is to find the existing gaps in predicting educational dropout and find the missing attributes if any, which my further contribute for better prediction. After that we try to find the best attributes and DM techniques which are frequently used for dropout prediction. Based on the combination of missing attribute and best attribute of student data thus far, a new algorithm can be tested which may overcome the shortcomings of previous work done.
The main objective of this paper is to develop a basic prototype model which can determine and extract
unknown knowledge (patterns, concepts and relations) related with multiple factors from past database records of
specific students. Data mining is science and engineering study of extracting previously undiscovered patterns
from a huge set of data. Data mining techniques are helpful for decision making as well as for discovering patterns
of data. In this paper students eligibility prediction system using Rule based classification is proposed to predict
the eligibility of students based on their details with high prediction accuracy. In Educational Institutes, a
tremendous amount of data is generated. This paper outlines the idea of predicting a particular student’s placement
eligibility by performing operations on the data stored. In this paper an efficient algorithm with the technique
Fuzzy for prediction is proposed.
Predicting Success : An Application of Data Mining Techniques to Student Outc...IJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter
tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Fuzzy Association Rule Mining based Model to Predict Students’ Performance IJECEIAES
The major intention of higher education institutions is to supply quality education to its students. One approach to get maximum level of quality in higher education system is by discovering knowledge for prediction regarding the internal assessment and end semester examination. The projected work intends to approach this objective by taking the advantage of fuzzy inference technique to classify student scores data according to the level of their performance. In this paper, student’s performance is evaluated using fuzzy association rule mining that describes Prediction of performance of the students at the end of the semester, on the basis of previous database like Attendance, Midsem Marks, Previous semester marks and Previous Academic Records were collected from the student’s previous database, to identify those students which needed individual attention to decrease fail ration and taking suitable action for the next semester examination.
A Study on Learning Factor Analysis – An Educational Data Mining Technique fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
This study used multilevel analysis to determine the predictive value of selected intrinsic factors (gender,
computer ownership, mathematics background and computer experience) and institutional type (an
extrinsic factor) on undergraduates’ Self Efficacy in Java Computer Programming (SEiJCP) in South –
West, Nigeria. The study adopted a correlational design. Purposive Sampling was used to select 254
computer science undergraduates from four universities (three federal-owned and one state-owned) in
south-west, Nigeria. Three research questions were answered. Two research instruments namely,
Computer experience scale (r = 0.84) and Java Programming Self Efficacy Scale (JPSES, r = 0.96) were
used to collect data. Data were analysed using descriptive statistics, and null and linear growth model
(LGM) procedures. The intercorrelation coefficients among the extrinsic factor, intrinsic factors and
SEiJCP were moderate. Null model shows that the variations in SEiJCP accounted for by insitutional level
differences was 99.0%. The fixed part of the LGM of intrinsic factors showed that only mathematical
backgroung contributed significantly (p < 0.05) to the prediction of SEiJCP. The random part of the LGM
showed no significant contributions of the interactions of the intrinsic factors, to the prediction of SEiJCP.
About 60.0% of the student level variation in SEiJCP is explained by the differences in intrinsic factors.
The institution – level variable had large predictive value on programming self efficacy. Computer science
departments should increase the number of mathematics courses in their curriculum.
Educational Data Mining (EDM) is one of the crucial application areas of data mining which helps in predicting educational dropout and hence provides timely help to students. In Indian context, predicting educational dropouts is a major problem. By implementing EDM, we can predict the learning habits of the student. At present EDM has not been introduced at higher education level. Due to this we cannot recognize the genuine problems of students during their education. The objective of this analysis is to find the existing gaps in predicting educational dropout and find the missing attributes if any, which my further contribute for better prediction. After that we try to find the best attributes and DM techniques which are frequently used for dropout prediction. Based on the combination of missing attribute and best attribute of student data thus far, a new algorithm can be tested which may overcome the shortcomings of previous work done.
Data Mining Application in Advertisement Management of Higher Educational Ins...ijcax
In recent years, Indian higher educational institute’s competition grows rapidly for attracting students to get enrollment in their institutes. To attract students educational institutes select a best advertisement method. There are different advertisements available in the market but a selection of them is very difficult
for institutes. This paper is helpful for institutes to select a best advertisement medium using some data mining methods.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Educational Data Mining is used to predict the future learning behavior of the student. It is still a research topic for the researcher who wants do better result from the prediction of the student. The results of all these techniques help the teachers, management, and administrator to draft new rules and policy for the improvement of the educational standards and hence overall results and student retention. Taking this point in mind work has been done to find the slow learner in a High School class and then provide timely help to them for improving their overall result. There are lots of techniques of data mining are available for use but we are selecting only those techniques which are mostly used by different research for their result prediction like J48, REPTree, Naive Bayes, SMO, Multilayer Perceptron. On the collected dataset Multilayer Perception classification algorithm gives 87.43% accuracy when using whole dataset as training dataset and SMO and J48 gives 69.00% accuracy when using 10-fold cross validation algorithm.
With the emergence of virtualization and cloud computing technologies, several services are housed on virtualization platform. Virtualization is the technology that many cloud service providers rely on for efficient management and coordination of the resource pool. As essential services are also housed on cloud platform, it is necessary to ensure continuous availability by implementing all necessary measures. Windows Active Directory is one such service that Microsoft developed for Windows domain networks. It is included in Windows Server operating systems as a set of processes and services for authentication and authorization of users and computers in a Windows domain type network. The service is required to run continuously without downtime. As a result, there are chances of accumulation of errors or garbage leading to software aging which in turn may lead to system failure and associated consequences. This results in software aging. In this work, software aging patterns of Windows active directory service is studied. Software aging of active directory needs to be predicted properly so that rejuvenation can be triggered to ensure continuous service delivery. In order to predict the accurate time, a model that uses time series forecasting technique is built.
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
The Architecture of System for Predicting Student Performance based on the Da...Thada Jantakoon
The goals of this study are to develop the architecture of a system for predicting student performance based on data science approaches (SPPS-DSA Architecture) and evaluate the SPPS-DSA Architecture. The research process is divided into two stages: (1) context analysis and (2) development and assessment. The data is analyzed by means of standardized deviations statistically. The research findings suggested that the SPPS-DSA architecture, according to the research findings, consists of three key components: (i) data source, (ii) machine learning methods and attributes, and (iii) data science process. The SPPS-DSA architecture is rated as the highest appropriate overall. Predicting student performance helps educators and students improve their teaching and learning processes. Predicting student performance using various analytical methods is reviewed here. Most researchers used CGPA and internal assessment as data sets. In terms of prediction methods, classification is widely used in educational data science. Researchers most commonly used neural networks and decision trees to predict student performance under classification techniques.
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
Abstract: Data mining techniques are applied to predict college failure and bum of the student. This is method uses real data on middle-school students for prediction of failure and drop out. It implements white-box classification strategies, like induction rules and decision trees or call trees. Call tree could be a call support tool that uses tree-like graph or a model of call and their possible consequences. A call tree is a flowchart-like structure in which internal node represents a "test" on an attribute. Attribute is the real information of students that is collected from college in middle or pedagogy, each branch represents the outcome of the test and each leaf node represents a class label. The paths from root to leaf represent classification rules and it consists of three kinds of nodes which incorporates call node, likelihood node and finish node. It is specifically used in call analysis. Using this technique to boost their correctness for predicting which students might fail or dropout (idler) by first, using all the accessible attributes next, choosing the most effective attributes. Attribute choice is done by using WEKA tool.
Keywords: dataset, classification, clustering.
Predicting students' performance using id3 and c4.5 classification algorithmsIJDKP
An educational institution needs to have an approximate prior knowledge of enrolled students to predict
their performance in future academics. This helps them to identify promising students and also provides
them an opportunity to pay attention to and improve those who would probably get lower grades. As a
solution, we have developed a system which can predict the performance of students from their previous
performances using concepts of data mining techniques under Classification. We have analyzed the data
set containing information about students, such as gender, marks scored in the board examinations of
classes X and XII, marks and rank in entrance examinations and results in first year of the previous batch
of students. By applying the ID3 (Iterative Dichotomiser 3) and C4.5 classification algorithms on this data,
we have predicted the general and individual performance of freshly admitted students in future
examinations.
Role of education is very critical for the development of any country. So it is the responsibility of each and every person to do something for the betterment of education. Taking this fact into consideration we start working on the education system. Education system ranging from basic to higher education. Now a day education system generates a lots of data related to student. If we cannot analyze that data properly then that data is useless. With the help of data mining techniques we can find the hidden information from the data collected for the different educational setting. With the help of that information we can review our educational process or make improvement in our education system. Here in this article we are considering a case of an engineering college student and try to predict the final result in advance. The result of the prediction provides timely help to those students who are on risk of failure in the final examination. There are different techniques of data mining are available and we are using J48, RandomForest, and ADTree to predict the performance of the student in their final examination. On the basis of this predication we can make a decision whether the student will be promoted to next year or not. We the help of the result we can improve the performance of the student who are on risk of fail or promoted. After the declaration of the final result of the student, result is fed into the system and hence the result will analysed for the next semester. The comparative result shows that, prediction help in the improvement of overall result of the weaker students.
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...IIRindia
Educational Data mining(EDM)is a prominent field concerned with developing methods for exploring the unique and increasingly large scale data that come from educational settings and using those methods to better understand students in which they learn. It has been proved in various studies and by the previous study by the authors that data mining techniques find widespread applications in the educational decision making process for improving the performance of students in higher educational institutions. Classification techniques assumes significant importance in the machine learning tasks and are mostly employed in the prediction related problems. In machine learning problems, feature selection techniques are used to reduce the attributes of the class variables by removing the redundant and irrelevant features from the dataset. The aim of this research work is to compares the performance of various feature selection techniques is done using WEKA tool in the prediction of students’ performance in the final semester examination using different classification algorithms. Particularly J48, Naïve Bayes, Bayes Net, IBk, OneR, and JRip are used in this research work. The dataset for the study were collected from the student’s performance report of a private college in Tamil Nadu state of India. The effectiveness of various feature selection algorithms was compared with six classifiers and the results are discussed. The results of this study shows that the accuracy of IBK is 99.680% which is found to be
PREDICTING SUCCESS: AN APPLICATION OF DATA MINING TECHNIQUES TO STUDENT OUTCOMESIJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
PREDICTING SUCCESS: AN APPLICATION OF DATA MINING TECHNIQUES TO STUDENT OUTCOMESIJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...IJCNCJournal
Education data mining is an emerging stream which helps in mining academic data for solving various
types of problems. One of the problems is the selection of a proper academic track. The admission of a
student in engineering college depends on many factors. In this paper we have tried to implement a
classification technique to assist students in predicting their success in admission in an engineering
stream.We have analyzed the data set containing information about student’s academic as well as sociodemographic variables, with attributes such as family pressure, interest, gender, XII marks and CET rank
in entrance examinations and historical data of previous batch of students. Feature selection is a process
for removing irrelevant and redundant features which will help improve the predictive accuracy of
classifiers. In this paper first we have used feature selection attribute algorithms Chi-square.InfoGain, and
GainRatio to predict the relevant features. Then we have applied fast correlation base filter on given
features. Later classification is done using NBTree, MultilayerPerceptron, NaiveBayes and Instance based
–K- nearest neighbor. Results showed reduction in computational cost and time and increase in predictive
accuracy for the student model
Data Mining Application in Advertisement Management of Higher Educational Ins...ijcax
In recent years, Indian higher educational institute’s competition grows rapidly for attracting students to get enrollment in their institutes. To attract students educational institutes select a best advertisement method. There are different advertisements available in the market but a selection of them is very difficult
for institutes. This paper is helpful for institutes to select a best advertisement medium using some data mining methods.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Educational Data Mining is used to predict the future learning behavior of the student. It is still a research topic for the researcher who wants do better result from the prediction of the student. The results of all these techniques help the teachers, management, and administrator to draft new rules and policy for the improvement of the educational standards and hence overall results and student retention. Taking this point in mind work has been done to find the slow learner in a High School class and then provide timely help to them for improving their overall result. There are lots of techniques of data mining are available for use but we are selecting only those techniques which are mostly used by different research for their result prediction like J48, REPTree, Naive Bayes, SMO, Multilayer Perceptron. On the collected dataset Multilayer Perception classification algorithm gives 87.43% accuracy when using whole dataset as training dataset and SMO and J48 gives 69.00% accuracy when using 10-fold cross validation algorithm.
With the emergence of virtualization and cloud computing technologies, several services are housed on virtualization platform. Virtualization is the technology that many cloud service providers rely on for efficient management and coordination of the resource pool. As essential services are also housed on cloud platform, it is necessary to ensure continuous availability by implementing all necessary measures. Windows Active Directory is one such service that Microsoft developed for Windows domain networks. It is included in Windows Server operating systems as a set of processes and services for authentication and authorization of users and computers in a Windows domain type network. The service is required to run continuously without downtime. As a result, there are chances of accumulation of errors or garbage leading to software aging which in turn may lead to system failure and associated consequences. This results in software aging. In this work, software aging patterns of Windows active directory service is studied. Software aging of active directory needs to be predicted properly so that rejuvenation can be triggered to ensure continuous service delivery. In order to predict the accurate time, a model that uses time series forecasting technique is built.
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
The Architecture of System for Predicting Student Performance based on the Da...Thada Jantakoon
The goals of this study are to develop the architecture of a system for predicting student performance based on data science approaches (SPPS-DSA Architecture) and evaluate the SPPS-DSA Architecture. The research process is divided into two stages: (1) context analysis and (2) development and assessment. The data is analyzed by means of standardized deviations statistically. The research findings suggested that the SPPS-DSA architecture, according to the research findings, consists of three key components: (i) data source, (ii) machine learning methods and attributes, and (iii) data science process. The SPPS-DSA architecture is rated as the highest appropriate overall. Predicting student performance helps educators and students improve their teaching and learning processes. Predicting student performance using various analytical methods is reviewed here. Most researchers used CGPA and internal assessment as data sets. In terms of prediction methods, classification is widely used in educational data science. Researchers most commonly used neural networks and decision trees to predict student performance under classification techniques.
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
Abstract: Data mining techniques are applied to predict college failure and bum of the student. This is method uses real data on middle-school students for prediction of failure and drop out. It implements white-box classification strategies, like induction rules and decision trees or call trees. Call tree could be a call support tool that uses tree-like graph or a model of call and their possible consequences. A call tree is a flowchart-like structure in which internal node represents a "test" on an attribute. Attribute is the real information of students that is collected from college in middle or pedagogy, each branch represents the outcome of the test and each leaf node represents a class label. The paths from root to leaf represent classification rules and it consists of three kinds of nodes which incorporates call node, likelihood node and finish node. It is specifically used in call analysis. Using this technique to boost their correctness for predicting which students might fail or dropout (idler) by first, using all the accessible attributes next, choosing the most effective attributes. Attribute choice is done by using WEKA tool.
Keywords: dataset, classification, clustering.
Predicting students' performance using id3 and c4.5 classification algorithmsIJDKP
An educational institution needs to have an approximate prior knowledge of enrolled students to predict
their performance in future academics. This helps them to identify promising students and also provides
them an opportunity to pay attention to and improve those who would probably get lower grades. As a
solution, we have developed a system which can predict the performance of students from their previous
performances using concepts of data mining techniques under Classification. We have analyzed the data
set containing information about students, such as gender, marks scored in the board examinations of
classes X and XII, marks and rank in entrance examinations and results in first year of the previous batch
of students. By applying the ID3 (Iterative Dichotomiser 3) and C4.5 classification algorithms on this data,
we have predicted the general and individual performance of freshly admitted students in future
examinations.
Role of education is very critical for the development of any country. So it is the responsibility of each and every person to do something for the betterment of education. Taking this fact into consideration we start working on the education system. Education system ranging from basic to higher education. Now a day education system generates a lots of data related to student. If we cannot analyze that data properly then that data is useless. With the help of data mining techniques we can find the hidden information from the data collected for the different educational setting. With the help of that information we can review our educational process or make improvement in our education system. Here in this article we are considering a case of an engineering college student and try to predict the final result in advance. The result of the prediction provides timely help to those students who are on risk of failure in the final examination. There are different techniques of data mining are available and we are using J48, RandomForest, and ADTree to predict the performance of the student in their final examination. On the basis of this predication we can make a decision whether the student will be promoted to next year or not. We the help of the result we can improve the performance of the student who are on risk of fail or promoted. After the declaration of the final result of the student, result is fed into the system and hence the result will analysed for the next semester. The comparative result shows that, prediction help in the improvement of overall result of the weaker students.
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...IIRindia
Educational Data mining(EDM)is a prominent field concerned with developing methods for exploring the unique and increasingly large scale data that come from educational settings and using those methods to better understand students in which they learn. It has been proved in various studies and by the previous study by the authors that data mining techniques find widespread applications in the educational decision making process for improving the performance of students in higher educational institutions. Classification techniques assumes significant importance in the machine learning tasks and are mostly employed in the prediction related problems. In machine learning problems, feature selection techniques are used to reduce the attributes of the class variables by removing the redundant and irrelevant features from the dataset. The aim of this research work is to compares the performance of various feature selection techniques is done using WEKA tool in the prediction of students’ performance in the final semester examination using different classification algorithms. Particularly J48, Naïve Bayes, Bayes Net, IBk, OneR, and JRip are used in this research work. The dataset for the study were collected from the student’s performance report of a private college in Tamil Nadu state of India. The effectiveness of various feature selection algorithms was compared with six classifiers and the results are discussed. The results of this study shows that the accuracy of IBK is 99.680% which is found to be
PREDICTING SUCCESS: AN APPLICATION OF DATA MINING TECHNIQUES TO STUDENT OUTCOMESIJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
PREDICTING SUCCESS: AN APPLICATION OF DATA MINING TECHNIQUES TO STUDENT OUTCOMESIJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...IJCNCJournal
Education data mining is an emerging stream which helps in mining academic data for solving various
types of problems. One of the problems is the selection of a proper academic track. The admission of a
student in engineering college depends on many factors. In this paper we have tried to implement a
classification technique to assist students in predicting their success in admission in an engineering
stream.We have analyzed the data set containing information about student’s academic as well as sociodemographic variables, with attributes such as family pressure, interest, gender, XII marks and CET rank
in entrance examinations and historical data of previous batch of students. Feature selection is a process
for removing irrelevant and redundant features which will help improve the predictive accuracy of
classifiers. In this paper first we have used feature selection attribute algorithms Chi-square.InfoGain, and
GainRatio to predict the relevant features. Then we have applied fast correlation base filter on given
features. Later classification is done using NBTree, MultilayerPerceptron, NaiveBayes and Instance based
–K- nearest neighbor. Results showed reduction in computational cost and time and increase in predictive
accuracy for the student model
Correlation based feature selection (cfs) technique to predict student perfro...IJCNCJournal
Education data mining is an emerging stream which h
elps in mining academic data for solving various
types of problems. One of the problems is the selec
tion of a proper academic track. The admission of a
student in engineering college depends on many fact
ors. In this paper we have tried to implement a
classification technique to assist students in pred
icting their success in admission in an engineering
stream.We have analyzed the data set containing inf
ormation about student’s academic as well as socio-
demographic variables, with attributes such as fami
ly pressure, interest, gender, XII marks and CET ra
nk
in entrance examinations and historical data of pre
vious batch of students. Feature selection is a pro
cess
for removing irrelevant and redundant features whic
h will help improve the predictive accuracy of
classifiers. In this paper first we have used featu
re selection attribute algorithms Chi-square.InfoGa
in, and
GainRatio to predict the relevant features. Then we
have applied fast correlation base filter on given
features. Later classification is done using NBTree
, MultilayerPerceptron, NaiveBayes and Instance bas
ed
–K- nearest neighbor. Results showed reduction in c
omputational cost and time and increase in predicti
ve
accuracy for the student model
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...IJCNCJournal
Education data mining is an emerging stream which helps in mining academic data for solving various types of problems. One of the problems is the selection of a proper academic track. The admission of a student in engineering college depends on many factors. In this paper we have tried to implement a classification technique to assist students in predicting their success in admission in an engineering stream.We have analyzed the data set containing information about student’s academic as well as sociodemographic variables, with attributes such as family pressure, interest, gender, XII marks and CET rank in entrance examinations and historical data of previous batch of students. Feature selection is a process for removing irrelevant and redundant features which will help improve the predictive accuracy of classifiers. In this paper first we have used feature selection attribute algorithms Chi-square.InfoGain, and GainRatio to predict the relevant features. Then we have applied fast correlation base filter on given features. Later classification is done using NBTree, MultilayerPerceptron, NaiveBayes and Instance based –K- nearest neighbor. Results showed reduction in computational cost and time and increase in predictive accuracy for the student model
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...Editor IJCATR
Higher learning institutions nowadays operate in a more complex and competitive due to a high demand from prospective
students and an emerging increase of universities both public and private. Management of Universities face challenges and concerns of
predicting students’ academic performance in to put mechanisms in place prior enough for their improvement. This research aims at
employing Decision tree and K-means data mining algorithms to model an approach to predict the performance of students in advance
so as to devise mechanisms of alleviating student dropout rates and improve on performance. In Kenya for example, there has been
witnessed an increase student enrolling in universities since the Government started free primary education. Therefore the Government
expects an increased workforce of professionals from these institutions without compromising quality so as to achieve its millennium
development and vision 2030. Backlog of students not finishing their studies in stipulated time due to poor performance is another
issue that can be addressed from the results of this research since predicting student performance in advance will enable University
management to devise ways of assisting weak students and even make more decisions on how to select students for particular courses.
Previous studies have been done Educational Data Mining mostly focusing on factors affecting students’ performance and also used
different algorithms in predicting students’ performance. In all these researches, accuracy of prediction is key and what researchers
look forward to try and improve.
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...Editor IJCATR
Higher learning institutions nowadays operate in a more complex and competitive due to a high demand from prospective
students and an emerging increase of universities both public and private. Management of Universities face challenges and concerns of
predicting students’ academic performance in to put mechanisms in place prior enough for their improvement. This research aims at
employing Decision tree and K-means data mining algorithms to model an approach to predict the performance of students in advance
so as to devise mechanisms of alleviating student dropout rates and improve on performance. In Kenya for example, there has been
witnessed an increase student enrolling in universities since the Government started free primary education. Therefore the Government
expects an increased workforce of professionals from these institutions without compromising quality so as to achieve its millennium
development and vision 2030. Backlog of students not finishing their studies in stipulated time due to poor performance is another
issue that can be addressed from the results of this research since predicting student performance in advance will enable University
management to devise ways of assisting weak students and even make more decisions on how to select students for particular courses.
Previous studies have been done Educational Data Mining mostly focusing on factors affecting students’ performance and also used
different algorithms in predicting students’ performance. In all these researches, accuracy of prediction is key and what researchers
look forward to try and improve.
Data Mining Techniques in Higher Education an Empirical Study for the Univer...IJMER
Nowadays, ones of the biggest challenges that educational institutions face is the explosive
growth of educational data. and how to use these data to improve the quality of managerial decisions.
Data mining, as an analytical tools that can be used to extract meaningful knowledge from large data
sets, can be used to achieve this goal.
This paper addresses the applications of Educational Data Mining (EDM) to extract useful information
from registration information of student at university of Palestine in Gaza strip. The data include five
years period [2005-2011] by providing analytical tool to view and use this information for decision
making processes by taking real life example such as grade and GPA for the students. abstract should
summarize the content of the paper.
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Predictionijtsrd
Data mining techniques play an important role in data analysis. For the construction of a classification model which could predict performance of students, particularly for engineering branches, a decision tree algorithm associated with the data mining techniques have been used in the research. A number of factors may affect the performance of students. Data mining technology which can related to this student grade well and we also used classification algorithms prediction. In this paper, we used educational data mining to predict students final grade based on their performance. We proposed student data classification using ID3 Iterative Dichotomiser 3 Decision Tree Algorithm Khin Khin Lay | San San Nwe "Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26545.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26545/using-id3-decision-tree-algorithm-to-the-student-grade-analysis-and-prediction/khin-khin-lay
The International Journal of Mechanical Engineering Research and Technology is an international online journal published Quarterly offers fast publication schedule whilst maintaining rigorous peer review. The use of recommended electronic formats for article delivery expedites the process All submitted research articles are subjected to the immediate rapid screening by editors consultation with Editorial Board or others working in the field of appropriate to ensure that they are likely to be the level of interest and importance of appropriate for the journal.
ISSN 2454-535X
International Journal of Mechanical Engineering Research and Technology aims to provide the best possible service to authors of original research articles, and the fairest system of peer review.
The International Journal of Mechanical Engineering Research and Technology is an international online journal in English published Quarterly. This offers a fast publication schedule whilst maintaining rigorous peer review; the use of recommended electronic formats for article delivery expedites the process. All submitted research articles are subjected to immediate rapid screening by the editors, in consultation with the Editorial Board or others working in the field as appropriate, to ensure they are likely to be of the level of interest and importance appropriate for the journal.
In our homes or offices, security has been a vital issue. Control of home security system remotely always offers huge advantages like the arming or disarming of the alarms, video monitoring, and energy management control apart from safeguarding the home free up intruders. Considering the oldest simple methods of security that is the mechanical lock system that has a key as the authentication element, then an upgrade to a universal type, and now unique codes for the lock. The recent advancement in the communication system has brought the tremendous application of communication gadgets into our various areas of life. This work is a real-time smart doorbell notification system for home Security as opposes of the traditional security methods, it is composed of the doorbell interfaced with GSM Module, a GSM module would be triggered to send an SMS to the house owner by pressing the doorbell, the owner will respond to the guest by pressing a button to open the door, otherwise, a message would be displayed to the guest for appropriate action. Then, the keypad is provided for an authorized person for the provision of password for door unlocking, if multiple wrong password attempts were made to unlock, a message of burglary attempt would be sent to the house owner for prompt action. The main benefit of this system is the uniqueness of the incorporation of the password and messaging systems which denies access to any unauthorized personality and owner's awareness method.
Augmented reality, the new age technology, has widespread applications in every field imaginable. This technology has proven to be an inflection point in numerous verticals, improving lives and improving performance. In this paper, we explore the various possible applications of Augmented Reality (AR) in the field of Medicine. The objective of using AR in medicine or generally in any field is the fact that, AR helps in motivating the user, making sessions interactive and assist in faster learning. In this paper, we discuss about the applicability of AR in the field of medical diagnosis. Augmented reality technology reinforces remote collaboration, allowing doctors to diagnose patients from a different locality. Additionally, we believe that a much more pronounced effect can be achieved by bringing together the cutting edge technology of AR and the lifesaving field of Medical sciences. AR is a mechanism that could be applied in the learning process too. Similarly, virtual reality could be used in the field where more of practical experience is needed such as driving, sports, neonatal care training.
Image fusion is a sub field of image processing in which more than one images are fused to create an image where all the objects are in focus. The process of image fusion is performed for multi-sensor and multi-focus images of the same scene. Multi-sensor images of the same scene are captured by different sensors whereas multi-focus images are captured by the same sensor. In multi-focus images, the objects in the scene which are closer to the camera are in focus and the farther objects get blurred. Contrary to it, when the farther objects are focused then closer objects get blurred in the image. To achieve an image where all the objects are in focus, the process of images fusion is performed either in spatial domain or in transformed domain. In recent times, the applications of image processing have grown immensely. Usually due to limited depth of field of optical lenses especially with greater focal length, it becomes impossible to obtain an image where all the objects are in focus. Thus, it plays an important role to perform other tasks of image processing such as image segmentation, edge detection, stereo matching and image enhancement. Hence, a novel feature-level multi-focus image fusion technique has been proposed which fuses multi-focus images. Thus, the results of extensive experimentation performed to highlight the efficiency and utility of the proposed technique is presented. The proposed work further explores comparison between fuzzy based image fusion and neuro fuzzy fusion technique along with quality evaluation indices.
Graphs have become the dominant life-form of many tasks as they advance a
structure to represent many tasks and the corresponding relations. A powerful
role of networks/graphs is to bridge local feats that exist in vertices as they
blossom into patterns that help explain how nodal relations and their edges
impacts a complex effect that ripple via a graph. User cluster are formed as a
result of interactions between entities. Many users can hardly categorize their
contact into groups today such as “family”, “friends”, “colleagues” etc. Thus,
the need to analyze such user social graph via implicit clusters, enables the
dynamism in contact management. Study seeks to implement this dynamism
via a comparative study of deep neural network and friend suggest algorithm.
We analyze a user’s implicit social graph and seek to automatically create
custom contact groups using metrics that classify such contacts based on a
user’s affinity to contacts. Experimental results demonstrate the importance
of both the implicit group relationships and the interaction-based affinity in
suggesting friends.
This paper projects Gryllidae Optimization Algorithm (GOA) has been applied to solve optimal reactive power problem. Proposed GOA approach is based on the chirping characteristics of Gryllidae. In common, male Gryllidae chirp, on the other hand some female Gryllidae also do as well. Male Gryllidae draw the females by this sound which they produce. Moreover, they caution the other Gryllidae against dangers with this sound. The hearing organs of the Gryllidae are housed in an expansion of their forelegs. Through this, they bias to the produced fluttering sounds. Proposed Gryllidae Optimization Algorithm (GOA) has been tested in standard IEEE 14, 30 bus test systems and simulation results show that the projected algorithms reduced the real power loss considerably.
In the wake of the sudden replacement of wood and kerosene by gas cookers for several purposes in Nigeria, gas leakage has caused several damages in our homes, Laboratories among others. installation of a gas leakage detection device was globally inspired to eliminate accidents related to gas leakage. We present an alternative approach to developing a device that can automatically detect and control gas leakages and also monitor temperature. The system detects the leakage of the LPG (Liquefied Petroleum Gas) using a gas sensor, then triggred the control system response which employs ventilator system, Mobile phone alert and alarm when the LPG concentration in the air exceeds a certain level. The performance of two gas sensors (MQ5 and MQ6) were tested for a guided decision. Also, when the temperature of the environment poses a danger, LED (indicator), buzzer and LCD (16x2) display was used to indicate temperature and gas leakage status in degree Celsius and PPM respectively. Attension was given to the response time of the control system, which was ascertained that this system significantly increases the chances and efficiency of eliminating gas leakage related accident.
Feature selection problem is one of the main important problems in the text and data mining domain. This paper presents a comparative study of feature selection methods for Arabic text classification. Five of the feature selection methods were selected: ICHI square, CHI square, Information Gain, Mutual Information and Wrapper. It was tested with five classification algorithms: Bayes Net, Naive Bayes, Random Forest, Decision Tree and Artificial Neural Networks. In addition, Data Collection was used in Arabic consisting of 9055 documents, which were compared by four criteria: Precision, Recall, F-measure and Time to build model. The results showed that the improved ICHI feature selection got almost all the best results in comparison with other methods.
In this paper Gentoo Penguin Algorithm (GPA) is proposed to solve optimal reactive power problem. Gentoo Penguins preliminary population possesses heat radiation and magnetizes each other by absorption coefficient. Gentoo Penguins will move towards further penguins which possesses low cost (elevated heat concentration) of absorption. Cost is defined by the heat concentration, distance. Gentoo Penguins penguin attraction value is calculated by the amount of heat prevailed between two Gentoo penguins. Gentoo Penguins heat radiation is measured as linear. Less heat is received in longer distance, in little distance, huge heat is received. Gentoo Penguin Algorithm has been tested in standard IEEE 57 bus test system and simulation results show the projected algorithm reduced the real power loss considerably.
08 20272 academic insight on applicationIAESIJEECS
This research has thrown up many questions in need of further investigation.There was an expressive quantitative-qualitative research, which a common investigation form was used in.The dialogue item was also applied to discover if the contributors asserted the media-based attitude supplements their learning of academic English writing classes or not.Data recounted academic” insights toward using Skype as a sustaining implement for lessons releasing based on chosen variables: their occupation, year of education, and knowledge with Skype discovered that there were no important statistical differences in the use of Skype units because of medical academics major knowledge. There are statistically important differences in using Skype units. The findings also, disclosed that there are statistically significant differences in using Skype units due to the practice with Skype variable, in favors of academics with no Skype use practice. Skype instrument as an instructive media is a positive medium to be employed to supply academic medical writing data and assist education. Academics who do not have enough time to contribute in classes believe comfortable using the Skype-based attitude in scientific writing. They who took part in the course claimed that their approval of this media is due to learning academic innovative medical writing.
Cloud computing has sweeping impact on the human productivity. Today it’s used for Computing, Storage, Predictions and Intelligent Decision Making, among others. Intelligent Decision-Making using Machine Learning has pushed for the Cloud Services to be even more fast, robust and accurate. Security remains one of the major concerns which affect the cloud computing growth however there exist various research challenges in cloud computing adoption such as lack of well managed service level agreement (SLA), frequent disconnections, resource scarcity, interoperability, privacy, and reliability. Tremendous amount of work still needs to be done to explore the security challenges arising due to widespread usage of cloud deployment using Containers. We also discuss Impact of Cloud Computing and Cloud Standards. Hence in this research paper, a detailed survey of cloud computing, concepts, architectural principles, key services, and implementation, design and deployment challenges of cloud computing are discussed in detail and important future research directions in the era of Machine Learning and Data Science have been identified.
Notary is an official authorized to make an authentic deed regarding all deeds, agreements and stipulations required by a general rule. Activities carried out at the notary office such as recording client data and file data still use traditional systems that tend to be manual. The problem that occurs is the inefficiency in data processing and providing information to clients. Clients have difficulty getting information related to the progress of documents that are being taken care of at the notary's office. The client must take the time to arrive to the notary's office repeatedly to check the progress of the work of the document file. The purpose of this study is to facilitate clients in obtaining information about the progress of the work in progress, and make it easier for employees to process incoming documents by implementing an administrative system. This system was developed with the waterfall system development method and uses the Multi-Channel Access Technology integrated in the website to simplify the process of delivering information and requesting information from clients and to clients with Telegram and SMS Gateway. Clients will come to the office only when there is a notification from the system via Telegram or SMS notifying that the client must come directly to the notary's office, thus leading to an efficient time and avoiding excessive transportation costs. The overall functional system can function properly based on the results of alpha testing. The results of beta testing conducted by distributing the system feasibility test questionnaire to end users, get a percentage of 96% of users agree the system is feasible to be implemented.
In this work Tundra wolf algorithm (TWA) is proposed to solve the optimal reactive power problem. In the projected Tundra wolf algorithm (TWA) in order to avoid the searching agents from trapping into the local optimal the converging towards global optimal is divided based on two different conditions. In the proposed Tundra wolf algorithm (TWA) omega tundra wolf has been taken as searching agent as an alternative of indebted to pursue the first three most excellent candidates. Escalating the searching agents’ numbers will perk up the exploration capability of the Tundra wolf wolves in an extensive range. Proposed Tundra wolf algorithm (TWA) has been tested in standard IEEE 14, 30 bus test systems and simulation results show the proposed algorithm reduced the real power loss effectively.
In this work Predestination of Particles Wavering Search (PPS) algorithm has been applied to solve optimal reactive power problem. PPS algorithm has been modeled based on the motion of the particles in the exploration space. Normally the movement of the particle is based on gradient and swarming motion. Particles are permitted to progress in steady velocity in gradient-based progress, but when the outcome is poor when compared to previous upshot, immediately particle rapidity will be upturned with semi of the magnitude and it will help to reach local optimal solution and it is expressed as wavering movement. In standard IEEE 14, 30, 57,118,300 bus systems Proposed Predestination of Particles Wavering Search (PPS) algorithm is evaluated and simulation results show the PPS reduced the power loss efficiently.
In this paper, Mine Blast Algorithm (MBA) has been intermingled with Harmony Search (HS) algorithm for solving optimal reactive power dispatch problem. MBA is based on explosion of landmines and HS is based on Creativeness progression of musicians-both are hybridized to solve the problem. In MBA Initial distance of shrapnel pieces are reduced gradually to allow the mine bombs search the probable global minimum location in order to amplify the global explore capability. Harmony search (HS) imitates the music creativity process where the musicians supervise their instruments’ pitch by searching for a best state of harmony. Hybridization of Mine Blast Algorithm with Harmony Search algorithm (MH) improves the search effectively in the solution space. Mine blast algorithm improves the exploration and harmony search algorithm augments the exploitation. At first the proposed algorithm starts with exploration & gradually it moves to the phase of exploitation. Proposed Hybridized Mine Blast Algorithm with Harmony Search algorithm (MH) has been tested on standard IEEE 14, 300 bus test systems. Real power loss has been reduced considerably by the proposed algorithm. Then Hybridized Mine Blast Algorithm with Harmony Search algorithm (MH) tested in IEEE 30, bus system (with considering voltage stability index)- real power loss minimization, voltage deviation minimization, and voltage stability index enhancement has been attained.
Artificial Neural Networks have proved their efficiency in a large number of research domains. In this paper, we have applied Artificial Neural Networks on Arabic text to prove correct language modeling, text generation, and missing text prediction. In one hand, we have adapted Recurrent Neural Networks architectures to model Arabic language in order to generate correct Arabic sequences. In the other hand, Convolutional Neural Networks have been parameterized, basing on some specific features of Arabic, to predict missing text in Arabic documents. We have demonstrated the power of our adapted models in generating and predicting correct Arabic text comparing to the standard model. The model had been trained and tested on known free Arabic datasets. Results have been promising with sufficient accuracy.
In the present-day communications speech signals get contaminated due to
various sorts of noises that degrade the speech quality and adversely impacts
speech recognition performance. To overcome these issues, a novel approach
for speech enhancement using Modified Wiener filtering is developed and
power spectrum computation is applied for degraded signal to obtain the
noise characteristics from a noisy spectrum. In next phase, MMSE technique
is applied where Gaussian distribution of each signal i.e. original and noisy
signal is analyzed. The Gaussian distribution provides spectrum estimation
and spectral coefficient parameters which can be used for probabilistic model
formulation. Moreover, a-priori-SNR computation is also incorporated for
coefficient updation and noise presence estimation which operates similar to
the conventional VAD. However, conventional VAD scheme is based on the
hard threshold which is not capable to derive satisfactory performance and a
soft-decision based threshold is developed for improving the performance of
speech enhancement. An extensive simulation study is carried out using
MATLAB simulation tool on NOIZEUS speech database and a comparative
study is presented where proposed approach is proved better in comparison
with existing technique.
Previous research work has highlighted that neuro-signals of Alzheimer’s disease patients are least complex and have low synchronization as compared to that of healthy and normal subjects. The changes in EEG signals of Alzheimer’s subjects start at early stage but are not clinically observed and detected. To detect these abnormalities, three synchrony measures and wavelet-based features have been computed and studied on experimental database. After computing these synchrony measures and wavelet features, it is observed that Phase Synchrony and Coherence based features are able to distinguish between Alzheimer’s disease patients and healthy subjects. Support Vector Machine classifier is used for classification giving 94% accuracy on experimental database used. Combining, these synchrony features and other such relevant features can yield a reliable system for diagnosing the Alzheimer’s disease.
Attenuation correction designed for PET/MR hybrid imaging frameworks along with portion making arrangements used for MR-based radiation treatment remain testing because of lacking high-energy photon weakening data. We present a new method so as to uses the learned nonlinear neighborhood descriptors also highlight coordinating toward foresee pseudo-CT pictures starting T1w along with T2w MRI information. The nonlinear neighborhood descriptors are acquired through anticipating the direct descriptors interested in the nonlinear high-dimensional space utilizing an unequivocal constituent guide also low-position guess through regulated complex regularization. The nearby neighbors of every near descriptor inside the data MR pictures are looked during an obliged spatial extent of the MR pictures among the training dataset. By that point, the pseudo-CT patches are evaluated through k-closest neighbor relapse. The planned procedure designed for pseudo-CT forecast is quantitatively broke downward on top of a dataset comprising of coordinated mind MRI along with CT pictures on or after 13 subjects.
The cognitive radio prototype performance is to alleviate the scarcity of spectral resources for wireless communication through intelligent sensing and quick resource allocation techniques. Secondary users (SU’s) actively obtain the spectrum access opportunity by supporting primary users (PU’s) in cognitive radio networks (CRNs). In present generation, spectrum access is endowed through cooperative communication-based link-level frame-based cooperative (LLC) principle. In this SUs independently act as conveyors for PUs to achieve spectrum access opportunities. Unfortunately, this LLC approach cannot fully exploit spectrum access opportunities to enhance the throughput of CRNs and fails to motivate PUs to join the spectrum sharing processes. Therefore, to overcome this con, network level cooperative (NLC) principle was used, where SUs are integrated mutually to collaborate with PUs session by session, instead of frame based cooperation for spectrum access opportunities. NLC approach has justified the challenges facing in LLC approach. In this paper we make a survey of some models that have been proposed to tackle the problem of LLC. We show the relevant aspects of each model, in order to characterize the parameters that we should take in account to achieve a spectrum access opportunity.
In this paper, the author provides insights and lessons that can be learned from colleagues at American universities about their online education experiences. The literature review and previous studies of online educations gains are explored and summarized in this research. Emerging trends in online education are discussed in detail, and strategies to implement these trends are explained. The author provides several tools and strategies that enable universities to ensure the quality of online education. At the end of this research paper, the researcher provides examples from Arab universities who have successfully implemented online education and expanded their impact on the society. This research provides a strategy and a model that can be used by universities in the Middle East as a roadmap to implement online education in their regions.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. Int J Inf & Commun Technol ISSN: 2252-8776
Classifiers ensemble and synthetic minority oversampling techniques for academic … (Abdulazeez Yusuf)
123
technique for solving problems related to students’ success using data originating from educational
environments. Various data mining techniques have been implemented in research studies for educational data
mining. The research work conducted [4] categorized all the various methods used in educational data mining
into the following categories: Classification, Clustering, Relationship mining, Discovery with models and
finally Distillation of data for human judgment. These data mining methods have been applied in many research
works and were reported to have better performance than other methods. A cross validation test result [5]
indicates that data mining techniques predicted significantly better than its statistical counterpart. Therefore,
[6] suggested that with the increasing need for data mining and analyses, there is a need for improving the
performance of data mining models and machine learning algorithms.
2. RELATED STUDIES
In recent time various research studies have been conducted on predicting students’ academic
performance using various data mining techniques and machine learning algorithms. The study of [7] adopted
only two classifiers algorithms in predicting the dropout features of students. The result of the research on three
different datasets which contains different students’ attributes, such as: nationality of the students, sex, city of
living, high school grades, program enrolled, number of earned credits in the first year of study and average
grade in the first year of study indicates that data mining with J48 decision tree algorithm is more accurate than
Naïve Bayes classifier algorithm with an accuracy of 81.1679 %. The research only considered two classifier
algorithms. Meanwhile, [8] used more classifier algorithms which are; Neural Network (NN), Decision Tree,
Support Vector Machine (SVM), K-nearest neighbor (KNN), Naïve Bayes and Rule Based to predict learners’
progression in tertiary education. The finding from the research indicates that SVM has the highest performance
accuracy of 73.33% and the least performance was recorded by Logistic Regression which has 60.05%
accuracy. Only psychometric factors related to students are considered in conducting the research. Academic
performance of student [2] is not a result of only one deciding factor besides it heavily hinges on various factors
like personal, socio-economic, psychological and other environmental variables.
The work of [9] made use of three classifiers which are Naïve Bayes, Decision tree and Neural
Network. In the study, continuous attributes were discretized using optimal equal width binning and Synthetic
Minority Over-Sampling (SMOTE) technique was used to increase the volume of data, because there were
limited instances in the acquired data during preprocessing. Neural Network and Naïve Bayes were reported
to be more accurate than Decision tree for classification when Optimal Equal Width Binning and Synthetic
Minority Over-Sampling (SMOTE) techniques are applied on data. Both classifiers have an accuracy of 71.6%
though; Neural Network algorithm was discovered to be slower when compared to the Naïve Bayes algorithm
thereby making Naïve Bayes model better than the neural network model. [2] Focused on identifying the slow
learners among students and displaying it by a predictive data mining model using classification-based
algorithms (Multilayer Perception, Naïve Bayes, SMO, J48 and REPTree. The work shows that multilayer
Perception has the highest accuracy of 75% and RepTree has the least accuracy with 67.76%.
A comparative analysis of three selected classification algorithms; Decision Tree (DT), Naïve Bayes
(NB), and Rule Based (RB) was conducted by [10] to predict students’ academic performance. The analysis
was done to discover the best techniques to develop a predictive model for Student Academic Performance of
first semester performance for first year Bachelor of computer science students at Universiti Sultan Zainal
Abidin. Rule Based classifier was discovered to be the best model amongst the other classifiers by receiving
the highest performance accuracy value of 71.3%. The model in this study does not provide detailed
information about the students’ performance. It only predicted the first-year performance of students not the
graduation performance and classifies students’ performance into poor, average and good. The research on
early prediction of students’ Grade Point Average (GPA) by [11] also showed that support vector machine
(SVM) classifier algorithm prediction is more accurate than the extreme learning machine method and neural
network. With performance accuracy of 93.06% when second year GPA of students is considered and 97.98%
when the third year GPA of students are considered. Three supervised machine learning algorithms’
performance were evaluated on students’ assessment data characteristics by [14] to predict success in a course
(either passed or failed) the result indicates that base on their prediction accuracy, ease of learning and user
friendly characteristics, Naïve Bayes classifier outperforms decision tree and neural network classifiers.
The research conducted by [13] to predict students’ performance shows that Random forest is a more
accurate and faster algorithm compared to decision tree, K-Nearest Neighbour (IBK) and Multi-layer
perceptron algorithms with an accuracy of 89.23%. Predicting academic performance of students is [14]
challenging since students’ academic performance depends on diverse factors such as personal, socio-
economic, psychological and other environmental variables. The study also identify ensemble methods are the
most influential development in Data Mining and Machine Learning in the past decade. An approach for
predicting students’ academic performance using ensemble model was presented in the study of [15]. Stacking
3. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 8, No. 3, Dec 2019: 122 – 127
124
ensemble technique was used in [16] for predicting academic achievement of students. Performance of three
classifiers algorithms were evaluated and stacking ensemble technique was used to combine the three
classifiers and a better root mean square error (RMSE) value of 0.1291 was obtained as compare to 0.1898 for
back propagation neural network, 0.1314 for M5P and 0.1343 for support vector machine.
Stacking is one of the ensemble techniques used by researcher with the aim of improving model’s
performance. [17] Stacking ensemble technique has capability of combining heterogeneous base classifiers and
a Meta classifier is trained for final prediction. Predictions output of base classifiers are fed directly as data
input into the Meta classifier for training and final prediction.
Therefore, in this research work, the improvement of the performance of stacking classifier ensemble
model was considered, so that only instances that are correctly predicted from the base classifiers are fed to the
meta classifier.
3. RESEARCH METHOD
The methodology of Educational Data Mining was not yet clearly defined and there are no clear
standards about which data mining methods or algorithms are preferable in this context. Various data mining
methods have been used by different researchers for estimating preferable algorithms in this context [7]. But
in general, it was stated in [18] that Data mining processes follows a set of steps that must be executed
regardless of the algorithms or methodology that will be implemented. In this study the Cross Industry Standard
Process for Data Mining (CRISP-DM) was adopted.
3.1. Data collection
A total of 206 students’ data from the faculty of science, Federal University Dutse was collected. The
data set was divided into two subsets for model training and testing respectively. The model was trained using
164 student’s data which represents 80% of the data set while 42 students’ data which represent 20% of the
data set was used in model testing.
3.2. Data preparation and cleaning
The data preparation phase covers all activities required in constructing the final data set that was fed
into WEKA 3.9.1 data mining tools from the initial raw data. It is a known fact that real-world data tend to be
incomplete, inconsistent and noisy. Therefore, for real-world data to be utilized by the data mining tool, they
have to be further pre-processed. The attribute filter in WEKA 3.9.1 was used to remove noisy and incomplete
data. The final summary of attributes used for conducting the experiments after data cleaning
are presented in Table 1.
Table 1. Summary of selected students’ attributes
S/N Attributes Data Type
1 English Score Float
2 Subject2 Score Float
3 Subject3 Score Float
4 Subject 4 Score Float
5 English Grade String
6 Subject2 Grade String
7 Subject 3 Grade String
8 Subject 4 Grade String
9 First Year CGPA Float
10 Predicted Class of Graduation Nominal
3.3. Modeling
The model this research work attempts to develop is a stacking classifiers ensemble model. The data
set for the study is small and imbalance as such, machine learning algorithms that are likely to perform well in
developing this type of model based on previous studies were adopted. The dataset before class balancing as
shown in Figure 1. SMOTE was used for balancing the classes in the data set thus, increasing the volume of
the training data set from 164 instances to 312 instances thereby making all the four classes to have 78 equal
numbers of instances illustrated in Figure 2.
The various models were trained and tested using 10-fold cross validation to avoid over-fitting the
models. Proposed model framework can be seen in Figure 3.
4. Int J Inf & Commun Technol ISSN: 2252-8776
Classifiers ensemble and synthetic minority oversampling techniques for academic … (Abdulazeez Yusuf)
125
Figure 1. Dataset before class balancing
Figure 2. Dataset after class balancing with SMOTE on WEKA explorer
Figure 3. Proposed model framework
Data
set
Data Processing
Training
Data
SMOTE
Testing
Data
Stacking Ensemble
IBK J48 SMO
IBK J48 SMO
Meta Classifier
Evaluation
Accuracy and RMSE
Predicted outcome
Modelling
5. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 8, No. 3, Dec 2019: 122 – 127
126
3.4. Model training and testing
In this research, series of training and testing were carried out on the using the various model by
dividing the data set was divided into two subsets for model training and testing., for training, 80 % of the data
set was used and the remaining 20% of the data set was used for testing. 10-fold cross validation was used
throughout model training and testing to avoid over fitting the models. Since the data set is small and
imbalanced SMOTE technique was used to balance the classes and increase data volume of the training data
sets. The WEKA 3.9.1 data mining tool provides a training and testing option to train and test on the same
data set.
3.5. Model evaluation
To evaluate the performance of the various models, Performance accuracy and Root mean square
error (RMSE) was used to indicate the various model performances which are presented in tabular forms.
4. RESULTS AND DISCUSSION
The results as obtained on training the various models using the training data set before class balancing
is presented in Table 2 while the various models performance results after class balancing with SMOTE is
presented in Table 3 The result from Table 3 indicates that class balancing using SMOTE results in improving
all the various models performance. Though, all the various models recorded improvement in their
performance. The proposed modified stacking classifiers ensemble model outperformed the other models in
both performance accuracy and RMSE values which makes the model better than the other classifiers model.
Table 2. Performance of various model on training dataset before class balancing
S/N Classifiers Accuracy RMSE TPR FPR Precision
1 J48 87.1951% 0.2322 0.872 0.082 0.881
2 SMO 86.5854% 0.3301 0.866 0.096 0.870
3 IBK 82.9268% 0.2097 0.829 0.134 0.840
4 Standard Stacking 87.1951% 0.2404 0.872 0.082 0.881
5 Modified Stacking 93.9024% 0.1719 0.939 0.048 0.941
Table 3. Performance result of various model on training dataset after class balancing
S/N Classifiers Accuracy RMSE TPR FPR Precision
1 J48 95.1923% 0.1455 0.952 0.016 0.953
2 SMO 90.7051% 0.3248 0.907 0.031 0.908
3 IBK 90.7051% 0.1591 0.907 0.031 0.919
4 Standard Stacking 96.7949% 0.1098 0.968 0.011 0.969
5 Modified Stacking 97.7564% 0.1060 0.978 0.007 0.978
The various models performance accuracy results obtained on testing the various classifier models indicates that
the modified stacking ensemble model outperformed the other models with an accuracy of 97.7564% and RMSE of 0.1060.
5. CONCLUSION
Data mining can be applied on students’ data available to higher educational institutions to develop
models for predicting students’ graduation classes of degrees early using students’ first year CGPA, UTME
subjects’ scores and their corresponding grades in SSCE. Resolving class imbalance problem in data set used
for developing data mining models to predict students’ academic performance using synthetic minority over-
sampling technique (SMOTE) results in improving model performance.
REFERENCES
[1] B. Ryan & Y. Kalina, "The State of Educational Data Mining in 2009: A Review and Future Visions," Journal of
Educational Data Mining, Artiucle 1, vol. 1, no.1, 2009.
[2] K. Parneet, M. Singh & G. S. Josan, "Classification and Prediction Based Data Mining Algorithms to Predict Slow
Learners in Education Sector," 3rd International Conference on Recent Trends in Computing 2015(ICRTC-2015)
Procedia Computer Science 57, pp. 500-508, 2015.
[3] N. Srecˇko & Z. Moti, "Student Data Mining Solution-Knowledge Management System Related to Higher Education
Institutions," Expert Systems with Applications, at ScienceDirect, vol. 41 pp. 6400-6407. 2014.
6. Int J Inf & Commun Technol ISSN: 2252-8776
Classifiers ensemble and synthetic minority oversampling techniques for academic … (Abdulazeez Yusuf)
127
[4] P. Nithya, B. Umamaheswari, A. Umadevi., "A Survey on Educational Data Mining in Field of Education,"
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET),
vol. 5, no. 1, pp.69-78 2016.
[5] S. A. Mohamed, W. Husain, N. A. Rashid., "A Review on Predicting Student’s Performance using Data Mining
Techniques," SiecnceDirect Procedia Computer Science the Third Information Systems International Conference,
no. 72, pp. 414-422, 2015.
[6] M. S. Tabra & A. Lawan, "A Comparative Analysis of the Performance of Three Machine Learning Algorithms for
Tweets on Nigerian Dataset," The International Journal of E-learning and Educational Technologies in th Digital
Media (IJEETDM), vol. 3, no 1, pp. 23-30, 2017.
[7] N. Vlatko, S. Riste etal., "Educational Data Mining: Case Study for Predicting Student Dropout in Higher Education,"
https://www.researchgate.net/publication/282333827 Conference Paper, April, 2015.
[8] G. Geraldine, C. McGuinness, P. Owende., "An Application of Classification Models to Predict Learner Progression
in Tertiary Education," Conference Paper, February 2014 DOI: 10.1109/IAdCC.2014.6779384. 2014.
[9] J. S. Tanveer, R. I. Rashu etal., "Improving Accuracy of Students’ Final Grade Prediction Model Using Optimal
Equal Width Binning and Synthetic Minority Over-Sampling Technique," Decision Analytics (2015) 2:1 A Springer
Open Journal, 2015.
[10] A. Fadhilah, N. H. Ismail, A. Abdul Aziz., "The Prediction of Students’ Academic Performance Using Classification
Data Mining Techniques," Applied Mathematical Sciences, vol. 9, no. 129, pp. 6415-6426, 2015.
http://dx.doi.org/10.12988/ams.2015.53289
[11] Teressa T. & Chikohora, "A Study of the Factors Considered when Choosing an Appropriate Data Mining
Algorithm," International Journal of Soft Computing and Engineering (IJSCE), vol. 4, no. 3, pp.42-45, July 2014.
ISSN: 2231-2307
[12] O. Edin & M. Suljić, "Data Mining Approach for Predicting Student Performance," Economic Review – Journal of
Economics and Business, vol. x, no. 1, pp. 4-12, May 2012.
[13] M. S. Mythili & A. R. M. Shanavas, "An Analysis of students’ Performance Using Classification Algorithms," IOSR
Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727, vol. 6, no. 1, pp. 63-69, Ver.
III Jan. 2014. www.iosrjournals.org. 2014
[14] S. Sajadin, M. Zarlis, D. Hartama, S. Ramliana, E. Wani, "Prediction of Student Academic Performance by an
Application of Data Mining Techniques," 2011 International Conference on Management and Artificial Intelligence
Ipedr, vol. 6, pp. 110-114, 2011.
[15] R. Sikora & O. H. Al-laymoun, "A Modified Stacking Ensemble Machine Learning Algorithm Using Genetic
Algorithms," Journal of International Technology and Information Management, vol. 23, no.1, pp. 1-12. 2014
[16] N. Chanamarn, K. Tamee, P. Sittidech., "Stacking Technique for Academic Achievement Prediction," 2016
International Workshop on Smart Info-Media Systems in Asia (SISA 2016), pp.14-17. 2016
[17] D. Saso & B. Zenko, "Is Combining Classifiers with Stacking Better than Selecting the Best One?," Springer Journal
of Machine Learning, vol. 54, pp. 255-273, 2004.
[18] T. Ahmet., "Early Prediction of Students’ Grade Point Averages at Graduation: A Data Mining Approach," Eurasian
Journal of Educational Research, no. 54, pp. 207-226, 2014.
BIOGRAPHIES OF AUTHORS
Yusuf Abdulazeez received his BSc. degree in computer Science from Adamawa State University
Mubi, Nigeria in 2011 and presently a post-graduate student at the department of Computer
Science, Bayero University Kano, Nigeria. His research interest includes Data Mining, Artificial
Intelligence and HCI.
Ayuba John received his B.Eng. Engineering Degree in Computer Engineering from University of
Maiduguri, Nigeria in 2010, and M.Eng. Computer Engineering Degree from University of Benin,
Nigeria, in 2017, He has worked as a transmission Engineer in National Control Centre (NCC);
Transmission Company of Nigeria (TCN) in 2014 and he is a member of the Nigerian society of
engineers (NSE), currently a lecturer from Federal University Dutse, Nigeria. His research
interests are in the areas of Microelectronics' Intelligent Security System & Wireless Sensor
Networks.