This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
A Survey on Educational Data Mining TechniquesIIRindia
Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
Data Mining Model for Predicting Student Enrolment in STEM Courses in Higher ...Editor IJCATR
Educational data mining is the process of applying data mining tools and techniques to analyze data at educational
institutions. In this paper, educational data mining was used to predict enrollment of students in Science, Technology, Engineering and
Mathematics (STEM) courses in higher educational institutions. The study examined the extent to which individual, sociodemographic
and school-level contextual factors help in pre-identifying successful and unsuccessful students in enrollment in STEM
disciplines in Higher Education Institutions in Kenya. The Cross Industry Standard Process for Data Mining framework was applied to
a dataset drawn from the first, second and third year undergraduate female students enrolled in STEM disciplines in one University in
Kenya to model student enrollment. Feature selection was used to rank the predictor variables by their importance for further analysis.
Various predictive algorithms were evaluated in predicting enrollment of students in STEM courses. Empirical results showed the
following: (i) the most important factors separating successful from unsuccessful students are: High School final grade, teacher
inspiration, career flexibility, pre-university awareness and mathematics grade. (ii) among classification algorithms for prediction,
decision tree (CART) was the most successful classifier with an overall percentage of correct classification of 85.2%. This paper
showcases the importance of Prediction and Classification based data mining algorithms in the field of education and also presents
some promising future lines.
A Survey on Educational Data Mining TechniquesIIRindia
Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
Data Mining Model for Predicting Student Enrolment in STEM Courses in Higher ...Editor IJCATR
Educational data mining is the process of applying data mining tools and techniques to analyze data at educational
institutions. In this paper, educational data mining was used to predict enrollment of students in Science, Technology, Engineering and
Mathematics (STEM) courses in higher educational institutions. The study examined the extent to which individual, sociodemographic
and school-level contextual factors help in pre-identifying successful and unsuccessful students in enrollment in STEM
disciplines in Higher Education Institutions in Kenya. The Cross Industry Standard Process for Data Mining framework was applied to
a dataset drawn from the first, second and third year undergraduate female students enrolled in STEM disciplines in one University in
Kenya to model student enrollment. Feature selection was used to rank the predictor variables by their importance for further analysis.
Various predictive algorithms were evaluated in predicting enrollment of students in STEM courses. Empirical results showed the
following: (i) the most important factors separating successful from unsuccessful students are: High School final grade, teacher
inspiration, career flexibility, pre-university awareness and mathematics grade. (ii) among classification algorithms for prediction,
decision tree (CART) was the most successful classifier with an overall percentage of correct classification of 85.2%. This paper
showcases the importance of Prediction and Classification based data mining algorithms in the field of education and also presents
some promising future lines.
Data mining approach to predict academic performance of studentsBOHRInternationalJou1
Powerful data mining techniques are available in a variety of educational fields. Educational research is
advancing rapidly due to the vast amount of student data that can be used to create insightful patterns
related to student learning. Educational data mining is a tool that helps universities assess and identify student
performance. Well-known classification techniques have been widely used to determine student success in
data mining. A decisive and growing exploration area in educational data mining (EDM) is predicting student
academic performance. This area uses data mining and automaton learning approaches to extract data from
education repositories. According to relevant research, there are several academic performance prediction
methods aimed at improving administrative and teaching staff in academic institutions. In the put-forwarded
approach, the collected data set is preprocessed to ensure data quality and labeled student education data
is used to apply ANN classifiers, support vector classifiers, random forests, and DT Compute and train a
classifier. The achievement of the four classifications is measured by accuracy value, receiver operating curve
(ROC), F1 score, and confusion matrix scored by each model. Finally, we found that the top three algorithmic
models had an accuracy of 86–95%, an F1 score of 85–95%, and an average area under ROC curve of
OVA of 98–99.6%
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, it states that the best algorithm for the improvement of students performance using these algorithms.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, itstates that the best algorithm for the improvement of students performance using these algorithms.
A comparative study of machine learning algorithms for virtual learning envir...IAESIJAI
Virtual learning environment is becoming an increasingly popular study option for students from diverse cultural and socioeconomic backgrounds around the world. Although this learning environment is quite adaptable, improving student performance is difficult due to the online-only learning method. Therefore, it is essential to investigate students' participation and performance in virtual learning in order to improve their performance. Using a publicly available Open University learning analytics dataset, this study examines a variety of machine learning-based prediction algorithms to determine the best method for predicting students' academic success, hence providing additional alternatives for enhancing their academic achievement. Support vector machine, random forest, Nave Bayes, logical regression, and decision trees are employed for the purpose of prediction using machine learning methods. It is noticed that the random forest and logistic regression approach predict student performance with the highest average accuracy values compared to the alternatives. In a number of instances, the support vector machine has been seen to outperform the other methods.
Predictive and Statistical Analyses for Academic Advisory Supportijcsit
The ability to recognize students’ weakness and solving any problem may confront them in timely fashion is
always a target of all educational institutions. This study was designed to explore how can predictive and
statistical analysis support the academic advisor’s work mainly in analysis students’ progress. The sample
consisted of a total of 249 undergraduate students; 46% of them were Female and 54% Male. A one-way
analysis of variance (ANOVA) and t-test were conducted to analysis if there was different behaviour in
registering courses. Predictive data mining is used for support advisor in decision making. Several
classification techniques with 10-fold Cross-validation were applied. Among of them, C4.5 constitutes the
best agreement among the finding results.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...IAEME Publication
Background/Objectives: While the increase in classroom technology, it is necessary to examine how assessment is administered through technology. The purpose of this study is to understand how students and faculty are perceived and examine the effectiveness of the computer-based assessment in professional education courses (Educational Technology) at Northern Iloilo Polytechnic State College, Iloilo, Philippines. Methods: The research design utilized in this study is mixed-method research. A computer-based assessment was utilized to assess students' performance in educational technology. This instrument was validated, and pilot tested to establish reliability. Each campus of NIPSC selected ten students of 70 as respondents during Academic Year 2016-2017. Frequency count, mean, standard deviation, and Wilcoxon signed-rank test were statistical tools used for data analyses. Findings: The study's finding showed a high score of students in the posttest ensured better performance of the students in educational technology. The increase in the posttest per performance level of the students was due to an accurate measure of what they have learned in educational technology. The majority of students users agreed that online assessment was fasters than the paper and pencil form. Also, users agreed that online assessment is contemporary and more systematic. They also stated that online assessment is consistent with the teaching style, but they are less anxious. Furthermore, according to faculty and students, ninety percent (90%) believed that computer-based assessment accurately measures what they are teaching and what they learned in school, respectively. Novelty: With the current situation that the education system is in new normal, computer-based learning is important in flexible learning. And assessment using technology is a great help to both faculty and students. Thus, state universities and colleges (SUCs) should adopt this innovation to help teaching and learning.
ow-a-days data volumes are growing rapidly in several domains. Many factors have contributed to this growth, including inter alia proliferation of observational devices, miniaturization of various sensors ,improved logging and tracking of systems, and improvements in the quality and capacity of both disk storage and networks .Analyzing such data provides insights that can be used to guide decision making. To be effective, analysis must be timely and cope with data scales. The scale of the data and the rates at which they arrive make manual inspection infeasible. As an educational management tool, predictive analytics can help and improve the quality of education by letting decision makers address critical issues such as enrollment management and curriculum Development. This paper presents an analytical study of this approach’s prospects for education planning. The goals of predictive analytics are to produce relevant information, actionable insight, better outcomes, and smarter decisions, and to predict future events by analyzing the volume, veracity, velocity, variety, value of large amounts of data and interactive exploration.
The Big data concept emerged to meet the growing demands in analysing large
volumes of fast moving, heterogeneous and complex data, which traditional data
analysis systems could not manage further. The application of big data technology
across various sectors of the economy has aided better utilization of multiple data
collated and hence decision making. Organizations no longer base operations on
assumptions or constructed models solely, but can make inferences from generated
data. Educational organizations are more efficient and the pedagogical processes
more effective, when multiple streams of data can be collated from the various
personnel and facilitators involved. This data when analysed, maximizes the
performance of administrators andrecipients alike. This paper looks at the
components and techniques in bigdata technology, and how it can be implemented in
the education system for effective administration and delivery
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
The most critical parameters that indicate the Wi-Fi network are throughput, delay, latency, and packet loss since they provide significant benefits, especially to the end-user. This research aims to investigate Wi-Fi performance in an indoor environment for light-of-sight (LOS) and nonlight-of-sight (NLOS) conditions. The effect of the surrounding obstacles and distance has also been reported in the paper. The parameters measured are packet loss, the packet sent, the packet received, throughput, and latency. Site measurement is done to obtain real-time and optimum results. The measured parameters are then validated using the EMCO ping monitor 8 software. The comparison results between the measurement and the simulation are well presented in this paper. Additionally, the measurement distance is done up to 30 meters and the results are reported in the paper as well. The results indicate that the throughput value decreases with an increasing distance, where the lowest throughput value is 24.64 Mbps and the highest throughput value is 70.83 Mbps. Next, the maximum latency value from the measurement is 79 ms, while the lowest latency value is 56.09 ms. Finally, this research verified that obstacles and distances are among the contributing factors affecting the throughput and latency performance of the Wi-Fi network.
Subarrays of phased-array antennas for multiple-input multiple-output radar a...IJICTJOURNAL
The subarray MIMO radar (SMIMO) is a multiple-input multiple-output (MIMO) radar with elements in the form of a sub-array that acts as a phased array (PAR), so it combines at the same time the key advantage of the PAR radar, which is high directional gain to increase target range, and the key advantage of the MIMO radar, i.e., high diversity gains to increase the maximum number of detected targets. Different schemes for the number of antenna elements in the transceiver zones, such as uniform and/or variable, overlapped and non-overlapped, significantly determine the performance of radars as virtual arrays (VARs), maximum number of detected targets, accuracy of target angle, detection resolution, SNR detection, and detection probability. Performance is also compared with the PAR, the MIMO, and the phased MIMO radars (PMIMO). The SMIMO radar offers great versatility for radar applications, being able to adapt to different shapes of the multiple targets to be detected and their environment. For example, for a transmit-receive with an antenna element number, i.e., M = N = 8, the range of the number of detected targets for the SMIMO radar is flexible compared to the other radars. On the other hand, the proposed radar's signal-to-noise ratio (SNR) detection performance and detection probability (K = 5, L = 3) are both 1,999 and above 90%, which are better than other radars.
More Related Content
Similar to Multiple educational data mining approaches to discover patterns in university admissions for program prediction
Data mining approach to predict academic performance of studentsBOHRInternationalJou1
Powerful data mining techniques are available in a variety of educational fields. Educational research is
advancing rapidly due to the vast amount of student data that can be used to create insightful patterns
related to student learning. Educational data mining is a tool that helps universities assess and identify student
performance. Well-known classification techniques have been widely used to determine student success in
data mining. A decisive and growing exploration area in educational data mining (EDM) is predicting student
academic performance. This area uses data mining and automaton learning approaches to extract data from
education repositories. According to relevant research, there are several academic performance prediction
methods aimed at improving administrative and teaching staff in academic institutions. In the put-forwarded
approach, the collected data set is preprocessed to ensure data quality and labeled student education data
is used to apply ANN classifiers, support vector classifiers, random forests, and DT Compute and train a
classifier. The achievement of the four classifications is measured by accuracy value, receiver operating curve
(ROC), F1 score, and confusion matrix scored by each model. Finally, we found that the top three algorithmic
models had an accuracy of 86–95%, an F1 score of 85–95%, and an average area under ROC curve of
OVA of 98–99.6%
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement.
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, it states that the best algorithm for the improvement of students performance using these algorithms.
Technology Enabled Learning to Improve Student Performance: A SurveyIIRindia
The use of recent technology creates more impact in the teaching and learning process nowadays. Improvement of students’ knowledge by using the various technologies like smart class room environment, internet, mobile phones, television programs, use of iPods and etc. are play a very important role. Most of the education institutions used classroom teaching using advanced technologies such as smart class environment, visualization by power point projector and etc. This research work focusses on such technologies used for the improvement of student’s performance using some of the Data Mining (DM) techniques particularly classification and clustering. Information repositories (Educational Data Bases, Data Warehouses) are the source place for collecting study materials and use them for their learning purposes is the number one source for preparation of examinations. Particularly, this research work analyzes about the use of clustering and classification algorithms to enable the student’s performances and their learning capabilities using these modern technologies. During the study period, the student’s family background and their economic status are also play a very important role in their daily activities. These things are not considered in this survey work. A comparative study is carried out in this work by comparing students performance based on their results. The comparison is carried out based on the results of some of the classification and clustering algorithms. Finally, itstates that the best algorithm for the improvement of students performance using these algorithms.
A comparative study of machine learning algorithms for virtual learning envir...IAESIJAI
Virtual learning environment is becoming an increasingly popular study option for students from diverse cultural and socioeconomic backgrounds around the world. Although this learning environment is quite adaptable, improving student performance is difficult due to the online-only learning method. Therefore, it is essential to investigate students' participation and performance in virtual learning in order to improve their performance. Using a publicly available Open University learning analytics dataset, this study examines a variety of machine learning-based prediction algorithms to determine the best method for predicting students' academic success, hence providing additional alternatives for enhancing their academic achievement. Support vector machine, random forest, Nave Bayes, logical regression, and decision trees are employed for the purpose of prediction using machine learning methods. It is noticed that the random forest and logistic regression approach predict student performance with the highest average accuracy values compared to the alternatives. In a number of instances, the support vector machine has been seen to outperform the other methods.
Predictive and Statistical Analyses for Academic Advisory Supportijcsit
The ability to recognize students’ weakness and solving any problem may confront them in timely fashion is
always a target of all educational institutions. This study was designed to explore how can predictive and
statistical analysis support the academic advisor’s work mainly in analysis students’ progress. The sample
consisted of a total of 249 undergraduate students; 46% of them were Female and 54% Male. A one-way
analysis of variance (ANOVA) and t-test were conducted to analysis if there was different behaviour in
registering courses. Predictive data mining is used for support advisor in decision making. Several
classification techniques with 10-fold Cross-validation were applied. Among of them, C4.5 constitutes the
best agreement among the finding results.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
THE USE OF COMPUTER-BASED LEARNING ASSESSMENT FOR PROFESSIONAL COURSES: A STR...IAEME Publication
Background/Objectives: While the increase in classroom technology, it is necessary to examine how assessment is administered through technology. The purpose of this study is to understand how students and faculty are perceived and examine the effectiveness of the computer-based assessment in professional education courses (Educational Technology) at Northern Iloilo Polytechnic State College, Iloilo, Philippines. Methods: The research design utilized in this study is mixed-method research. A computer-based assessment was utilized to assess students' performance in educational technology. This instrument was validated, and pilot tested to establish reliability. Each campus of NIPSC selected ten students of 70 as respondents during Academic Year 2016-2017. Frequency count, mean, standard deviation, and Wilcoxon signed-rank test were statistical tools used for data analyses. Findings: The study's finding showed a high score of students in the posttest ensured better performance of the students in educational technology. The increase in the posttest per performance level of the students was due to an accurate measure of what they have learned in educational technology. The majority of students users agreed that online assessment was fasters than the paper and pencil form. Also, users agreed that online assessment is contemporary and more systematic. They also stated that online assessment is consistent with the teaching style, but they are less anxious. Furthermore, according to faculty and students, ninety percent (90%) believed that computer-based assessment accurately measures what they are teaching and what they learned in school, respectively. Novelty: With the current situation that the education system is in new normal, computer-based learning is important in flexible learning. And assessment using technology is a great help to both faculty and students. Thus, state universities and colleges (SUCs) should adopt this innovation to help teaching and learning.
ow-a-days data volumes are growing rapidly in several domains. Many factors have contributed to this growth, including inter alia proliferation of observational devices, miniaturization of various sensors ,improved logging and tracking of systems, and improvements in the quality and capacity of both disk storage and networks .Analyzing such data provides insights that can be used to guide decision making. To be effective, analysis must be timely and cope with data scales. The scale of the data and the rates at which they arrive make manual inspection infeasible. As an educational management tool, predictive analytics can help and improve the quality of education by letting decision makers address critical issues such as enrollment management and curriculum Development. This paper presents an analytical study of this approach’s prospects for education planning. The goals of predictive analytics are to produce relevant information, actionable insight, better outcomes, and smarter decisions, and to predict future events by analyzing the volume, veracity, velocity, variety, value of large amounts of data and interactive exploration.
The Big data concept emerged to meet the growing demands in analysing large
volumes of fast moving, heterogeneous and complex data, which traditional data
analysis systems could not manage further. The application of big data technology
across various sectors of the economy has aided better utilization of multiple data
collated and hence decision making. Organizations no longer base operations on
assumptions or constructed models solely, but can make inferences from generated
data. Educational organizations are more efficient and the pedagogical processes
more effective, when multiple streams of data can be collated from the various
personnel and facilitators involved. This data when analysed, maximizes the
performance of administrators andrecipients alike. This paper looks at the
components and techniques in bigdata technology, and how it can be implemented in
the education system for effective administration and delivery
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making.Boosting technique is the most popular techniques for constructing ensembles of classifier to improve the classification accuracy. Adaptive Boosting (AdaBoost) is a generation of boosting algorithm. It is used for
the binary classification and not applicable to multiclass classification directly. SAMME boosting
technique extends AdaBoost to a multiclass classification without reduce it to a set of sub-binaryclassification.In this paper, students’ performance prediction system usingMulti Agent Data Mining is proposed to predict the performance of the students based on their data with high prediction accuracy and provide helpto the low students by optimization rules.The proposed system has been implemented and evaluated by investigate the prediction accuracy ofAdaboost.M1 and LogitBoost ensemble classifiers methods and with C4.5 single classifier method. The results show that using SAMME Boosting technique improves the prediction accuracy and outperformed
C4.5 single classifier and LogitBoost.
The most critical parameters that indicate the Wi-Fi network are throughput, delay, latency, and packet loss since they provide significant benefits, especially to the end-user. This research aims to investigate Wi-Fi performance in an indoor environment for light-of-sight (LOS) and nonlight-of-sight (NLOS) conditions. The effect of the surrounding obstacles and distance has also been reported in the paper. The parameters measured are packet loss, the packet sent, the packet received, throughput, and latency. Site measurement is done to obtain real-time and optimum results. The measured parameters are then validated using the EMCO ping monitor 8 software. The comparison results between the measurement and the simulation are well presented in this paper. Additionally, the measurement distance is done up to 30 meters and the results are reported in the paper as well. The results indicate that the throughput value decreases with an increasing distance, where the lowest throughput value is 24.64 Mbps and the highest throughput value is 70.83 Mbps. Next, the maximum latency value from the measurement is 79 ms, while the lowest latency value is 56.09 ms. Finally, this research verified that obstacles and distances are among the contributing factors affecting the throughput and latency performance of the Wi-Fi network.
Subarrays of phased-array antennas for multiple-input multiple-output radar a...IJICTJOURNAL
The subarray MIMO radar (SMIMO) is a multiple-input multiple-output (MIMO) radar with elements in the form of a sub-array that acts as a phased array (PAR), so it combines at the same time the key advantage of the PAR radar, which is high directional gain to increase target range, and the key advantage of the MIMO radar, i.e., high diversity gains to increase the maximum number of detected targets. Different schemes for the number of antenna elements in the transceiver zones, such as uniform and/or variable, overlapped and non-overlapped, significantly determine the performance of radars as virtual arrays (VARs), maximum number of detected targets, accuracy of target angle, detection resolution, SNR detection, and detection probability. Performance is also compared with the PAR, the MIMO, and the phased MIMO radars (PMIMO). The SMIMO radar offers great versatility for radar applications, being able to adapt to different shapes of the multiple targets to be detected and their environment. For example, for a transmit-receive with an antenna element number, i.e., M = N = 8, the range of the number of detected targets for the SMIMO radar is flexible compared to the other radars. On the other hand, the proposed radar's signal-to-noise ratio (SNR) detection performance and detection probability (K = 5, L = 3) are both 1,999 and above 90%, which are better than other radars.
Statistical analysis of an orographic rainfall for eight north-east region of...IJICTJOURNAL
Autoregressive integrated moving average (ARIMA) models are used to predict the rain rate for orographic rainfall over a long period of time, from 1980 to 1918. As the orographic rainfall may cause landslides and other natural disaster issues, So, this study is very important for the analysis of rainfall prediction. In this research, statistical calculations have been done based on the rainfall data for twelve regions of India (Cherrapunji, Darjeling, Dawki, Ghum, Itanagar, Kamchenjunga, Mizoram, Nagaland, Pakyong, Saser Kangri, Slot Kangri, and Tripura) from the eight states, i.e., Sikkim, Meghalaya, West Bengal, Ladakh (Union Territory of India), Arunachal Pradesh, Mizoram, Tripura, and Nagaland) with varying altitude. The model's output is assessed using several error calculations. The model's performance is represented by the fit value, which is reliable for the northeast region of India with increasing altitude. The statistical dependability of the rainfall prediction is shown by the parameters. The lowest value of root mean square error (RMSE) indicates better prediction for orographic rainfall.
A broadband MIMO antenna's channel capacity for WLAN and WiMAX applicationsIJICTJOURNAL
This paper describes the findings of a research into the multiple input multiple output (MIMO) channel capacity of a broadband dual-element printed inverted F-antenna (PIFA) antenna array. The dual-element antenna array is made up of two PIFAs that are meant to fit on a teeny-tiny and small wireless communication device that runs at 5 GHz. The device's frequency range is between 3.5 and 4.5 GHz. These PIFAs are also loaded into the device during the installation process. In order to investigate the channel capacity, the ray tracing method is employed in two different kinds of circumstances. For the purpose of carrying out this analysis of the channel capacity, both the simulated and measured mutual couplings of the broadband MIMO antenna are utilized.
Satellite dish antenna control for distributed mobile telemedicine nodesIJICTJOURNAL
The positioning control of a dish antenna mounted on distributed mobile telemedicine nodes (DMTNs) within Nigeria communicating via NigComSat-1R has been presented. It was desired to improve the transient and steady performance of satellite dish antenna and reduce the effect of delay during satellite communication. In order to overcome this, the equations describing the dynamics of the antenna positioning system were obtained and transformed into state space variable equations. A full state feedback controller was developed with forward path gain and an observer. The proposed controller was introduced into the closed loop of the dish antenna positioning control system. The system was subjected to unit step forcing function in MATLAB/Simulink simulation environment considering three different cases so as to obtain time domain parameters that characterized the transient and steady state response performances. The simulation results obtained revealed that the introduction of the full state feedback controller provided improved position tracking to unit step input with a rise time of 0.42 s, settling time of 1.22 s and overshoot of 4.91%. With the addition of observer, the rise time achieved was 0.39 s, settling time of 1.31 s, and overshoot of 10.7%. The time domain performance comparison of the proposed system with existing systems revealed its superiority over them.
High accuracy sensor nodes for a peat swamp forest fire detection using ESP32...IJICTJOURNAL
The use of smoke sensors in high-precision and low-cost forest fire detection kits needs to be developed immediately to assist the authorities in monitoring forest fires especially in remote areas more efficiently and systematically. The implementation of automatic reclosing operation allows the fire detector kit to distinguish between real smoke and non-real smoke successfully. This has profitably reduced kit errors when detecting fires and in turn prevent the users from receiving incorrect messages. However, using a smoke sensor with automatic reclosing operation has not been able to optimize the accuracy of identifying the actual smoke due to the working sensor node situation is difficult to predict and sometimes unexpected such as the source of smoke received. Thus, to further improve the accuracy when detecting the presence of smoke, the system is equipped with two digital cameras that can capture and send pictures of fire smoke to the users. The system gives the users choice of three interesting options if they want the camera to capture and send pictures to them, namely request, smoke trigger and movement for security purposes. In all cases, users can request the system to send pictures at any time. The system equipped with this camera shows the accuracy of smoke detection by confirming the actual smoke that has been detected through images sent in the user’s Telegram channel and on the graphical user interface (GUI) display. As a comparison of the system before and after using this camera, it was found that the system that uses the camera gives advantage to the users in monitoring fire smoke more effectively and accurately.
Prediction analysis on the pre and post COVID outbreak assessment using machi...IJICTJOURNAL
In this time of a global urgency where people are losing lives each day in a large number, people are trying to develop ways/technology to solve the challenges of COVID-19. Machine learning (ML) and artificial intelligence (AI) tools have been employed previously as well to the times of pandemic where they have proven their worth by providing reliable results in varied fields this is why ML tools are being used extensively to fight this pandemic as well. This review describes the applications of ML in the post and pre COVID-19 conditions for contact tracing, vaccine development, prediction and diagnosis, risk management, and outbreak predictions to help the healthcare system to work efficiently. This review discusses the ongoing research on the pandemic virus where various ML models have been employed to a certain data set to produce outputs that can be used for risk or outbreak prediction of virus in the population, vaccine development, and contact tracing. Thus, the significance and the contribution of ML against COVID-19 are self-explanatory but what should not be compromised is the quality and accuracy based on which solutions/methods/policies adopted or produced from this analysis which will be implied in the real world to real people.
Meliorating usable document density for online event detectionIJICTJOURNAL
Online event detection (OED) has seen a rise in the research community as it can provide quick identification of possible events happening at times in the world. Through these systems, potential events can be indicated well before they are reported by the news media, by grouping similar documents shared over social media by users. Most OED systems use textual similarities for this purpose. Similar documents, that may indicate a potential event, are further strengthened by the replies made by other users, thereby improving the potentiality of the group. However, these documents are at times unusable as independent documents, as they may replace previously appeared noun phrases with pronouns, leading OED systems to fail while grouping these replies to their suitable clusters. In this paper, a pronoun resolution system that tries to replace pronouns with relevant nouns over social media data is proposed. Results show significant improvement in performance using the proposed system.
Performance analysis on self organization based clustering scheme for FANETs ...IJICTJOURNAL
With the fast-increasing development of wireless communication networks, unmanned aerial vehicle (UAV) has emerged as a flying platform for wireless communication with efficient coverage, capacity, reliability, and its network is called flying ad-hoc network (FANET); which keeps changing its topology due to its dynamic nature, causing inefficient communication, and therefore needs cluster formation. In this paper, we proposed a cluster formation, selection of cluster head and its members, connectivity and transmission with the base station using the K-means algorithm, and choice of an optimized path for transmission using firefly optimization algorithm for efficient communication. Evaluation of performance with experimental results are obtained and compared using the K-means algorithm and firefly optimization algorithm in cluster building time, cluster lifetime, energy consumption, and probability of delivery success. On comparison of firefly optimization algorithm with firefly optimization algorithm, i.e., K-means algorithm results proved than without firefly optimization algorithm, better in terms of cluster building time, energy consumption, cluster lifetime, and also the probability of delivery success.
A persuasive agent architecture for behavior change interventionIJICTJOURNAL
A persuasive agent makes use of persuasion attributions to ensure that its predefined objective(s) is achieved within its immediate environment. This is made possible based on the five unique features namely sociable, persuasive, autonomy, reactive, and proactive natures. However, there are limited successes recorded within the behavioural intervention and psychological reactance is responsible for these failures. Psychological reactance is the stage where rejection, negative response and frustration are felt by the users of the persuasive system. Thus, this study proposes a persuasive agent (PAT) architecture that limits the experience of psychological reactance to achieve an improved behavioural intervention. PAT architecture adopted the combination of the reactance model for behavior change and the persuasive design principle. The architecture is evaluated by conducting an experimental study using a user-centred approach. The evaluation reflected that there is a reduction in the number of users who experienced psychological reactance from 70 per cent to 3 per cent. The result is a better improvement compared with previous outcomes. The contribution made in this study would provide a design model and a steplike approach to software designers on how to limit the effect of psychological reactance on persuasive system applications and interventions.
Enterprise architecture-based ISA model development for ICT benchmarking in c...IJICTJOURNAL
Building on a coincided in progress paper, this paper constructs and evaluate an information systems architecture (ISA) model for the Bahraini architecture, engineering and construction (AEC) sector, from the lens of enterprise architecture (EA). This model acts as an information and communication technology (ICT) barometer tool to identify and benchmark the ICT’s gaps, duplicative levels, and future investments. Following the design science research, this paper and throughout a utilization of a tailored version of the open group architectural framework (TOGAF), embedded into a rigorous case study approach, the construction, testing, and evaluation of the conceptual ISA model is approached to benchmark the ICT measurement. Empirically, the study revealed the appropriateness of the model and the ability to identify the availability of 28 groups of 38 individual ICT applications in the Bahraini AEC sector and benchmark them to score an average of 18.5% against 17 countries that scored an average of 18.6%.
Empirical studies on the effect of electromagnetic radiation from multiple so...IJICTJOURNAL
Just after the invention of electricity by Michael Faraday, there has been a revolution in the communication technology, which lead to the invention of radio, television, radar, satellite, and mobile. While these machines transformed our life high quality, safer and simpler, they have been associated with alarming probable health hazards owing to their electromagnetic radiation (EMR) emission. Couple of cases it has been reported by personals regarding various health related issues relating to exposure on electromagnetic field (EMF) and EMR. Although couple of persons showed light symptoms and respond by avoiding the electrical field (EF) and EMR fields as much as possible, some others are so much affected that they have changed their entire lifestyle. In this paper, empirical survey study has been carried out in the laboratories of Daffodil International University (DIU) main and permanent campus. It was found that some of the instrument had higher EMFs. The findings from this survey may be helpful for the students to take precautionary measurement who work for long duration in the various laboratories for their practical classes and for the users of the domestic appliances as well as office equipment and industrial instruments.
Cyber attack awareness and prevention in network securityIJICTJOURNAL
This article aims to provide an overview of cyber attack awareness and prevention in network security. This article discussed the different types of cyber attacks, current trends of cyber attacks, how to prevent cyber attacks and uum students' awareness of cyber attacks. First, we will go over the different types of cyber attack, current trend, impact of cyber attack and the prevention. The approach entailed comparing and observing the outcomes of 13 different papers. The survey's findings would demonstrate the results obtained after analyzing the data collection which are the questionnaire filled out by respondents after watching the cyber attack awareness video to improve awareness of students through the cyber attack. Depending on the outcome of this survey, we will have a better understanding of current students' knowledge and awareness of cyber attacks, allowing us to improve students' understanding of cyber threats and the necessity of cyber security.
An internet of things-based irrigation and tank monitoring systemIJICTJOURNAL
Agriculture plays a significant role in the development of a nation and provides the main source of food production, income, and employment to nations. It was the most practiced occupation in Nigeria and this formed the backbone of the economy in the early 1960s before the discovery of crude oil, which has led to the derail of sufficient food production, exportation, and the agricultural economy at large. Over time, the dry season has always been challenging with little or no rainfall and there are no irrigation facilities that incorporate different saving practices to adapt to these climate changes on their own. In this paper, a cost-effective internet of things irrigation system that is capable of reducing water wastage, manual labor, monitoring tank water level and that can be controlled remotely is designed. The system integrated Arduino UNO with a soil moisture sensor, HCSR04 ultrasonic sensor, and ESP8266 Wi-Fi module that gives the system capable of being controlled remotely via the internet, thus achieving optimal irrigation using the internet of things (IoT). Some of the challenges facing the existing irrigation system are water wastage, poor performance, and high cost of implementation. The design system helps to control water supply to crops when it is needed, and also monitors soil moisture, temperature, and water tank level. After carrying out the experiments for 15 days, the system saved approximately 49% of the water used in traditional irrigation method. The system is useful in large farming areas to minimize human effort and reduce the cost of hiring personnel.
About decentralized swarms of asynchronous distributed cellular automata usin...IJICTJOURNAL
This research describes the simple implementation of asynchronous distributed cellular automata and decentralized swarms of asynchronous distributed cellular automata built on top of inter-planetary file system’s publish-subscribe (IPFS PubSub) experimentation. Various publish-subscribe(PubSub) models are described. As an illustration, two distributed versions and a decentralized swarm version of a 2D elementary cellular automaton are thoroughly detailed to highlight the simplicity of implementation with IPFS and the inner workings of these kinds of cellular automata (CA). Both algorithms were implemented, and experiments were conducted throughout five datacenters of Grid’5000 testbed in France to obtain preliminary performance results in terms of network bandwidth usage. This work is prior to implementing a large-scale decentralized epidemic propagation modeling and prediction system based upon asynchronous distributed cellular automata with application to the current pandemic of SARSCoV-2 coronavirus disease 2019 (COVID-19).
A convolutional neural network for skin cancer classificationIJICTJOURNAL
Skin diseases can be seen clearly by oneself and others. Although this disease is visible on the skin, we fear that this skin disease is harmful. People who experience skin diseases immediately visit a dermatologist to have their complaints and symptoms checked. This skin protects the body, especially from the sun, so it can be lethal if something goes wrong. One example of deadly skin disease is skin cancer or skin tumors. In this research, we classified skin cancer into Benign and Malignant using the convolution neural network (CNN) algorithm. The purpose of this research is to develop the CNN architecture to help identify skin diseases. We used a dataset of 3,297 skin cancer images which are publicly available on the Kaggle website. We propose two CNN architectures that differ in the number of parameters. The first architecture has 6,427,745 parameters, and the second architecture has 2,797,665. The accuracy of the proposed models is 93% and 74% respectively. The first model with the number of parameters 6,427,745 was saved for use in the creation of the website. We created a web-based application with the Django framework for skin disease identification.
A review on notification sending methods to the recipients in different techn...IJICTJOURNAL
Women have progressed a lot in terms of social empowerment and economics. They are going for higher education, jobs, and many other similar endeavors, but harassment cases have also been on the rise. So, women’s safety is a big concern nowadays, especially in developing countries. Many previous studies and attempts were made to create a feasible safety solution for women. Out of various features to ensure women’s safety in critical situations, location tracking is a very common and key feature in most previously proposed solutions. This study found mechanisms of sending the location to different types of recipients in various women safety solutions. In addition, the advantages and drawbacks of location sending methods in women's safety solutions were analyzed.
Detection of myocardial infarction on recent dataset using machine learningIJICTJOURNAL
In developing countries such as India, with a large aging population and limited access to medical facilities, remote and timely diagnosis of myocardial infarction (MI) has the potential to save the life of many. An electrocardiogram is the primary clinical tool utilized in the onset or detection of a previous MI incident. Artificial intelligence has made a great impact on every area of research as well as in medical diagnosis. In medical diagnosis, the hypothesis might be doctors' experience which would be used as input to predict a disease that saves the life of mankind. It is been observed that a properly cleaned and pruned dataset provides far better accuracy than an unclean one with missing values. Selection of suitable techniques for data cleaning alongside proper classification algorithms will cause the event of prediction systems that give enhanced accuracy. In this proposal detection of myocardial infarction using new parameters is proposed with increased accuracy and efficiency of the existing model. Additional parameters are used to predict MI with more accuracy. The proposed model is used to predict an early diagnosis of MI with the help of expertise experiences and data gathered from hospitals.
Correcting optical character recognition result via a novel approachIJICTJOURNAL
Optical character recognition (OCR) is a recognition system used to recognize the substance of a checked picture. This system gives erroneous results, which necessitates a post-treatment, for the sentence correction. In this paper, we proposed a new method for syntactic and semantic correction of sentences it is based on the frequency of two correct words in the sentence and a recursive technique. This approach starts with the frequency calculation of each two words successive in the corpora, the words that have the greatest frequency build a correction center. We found 98% using our approach when we used the noisy channel. Further, we obtained 96% using the same corpus in the same conditions.
A novel enhanced algorithm for efficient human trackingIJICTJOURNAL
Tracking moving objects has been an issue in recent years of computer vision and image processing and human tracking makes it a more significant challenge. This category has various aspects and wide applications, such as autonomous deriving, human-robot interactions, and human movement analysis. One of the issues that have always made tracking algorithms difficult is their interaction with goal recognition methods, the mutable appearance of variable aims, and simultaneous tracking of multiple goals. In this paper, a method with high efficiency and higher accuracy was compared to the previous methods for tracking just objects using imaging with the fixed camera is introduced. The proposed algorithm operates in four steps in such a way as to identify a fixed background and remove noise from that. This background is used to subtract from movable objects. After that, while the image is being filtered, the shadows and noises of the filmed image are removed, and finally, using the bubble routing method, the mobile object will be separated and tracked. Experimental results indicated that the proposed model for detecting and tracking mobile objects works well and can improve the motion and trajectory estimation of objects in terms of speed and accuracy to a desirable level up to in terms of accuracy compared with previous methods.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Multiple educational data mining approaches to discover patterns in university admissions for program prediction
1. International Journal of Informatics and Communication Technology (IJ-ICT)
Vol. 11, No. 1, April 2022, pp. 45~56
ISSN: 2252-8776, DOI: 10.11591/ijict.v11i1.pp45-56 45
Journal homepage: http://ijict.iaescore.com
Multiple educational data mining approaches to discover
patterns in university admissions for program prediction
Julius Cesar O. Mamaril1
, Melvin A. Ballera2
1
College of Computing Sciences, Pangasinan State University, Lingayen Pangasinan, Philippines
2
Graduate Programs, Technological Institute of the Philippines, Manila, Philippines
Article Info ABSTRACT
Article history:
Received Nov 25, 2021
Revised Feb 18, 2022
Accepted Mar 2, 2022
This paper presented the utilization of pattern discovery techniques by using
multiple relationships and clustering educational data mining approaches to
establish a knowledge base that will aid in the prediction of ideal college
program selection and enrollment forecasting for incoming freshmen.
Results show a significant level of accuracy in predicting college programs
for students by mining two years of student college admission and
graduation final grade scholastic records. The results of educational
predictive data mining methods can be applied in improving the services of
the admission department of an educational institution, particularly in its
course alignment, student mentoring, admission forecast, marketing, and
enrollment preparedness.
Keywords:
Attribute selection
Design algorithm
Educational data mining
Pattern discovery
Program prediction
Supervised learning
University admission
This is an open access article under the CC BY-SA license.
Corresponding Author:
Julius Cesar O. Mamaril
College of Computing Sciences, Pangasinan State University
Alvear E, Poblacion, Lingayen, 2401 Pangasinan, Philippines
Email: jmamaril@psu.edu.ph
1. INTRODUCTION
The second half of the 20th
century brought swift and unceasing growth in computing power and has
had a significant brunt both in the practice of statistical science and computer technology [1]. These
advancements in computer technology allowed databases to be capable of handling enormous and more
complex datasets augmenting the power of data processing in the fields of artificial intelligence and data
mining [2]. Undoubtedly, among the sectors that benefited from these developments in the education industry
which now harness the promises of educational data mining due to the fact that the academy is a silent data
warehouse that continuously and regularly upsizes its data collection almost every day for those educational
institutions with computer systems or every semester for those schools who still employ the good old paper-
based and manual records processing and management [3].
This paper is in preparation for the future research of the authors and crucial to the development
plans of a custom-built educational data mining (EDM) for program prediction to be deployed in a local state
university. Although numerous data mining software is currently available both open-source and proprietary
in nature, the by can be observed in utilizing these data mining software: they all require high technical
proficiency and experience to operate them which most university admission and student registration
employees will fall short; they were developed to be general-purpose DM engines, thus, include more
features that are beyond the need for EDM purposes; some have limitations in terms of the number of rows of
records to be processed for free; they involve re-uploading/reprocessing of records for different prediction
approach or appending records; and the majority cannot be added as a free third-party software module,
library or service to a software development endeavor.
2. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
46
This paper sought to answer the by problems: i) There are only a few schools that offer customized
data mining systems for admission purposes; ii) There are some studies that historical data never converged
for preprocessing in data mining; iii) There are many attributes used, but it is difficult to extract effective
attributes to formulate a dynamic rule-based approach in the admission process; iv) There are only a few
studies that discover knowledge or pattern based on historical data that has not been used by the university
and v) Not all forecasting met the university's expectations.
To bridge the gap and mitigate the problems enumerated above and as has been mentioned awhile
regarding the aim of the authors to custom-develop an EDM software, it is of tantamount importance to
aspire and limit the following for the meantime in this paper: identity the data mining pre-processing
techniques to be used with university admission historical records; enumerate which among university
admission datasets are the significant attributes to be used in data mining which can be modeled predict an
ideal program for an incoming college freshman; determine how relationship and clustering algorithms can
be implemented to discover patterns in university historical records, and identify which patterns discovered
can be a significant addition to the knowledge base for the prediction of a college program for freshmen
students during university admissions.
2. RELATED WORKS
EDM is concerned with developing, researching, and applying computerized methods to detect
patterns in large collections of educational data patterns that would otherwise be hard or impossible to
analyze due to the enormous volume of data they exist within [4]. EDM pertains to the growing research and
exploration in the sector of education that concerns primarily the combined use of statistics, machine
learning, artificial intelligence, and data mining using historical records in the academe. Generally, this sector
of education pursues to improve different data exploration methodologies by observing the different levels of
significant hierarchies to uncover new information and understand how stakeholders in the educational
institution learn, choose, decide, and improve themselves in the context of school settings [2]. Among the
most common applications of EDM can be seen in intelligent tutoring systems, student behavior forecasting,
university admission forecasting, and student graduation forecasting [3]. The selection of potential students
while still in high school using educational historical records has been one of the major applications of EDM
among first-world western countries [5].
Data mining is defined as the process of extracting useful information from vast amounts of data.
The author noted that many other terms are being used to interpret data mining, such as knowledge mining
from databases, knowledge extraction, data analysis, data archaeology, and data dredging [6]. There are four
goals of EDM: i) Establishing a student model that integrates the measurement of the overall learning
satisfaction of students, detailed student information and characteristics to predict future learning behavior;
ii) Improving education domain frameworks to discover new methodologies to support learning and improve
previous student models; iii) Investigating the results and causality of educational support through the
implementation of electronic learning systems; iv) Expanding the breadth and reach of scientific information
regarding the system of learning by implementing and improving student models through computer software
[7].
The use of high school report cards, admission test results, and final GWA from historical records of
graduate students as attributes for a rule-based classification technique that uses university admission policies
for course/career alignment as data mining rules and looks for similar patterns to predict when students drop
out of college or transfer to other schools [8]. Furthermore, historical data per course and department, as well
as variables extracted from the admission form, are used as input parameters for a forecasting technique to
predict the total number of "would-be" enrollees and assess preparation and enrollment capacity [9].
2.1. Phases of educational data mining
The task of discovering patterns in the educational sector involves the same data mining algorithms
and approaches in the business sector although the intentions of the latter are for the betterment of learning in
general among its stakeholders in the academe and has little to do with monetary gains. These educational
data mining tasks, similar to general-purpose data mining, are usually divided also into phases [10].
The first phase of EDM is uncovering relationships among historical educational data. Information
submitted by students from enrollment to graduation is among the richest and most detailed relevant
information that can be used as models for prediction, analytical and quantitative research. Because these
data are required to be given by each student, relationships that will be discovered can be validated and
double checked for consistency, frequency, and integrity. The pre-processing phase of EDM usually involves
identifying which data are related to one another and what other attributes that are available on record can be
utilized to create or sustain a stronger attribute relationship. To identify all possible or relevant relationships,
3. Int J Inf & Commun Technol ISSN: 2252-8776
Multiple educational data mining approaches to discover patterns … (Julius Cesar O. Mamaril)
47
several algorithms can be utilized such as classification-based approaches, sequential pattern mining
clustering, regression, association rule mining, and factor analysis. Under this phase, a huge part of the
historical data is used as a training set to create the model [11].
After identifying the relationships among attributes, phase 2 of EDM involves relationship
validation to get away with overfitting. The infield of statistics, overfitting refers to an analysis that relates
too narrowly or almost exactly to a specific dataset and may consequently fail in predicting reliably future or
upcoming data. Usually, a portion of historical data is set aside as test data. By adjusting criteria, parameters,
factors, or thresholds, a significant level of accuracy can be attained but will still provide leeway for
upcoming data [12].
Phase 3 of EDM involves the application of validated relationships or the result of phase 2 to make
reliable predictions in the chosen educational context in which educational historical data, attributes, and
relationships are selected and aimed at [13]. Under this phase, the rate of prediction accuracy is monitored
before it can be used to render decisions. There are usually two main approaches for EDM under this phase:
discovery with models and distillation of data for human judgment. The discovery with models approach uses
the high rate predictions made under phase 3 to be utilized as a component in another analysis either by
predicting another variable for next-level analysis or the use of predicted variable as it relates to attributes
used in the prediction. This approach requires that the predictions made under phase 3 have proven general
ness across applicable contexts [11]. Meanwhile, a distillation of data for human judgment literally applies
human inferences or higher-level human intelligence, intuition, and mental grasp which the computerized
EDM process may lack. Data distillation by humans is purposed either to classify or identify. Predictions,
data frequency, or patterns that humans can classify can aid immensely in the improvement of the prediction
model while those that data relationships and patterns that humans can identify are purposed for
interpretation as computers may provide a visual representation of predictions or results, but they lack the
ability to interpret these visual data and the sublime reasons why such visual presentation was arrived at or
formed [12].
After approaching the prediction results of the EDM process with either the discovery of models or
distillation of data through human judgment, phase 4 will now take place which involves using the EDM
process predictions to support decision-making procedures and in crafting decision policies. The decision-
making process reflects the actual addition of obtained information to the knowledge base [14].
2.2. Data mining pre-processing techniques
Nowadays, there exist several definitions and characterizations of data mining and its corresponding
tasks. However, data mining can be simply viewed as a four-step process: data pre-processing, data mining,
pattern evaluation, and knowledge presentation. Among these four steps, the pre-processing step involves the
most crucial tasks to gain relevant results, as the old cliché in the computer world: garbage in, garbage out.
Aside from being the most crucial part of the process, being the foundation of the whole DM system, it
involves dealing with actual, physical raw data, thus, a substantial amount of effort and time are usually
allotted for this task [10].
Data mining pre-processing is a process consisting of an iterative sequence of the following steps:
i) Data cleaning which involves tasks of removing noise and inconsistent data; ii) Data integration wherein
multiple data sources may be combined; iii) Data selection which involves database retrieval of data relevant
to the analysis task, and iv) Data transformation which pertains to transforming or consolidating data into
forms appropriate for mining by performing summary or aggregation operations [15].
Depending on the nature or context of the data mining goal, the pre-processing of data may involve
simple or sophisticated choice methods or algorithms in order to identify the most significant attributes and
determine the difficulty and the general quality of models that can be taken into account in the actual data
mining stage [16].
2.3. Educational data mining attributes selection
Data mining attributes refer to the fundamental qualities or characteristics of individual data to be
mined. In terms of raw data, a filled-up form refers to a record while each item in the form pertains to a data
attribute. Technically, when viewed in a column-row relationship or a two-dimensional dataset, a filled-up
form or records becomes the row while the individual composition or items that make up a record becomes
the equivalent unique identifier of each column. The more rows you have means the more records to process.
The more columns you have means the more attributes or characterization you can have for a record. For
simplification purposes and to relate with the current subject of this paper, a student record refers to a row of
records while items such as name, age, gender, grades, refer to the attributes of the record [17].
In data mining, an attribute is directly equivalent to a variable in a model. Thus, the fewer variables
you have and the more records or samples you can aggregate, the more interpretable the result of the model
4. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
48
can be. Meanwhile, the more variables you have but fewer samples could mean less significant accuracy and
would be difficult to interpret [18].
In order to efficiently process voluminous data, the important and relevant variables are often
selected which is called feature selection which is a dimension reduction methodology that aims to reduce the
number of original predictors (usually denoted as p predictors) to be operated in a model by choosing the best
predictors (denoted as d predictors) [11].
Figure 1 exhibits the graphical representation of feature selection where X refers to an attribute and
n to the total number of samples or data points. In the left portion of the image, the eight attributes pertain to
the original predictors or p predictors which will require R^nx8 operations to come up with a contextual
result R. Meanwhile, the right portion of the image shows that it only requires three variables (d predictors)
and R^nx3 operations to come up to the same contextual result R. This dimension reduction directly impacts
and benefits computational time and resources. It also avoids overfitting and increases prediction
performance [16].
Figure 1. A graphical representation of feature selection as a data mining dimensional reduction methodology
of reducing multiple variables to only a few important variables to be used by a model
2.3.1. Approaches in feature selection of data mining
There are three approaches to feature selection in data mining. These are filter, wrapper, and
embedded approaches [9]. Figure 2 visually represents the filter approach of feature selection in data mining.
This method chooses the features to be used in a model which is independent of the type of classifier
algorithm to be utilized. It is simple and only requires to be performed once, however, it does not take into
account the feature dependencies and relevant interaction with the classifier [14].
Figure 2. Filter approach of data mining feature selection
Meanwhile, Figure 3 encapsulates the idea behind the wrapper approach of data mining feature
selection in which the process of selecting the best subset features refers to the iterative method of generating
a subset of predictors (s predictors) from the set of all original p predictors and subjecting it to a learning
algorithm to score each attribute in its impact in the total result or prediction accuracy. This approach is
totally contrary to the Filter approach as the wrapper approach directly communicates with the classifier
algorithm [14].
Figure 3. Wrapper approach of data mining feature selection
5. Int J Inf & Commun Technol ISSN: 2252-8776
Multiple educational data mining approaches to discover patterns … (Julius Cesar O. Mamaril)
49
Finally, Figure 4 summarizes the context of the embedded approach of data mining feature selection
which is similar to the wrapper method in iteratively finding the best subset of features, however, the process
of feature subset selection is incorporated as part of the classifier algorithm, as such, named embedded
approach. This approach offers a reduced computational time, however, requires a lot of technical
proficiency to customize the feature selection [18].
Figure 4. The embedded approach of data mining feature selection
2.4. Data mining pattern discovery techniques
Patterns are everywhere around us. Patterns can be tangible or be naturally present that can be seen
by the naked eye or intangible in the sense that it can be perceived or detected by utilizing mathematical
solutions through pattern recognition algorithms [19].
In data mining, pattern discovery or pattern recognition is the process of identifying relationships,
arrangements, configurations, outlines or repetitions of features from bulk of information contained in a
dataset through machine learning algorithms. In this definition, information or knowledge is already there in
the form of records and that the search for relationships of among features of one record to another pertains
to the process of extracting patterns. The set or subset of features used in pattern recognition are referred to as
features vector which is composed of the equivalent numerical values of each feature packed into one multi-
dimensional array [20].
Generally, pattern discovery can be summarized to involve two aspects: classification and
clustering, which then identifies the type of machine learning implemented, whether supervised or
unsupervised learning [21] as presented in Figure 5.
Figure 5. Data mining aspects and underlying techniques
2.4.1. Classification (supervised learning)
Classification is a thorough data analysis task of finding a model that will embody, define,
characterize and distinguish data classes and concepts from other data classes and concepts belonging to the
same dataset. This task particularly refers to the problem of ascertaining the category, label, or class of a new
observation taking into consideration that the category of the new observation is present among the
categories, labels, or classes and that this new observation has naturally occurred among the training set
which defined the model and its categories, labels or classes. Because the observations as well as the
categories of observations contained in the model are already known and naturally occurred as part of the
model, supervised machine learning is implemented [20]. These ideas are exhibited in Figure 6.
6. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
50
Figure 6. Supervised learning model
(image source: IBM https://developer.ibm.com/ articles/cc-models-machine-learning/)
As shown in Figure 6, supervised learning has a critical function that is triggered before rendering
an output. This critic function serves as a rater of how accurate its output is and is, therefore, a crucial part of
the learning system as it provides an idea of whether where adjustments can be made or whether a new
observation is not present in the list of labels.
Under supervised learning, the task is to train the machine to predict the category of a new
observation by referring to the results of all the previous observations. In simple terms, it’s like teaching a
child to learn the red, green, and blue colors among objects around him and then showing him a new object
and asking him to identify its color whether it is red, green, or blue [22].
Technically speaking, labels or categories can be identified because they can be quantifiable
numerically, as in our example, color, which is a combination of certain numerical values of either Red,
Green, or Blu (RGB) or cyan, magenta, yellow, black (CMYK) color combinations. Thus, classification may
involve a certain process to identify or recognize labels, categories, or classes. If the method of identification
involves a series of meeting feature conditions in an if-then-else fashion or finding differences with other
feature’s values, then it basically falls under decision trees. Meanwhile, if the recognition procedure involves
connecting or linking the feature based on how frequently that feature naturally occurred in the dataset, it
involves associative modeling or simply association. Finally, if the process of classification is consisted of
meeting certain criteria, threshold, or weight, then it involves rule induction [23].
Currently, there are several commonly-used classifiers in the field of data mining and AI. However,
for purposes of conciseness, this paper will succinctly discuss decision trees, linear regression, multilayer
perceptron, and random forest which were utilized in this research thru Weka.
a. Decision trees and rules
Decision trees and rules use univariate splits to have a simple representational form of decision-
making structure of an inferred model for end-users to comprehend easily [1]. The algorithm depends on
likelihood-based model-evaluation methods, with varying degrees of sophistication in terms of penalizing
model complexity. Technically, multiple nested if-then-else is used to model varying inputs and with similar
outcomes until a decision can arrive which then becomes a rule [21]. Figure 7 on the next page illustrates the
general diagram of a decision tree.
A decision tree can be formalized to become decision rules where the contents of each leaf node
become the outcome while the if clause is the path being formed relating to conditions. Each rule has the
following general form [1] if condition1 and condition 2 and … condition n then outcome.
7. Int J Inf & Commun Technol ISSN: 2252-8776
Multiple educational data mining approaches to discover patterns … (Julius Cesar O. Mamaril)
51
Figure 7. General diagram of a decision tree
b. Linear regression
Linear regression is a direct approach to forming the relationship through linear predictor functions
such as the mean function between a dependent variable and one or more independent variables [2]. Practical
uses and applications of linear regression usually fall into two categories: i) Fitting a predictive model from
observed values both dependent and independent variables and ii) Quantifying the degree of strength of the
relationship between the dependent and independent variables [1]. The first category is usually used during
prediction or forecasting and error reduction while the second category is usually utilized in qualitatively
explaining how the independent variables vary with different sets of dependent variables.
Figure 8 shows a general graph of mean values of x and y which is used in linear regression analysis
and predicting the next mean values.
Figure 8. General graph of a linear regression thru mean values
c. Multilayer perceptron
A multilayer perceptron (MLP) is a type of feed-forward artificial neural network which is usually
consisted of an input layer in one end, at least one hidden layer in the middle, and an output layer on the other
end where each node is considered a neuron, except for input nodes, that uses a nonlinear activation function
or a resulting value to be forwarded for use by the next layer [4]. MLP most of the times uses a form of
supervised learning technique commonly known as back propagation during training or adjusting bias
function values to achieve the desired output [2]. Figure 9 exhibits a general diagram of a multilayer
perceptron. MLP makes a suitable classifier algorithm due to its ability to create models through regression
analysis and statistically solve problems with random probability distribution[16].
8. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
52
Figure 9. General diagram of a multilayer perceptron
d. Random forest
Random forest is a collaboration of learning methods for classification tasks by forming multiple
decision trees during the training period and outputting the classes or mean prediction of the individual trees
formed [1]. Figure 10 shows cases the general diagram of a random forest algorithm. Random forest most of
the time corrects the over-fitting behavior of simple decision trees which makes it a good classifier algorithm
[16].
Figure 10. General diagram of a random forest
2.4.2. Clustering (unsupervised learning)
Clustering or Unsupervised learning involves the problem of classifying a new observation from a
set of observations but without any categories, labels, or classes. Because it does not know what category or
label it will use to identify a new observation, its aim is to find one but not really directly tagging each
observation a certain label, category, or class.
To identify observations without tagging or labeling, it uses the numerical features of each
observation, maps them, and group them according to their values so when a new observation arrives, it tells
whether its feature values are far, near, or within the group of previous observations [22]. This process can be
seen in Figure 11.
Unsupervised learning does not incorporate a critic function as clearly shown in Figure 11 and uses
all or mostly all of the numerical values of features it has observed. Thus, it cannot provide an accuracy
rating for each output [24]. Most unsupervised learning algorithms use the nearest neighbor concept to tell
how far or near the feature of a new observation is to the previous observations [25].
9. Int J Inf & Commun Technol ISSN: 2252-8776
Multiple educational data mining approaches to discover patterns … (Julius Cesar O. Mamaril)
53
Figure 11. Unsupervised learning model
(image source: IBM https://developer.ibm.com/ articles/cc-models-machine-learning/)
3. RESEARCH METHOD
The paper proposed a framework for an effective educational process that uses data mining
techniques to uncover hidden trends and patterns and make accurate predictions based on a higher level of
analytical sophistication in the admission process in the university.
The major impact of this study is that it helps the administration make better decisions by projecting
the number of expected registrants for the following semester and alerting concerned colleges and
departments about their enrollment readiness from their respective areas of responsibility. The number of
teachers, classrooms, classroom equipment, materials and tools, pertinent supplies, and non-academic
personnel can all be assessed for sufficiency using the forecasted data, and decisions, actions, and reporting
can be made quickly in order to ensure a smooth enrollment and class operations.
This paper involved three major tasks: attribute selection, pattern discovery, and prediction. These
three tasks were performed in Weka, an open-source data mining software with a collection of machine
learning algorithms for data mining tasks. The researcher uses a feature selection technique to find the traits
or features that have the greatest impact on our outcome variable. Weka offers a diverse set of attribute
selection algorithms that can be categorized in a variety of ways. The way in which attributes can be
evaluated is one of the most prominent categorizations of the algorithms and this way they can be classed as
filters and wrappers. Wrappers employ the performance of learning algorithms to determine the desirability
of an attribute subset, while filters pick and analyze features independently of learning techniques. Weka
supports multiple feature selection algorithms. It contains tools for data preparation, classification,
regression, clustering, association rules mining, and visualization [9].
The attribute selection task involved wrapper attribute selection using Weka’s correlation attribute
eval algorithm. To find which among the 11 fundamental student attributes as shown in Table 1 can be used
as strong predictors of successful graduation of a student’s college program. Each attribute has been assigned
a model code identifier for brevity. Meanwhile, for pattern recognition and prediction tasks, a two-year
record set consisting of 1,817 records of admitted students in a local university during the school year 2014
and graduated in 2018 were pre-processed and converted into a comma-separated value (CSV) format file
using the attribute’s model code in Table 1 as column headers. The labels chosen for the model to classify
were the fourteen university program offerings as shown in Table 2.
For the input vector, seventy percent (70%) or a total of 1,272 records were apportioned as part of
the training dataset while the remaining 30% or 545 records were set aside to be part of the test dataset. By
setting the 70%-30% data proportion in Weka, the application chose the respective 70% training dataset and
30% test dataset. Multiple data mining predictive techniques were utilized which included multilayer
perceptron, decision trees, linear regression, and random forest.
10. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
54
Table 1. Student attributes and model code
# Attribute Description Model code
1 General weighted average High school general weighted average HS_GWA
2 Admission test score (ATS) College admission test score ATEST_SCORE
3 Preferred program Preferred collegiate program of the student PREF_PROGRAM
4 Prescribed program Prescribed college program to the student based on university
policy
PRESC_PROGRAM
5 Final math grade Final mathematics average grade of student in high school card HS_MATH
6 Final science grade Final science and technology average grade of student in high
school report card
HS_SCIENCE
7 Final english grade Final english average grade of student in high school report
card
HS_ENGLISH
8 Admission test math score College admission test mathematics score ATEST_MATH
9 Admission test science score College admission test science and technology score ATEST_SCIENCE
10 Admission test English score College admission test English score ATEST_ENGLISH
11 College general weighted
average (CGWA)
General weighted average of student in his chosen college COLLEGE_GWA
Table 2. Model labels
# College program Assigned label
1 B.S. Biology BS BIO
2 A.B. English AB ENGLISH
3 A.B. Economics AB ECONOMICS
4 A.B. Public administration AB PA
5 B.S. Mathematics BS MATH
6 B.S. Computer science BSCS
7 B.S. Information and communications technology BSICT
8 B.S. Nutrition and dietetics BSND
9 B. S. Hospitality management BSHM
10 Bachelor of secondary education BSEd
11 B.S. Business administration BSBA
12 B.S. Social work BSSW
13 Bachelor of technology in teacher education BTTE
14 Bachelor in industrial technology BIT
4. RESULTS AND DISCUSSION
The major purpose of this study was to develop educational data mining and knowledge discovery
system for Pangasinan State University (PSU) in order to improve its college admission services by
analyzing historical data on college admissions, identifying and choosing essential features or qualities from
student data entry credentials that will include a dynamic rule-based approach to data mining, knowledge
discovery using various data mining approaches, assessing the resulting system's acceptability among
stakeholders, and identifying how discovered patterns may be used to improve college admission services.
Each admission officer manually assesses each candidate's data against the numerous entrance
requirements before making admission and placement decisions under the present admission process. This
method will take far too long to complete in terms of processing time. Thus, PSU mining and knowledge
discovery system was created through mining the historical databases and discovering knowledge.
Thru classification (supervised learning) and association rule mining employing Apriori algorithm
to determine the most frequent attribute values in a topic dataset and then reduces the attributes to reveal
distinctive patterns. In discovering patterns, which serve as the foundation for establishing the ideal course
for students and admission forecasting for enrollment readiness, which is useful for enrollment capacity
planning.
Figure 12 exhibits the enrollment forecasting generated from the developed customized data mining
and knowledge discovery system. The researchers made use of the previous 1st semester enrollment records
from school years 2013 to 2018 and projected the 1st
semester 2019 enrollment. The forecast estimated 6,972
enrollees, however, only 6,216 enrolled, thus making the forecast 88.59% accurate which is still significantly
close to the 85% accuracy forecast threshold. As can be gleaned from the graph.
Figure 13 shows the attribute selection task which was performed using Weka’s correlation attribute
eval algorithm. Based on the graph, Admission Test Score in Science shows the strongest predictor with a
correlation rate of 4.94% followed by the final grade in Math during high school, admission test score, high
school Science final grade, high school general weighted average, and admission test scores in Math and in
English. The correlation shows also that the rest of the attributes have shown to be fewer predictors if a cut-
off of 0.1% will be imposed.
11. Int J Inf & Commun Technol ISSN: 2252-8776
Multiple educational data mining approaches to discover patterns … (Julius Cesar O. Mamaril)
55
Figure 14 presents the prediction accuracy of the four subjects supervised learning classifiers
(simple linear regression, decision Table, multi-layer perceptron, and random forest), each of which was
trained with 70% of the datasets and the remaining 30% were utilized as test dataset using Weka. The graph
shows that the Multilayer perceptron rendered a significant average accuracy of 92.12% on its predictions,
followed by the decision Table approach (91.33%), simple linear regression (89.78%), and random forest
approach (89.15%), respectively.
Figure 12. Graph of enrollment forecasting generated
Figure 13. Graph of attribute selection task
performed using Weka’s correlation attribute eval
algorithm
Figure 14. Graph of classifier prediction accuracy
among the 4 subjects supervised learning classifiers
5. CONCLUSION
The careful selection of attributes that embody the data mining model is a crucial task in order to
gain substantial accurate results. The result found that the university program for college freshmen using
students’ grades from high school report cards and college admission test scores as EDM model attributes
can be predicted with a significant level of accuracy by using multiple supervised learning classifiers such as
multilayer perceptron, decision Table, linear regression and random forest.
REFERENCES
[1] M. J. Zaki, W. Meira Jr, and W. Meira, Data mining and analysis: fundamental concepts and algorithms. Cambridge University
Press, 2014.
[2] D. L. Poole and A. K. Mackworth, Artificial Intelligence: foundations of computational agents, 2nd ed. Cambridge University
Press, 2010.
[3] R. T. Watson, Data Management: Databases and Organizations. Prospect Press: Prospect Press; 6th edition, 2015.
[4] C. Romero, M. Ventura, Sebastian Pechenizkiy, and R. Sj. Baker, Handbook of educational data mining. CRC press, 2010.
[5] J. Grus, Data science from scratch: first principles with python. O’Reilly Media, 2019.
[6] F. Gorunescu, Data mining: concepts, models and techniques. Berlin: Springer-Verlag Berlin and Heidelberg GmbH and Co. K,
2013.
12. ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 45-56
56
[7] J. Dean, Big data, data mining, and machine learning: value creation for business leaders and practitioners. John Wiley and
Sons, 2014.
[8] S. Huang and N. Fang, “Predicting student academic performance in an engineering dynamics course: A comparison of four types
of predictive mathematical models,” Computers & Education, vol. 61, pp. 133–145, Feb. 2013, doi:
10.1016/j.compedu.2012.08.015.
[9] E. Muratov, M. Lewis, D. Fourches, A. Tropsha, and W. C. Cox, “Computer-assisted decision support for student admissions
based on their predicted academic performance,” American Journal of Pharmaceutical Education, vol. 81, no. 3, p. 46, Apr.
2017, doi: 10.5688/ajpe81346.
[10] C. C. Aggarwal and Others, Data mining: The textbook. Springer, 2015.
[11] V. Kotu and B. Deshpande, Predictive Analytics and Data Mining. Elsevier, 2015.
[12] D. T. Larose and C. D. Larose, Discovering knowledge in data: an introduction to data mining. John Wiley and Sons, 2014.
[13] R. S. J. d. Baker and K. Yacef, “The state of educational data mining in 2009: A review and future visions,” Journal of
educational data mining, vol. 1, no. 1, pp. 3–17, 2009, doi: 10.5281/zenodo.3554657.
[14] M. Heumann, Christian Schomaker and Others, Introduction to statistics and data analysis. Springer, 2016.
[15] B. Christian and T. Griffiths, Algorithms to live by: The computer science of human decisions. Henry Holt and Co.; 1st edition,
2016.
[16] P. Frasconi, N. Landwehr, G. Manco, and J. Vreeken, Machine learning and knowledge discovery in databases: European
Conference, ECML PKDD 2016, Riva Del Garda, Italy, September 19-23, 2016, Proceedings, Part II. Springer, 2016.
[17] S. Shalev-Shwartz and S. Ben-David, Understanding machine learning: From theory to algorithms. Cambridge University Press,
2013.
[18] U. Waikato, “Weka - Machine Learning Software in Java,” ., 2014. http://www.cs.waikato.ac.nz/ml/weka/.
[19] Y. Kodratoff and R. S. Michalski, Machine learning: An artificial intelligence approach, volume 3. Elsevier, 2024.
[20] J. Bibodi, A. Vadodaria, A. Rawat, and J. Patel, “Admission prediction system using machine learning.” pp. 1–8, 2017.
[21] T. H. H. Cormen, C. E. E. Leiserson, R. L. L. Rivest, and C. Stein, Introduction to algorithms, 3rd ed. MIT Press, 2014.
[22] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach - 3rd Edition. Publisher: Prentice Hall pa. Pearson; 3rd
edition, 2009.
[23] L. Rusdiana and Marfuah, “The Application of Determining Students’ Graduation Status of STMIK Palangkaraya Using K-
Nearest Neighbors Method,” IOP Conference Series: Earth and Environmental Science, vol. 97, p. 012040, Dec. 2017, doi:
10.1088/1755-1315/97/1/012040.
[24] R. E. Neapolitan and X. Jiang, Artificial Intelligence With an Introduction to Machine Learning, vol. 53, no. 9. Chapman and
Hall/CRC, 2019.
[25] S. Asadianfam and S. Shamsi, Mahboubeh Asadianfam, “Predicting academic major of students using bayesian networks to the
case of iran,” International Journal of Computer-Aided Technologies (IJCAx), vol. 2, no. 3, p. 7, 2015, doi:
10.5121/ijcax.2015.2304.
BIOGRAPHIES OF AUTHORS
Julius Cesar O. Mamaril is an Associate Professor at the College of Computing
Sciences, Pangasinan State University, Philippines. He possesses a degree in Computer
Science (BSCS), Master in Public Administration (MPA), Doctorate degree in Business
Administration (DBA) and currently completing his Doctorate Degree in Information
Technology (DIT) at the Technological Institute of the Philippines, Manila. He was a former
Dean of the College of Computing Sciences, Deputy Executive Director of Open University
Systems and currently the Director of ICT Management Office, formerly MIS Office. He can
be contacted at email: jmamaril@psu.edu.ph.
Melvin A. Ballera is a professor at Technological Institute of the Philippines,
Manila with the degree in PhD in Computer Science (FIT/SGS), De La Salle University. He
holds positions including academic coordinator, director for linkages and currently the
Assitant Research Director. He has published numerous papers in the area of Artificial
Intelligence, E-Learning, and Instructional Technology. He can be contacted at email:
melvin.ballera@tip.edu.ph.