This document describes the AugmentED model for predicting students' academic performance using multisource behavioral data. The model consists of three modules: (1) A Data Module that aggregates behavioral data from multiple sources on campus and extracts features representing students' behavioral changes. (2) A Prediction Module that uses machine learning algorithms to predict academic performance as a classification problem. (3) A Feedback Module that provides individualized feedback to students based on predictions and feature analysis. The model is examined using a dataset of 156 college students, showing it can predict academic performance with high accuracy by analyzing patterns in students' campus lifestyle behaviors.
A comparative study of machine learning algorithms for virtual learning envir...IAESIJAI
Virtual learning environment is becoming an increasingly popular study option for students from diverse cultural and socioeconomic backgrounds around the world. Although this learning environment is quite adaptable, improving student performance is difficult due to the online-only learning method. Therefore, it is essential to investigate students' participation and performance in virtual learning in order to improve their performance. Using a publicly available Open University learning analytics dataset, this study examines a variety of machine learning-based prediction algorithms to determine the best method for predicting students' academic success, hence providing additional alternatives for enhancing their academic achievement. Support vector machine, random forest, Nave Bayes, logical regression, and decision trees are employed for the purpose of prediction using machine learning methods. It is noticed that the random forest and logistic regression approach predict student performance with the highest average accuracy values compared to the alternatives. In a number of instances, the support vector machine has been seen to outperform the other methods.
This document summarizes a literature review that analyzed research predicting student performance and dropout rates using machine learning techniques. The review identified 78 relevant papers published between 2009-2021. These papers mostly used student data from universities and MOOC platforms to test machine learning classifiers for predicting at-risk students and dropout likelihood. The review found that machine learning methods effectively predicted student performance and helped universities develop intervention strategies to improve student outcomes.
A Systematic Literature Review Of Student Performance Prediction Using Machi...Angie Miller
This document summarizes a systematic literature review of research predicting student performance using machine learning techniques. The review examined studies from 2009 to 2021 that identified students at risk of dropping out. It found that various machine learning methods were used to understand challenges and predict performance. Most studies used data from university databases and online learning platforms. Machine learning was shown to effectively predict student risk levels and dropout rates, helping improve student outcomes.
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
The document discusses using Learning Factor Analysis (LFA), an educational data mining technique, to model student knowledge based on student-tutor interaction log data. LFA uses a multiple logistic regression model with difficulty factors defined by subject experts to quantify skills. A combinatorial search method called A* search is used to select the best-fitting model. The document illustrates applying LFA to data from an online math tutor, identifying 5 skills and presenting the results of the logistic regression modeling, including fit statistics and learning rates for skills. Learning curves are used to visualize student performance over time.
A Study on Learning Factor Analysis – An Educational Data Mining Technique fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
This document discusses using data mining techniques to analyze faculty performance at an engineering college in India. It proposes analyzing 4 parameters - student complaints, feedback, results, and reviews - to evaluate faculty instead of just 2 parameters (feedback and results) used previously. It will use opinion mining to analyze faculty performance and calculate scores. The system will collect data, preprocess it, apply a KNN algorithm to the 4 parameters to calculate scores for each faculty, sum the scores, classify results using rule-based classification, and analyze outcomes by subject and class. It reviews related work applying educational data mining and concludes the multiple classifier approach is better, and future work could consider more parameters and expand to all college branches and departments.
The Big data concept emerged to meet the growing demands in analysing large
volumes of fast moving, heterogeneous and complex data, which traditional data
analysis systems could not manage further. The application of big data technology
across various sectors of the economy has aided better utilization of multiple data
collated and hence decision making. Organizations no longer base operations on
assumptions or constructed models solely, but can make inferences from generated
data. Educational organizations are more efficient and the pedagogical processes
more effective, when multiple streams of data can be collated from the various
personnel and facilitators involved. This data when analysed, maximizes the
performance of administrators andrecipients alike. This paper looks at the
components and techniques in bigdata technology, and how it can be implemented in
the education system for effective administration and delivery
A comparative study of machine learning algorithms for virtual learning envir...IAESIJAI
Virtual learning environment is becoming an increasingly popular study option for students from diverse cultural and socioeconomic backgrounds around the world. Although this learning environment is quite adaptable, improving student performance is difficult due to the online-only learning method. Therefore, it is essential to investigate students' participation and performance in virtual learning in order to improve their performance. Using a publicly available Open University learning analytics dataset, this study examines a variety of machine learning-based prediction algorithms to determine the best method for predicting students' academic success, hence providing additional alternatives for enhancing their academic achievement. Support vector machine, random forest, Nave Bayes, logical regression, and decision trees are employed for the purpose of prediction using machine learning methods. It is noticed that the random forest and logistic regression approach predict student performance with the highest average accuracy values compared to the alternatives. In a number of instances, the support vector machine has been seen to outperform the other methods.
This document summarizes a literature review that analyzed research predicting student performance and dropout rates using machine learning techniques. The review identified 78 relevant papers published between 2009-2021. These papers mostly used student data from universities and MOOC platforms to test machine learning classifiers for predicting at-risk students and dropout likelihood. The review found that machine learning methods effectively predicted student performance and helped universities develop intervention strategies to improve student outcomes.
A Systematic Literature Review Of Student Performance Prediction Using Machi...Angie Miller
This document summarizes a systematic literature review of research predicting student performance using machine learning techniques. The review examined studies from 2009 to 2021 that identified students at risk of dropping out. It found that various machine learning methods were used to understand challenges and predict performance. Most studies used data from university databases and online learning platforms. Machine learning was shown to effectively predict student risk levels and dropout rates, helping improve student outcomes.
Predicting student performance in higher education using multi-regression modelsTELKOMNIKA JOURNAL
Supporting the goal of higher education to produce graduation who will be a professional leader is a crucial. Most of universities implement intelligent information system (IIS) to support in achieving their vision and mission. One of the features of IIS is student performance prediction. By implementing data mining model in IIS, this feature could precisely predict the student’ grade for their enrolled subjects. Moreover, it can recognize at-risk students and allow top educational management to take educative interventions in order to succeed academically. In this research, multi-regression model was proposed to build model for every student. In our model, learning management system (LMS) activity logs were computed. Based on the testing result on big students datasets, courses, and activities indicates that these models could improve the accuracy of prediction model by over 15%.
The document discusses using Learning Factor Analysis (LFA), an educational data mining technique, to model student knowledge based on student-tutor interaction log data. LFA uses a multiple logistic regression model with difficulty factors defined by subject experts to quantify skills. A combinatorial search method called A* search is used to select the best-fitting model. The document illustrates applying LFA to data from an online math tutor, identifying 5 skills and presenting the results of the logistic regression modeling, including fit statistics and learning rates for skills. Learning curves are used to visualize student performance over time.
A Study on Learning Factor Analysis – An Educational Data Mining Technique fo...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
This document discusses using data mining techniques to analyze faculty performance at an engineering college in India. It proposes analyzing 4 parameters - student complaints, feedback, results, and reviews - to evaluate faculty instead of just 2 parameters (feedback and results) used previously. It will use opinion mining to analyze faculty performance and calculate scores. The system will collect data, preprocess it, apply a KNN algorithm to the 4 parameters to calculate scores for each faculty, sum the scores, classify results using rule-based classification, and analyze outcomes by subject and class. It reviews related work applying educational data mining and concludes the multiple classifier approach is better, and future work could consider more parameters and expand to all college branches and departments.
The Big data concept emerged to meet the growing demands in analysing large
volumes of fast moving, heterogeneous and complex data, which traditional data
analysis systems could not manage further. The application of big data technology
across various sectors of the economy has aided better utilization of multiple data
collated and hence decision making. Organizations no longer base operations on
assumptions or constructed models solely, but can make inferences from generated
data. Educational organizations are more efficient and the pedagogical processes
more effective, when multiple streams of data can be collated from the various
personnel and facilitators involved. This data when analysed, maximizes the
performance of administrators andrecipients alike. This paper looks at the
components and techniques in bigdata technology, and how it can be implemented in
the education system for effective administration and delivery
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...Kate Campbell
This document discusses using assignment information to predict student performance in online courses. It proposes representing assignment data using Multiple Instance Learning (MIL) to better handle sparse data. The study compares:
1) Representing assignments as single instances vs MIL representation. Algorithms using MIL outperform single instance by over 20%, showing MIL better captures assignment information.
2) Predictions using only assignment data represented with MIL vs prior studies using other factors like demographics and interactions. MIL assignment models achieve competitive results, showing assignments are a relevant predictive factor.
3) The document concludes representing assignments with MIL reveals their importance in predicting student success, whereas the sparse nature of assignment data limited its consideration in
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Predicting students’ intention to continue business courses on online platfor...Samsul Alam
The objective of this study was to analyze the intention of a University's business department students to continue their studies on e-learning platforms during the ongoing COVID-19 pandemic. To this end, a questionnaire was developed to collect primary data from students in business fields. The study took into account more than 285 respondents from two different universities and relied on the expectation confirmation model (ECM) theory and the structural equation model. The partial least squares (SEM-PLS) method was used to analyze the data. The results of the study showed that task skills (TS) and task challenges (TC) were significant for the enjoyment (EN) of the students which in turn had a positive effect on the satisfaction levels. Confirmation (CON) had an impact on the post adoption perceived usefulness (PAPU), which was deemed positive for student satisfaction (SAT). The SAT and psychological safety (PS) of online learning platforms were found to positively influence the continuance intention (CI) on e-learning platforms. Finally, both SAT and PS of online learning platforms were observed to positively influence CI on e-learning platforms. Further research in this area could be useful in making decisions about promoting educational programs based on e-learning. The researchers recommend that academicians and policymakers must ensure appropriate arrangements for teaching on e-learning platforms.
Understanding the role of individual learner in adaptive and personalized e-l...journalBEEI
Dynamic learning environment has emerged as a powerful platform in a modern e-learning system. The learning situation that constantly changing has forced the learning platform to adapt and personalize its learning resources for students. Evidence suggested that adaptation and personalization of e-learning systems (APLS) can be achieved by utilizing learner modeling, domain modeling, and instructional modeling. In the literature of APLS, questions have been raised about the role of individual characteristics that are relevant for adaptation. With several options, a new problem has been raised where the attributes of students in APLS often overlap and are not related between studies. Therefore, this study proposed a list of learner model attributes in dynamic learning to support adaptation and personalization. The study was conducted by exploring concepts from the literature selected based on the best criteria. Then, we described the results of important concepts in student modeling and provided definitions and examples of data values that researchers have used. Besides, we also discussed the implementation of the selected learner model in providing adaptation in dynamic learning.
UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO PREDICT STUDENT ...IRJET Journal
This document summarizes a research study that aimed to predict student performance and support decision making for university admission systems using data mining techniques. The study analyzed data from 2,039 students at a university in Saudi Arabia to compare the predictive power of different data mining classification models (ANN, decision trees, SVM, naive Bayes). It found that a student's score on the pre-admission Scholastic Proficiency Admission Test was the best predictor of their first year GPA. Based on this, the university adjusted its admission criteria to give greater weight to this pre-admission test score. After making this change, the number of students with high GPAs increased while the number with low GPAs decreased.
The paper examined the accessibility and efficiency of developed Online Learning System OLS of Surigao del Sur State University Main Campus. It based from waterfall model in which descriptive research was applied to non computer program students. Data from the pre assessment survey and interview were treated by using the weighted mean to determine the level of accessibility and efficiency of developed online learning system. The accessibility and efficiency to information from web, posting message and attend synchronous discussion is evident that users can manage to use the developed system, even a low rated experience in uploading files can be developed while using the system. Thus, the developed system is important in maintaining teaching and learning that online interaction can be used to enhance learning, especially for student who tend to kept in the learning process. Bryan L. Guibijar "Development of Online Learning System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-4 , June 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50256.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-processing/50256/development-of-online-learning-system/bryan-l-guibijar
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher
educational institutions (HEIs). It is of a great importance not only to the students but also to the
educational administrators and the institutions in the areas of improving academic quality and
efficient utilisation of the available resources for effective intervention. However, despite the different
frameworks and various models that researchers have used across institutions for predicting performance,
only negligible success has been recorded in terms of accuracy, efficiency and reduction of student
attrition. This has been attributed to the inadequate and selective use of variables for the predictive models.
AN INTEGRATED SYSTEM FRAMEWORK FOR PREDICTING STUDENTS’ ACADEMIC PERFORMANCE ...ijcsit
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher educational institutions (HEIs). It is of a great importance not only to the students but also to the educational administrators and the institutions in the areas of improving academic quality and efficient utilisation of the available resources for effective intervention. However, despite the different frameworks and various models that researchers have used across institutions for predicting performance, only negligible success has been recorded in terms of accuracy, efficiency and reduction of student
attrition. This has been attributed to the inadequate and selective use of variables for the predictive models. This paper presents a multi-dimensional and an integrated system framework that involves considerable learners’ input and engagement in predicting their academic performance and intervention in HEIs. The purpose and functionality of the framework are to produce a comprehensive, unbiased and efficient way of predicting student performance that its implementation is based upon multi-sources data and database
system. It makes use of student demographic and learning management system (LMS) data from the institutional databases as well as the student psychosocial-personality (SPP) data from the survey collected from the student to predict performance. The proposed approach will be robust, generalizable, and possibly give a prediction at a higher level of accuracy that educational administrators can rely on for providing timely intervention to students.
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher educational institutions (HEIs). It is of a great importance not only to the students but also to the educational administrators and the institutions in the areas of improving academic quality and efficient utilisation of the available resources for effective intervention. However, despite the different frameworks and various models that researchers have used across institutions for predicting performance, only negligible success has been recorded in terms of accuracy, efficiency and reduction of student attrition. This has been attributed to the inadequate and selective use of variables for the predictive models. This paper presents a multi-dimensional and an integrated system framework that involves considerable learners’ input and engagement in predicting their academic performance and intervention in HEIs. The purpose and functionality of the framework are to produce a comprehensive, unbiased and efficient way of predicting student performance that its implementation is based upon multi-sources data and database system. It makes use of student demographic and learning management system (LMS) data from the institutional databases as well as the student psychosocial-personality (SPP) data from the survey collected from the student to predict performance. The proposed approach will be robust, generalizable, and possibly give a prediction at a higher level of accuracy that educational administrators can rely on for providing timely intervention to students. --
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school
students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high
school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on
each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Students' Intention to Use Technology and E-learning probably Influenced by s...IRJET Journal
This document examines factors that influence students' intention to use technology and e-learning in Libyan higher education. It investigates the effects of computer-internet experience, computer self-efficacy, technology-internet quality, and attitudes toward use on intention to use technology and e-learning. It also examines potential differences based on gender and field of study. The document describes a study that distributed questionnaires to 217 students to test 14 hypotheses related to these factors and differences. The results found that computer-internet experience, computer self-efficacy, technology-internet quality, and attitudes toward use were all positively related to intention to use technology and e-learning. However, no significant differences were found based on gender or field of study.
Educational Data Mining is used to find interesting patterns from the data taken from
educational settings to improve teaching and learning. Assessing student’s ability and performance with
EDM methods in e-learning environment for math education in school level in India has not been
identified in our literature review. Our method is a novel approach in providing quality math education
with assessments indicating the knowledge level of a student in each lesson. This paper illustrates how
Learning Curve – an EDM visualization method is used to compare rural and urban students’ progress
in learning mathematics in an e-learning environment. The experiment is conducted in two different
schools in Tamil Nadu, India. After practicing the problems the students attended the test and their
interaction data are collected and analyzed their performance in different aspects: Knowledge
component level, time taken to solve a problem, error rate. This work studies the student actions for
identifying learning progress. The results show that the learning curve method is much helpful to the
teachers to visualize the students’ performance in granular level which is not possible manually. Also it
helps the students in knowing about their skill level when they complete each unit.
Data Mining for Education
Ryan S.J.d. Baker, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
rsbaker@cmu.edu
Article to appear as
Baker, R.S.J.d. (in press) Data Mining for Education. To appear in McGaw, B., Peterson, P.,
Baker, E. (Eds.) International Encyclopedia of Education (3rd edition). Oxford, UK: Elsevier.
This is a pre-print draft. Final article may involve minor changes and different formatting.
A Survey on Educational Data Mining TechniquesIIRindia
Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
The document discusses educational data mining and a proposed Student-Staff-Tutor (SSTT) framework. It summarizes the following:
1) Educational data mining uses techniques like machine learning, statistics and data mining to analyze educational data to better understand the learning process and student performance.
2) The SSTT framework models relationships between students, staff, and tutors and how these interactions impact student learning and outcomes.
3) An experiment applies clustering and social network analysis to educational data to analyze student knowledge distribution and interactions under the SSTT framework. The results found tutors play an important role in strengthening student-staff relationships and improving student performance.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
Assignments As Influential Factor To Improve The Prediction Of Student Perfor...Kate Campbell
This document discusses using assignment information to predict student performance in online courses. It proposes representing assignment data using Multiple Instance Learning (MIL) to better handle sparse data. The study compares:
1) Representing assignments as single instances vs MIL representation. Algorithms using MIL outperform single instance by over 20%, showing MIL better captures assignment information.
2) Predictions using only assignment data represented with MIL vs prior studies using other factors like demographics and interactions. MIL assignment models achieve competitive results, showing assignments are a relevant predictive factor.
3) The document concludes representing assignments with MIL reveals their importance in predicting student success, whereas the sparse nature of assignment data limited its consideration in
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Predicting students’ intention to continue business courses on online platfor...Samsul Alam
The objective of this study was to analyze the intention of a University's business department students to continue their studies on e-learning platforms during the ongoing COVID-19 pandemic. To this end, a questionnaire was developed to collect primary data from students in business fields. The study took into account more than 285 respondents from two different universities and relied on the expectation confirmation model (ECM) theory and the structural equation model. The partial least squares (SEM-PLS) method was used to analyze the data. The results of the study showed that task skills (TS) and task challenges (TC) were significant for the enjoyment (EN) of the students which in turn had a positive effect on the satisfaction levels. Confirmation (CON) had an impact on the post adoption perceived usefulness (PAPU), which was deemed positive for student satisfaction (SAT). The SAT and psychological safety (PS) of online learning platforms were found to positively influence the continuance intention (CI) on e-learning platforms. Finally, both SAT and PS of online learning platforms were observed to positively influence CI on e-learning platforms. Further research in this area could be useful in making decisions about promoting educational programs based on e-learning. The researchers recommend that academicians and policymakers must ensure appropriate arrangements for teaching on e-learning platforms.
Understanding the role of individual learner in adaptive and personalized e-l...journalBEEI
Dynamic learning environment has emerged as a powerful platform in a modern e-learning system. The learning situation that constantly changing has forced the learning platform to adapt and personalize its learning resources for students. Evidence suggested that adaptation and personalization of e-learning systems (APLS) can be achieved by utilizing learner modeling, domain modeling, and instructional modeling. In the literature of APLS, questions have been raised about the role of individual characteristics that are relevant for adaptation. With several options, a new problem has been raised where the attributes of students in APLS often overlap and are not related between studies. Therefore, this study proposed a list of learner model attributes in dynamic learning to support adaptation and personalization. The study was conducted by exploring concepts from the literature selected based on the best criteria. Then, we described the results of important concepts in student modeling and provided definitions and examples of data values that researchers have used. Besides, we also discussed the implementation of the selected learner model in providing adaptation in dynamic learning.
UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO PREDICT STUDENT ...IRJET Journal
This document summarizes a research study that aimed to predict student performance and support decision making for university admission systems using data mining techniques. The study analyzed data from 2,039 students at a university in Saudi Arabia to compare the predictive power of different data mining classification models (ANN, decision trees, SVM, naive Bayes). It found that a student's score on the pre-admission Scholastic Proficiency Admission Test was the best predictor of their first year GPA. Based on this, the university adjusted its admission criteria to give greater weight to this pre-admission test score. After making this change, the number of students with high GPAs increased while the number with low GPAs decreased.
The paper examined the accessibility and efficiency of developed Online Learning System OLS of Surigao del Sur State University Main Campus. It based from waterfall model in which descriptive research was applied to non computer program students. Data from the pre assessment survey and interview were treated by using the weighted mean to determine the level of accessibility and efficiency of developed online learning system. The accessibility and efficiency to information from web, posting message and attend synchronous discussion is evident that users can manage to use the developed system, even a low rated experience in uploading files can be developed while using the system. Thus, the developed system is important in maintaining teaching and learning that online interaction can be used to enhance learning, especially for student who tend to kept in the learning process. Bryan L. Guibijar "Development of Online Learning System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-4 , June 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50256.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-processing/50256/development-of-online-learning-system/bryan-l-guibijar
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher
educational institutions (HEIs). It is of a great importance not only to the students but also to the
educational administrators and the institutions in the areas of improving academic quality and
efficient utilisation of the available resources for effective intervention. However, despite the different
frameworks and various models that researchers have used across institutions for predicting performance,
only negligible success has been recorded in terms of accuracy, efficiency and reduction of student
attrition. This has been attributed to the inadequate and selective use of variables for the predictive models.
AN INTEGRATED SYSTEM FRAMEWORK FOR PREDICTING STUDENTS’ ACADEMIC PERFORMANCE ...ijcsit
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher educational institutions (HEIs). It is of a great importance not only to the students but also to the educational administrators and the institutions in the areas of improving academic quality and efficient utilisation of the available resources for effective intervention. However, despite the different frameworks and various models that researchers have used across institutions for predicting performance, only negligible success has been recorded in terms of accuracy, efficiency and reduction of student
attrition. This has been attributed to the inadequate and selective use of variables for the predictive models. This paper presents a multi-dimensional and an integrated system framework that involves considerable learners’ input and engagement in predicting their academic performance and intervention in HEIs. The purpose and functionality of the framework are to produce a comprehensive, unbiased and efficient way of predicting student performance that its implementation is based upon multi-sources data and database
system. It makes use of student demographic and learning management system (LMS) data from the institutional databases as well as the student psychosocial-personality (SPP) data from the survey collected from the student to predict performance. The proposed approach will be robust, generalizable, and possibly give a prediction at a higher level of accuracy that educational administrators can rely on for providing timely intervention to students.
Accurate prediction and early identification of student at-risk of attrition are of high concern for higher educational institutions (HEIs). It is of a great importance not only to the students but also to the educational administrators and the institutions in the areas of improving academic quality and efficient utilisation of the available resources for effective intervention. However, despite the different frameworks and various models that researchers have used across institutions for predicting performance, only negligible success has been recorded in terms of accuracy, efficiency and reduction of student attrition. This has been attributed to the inadequate and selective use of variables for the predictive models. This paper presents a multi-dimensional and an integrated system framework that involves considerable learners’ input and engagement in predicting their academic performance and intervention in HEIs. The purpose and functionality of the framework are to produce a comprehensive, unbiased and efficient way of predicting student performance that its implementation is based upon multi-sources data and database system. It makes use of student demographic and learning management system (LMS) data from the institutional databases as well as the student psychosocial-personality (SPP) data from the survey collected from the student to predict performance. The proposed approach will be robust, generalizable, and possibly give a prediction at a higher level of accuracy that educational administrators can rely on for providing timely intervention to students. --
Recommendation of Data Mining Technique in Higher Education Prof. Priya Thaka...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school
students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high
school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on
each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Students' Intention to Use Technology and E-learning probably Influenced by s...IRJET Journal
This document examines factors that influence students' intention to use technology and e-learning in Libyan higher education. It investigates the effects of computer-internet experience, computer self-efficacy, technology-internet quality, and attitudes toward use on intention to use technology and e-learning. It also examines potential differences based on gender and field of study. The document describes a study that distributed questionnaires to 217 students to test 14 hypotheses related to these factors and differences. The results found that computer-internet experience, computer self-efficacy, technology-internet quality, and attitudes toward use were all positively related to intention to use technology and e-learning. However, no significant differences were found based on gender or field of study.
Educational Data Mining is used to find interesting patterns from the data taken from
educational settings to improve teaching and learning. Assessing student’s ability and performance with
EDM methods in e-learning environment for math education in school level in India has not been
identified in our literature review. Our method is a novel approach in providing quality math education
with assessments indicating the knowledge level of a student in each lesson. This paper illustrates how
Learning Curve – an EDM visualization method is used to compare rural and urban students’ progress
in learning mathematics in an e-learning environment. The experiment is conducted in two different
schools in Tamil Nadu, India. After practicing the problems the students attended the test and their
interaction data are collected and analyzed their performance in different aspects: Knowledge
component level, time taken to solve a problem, error rate. This work studies the student actions for
identifying learning progress. The results show that the learning curve method is much helpful to the
teachers to visualize the students’ performance in granular level which is not possible manually. Also it
helps the students in knowing about their skill level when they complete each unit.
Data Mining for Education
Ryan S.J.d. Baker, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
rsbaker@cmu.edu
Article to appear as
Baker, R.S.J.d. (in press) Data Mining for Education. To appear in McGaw, B., Peterson, P.,
Baker, E. (Eds.) International Encyclopedia of Education (3rd edition). Oxford, UK: Elsevier.
This is a pre-print draft. Final article may involve minor changes and different formatting.
A Survey on Educational Data Mining TechniquesIIRindia
Educational data mining (EDM) creates high impact in the field of academic domain. The methods used in this topic are playing a major advanced key role in increasing knowledge among students. EDM explores and gives ideas in understanding behavioral patterns of students to choose a correct path for choosing their carrier. This survey focuses on such category and it discusses on various techniques involved in making educational data mining for their knowledge improvement. Also, it discusses about different types of EDM tools and techniques in this article. Among the different tools and techniques, best categories are suggested for real world usage.
The document discusses educational data mining and a proposed Student-Staff-Tutor (SSTT) framework. It summarizes the following:
1) Educational data mining uses techniques like machine learning, statistics and data mining to analyze educational data to better understand the learning process and student performance.
2) The SSTT framework models relationships between students, staff, and tutors and how these interactions impact student learning and outcomes.
3) An experiment applies clustering and social network analysis to educational data to analyze student knowledge distribution and interactions under the SSTT framework. The results found tutors play an important role in strengthening student-staff relationships and improving student performance.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
1. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
VOLUME XX, 2017 1
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
Academic Performance Prediction Based
on Multisource, Multifeature Behavioral
Data
Liang Zhao1,*
, Kun Chen1
, Jie Song1
, Xiaoliang Zhu1
, Jianwen Sun1
, Brian Caulfield2
, Brian
Mac Namee2
1
National Engineering Laboratory for Educational Big Data (NELEBD), National Engineering Research Center for E-learning (NERCEL), Central China
Normal University (CCNU), Wuhan, P. R. China
2
Insight Center for Data Analytics, University College Dublin (UCD), Dublin, Ireland
Corresponding author: Liang Zhao (liang.zhao@mail.ccnu.edu.cn)
The authors acknowledge the support received from the National Key R&D Program of China (Grant No: 2017YFB1401300 and
2017YFB1401303), the National Natural Science Foundation of China (Grant No: 61977030), and “the Fundamental Research Funds for the
Central University” (Grant No: 20205170443)
ABSTRACT Digital data trails from disparate sources covering different aspects of student life are stored
daily in most modern university campuses. However, it remains challenging to (i) combine these data to
obtain a holistic view of a student, (ii) use these data to accurately predict academic performance, and (iii)
use such predictions to promote positive student engagement with the university. To initially alleviate this
problem, in this paper, a model named Augmented Education (AugmentED) is proposed. In our study, (1)
first, an experiment is conducted based on a real-world campus dataset of college students (N = 156) that
aggregates multisource behavioral data covering not only online and offline learning but also behaviors inside
and outside of the classroom. Specifically, to gain in-depth insight into the features leading to excellent or
poor performance, metrics measuring the linear and nonlinear behavioral changes (e.g., regularity and
stability) of campus lifestyles are estimated; furthermore, features representing dynamic changes in temporal
lifestyle patterns are extracted by the means of long short-term memory (LSTM). (2) Second, machine
learning-based classification algorithms are developed to predict academic performance. (3) Finally,
visualized feedback enabling students (especially at-risk students) to potentially optimize their interactions
with the university and achieve a study-life balance is designed. The experiments show that the AugmentED
model can predict students’ academic performance with high accuracy.
INDEX TERMS academic performance prediction, behavioral pattern, digital campus, machine learning (ML),
long short-term memory (LSTM)
I. INTRODUCTION
As an important step to achieving personalized education,
academic performance prediction is a key issue in the
education data mining field. It has been extensively
demonstrated that academic performance can be profoundly
affected by the following factors:
⚫ Students’ Personality (e.g., neuroticism, extraversion,
and agreeableness) [1-4];
⚫ Personal Status (e.g., gender, age, height, weight,
physical fitness, cardiorespiratory fitness, aerobic
fitness, stress, mood, mental health, intelligence, and
executive functions) [1-12];
⚫ Lifestyle Behaviors (e.g., eating, physical activity,
sleep patterns, social tie, and time management) [7-28];
and
⚫ Learning Behaviors (e.g., class attendance, study
duration, library entry, and online learning) ([7,8,23-
26,28-38]).
For example, [2] investigated the incremental validity of the
Big Five personality traits in predicting college GPA. [21]
demonstrated that physical fitness in boys and obesity status
in girls could be important factors related to academic
achievement. Meanwhile, [22] showed that a regular lifestyle
could lead to good performance among college students. [24]
showed that the degree of effort exerted while working could
2. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
be strongly correlated with academic performance.
Additionally, [32] showed that compared with high- and
medium-achieving students, low-achieving students were less
emotionally engaged throughout the semester and tended to
express more confusions during the final stage ofthe semester.
By analyzing the effect of the factors influencing
academic performance, many systems using data to predict
academic performance have been developed in the literature
[1-4,7,8,13-19,22-27,29-31,33,34,37-41]. For instance, in
[8], academic performance was predicted based on passive
sensing data and self-reports from students’ smart phones. In
[23], a multitask predictive framework that captures
intersemester and intermajor correlations and integrates
student similarity was built to predict students’ academic
performance. In [34], based on homework submission data,
the academic performance of students enrolled in a blended
learning course was predicted.
According to their predicted academic performance, early
feedbacks and interventions could be individually applied to
at-risk students. For example, in [33], to help students with
a low GPA, basic interventions are defined based on GPA
predictions. However, the research on the
feedback/intervention is still in the early stage, its
achievements are relatively few.
In recent years, compared with primary and secondary
education (i.e. K12) [6,10,12,17], more and more attentions
have been paid to the academic performance prediction for
higher education [7-9,14,15,22-25,27,28,30-32,36-38]. The
reasons contributing to this phenomenon warrant further
investigation and might include the following. First, for
college students on a modern campus, life involves a
combination of studying, eating, exercising, socializing, etc.
(see Fig. 1) [7,8,22-25,27,42,43]. All activities that students
engage in (e.g., borrowing a book from the library) leave a
digital trail in some database. Therefore, it is relatively easy
to track college students’ behaviors, e.g. online learning
behaviors captured from massive open online courses
(MOOC) and small private online courses (SPOC)
platforms [30-32,36-38]. Second, given the diverse range
of activities listed above, it could be difficult for college
students to maintain balanced, self-discipline, well-being
university experiences, including excellent academic
performance.
Although many academic performance prediction systems
have been developed for college students, the following
challenges persist: (i) capturing a sufficiently rich profile of a
student and integrating these data to obtain a holistic view; (ii)
exploring the factors affecting students’ academic
performance and using this information to develop a robust
prediction model with high accuracy; and (iii) taking
advantage of the prediction model to deliver personalized
services that potentially enable students to drive behavioral
change and optimize their study-life balance.
To address these challenges, four representative prediction
systems (including one online system and three offline
systems) are summarized in Table I. We first discuss the
online prediction system, System A [32] (proposed by Z. Liu).
This system is relatively simple because its data is only
captured from either SPOC or MOOC. Regarding the latter
three offline prediction systems, i.e., Systems B ~ D [8,22,24]
(proposed by R. Wang, Y. Cao, and Z. Wang respectively),
the number of data sources is reduced, while the
corresponding scale size rapidly increases; Unfortunately, the
number of different types of behaviors that could be
considered is decreased. Ideally, multisource data at a
medium/large scale could help lead to a better prediction
system design. However, in practice, due to limitations, such
as computing capability, either data diversity or the sample
size is sacrificed during the system design process.
TABLE I
FOUR TYPICAL PREDICTION SYSTEMS (PROPOSED BY PREVIOUS
RESEARCHERS)
To initially alleviate the challenges mentioned above, a
model named Augmented Education (AugmentED) is
proposed in this paper. As shown in Fig. 2, this model mainly
consists of the following three modules: (1) a Data Module in
which multisource data on campus covering a large variety of
data trails are aggregated and fused, and the
characteristics/features that can represent students’ behavioral
change from three different perspectives are evaluated; (2) a
Prediction Module in which academic performance
prediction is considered a classification problem that is solved
by machine learning (ML)-based algorithms; and (3) a
Feedback Module in which visualized feedback is delivered
individually based on the predictions made and feature
analysis. Finally, AugmentED is examined using a real-world
dataset of 156 college students.
Systems Scale
Size (N)
Data Source Behaviors
On-
line
System A
(single-
source +
medium
scale) [32]
243 SPOC/MOOC
⚫ online study
⚫ discussions on the
forum
Off-
line
System B
(multisource
+ small
scale) [8]
30
⚫ wearable sensors
smart phone
accelerometer
light sensor
microphone
GPS/Bluetooth
⚫ self-reports
SurveyMonkey
mobile EMA
⚫ activity
⚫ conversation
⚫ sleeping
⚫ location
⚫ socializing
⚫ exercising
⚫ mental health
⚫ stress
⚫ mood
System C
(almost
multisource
+ medium
scale) [24]
528
⚫ WiFi
⚫ campus network
⚫ smart card
⚫ class schedule
⚫ usage of smart card
showering
eating
consumption
⚫ trajectory
wake-up time…
⚫ network
network cost…
System D
(single-
source +
large scale)
[22]
18960 smart card
⚫ usage of smart card
showering
eating
library entry-exit
fetching water
3. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
The remainder of this paper is organized as follows. In
Section II, a literature review is given. In Section III, the
methodology of AugmentED is described in detail. In Section
IV, the experimental results are discussed and analyzed.
Finally, a brief conclusion is given in Section V.
Learning
Emotion
Library interaction
Consumption Clinic visit
Digital
Campus
Digital
Campus
Online
Outside
of the
classroom
Offline
Inside the
classroom
(a) (b)
Meal Trajectory
FIGURE 1. Digital data remaining on a modern campus: (a) Multisource; (b) Multispace, covering not only online and offline learning but also students’
behaviors inside and outside of the classrooms.
➢LSTM-based
features
Data Module
meal
WiFi
smart
card
online
learning
central
storage
Data Trails
library
interaction
consumption
clinic
visit
trajectory
Raw Data
➢Entropy
➢HMM-based
Entropy
Characters & Features
feature
selection
intelligence
algorithm
cross
validation
Prediction Module
➢Slope
➢Breakpoint
➢RSS
Behavioral Change
-Linear
(BC-Linear)
emotion
visualized feedback
feature analysis
Feedback Module
➢LyE
➢HurstE
➢DFA
Behavioral Change
-nonLinear
(BC-nonLinear)
Behavioral Change
-LSTM
(BC-LSTM)
learning
FIGURE 2. Overview of AugmentED. In the data module, the features blocked in dashed boxes (including LyE, HurstE, DFA, and LSTM-based features)
are proposed in our study, to the best of our knowledge, which is used for the first time in student’s behavioral analysis.
II. RELATED WORK
A. FEATURE EXTRACTION
Feature evaluation plays an important role in designing
prediction systems. Features that measure the various
behavioral patterns can enhance our understanding of how a
student’s behavior changes as the semester progresses. In
this part, on the one hand, previous features that quantify
students’ behavioral patterns are summarized; On the other
hand, new features worthy of inclusion are also introduced.
In general, behavioral change can be quantified by the
following three groups of metrics.
1) BEHAVIORAL CHANGE-LINEAR (BC-LINEAR)
Traditionally, behavioral change is mainly quantified by two
linear metrics: behavioral slope and behavioral breakpoint.
First, the behavioral slope can be captured by computing
the slope of the behavioral time series of each student using
a linear regression [8]. The value of the slope indicates the
direction and strength of the behavioral changes, e.g., a
positive slope with a greater absolute value indicates a faster
increase in behavioral change [8]. Given a mid-term day
during the semester [8], both the pre-slope and post-slope can
be calculated to represent the students’ behavioral change
4. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
during the first and second halves of the semester,
respectively.
Second, the behavioral breakpoint can be captured by
computing the rate of behavioral changes occurring across
the semester. The value of the breakpoint identifies the day
during the semester before and after which a student’s
behavioral patterns differed. Two linear regressions can be
used to fit a behavioral time series and then use the Bayesian
information criterion (BIC) to select the best breakpoint [8].
If a single regression algorithm is selected, the breakpoint
can be set to the last day.
2) BEHAVIORAL CHANGE-NONLINEAR (BC-NONLINEAR)
In recent years, nonlinear metrics have been increasingly
applied to time series analysis [22,44-59].
Regarding the students’ behavioral time series, nonlinear
metrics have been used to discover nonlinear behavioral
patterns. We consider entropy an example. In [22], entropy
is proposed to quantify the regularity/orderliness of students’
behaviors, and it was demonstrated that a small entropy value
generally leads to high regularity and high academic
performance. Another example is entropy calculated based
on a Hidden Markov Model (HMM) analysis [44], which is
called HMM-based entropy for simplicity in our study.
HMM-based entropy is proposed to quantify the
uncertainty/diversity of students’ behaviors, e.g., the
uncertainty between the transition of different behaviors and
the various activities that a behavior exhibits. In [44], HMM-
based entropy is evaluated by the following two steps: (i)
extracting the hidden states of a behavioral time series by
HMM [45,46]; and (ii) subsequently calculating the HMM-
based entropy of the extracted hidden states.
To further recognize students’ activities and discover their
nonlinear behavioral patterns, the following three new
metrics, which have not been applied in students’ behavioral
time series analysis previously, are also worth to be studied.
Lyapunov Exponent (LyE) [47-51] is a measure of the
stability of a time series. For example, in [47], LyE is
used to quantify the stability of a gait time series, and
the results demonstrate that a time series with a large
LyE value is less stable than a series with a small LyE
value, i.e., generally, a large LyE value indicates high
instability. Therefore, in gait analyses, LyE is
considered a stability risk indicator for falls [47] that
can distinguish healthy subjects from those at a high
risk of falling.
Hurst Exponent (HurstE) [52-54] is a measure of
predictability (in some studies, it is also called long-
term memory) of a time series. For example, in [53],
HurstE is applied to quantify the predictability of a
financial time series, and the results demonstrate that a
time series with a large HurstE value can be predicted
more accurately than a series with a HurstE value close
to 0.5.
Detrended Fluctuation Analysis (DFA) [54-57] is a
measure of the long-range correlation (also called
statistical self-affinity or long-range dependence) of a
time series [56]. For example, in [56], DFA is used to
quantify the long-range correlation of a heart rate time
series, and it is demonstrated that a time series with a
small DFA value indicates less long-range correlation
behavior than a series with a large DFA value.
Therefore, in heart rate analyses, DFA is considered a
long-range correlation indicator that can distinguish
healthy subjects from those with severe heart disease
[56].
In summary, the above three nonlinear metrics can measure
the stability, predictability, and long-range correlation of a
time series. Although these metrics have already been
extensively applied in time series analyses, e.g., gait time
series [47], in this study, for the first time, they are used in a
behavioral time series analysis. These metrics can enhance
our understanding of not only whether a student’s behavior
is stable, predictable, and long-range correlated, but also how
good a student’s behavior is (e.g., self-discipline).
3) BEHAVIORAL CHANGE-LSTM (BC-LTSM)
Features represent temporal change over time is also worthy
of study. Such features can be extracted by long short-term
memory (LSTM) [58], which in this paper is called LSTM-
based features for short. LSTM-based features have been
applied in many fields, including for example emotion
recognition [59,60], traffic forecast [61] and video action
classification [62]. However, these features have not been
applied in lifestyle behavioral analysis previously.
B. PREDICTION ALGORITHMS
In general, academic performance prediction can be
considered either a regression or a classification problem. A
wide variety of algorithms have been used/proposed in
literatures to predict academic performance.
For example, in [8], Lasso (least absolute shrinkage and
selection operator) regularized linear regression model,
proposed by Tibshirani [63] in 1996, is used to predict
academic performance. In [24], four supervised learning
algorithms (consisting of support vector machine (SVM),
logistic regression (LR), decision tree and naï
ve Bayes) are
used to classify students’ performance. In [22], RankNET, a
neural network method proposed by Burges et al. [64] in
2015, is used to predict the ranks of students’ semester
grades. Similarly, in [27], a layer-supervised MLP-based
method is proposed for academic performance prediction. In
[32], a temporal emotion-aspect model (TEAM), modeling
time jointly with emotions and aspects extracted from SPOC
platform, is proposed to explore the effect of most concerned
emotion-aspects as well as their evolutionary trends on
academic achievement. In [65], four classification methods
(consisting of Naï
ve-Bayes, SMO, J48, and JRip) are used to
predict students’ performance by considering student
heterogeneity.
In general, due to the lack of open-access, large-scale, and
multisource data sets in the education field, on the one hand,
5. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
to some extent, it is impossible to compare the performances
of the existing academic performance prediction algorithms;
On the other hand, the algorithms proposed in this field are
relatively simple, which are mainly based on basic statistics
models (e.g. ANOVA and Post hoc tests) or ML algorithms
(e.g. SVM and LR).
C. MULTISOURCE AND MULTIFEATURE
It has been verified in many literatures that the predictive
power could be improved by multisource data and
multifeatured fusion. For example, it is demonstrated that
the performances of predicting both at-risk students [65] and
stock market [66] could be improved by combining multi-
source data. Similarly, in [22,23], the performances of
academic performance prediction are improved by combing
traditional diligence features with orderliness (and sleep
patterns) features. In [67], the accuracy of scholars’
scientific impact prediction is improved by using multi-field
feature extraction and fusion. In [68], a contrast experiments
of eleven different feature combinations were conducted,
demonstrating that the performances of sentiment
classification can be improved by multifeatured fusion.
However, we note that multisource and/or multifeature
data cannot always guarantee a higher predictive power. For
instance, [69] shows that the results of predictive modeling,
notwithstanding the fact that they are collected within a
single institution, strongly vary across courses. Actually,
compared with single course, the portability of the prediction
models across courses (multisource data) is lower [69].
Therefore, the effect of multisource and multifeature data
needs to be varied in experiments.
III. Methodology
In our study, academic performance prediction is considered
as a classification problem. According to the high-low
discrimination index proposed by Kelley [41], academic
performance is divided into low-, medium-, and high- groups.
Given a digital campus dataset, according to Fig. 2, the main
task is to first extract features from the raw multisource data;
then select the features that are strongly correlated with
academic performance and use these features to train the
classification algorithm; and finally provide visualized
feedback based on the prediction results.
In this section, the three modules designed in AugmentED
(see Fig. 2) are described in detail.
A. DATA MODULE
A flowchart of this module is shown in Fig. 3, which includes
the following three parts.
1) RAW DATA
Permission to access the raw data was granted by the
Academic Affairs Office of our university. The raw dataset
used in our study was captured from students engaging in the
course of “Freshman Seminar” during the fall semester of
2018-2019. The “Freshman Seminar” was chosen for the
following reasons: (1) more students were enrolled in this
course (N = 156) than other comparable courses, and (2) these
156 students were more active on our self-developed SPOC
platform, thus providing abundant valuable behavioral data.
Our dataset consists of the following four data sources (see
Table II):
⚫ SPOC Data. Two different types of data were collected
on the SPOC platform. The first type is log files, which
are recorded when a student logs in or out of the system,
and the second type is posts on the SPOC discussion
forum, which records discussions related to students’
learning experience.
⚫ Smart Card Data. Similar to most modern universities,
in our university, all students have a campus smart card
registered under their real name. The usage of this smart
card, such as for borrowing books from the library,
entering the library, consuming meals in campus
cafeterias, shopping on campus, or making an
appointment with the school clinic, is captured daily.
⚫ WiFi Data. There are approximately 3000 wireless
access points at our university, covering most areas of
campus. Once a student passes by one of these points,
the MAC address of his/her device (e.g., tablet, laptop,
or smart phone) can be recorded [40]. In our study, to
distinguish among diverse behaviors, the entire campus
is divided into several different areas, including a study
area and a relaxation/dormitory area.
⚫ Central Storage Data. As shown in Table II, other
features used in our study, including the students’
personal information and academic records, are recorded
by the central storage system of our university.
For simplicity, the former three data sources are designated D1,
D2, and D3, see Table II. To evaluate the effect of multisource
data on the academic performance prediction, which is similar
to the studies introduced in Section II.C, contrast experiments
of different data source combinations were conducted in our
study (see Section IV). To be specific, based on D1, D2, and
D3, in total, the following seven data combinations could be
obtained: D1, D2, D3, D1+D2, D1+D3, D2+D3, and D1+D2+D3.
The latter data source, i.e., Central Storage (which is relatively
static and simple), is considered fundamental information
shared by all seven combinations.
In our study, privacy protection is seriously considered, and
all students’ identifying information is anonymized. The
infringement of students’ privacy is avoided during both the
data collection period and data analysis period. First, the
student IDs are already pseudonymous in our raw data.
Moreover, the resolution of the students’ spatial-temporal
trajectory is reduced. All information regarding the exact
date/area showing when/where a behavior occurred is
removed. Therefore, it would be reasonably difficult to
reidentify individuals through our dataset.
2) DATA TRIALS
In our study, to initially understand how a student’s behavior
changes as the semester progresses, on the one hand, data
6. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
trails across the whole semester is processed and organized in
chronological order, including when, where and how a
behavior occurs; On the other hand, data trails per week is
summarized according to preliminary statistics, including the
flowing information in each week, e.g., how often a behavior
occurs (i.e. total frequency), how long does a behavior last (i.e.
duration), and how much money does a student need.
Regarding the SPOC data (D1), online learning is
quantified by (i) learning frequency and duration, which are
extracted from the raw log files; and (ii) online learning
emotion, which is extracted from the discussion forum.
Regarding the Smart Card data (D2), multiple behaviors are
involved, e.g. library interaction (including borrowing a book
and library entry), see Table II. Regarding the WiFi data (D3),
first, student’s trajectory is calculated, mainly including when
a student comes to a place; how often does he/she visit this
place (i.e. frequency); how long does he/she stay there (i.e.
duration). Second, attendance is calculated by combining
WiFi data with class schedules. Specifically, to distinguish
among behavioral patterns during different periods, three
types of durations (namely, durations on working days, on
weekends, and throughout the semester) and two types of
attendances (namely, attendance during the final study week
and attendance throughout the semester) are evaluated in our
study.
3) FEATURE EXTRACTION
To gain a deeper insight into students’ behavioral patterns,
as summarized in Section II.A, in our study behavioral
change is evaluated by linear, nonlinear, and deep learning
(LSTM) methods, see Fig. 3.
⚫ BC-Linear. Similar to the traditional approach, linear
behavioral change is quantified by behavioral slope and
behavioral breakpoint. Students behavioral series are
fitted by two linear regressions, subsequently the
optimized breakpoint is selected by BIC and behavioral
slopes are calculated. Additionally, to further measure
the amount of variance in the dataset that is not
explained by the traditional regression model, the
residual sum of squares (RSS) is also evaluated (see
Table II). In our study, those linear metrics are mainly
calculated by the python model sklearn.linear_model.
⚫ BC-nonLinear. Similar to the traditional approach,
first, entropy and HMM-based entropy are evaluated in
our study, measuring the regularity and diversity of
campus lifestyles respectively. Notably, the hidden
states are numerically extracted by the MATLAB
function hmmestimate, then the HMM-based entropy of
the extracted hidden states is evaluated by the
MATLAB function entropy. Second, to further
discover nonlinear behavioral patterns, the following
three nonlinear metrics are proposed and extracted for
the first time: LyE, HurstE, and DFA, measuring the
stability, predictability, and long-range correlation of
campus lifestyles respectively. In our study, four
nonlinear metrics (entropy, LyE, HurstE, and DFA) are
evaluated by a numpy-based python library, i.e. nolds,
based on the 0&1 sequence (see Appendix A).
⚫ BC-LSTM. LSTM-based features representing
dynamic changes in temporal behavioral patterns are
calculated as follows. First, as input information, data
trails from multiple behaviors are organized together
week by week, see Fig. 3. In each week, the basic
information of all multiple behaviors involved in our
study is summarized, including for example how many
times having breakfast and borrowing books from
library etc. occurred respectively. Subsequently, this
weekly information is fitted into a Keras LSTM
network, then features representing the weekly
behavioral patterns that might change throughout
semester are extracted.
TABLE II
CHARACTERISTICS AND FEATURES EVALUATED IN OUR STUDY
Note: (i) “” denotes features/characteristics extracted; (ii) “emotion” represents the students’ emotion patterns on SPOC forums in terms of engagement
behaviors and the following three different emotions [36-38]: positivity, negativity, and confusion; (iii) “Freq” represents the total times of a behavior
throughout the semester; and (iv) “grade” represents the year when a student starts his/her college study.
Raw Data BC-Linear
(Behavioral Change-Linear)
BC-nonLinear
(Behavioral Change-nonLinear)
BC-LSTM
(Behavioral
Change-LSTM)
Basic info.
Data
Source
Data
Label
Data Content Slope Breakpoint RSS Entropy HMM-based
Entropy
LyE HurstE DFA LSTM-based
Features
Freq Duration
On-
line
SPOC D1
online study
emotion
Off-
line
Smart
Card
D2
borrowing a book
library entry
eating
breakfast
consumption
clinical visit
WiFi D3
study area
relaxation area
Central Storage
⚫ academic records: academic history and class schedule
⚫ personal information: gender, age, and grade
7. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
2 VOLUME XX, 2017
Features
Extracted
Statistical
Results
Statistical
Results
Statistical
Results
t
week1
Raw Data
week2 week16
Behavioral Change-Linear (BC-Linear)
Behavioral Change-LSTM (BC-LSTM)
➢Slope
➢Breakpoint
➢RSS
➢HMM-based
Entropy
➢LyE
➢HurstE
➢DFA
➢LSTM-based
features
nonLinear analysis
(by nolds, a numpy-
based python library)
hmmestimate and
entropy (MATLAB
functions)
Linear regressions +
Bayesian information
criterion (BIC)
a Keras LSTM network
➢Entropy
Behavioral Change-Linear (BC-Linear)
Data Trials
data trials per week
data trials across the whole
semester
Feature Extraction
FIGURE 3. Flowchart of the data module.
B. PREDICTION MODULE
The main task ofthis module is to select features and use these
features to train the prediction algorithm.
1) FEATURE SELECTION
In our study, 708 different types of features are extracted,
including 510 linear features, 119 nonlinear features, 50
LSTM-based features, and 29 basic features (including e.g.
frequency and duration, gender, age, and grade). For instance,
because multiple behaviors are involved in our study, there are
20 DFA related features in total to quantify long-range
correlation for each behavior individually (e.g. library entry).
The distributions of the evaluated features and GPA are
spread in different value scopes. Therefore, to eliminate a
potential effect on the correlation analysis, both the features
and GPA are normalized by min-max normalization.
Additionally, to improve the performance of the prediction
algorithms, the top 130 features with the most significant
effect on academic performance are selected by the
SelectKBest function in a python library named scikit-learn.
2) PREDICTION ALGORITHM
Subsequently, the selected features are used to train the ML-
based classification algorithm for the academic performance
prediction. Specifically, in our study, five ML algorithms are
applied, including RF (random forest), GBRT (gradient boost
regression tree), KNN (k-nearest neighbor), SVM, and
XGBoost (extreme gradient boosting). The hyperparameters
of the ML and LSTM algorithms are optimized by
GridSearchCV in scikit-learn.
3) CROSS VALIDATION
Our dataset is divided into a training set and a test set at the
ratio of 7:3. The classification algorithm is first trained and
then applied to the test set to predict academic performance.
Finally, the robustness of the algorithm is tested by 10-fold
cross validation.
C. VISUALIZATION MODULE
The main task of this module is to provide personalized
feedback, including GPA prediction and a visualized summary
of the students’ behavioral patterns.
IV. EXPERIMENTAL RESULTS
In this section, first, the experimental results of AugmentED
is presented and analyzed. Second, to evaluate the
effectiveness of multisource and multifeature, contrast
experiments are conducted, and the corresponding results are
discussed. Finally, visualized feedback offered to students are
designed.
A. PREDICTION RESULTS
The experimental results of AugmentED are shown in the last
five rows of Table III (i.e., RF*, GBRT*, KNN*, SVM* and
XGBoost*), which are highlighted in bold. Five indexes
(accuracy, precision, recall, f1, and AUC) are used to evaluate
the performance.
Notely, AugmentED is proposed based on
(i) multisource data, i.e. D1+D2+D3 (including SPOC,
Smart Card, and WiFi data);
(ii) multiple features, i.e. C-III (including BC-Linear,
BC-nonLinear, and BC-LSTM features).
* in Tabe III denotes that C-III feature combination is used in
the corresponding ML algorithms for academic performance
prediction.
From Table III, it can be seen that, first, the academic
performance can be predicted by AugmentED with quite high
accuracy. Second, the performance of the five different ML
algorithms (RF*
, GBRT*
, KNN*
, SVM*
and XGBoost*
) are
8. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
similar, which can all lead to a good prediction result. To
clarify, we consider the case of precision values, see the 5th
column of Table III. The precision values of five ML
algorithms are 0.873, 0.877, 0.863, 0.889, and 0.871
respectively, indicating that (i) its minimum value is 0.863, i.e.
the precision of AugmentED is no less than 86.3%; (ii) the
difference between the minimum and maximum values is
0.026, which is quite small, i.e. AugmentED is independent of
ML algorithms.
TABLE III
PREDICTION RESULTS (THE AVERAGE CLASSIFICATION RESULTS OF 10-
FOLDER CROSS VALIDATION)
Data Feature Algorithm accuracy precision recall f1 AUC
D1
(SPOC)
C-I
RF 0.604 0.642 0.604 0.602 0.698
GBRT 0.584 0.643 0.584 0.581 0.684
KNN 0.500 0.497 0.500 0.473 0.627
SVM 0.539 0.590 0.539 0.534 0.641
XGBoost 0.521 0.586 0.521 0.520 0.605
C-II LSTM 0.488 0.517 0.488 0.491 0.583
C-III
RF*
0.776 0.806 0.776 0.774 0.807
GBRT*
0.764 0.784 0.764 0.761 0.800
KNN*
0.801 0.825 0.801 0.802 0.840
SVM*
0.795 0.818 0.795 0.792 0.836
XGBoost*
0.789 0.823 0.789 0.787 0.836
D2
(Smart
Card)
C-I
RF 0.507 0.538 0.507 0.496 0.600
GBRT 0.487 0.510 0.487 0.481 0.604
KNN 0.462 0.523 0.462 0.454 0.617
SVM 0.528 0.571 0.528 0.515 0.637
XGBoost 0.508 0.522 0.508 0.495 0.617
C-II LSTM 0.436 0.467 0.436 0.425 0.552
C-III
RF*
0.742 0.795 0.742 0.746 0.808
GBRT*
0.737 0.778 0.737 0.733 0.791
KNN* 0.733 0.780 0.730 0.729 0.771
SVM*
0.719 0.744 0.719 0.712 0.773
XGBoost*
0.733 0.760 0.733 0.733 0.767
D3
(WiFi)
C-I
RF 0.424 0.408 0.424 0.395 0.511
GBRT 0.437 0.473 0.437 0.428 0.532
KNN 0.399 0.424 0.399 0.396 0.531
SVM 0.391 0.432 0.391 0.387 0.480
XGBoost 0.425 0.383 0.425 0.395 0.503
C-II LSTM 0.413 0.398 0.413 0.380 0.512
C-III
RF*
0.545 0.620 0.545 0.539 0.639
GBRT*
0.558 0.600 0.558 0.548 0.650
KNN*
0.501 0.591 0.501 0.487 0.607
SVM*
0.487 0.404 0.487 0.416 0.590
XGBoost*
0.546 0.624 0.546 0.529 0.646
D1+D2
(SPOC +
Smart
Card)
C-I
RF 0.616 0.660 0.616 0.608 0.688
GBRT 0.603 0.676 0.603 0.605 0.692
KNN 0.534 0.608 0.534 0.526 0.626
SVM 0.608 0.661 0.608 0.617 0.683
XGBoost 0.565 0.603 0.565 0.556 0.641
C-II LSTM 0.493 0.520 0.493 0.488 0.579
C-III
RF*
0.814 0.839 0.814 0.815 0.855
GBRT*
0.809 0.843 0.809 0.809 0.853
KNN*
0.815 0.839 0.815 0.815 0.857
SVM*
0.821 0.848 0.821 0.821 0.862
XGBoost*
0.833 0.860 0.833 0.826 0.813
D1+D3
(SPOC +
WiFi)
C-I
RF 0.616 0.650 0.616 0.614 0.697
GBRT 0.616 0.664 0.616 0.612 0.702
KNN 0.538 0.594 0.538 0.529 0.626
SVM 0.602 0.640 0.602 0.598 0.690
XGBoost 0.573 0.606 0.573 0.568 0.626
C-II LSTM 0.488 0.500 0.488 0.469 0.559
C-III
RF*
0.800 0.867 0.800 0.799 0.849
GBRT*
0.793 0.851 0.793 0.795 0.851
KNN*
0.814 0.859 0.814 0.811 0.857
SVM*
0.807 0.843 0.807 0.799 0.843
XGBoost*
0.807 0.843 0.807 0.803 0.867
D2+D3
(Smart
Card +
WiFi)
C-I
RF 0.584 0.612 0.584 0.567 0.662
GBRT 0.571 0.584 0.571 0.560 0.626
KNN 0.551 0.593 0.551 0.558 0.641
SVM 0.545 0.550 0.545 0.530 0.579
XGBoost 0.559 0.563 0.559 0.544 0.579
C-II LSTM 0.449 0.469 0.449 0.434 0.551
C-III
RF*
0.781 0.819 0.781 0.782 0.837
GBRT*
0.781 0.801 0.781 0.779 0.816
KNN*
0.755 0.793 0.755 0.754 0.793
SVM*
0.762 0.794 0.762 0.763 0.815
XGBoost*
0.762 0.801 0.762 0.767 0.813
D1+D2+
D3
(SPOC +
Smart
Card +
WiFi)
C-I
RF 0.630 0.679 0.630 0.630 0.699
GBRT 0.637 0.671 0.637 0.620 0.703
KNN 0.604 0.647 0.604 0.596 0.669
SVM 0.635 0.696 0.635 0.644 0.716
XGBoost 0.608 0.670 0.608 0.598 0.691
C-II LSTM 0.501 0.544 0.501 0.473 0.578
C-III
RF*
0.852 0.873 0.852 0.844 0.857
GBRT*
0.852 0.877 0.852 0.852 0.876
KNN*
0.847 0.863 0.847 0.841 0.851
SVM*
0.866 0.889 0.866 0.865 0.872
XGBoost*
0.859 0.871 0.859 0.850 0.874
Note: (i) D1+D2+D3 is the multiple data source used in AugmentED to predict
academic performance, including SPOC, Smart Card and WiFi data; (ii) The
rows highlighted in light blue, light pink, light green, denote the three
following feature or feature combinations are used for prediction, C-I (BC-
linear and BC-nonlinear features), C-II (only BC-LSTM, i.e., LSTM-based
features), C-III (BC-Linear, BC-nonLinear, and BC-LSTM features).
B. COMPARATIVE EXPERIMENTS
In this part, contrast experiments are conducted to evaluate the
prediction effect of multisource and multifeature
combinations.
1) MULTISOURCE
Comparisons of the performance of different data source
combinations are conducted, see the 1st
column of Table III.
As shown in Table III, a large number of multiple data sources
can lead to a more accurate prediction result.
To clarify, we consider the case ofSVM*
, from D1 to D1+D2
and D1+D2+D3 (see Tables III and Fig. 4), all five evaluation
indexes significantly improves with the types of data sources
increases. Specifically, (i) the accuracy values of D1, D1+D2,
and D1+D2+D3 are 0.795, 0.821, 0.866, respectively; (ii) the
precision values are 0.818, 0.848 and 0.889; (iii) the recall
values are 0.795, 0.821 and 0.866; (iv) the f1 values are 0.792,
0.821 and 0.865; and (v) the AUCs value are 0.836, 0.862 and
0.872. It is verified that multisource data can enhance the in-
depth insight gained into students’ behavioral patterns.
9. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
FIGURE 4. Comparisons of the SVM* performance of different data source
combinations.
2) MULTIFEATURE
Comparisons of the performance of three different feature
combinations (C-I, CI-II, C-III) are also conducted, see the 2nd
column of Table III and Fig. 5.
⚫ C-I (including BC-linear and BC-nonlinear features), see
the rows of Table III highlighted in light blue. Its
corresponding MLs are denoted as RF, GBRT, KNN,
SVM, and XGboost;
⚫ C-II (only including BC-LSTM features), see the rows
of Table III highlighted in light pink;
⚫ C-III (including BC-Linear, BC-nonLinear, and BC-
LSTM features), see the rows of Table III highlighted in
light green. Its corresponding MLs are denoted as RF*
,
GBRT*
, KNN*
, SVM*
, and XGboost*
.
FIGURE 5. Comparisons of the accuracy values of three feature
combinations in the (D1+D2+D3) dataset.
As shown in Fig. 5 and Table III, all five evaluation indexes
(accuracy, precision, recall, f1, and AUC) of C-III are
significantly higher than those of C-I and C-II. To clarify, we
consider the case of SVM*
in the (D1+D2+D3) dataset, the
accuracy value of SVM*
is 0.866, which is much higher than
that of SVM and LSTM (i.e., 0.635 and 0.501, respectively),
see the 4th
column of Table III. This result indicates that the
multifeature combination proposed in our study (i.e. C-III) can
significantly improve the predictive power.
C. IDENTIFICATION OF AT-RISK STUDNETS BASED ON
THE PREDICITON
The prediction result obtained by AugmentED can be used to
identify at-risk students, i.e., determine whether a student
belongs in a low performance group. It could be quite helpful
for early warning and feedback to be provided to at-risk
students before the final exam week.
To illuminate how AugmentED could potentially help
students optimize their college lifestyles and consequently
improve their academic performance, a feedback example
delivered to one at risk student is shown in Fig. 6.
FIGURE 6. Example of a feedback given to one at risk student, including the
average values and 95% confidence intervals of the following nine assistant
indicators from the low-, medium-, and high- performance groups: First, (a1)
D-Linear, (a2) D-postRSS, (a3) D-preSlope are the indicators representing
(weighted) linear, RSS in post-semester, slope in pre-semester patterns of all
behaviors (rather than one single behavior) respectively; Second, (b1) D-
nonLinear; (b2) D-Entropy; (b3) D-DFA are the indicators representing
(weighted) nonlinear, entropy, and DFA patterns of all behaviors. Finally, (c1) D-
LSTM is the indicators representing the temporal pattern of all behaviors, while
(c2) LSTM-49 and (c3) LSTM-1 are two of the 50 features extracted by our LSTM
network.
We note that except for the prediction result itself, the
extracted features that are strongly correlated with academic
performance can also be taken as assistant indicators, to
identify at-risk students. Traditionally, those features can be
selected by either statistical analysis (e.g. by ANOVA) or ML
algorithms (e.g. feature importance returned by RF). We
recall that in our study, multiple behaviors are involved, and
each behavior is quantified by a plenty of -linear, -nonlinear,
and -LSTM features. Therefore, a particular feature (e.g.
entropy) ofone single behavior (e.g. either having breakfast or
learning online) might not make sense to gain a
comprehensive evaluation of student’ behavioral patterns.
low medium high
low medium high low medium high
low medium high
low medium high low medium high
1.8
1.4
1.0
low
low low
1.00
0.99
0.97
medium high medium high medium high
(a1) (a2) (a3)
(b1) (b2) (b3)
(c1) (c2) (c3)
3.0
2.5
2.0
1.5
10
6
14
0.25
0.15
0.05
1.1
0.8
0.5
0.98
0.35
0.25
0.15
0.7
0.6
0.5
0.4
low
0.2
0.6
1.0
at risk!
at risk!
at risk!
at risk!
at risk!
at risk!
at risk!
at risk!
at risk!
0.2
0.4
0.6
0.8
RF
GBRT
KNN
SVM
XGBoost
GBRT*
KNN*
SVM*
XGBoost*
LSTM
LSTM
RF*
LSTM
LSTM
LSTM
0
0.4
0.6
0.8
accuracy
percision
recall
f1
AUC
D
3
D
2
D
2
+D
3
D
1
D
1
+D
3
D
1
+D
2
D
1
+D
2
+D
3
10. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
From this perspective, in Fig. 6, nine assistant indicators are
calculated and plotted.
We begin by discussing the indicators of -linear, -nonlinear,
and -LSTM features (see Appendix B), which are denoted as
D-linear, D-nonLinear and D-LSTM respectively,
representing the (weighted) linear, nonlinear and temporal
pattern of all multiple behaviors involved in our study (rather
than one single behavior). Regarding these three indicators,
(i) The average values and 95% confidence intervals
(from the low-, medium-, and high- academic
performance groups) are plotted in the left column of
Fig. 6.
(ii) The Pearson correlation between the indicators and
academic performance is calculated, see the 2nd
, 5th
,
and 8th
rows of Table IV which are highlighted in light
gray.
Furthermore, six more indicators are calculated and
provided as supplementary, see the 2nd
and 3rd
columns of Fig.
6. The Pearson correlation between these indicators and
academic performance is also calculated and listed in Table IV.
From Table IV it can be seen that all the nine indicators are
strongly correlated with academic performance. Additionally,
in Fig. 6, the apparent distinction among three academic
performance groups demonstrates that all the nine indicators
can offer strong support in at-risk student identification.
To clarify, we consider the case of D-linear. On the one
hand, its average values and 95% confidence intervals from
low-, medium-, and high- academic performance groups are
(1.4570.199, 2.1600.193, 3.0350.341), see Fig. 6(a1),
indicating clear separation. On the other hand, its correlation
coefficient is 0.534, see the 3rd
row of Table IV, i.e., this
indicator is significantly correlated with academic
performance. Therefore, D-linear can be taken as an indicator
to explore which student is at risk because of the low
performance he/she will achieve.
TABLE IV
CORRELATION COEFFICIENT AND P-VALUE
Assistant Indexes Correlation coefficient P-value
Fig.6(a1) D-Linear 0.534 7.18e-13
Fig.6(a2) D-postRSS 0.366 2.65e-06
Fig.6(a3) D-preSlope 0.425 3.12e-08
Fig.6(b1) D-nonLinear 0.392 4.28e-07
Fig.6(b2) D-entropy 0.402 2.02e-07
Fig.6(b3) D-DFA 0.345 1.05e-05
Fig.6(c1) D-LSTM 0.703 1.32e-24
Fig.6(c2) LSTM-49 0.254 0.001
Fig.6(c3) LSTM-1 0.734 1.23e-27
V. CONCLUSION AND FUTURE WORK
As an important issue in the education data mining field,
academic performance prediction has been studied by many
researchers. However, due to lack of richness and diversity in
both data sources and features, there still exist a lot of
challenges in prediction accuracy and interpretability. To
initially alleviate this problem, our study aims at developing a
robust academic performance prediction model, to gain an in-
depth insight into student behavioral patterns and potentially
help students to optimize their interactions with the university.
In our study, a model named AugmentED is proposed to
predict the academic performance of college students. Our
contributions in this study are related to three sources. First,
regarding data fusion, to the best of our knowledge, this work
is the first to capture, analyze and use multisource data
covering not only online and offline learning but also campus-
life behaviors inside and outside ofthe classroomfor academic
performance prediction. Based on these multisource data, a
rich profile of a student is obtained. Second, regarding the
feature evaluation, behavioral change is evaluated by linear,
nonlinear, and deep learning (LSTM) methods respectively,
which provides a systematical view of students’ behavioral
patterns. Specifically, it is the first time that three novel
nonlinear metrics (LyE, HurstE, and DFA) and LSTM are
applied in students’ behavioral time series analysis. Third, our
experimental results demonstrate that AugmentED can predict
academic performance with quite high accuracy, which help
to formulate personalized feedback for at-risk (or unself-
disciplined) students.
However, there are also some limitations in our study. To
gain a multisource dataset, we scarified the scale the dataset
by only using student-generated data within a single course.
This limitation might have a certain negative influence on the
generalization of AugmentED. Furthermore, in this study, we
mainly focus on behavioral change. Other
characteristics/features (e.g., peer effect, sleep) that are worthy
of consideration were not evaluated in this study.
In conclusion, our study is based on a complete passive
daily data capture system that exists in most modern
universities. This system can potentially lead to continual
investigations on a larger scale. The knowledge obtained in
this study can also potentially contribute to related research
among K-12 students.
Appendix A
To evaluate the four nonlinear metrics (entropy, LyE, HurstE,
and DFA) of the time series, we concentrate on the precise
time of day during which the behaviors occurred. Therefore,
in our study, the involved time is first converted to a discrete
time sequence. Then, according to the represented discrete
time sequence, the raw behavioral time series data are
converted to the 0&1 sequence as follows:
STEP 1. TIME DATA REPRESENTATION
The time data are converted to a discrete sequence with a
normalized time interval by the following three steps:
⚫ Step 1.1. The entire semester was from 01/09/2018
(September 1st
) to 20/01/2019 (January 20st
) and includes
a total of 140 days. Thus, each day can be numbered
from 1 to 140, resulting in a discrete sequence {p1, p2, . . .,
pi} = {1, 2, . . ., 140}, where i denotes the ith
day in the
semester;
11. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
⚫ Step 1.2. We divide each day into 48 time bins such that
each bin spans 30 minutes. Subsequently, every bin is
encoded from 1 to 48, i.e., {q1, q2, . . ., qj} = {1, 2, . . .,
48}, where j denotes the jth
time bin. For example,
“0:01—0:30” is the 1st
time bin, “0:31—1:00” is the 2nd
bin, etc.
⚫ Step 1.3. By combining the sequences of days and time
bins, the time during the spring semester is mapped to a
discrete time sequence with length Nt, i.e., {T1, T2, . . .,
Tij} = {1, 2, . . ., Nt}, where
=( 1) 48
ij i j
T p q
− + , (A-1)
And Nt = 6720. Specifically, if the time is “03/09/2018,
10:24”, i.e., pi = 3 and qj = 21 (the 21st
time bin of the 3rd
day), according to Eq. A-1, Tij = 2 48 + 21 = 117, i.e.,
“03/09/2018, 10:24” is encoded by 117.
STEP 2. BEHAVIORAL DATA REPRESENTATION
Following the time data representation, the raw behavioral
data are converted to a 0&1 sequence by the following two
steps:
⚫ Step 2.1. First, a zero sequence Xij with length Nt is
generated, and
⚫ Step 2.2. If a behavior occurs at time Tij, the Tij
th
element
of the corresponding discrete behavioral sequence Xij is
set to 1, i.e., Xij = 1. For instance, if a student has a meal
at “03/09/2018, 10:24” (where Tij = 117), the 117th
element of the discrete meal sequence is set to 1, i.e., Xij
= 117.
This process can by described as follows:
1, if a behaviorhappens at time
0, otherwise
ij
ij
T
X
=
, (A-2)
where Xij [0,1]. According to Eq. A-2, all behavioral data
listed in Table II (including SPOC online study, borrowing a
book, library entry, meal consumption, breakfast consumption,
consumption, clinical visits, and WiFi data in the study and
relaxation areas) are converted to discrete behavioral
sequences.
Appendix B
Regarding the nine assistant indicators described in Section
IV.C, the former seven are calculated according to [24]; while
the latter two (LSTM-49, LSTM-1) are selected from the
extracted 50 LSTM-features without any further processing.
The fundamental mathematical approach to calculate the
former seven indicators is the same. The major similarity
between these indicators is that they all represent certain
property of all multiple behaviors involved in our study. The
major difference is the input features used for calculation. To
clarify, in this section the mathematical approach to the
calculation of D-linear is given.
⚫ Step 1. The score of each linear feature (e.g. slope) for
each student is calculated as follows,
( ( )) / , ( ) 0
( ) / , ( ) 0
n k
n
k
n k
N Rank x N Corr X
Score
Rank x N Corr X
−
=
(B-1)
We assume that there are N students and K extracted
features in total. Corr(Xk) is the Pearson correlation
coefficient between the kth feature XK and students’
academic performance, where k K. Rank(xn) means the
ranking of the nth student’s (denoted as un, where n N)
feature among all students. For example, there are three
students (u1, u2, u3), and their kth feature (e.g. slope value
of having breakfast) are (0.8, 0.4, 0.6), then we have
Scorek
1
= 0, Scorek
2
= 0.667, and Scorek
3
= 0.333 because
Corr(Xk) > 0.
⚫ Step 2. The indicator of linear feature group, D-Linear,
is calculated by utilizing the feature scores as follows,
1
( ( ) * )
K
n
k k
k
Corr X Score
=
(B-2)
We note that essentially D-Linear is the weighted mean of all
linear feature scores, and its weights are the correlation
coefficients.
REFERENCES
[1] A. Furnham, and J. Monsen, “Personality traits and intelligence
predict academic school grades," Learning and Individual Differences,
vol. 19, no. 1, pp. 0-33, 2009.
[2] M. A. Conard, “Aptitude is not enough: How personality and behavior
predict academic performance,” Journal of Research in Personality,
vol. 40, no. 3, pp. 339-346, 2006.
[3] T. Chamorropremuzic, and A. Furnham, “Personality predicts
academic performance: Evidence from two longitudinal university
samples,” Journal of Research in Personality, vol. 37, no. 4, pp. 319-
338, 2003.
[4] R. Langford, C. P. Bonell, H. E. Jones, T. Pouliou, S. M. Murphy, and
E. Waters, “The WHO health promoting school framework for
improving the health and well‐being of students and their academic
achievement,” Cochrane Database of Systematic Reviews, vol. 4, no.
4, pp. CD008958, 2014.
[5] A. Jones, and K. Issroff, “Learning technologies: Affective and social
issues in computer-supported collaborative learning,” Computers &
Education, vol. 44, no. 4, pp. 395-408, 2005.
[6] D. N. A. G. Van, E. Hartman, J. Smith, and C. Visscher, “Modeling
relationships between physical fitness, executive functioning, and
academic achievement in primary school children,” Psychology of
Sport & Exercise, vol. 15, no. 4, pp. 319-325, 2014.
[7] R. Wang, F. Chen, Z. Chen, T. Li, and A. T. Campbell, “StudentLife:
Assessing mental health, academic performance and behavioral trends
of college students using smartphones,” In Proc. of the ACM
International Joint Conference on Pervasive & Ubiquitous Computing,
Seattle, WA, USA, 2014.
[8] R. Wang, G. Harari, P. Hao, X. Zhou, and A. T. Campbell, “SmartGPA:
How smartphones can assess and predict academic performance of
college students,” In Proc. of the ACM International Joint Conference
on Pervasive & Ubiquitous Computing, Osaka, Japan, 2015.
[9] M. T. Trockel, M. D. Barnes, and D. L. Egget, “Health-related
variables and academic performance among first-year college students:
Implications for sleep and other behaviors,” Journal of American
College Health, vol. 49, no. 3, pp. 125-131, 2000.
[10] D. M. Hansen, S. D. Herrmann, K. Lambourne, J. Lee, and J. E.
Donnelly, “Linear/nonlinear relations of activity and fitness with
children’s academic achievement,” Med Sci Sports Exerc. vol. 46, no.
12, pp. 2279-2285, 2014.
[11] A. K. Porter, K. J. Matthews, D. Salvo, and H. W. Kohl, “Associations
of physical activity, sedentary time, and screen time with
cardiovascular fitness in United States adolescents: Results from the
12. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
NHANES national youth fitness survey (NNYFS),” Journal of
Physical Activity and Health, pp. 1-21, 2017.
[12] K. N. Aadland, O. Yngvar, A. Eivind, K. S. Bronnick, L. Arne, and G.
K. Resaland, “Executive functions do not mediate prospective
relations between indices of physical activity and academic
performance: the active smarter kids (ask) study,” Frontiers in
Psychology, vol. 8, pp. 1088, 2017.
[13] M. Credé
, S. G. Roch, and U. M. Kieszczynska, "Class attendance in
college: A meta-analytic review of the relationship of class attendance
with grades and student characteristics," Review of Educational
Research, vol. 80, no. 2, pp. 272-295, 2010.
[14] S. P. Gilbert, and C. C. Weaver, “Sleep quality and academic
performance in university students: A wake-up call for college
psychologists,” Journal of College Student Psychotherapy, vol. 24, no.
4, pp. 295-306, 2010.
[15] A. Wald, P. A. Muennig, K. A. O"Connell, and C. E. Garber,
“Associations between healthy lifestyle behaviors and academic
performance in U.S. undergraduates: A secondary analysis of the
American college health association"s National College Health
Assessment II,” American Journal of Health Promotion, vol. 28, no.
5, pp. 298-305, 2014.
[16] P. Scanlon, and A. Smeaton, “Identifying the impact of friends on their
peers academic performance,” In Proc. of the IEEE/ACM
International Conference on Advances in Social Networks Analysis
and Mining (ASONAM), San Francisco, CA, USA, 2016.
[17] E. L. Faught, J. P. Ekwaru, D. Gleddie, K. E. Storey, M. Asbridge, and
P. J. Veugelers, “The combined impact of diet, physical activity, sleep
and screen time on academic achievement: A prospective study of
elementary school students in Nova Scotia, Canada,” International
Journal of Behavioral Nutrition and Physical Activity, vol. 14, no. 1,
pp. 29, 2017.
[18] E. L. Faught, D. Gleddie, K. E. Storey, C.M. Davison, and P. J.
Veugelers, “Healthy lifestyle behaviours are positively and
independently associated with academic achievement: an analysis of
self-reported data from a nationally representative sample of Canadian
early adolescents,” Plos One, vol. 12, no. 7, pp. e0181938, 2017.
[19] V. Kassarnig, E. Mones, A. Bjerre-Nielsen, P. Sapiezynski, D. D.
Lassen, and S. Lehmann, “Academic performance and behavioral
patterns,” Epj Data Science, vol. 7, no. 1, pp. 10, 2017.
[20] J. E. Donnelly, C. H. Hillman, J. L. Greene, D. M. Hansen, C. A.
Gibson, D. K. Sullivan, “Physical activity and academic achievement
across the curriculum: Results from a 3-year cluster-randomized trial,”
Preventive Medicine, vol. 99, pp. 140-145, 2017.
[21] N. Morita, T. Nakajima, K. Okita, T. Ishihara, M. Sagawa, and K.
Yamatsu, “Relationships among fitness, obesity, screen time and
academic achievement in Japanese adolescents,” Physiology &
Behavior, vol. 163, pp. 161-166, 2016.
[22] Y. Cao, J. Gao, D. Lian, Z. Rong, J. Shi, and Q. Wang, “Orderliness
predicts academic performance: Behavioural analysis on campus
lifestyle,” Journal of the Royal Society Interface, vol. 15, pp. 146,
2018.
[23] H. Yao, D. Lian, Y. Cao, Y. Wu, and T. Zhou, “Predicting academic
performance for college students: A campus behavior perspective,”
ACM Transactions on Intelligent Systems and Technology, vol. 1, no.
1, pp. 1-20, 2019.
[24] Z. Wang, X. Zhu, J. Huang, X. Li, and Y. Ji, “Prediction of academic
achievement based on digital campus,” In Proc. of the 11th
International Conference on Educational Data Mining, Buffalo, NY,
USA, 2018.
[25] I. Pytlarz, S. Pu, and M. Patel, “What can we learn from college
students’ network transactions? Constructing useful features for
students prediction,” in Proc. of the 11th International Conference on
Educational Data Mining, Buffalo, NY, USA, 2018, pp. 444-448.
[26] S. Ahmad, K. Li, A. Amin, M. S. Abnwar, and W. Khan, “A multilayer
prediction approach for the student cognitive skills measurement,”
IEEE Access, vol. 6, pp. 57470-57484, 2018.
[27] S. Qu, K. Li, S. Zhang, and Y. Wang, “Predicting achievement of
students in smart campus,” IEEE Access, vol. 6, pp. 60264-60273,
2018.
[28] N. Alalwan, W. M. Al-Rahmi, O. Alfarraj, A. Alzahrani, N. Yahaya,
and A. Al-Rahmi, “Integrated three theories to develop a model of
factors affecting students’ academic performance in higher education,”
IEEE Access, vol. 7, pp. 98725-98742, 2019.
[29] T. Phan, S. G. Mcneil, and B. R. Robin, “Students’ patterns of
engagement and course performance in a massive open online course,”
Computers & Education, vol. 95, pp. 36-44, 2016.
[30] S. Liu, X. Peng, H. Cheng, Z. Liu, J. Sun, and C. Yang, “Unfolding
sentimental and behavioral tendencies of learners’ concerned topics
from course reviews in a MOOC,” Journal of Educational Computing
Research, vol. 57, no. 3, pp. 670-696, 2018.
[31] S. Helal, J. Li, L. Liu, E. Ebrahimie, S. Dawson, and D. J. Murray,
“Predicting academic performance by considering student
heterogeneity,” Knowledge-Based Systems, vol. 161, pp. 134-146,
2018.
[32] Z. Liu, C. Yang, L. S. Rü
dian, S. Liu, L. Zhao, and T. Wang,
“Temporal emotion-aspect modeling for discovering what students are
concerned about in online course forums,” Interactive Learning
Environments, vol. 27, pp. 598-627, 2019.
[33] A. Zollanvari, R. C. Kizilirmak, Y. H. Kho, and D. Hernandez-torrano,
“Predicting students’ GPA and developing intervention strategies
based on self-regulatory learning behaviors,” IEEE Access, vol. 5, pp.
23792-23802, 2017.
[34] A. Akram, C. Fu, Y. Li, M. Y. Javed, R. Lin, Y. Jiang, and Y. Tang,
“Predicting students academic procrastination in blended learning
course using homework submission data,” IEEE Access, vol. 7, pp.
102487-102498, 2019.
[35] L. Gao, Z. Zhao, L. Qi, Y. Liang, and J. Du, “Modeling the effort and
learning ability of students in MOOCs,” IEEE Access, vol. 7, pp.
128035-128042, 2019.
[36] Z. Liu, H. Cheng, S. Liu, and J. Sun, “Discovering the two-step lag
behavioral patterns of learners in the college SPOC platform,”
International Journal of Information and Communication Technology
Education, vol. 13, no. 1, pp. 1-13, 2017.
[37] Z. Liu, W. Zhang, H. Cheng, J. Sun, and S. Liu, “Investigating
relationship between discourse behavioral patterns and academic
achievements of students in SPOC discussion forum,” International
Journal of Distance Education Technologies, vol. 16, no. 2, pp. 37-50,
2018.
[38] Z. Liu, N. Pinkwart, H. Liu, S. Liu, and G. Zhang, “Exploring students
engagement patterns in SPOC forums and their association with
course performance,” EURASIA Journal of Mathematics, Science and
Technology Education, vol. 14, no. 7, pp. 3143-3158, 2018.
[39] B. Kim, B. Vizitei, and V. Ganapathi, “GritNet: Student performance
prediction with deep learning,” in Proc. of the 11th International
Conference on Educational Data Mining, Buffalo, NY, USA, 2018,
pp. 625-629.
[40] S. Sahebi, and P. Brusilovshky, “Student performance prediction by
discovering inter-activity relations,” In Proc. of the 11th International
Conference on Educational Data Mining, Buffalo, NY, USA, 2018,
pp. 87-96.
[41] T. L. Kelley, “The selection of upper and lower groups for the
validation of test items,” Journal of Educational Psychology, vol. 30,
no. 1, pp. 17–24, 1939.
[42] J. Heo, S. Yoon, W. S. Oh, J. W. Ma, S. Ju, and S. B. Yun, "Spatial
computing goes to education and beyond: Can semantic trajectory
characterize students?" In Proc. of the 5th ACM SIGSPATIAL
International Workshop on Analytics for Big Geospatial Data
(BigSpatial'16), Oct. 31-Nov. 03, Burlingame, CA, USA, 2016.
[43] J. Heo, H. Lim, S. B. Yun, S. Ju, S. Park, and R. Lee, "Descriptive and
predictive modeling of student achievement, satisfaction, and mental
health for data-driven smart connected campus life service," In Proc.
of the 9th International Conference on Learning Analytics &
Knowledge (LAK'19), March, Tempe, AZ, USA, 2019.
[44] X. Zhang, G. Sun, Y. Pan, H. Sun, Y. He, and J. Tan, “Students
performance modeling based on behavior pattern,” Journal of Ambient
Intelligence and Humanized Computing, vol. 9, pp. 1659–1670, 2018.
[45] S. R. Eddy, “Hidden markov models,” Current Opin Struct Biol, vol.
6, no. 3, 361–365, 1996.
[46] https://ww2.mathworks.cn/help/stats/hidden-markov-models-
hmm.html.
[47] J. Howcroft, J. Kofman, and E. D. Lemaire, “Review of fall risk
assessment in geriatric populations using inertial sensors,” Journal of
Neuroengineering & Rehabilitation, vol. 10, no. 1, pp. 1-12, 2013.
13. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)
VOLUME XX, 2017 9
[48] A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, “Determining
lyapounov exponents from a time series,” Physica D Nonlinear
Phenomena, vol. 16, no. 3, pp. 285-317, 1985.
[49] M. T. Rosenstein, J. J. Collins, and C. J. D. Luca, “A practical method
for calculating largest lyapunov exponents from small data set,”
Expert Systems with Applications, vol. 29, no. 3, pp. 506-514, 1993.
[50] S. M. Bruijn, D. J. J. Bregman, O. G. Meijer, P. J. Beek, and J. H.
Dieen, “Maximum lyapunov exponents as predictors of global gait
stability: A modeling approach,” Medical Engineering & Physics, vol.
34, no. 4, pp. 428-436, 2012.
[51] N. F. Güler, E. Ubeyli, and I. Güler, “Recurrent neural networks
employing lyapunov exponents for EEG signals classification,” Expert
Systems with Applications, vol. 29, no. 3, pp. 506-514, 2005.
[52] H. E. Hurst, “Suggested statistical model of some time series which
occurs in nature,” Nature, vol. 180, no. 4584, pp. 494, 1957.
[53] B. Qian, and K .Rasheed, “Hurst exponent and financial market
predictability,” In Proc. of the 2nd IASTED International Conference
on Financial Engineering and Applications, Cambridge, MA, USA,
2004, pp. 356-362.
[54] R. Weron, “Estimating long range dependence: Finite sample
properties and confidence intervals,” Physica A Statistical Mechanics
& Its Applications, vol. 312, no. 1, pp. 285-299, 2001.
[55] C. K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, and A. L.
Goldberger, “Mosaic organization of DNA nucleotides,” Physical
Review E, vol. 49, pp. 1685-1689, 1994.
[56] C. K. Peng, S. Havlin, E. H. Stanley, and A. L. Goldberger,
“Quantification of scaling exponents and crossover phenomena in
nonstationary heartbeat time series,” Chaos: An Interdisciplinary
Journal of Nonlinear Science, vol. 5, no. 1, pp. 82-0, 1995.
[57] R. Hardstone, S. S. Poil, G. Schiavone, R. Jansen, V. V. Nikulin, and
H. D. Mansvelder, “Detrended fluctuation analysis: A scale-free view
on neuronal oscillations,” Frontiers in Physiology, vol. 3, pp. 1-12,
2012.
[58] S. Hochreiter and J. Schmidhuber,. “Long short-term memory,”
Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[59] S. Alhagry, A. Fahmy, and R. El-Khoribi, “Emotion recognition based
on EEG using LSTM recurrent neural network,” International Journal
of Advanced Computer Science and Applications, vol. 8, no. 10, pp.
355-358, 2017.
[60] M. Wö
lller, A. Metallinou, F. Eyben B. Schuller, and S. Narayanan,
“Context-sensitive multimodal emotion recognition from speech and
facial expression using bidirectional LSTM modeling,” Interspeech,
Conference of the International Speech Communication Association
(ISCA), Makuhari, Chiba, Japan, September. DBLP, 2010.
[61] Z. Zhao, W. Chen, X. Wu, P. C. Y. Chen, and J. Liu, “LSTM network:
A deep learning approach for short-term traffic forecast,” IET
Intelligent Transport Systems, vol. 11, no. 2, pp. 68-75, 2017.
[62] M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt,
“Action classification in soccer videos with long short-term memory
recurrent neural networks,” In: Proc. of ICANN 2020: International
Conference on Artificial Neural Networks, Thessaloniki, Greece,
September, Part II. DBLP, 2010.
[63] R. Tibshirani. “Regression shrinkage and selection via the lasso,”
Journal of the Royal Statistical Society. Series B (Methodological), vol.
58, no. 1, pp. 267–288, 1996.
[64] C. Burges, T. Shaked, E. Renshaw, A. Lazier, and G. N. Hullender,
“Learning to rank using gradient descent,” In: Proc. of the Twenty-
Second International Conference (ICML 2005), Bonn, Germany,
August 7-11, 2005.
[65] S. Helal, J. Li, L. Liu, E. Ebrahimie, S. Dawson, D. J. Murray, and Q.
Long, “Predicting academic performance by considering student
heterogeneity,” Knowledge-Based Systems, vol. 161, pp. 134-146,
2018.
[66] X. Zhang, S. Qu, J. Huang, B. Fang, and P. Yu, “Stock market
prediction via multi-source multiple instance learning,” IEEE Access,
vol. 6, pp. 50720-50728, 2018.
[67] Z. Wu, W. Lin, P. Liu, J. Chen, and L. Mao, “Predicting long-term
scientific impact based on multi-field feature extraction,” IEEE
ACCESS, vol. 7, 2019, 51759-51770.
[68] L. Qiu, Q. Lei, and Z. Zhang, “Advanced sentiment classification of
Tibetan microblogs on smart campuses based on multi-feature fusion,”
IEEE Access, vol. 6, pp. 17896-17904, 2018.
[69] R. Conijn, C. Snijders, A. Kleingeld, and U. Matzat, “Predicting
student performance from LMS data: A comparison of 17 blended
courses using moodle LMS,” IEEE Trans. on Learning Technologies,
vol. 10, no. 1, pp. 17-29, 2017.