The document reviews the history and trends in educational data mining (EDM) research. It discusses how EDM has grown from early work analyzing student-computer interaction logs to using a variety of data mining and machine learning methods. Relationship mining was historically prominent but prediction and discovery with models have increased. The document also summarizes key applications of EDM including student modeling, knowledge modeling, pedagogical support analysis, and exploring educational theories. It analyzes the most influential early EDM papers and identifies trends like using EDM to study gaming behavior and develop student models.
What forty years_of_research_says_about__the_impact_of_technology_on_learning...Cathy Cavanaugh
This research study employs a second-order meta-analysis procedure to summarize
40 years of research activity addressing the question, does computer
technology use affect student achievement in formal face-to-face classrooms
as compared to classrooms that do not use technology? A study-level metaanalytic
validation was also conducted for purposes of comparison. An
extensive literature search and a systematic review process resulted in the
inclusion of 25 meta-analyses with minimal overlap in primary literature,
encompassing 1,055 primary studies. The random effects mean effect size of
0.35 was significantly different from zero. The distribution was heterogeneous
under the fixed effects model. To validate the second-order metaanalysis,
574 individual independent effect sizes were extracted from 13 out
of the 25 meta-analyses. The mean effect size was 0.33 under the random
effects model, and the distribution was heterogeneous. Insights about the
state of the field, implications for technology use, and prospects for future
research are discussed.
An investigation of_factors_influencing_student_use_of_technology_in_k-12_cla...Cathy Cavanaugh
The purpose of this research was to examine the effects of teachers’ characteristics,
school characteristics, and contextual characteristics on classroom
technology integration and teacher use of technology as mediators of student
use of technology. A research-based path model was designed and tested
based on data gathered from 732 teachers from 17 school districts and 107
different schools in the state of Florida. Results show that a teacher’s level
of education and experience teaching with technology positively and significantly
influence his/her use of technology. Teacher use of technology
strongly and positively explains classroom technology integration and
student use of technology. Further, how a teacher integrates technology
into the classroom explains how frequently students use technology in a
school setting. The findings provided significant evidence that the path
model is useful in explaining factors affecting student use of technology
and the relationships among the factors.
Data Mining Techniques in Higher Education an Empirical Study for the Univer...IJMER
Nowadays, ones of the biggest challenges that educational institutions face is the explosive
growth of educational data. and how to use these data to improve the quality of managerial decisions.
Data mining, as an analytical tools that can be used to extract meaningful knowledge from large data
sets, can be used to achieve this goal.
This paper addresses the applications of Educational Data Mining (EDM) to extract useful information
from registration information of student at university of Palestine in Gaza strip. The data include five
years period [2005-2011] by providing analytical tool to view and use this information for decision
making processes by taking real life example such as grade and GPA for the students. abstract should
summarize the content of the paper.
Predicting Success : An Application of Data Mining Techniques to Student Outc...IJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter
tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
What forty years_of_research_says_about__the_impact_of_technology_on_learning...Cathy Cavanaugh
This research study employs a second-order meta-analysis procedure to summarize
40 years of research activity addressing the question, does computer
technology use affect student achievement in formal face-to-face classrooms
as compared to classrooms that do not use technology? A study-level metaanalytic
validation was also conducted for purposes of comparison. An
extensive literature search and a systematic review process resulted in the
inclusion of 25 meta-analyses with minimal overlap in primary literature,
encompassing 1,055 primary studies. The random effects mean effect size of
0.35 was significantly different from zero. The distribution was heterogeneous
under the fixed effects model. To validate the second-order metaanalysis,
574 individual independent effect sizes were extracted from 13 out
of the 25 meta-analyses. The mean effect size was 0.33 under the random
effects model, and the distribution was heterogeneous. Insights about the
state of the field, implications for technology use, and prospects for future
research are discussed.
An investigation of_factors_influencing_student_use_of_technology_in_k-12_cla...Cathy Cavanaugh
The purpose of this research was to examine the effects of teachers’ characteristics,
school characteristics, and contextual characteristics on classroom
technology integration and teacher use of technology as mediators of student
use of technology. A research-based path model was designed and tested
based on data gathered from 732 teachers from 17 school districts and 107
different schools in the state of Florida. Results show that a teacher’s level
of education and experience teaching with technology positively and significantly
influence his/her use of technology. Teacher use of technology
strongly and positively explains classroom technology integration and
student use of technology. Further, how a teacher integrates technology
into the classroom explains how frequently students use technology in a
school setting. The findings provided significant evidence that the path
model is useful in explaining factors affecting student use of technology
and the relationships among the factors.
Data Mining Techniques in Higher Education an Empirical Study for the Univer...IJMER
Nowadays, ones of the biggest challenges that educational institutions face is the explosive
growth of educational data. and how to use these data to improve the quality of managerial decisions.
Data mining, as an analytical tools that can be used to extract meaningful knowledge from large data
sets, can be used to achieve this goal.
This paper addresses the applications of Educational Data Mining (EDM) to extract useful information
from registration information of student at university of Palestine in Gaza strip. The data include five
years period [2005-2011] by providing analytical tool to view and use this information for decision
making processes by taking real life example such as grade and GPA for the students. abstract should
summarize the content of the paper.
Predicting Success : An Application of Data Mining Techniques to Student Outc...IJDKP
This project examines the effectiveness of applying machine learning techniques to the realm of college
student success, specifically with the intent of discovering and identifying those student characteristics and
factors that show the strongest predictive capability with regards to successful graduation. The student
data examined consists of first time freshmen and transfer students who matriculated at California State
University San Marcos in the period of Fall 2000 through Fall 2010 and who either graduated successfully
or discontinued their education. Operating on over 30,000 student observations, random forests are used
to determine the relative importance of the student characteristics with genetic algorithms to perform
feature selection and pruning. To improve the machine learning algorithm cross validated hyperparameter
tuning was also implemented. Overall predictive strength is relatively high as measured by the
Matthews Correlation Coefficient, and both intuitive and novel features which provide support for the
learning model are explored.
An Analysis of Behavioral Intention toward Actual Usage of Open Source Softwa...IJAEMSJORNAL
This study focused on analyzing behavioral intention toward the actual usage of open source software in private universities in Tanzania. Questionnaires were used to collect quantitative data in two private universities namely Iringa University and Ruaha Catholic University. Stratified sampling technique was utilized to ensure sample representativeness among two universities where simple random sampling was used to draw a sample from each stratum during the survey. Finding Using Structural Equation Modeling indicated that performance expectancy (source code production and software localization) and social factor (Vendor, internet services provider and lecturer) have a significant influence toward behavioral intention while effort expectancy was found to be insignificant. In addition the behavioral intention was found to be significant toward student’s actual usage of open source software in Universities. This study recommended that for students to develop behavioral intention toward OSS actual usage, internet service provider have to increase the level of internet services that can assist the university communities to access and download open source software. In addition, to increase actual use, open source software vendors and lecturer or experts have to make sure that their software source code is free for distribution and localization, this will increase self-motivation and interest of the students toward actual usage of open source software.
Data Mining Application in Advertisement Management of Higher Educational Ins...ijcax
In recent years, Indian higher educational institute’s competition grows rapidly for attracting students to get enrollment in their institutes. To attract students educational institutes select a best advertisement method. There are different advertisements available in the market but a selection of them is very difficult
for institutes. This paper is helpful for institutes to select a best advertisement medium using some data mining methods.
Data Mining Model for Predicting Student Enrolment in STEM Courses in Higher ...Editor IJCATR
Educational data mining is the process of applying data mining tools and techniques to analyze data at educational
institutions. In this paper, educational data mining was used to predict enrollment of students in Science, Technology, Engineering and
Mathematics (STEM) courses in higher educational institutions. The study examined the extent to which individual, sociodemographic
and school-level contextual factors help in pre-identifying successful and unsuccessful students in enrollment in STEM
disciplines in Higher Education Institutions in Kenya. The Cross Industry Standard Process for Data Mining framework was applied to
a dataset drawn from the first, second and third year undergraduate female students enrolled in STEM disciplines in one University in
Kenya to model student enrollment. Feature selection was used to rank the predictor variables by their importance for further analysis.
Various predictive algorithms were evaluated in predicting enrollment of students in STEM courses. Empirical results showed the
following: (i) the most important factors separating successful from unsuccessful students are: High School final grade, teacher
inspiration, career flexibility, pre-university awareness and mathematics grade. (ii) among classification algorithms for prediction,
decision tree (CART) was the most successful classifier with an overall percentage of correct classification of 85.2%. This paper
showcases the importance of Prediction and Classification based data mining algorithms in the field of education and also presents
some promising future lines.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
Abstract: Data mining techniques are applied to predict college failure and bum of the student. This is method uses real data on middle-school students for prediction of failure and drop out. It implements white-box classification strategies, like induction rules and decision trees or call trees. Call tree could be a call support tool that uses tree-like graph or a model of call and their possible consequences. A call tree is a flowchart-like structure in which internal node represents a "test" on an attribute. Attribute is the real information of students that is collected from college in middle or pedagogy, each branch represents the outcome of the test and each leaf node represents a class label. The paths from root to leaf represent classification rules and it consists of three kinds of nodes which incorporates call node, likelihood node and finish node. It is specifically used in call analysis. Using this technique to boost their correctness for predicting which students might fail or dropout (idler) by first, using all the accessible attributes next, choosing the most effective attributes. Attribute choice is done by using WEKA tool.
Keywords: dataset, classification, clustering.
The conceptual landscape of iSchools: Examining current research interests of...Kim Holmberg
Introduction
This study describes the intellectual landscape of iSchools and examines how the various iSchools map onto these research areas.
Method
The primary focus of the data collection process was on faculty members’ current research interests as described by the individuals themselves. A co-word analysis of all iSchool faculty members’ research interests was used as a research method. The relations between the current research profiles of the iSchools were compared by calculating the cosine similarity between co-word profiles and visualized in network graphs.
Results
The results show that the iSchools still contain many dominant themes from LIS, but have an expanded conceptual landscape with the introduction of new iSchools. The methods used for data collection guaranteed the most current data available (in contrast to using publications) and the methods used for analyses gave multiple perspectives to the research landscape of the iSchools.
Conclusions
The results of the present study showed how the current research landscape of the iSchools and the shared research interests were built by many topics that still reflect dominant LIS topics (e.g., bibliometrics, information retrieval, and information seeking behaviour), but that there are also growing areas that reflect the iSchools’ interdisciplinary composition, thus answering the research questions.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table.
Big data is prevalent in our daily life. Not surprisingly, big data becomes a hot topic discussedby commercial worlds, media, magazines, general publics and elsewhere. From academic point of view, isit a research area of potential worth being explored? Or it is just another hype? Are there only computer orIS related scholars suitable for big data research due to its nature? Or scholars from other research areas are alsosuitable for this subject? This study aims to answer these questions through the use of informetricsapproach and data source form the SSCI Journal database, leveraging informetric‟s robust natures ofquantitative power of analyze information in any form onto the data source of representativeness. This research shows that big data research is at its growth phase with an exponential growth patternsince 2012 and with great potential for years to come. And perhaps surprisingly, computer or IS relateddisciplinesare not on the top 5 research areas fromthis research results. In fact, the top five research disciplinesare more diversified then expected: business economics (#1), Government Law (#2), InformationScience/ Library Science (#3), Social Science (#4) and Computer Science (#5). Scholars from the USuniversities are the most productive in this subject while Asian countries, including Taiwan, are alsovisible. Besides, this study also identifies that big data publications from SSCI journal database during2005-2015 do fit Lotka‟s law. This study contributes tounderstand the current big data research trends and also show the ways toresearchers who are interested to conduct future research in big data regardless of their research backgrounds.
A Survey on the Classification Techniques In Educational Data MiningEditor IJCATR
Due to increasing interest in data mining and educational system, educational data mining is the emerging topic for research
community. educational data mining means to extract the hidden knowledge from large repositories of data with the use of technique
and tools. educational data mining develops new methods to discover knowledge from educational database and used for decision
making in educational system. The various techniques of data mining like classification. clustering can be applied to bring out hidden
knowledge from the educational data.
In this paper, we focus on the educational data mining and classification techniques. In this study we analyze attributes for the
prediction of student's behavior and academic performance by using WEKA open source data mining tool and various classification
methods like decision trees, C4.5 algorithm, ID3 algorithm etc.
Data Mining for Education
Ryan S.J.d. Baker, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
rsbaker@cmu.edu
Article to appear as
Baker, R.S.J.d. (in press) Data Mining for Education. To appear in McGaw, B., Peterson, P.,
Baker, E. (Eds.) International Encyclopedia of Education (3rd edition). Oxford, UK: Elsevier.
This is a pre-print draft. Final article may involve minor changes and different formatting.
Cognitive Computing and Education and Learningijtsrd
Its enormous potential in learning spurs Cognitive Computing. The overreaching purpose here is to devise computational frameworks to help us learn better by exploiting the learning process and activities. The research challenge recognized the broad spectrum of human learning, the complex and not fully understood human learning process, and various learning factors, such as pedagogy, technology, and social elements. From the theoretical point of view, Cognitive Computing could replace existing calculators in many applications. This paper focuses on applying data mining and learning analytics, clustering student modeling, and predicting student performance when involved in the education field with possible approaches. Latifa Rahman "Cognitive Computing and Education and Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-3 , April 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49783.pdf Paper URL: https://www.ijtsrd.com/humanities-and-the-arts/education/49783/cognitive-computing-and-education-and-learning/latifa-rahman
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
An Analysis of Behavioral Intention toward Actual Usage of Open Source Softwa...IJAEMSJORNAL
This study focused on analyzing behavioral intention toward the actual usage of open source software in private universities in Tanzania. Questionnaires were used to collect quantitative data in two private universities namely Iringa University and Ruaha Catholic University. Stratified sampling technique was utilized to ensure sample representativeness among two universities where simple random sampling was used to draw a sample from each stratum during the survey. Finding Using Structural Equation Modeling indicated that performance expectancy (source code production and software localization) and social factor (Vendor, internet services provider and lecturer) have a significant influence toward behavioral intention while effort expectancy was found to be insignificant. In addition the behavioral intention was found to be significant toward student’s actual usage of open source software in Universities. This study recommended that for students to develop behavioral intention toward OSS actual usage, internet service provider have to increase the level of internet services that can assist the university communities to access and download open source software. In addition, to increase actual use, open source software vendors and lecturer or experts have to make sure that their software source code is free for distribution and localization, this will increase self-motivation and interest of the students toward actual usage of open source software.
Data Mining Application in Advertisement Management of Higher Educational Ins...ijcax
In recent years, Indian higher educational institute’s competition grows rapidly for attracting students to get enrollment in their institutes. To attract students educational institutes select a best advertisement method. There are different advertisements available in the market but a selection of them is very difficult
for institutes. This paper is helpful for institutes to select a best advertisement medium using some data mining methods.
Data Mining Model for Predicting Student Enrolment in STEM Courses in Higher ...Editor IJCATR
Educational data mining is the process of applying data mining tools and techniques to analyze data at educational
institutions. In this paper, educational data mining was used to predict enrollment of students in Science, Technology, Engineering and
Mathematics (STEM) courses in higher educational institutions. The study examined the extent to which individual, sociodemographic
and school-level contextual factors help in pre-identifying successful and unsuccessful students in enrollment in STEM
disciplines in Higher Education Institutions in Kenya. The Cross Industry Standard Process for Data Mining framework was applied to
a dataset drawn from the first, second and third year undergraduate female students enrolled in STEM disciplines in one University in
Kenya to model student enrollment. Feature selection was used to rank the predictor variables by their importance for further analysis.
Various predictive algorithms were evaluated in predicting enrollment of students in STEM courses. Empirical results showed the
following: (i) the most important factors separating successful from unsuccessful students are: High School final grade, teacher
inspiration, career flexibility, pre-university awareness and mathematics grade. (ii) among classification algorithms for prediction,
decision tree (CART) was the most successful classifier with an overall percentage of correct classification of 85.2%. This paper
showcases the importance of Prediction and Classification based data mining algorithms in the field of education and also presents
some promising future lines.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and science were studied and compared. The purpose of this research is to predict the academic major of high school students using Bayesian networks. The effective factors have been used in academic major selection for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on each other, discretization data and processing them was performed by GeNIe. The proper course would be advised for students to continue their education.
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
Educational data mining (EDM) is one of the applications of data mining. In educational data mining, there are two key domains, i.e. student domain and faculty domain. Different type of research work has been done in both domains.
In existing system the faculty performance has calculated on the basis of two parameters i.e. Student feedback and the result of student in that subject. In existing system we define two approaches one is multiple classifier approach and the other is a single classifier approach and comparing them, for relative evaluation of faculty performance using data mining
Techniques. In multiple classifier approach K-nearest neighbor (KNN) is used in first step and Rule based classification is used in the second step of classification while in single classifier approach only KNN is used in both steps of classification.
But in proposed system, I will analyse the faculty performance using 4 parameters i.e., student complaint about faculty, Student review feedback for faculty, students feedback, and students result etc.
For this proposed system I will be going to use opinion mining technique for analyzing performance of faculty and calculating score of each faculty.
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
Abstract: Data mining techniques are applied to predict college failure and bum of the student. This is method uses real data on middle-school students for prediction of failure and drop out. It implements white-box classification strategies, like induction rules and decision trees or call trees. Call tree could be a call support tool that uses tree-like graph or a model of call and their possible consequences. A call tree is a flowchart-like structure in which internal node represents a "test" on an attribute. Attribute is the real information of students that is collected from college in middle or pedagogy, each branch represents the outcome of the test and each leaf node represents a class label. The paths from root to leaf represent classification rules and it consists of three kinds of nodes which incorporates call node, likelihood node and finish node. It is specifically used in call analysis. Using this technique to boost their correctness for predicting which students might fail or dropout (idler) by first, using all the accessible attributes next, choosing the most effective attributes. Attribute choice is done by using WEKA tool.
Keywords: dataset, classification, clustering.
The conceptual landscape of iSchools: Examining current research interests of...Kim Holmberg
Introduction
This study describes the intellectual landscape of iSchools and examines how the various iSchools map onto these research areas.
Method
The primary focus of the data collection process was on faculty members’ current research interests as described by the individuals themselves. A co-word analysis of all iSchool faculty members’ research interests was used as a research method. The relations between the current research profiles of the iSchools were compared by calculating the cosine similarity between co-word profiles and visualized in network graphs.
Results
The results show that the iSchools still contain many dominant themes from LIS, but have an expanded conceptual landscape with the introduction of new iSchools. The methods used for data collection guaranteed the most current data available (in contrast to using publications) and the methods used for analyses gave multiple perspectives to the research landscape of the iSchools.
Conclusions
The results of the present study showed how the current research landscape of the iSchools and the shared research interests were built by many topics that still reflect dominant LIS topics (e.g., bibliometrics, information retrieval, and information seeking behaviour), but that there are also growing areas that reflect the iSchools’ interdisciplinary composition, thus answering the research questions.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table.
Big data is prevalent in our daily life. Not surprisingly, big data becomes a hot topic discussedby commercial worlds, media, magazines, general publics and elsewhere. From academic point of view, isit a research area of potential worth being explored? Or it is just another hype? Are there only computer orIS related scholars suitable for big data research due to its nature? Or scholars from other research areas are alsosuitable for this subject? This study aims to answer these questions through the use of informetricsapproach and data source form the SSCI Journal database, leveraging informetric‟s robust natures ofquantitative power of analyze information in any form onto the data source of representativeness. This research shows that big data research is at its growth phase with an exponential growth patternsince 2012 and with great potential for years to come. And perhaps surprisingly, computer or IS relateddisciplinesare not on the top 5 research areas fromthis research results. In fact, the top five research disciplinesare more diversified then expected: business economics (#1), Government Law (#2), InformationScience/ Library Science (#3), Social Science (#4) and Computer Science (#5). Scholars from the USuniversities are the most productive in this subject while Asian countries, including Taiwan, are alsovisible. Besides, this study also identifies that big data publications from SSCI journal database during2005-2015 do fit Lotka‟s law. This study contributes tounderstand the current big data research trends and also show the ways toresearchers who are interested to conduct future research in big data regardless of their research backgrounds.
A Survey on the Classification Techniques In Educational Data MiningEditor IJCATR
Due to increasing interest in data mining and educational system, educational data mining is the emerging topic for research
community. educational data mining means to extract the hidden knowledge from large repositories of data with the use of technique
and tools. educational data mining develops new methods to discover knowledge from educational database and used for decision
making in educational system. The various techniques of data mining like classification. clustering can be applied to bring out hidden
knowledge from the educational data.
In this paper, we focus on the educational data mining and classification techniques. In this study we analyze attributes for the
prediction of student's behavior and academic performance by using WEKA open source data mining tool and various classification
methods like decision trees, C4.5 algorithm, ID3 algorithm etc.
Data Mining for Education
Ryan S.J.d. Baker, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
rsbaker@cmu.edu
Article to appear as
Baker, R.S.J.d. (in press) Data Mining for Education. To appear in McGaw, B., Peterson, P.,
Baker, E. (Eds.) International Encyclopedia of Education (3rd edition). Oxford, UK: Elsevier.
This is a pre-print draft. Final article may involve minor changes and different formatting.
Cognitive Computing and Education and Learningijtsrd
Its enormous potential in learning spurs Cognitive Computing. The overreaching purpose here is to devise computational frameworks to help us learn better by exploiting the learning process and activities. The research challenge recognized the broad spectrum of human learning, the complex and not fully understood human learning process, and various learning factors, such as pedagogy, technology, and social elements. From the theoretical point of view, Cognitive Computing could replace existing calculators in many applications. This paper focuses on applying data mining and learning analytics, clustering student modeling, and predicting student performance when involved in the education field with possible approaches. Latifa Rahman "Cognitive Computing and Education and Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-3 , April 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49783.pdf Paper URL: https://www.ijtsrd.com/humanities-and-the-arts/education/49783/cognitive-computing-and-education-and-learning/latifa-rahman
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement
Due to the increasing interest in big data especially in the educational field and online education has led to a conflict in terms of performance indicators of the student. In this paper we discuss the methodology of assessing the student performance in terms of the success indicators revealing a number of indicators that is recommended to indicate success of the final academic achievement.
The International Journal of Multimedia & Its Applications (IJMA)ijma
Submit your Research Articles!!
The International Journal of Multimedia & Its Applications (IJMA)
ISSN : 0975-5578(Online); 0975-5934 (Print)
WJCI, ERA Indexed, H Index 31
Web Page URL : http://airccse.org/journal/ijma.html
current issue link: https://airccse.org/journal/ijma_current23.html
Exploring the Aspects of Educational Robotics: A Mini Systematic Literature Review
Theinmoli Munusamy, Maizatul Hayati Mohamad Yatim and Suhazlan bin Suhaimi
Universiti Pendidikan Sultan Idris (UPSI), Malaysia
Abstract URL :https://aircconline.com/abstract/ijma/v15n4/15423ijma01.html
Article URL :https://aircconline.com/ijma/V15N4/15423ijma01.pdf
#educationalrobotics #educationalrobots #minisystematic #literaturereview
Submission System: https://airccse.com/submissioncs/home.html
Contact Us : ijmajour@gmail.com or ijma@aircconline.com
EXPLORING THE ASPECTS OF EDUCATIONAL ROBOTICS: A MINI SYSTEMATIC LITERATURE R...ijma
Educational robotics is employed in both formal education and extracurricular activities to foster student
interest, engagement, and academic performance across various subjects. The research on robot-based
learning and its impact on academics has been continuously growing in recent years. Hence, this mini
Systematic Literature Review (SLR) is aimed at reviewing previous studies on using robotics in education.
Articles accessed from 2019-2023 across three databases, Scopus, Springer and ScienceDirect, to discover
relevant papers and documents for highlighting were considered. This research implements the Preferred
Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) mode. The findings of the articles
from the nations demonstrated that ASIA had carried out more research. The research method employed in
educational robotics articles focused on survey questionnaires with the highest ratings of 40%. Most of the
articles focused on primary education. The findings can guide future research that needs to be conducted
concerning educational robotics among remedial students.
EXPLORING THE ASPECTS OF EDUCATIONAL ROBOTICS: A MINI SYSTEMATIC LITERATURE R...ijma
Educational robotics is employed in both formal education and extracurricular activities to foster student
interest, engagement, and academic performance across various subjects. The research on robot-based
learning and its impact on academics has been continuously growing in recent years. Hence, this mini
Systematic Literature Review (SLR) is aimed at reviewing previous studies on using robotics in education.
Articles accessed from 2019-2023 across three databases, Scopus, Springer and ScienceDirect, to discover
relevant papers and documents for highlighting were considered. This research implements the Preferred
Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) mode. The findings of the articles
from the nations demonstrated that ASIA had carried out more research. The research method employed in
educational robotics articles focused on survey questionnaires with the highest ratings of 40%. Most of the
articles focused on primary education. The findings can guide future research that needs to be conducted
concerning educational robotics among remedial students.
The main objective of this paper is to develop a basic prototype model which can determine and extract
unknown knowledge (patterns, concepts and relations) related with multiple factors from past database records of
specific students. Data mining is science and engineering study of extracting previously undiscovered patterns
from a huge set of data. Data mining techniques are helpful for decision making as well as for discovering patterns
of data. In this paper students eligibility prediction system using Rule based classification is proposed to predict
the eligibility of students based on their details with high prediction accuracy. In Educational Institutes, a
tremendous amount of data is generated. This paper outlines the idea of predicting a particular student’s placement
eligibility by performing operations on the data stored. In this paper an efficient algorithm with the technique
Fuzzy for prediction is proposed.
In the discovery with models method identification relationships among students behaviors and characteristics or contextual variables are key applications.
Lo sviluppo torrenziale degli ultimi 20 anni di nuove tecnologie di rete e della contemporanea propensione ad utilizzarle (non sempre in maniera ottimale, ovviamente) apre le porte ad innumerevoli opportunità per progetti virtuosi di armonizzazione del territorio attraverso questo nuovo mix di tecnologia ed usi sociali governati ed incentivati.
Genius Loci EST Conference - O.Missikoff a renowned specialist of Digital Twin technology and applications, presents to the pubblic of a local conference what are potential applications of DT and what is it today state of the art of this tech sector.
C.Collicelli espone la storia e le caratteristiche dell'Alleanza Sviluppo Sostenibile in relazione al progetto Genius LOci per lo sviluppo del Turismo Sostenibile nella Tuscia
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
1. The State of Educational Data Mining in 2009: A
Review and Future Visions
RYAN S.J.D. BAKER
Department of Social Science and Policy Studies
Worcester Polytechnic Institute
Worcester, MA USA
AND
KALINA YACEF
School of Information Technologies
University of Sydney
Sydney, NSW Australia
________________________________________________________________________
We review the history and current trends in the field of Educational Data Mining (EDM). We consider the
methodological profile of research in the early years of EDM, compared to in 2008 and 2009, and discuss trends
and shifts in the research conducted by this community. In particular, we discuss the increased emphasis on
prediction, the emergence of work using existing models to make scientific discoveries (“discovery with
models”), and the reduction in the frequency of relationship mining within the EDM community. We discuss
two ways that researchers have attempted to categorize the diversity of research in educational data mining
research, and review the types of research problems that these methods have been used to address. The most-
cited papers in EDM between 1995 and 2005 are listed, and their influence on the EDM community (and
beyond the EDM community) is discussed.
________________________________________________________________________
1. INTRODUCTION
The year 2009 finds the nascent research community of Educational Data Mining (EDM)
growing and continuing to develop. This summer, the second annual international
conference on Educational Data Mining, EDM2009, was held in Cordoba, Spain, and
plans are already underway for the third international conference to occur in June 2010 in
Pittsburgh, USA. With the publication of this issue, the Educational Data Mining
community now has its own journal, the Journal of Educational Data Mining. In addition,
it is anticipated that in the next year, Chapman & Hall/CRC Press, Taylor and Francis
Group will publish the first Handbook of Educational Data Mining.
This moment in the educational data mining community’s history provides a unique
opportunity to consider where we come from and where we are headed. In this article, we
will review some of the major areas and trends in EDM, some of the most prominent
articles in the field (both those published in specific EDM venues, and in other venues
where top-quality EDM research can be found), and consider what the future may hold
for our community.
3 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
2. 2. WHAT IS EDM?
The Educational Data Mining community website, www.educationaldatamining.org,
defines educational data mining as follows: “Educational Data Mining is an emerging
discipline, concerned with developing methods for exploring the unique types of data that
come from educational settings, and using those methods to better understand students,
and the settings which they learn in.”
Data mining, also called Knowledge Discovery in Databases (KDD), is the field
of discovering novel and potentially useful information from large amounts of data
[Witten and Frank 1999]. It has been proposed that educational data mining methods are
often different from standard data mining methods, due to the need to explicitly account
for (and the opportunities to exploit) the multi-level hierarchy and non-independence in
educational data [Baker in press]. For this reason, it is increasingly common to see the
use of models drawn from the psychometrics literature in educational data mining
publications [Barnes 2005; Desmarais and Pu 2005; Pavlik et al. 2008].
3. EDM METHODS
Educational data mining methods are drawn from a variety of literatures, including data
mining and machine learning, psychometrics and other areas of statistics, information
visualization, and computational modeling. Romero and Ventura [2007] categorize work
in educational data mining into the following categories:
Statistics and visualization
Web mining
o Clustering, classification, and outlier detection
o Association rule mining and sequential pattern mining
o Text mining
This viewpoint is focused on applications of educational data mining to web data, a
perspective that accords with the history of the research area. To a large degree,
educational data mining emerged from the analysis of logs of student-computer
interaction. This is perhaps most clearly shown by the name of an early EDM workshop
(according to the EDM community website, the third workshop in the history of the
community – the workshop at AIED2005 on Usage Analysis in Learning Systems
[Choquet et al. 2005]) . The methods listed by Romero and Ventura as web mining
4 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
3. methods are quite prominent in EDM today, both in mining of web data and in mining
other forms of educational data.
A second viewpoint on educational data mining is given by Baker [in press], which
classifies work in educational data mining as follows:
Prediction
o Classification
o Regression
o Density estimation
Clustering
Relationship mining
o Association rule mining
o Correlation mining
o Sequential pattern mining
o Causal data mining
Distillation of data for human judgment
Discovery with models
The first three categories of Baker’s taxonomy of educational data mining methods
would look familiar to most researchers in data mining (the first set of sub-categories are
directly drawn from Moore’s categorization of data mining methods [Moore 2006]). The
fourth category, though not necessarily universally seen as data mining, accords with
Romero and Ventura’s category of statistics and visualization, and has had a prominent
place both in published EDM research [Kay et al. 2006], and in theoretical discussions of
educational data mining [Tanimoto 2007].
The fifth category of Baker’s EDM taxonomy is perhaps the most unusual category,
from a classical data mining perspective. In discovery with models, a model of a
phenomenon is developed through any process that can be validated in some fashion
(most commonly, prediction or knowledge engineering), and this model is then used as a
component in another analysis, such as prediction or relationship mining. Discovery with
models has become an increasingly popular method in EDM research, supporting
sophisticated analyses such as which learning material sub-categories of students will
most benefit from [Beck and Mostow 2008], how different types of student behavior
impact students’ learning in different ways [Cocea et al. 2009], and how variations in
intelligent tutor design impact students’ behavior over time [Jeong and Biswas 2008].
5 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
4. Historically, relationship mining methods of various types have been the most
prominent category in EDM research. In Romero & Ventura’s survey of EDM research
from 1995 to 2005, 60 papers were reported that utilized EDM methods to answer
research questions of applied interest (according to a post-hoc analysis conducted for the
current article). 26 of those papers (43%) involved relationship mining methods. 17 more
papers (28%) involved prediction methods of various types. Other methods were less
common. The full distribution of methods across papers is shown in Figure 1.
Figure 1. The proportion of papers involving each type of EDM method, in Romero & Ventura’s [2007]
1995-2005 survey. Note that papers can use multiple methods, and thus some papers can be found in multiple
categories.
4. KEY APPLICATIONS OF EDM METHODS
Educational Data Mining researchers study a variety of areas, including individual
learning from educational software, computer supported collaborative learning,
computer-adaptive testing (and testing more broadly), and the factors that are associated
with student failure or non-retention in courses.
Across these domains, one key area of application has been in the improvement of
student models. Student models represent information about a student’s characteristics or
state, such as the student’s current knowledge, motivation, meta-cognition, and attitudes.
6 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
5. Modeling student individual differences in these areas enables software to respond to
those individual differences, significantly improving student learning [Corbett 2001].
Educational data mining methods have enable researchers to model a broader range of
potentially relevant student attributes in real-time, including higher-level constructs than
were previously possible. For instance, in recent years, researchers have used EDM
methods to infer whether a student is gaming the system [Baker et al. 2004], experiencing
poor self-efficacy [McQuiggan et al. 2008], off-task [Baker 2007], or even if a student is
bored or frustrated [D'Mello et al. 2008]. Researchers have also been able to extend
student modeling even beyond educational software, towards figuring out what factors
are predictive of student failure or non-retention in college courses or in college
altogether [Dekker et al. 2009; Romero et al. 2008; Superby et al. 2006].
A second key area of application of EDM methods has been in discovering or
improving models of a domain’s knowledge structure. Through the combination of
psychometric modeling frameworks with space-searching algorithms from the machine
learning literature, a number of researchers have been able to develop automated
approaches that can discover accurate domain structure models, directly from data. For
instance, Barnes [2005] has developed algorithms which can automatically discover a Q-
Matrix from data, and Desmarais & Pu [2005] and Pavlik et al [Pavlik et al. 2009; Pavlik,
Cen, Wu and Koedinger 2008] have developed algorithms for finding partial order
knowledge structure (POKS) models that explain the interrelationships of knowledge in a
domain.
A third key area of application of EDM methods has been in studying pedagogical
support (both in learning software, and in other domains, such as collaborative learning
behaviors), towards discovering which types of pedagogical support are most effective,
either overall or for different groups of students or in different situations [Beck and
Mostow 2008; Pechenizkiy et al. 2008]. One popular method for studying pedagogical
support is learning decomposition [Beck and Mostow 2008]. Learning decomposition fits
exponential learning curves to performance data, relating a student’s later success to the
amount of each type of pedagogical support the student received up to that point. The
relative weights for each type of pedagogical support, in the best-fit model, can be used
to infer the relative effectiveness of each type of support for promoting learning.
A fourth key area of application of EDM methods has been in looking for empirical
evidence to refine and extend educational theories and well-known educational
phenomena, towards gaining deeper understanding of the key factors impacting learning,
often with a view to design better learning systems. For instance Gong, Rai and
7 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
6. Heffernan [2009] investigated the impact of self-discipline on learning and found that,
whilst it correlated to higher incoming knowledge and fewer mistakes, the actual impact
on learning was marginal. Perera et al. [2009] used the Big 5 theory for teamwork as a
driving theory to search for successful patterns of interaction within student teams.
Madhyastha and Tanimoto [2009] investigated the relationship between consistency and
student performance with the aim to provide guidelines for scaffolding instruction, basing
their work on prior theory on the implications of consistency in student behavior
[Abelson 1968].
5. IMPORTANT TRENDS IN EDUCATIONAL DATA MINING RESEARCH
In this section, we consider how educational data mining has developed in recent years,
and investigate what some of the major trends are in EDM research. In order to
investigate what the trends are, we analyze what researchers were studying previously,
and what they are studying now, towards understanding what is new and what attributes
EDM research has had for some time.
5.1. Prominent Papers From Early Years
One way to see where EDM has been is to look at which articles were the most
influential in its early years. We have an excellent resource, in Romero and Ventura’s
(2007) survey. This survey gives us a comprehensive list of papers, published between
1995 and 2005, which are seen as educational data mining by a prominent pair of
authorities in EDM (beyond authoring several key papers in EDM, Romero and Ventura
were conference chairs of EDM2009). To determine which articles were most influential,
we use how many citations each paper received, a bibliometric or scientometric measure
often used to indicate influence of papers, researchers, or institutions. As Bartneck and
Hu [2009] have noted, Google Scholar, despite imperfections in its counting scheme, is
the most comprehensive source for citations – particularly for the conferences which are
essential for understanding Computer Science research.
The top 8 most cited applied papers in Romero and Ventura’s survey (as of
September 9, 2009) are listed in Table 1. These articles have been highly influential, both
on educational data mining researchers, and on related fields; as such, they exemplify
many of the key trends in our research community.
The most cited article, [Zaïane 2001], suggests an application for data mining, using it
to study on-line courses. This article proposes and evangelizes EDM’s usefulness, and in
this fashion was highly influential to the formation of our community.
8 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
7. The second and fourth most cited articles, [Zaïane 2002] and [Tang and McCalla
2005] center around how educational data mining methods (specifically association rules,
and clustering to support collaborative filtering) can support the development of more
sensitive and effective e-learning systems. As in his other paper in this list, Zaiane makes
a detailed and influential proposal as to how educational data mining methods can make
an impact on e-learning systems. Tang and McCalla report an instantiation of such a
system, which integrates clustering and collaborative filtering to recommend content to
students. The authors present a study conducted with simulated students; successful
evaluation of the system with real students is presented in [Tang and McCalla 2004].
The third most-cited article, [Baker, Corbett and Koedinger 2004] gives a case study
on how educational data mining methods (specifically prediction methods) can be used to
open new research areas, in this case the scientific study of gaming the system
(attempting to succeed in an interactive learning environment by exploiting properties of
the system rather than by learning the material). Though this topic had seen some prior
interest (including [Aleven and Koedinger 2001; Schofield 1995; Tait et al. 1973]),
publication and research into this topic exploded after it became clear that educational
data mining now opened this topic to concrete, quantitative, and fine-grained analysis.
The fifth and sixth most cited articles, [Merceron and Yacef 2003] and [Romero et al.
2003], present tools that can be used to support educational data mining. This theme is
carried forward in these groups’ later work [Merceron and Yacef 2005; Romero, Ventura,
Espejo and Hervas 2008], and in EDM tools developed by other researchers [Donmez et
al. 2005].
The seventh most cited article [Beck and Woolf 2000] shows how educational data
mining prediction methods can be used to develop student models. They use a variety of
variables to predict whether a student will make a correct answer. This work has inspired
a great deal of later educational data mining work – student modeling is a key theme in
modern educational data mining, and the paradigm of testing EDM models’ ability to
predict future correctness – advocated strongly by Beck & Woolf – has become very
common (eg [Beck 2007; Mavrikis 2008]) .
Table 1. The top 8 most cited papers, in Romero & Ventura’s 1995-2005 survey.
Citations are from Google Scholar, retrieved 9 September, 2009.
Article Citations
Zaïane, O. (2001). Web usage mining for a better web-based learning 110
9 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
8. environment. Proceedings of Conference on Advanced Technology for
Education, 60–64.
Zaïane, O. (2002). Building a recommender agent for e-learning systems.
Proceedings of the International Conference on Computers in Education,
55–59.
89
Baker, R.S., Corbett, A.T., Koedinger, K.R. (2004) Detecting Student
Misuse of Intelligent Tutoring Systems. Proceedings of the 7th
International Conference on Intelligent Tutoring Systems, 531-540.
83
Tang, T., McCalla, G. (2005) Smart recommendation for an evolving e-
learning system: architecture and experiment, International Journal on
E-Learning, 4 (1), 105–129.
63
Merceron, A., Yacef, K. (2003). A web-based tutoring tool with mining
facilities to improve learning and teaching. Proceedings of the 11th
International Conference on Artificial Intelligence in Education, 201–
208.
54
Romero, C., Ventura, S., de Bra, P., & Castro, C. (2003). Discovering
prediction rules in aha! courses. Proceedings of the International
Conference on User Modeling, 25–34.
46
Beck, J., & Woolf, B. (2000). High-level student modeling with machine
learning. Proceedings of the 5th International Conference on Intelligent
Tutoring Systems, 584–593.
43
Dringus, L.P., Ellis, T. (2005) Using data mining as a strategy for
assessing asynchronous discussion forums, Computer and Education
Journal , 45, 141–160.
37
5.2. Shift In Paper Topics Over The Years
As discussed earlier in this paper (see Figure 1), relationship mining methods of various
types were the most prominent type of EDM research between 1995 and 2005. 43% of
papers in those years involved relationship mining methods. Prediction was the second
most prominent research area, with 28% of papers in those years involving prediction
methods of various types. Human judgment/exploratory data analysis and clustering
followed with (respectively) 17% and 15% of papers.
A very different pattern is seen in the papers from the first two years of the
Educational Data Mining conference [Baker et al. 2008; Barnes et al. 2009], as shown in
Figure 2. Whereas relationship mining was dominant between 1995 and 2005, in 2008-
10 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
9. 2009 it slipped to fifth place, with only 9% of papers involving relationship mining.
Prediction, which was in second place between 1995 and 2005, moved to the dominant
position in 2008-2009, representing 42% of EDM2008 papers. Human
judgment/exploratory data analysis and clustering remain in approximately the same
position in 2008-2009 as 1995-2005, with (respectively) 12% and 15% of papers.
A new method, significantly more prominent in 2008-2009 than in earlier years, is
discovery with models. Whereas no papers in Romero & Ventura’s survey involved
discovery with models, by 2008-2009 it has become the second most common category
of EDM research, representing 19% of papers.
Another key trend is the increase in prominence of modeling frameworks from Item
Response Theory, Bayes Nets, and Markov Decision Processes. These methods were rare
at the very beginning of educational data mining, began to become more prominent
around 2005 (appearing, for instance, in [Barnes 2005] and [Desmarais and Pu 2005]),
and were found in 28% of the papers in EDM2008 and EDM2009. The increase in the
commonality of these methods is likely a reflection of the integration of researchers from
the psychometrics and student modeling communities into the EDM community.
Figure 2. The proportion of papers involving each type of EDM method, in the proceedings of Educational
Data Mining 2008 and 2009 [Baker, Barnes and Beck 2008; Barnes, Desmarais, Romero and Ventura 2009].
Note that papers can use multiple methods, and thus some papers can be found in multiple categories.
11 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
10. It is worth noting that educational data mining publications in 2008 and 2009 are not
limited solely to those appearing in the proceedings of the conference (though our
analysis in this paper was restricted to those publications). One of the notable metrics of
our community’s growth is that the proceedings of EDM2008 and EDM2009 alone
accounted for approximately as many papers as were published in the first 10 years of the
community’s existence (according to Romero & Ventura’s review). Hence, EDM appears
to be growing in size rapidly, and the next major review of the field is likely to be a time-
consuming process. However, we encourage future researchers to conduct such a survey.
In general, it will be very interesting to see how the methodological trends exposed in
Figures 1 and 2 develop in the next few years.
5.3. Emergence of public data and public data collection tools
One interesting difference between the work in EDM2008 and EDM2009, and earlier
educational data mining work, is where the educational data comes from. Between 1995
and 2005, data almost universally came from the research group conducting the analysis
– that is to say, in order to do educational data mining research, a researcher first needed
to collect their own educational data.
This necessity appears to be disappearing in 2008, due to two developments. First, the
Pittsburgh Science of Learning Center has opened a public data repository, the PSLC
DataShop [Koedinger et al. 2008], which makes substantial quantities of data from a
variety of online learning environments available, for free, to any researcher worldwide.
14% of the papers published in EDM2008 and EDM2009 utilized data publicly available
from the PSLC DataShop.
Second, researchers are increasingly frequently instrumenting existing online course
environments used by large numbers of students worldwide, such as Moodle and
WebCAT. 12% of the papers in EDM2008 and EDM2009 utilized data coming from the
instrumentation of existing online courses.
Hence, around a quarter of the papers published at EDM2008 and EDM2009 involved
data from these two readily available sources. If this trend continues, there will be
significantly benefits for the educational data mining community. Among them, it will
become significantly easier to externally validate an analysis. If a researcher does an
analysis that produces results that seem artifactual or “too good to be true”, another
researcher can download the data and check for themselves. A second benefit is that
researchers will be more able to build on others’ past efforts. As reasonably predictive
models of domain structure and student moment-to-moment knowledge become available
12 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
11. for public data sets, other researchers will be able to test new models of these phenomena
in comparison to a strong baseline, or to develop new models of higher grain-size
constructs that leverage these existing models. The result is a science of education that is
more concrete, validated, and progressive than was previously possible.
6. CONCLUSIONS
The publication of this first issue of the Journal of Educational Data Mining finds the
field growing rapidly, but also in a period of transition. The advent of the EDM
conference series has led to a significant increase in the volume of research published. In
addition, public educational databases and tools for instrumenting online courses increase
the accessibility of educational data to a wider pool of individuals, lowering the barriers
to becoming an educational data mining researcher. Hence further growth can be
expected.
It is possible that these trends will make educational data mining an increasingly
international community as well. Between the papers in Romero & Ventura and the
EDM2008 and EDM2009 proceedings, it can be seen that the EDM community remains
focused in North America, Western Europe, and Australia/New Zealand, with relatively
lower participation from other regions. However, the increasing accessibility of relevant
and usable educational data has the potential to “lower the barriers” to entry for
researchers in the rest of the world.
Recent years have also seen major changes in the types of EDM methods that are
used, with prediction and discovery with models increasing while relationship mining
becomes rarer. It will be interesting to see how these trends shift in the years to come,
and what new types of research will emerge from the increase in discovery with models,
a method prominent in cognitive modeling and bioinformatics, but thus far rare in
education research.
At this point, educational data mining methods have had some level of impact on
education and related interdisciplinary fields (such as artificial intelligence in education,
intelligent tutoring systems, and user modeling). However, so far only a handful of
articles have achieved more than 50 citations (as shown in Table 1), indicating that there
is still considerable scope for an increase in educational data mining’s scientific
influence. It is hoped that this journal will play a role in raising the profile of the
educational data mining field and bringing to educational research the mathematical and
scientific rigor that similar methods have previously brought to cognitive psychology and
biology.
13 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
12. ACKNOWLEDGEMENTS
We thank Cristobal Romero and Sebastian Ventura for their excellent review in 2005 of
the state of Educational Data Mining, which influenced our article – and the field –
considerably. We thank support from the Pittsburgh Science of Learning Center, which is
funded by the National Science Foundation, award number SBE-0354420.
REFERENCES
ABELSON, R. 1968. Theories of Cognitive Consistency: A Sourcebook. Rand McNally,
Chicago.
ALEVEN, V. and KOEDINGER, K.R. 2001. Investigations into help seeking and
learning with a Cognitive Tutor. In Proceedings of the AIED-2001 Workshop on Help
Provision and Help Seeking in Interactive Learning Environments, 47-58. R. LUCKIN
Ed.
BAKER, R.S., CORBETT, A.T. and KOEDINGER, K.R. 2004. Detecting Student
Misuse of Intelligent Tutoring Systems. In Proceedings of the 7th International
Conference on Intelligent Tutoring Systems, Maceio, Brazil, 531-540.
BAKER, R.S.J.D. 2007. Modeling and Understanding Students’ Off-Task Behavior in
Intelligent Tutoring Systems. In Proceedings of the ACM CHI 2007: Computer-Human
Interaction conference, 1059-1068.
BAKER, R.S.J.D. in press. Data Mining For Education. In International Encyclopedia of
Education (3rd edition), B. MCGAW, PETERSON, P., BAKER Ed. Elsevier, Oxford,
UK.
BAKER, R.S.J.D., BARNES, T. and BECK, J.E. 2008. 1st International Conference on
Educational Data Mining, Montreal, Quebec, Canada.
BARNES, T. 2005. The q-matrix method: Mining student response data for knowledge.
In Proceedings of the AAAI-2005 Workshop on Educational Data Mining.
BARNES, T., DESMARAIS, M., ROMERO, C. and VENTURA, S. 2009. Educational
Data Mining 2009: 2nd International Conference on Educational Data Mining,
Proceedings, Cordoba, Spain.
BARTNECK, C. and HU, J. 2009. Scientometric Analysis of the CHI Proceedings. In
Proceedings of the Conference on Human Factors in Computing Systems (CHI2009),
699-708.
BECK, J. and WOOLF, B. 2000. High-level student modeling with machine learning. In
Proceedings of the International Conference on Intelligent tutoring systems, 584-593.
BECK, J.E. 2007. Difficulties in inferring student knowledge from observations (and
why you should care). Proceedings of the AIED2007 Workshop on Educational Data
Mining, 21-30.
BECK, J.E. and MOSTOW, J. 2008. How who should practice: Using learning
decomposition to evaluate the efficacy of different types of practice for different types of
students. In Proceedings of the 9th International Conference on Intelligent Tutoring
Systems, 353-362.
CHOQUET, C., LUENGO, V. and YACEF, K. 2005. Proceedings of "Usage Analysis in
Learning Systems" workshop, held in conjunction with AIED 2005, Amsterdam, The
Netherlands, July 2005.
14 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
13. COCEA, M., HERSHKOVITZ, A. and BAKER, R.S.J.D. 2009. The Impact of Off-task
and Gaming Behaviors on Learning: Immediate or Aggregate? In Proceedings of the 14th
International Conference on Artificial Intelligence in Education, 507-514.
CORBETT, A.T. 2001. Cognitive Computer Tutors: Solving the Two-Sigma Problem. In
Proceedings of the International Conference on User Modeling, 137-147.
D'MELLO, S.K., CRAIG, S.D., WITHERSPOON, A.W., MCDANIEL, B.T. and
GRAESSER, A.C. 2008. Automatic Detection of Learner’s Affect from Conversational
Cues. User Modeling and User-Adapted Interaction 18, 45-80.
DEKKER, G., PECHENIZKIY, M. and VLEESHOUWERS, J. 2009. Predicting Students
Drop Out: A Case Study. In Proceedings of the International Conference on Educational
Data Mining, Cordoba, Spain, T. BARNES, M. DESMARAIS, C. ROMERO and S.
VENTURA Eds., 41-50.
DESMARAIS, M.C. and PU, X. 2005. A Bayesian Student Model without Hidden Nodes
and Its Comparison with Item Response Theory. International Journal of Artificial
Intelligence in Education 15, 291-323.
DONMEZ, P., ROSÉ, C., STEGMANN, K., WEINBERGER, A. and FISCHER, F. 2005.
Supporting CSCL with automatic corpus analysis technology. In Proceedings of the
International Conference of Computer Support for Collaborative Learning (CSCL 2005),
125-134.
GONG, Y., RAI, D., BECK, J. and HEFFERNAN, N. 2009. Does Self-Discipline Impact
Students’ Knowledge and Learning? In Proceedings of the 2nd International Conference
on Educational Data Mining, 61-70.
JEONG, H. and BISWAS, G. 2008. Mining Student Behavior Models in Learning-by-
Teaching Environments. In Proceedings of the 1st International Conference on
Educational Data Mining, 127-136.
KAY, J., MAISONNEUVE, N., YACEF, K. and REIMANN, P. 2006. The Big Five and
Visualisations of Team Work Activity. In Intelligent Tutoring Systems, M. IKEDA, K.D.
ASHLEY and T.-W. CHAN Eds. Springer-Verlag, Taiwan, 197-206.
KOEDINGER, K.R., CUNNINGHAM, K., A., S. and LEBER, B. 2008. An open
repository and analysis tools for fine-grained, longitudinal learner data. In Proceedings of
the 1st International Conference on Educational Data Mining, 157-166.
MADHYASTHA, T. and TANIMOTO, S. 2009. Student Consistency and Implications
for Feedback in Online Assessment Systems. In Proceedings of the 2nd International
Conference on Educational Data Mining, 81-90.
MAVRIKIS, M. 2008. Data-driven modeling of students’ interactions in an ILE. In
Proceedings of the 1st International Conference on Educational Data Mining, 87-96.
MCQUIGGAN, S., MOTT, B. and LESTER, J. 2008. Modeling Self-Efficacy in
Intelligent Tutoring Systems: An Inductive Approach. User Modeling and User-Adapted
Interaction 18, 81-123.
MERCERON, A. and YACEF, K. 2003. A Web-based Tutoring Tool with Mining
Facilities to Improve Learning and Teaching. In 11th International Conference on
Artificial Intelligence in Education., F. VERDEJO and U. HOPPE Eds. IOS Press,
Sydney, 201-208.
MERCERON, A. and YACEF, K. 2005. Educational Data Mining: a Case Study. In
Artificial Intelligence in Education (AIED2005), C.-K. LOOI, G. MCCALLA, B.
BREDEWEG and J. BREUKER Eds. IOS Press, Amsterdam, The Netherlands, 467-474.
MOORE, A.W. 2006. Statistical Data Mining Tutorials. Downloaded 1 August 2009
from http://www.autonlab.org/tutorials/
PAVLIK, P., CEN, H. and KOEDINGER, K.R. 2009. Learning Factors Transfer
Analysis: Using Learning Curve Analysis to Automatically Generate Domain Models. In
Proceedings of the 2nd International Conference on Educational Data Mining, 121-130.
15 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009
14. PAVLIK, P., CEN, H., WU, L. and KOEDINGER, K. 2008. Using Item-type
Performance Covariance to Improve the Skill Model of an Existing Tutor. In Proceedings
of the 1st International Conference on Educational Data Mining, 77-86.
PECHENIZKIY, M., CALDERS, T., VASILYEVA, E. and DE BRA, P. 2008. Mining
the Student Assessment Data: Lessons Drawn from a Small Scale Case Study. In
Proceedings of the 1st International Conference on Educational Data Mining, 187-191.
PERERA, D., KAY, J., KOPRINSKA, I., YACEF, K. and ZAIANE, O. 2009. Clustering
and sequential pattern mining to support team learning. IEEE Transactions on Knowledge
and Data Engineering 21, 759-772
ROMERO, C. and VENTURA, S. 2007. Educational Data Mining: A Survey from 1995
to 2005. Expert Systems with Applications 33, 125-146.
ROMERO, C., VENTURA, S., DE BRA, P. and CASTRO, C. 2003. Discovering
prediction rules in aha! courses. In Proceedings of the International Conference on User
Modeling, 25–34.
ROMERO, C., VENTURA, S., ESPEJO, P.G. and HERVAS, C. 2008. Data Mining
Algorithms to Classify Students. In Proceedings of the 1st International Conference on
Educational Data Mining, 8-17.
SCHOFIELD, J. 1995. Computers and Classroom Culture. Cambridge University Press
Cambridge, UK.
SUPERBY, J.F., VANDAMME, J.-P. and MESKENS, N. 2006. Determination of factors
influencing the achievement of the first-year university students using data mining
methods. In Proceedings of the Workshop on Educational Data Mining at the 8th
International Conference on Intelligent Tutoring Systems (ITS 2006), 37-44.
TAIT, K., HARTLEY, J.R. and ANDERSON, R.C. 1973. Feedback Procedures in
Computer-Assisted Arithmetic Instruction. British Journal of Educational Psychology 43,
161-171.
TANG, T. and MCCALLA, G. 2004. Utilizing Artificial Learners to Help Overcome the
Cold-Start Problem in a Pedagogically-Oriented Paper Recommendation System. In
Proceedings of the International Conference on Adaptive Hypermedia, 245-254.
TANG, T. and MCCALLA, G. 2005. Smart recommendation for an evolving e-learning
system: architecture and experiment. International Journal on E-Learning 4, 105-129.
TANIMOTO, S.L. 2007. Improving the Prospects for Educational Data Mining. In
Proceedings of the Complete On-Line Proceedings of the Workshop on Data Mining for
User Modeling, at the 11th International Conference on User Modeling (UM 2007), 106-
110.
WITTEN, I.H. and FRANK, E. 1999. Data mining: Practical Machine Learning Tools
and Techniques with Java Implementations. Morgan Kaufmann, San Fransisco, CA.
ZAÏANE, O. 2001. Web usage mining for a better web-based learning environment. In
Proceedings of conference on advanced technology for education, 60-64.
ZAÏANE, O. 2002. Building a recommender agent for e-learning systems. In
Proceedings of the International Conference on Computers in Education, 55–59.
16 Journal of Educational Data Mining, Article 1, Vol 1, No 1, Fall 2009