This document summarizes a literature review on prediction and personalization in Massive Open Online Courses (MOOCs).
It finds that MOOCs are commonly used to predict outcomes like certificate earning, dropout rates, scores and forum post classification. Features used include demographics, video interactions, and platform usage. Common techniques are regression, decision trees, random forests and neural networks. Metrics for evaluation include accuracy, AUC, F-score and recall/precision.
The review also identifies needs for personalization in MOOCs like accommodating learner diversity, offering personalized paths and assessment, and improving community continuity after courses end. The seminar topic could be extended to a project applying predictive models to analyze student performance data
3. Introduction
◦ Massive Open Online Courses (MOOCs) are the most important platforms for education in the 21st century.
So, it is important to personalize the environment based on needs and use the features from MOOCs to
predict the different outcomes based on it.
◦ Main objectives of literature review of prediction in MOOCs are:
To identify the characteristics of the MOOCs used for prediction
To describe the prediction outcomes
To classify the prediction features
To determine the techniques used to predict the variables
To identify the metrics used to evaluate the predictive models
3
4. ◦ Why personalization of MOOCs needed?
There are lot of drawbacks of MOOCs that need to be addressed and based on that we need to personalize
the MOOCs.
Drawbacks:
◦ Lack of cooperative activities among learners
◦ High dropout rate(Low completion rate in MOOCs)
◦ MOOCs provide limited opportunities for interaction between instructors and the learners
4
6. ◦ Focus questions/ Research questions:
◦ RQ1: What are the most common characteristics/features of the MOOCs that have been used for prediction?
◦ RQ2: What outcomes have been predicted in contributions about MOOCs?
◦ RQ3: What are the prediction features that are used to build prediction models in MOOCs?
◦ RQ4: What are the techniques/models used for prediction in MOOCs?
◦ RQ5: What metrics have been used to evaluate prediction results in MOOCs?(the evaluation of results)
6
8. Methodology
Use of PRISMA(Preferred Reporting Items for Systematic Reviews and Meta-Analyses)
SPIDER Tool is used to identify the proper keywords to search in a digital database.
Search engines used: Scopus and IEEEXplore
Year : 2018- 2021
8
9. ◦ KEYWORDS:
◦ For paper related prediction in MOOCs
◦ (“Personalize” OR “personalization” OR “personalized” OR “adaptive”) AND (“MOOC” OR “MOOCs” OR
“Massive Open Online Course” OR “Massive Open Online Courses”)
◦ For paper related personalization in MOOCs
◦ (“predict” OR “prediction” OR “predictive” OR “forecasting”) AND (“MOOC” OR “MOOCs” OR
“Massive Open Online Course” OR “Massive Open Online Courses”)
9
11. ◦ FILTERING CRITERIA:
Title
Index keywords
Author keywords
Year (2018-2021)
Excluded Learning style related paper
Excluded Non English papers
Removed duplicates
11
12. Result and discussion for Prediction in
MOOCs
(1) RQ1: What are the most common characteristics/features of the MOOCs that have been used for
prediction?
The MOOCs considered in each paper (Coursera, Edx, Future learn)
The areas of these MOOCs the platforms on which these MOOCs are deployed(Engineering, Mathematics)
The number of enrolled users in these MOOCs (Between 100 to 10,432)
The duration of these MOOCs.(1 week to 8 week)
12
13. (2)RQ2: What outcomes have been predicted in contributions about MOOCs?
◦ Certificate earner
◦ Dropout**
◦ Scores prediction
◦ Forum posts classification
◦ Relevance of content
◦ Student behaviour
◦ Course Recommendation in MOOCs
13
15. (3)RQ3: What are the prediction features that are used to build prediction models in MOOCs?
◦ Demographic variables(age, level of schooling, country of origin, primary language, employment status, etc.)
◦ Video-related variables(play, pause, skip back, skip forward, change to faster rate, change to slower rate, and
change to default rate)
◦ Platform use variables(time elapsed after first activity, number of accesses, number of periods when the user
has been on the course, number of weeks the student spends on the course, last access time, registration
time, total click count, course access interval)
◦ Temporal assignment-related behaviors (e.g., submission order sequence, completion time sequence)
◦ learner’s number of accesses and time spent per access
15
16. (4)RQ4: What are the techniques/models used for prediction in MOOCs?
◦ Gradient Boosting Machine (GBM)
◦ Neural Networks
◦ Decision trees (DTs)
◦ Random Forest (RF)
◦ Regression**
◦ Support Vector Machines (SVMs)
◦ latent dynamic factor graph (LadFG)
◦ long short-term memory (LSTM) neural network
16
18. (5)RQ5: What metrics have been used to evaluate prediction results in MOOCs?(the evaluation of
results)
◦ Kappa
◦ RMSE
◦ Recall and Precision
◦ F-score
◦ Accuracy**
◦ AUC
18
19. 19
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
◦ A C C U R A C Y
◦ A U C
◦ F - S C O R E
◦ R E C A L L A N D P R E C I S I O N
METRICS
Distribution of Metrics
20. Result and discussion for Personalization in
MOOCs
◦ The current MOOCs design does not consider the diversity of its learners(Providing same content to all type
of learners, Only English language used, learners disability)
◦ Learner determined or contributed content is very less. Learner selected learning pathways (i.e., different
routes to learn the same content)
◦ Personalized learning path, personalized assessment and feedback, personalized forum thread and
recommendation service for related learning materials.
◦ Poor continuity of learning communities when a MOOCs course ends.
◦ Lack of cooperative activities among learners
20
21. Extension of seminar to project
◦ Big Data Analytics: Literature review on prediction and personalization in MOOC environments. NPTEL at
IITM collected data of labor 24K students profile and performance data in data science and programming
course.
◦ Develop regression models to predict the students' performance in qualifier exam and impact analysis of
results.
21
22. References
◦ Manuel Moreno-Marcos, Carlos Alario-Hoyos,Pedro J. Muñoz-Merino, Carlos Delgado Kloos. “Prediction in
MOOCs: A Review and Future Research Directions”, IEEE Transactions on Learning Technologies ,384 - 401
.DOI: 10.1109/TLT.2018.2856808
◦ Ayse Saliha ,SunarNor Aniza ,Abdullah Nor Aniza, Abdullahsu ,Whitesu White Hugh, Davis Hugh Davis.
“Personalisation in MOOCs: A Critical Literature Review” International Conference on Computer Supported
Education Project: Predicting Learners' Course Completion based on their Social Activities , February 2016
DOI: 10.1007/978-3-319-29585-5_9
◦ Di Sun; Yueheng Mao; Junlei Du; Pengfei Xu; Qinhua Zheng; Hongtao Sun. “Deep Learning for Dropout
Prediction in MOOCs”, 2019 Eighth International Conference on Educational Innovation through Technology
(EITT), DOI: 10.1109/EITT.2019.00025
◦ Raghad Alshabandar; Abir Hussain; Robert Keight; Wasiq Khan, “Students Performance Prediction in Online
Courses Using Machine Learning Algorithms”, 2020 International Joint Conference on Neural Networks
(IJCNN), DOI: 10.1109/IJCNN48605.2020.9207196
22