MARGINALIZATION (Different learners in Marginalized Group
Analysis of social interactions and prediction of assignment grades in a Massive Open Online Course»
1. Analysis of social interactions and prediction
of assignment grades in a Massive Open
Online Course
Pedro Manuel Moreno Marcos
Universidad Carlos III de Madrid
eMadrid Seminar on ‘OERs & Smart Education’
UNED, Madrid, 24th November 2017
2. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
2
3. INTRODUCTION: CONTEXT
3
Greller, W., & Drachsler, H. (2012). Translating learning into
numbers: A generic framework for learning analytics. Journal of
Educational Technology & Society, 15(3), 42-57
Prediction Visualizations
4. INTRODUCTION: MOTIVATION
• BENEFITS
– Teachers: Improve learning
processes. Support students.
– Learners: Self-reflection
• Use of dashboards to display
information
• Importance of timing considerations
4
5. INTRODUCTION: OBJECTIVES
5
• Design of a Web application with
different visualizations regarding forum
interactions
• Obtain conclusions regarding learners’
behaviour in a real MOOC
• Analyze how assignments grades can be
anticipated and which factors affect the
predictive power
6. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
6
7. RELATED WORK: VISUALIZATIONS
• Objective: present visual results to stakeholders
• Examples: ANALYSE (Open edX) / edX Insights
• Lack of visualizations related to the forum activity
7
8. RELATED WORK: PREDICTION IN EDUCATION
• Two types: future prediction / detection
• Course completion
• Student’s behaviors: motivations, problems,
etc.
• Scores
– ASSISTment
– Peer-review activities
8
9. 6
18
20
18
16
7
0 5 10 15 20 25
Others
Platform use
Forum-related
Exercises-related
Video-related
Demographic
Number of articles
Typeofvariables
Distribution of predictor variables in MOOCs
RELATED WORK: PREDICTION IN MOOCs
• Systematic review
• predict(ion) AND
MOOC(s)
• 35 analysed papers
9
5
3
2
3
9
11
6
0 2 4 6 8 10 12
Others
Student engagement/personality
Value/interest of items
Forum posts classification
Scores prediction
Drop-out
Certificate earners
Number of articles
Precitionparameters
Distribution of prediction parameters in MOOCs
10. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
10
11. FORUM DASHBOARD: FIRST FUNCTIONALITIES
• Basic Statistics
– Number of messages,
votes, response
times, etc.
• Participation
– Number of learners,
top contributors, etc.
• Messages with more
responses/votes
11
12. FORUM DASHBOARD: COURSE ABILITIES
• Definition of abilities
– Plain or hierarchical
structure
– JavaScript (D3)
• Visualize what
abilities appear
more
12
13. FORUM DASHBOARD: SENTIMENT ANALYSIS (I)
• Determine if a
message is positive,
negative or neutral
• Algorithm:
– Based on dictionaries
– Use emoticons
– Consider negations
13
14. APPROACH
FORUM DASHBOARD: SENTIMENT ANALYSIS (II)
• Two main categories:
– Supervised (machine
learning based)
• 8 types of indicators,
including votes, length,
responses, etc.
– Unsupervised (lexicon
based)
METRICS
• Accuracy
• AUC (Area Under the Curve) 14
Method AUC Accuracy
Dictionaries 71/78 74/78
SentiWordNet 65/75 66/77
Logistic Reg. 68/84 70/81
SVM 70/77 72/72
Decision Trees 64/74 69/74
Random Forest 71/82 72/74
Naïve Bayes 66/85 57/79
Results expressed in %
15. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
15
16. JAVA PROGRAMMING MOOC: CASE STUDY
• Introduction to
Programming with Java
– Part I: Starting to
Program in Java
• 5 weeks
• Instructor-led
• Typically 14 days for
each assignment
• Passing grade: 60%
• Evaluation:
– 5 graded tests (Ti)
– 2 programming
assignments (Pi)
16
17. JAVA PROGRAMMING MOOC: FORUM USE
• 13,302 messages
• Activity rises in critical dates
17
18. JAVA PROGRAMMING MOOC: MESSAGES
MORE RESPONSES
• Cover varied issues:
- Technical questions
- Course-related
questions
MORE VOTES
• Provide answers to
questions related to
course concepts
• Top three messages
belong to the first week
18
19. JAVA PROGRAMMING MOOC: SENTIMENTS
• 5,292 positives
• 2,934 negatives
• 5,076 neutral
• 64.33% positive
• Higher positivity at the
beginning
• Decrease near the deadlines
of programming tasks
19
20. JAVA PROGRAMMING MOOC: ABILITIES
• Analysis based on
42 abilities: method,
casting, calculator,
array.
• Analysis based on
10 relevant terms:
array, loop,
certificate, deadline
20
21. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
21
22. ASSIGNMENT PREDICTION: DATA COLLECTION
SOURCE OF DATA
• Data provided by edX
• Database data:
– Course structure
– State of course
components per learner
– Forum interactions
• Instructor dashboard:
– Grade report
SAMPLE SELECTION
• 95,555 enrolled users
• Two filters:
– Consider only participants
in the forum
– Exclude unenrolled users
• Result: 4,358 learners
22
23. ASSIGNMENT PREDICTION: VARIABLES AND
TECHNIQUES
TYPES OF VARIABLES TECHNIQUES
23
METRIC
Forum
Exercises
Video
Previous
grades
Regression
(RG)
Support
Vector
Machines
(SVM)
Decision
Trees
(DT)
Random
Forest
(RF)
Root
Mean
Squared
Error
(RMSE)
24. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
24
25. ASSIGNMENT PREDICTION: PREDICTIVE POWER
IN COURSE ASSIGNMENTS
• Model A: Exercises and video variables
• Model B: Model A + previous grades
25
Results expressed in RMSE
Method T1 T2 T3 T4 T5 P3 P5 FG
ModelA
Best 0.26 0.21 0.20 0.18 0.16 0.25 0.20 0.14
Worse 0.34 0.28 0.26 0.22 0.18 0.31 0.27 0.16
ModelB
Best 0.26 0.20 0.18 0.15 0.13 0.24 0.19 -
Worse 0.34 0.26 0.23 0.20 0.17 0.32 0.26 -
26. ASSIGNMENT PREDICTION: EFFECT OF FORUM-
RELATED VARIABLES
• Model C: Forum variables
• Model D: Model C + exercises and videos
• Model E: Model D + previous grades
26
Results expressed in RMSE
Method T1 T2 T3 T4 T5 P3 P5 FG
ModelC
Best 0.41 0.36 0.33 0.31 0.27 0.34 0.24 0.25
Worse 0.46 0.40 0.35 0.33 0.30 0.36 0.28 0.28
ModelD
Best 0.25 0.21 0.20 0.18 0.16 0.25 0.20 0.14
Worse 0.34 0.28 0.26 0.23 0.19 0.32 0.28 0.17
ModelE
Best 0.25 0.20 0.18 0.15 0.13 0.24 0.19 -
Worse 0.34 0.26 0.23 0.20 0.17 0.32 0.26 -
27. ASSIGNMENT PREDICTION: CLOSE-ENDED VS. OPEN-
ENDED QUESTIONS
Assignment Forum
(Model C)
Problems
and video
(Model A)
Problems, video
and grades
(Model B)
Test 3 0.33 0.20 0.18
Peer-review 3 0.34 0.25 0.24
Test 5 0.27 0.16 0.13
Peer-review 5 0.25 0.20 0.19
• No differences in Model C
• Statistically Significant difference in Models A
and B (p<0.05)
27
Results expressed in RMSE
28. ASSIGNMENT PREDICTION: EFFECT OF
VARIABLES FROM PREVIOUS WEEKS
• Model F (Model A +
previous data)
• Assignments →
Non-cumulative
• Final Grade →
Cumulative
• Factors:
– Independency
– Engagement over
time 28
Grades prediction using data from previous weeks
29. ASSIGNMENT PREDICTION: STABILISATION OF
PREDICTIVE POWER IN A DAY-BY-DAY ANALYSIS
• Threshold is
between days 7-9
• Trade-off
between
anticipation and
predictive power
29
Evolution of the predictive power day-by-day
30. INDEX
1. INTRODUCTION
2. RELATED WORK
3. FORUM DASHBOARD
4. JAVA PROGRAMMING MOOC: CASE STUDY
5. ASSIGNMENT PREDICTION: METHODOLOGY
6. ASSIGNMENT PREDICTION: RESULTS
7. CONCLUSIONS AND FUTURE WORK
30
31. CONCLUSIONS: FORUM ACTIVITY
• Acceptable
functioning
• Deadlines alter
learners’ behaviors
and thus forum
activity
• Low participation
• Higher activity in
some concepts:
arrays, loops or
casting
• Different valid
approaches for
sentiment analysis
31
32. CONCLUSIONS: ASSIGNMENT PREDICTION
1) Early assignments are harder to predict
2) Algorithms are less important than data
3) Previous grades always enhance models
4) Forum-related variables have low predictive power
5) Closed-ended assignments can be predicted better
6) Previous interactions make models worse
7) Data from nearest previous week have stronger
relationship with current grades
8) Interactions from current week become relevant
after 7 days 32
33. LIMITATIONS AND FUTURE WORK: FORUM
ACTIVITY
LIMITATIONS
• Limited evaluation of
the usability
• Applicability on the
context
• Lack of labelled data
• Subjectivity of the
labelling process
FUTURE WORK
• Incorporate data from
new courses
• Automatic detection
of abilities
• Improve training set
for sentiment analysis
33
34. LIMITATIONS AND FUTURE WORK:
ASSIGNMENT PREDICTION
LIMITATIONS
• Data restrictions
• Sample selection
criteria
• Applicability
depending on context
FUTURE WORK
• Use courses with more
comprehensive traces
• Comparison with other
learners
• Assess applicability
• Differentiate learners who
fail
• Put models into practise
• Analyse possible
interventions 34
35. PUBLICATIONS SENT
• P.M. Moreno-Marcos, C. Alario-Hoyos, P.J Muñoz-Merino
and C. Delgado Kloos. Prediction in MOOCs: A review and
future research directions. IEEE Transactions on Learning
Technologies.
• P.M. Moreno-Marcos, C. Alario-Hoyos, P.J. Muñoz-Merino,
I. Estévez-Ayres and C. Delgado Kloos. Sentiment Analysis
in MOOCs: A case study. EDUCON Conference 2018.
• P.M. Moreno-Marcos, P.J. Muñoz-Merino, C. Alario-Hoyos,
I. Estévez-Ayres and C. Delgado Kloos. Analysing the
predictive power for anticipating assignment grades in a
Massive Open Online Course. Behaviour & Information
Technology 35