Improving retention: predicting
at-risk students by analysing
clicking behaviour in a virtual
learning environment
Annika ...
Student retention
• Struggling students don’t always ask
for help – drop-out of module or fail
and then don’t progress fur...
Open University context

Distance learning:
• Content through VLE
• Contact mediated
through VLE – how to
tell if students...
Data sources and data sets
VLE

Assessment

Learning content
Forums
Quizzes….

Ongoing assessments
Final exam

Demographic...
Typical VLE clicks
3000

2500
Students

Tutors

2000

1500

1000

500

0
1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 33...
VLE activity (prior TMA1)
•
•
•
•
•
•
•

No VLE activity … 317 students
1-20 clicks ……….. 609 students
21-80 clicks ……… 94...
Problem specification

• Given:
– Demographic data at the Start (may include information about
student’s previous modules ...
Comments on problem specification
• OU intervention is meaningful if the cost of the intervention is
lower than the expect...
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is
lower than the expected g...
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is
lower than the expected g...
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is
lower than the expected g...
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is
lower than the expected g...
Prediction at TMA1
– Why? TMA1 is a good predictor of success or
failure
– It is enough time to intervene

History we know...
Building a classifier
Pass

Fail

Training instances
New instances
PASS

FAIL
Decision Tree – first results (no demographi...
Performance drop (VLE+TMA)
Final outcome
Naïve Bayes network
• Education:
–
–
–
–
–

Sex

N/C
TMA1
Education

VLE

No formal qualif.
Lower than A level
A level
HE ...
Predicting final result from TMA1
Pass/Distinction

TMA1 >=40
TMA1

TMA2

TMA7

TMA1 <40

Final result

Fail

Prior probab...
P(Fail|TMA1-score), P(Pass/Dist|TMA1-score)
1
0.9
0.8
0.7
0.6
0.5

Fail
Pass/Dist

0.4
0.3
0.2
0.1
0
0-39

40-59

60-69

7...
Predicting final result from TMA1
Sex

Pass/Distinction

TMA1 >=40
N/C
TMA1

TMA2

TMA7

Final result

Education

TMA1 <40...
Demo Case 1
• Demographic data

Sex
N/C
Educatio
n

– Student fits certain
demographic profile of
gender, educational
back...
Demo Case 2
• Demographic data

Sex
N/C
Educatio
n

– Different demographic profile
to previous slide

TMA1

Without VLE:
...
TMA1? … it might be too late!

Future we can affect

History
We are here

Can we predict TMA1 from VLE activities 1 week
b...
Dashboard and Chart

has not engaged with VLE
at least one TMA below 40

predicted to fail

Has not submitted TMA5

averag...
Dashboard – new design
Conclusions
• In a distance learning context, the
VLE data provides a valuable
source of data for prediction
• Prediction ...
Upcoming SlideShare
Loading in …5
×

In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

1,009 views

Published on

Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment.

Presentation from 'InFocus: Learner analytics and big data', a CDE technology symposium held at Senate House on 10 December 2013. Conducted by Annika Wolff, Knowledge Media Institute, Open University.

Audio of the session and more details can be found at www.cde.london.ac.uk.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,009
On SlideShare
0
From Embeds
0
Number of Embeds
525
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment

  1. 1. Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment Annika Wolff and Zdenek Zdrahal 10th December 2013
  2. 2. Student retention • Struggling students don’t always ask for help – drop-out of module or fail and then don’t progress further • When timely help is offered, this can make the difference between success and failure. • It can be hard to know who’s in trouble and where to direct resources
  3. 3. Open University context Distance learning: • Content through VLE • Contact mediated through VLE – how to tell if students are struggling? Solution: develop predictive models from student data students tutors
  4. 4. Data sources and data sets VLE Assessment Learning content Forums Quizzes…. Ongoing assessments Final exam Demographic Age Gender Previous study…..
  5. 5. Typical VLE clicks 3000 2500 Students Tutors 2000 1500 1000 500 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
  6. 6. VLE activity (prior TMA1) • • • • • • • No VLE activity … 317 students 1-20 clicks ……….. 609 students 21-80 clicks ……… 943 students 81-150 clicks ……. 621 students 151-300 clicks …. 803 students 301-600 clicks …. 516 students > 600 clicks ……… 355 students
  7. 7. Problem specification • Given: – Demographic data at the Start (may include information about student’s previous modules studied at the OU and his/her objectives) – Assessments (TMAs) as they are available during the module – VLE activities between TMAs – Conditions student must satisfy to pass the module • Goal: – Identify students at risk of failing the module as early as possible so that OU intervention is meaningful.
  8. 8. Comments on problem specification • OU intervention is meaningful if the cost of the intervention is lower than the expected gain from retaining the student. • Modelling the problem: We are here
  9. 9. Comments on problem specification • OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student. • Modelling the problem: History we know We are here
  10. 10. Comments on problem specification • OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student. • Modelling the problem: Future we can estimate History we know We are here
  11. 11. Comments on problem specification • OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student. • Modelling the problem: Future we can estimate History we know We are here … and we can influence!
  12. 12. Comments on problem specification • OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student. • Modelling the problem: Future we can estimate History we know We are here How can we estimate the future? … Based on student’s history and properties of upcoming parts of the module known from previous presentations.
  13. 13. Prediction at TMA1 – Why? TMA1 is a good predictor of success or failure – It is enough time to intervene History we know Future we can affect We are here
  14. 14. Building a classifier Pass Fail Training instances New instances PASS FAIL Decision Tree – first results (no demographics) Assessment 1 score? >40% <40% Fail Pass Pass Fail
  15. 15. Performance drop (VLE+TMA)
  16. 16. Final outcome
  17. 17. Naïve Bayes network • Education: – – – – – Sex N/C TMA1 Education VLE No formal qualif. Lower than A level A level HE qualif. Postgraduate qualif. • VLE: – – – – No engagement 1-20 clicks 21-100 clicks 101 – 800 clicks • N/C: Goal: Calculate probability of failing at TMA1 • either by not submitting TMA1, • or by submitting with score < 40. – New student – Continuing student • Sex: – Female – Male
  18. 18. Predicting final result from TMA1 Pass/Distinction TMA1 >=40 TMA1 TMA2 TMA7 TMA1 <40 Final result Fail Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193 Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142 P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907 Bayes minimum error classifier If student fails in TMA1 he/she is likely to fail the final result VLE
  19. 19. P(Fail|TMA1-score), P(Pass/Dist|TMA1-score) 1 0.9 0.8 0.7 0.6 0.5 Fail Pass/Dist 0.4 0.3 0.2 0.1 0 0-39 40-59 60-69 70-79 80-100 TMA1
  20. 20. Predicting final result from TMA1 Sex Pass/Distinction TMA1 >=40 N/C TMA1 TMA2 TMA7 Final result Education TMA1 <40 Fail VLE Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193 Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142 P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907 Bayes minimum error classifier If student fails in TMA1 he/she is likely to fail the final result VLE
  21. 21. Demo Case 1 • Demographic data Sex N/C Educatio n – Student fits certain demographic profile of gender, educational background etc. TMA1 Without VLE: Probability of failing at TMA1 = 18.5% With VLE: Sex N/C Educatio n VLE Clicks TMA1 Probability Nr of students 0 64% 4 1-20 44% 3 21-100 26% 5 101-800 6.3% 14
  22. 22. Demo Case 2 • Demographic data Sex N/C Educatio n – Different demographic profile to previous slide TMA1 Without VLE: Probability of failing at TMA1 = 7.7% With VLE: Sex N/C Educatio n VLE Clicks TMA1 Probability Nr of students 0 39% 35 1-20 22% 74 21-100 11.2% 178 101-800 2.4% 461
  23. 23. TMA1? … it might be too late! Future we can affect History We are here Can we predict TMA1 from VLE activities 1 week before the TMA1 deadline? How about 2, 3, … weeks?
  24. 24. Dashboard and Chart has not engaged with VLE at least one TMA below 40 predicted to fail Has not submitted TMA5 average score < 40 However average score = 81.71 !!! has not engaged with VLE
  25. 25. Dashboard – new design
  26. 26. Conclusions • In a distance learning context, the VLE data provides a valuable source of data for prediction • Prediction improves as a module progresses, but this is too late! • We need to optimise methods for early prediction

×