This document summarizes the development of a predictive analytics tool called OU Analyse that was created to identify at-risk students at The Open University. It describes how OU Analyse uses demographic and virtual learning environment activity data to build predictive models and identify important course activities. The tool provides weekly predictions of at-risk students to tutors so interventions can be made early. It has expanded from supporting 2 courses to 18 courses in a year. Future work includes scaling the tool to more courses and evaluating the impact of interventions.
1. Analysing at-risk students at The Open University
Date: 18th March, 2015
Author: Jakub Kuzilek, Martin Hlosta, Drahomira Herrmannova, Zdenek Zdrahal, Annika Wolff
2. The Open University
• Largest distance learning university
in the UK (200k students, over 500
courses)
• Main campus in Milton Keynes
• Many efforts of the OU aims at
improving retention rate of students
• Several university units exist only
for providing help to at-risk
students
3. Problem specification
• Given:
– Demographic data at the Start (may include information about student’s previous
modules studied at the OU and his/her objectives)
– Assessments (TMAs) as they are available during the module
– Virtual Learning Environment activities between TMAs
– Conditions student must satisfy to pass the module
• Goal:
– Identify students at risk of failing the module as early as possible so that OU intervention
is efficient and meaningful.
5. Genesis of OU Analyse
time
Darkness OU Analyse prehistory
2011 20132012
Project with
MSR Cambridge
Analysis of
VLE data
3 courses
Experimental
dashboard
6. Genesis of OU Analyse
time
Darkness OU Analyse prehistory
2011 2013
OU Analyse success stories
2014
Feb
2 courses
2012
Project with
MSR Cambridge
Weekly support for OU courses
4 predictive
models
Analysis of
VLE data
3 courses
Experimental
dashboard
7. Genesis of OU Analyse
time
Darkness OU Analyse prehistory
2011 2013
OU Analyse success stories
2014
Feb
2 courses
2012
Project with
MSR Cambridge
Weekly support for OU courses
2014
Oct
12 courses
4 predictive
models
DashboardAnalysis of
VLE data
3 courses
Experimental
dashboard
8. Genesis of OU Analyse
time
Darkness OU Analyse prehistory
2011 2013
OU Analyse success stories
2014
Feb
2 courses
2012
Project with
MSR Cambridge
Weekly support for OU courses
2015
Feb
2014
Oct
12 courses 18 courses
4 predictive
models
Dashboard Dashboard
& recommender
Analysis of
VLE data
3 courses
Experimental
dashboard
9. Genesis of OU Analyse
time
Darkness OU Analyse prehistory
2011 2013
OU Analyse success stories
2014
Feb
2 courses
OU Analyse future
2012
Project with
MSR Cambridge
Weekly support for OU courses
2015
Feb
2014
Oct
12 courses 18 courses
4 predictive
models
Dashboard Dashboard
& recommender
Analysis of
VLE data
3 courses
Experimental
dashboard
10. Our approach to support at-risk students
• Predictive modeling
12. What is the best time for at-risk student identification?
• Students, who fail first Tutor Marked Assignment
(TMA) in fourth week has high probability of course
failure (>95%)
We need to start predicting before
first TMA
13. Data
• Demographic data
– Static data during the course
– Gender, Age, Highest education,
New/Continuing student, Index of
multiple deprivation, Number of previous
course attempts, Student workload
during the course
• Virtual Learning Envinronment (VLE)
data
– Data from student interaction with VLE
– One day summary data
14. Importance of VLE data
• Demographic data
– New student
– Male
– No formal qualification
Sex
Education
N/C
TMA1
Without VLE:
Probability of failing at TMA1 = 18.5%
Sex
Education
N/C
VLE
TMA1
Clicks Probability Nr of students
0 64% 4
1-20 44% 3
21-100 26% 5
101-800 6.3% 14
With VLE:
16. Important VLE activities
• Identified VLE activities: Forum (F), Subpage (S),
Resource (R), OU_content (O), No activity (N)
• Possible activities each week are: F, FS, N, O, OF, OFS,
OR, ORF, ORFS, ORS, OS, R, RF, RFS, RS, S
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
17. Start
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
Pass Fail No submit TMA-1time
VLE opens
Start
Activity space
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
18. FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
Start
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
Pass Fail No submit TMA-1time
VLE opens
Start
VLE trail: successful
student
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
19. FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
Start
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
FSF RFSOFS ORFN O SRFROF OR ORSORFS OS RS
Pass Fail No submit TMA-1time
VLE opens
Start
VLE trail: student who
did not submit
28. Results
• Four predictive modules
• Important activities identification -> recommendations
• Support of 18 modules
• Weekly predictions
• Dashboard (from scratch to working application in 1 year)
29. Future work
• Scaling up
• 2nd round of evaluations
• Addressing new challenges: modules without historical
data, model voting, new models, module finger prints,
alignment of assessments
• Evaluation of interventions
30. Thank you and see you at tech showcase!
(SC 3102-3105, Thu 19th March, 4:30-5:30 PM)