SlideShare a Scribd company logo
1 of 93
Download to read offline
UNIVERSITY OF SOUTHAMPTON
Faculty of Physical Sciences and Engineering
Electronics and Computer Science
A mini-thesis submitted for transfer from
MPhil to PhD
Supervisors: Ed Zaluska (ejz), Dave Millard (dem)
Examiner: Mark Weal (mjw)
Predicting Student Success with
Learning Analytics on Big Data
Sets: Conditioning and Behavioural
Factors
by Adriana Wilde
July 10, 2014
UNIVERSITY OF SOUTHAMPTON
FACULTY OF PHYSICAL SCIENCES AND ENGINEERING
ELECTRONICS AND COMPUTER SCIENCE
Predicting Student Success with Learning Analytics on Big Data Sets:
Conditioning and Behavioural Factors
A mini-thesis submitted for transfer from MPhil to PhD
by Adriana Wilde
ABSTRACT
Advances in computing technologies have a profound impact in many areas of human
concern, especially in education. Teaching and learning are undergoing a (digital) rev-
olution, not only by changing the media and methods of delivery but by facilitating
a conceptual shift from traditional face-to-face instruction towards a learner-centered
paradigm with delivery increasingly becoming tailored to student needs. Educational
institutions of the immediate future have the potential to predict (and even facilitate)
student success by applying learning analytics techniques on the large amount of data
they hold about their learners, which include a number of indicators that measure both
the conditioning (under which students are subjected) and the behavioural factors (what
students do) influencing whether a given student will be successful. More than ever
before, key information about successful student habits and learning context can be
discovered.
Our hypothesis is that collective data can be used to construct a model of success for
Higher Education students, which then can be used to identify students at risk. This
is a complex issue which is receiving increased attention amongst e-learning commu-
nities (of which Massive Open Online Courses are an example), and administrators of
learning management system alike. Smartphones, as sensor-rich, ubiquitous devices, are
expected to become an important source of such data in the imminent future, increasing
significantly the complexity of the problem of devising an accurate predictive model of
success.
This interim thesis presents the relevant issues in predicting student success using learn-
ing analytics approaches by incorporating both conditioning and behavioural factors
with the ultimate goal of informing behavioural change interventions in the context of
learning in Higher Education. It then discusses our work to date and concludes with a
workplan to generate publishable results.
Contents
1 Introduction 1
2 Background and Literature Review 4
2.1 Higher education learners today . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 A digitally-literate generation of students . . . . . . . . . . . . . . 4
2.1.2 Mature students in HE . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Computers and learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Learning Management Systems . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Learning analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Massive Open Online Courses . . . . . . . . . . . . . . . . . . . . . 10
2.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Smart badges and smartphones . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Behaviour sensing and intervention . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Final comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 A research question 18
3.1 What are the measurable factors for the prediction of student academic
success? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Outcomes of Work to Date 21
4.1 Survey of HE English-speaking students . . . . . . . . . . . . . . . . . . . 21
4.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1.2 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Survey of students from the University of Chile . . . . . . . . . . . . . . . 24
4.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.2 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 U-Cursos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.1 Current status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 Research Plan for Final Thesis 31
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Research question and research hypotheses . . . . . . . . . . . . . . . . . 32
5.3 Work Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 Contingency research plan . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ii
CONTENTS iii
6 Conclusions 43
References 45
A Beyond this thesis 56
A.1 How to help students reflect on their behaviour? . . . . . . . . . . . . . . 56
B Predictability of human behaviour 60
C Survey questions 62
D A word cloud of concerns 66
E The U-Cursos experience 68
F U-Campus Screenshots 75
G Chilean University Selection Test 77
H Additional research 81
H.1 Audience response systems (zappers) . . . . . . . . . . . . . . . . . . . . . 81
H.1.1 Own experience with zappers . . . . . . . . . . . . . . . . . . . . . 82
H.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
H.3 Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
H.4 Activity Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
List of Figures
2.1 Multi-level categorisation model of conceptions of teaching . . . . . . . . . 8
2.2 Smart badges: The Active Badge by Palo Alto Research Centre . . . . . . 11
2.3 Smart badges: The HBM (external and internal appearance) . . . . . . . 11
2.4 Smart badges: The MIT wearable sociometric badge . . . . . . . . . . . . 12
2.5 A smartphone sensing architecture . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Components of digital behaviour interventions using smartphones . . . . 16
4.1 Survey responses from UK students (excluding qualitative data). . . . . . 23
4.2 Survey of University of Chile students: First screen . . . . . . . . . . . . . 25
4.3 Survey responses from students of the University of Chile (excluding qual-
itative data). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 U-Cursos view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5 Cramped look to the U-Cursos web interface from a smartphone . . . . . 28
4.6 Access graph between 2010 and 2014 for U-Cursos . . . . . . . . . . . . . 29
5.1 Data architecture at the University of Chile. . . . . . . . . . . . . . . . . . 36
D.1 Participantsā€™ answers to the question ā€œDo any of the potential applications
described cause you any concern? Which ones? Why?ā€ . . . . . . . . . . . 66
F.1 U-Campus courses catalogue. . . . . . . . . . . . . . . . . . . . . . . . . . 75
F.2 U-Campus module catalogue for the Computer Science course. . . . . . . 76
G.1 Chilean University Selection Test (PSU) - step one . . . . . . . . . . . . . 77
G.2 PSU - step two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
G.3 PSU - step three . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
G.4 PSU - step four . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
H.1 A commercial zapper: A TurningPointTMresponse card . . . . . . . . . . . 82
H.2 Zappers in action: Example exam question with student responses . . . . 83
H.3 Zappers in action: Appraising students confidence on their self-assessment
before (left slide) and after (right slide) the solution was discussed in class. 84
iv
List of Tables
3.1 What do students do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1 U-Cursos services ranked in ascendent order of popularity amongst users. 30
5.1 Schedule of research work and thesis submission (A Gantt chart) . . . . . 35
5.2 University Selection Tests (PSU) data fields . . . . . . . . . . . . . . . . . 38
5.3 FutureLearn Platform Data Exports . . . . . . . . . . . . . . . . . . . . . 41
A.1 Table of interventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
v
Chapter 1
Introduction
Recent developments in mobile technologies are characterised by a high integration of
information processing, connectivity and sensing capabilities into everyday objects. It
is now easier than ever to collect, analyse and exchange data about our daily activities:
revolutionising how humans live, work and learn. This is particularly true amongst
higher education students, who already generate a rich ā€œdata trailā€ as they navigate
their way through towards successful completion of their studies.
Traditional learning analytics research focuses on the use of data an educational
institution holds about their students to promptly identify poor performance so that
actions that can be taken to encourage success. Struggling students in particular need to
be directed to be able to complete their courses more successfully (Baepler and Murdoch,
2010), as the failure to do so comes to a great cost, not only to these students but to
their institutions. This is a difficult issue, as measures of success are usually limited
to traditional indicators such as progression and academic performance. For a student,
an educational institution and the wider society, ā€œsuccessā€ would have to be defined by
retention, level of engagement and contentment as well as achievement of higher marks.
Against this context, Higher Education institutions have, in recent years, devoted
great efforts to support students and encourage them to succeed, by making learning
materials widely available to their students, for example. Furthermore, the greater
affordability of smartphones and the ubiquity of the Internet not only allows students
to access learning materials at any time and any where (although students may well
not see this as the primary benefit of such technologies), but also allows academics to
learn more about student habits and context than ever before. In other words: what do
students actually do and could this information empower them to do better?
One valid approach to understanding how students learn may use technology to
gather data about the conditioning factors for their success as well as the behaviours
they adopt in their student lives. A second step would then use these indicators to
1
Chapter 1 Introduction 2
predict student success in time to perform an intervention on those students identified
as ā€œat riskā€. The technology available for collecting activity data is not only becoming
more diverse and powerful but it is also becoming widely available at a decreasing costs,
hence increasing the potential for building ā€œBig Dataā€ collections on which sophisticated
prediction models could be devised.
Students of today have unprecedented access to a breadth of technology, and this
increase in access justify in its own right an study into how to bring pervasive computing
ideas into learning analytics. Pervasive computing is a ā€˜post-desktopā€™ computing model
under which, greater processing power, connectivity and sensing are all available at a low
cost, facilitating a widespread adoption of sensor-loaded, powerful, mobile devices. This
active area of research is concerned with context-awareness, i.e. how tailored services
can be offered to users via interconnected computing devices that are sensitive to the
users context as determined by the processing of sensor data. One area of application
of increasing interest is education. However, in this area much of the current interest
tends to focus on the delivery of learning resources to students (Laine and Joy, 2009,
and references therein) and the provision of virtual learning environments rather than
identifying what students do.
The application of pervasive computing in the area of education exploits both the
opportunity of the ubiquity of devices and the increasing interest in new technology
exhibited across the current generation of students. Although there has been a great
amount of research in this direction (Laine and Joy, 2009; Hwang and Tsai, 2011, and
references therein), most of this research has been focused on the use of pervasive tech-
nologies to:
ā€¢ enrich student learning experiences indoors and/or outdoors with digital augmen-
tation (Rogers et al., 2004, 2005);
ā€¢ assess students (Cheng et al., 2005);
ā€¢ increase access to content and annotation capabilities in support of peer-to-peer
learning (Yang, 2006);
ā€¢ inform the learning activity design taking student context into account (Hwang,
Tsai, and Yang, 2008);
ā€¢ increase interaction by broadening discourse in the classroom (Anderson and Serra,
2011; Griswold et al., 2004) or by playing mobile learning games (Laine et al.,
2010);
ā€¢ enable ubiquitous learning in resource-limited settings, and observing the influence
of new tools in the adaptation of learning activities and community rules (Pimmer
et al., 2013);
Chapter 1 Introduction 3
ā€¢ ā€œdeconstructā€ everyday experiences into digital environments (Owens, Millard,
and Stanford-Clark, 2009; Dix, 2004).
These examples demonstrate the possibility of applying such technologies in educa-
tion. However, they had not set out to use contextual information in order to predict
or even understand student behaviours. To address this shortcoming, we will consider
context aware computing methods and techniques that have been applied successfully in
the areas of healthcare, assisted living and social networking, and apply them to Higher
Education to complement knowledge gained through traditional educational analytics.
Many researchers have worked on the acquisition of context in general and on the dis-
crimination of human activity in particular, such as dos Santos et al. (2010); Lau (2012);
Bieber and Peter (2008); Huynh and Schiele (2005) and Khattak et al. (2011). Their
findings could be applied in this area of research too, especially as the rapid emergence of
the Internet of Things (IoT) means that the available sensor data will grow exponentially
(Manyika et al., 2011). In my opinion, the application of novel techniques from pervasive
computing into an investigation of student behaviour is worth exploring (Wilde, 2013;
Wilde, Zaluska, and Davis, 2013c,d). Indeed, I am interested in exploring the untapped
possibilities of extending learning analytics in a data-rich environment such as the one
that will be prevalent in the Internet of Things, where all specific activities and general
behaviour of students will leave ā€œfingerprints of dataā€ about them. This data trail af-
fords specific contextual information, capable of analysis for measures of engagement,
collaboration and attainment, thereby enabling the provision of adequate and timely
feedback.
Within this research I have already considered certain aspects related to the study of
behaviour in the population of interest, akin to those in ethnographic methods, with my
specific contribution residing on the disconnect between intentions of privacy as declared
by smartphone users and the actual privacy levels evident in their phone interactions
(Wilde et al., 2013b), which is one of the findings from a survey described in detail later
in this report.
This remainder of this upgrade report is organised as follows: Chapter 2 considers
the characteristics of our learners, explores the state of the art in context-aware tech-
nologies and their existing use in education as well as looking at the predictability of
human behaviour and the type of data that is available in order to infer behaviour.
Chapter 3 examines the research question to be addressed during this research: what
are the measurable factors for the prediction of student academic success?. Chapter 4
presents the research work to date, specifically the design and application of a survey of
Higher Education students (in the UK and in Chile), as well as information discovery
for a suitable dataset to explore these factors (on University of Chile students), which
will be prepared by combining data from the platforms U-Campus and U-Cursos here
described. These chapters lead into a plan for the remaining work, which is detailed in
5. Finally, the conclusions of this upgrade thesis are presented in Chapter 6.
Chapter 2
Background and Literature
Review
The general motivation for this research is assisting higher education students to achieve
success. As they are the subjects of interest, they are more precisely described in Sec-
tion 2.1. Then, I look into the use of digital technologies for learning (in Section 2.2),
both from the educational institutions and their students viewpoints, as well as ways
of using mobile and wearable technologies to learn more about students (Section 2.3).
Section 2.4 reviews existing literature on the identification of human behaviour through
these technologies. Finally, Section 2.5 appraises this review as a foundation for predic-
tion of student success using a characterisation of students from measurable data about
their conditioning and behavioural factors.
2.1 Higher education learners today
To learn about student behaviour, it is useful to start with identifying salient charac-
teristics of the students in higher education today, considering those of the ā€œtypicalā€
student, as well as those pertaining to students that do not fit into that classification.
Specifically, Iā€™ll look into two dimensions: one, being the student levels of efficacy or
even engagement with digital technologies (in sub-section 2.1.1) and another one, the
age group to which the student belongs (sub-section 2.1.2).
2.1.1 A digitally-literate generation of students
Prenskyā€™s term digital natives (Prensky, 2001a) is one amongst many1 used to identify
those born ā€œtypically between 1982 and 2003 (standard error of Ā±2 years)ā€ (Berk, 2009,
1
Terms include: Millennials, Generation Y, Echo Boomers, Trophy Kids, Net Generation, Net Geners,
First Digitals, Dot.com Generation and Nexters (Berk, 2009). Other terms are: cybercitizens, netizens,
4
Chapter 2 Background and Literature Review 5
2010). Members of this group, by this definition, are now 11 to 32 years old, so the ma-
jority of students in higher education today would belong to it. Furthermore, according
to Prensky (2001b), many may even process and interpret information differently (al-
legedly due to the plasticity of the brain). These assertions would imply that what have
been regarded as traditionally effective study habits and behaviours for previous gener-
ations are no longer effective and need to be reviewed to accommodate to the needs of
the current generation of students.
Nevertheless, since only a fraction of the world population access digital technologies
to achieve ā€˜nativeā€™-like fluency in their use, the term ā€œdigital nativesā€ is not a fit descrip-
tion (Palfrey and Gasser, 2010), and for this reason (amongst others) it has become less
accepted in the current educational discourse. Education, experience, breadth of use
and self-efficacy are more relevant than age in explaining how people become ā€œdigital
nativesā€ (Helsper and Eynon, 2010). As a response, Kennedy et al. (2010) proposed
a different classification based on a study comprising 2096 students in Australian uni-
versities: ā€œpower users (14% of sample), ordinary users (27%), irregular users (14%)
and basic users (45%)ā€. However, rather than a discrete classification, a more useful
typology is a continuum, as individuals are placed along it depending on a number of
factors. Jones and Shao (2011) indicate that various demographic factors affect student
responses to new technologies, such as gender, mode of study (distance or place-based)
and whether the student is a home or international one. A JISC report questions the
validity of certain attributed characteristics of this generation (Nicholas, Rowlands, and
Huntington, 2008). Examples are: a preference for ā€œquick informationā€ and the need
to be constantly connected to the web, now proved to be myths: these traits are not
generational. Whilst Turkle (2008) notes that young people have digital devices always-
on and always-on-them, becoming virtually ā€˜tetheredā€™, this behaviour is not restricted
to young people. For these reasons, this term has increasingly become replaced by the
term digital residents and its counterpart digital visitors (White et al., 2012).
In any case, we acknowledge that many of our students today are not only engaged
in digital technologies in a daily basis, but in their world there have always been digital
technologies in various forms. Even with the proviso that this behaviour may not be
generalisable ā€œoutside of the social class currently wealthy enough to afford such thingsā€
(Turkle, 2008), it is an observable behaviour that is becoming increasingly common as
digital technologies have become more affordable than ever before. This suggests that
in the planning of a study involving higher education students as participants, not only
those in this generation should be considered, but also those outside it, such as mature
students.
homo digitalis, homo sapiens digital, technologically enhanced beings, digital youth and the ā€œyuk/wowā€
generation (Hockly, 2011; Dawson, 2010).
Chapter 2 Background and Literature Review 6
2.1.2 Mature students in HE
Ascribing generational traits to todayā€™s learners is somewhat an overgeneralisation. As
Jones and Shao (2011) point out, global empirical evidence indicates that, on the whole,
students do not form a generational cohort but they are ā€œa mixture of groups with var-
ious interests, motives, and behavioursā€, not cohering into a single group or generation
of students with common characteristics. In particular, research on higher education
students often focus on the standard age band of students under 21 years of age, not
accounting for mature students (this term is typically used to refer to those who are over
this threshold upon entrance).
Even amongst this group, there are significative differences in behaviour and attain-
ment. Studies have found that older mature students were more likely to study part-time
than full-time, as family and work commitments have been acquired. In fact, 90% of
part-time undergraduate students are 25 years old or over and as many as 67% are over
30 (Smith, 2008).
On this note, Baxter and Hatt (1999) argued that mature students could be disag-
gregated according to age bands seemingly correlating with various levels of academic
success. Therefore, instead of considering standard and mature students solely (under
and over 21 respectively), they introduce the distinction between younger and older
matures, as those over 24 were more likely to progress through into their second year,
despite a longer period time out of education. In general the younger mature learners
were more at risk of leaving the course than older mature students.
However, even this division may well be still a poor generalisation about (mature)
students, as beside their age, there are a myriad of more relevant factors affecting their
experience, such as their route into HE, their background and motivation to study, all are
difficult (if not pointless) to use for a classification of mature learners (Waller, 2006). An
approach that acknowledges the individual characteristics of learners is to be preferred
to those requiring conflating them into a homogeneous group, as conclude by Waller
(2006), requiring educational providers to act on means to identify these characteristics
in order to adopt such an approach.
2.1.3 Summary
The literature reviewed in this area validates the need for individualised support and
feedback, delivered timely and directly to each student, if it is to make an impact.
Another conclusion from this review is that students in higher education today have been
exposed to digital technologies (of which wearable and mobile devices are an example),
suggesting that these can become appropriate channels to facilitate this delivery.
Chapter 2 Background and Literature Review 7
2.2 Computers and learning
A natural consequence of the pervasiveness of digital technologies in recent years is that
they are now almost universally use in teaching and learning (to various degrees). In fact,
coinciding with the advent of the personal computer in the 1970s, the term Computer
Assisted Learning was first coined, alongside Computer Assisted Instruction and similar
others, however, these terms are less commonly used as they are becoming replaced in the
educational discourse by the term e-learning. The former have been used to characterise
the use of computers in education, or more specifically, where digital content is used in
teaching and learning. In contrast, the latter is generally used only when the content is
accessed over the Internet (Derntl, 2005; Hughes, 2007; Jones, 2011; Sun et al., 2008).
2.2.1 Learning Management Systems
Learning Management Systems (LMS), also known as virtual learning environments
(VLE) and course management systems, are excellent examples of the application of
e-learning to support traditional face-to-face instruction. These are systems used in the
context of educational institutions offering technology-enhanced learning or computer-
assisted instruction ā€“ BlackboardTMand Moodle are the best-known examples.
Stakeholders may have different objectives for using a LMS. For example, Romero
and Ventura (2010) reviewed 304 studies indicating that students use LMS to person-
alise their learning, reviewing specific material and engaging in relevant discussions as
they prepare for their exams. Lecturers and instructors use them to give and receive
prompt feedback about their instruction, as well as to provide timely support to stu-
dents (e.g. struggling students need additional attention to complete their courses more
successfully (Baepler and Murdoch, 2010), as the failure to do so comes at a great cost,
not only to these students but to their institutions). Administrators use LMS to inform
their allocation of institutional resources, and other decision-making processes (Romero
and Ventura, 2010). These authors argue the need for the integration of educational
data mining tools into the e-learning environment, which can be achieved via LMS.
LMS are being increasingly offered by Higher Education institutions (HEIs), a tech-
nological trend making an impact on these institutions. Another trend is the prolifer-
ation of powerful mobile devices such as smartphones and tablets, from which on-line
resources can be accessed2.
2
These two trends push HEIs to provide LMS access via smartphones in a visually appealing and
accessible way. These are inherent requirements of the mobile experience, which is fundamentally dif-
ferent to the desktop one (Benson and Morgan, 2013). Benson and Morgan present their experiences
migrating an existing LMS (StudySpace) to a mobile development, as a response to these pressures and
the pitfalls identified on the Blackboard MobileTM
app.
Chapter 2 Background and Literature Review 8
It is worth noting that the majority of these systems have a client-server archi-
tecture supporting teacher-centric models of learning (common scenarios have teachers
producing the content while students ā€˜consumeā€™ it) (Yang, 2006). To put this assertion
in context, pedagogic conceptions of teaching and learning are usually understood in
the literature as falling into one of two categories: teacher-centred (content driven) and
student-centred (learning driven) (Jones, 2011, and references therein). Figure 2.1 shows
these orientations as overarching the main five conceptions of teaching and learning
which act as landmarks alongside a continuum of roles in learning. Deep learning occurs
at the bottom end of the scale, as opposed to shallow learning which occurs at the top
end. When student-centred, computer assisted learning can increase studentsā€™ satisfac-
tion and therefore engagement and attainment. It is remarkable that the move towards
learner-centredness in Higher Education coincides with the trends towards personalisa-
tion and user-centredness in Human-Computer Interaction and computing technologies
in general.
Imparting information
Teacher-centred
(content-driven)
Transmitting
structured knowledge
Student-teacher
interaction /
apprenticeship
Facilitating
understanding
Conceptual change
/ intellectual
development
Student-centred
(learning-oriented)
Figure 2.1: Multi-level categorisation model of conceptions of teaching (adapted)
Kember (1997).
The trend towards a widespread use of mobile devices, earlier identified, brings an
increased number of opportunities of effecting the conceptual change from the categori-
sation above, as it has the potential of making the learning more student-centred than
Chapter 2 Background and Literature Review 9
before: it would take placer wherever the student goes, whenever it suits the student
best3. Additional opportunities to reach students to either deliver content or to assess
their learning, are coupled with opportunities for other stakeholders at educational insti-
tutions to gain an insight on student achievement (typically progression and completion)
via learning analytics, as presented in the next subsection.
2.2.2 Learning analytics
As well as facilitating engagement, content delivery and even assessment and feedback,
digital technologies have been increasingly being used for facilitating administrative
tasks and decision-making at educational institutions. In particular, in recent years
HE institutions have begun to use data held about their students for learning analytics
(Barber and Sharkey, 2012; Sharkey, 2011; Bhardwaj and Pal, 2011; Glynn, Sauer, and
Miller, 2003).
Learning analytics (also known as academic analytics and educational data mining),
are widely regarded as the analysis of student records held by the institution as well
as course management system audits, including statistics on online participation and
similar metrics, in order to inform stakeholders decisions in HE institutions. Academic
analytics are considered as useful tools to study scholarly innovations in teaching and
learning (Baepler and Murdoch, 2010). According to these authors, the term academic
analytics was originally coined by the makers of the virtual learning environment (VLE)
BlackboardTM, and it has become widely accepted to describe the actions ā€œthat can be
taken with real-time data reporting and with predictive modelingā€ which in turn helps
to suggest likely outcomes from certain behavioural patterns (Baepler and Murdoch,
2010).
Educational data mining involves processing such data (collected from the VLE
or other sources) through machine learning algorithms, enabling knowledge discovery,
which is ā€œthe nontrivial extraction of implicit, previously unknown, and potentially
useful information from dataā€ (Frawley, Piatetsky-Shapiro, and Matheus, 1992). Whilst
data mining does not explain causality, it can discover important correlations which
might still offer interesting insights. When applied to higher education, this might enable
the discovery of positive behaviours, such as for example, whether students posting more
than a certain number of times in an online forum tend to have higher final marks, or
whether attendance at lectures is a defining factor for academic success, or even for any
of its measures such as ā€œretention, progression and completionā€ (Sarker, 2014).
3
The ā€œanywhere, anytimeā€ maxim driving pervasive computing maxim is also a motivator for the
development of the next generation of e-learning. Rubens, Kaplan, and Okamoto (2014) discuss the
evolution of the field, aligning it to the advent of Web 2.0 and 3.0, central to this paradigm of learning.
Chapter 2 Background and Literature Review 10
2.2.3 Massive Open Online Courses
Developments in these learning digital technologies have facilitated the rise of massive
open online courses (MOOCs)4, where the already difficult issues of assessing and provid-
ing feedback increses dramatically in complexity with classes of up to tens of thousands
of learners (Hyman, 2012). Within this context, a considerable amount of interest has
been devoted very recently to the use of learning analytics too, for example:
ā€¢ On social factors contibuting to student attrition in MOOCs (RoseĢ et al., 2014;
Yang et al., 2013);
ā€¢ On linguistic analysis of forum posts to predict learner motivation and cognitive
engagement levels in MOOCs (Wen, Yang, and RoseĢ, 2014).
2.2.4 Summary
The literature reviewed in this area evidences the impact of digital technologies in the
provision of support and feedback to learners and other stakeholders of educational
institutions, both in terms of facilitating learning and assessment (as in MOOCs, for
example, but in e-learning in general) as well as in terms of characterising the learners
using learning analytics. In doing so, it is possible to identify the variations amongst
learners to better facilitate the learning experience. An important category of digital
technologies used in education includes portable, light-weight devices, which can be
additionally function as sensor carriers, as presented in the following section.
2.3 Smart badges and smartphones
Until recently, cumbersome sensing equipment (often carried in backpacks) was required,
as shown in a survey of early developments in sensing technologies for wearable comput-
ers (Amft and Lukowicz, 2009). These are now replaced by small, light-weight sensors
which are also capable of becoming embedded within badges and phones, for example.
Smart badges are identity cards with embedded processors, sensors and transmitters.
The concept is not new, in fact the first of these wearable computers was developed
two decades ago, by the Olivetti Research Laboratory (Cambridge) and then further
developed by Xerox PARC: the Active Badge (Want et al., 1992; Weiser, 1999), shown
in Figure 2.2.
More recently, smart badges have been used to study social behaviour, as with
the Hitachiā€™s Business Microscope (HBM) (Ara et al., 2011; Watanabe, Matsuda, and
Yano, 2013) and with its predecessor, the MIT wearable sociometric badge (Wu et al.,
4
MOOCs are occasionally referred to as ā€œMassively-Open Online Coursesā€.
Chapter 2 Background and Literature Review 11
Figure 2.2: Smart badges: The Active Badge by Palo Alto Research Centre
(Weiser, 1999)
2008; Pentland, 2010; Dong et al., 2012), shown in Figures 2.3 and 2.4. These badges,
containing tri-axial accelerometers, are able to capture some characteristics of the motion
of the wearer (e.g. being still, walking, gesturing). Thanks to additional sensors such as
infrarred transceivers, they are also able to capture face-to-face interaction time. Being
lightweight and with a long battery life, these badges can be carried unobstrusively for
several hours a day.
Figure 2.3: Smart badges: Hitachiā€™s Business Microscope
(external and internal appearance) (Ara et al., 2011)
Watanabe et al. (2012) used the HBM in an office environment, finding evidence
that the level of physical activity and interaction with others during break periods
(rather than during working activities) is highly correlated with the performance of
their team. Watanabe et al. (2013) then applied this methodology within a learning
Chapter 2 Background and Literature Review 12
Figure 2.4: Smart badges: The MIT wearable sociometric badge (Dong et al., 2012)
environment, this time using the smart badges on primary school children, observing
a strong correlation between the scholastic attainment of a class and the degree of in
which its members are ā€œbodily synchronisedā€. In other words, classes with all their
members are either physically active or resting consistently during the same periods,
perform better. Another correlation these authors observed is the number of face-to-
face interactions per child during break. Their findings suggest that when children in a
class move in a cohesive manner, the class perform well overall, and also, that the more
face-to-face interactions an individual has, the better their attainment.
The use of badges by all participants is easily enforced in an environment with a
strict dress code, such as school uniforms. Since our population of interest is higher
education students, smartphones are probably more appropriate than smart badges as
sensor carriers, but it is nonetheless interesting to see how much can be learned from sen-
sor data, especially when combined with learning analytics, as in the case of Watanabe
et al. (2013), certain behaviours can be found to be related to a measure of success.
Smartphones present another advantage over badges. Equipped with ambient light
sensors, proximity sensors, accelerometers, GPS, camera(s), microphone, compass and
gyroscope, plus WiFi, Bluetooth radios, a variety of applications can be built to gather
a great range of sensed data Lane et al. (2010). Thanks to their communication and
processing capabilities, smartphones could support a sensing architecture such as the
one depicted in Figure 2.5.
Contextual information can be inferred from the sensor data hence gathered, and the
context determined as in, for example, location. However, it has been long accepted that
ā€œthere is more to context than locationā€ (Schmidt, Beigl, and Gellersen, 1999). Contex-
tual information broadly falls into one of two types: physical environment context (such
as light, pressure, humidity, temperature, etc) and human factor related context such
as information about users (habits, emotional state, bio-physiological conditions, etc),
their social environment (co-location with others, social interaction, group dynamics,
etc), and their tasks (spontaneous activity, engaged tasks, goals, plans, etc) (Schmidt
et al., 1999).
Chapter 2 Background and Literature Review 13
Figure 2.5: A smartphone sensing architecture (Lane et al., 2010).
Context acquisition is, however, important not just because of the possibility to offer
customised services that adapt to the circumstances. Context processing can increase
user awareness (Andrew et al., 2007), and thereby prompt alternative actions to better
achieve a desired goal given the current context, hereby modifying somehow an intended
behavior.
2.3.1 Summary
The literature in this area indicates that sensor data has the potential to help us un-
derstand human behaviour as a collective and as individuals as well as gathering the
context in which it is situated. This would be a suitable foundation for a behavioural
Chapter 2 Background and Literature Review 14
intervention which is aligned to the userā€™s goals, and the smartphone is a suitable sensing
platform which could be used to understand usersā€™ behaviour as well as supporting them
in achieving their higher goals, as discussed in the next Section.
2.4 Behaviour sensing and intervention
Despite its inherent complexity, researchers have shown that human behaviour is highly
predictable in certain contexts. In the context of scale-free networks, the degree of
predictability has been quantified to 93% (Song et al., 2010). Evidence suggests that
behaviour can be ā€œminedā€ and even predicted using sensors on phones or smart badges
(presented in the previous Section):
ā€¢ identifying structure in routine (for location and activity) to infer the organisa-
tional dynamics (Eagle and Pentland, 2006);
ā€¢ analysing behaviour based on physical activity as detected via smartphones (Bieber
and Peter, 2008);
ā€¢ predicting work productivity based on face-to-face interaction metrics (Wu et al.,
2008; Watanabe et al., 2012);
ā€¢ inferring friendship network structure with mobile phone data (Eagle, Pentland,
and Lazer, 2009);
ā€¢ using mobile phone data to predict next geographical location based on peersā€™
mobility (De Domenico, Lima, and Musolesi, 2012), even predicting when will the
transition occur (Baumann, Kleiminger, and Santini, 2013);
ā€¢ classifying social interactions in contexts, where a crowd disaggregates in small
groups (Hung, Englebienne, and Kools, 2013);
ā€¢ predicting personality traits with mobile phones (de Montjoye et al., 2013);
ā€¢ Bahamonde et al. (2014) showed that even data from smart cards which can be
regarded as less personal than phones or identity cards are suitable capable for
behavior mining. In particular, these researchers were able to deduce usersā€™ home
address through the data exposed by their bip! cards, which are used for payment
for public transport in Santiago de Chile.
From this research we can assert that, given sufficient information, some human be-
haviour can be predicted (see Appendix B for more on its high predictability).
Specifically relevant to behaviour sensing in the educational context is the possibility
of ā€œseeingā€ the learning community (Dawson, 2010) by studying the frequency and types
Chapter 2 Background and Literature Review 15
of interactions amongst learners using social network analysis (SNA), as factors such as
degree centrality5 is a positive predictor of a student sense of community, which is
measurable.
Srivastava, Abdelzaher, and Szymanski (2012) acknowledge the use of smartphones
for sensing is becoming increasingly commonplace for human-centric sensing systems
(whether the humans are the sensing targets, sensors operators or data sources). They
identify various technical challenges to their wider adoption for these systems, one of
them being the difficulty of inferring a rich context in the wild. They warn that earlier
successes on inferences about mobility do not replicate with ease when making inferences
about ā€œphysical, physiological, behavioural, social, environmental and other contextsā€
(my emphasis).
In terms of behavioral change, the state of the art includes:
ā€¢ using computers as persuasive technologies6 (Fogg, 2003, 2009, 2003; MuĢˆller, Rivera-
Pelayo, and Heuer, 2012);
ā€¢ promoting preventive health behaviors to healthy individuals through SMS, with
positive behavior change in 13 out of 14 reviewed interventions (Fjeldsoe, Marshall,
and Miller, 2009);
ā€¢ health-promoting mobile applications (Halko and Kientz, 2010);
ā€¢ HCI frameworks for assessing technologies for behavior change for health (Klasnja,
Consolvo, and Pratt, 2011);
ā€¢ ā€œsoft-paternalisticā€ approaches to nudge users to adopt good behaviours to protect
their own privacy on mobile devices (Balebako et al., 2011);
ā€¢ nonverbal behavior approaches to identify emergent leaders in small groups (Sanchez-
Cortes et al., 2012);
ā€¢ interactions of great impact and recall to facilitate behaviour change (Benford
et al., 2012);
ā€¢ protocols for behavior intervention for new university students (Epton et al., 2013);
ā€¢ using smartphones for digital behavioral interventions (Lathia et al., 2013; Weal
et al., 2012);
ā€¢ guidance for planning, implementation and assessment of behavioral interventions
for health (Wallace, Brown, and Hilton, 2014).
5
The degree centrality is defined by the number of connections a given node has.
6
Persuasive technologies, not to be confused with pervasive, as here the emphasis is on ā€œpersuasionā€
rather than ubiquity.
Chapter 2 Background and Literature Review 16
In particular, Wallace et al. (2014) argue that interventions involve change processes
ā€œlinked to psychological theories of human behaviour, cognition, beliefs and motivationā€
with a primary aim of improving experiences and well-being. This must be incorporated
in the planning and implementation of any behavioural intervention, in particular for
digital interventions. Lathia et al. (2013) identify the need for monitoring, learning
about the behaviour, before delivering an intervention, effects of which must continue
to be monitored (Figure 2.6).
Monitor
ā€¢ Gather mobile sensing data
ā€¢ Collect online social network
relationships and interactions
Learn
ā€¢ Develop behaviour models
ā€¢ Infer when to trigger
intervention
ā€¢ Adapt sensing
Deliver
ā€¢ Tailored behaviour change
intervention
ā€¢ User feedback via the smart-
phone
Figure 2.6: The three components of digital behaviour interventions using
smartphones (Lathia et al., 2013, adapted).
Furthermore, Klasnja et al. (2011) assert that the development of such technolo-
gies presupposes the need for large studies, suggesting that ā€œa critical contribution of
evaluations in this domain, even beyond efficacy, should be to deeply understand how
the design of a technology for behavior change affects the technologyā€™s use by its target
audience in situā€. Translating this experience to the educational context means that it
is not realistic to measure the success of the development by actual behavior change,
but instead, by the degree of understanding of its potential to influence behaviour.
2.5 Final comments
In the previous section, smartphones and badges were considered as sensing platforms for
behaviour. In addition to the data that could be collected implicitly (i.e. without explicit
intervention from the user) via these, the possibility of incorporating user-generated data
is also valuable. As an example, life annotations (Smith, Oā€™Hara, and Lewis, 2006) and
ā€˜lifeloggingā€™ (Oā€™Hara, 2010; Smith et al., 2011). This data could be potentially used to
enrich that typically studied in learning analytics by giving an insight on an additional
dimension of student lives: what do they do when they are not studying?
Chapter 2 Background and Literature Review 17
Through this (still ongoing) survey of the relevant literature, I have now gained a
greater understanding of the characteristics of Higher Education students (which may
condition their levels of academic success), the devices they use in their learning (in
and out of the classroom), and others from which their behaviour can be sensed, as
behavioural factors may complement conditioning factors in determining of student suc-
cess. I also explored the state of the art in behavioural interventions, and what data can
be used to facilitate one. This is the foundation upon which key research components
have been created, which are presented in the next Chapter.
Chapter 3
A research question
The literature review presented in the previous Chapter surveyed the type of data and
techniques that can be used to understand and predict student behaviour. This Chap-
ter formulates the research question to be addressed, in order to plan an experimental
methodology and a road map for future work.
The research question stated in the introduction is ā€œWhat are the measurable factors
for the prediction of student academic success?ā€. This Chapter discusses conditioning
and behavioural factors affecting students academic success and how to gather data for
measures of these factors against academic performance (a proxy for success).
3.1 What are the measurable factors for the prediction of
student academic success?
Most context-aware pervasive systems use location as the most important contextual
information available. Indeed, there is a wealth of research and commercial products
which offer location-based services, which focus on the use of readily available informa-
tion relevant to users in a given location. Not yet so well exploited, although gathering
significant scientific interest, is the use of physical activities as contextual information.
Other sources of contextual information that can become readily available include
the use of social media and learning analytics. Additionally, using sentiment analysis
on social media could help capture users mood and general outlook over the observable
period. Data mining algorithms could be applied over collected data, however, the
ā€œground truthā€ measure of what constitutes a successful student needs to be established
beforehand, and as explained earlier, it is in itself a very difficult question. Proxy
measures of success can be used, such as academic achievement and progression, but
other aspects of student life such as level of engagement and contentedness (if somehow
18
Chapter 3 A research question 19
measurable) could also taken into account for a more complete portrait of a successful
student.
Table 3.1 lists a range of activities that students in higher education are likely to
engage in, as well as the means of gathering data which could lead to identify a given
activity, assuming participantsā€™ consent and unrestricted access to data sources, and the
practical viability of the creating such a data collection based on existing research. As
Table 3.1 suggests, a substantial amount of information about the student behaviour can
be harvested and quantified (albeit exhibiting ā€œBig Dataā€ challenges for any practical
purposes). In other words, it is viable to investigate the behavioural factors affecting
the student success, if, as in the traditional learning analytics (based on conditioning
factors1), these are analysed against metrics of academic success, such as retention,
progression and completion. This would give a more complete characterisation of a
student than ever before and, as a consequence a more powerful, accurate prediction of
their success.
I have now specified the research question, and will now discuss the practical work to
date conducted in pursuit of answers of aspects of this question, arisen from the literature
review presented in Chapter 2. This is followed by the formulation of specific research
hypothesis, which will specifically qualify the scope of this research (in Chapter 5).
1
Conditioning factors such as, for example, those highlighted in Table 5.2, page 38.
Chapter 3 A research question 20
Table 3.1: What do students do?
Activity What could be measured? Possible
data source
Research using
ā€œsimilarā€ data sources
Attend
lectures
Number of lectures attended
during the semester, punctu-
ality (by comparing calendar
against actual arrival times)
GPS, University
timetable, co-
location with peer
learners, wi-fi
Ara et al. (2011); Watan-
abe et al. (2013); Wu et al.
(2008); Pentland (2010);
Dong et al. (2012)
Use
a VLE
Forum participation (fre-
quency, number of posts),
number of downloads
VLE records Barber and Sharkey
(2012)
Visit
libraries
Number of items borrowed,
length of the loan, medium,
material type
Smartcard,
Radio-Frequency
Identification
(RFID), library
records
Take
exams
Academic performance mea-
sures (exam results, history of
academic performance)
University
records, VLE
Travel Mode of transport, Distance
travelled, peridiocity
Accelerometer,
transport smart
card records, GPS
Hemminki, Nurmi, and
Tarkoma (2013a); Baha-
monde et al. (2014)
Meet other
students
Co-location with other learn-
ers, certain locations (labs,
etc), noise levels at location
GPS, Bluetooth,
microphone,
smartcard, RFID
tags
Hemminki, Zhao, Ding,
RannanjaĢˆrvi, Tarkoma,
and Nurmi (2013b)
Extra-
curricular
activities
Participation in societies,
sports, games, etc
VLE forums,
Facebook
Wen et al. (2014)
Social
networking
Number and frequency of
tweets and facebook posts,
number of uploaded photos
Twitter,
Facebook
Physical
activities
Frequency, level of activity
(walk, cycle, run), fidgeting?
Accelerometer,
gyroscope
Hung et al. (2013); Huynh
(2008)
Play and
rest
Number of hours watching TV
or movies
Lifelogging, ambi-
ent light sensors,
accelerometer
Smith et al. (2011)
Other
activities
of daily
living
Eating and drinking (regular-
ity of meals, frequency)
Lifelogging Smith et al. (2011)
Social
networking
Number and frequency of
tweets and facebook posts,
number of uploaded photos
Twitter, Face-
book
Chapter 4
Outcomes of Work to Date
In addition to the literature review presented in Chapter 2, other work to date has
involved the investigation of studentā€™s views via two surveys applied to Higher Education
Students, one in English, of students in the UK (Section 4.1) and a version in Spanish,
of students at the University of Chile (Section 4.2), as well as an investigation into a
platform and its dataset from which student behaviour could be inferred: the U-Cursos
platform (Section 4.3).
4.1 Survey of HE English-speaking students
4.1.1 Methodology
A survey1 of Higher Education students, including undergraduate and postgraduate stu-
dents in several disciplines, was applied between the 16th August and the 18th October
2013. This survey focused on exploring the current use of smartphones by Higher Ed-
ucation students as well as establishing acceptability of a future application. It was
developed iteratively, applying early versions amongst fellow researchers before deploy-
ing it on the survey platform iSurvey. Data collected using early versions of the survey
was discarded as their purpose was only to inform the design. The questions appearing
in the final version of the survey can be seen in the Appendix C.
Some of the elements in the literature review informed the questionnaire design. For
example, the exploration the use of the smartphone that Questions 2 and 3 intended to
test the extent to which the characterisation of a virtually ā€œtetheredā€ student presented
in Section 2.1.1 is true. Similarly, the considerations presented in Section 2.1.2 helped
in determining the age groups within question 5(b). In all, the information required fell
across the following areas:
1
Hosted at https://www.isurvey.soton.ac.uk/admin/section_list.php?surveyID=8728.
21
Chapter 4 Outcomes of Work to Date 22
ā€¢ Smartphone ownership ā€” to establish whether participants own (or intend to
acquire) a smartphone shortly. If so, which brand, to confirm whether an Android
development would be suitable.
ā€¢ Current use of the smartphone ā€” in which participants are asked about the fre-
quency of their use of their phone across a range of activities.
ā€¢ Perception on whether the smartphone helps or hinders participantsā€™ personal goals
in general, and their academic success specifically.
ā€¢ Acceptability of a pervasive application that would provide behavioural ā€œnudgesā€
and desired features of such an application;
ā€¢ Other information controlled including: discipline studied, level of study, modality
of studies (part-time or full-time) and views on adoption of technology.
The survey was publicised on various social networks (LinkedIn, Facebook and Twit-
ter) as well as by direct e-mail invitation to University of Southampton students2. Par-
ticipants were required to be students in Higher Education and over 18 years old. No
compensation was offered as no detriments arose from the participation in the research
other than an investment of ten minutes for the typical participant (of which partici-
pants were duly warned beforehand) and participants were not required to give sensitive
information, as questions related to the demographics section of the survey were not
open (instead, meaningful bands were offered for selection whenever possible). Many
questions could have been skipped if the participant wanted so3.
A total of 807 students attempted this questionnaire however, many could not com-
plete due to a limitation of the iSurvey platform, which hosted the survey4. After
discarding incomplete submissions and those from participants in academic institutions
outside the UK, data from 164 participants remained for analysis.
4.1.2 Findings
An analysis of the responses indicate that participants, despite actively using smart-
phones in their daily lives, are hesitant on allowing these devices to track their behaviour
2
Via Joyce Lewis, Senior Fellow for Partnerships and Business Development.
3
Compliant with recommendations by the British Educational Research Association (BERA), out-
lined in ā€œEthical Guidelines for Educational Researchā€, http://www.bera.ac.uk/system/files/BERA%
20Ethical%20Guidelines%202011.pdf. Also compliant with our institutional guidelines collated un-
der https://sharepoint.soton.ac.uk/sites/fpas/governance/ethics/default.aspx, (both last ac-
cessed 28th
February 2014). Ethics reference number: ERGO/FoPSE/7447.
4
At the time, there was a requirement for the participants to have Flash-enabled devices to complete
surveys with slider questions (as it was the case), so participants accessing via iPhones or iPads had
to re-start the survey in other platforms. It is not possible to estimate how many did (given that the
survey was anonymous). This problem has now been resolved (https://www.isurvey.soton.ac.uk/
help/changes-to-the-slider-question-type/) but unfortunately it affected this data collection.
Chapter 4 Outcomes of Work to Date 23
and whether such feedback is desirable. On one hand, participants report their use of a
smartphone for a number of activities, as shown in the charts in Figure 4.1.
Figure 4.1: Survey responses from UK students (excluding qualitative data).
The first 18 charts refer to activities that participants report undertaking with their
smartphones, which correspond to the 18 activities indicated in Question 2 of the survey.
A dominance towards lower numbers in the x axis corresponds to a high frequency in
performing a given activity as reported by the participants. For example, this applies to
making or receiving phone calls and text messages, using social networks and calendars
or reminders. Conversely, a dominance towards higher numbers in the x axis corresponds
to a low frequency, as it is the case for blogging, searching for a job, and playing podcasts.
Chapter 4 Outcomes of Work to Date 24
The next two charts in Figure 4.1 show the reported purpose for participants to use
their smartphone both in term time and outside of term. Whilst there is a preference
towards the use of their smartphones for personal reasons, as expected, this was much
more marked for outside of term periods. With regards to the perception of their phone
being a help or a barrier towards their personal goals and their academic success (the
subsequent two charts), most participants leaned towards the left end of the spectrum
(a help).
Figure 4.1 also indicates the reported desirability of features of a future smartphone
application, in charts 23 to 28. In this case, a preference towards the left indicates that
the given category is very desirable, and towards the right that it is not. Participants
were then asked whether they were concerned about any of these possible features5. In
this case, and with various degrees of acceptance, the majority welcomed features that
provided them with information about themselves and their peers, with the exception
to the check-in learning spaces, which is not desired for the majority of the participants
in the survey.
Out of 164 participants, as many as 95 reported no concern about the features
mentioned. The remaining 69 participants had a variety of concerns, more prominently
regarding feedback on their behaviour and about their peers, as well as privacy concerns
regarding the capability of an application to check them when entering learning spaces.
Other privacy concerns focused on the data itself, and who would access and control
it. Many commented they would not want their smartphones to have these features,
in particular those regarding physical activity tracking (terms such as ā€œsurveillanceā€,
ā€œbig brotherā€ and ā€œpanopticonā€ were mentioned) but some others would welcome some
feedback on how they use their time and see the benefits of using such an application.
However, not all respondents have the same attitude towards adopting innovation6,
as they claim identification with one of Rogers (1962) taxonomy classes: ā€œInnovators,
Early adopters, Early majority, Late majority, or Laggardsā€7.
4.2 Survey of students from the University of Chile
4.2.1 Methodology
Once it was decided to use data from the University of Chile students, it became relevant
to adapt the survey previously described in Section 4.1 for its application on these
5
See Appendix D for a word cloud based on participantsā€™ responses.
6
Rogersā€™ taxonomy is succintly summarised as follows: Innovators: first to adopt an innovation; Early
adopters: judicious in balancing financial risks; Early majority: adopt an innovation with early adopters
advice; Late majority: adopt innovation after majority; ā€œLaggardsā€: the last to adopt an innovation.
(Rogers, 1962)
7
Currently, this data is being analysed using NVIVO (for the open responses) and SPSS and
SigmaPlot, and further conclusions will be reported in the final thesis.
Chapter 4 Outcomes of Work to Date 25
Figure 4.2: Survey of University of Chile students: First screen.
students8. As well as translating the content for each of the screens (see example 4.2),
a question was removed as it was not relevant within this context (the concept of part-
time studying is not formalised via registration), and further options were added to the
educational stage question (as graduate courses last typically a minimum of 5 years, as
opposed to the UKā€™s three-year courses).
4.2.2 Findings
The general trend of the responses is remarkably similar to that of UK students. Only
two exceptions, which are explained in the following paragraphs:
Firstly, the Chilean participants seem to prefer phone calls to SMS messaging. This
may be explained by the fact that each SMS text is typically charged (unlike in the
UK, where most providers offer a number of free messages as part of their services).
Given that Internet providers in Chile offer affordable flat-fare packages, for small texts,
Chilean students may prefer communicating via social networks (such as Twitter direct
messaging or Facebook chat), or messaging apps (such as WhatsApp and Viber).
A second difference worth commenting is that whilst the UK participants perceive
their smartphones as helpful towards the achievement of both their personal goals and
their academic success, this is not so clear for the Chilean participants, who seem divided
in their responses. Although the justification for this difference is yet to emerge from
8
The version of this survey in Spanish is hosted at https://www.isurvey.soton.ac.uk/admin/
section_list.php?surveyID=10807 (closed at present).
Chapter 4 Outcomes of Work to Date 26
Figure 4.3: Survey responses from students of the University of Chile (excluding
qualitative data). Note that it has one chart less than Figure 4.1 because there is no
distinction between Full- and Part-Time at registration at the University of Chile.
further analysis of the data, one possible explanation may lie with the stage in their
studies: it is conceivable that students who have not progressed as quickly as they had
expected may attribute their lack of progress to distractions related to their use of their
smartphones, which is nevertheless, comparable to that of their UK counterparts.
Chapter 4 Outcomes of Work to Date 27
4.3 U-Cursos
U-Cursos is a web-based platform designed to support classroom teaching. An in-house
development by the University of Chile, it was first released in 1999, when the Faculty of
Engineering required the automation of academic and administrative tasks. In doing so,
the quality and efficiency of their processes improved, whilst supporting specific tasks
such as coordination, discussion, document sharing and marks publication, amongst oth-
ers. Within a decade, U-Cursos became an indispensable platform to support teaching
across the University, used in all 37 faculties and other related institutions.
Channels Service content
Channels services
Figure 4.4: A typical U-Cursos view. Left: a list of current channels (courses,
communities and associated institutions). Top right: services available for the selected
channel. Bottom right: contents of a service. From CaĢdiz et al. (2014) (in Appendix
E)
The success of U-Cursos is demonstrated by the high levels of use amongst students
and academics, reaching more than 30,000 are active users in 2013. U-Cursos provides
over twenty services to support teaching, as well as community and institutional ā€œchan-
nelsā€, which allow students to network, share interests and engage in discussion about
various topics. Figure 4.4 shows a typical view of U-Cursos. On the left, a list of
ā€œchannelsā€ available for the current term are shown. Channels are the ā€œcoursesā€, ā€œcom-
munitiesā€ and ā€œinstitutionsā€ associated with the user. Typically, courses are transient,
so they are replaced with new courses (if any) at the start of the term. Communities
are subscription channels which are permanent and typically refer to special interest
groups, usually managed by students, with extracurricular topics. Finally, institutions
Chapter 4 Outcomes of Work to Date 28
Figure 4.5: Cramped look to the U-Cursos web interface from a smartphone (CaĢdiz
et al., 2014).
refer to administrative figures within the organisation. The institutional channels are
used to communicate official messages on the news publication service and also to allow
students to interact using forums containing students from all of the programmes within
each institution.
A number of services are available for each type of channel. Users can select any
of the shown services and interact with it on the content area of the view. Note that
the majority of the services are provided for all types of channels, but courses also offer
academic services such as homework publication and hand-in, partial marks publication
and electronic transcripts of the final marks. These features make course channels official
points of access for the most important events in a course and have become indispensable
for students.
4.3.1 Current status
The current version of U-Cursos displays well on all regular-size screens (above 9ā€), such
as desktop computers and tablets. However, the user interaction becomes cumbersome
on small displays, such as those in smartphones, as shown in Figure 4.5.
Chapter 4 Outcomes of Work to Date 29
300,000
600,000
900,000
1,200,000
1,500,000
1,800,000
2,100,000
2,400,000
2,700,000
3,000,000
hits
month
1st term 2nd term student strike
Figure 4.6: Access graph between 2010 and 2014 for U-Cursos (CaĢdiz et al., 2014).
Another shortcoming is the lack of notification facilities, in particular those alerting
users of relevant content updates. The current setting requires users to manually access
the platform repeatedly to confirm that the information is still current. This behaviour
can be observed in Figure 4.6, which shows access statistics of U-cursos in the last four
years. There are clear high-peaks during the end-of-term periods9.
Additional factors may trigger an increased access rate to the service: students ask
more questions and download class material for the final exams, project coordination,
amongst others. According to the users, there is a component of uncertainty which
encourages users to repeatedly access the platform during these periods. As a response,
researchers from ADI designed a mobile application for the platform, currently in beta
testing.
A research visit to NIC Labs (University of Chile), took place from the 9th to the
19th of March 2014, to provide access and understanding of the historical data collected
across the University and also study the platform itself. A paper on the collaboration
was written and submitted to the 28th British HCI Conference, (see Appendix E).
U-cursos offers a number of services, of which the most frequently used are shown
in Table 4.1, with an indication of how popular are they amongst users as well as a list
of features students would like to see in U-Cursos (both for mobile and web).
The unique advantage of using this data above any other dataset currently available
is that it has over 30,000 users (staff and students) covering the past ten years, therefore
it is in principle viable for longitudinal and cross-sectional analysis. Whilst the mobile
platform is still in beta testing, having access to this wide range of data would enable
its analysis via educational analytics.
9
Terms run from March to July and from August to December in Chile. Some events may induce
small variations on the actual dates. The university closes for summer holidays in February. Source:
http://escuela.ing.uchile.cl/calendarios (In Spanish - Last accessed 9th
July 2014).
Chapter 4 Outcomes of Work to Date 30
Table 4.1: U-Cursos services ranked in ascendent order of popularity amongst users.
The number in parenthesis indicates the percentage of students who flagged the relevant
service or feature as especially useful or desirable (CaĢdiz, 2013, adapted).
Current services New mobile features New general features
My timetable (92) Granular push (20) Chat (39)
E-mail (74) Preview material (11) Library (7)
Notifications (70) Search for a room (10) Multiplatform (6)
Teaching material (58) More simplicity (9) Tablet support (6)
Calendar (50) Attendance log (5) Facebook integration (4)
Partial marks (46) People search (4) Campus map (3)
Forum (20) Offline access (4) Room status (2)
Dropbox (14) Book a lab (4) Staff timetable (2)
Guidance notes (11) Timeline (4) ā€œRead laterā€ (2)
Coursework (7) Certificate requests (4) Virtual Classroom (2)
News (7) Android widget (4) Notes bank (1)
Access to past courses (5) Marks calculator (4) Health benefits (1)
Favourites (3) Google drive (3) Evernote integration (1)
Resolutions (2) Printing queues (2) Anonymous feedback (1)
Polls (2) Institutional mail (2) Foursquare integration (1)
Links (2) Enrolment (2) Group making (1)
Official transcripts (2) Course catalogue (1) Compare timetables (1)
Course administration (1) Find staff offices (1) Anonymous feedback (1)
Posters (1) Shortcuts (1) Reporting admin errors (1)
4.3.2 Summary
This chapter has described the practical experiences in my research, in particular, those
related to the application of a survey amongst two different groups of HE students,
and those related to the process of securing a dataset from which a model of student
behaviour could be created in answering our first research question. This foundational
work inform the steps for future action, described in the next Chapter, which lays out
a plan for the following months up to the final thesis submission10.
10
Further work identified yet beyond the scope of this thesis is presented in Appendix A.
Chapter 5
Research Plan for Final Thesis
This research will explore the predictability of student success applying learning analytics
on big data sets. In particular, I will analyse a rich ā€œdata trailā€ of student activities
as gathered via their interactions with a Learning Management System (LMS), such as
the University of Chileā€™s U-Cursos1. This data can be combined with data captured by
the institution at first enrolment, such as socio-economic indicators (typically used in
traditional learning analytics). From this analysis, a model of academic success will be
developed, providing insight on the factors influencing academic performance amongst
other measurable proxies for success.
5.1 Motivation
A primary motivation behind seeking such an insight is that it would facilitate the
identification of students ā€œat riskā€, and further enable behavioural interventions so that
students can be supported in becoming successful in their studies. A greater, lasting
goal would be to influence student behaviour via persuasive technologies, so that the
students themselves are empowered to effect a significant change in their study. However,
this is a long-term goal beyond the scope of the present research. Whilst the rich
interconnection necessary for a digital behavioural intervention is not yet fully supported,
and the existing student data is both incomplete and noisy for this specific purpose, we
can still gain a good understanding of how it might look by examining current student
data, from both the educational and the pervasive computing perspectives.
A central theme of this research is learning analytics, informed by relevant studies on
behavioural interventions and the application of pervasive computing to education. In
order to build on the traditional learning analytics research approaches (generally limited
1
Developed by the University of Chileā€™s Information Technologies group (ADI, AĢrea de InfotecnologıĢas
in Spanish).
31
Chapter 5 Research Plan for Final Thesis 32
to data controlled by the educational institution), I have also considered including data
that could offer an additional insight into student behaviour, by articulating descriptions
of the activities successful students do even when they study.
5.2 Research question and research hypotheses
The general research question to be addressed is:
ā€œWhat are the measurable factors for the prediction of student academic
success?ā€
This is a very wide-ranging question, which includes a number of conditioning fac-
tors (e.g. what students bring with them before starting Higher Education) as well as
behavioural ones (e.g. how do students engage in Higher Education studies). To focus
the research, a number of specific research hypotheses have been identified:
H1: Traditional learning analytics on conditioning factors are suitable pre-
dictors of success. Specifically, are socioeconomic indicators and student com-
petences2 acquired during secondary schooling adequate predictors for student
performance in Higher Education? Existing research has strongly indicated this
to be true, however the work published to date contains limitations, such as:
(a) in the size of the sample. For example, Bhardwaj and Pal (2011) studied data
from up to 300 participants;
(b) studies predicting only persistence or attrition rather than measured academic
performance (Glynn et al., 2003)
My investigation of H1 is designed to extend the scope of the analysis and remove
some of these limitations. However, since this and other work published to date
highlight some factors as good predictors of student success, I will especially look
for evidence of such a correlation in the data to either support or falsify hypothesis
H1. These factors are: socio-economic factors such as age and parents level of
education, as well as academic performance in previous learning (such as high-
school marks).
H2: Learning analytics data in the traditional sense can be significantly
enriched by incorporating data from social media and other student-
generated data. Students interacting with the LMS leave a data trail which can
be quantified. Engagement in social forums within the U-Cursos platform is an
additional variable that can be incorporated in the prediction model. Does the
model become more accurate by doing so?
2
By student competences we refer to those measured by the University Selection Test in Chile (or
PSU, Prueba de SeleccioĢn Universitaria in Spanish (Dinkelman and MartıĢnez A, 2014)), which is used
for university admissions across the country.
Chapter 5 Research Plan for Final Thesis 33
H3: Smartphone data can be used to inform the prediction model. In par-
ticular, do measures of engagement with the U-Cursos mobile platform correlate
with those in the web-based version (for which there is substantial historical data
available)?
To test hypothesis H1, I will work with institutional data held by the University
of Chile via the platform U-Campus3, which holds databases on administrative data
related to each student, e.g. status, courses in which they are enrolled, enrolment, pro-
gression, withdrawal and completion, as well as the reported socio-economic indicators
at the time the PSU test ( Prueba de SeleccioĢn Universitaria in Spanish) was taken. U-
Campus offers a number of services to five4 faculties across the university: those services
related to curriculum management (e.g. enrolments, course programmes, prospectuses,
accreditation), administration and personal management (e.g. repository of University
Council minutes, accreditation statistics).
U-Campus is of interest for this research since the student data held (as above
outlined) could well be used to predict success if H1 is true. In particular, and following
on previous research (Sarker, 2014; Bhardwaj and Pal, 2011; Glynn et al., 2003), I expect
to find a correlation between academic performance and socioeconomic indicators such
as education level and occupation of the parents,
To test hypothesis H2, I will include in the analysis log data from U-Cursos in-
dicating the time and frequency of interactions with the LMS, including not only the
instances in which students upload content (e.g. submitting coursework) but also the
instances in which they retrieve information of interest (e.g. assessment results and
course information).
In testing hypothesis H3, I will follow closely the development of the mobile ex-
tension of U-Cursos, which aims firstly at improving accessability and usability, and
secondly at exploiting smartphones capabilities, such as nudges via granular pushes for
delivery of information and the possibility of incorporating location data to the times-
tamp of an interaction. Rather than investigating the effectiveness of these additions,
Iā€™m interested in proposing a framework so that mobile data can be incorporated into
the learning analytics.
There are certain limitations regarding the mobile data which will be available in
the coming months. In particular, this development is still in progress: beta testing
is expected to finish by the end of July 2014 and therefore there is no historical data
available. Additionally, the number of users is currently limited to just 50 (as opposed
to the current 30,000 users of the web-based version of the platform). Despite this
limitation, it is worth exploring whether the prediction model applied using the mobile
3
Access-restricted portal: https://www.u-campus.cl. See Appendix F for screenshots.
4
The University of Chile faculties currently using U-Campus are: Mathematical and Physical Sci-
ences, Medicine, Architecture and Landscaping, Social Sciences, and Philosophy.
Chapter 5 Research Plan for Final Thesis 34
data is reasonably aligned with the prediction results achieved when using the web-based
platform.
5.3 Work Packages
In order to test the hypotheses presented in the previous section, a number of activities
have been planned. The timescales for the proposed future work are given in the Gantt
chart in Table 5.1, and detailed in the following work packages:
WP1: Enhanced literature review, with a focus on learning analytics as applied to the
three research hypotheses.
WP2: Additional data analysis on surveys conducted in Chile and the UK.
WP3: Data acquisition and the collation of a complete dataset (a subset of U-Campus
and U-Cursos).
WP4: Analysis of historical data from the PSU admission test of University of Chile
students, for indicators associated to completion (available via U-Campus).
WP5: Analysis of U-Cursos data, for factors associated with high marks.
WP6: Integrating WP4 with WP5 findings for a predictive model of academic success.
WP7: Incorporating the additional variables gathered via U-Cursos mobile into the
predictive model from WP4.
I am currently working on the first three work packages (WP1 to WP3). WP1 is
necessary to complement my existing literature review, and will continue for the next
12 months, to ensure awareness of state-of-the-art research. In WP2, I will finalise the
quantitative and qualitative analysis of the surveys data that was described in Chapter 4.
WP3 also completes ongoing work, this time regarding the datasets needed to work in
this research. Work for this package started during my research visit to the University
of Chile from the 9th to the 19th of March 2014, when an improved understanding of the
data architecture of both U-Cursos and U-Campus was achieved (beyond the general
concept presented by CaĢdiz (2013)). During this trip the collaboration with ADI and
NIC Labs became formally established. Figure 5.1 provides an outline of the processes
and the kind of data stored, as well as the domains of responsibility for each.
WP4 will undertake a full analysis and evaluation of the PSU test data of students
who have enrolled in the University of Chile since 2003, when the test was first intro-
duced. More specifically, I will study correlations and statistical dependencies (using
Chapter
5
Research
Plan
for
Final
Thesis
35
Table 5.1: Schedule of research work and thesis submission
2014 2015
Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct
Mini-thesis viva
H1 ā€“ conditioning factors
WP1: Extending literature review
WP2: Additional data analysis on surveys data
WP3: Securing U-campus and U-cursos data
WP4: Analysis of U-campus data (with PSU data)
Second research visit to Chile
H2 ā€“ behavioural factors
WP5: Analysis of U-Cursos data (SPSS and WEKA)
WP6: Integration for a predictive model
Submit WP6 results to Computers and Education
H3 ā€“ smartphone data
WP7: Incorporating mobile data
Working with visiting researcher from Chile
Thesis write-up
Thesis submission
Chapter 5 Research Plan for Final Thesis 36
U-Campus
U-Cursos
Monthly
forum
ā€œdumpā€
PSU
ADI
Manual
enrolments
at Faculty
level
Students
automated
enrolments
Digitalisation
(some)
Digitalisation
Institutional
information
Student RUT,
name, address,
socioeconomic
data, age, etc
Course data (e.g. syllabus,
resources, coursework
specs, timetable, news,
student polls)
Student data (e.g. RUT,
names, email addresses,
avatars, courseworks,
partial marks, timetables,
final marks or fail status
(R/E/I))
U-Cursos
Mobile Lecturer/instructor data
(e.g. roles, courses,
permissions)
STI
Figure 5.1: Data architecture at the University of Chile: U-Campus and U-Cursos,
with processes and entities responsible for their management: ADI is the University of
Chileā€™s Information Technologies group (AĢrea de InfotecnologıĢas in Spanish) and STI
is the University of Chileā€™s Division of IT and Communications (DireccioĢn de Servicios
de TecnologıĢas de InformacioĢn y Comunicaciones ).
SPSS) between ā€œconditioningā€ factors and the academic performance to date as mea-
sured by the PSU test. Table 5.2 shows the data fields available for this test5, with marks
(X) next to those which are of interest for this analysis, in particular: socio-economic
indicators and the average high-school marks, since they are generally accepted as re-
liable predictors of academic performance in the literature. Additional factors, such as
gender, age and nationality have been identified in the global literature as influential,
therefore I will also incorporate this data. Specifically for the Chilean case, it has been
reported that the PSU test is widely regarded as being biased towards school-leavers
of private schools and towards the metropolitan area. Therefore, I will also study the
impact of the educational institution of origin and the home city on the academic per-
formance prior to the test (in this work package) and then later in Higher Education
(in WP5). Finally, after certain pre-processing6, other fields (marked with ā€ ) are also
5
See Appendix G for further details, including screenshots of a sample student application.
6
In order to guarantee anonymity, it is necessary to avoid sensitive data, such as the name, phone
numbers, email, exact home address (street and house number), and exact date of birth (month and
year will suffice).
Chapter 5 Research Plan for Final Thesis 37
necessary. In particular, I will require the national identification number (hashed or
otherwise protected), since this will act as a unique key which could be used to link the
data from the PSU test (ā€œconditioning dataā€) to the measures of academic performance
available via U-Cursos in WP5.
At this point, I will have sufficient evidence to either support or reject hypothesis
H1 (ā€œtraditional learning analytics on conditioning factors are suitable predictors of
successā€), as indicated in the Gantt chart (Table 5.1). My findings will be discussed
with researchers in ADI and NIC Labs during my second research visit (for two weeks,
exact dates TBA), where I will complete the analysis and commence work on WP5.
The visit will be used also to agree with these researchers on measurable behavioural
factors that are feasible to study via the smartphone extension of U-Cursos, which will
be required for WP7.
For WP5, data from U-Cursos will offer some information on measures of academic
performance and ā€œbehavioural factorsā€, limited to how students interact with the plat-
form, in terms of type and frequency of their access, including coursework submission
information and interim assessments. This data will be analysed and correlations and
statistical dependencies will be studied (using SPSS). Additionally, I will apply data
mining techniques to formulate a prediction model of successful performance, consider-
ing these variables as classifying features.
WP6 concerns the integration of the conditioning factors (as gathered from U-
Campus) and behavioural factors (from U-Cursos). Since the number of variables
available will increase significantly, it is essential to apply feature selection methods
to improve the model and avoid overfitting. A number of classification methods from
the data mining toolset WEKA could be used, for example NaıĢˆve Bayes, which has been
also used by Bhardwaj and Pal (2011) to predict academic performance7. As an outcome
of this work package, I intend to submit a research paper to the journal Computers and
Education8, where the evidence gathered to prove or disprove hypothesis H2 will be dis-
cussed. The effort in writing this paper will count towards the task ā€œThesis write-upā€,
shown last in Table 5.1), hence this is shown as formally starting at the same time as
WP6, though in practice the writing takes place throughout the research project. Fi-
nally, WP7 concerns entirely in testing hypothesis H3 (ā€œSmartphone data can be used
to inform the prediction modelā€), and will incorporate data from U-Cursos mobile to
the model created as part of WP6.
7
Bhardwaj and Pal (2011) only used conditioning variables such as those to be studied in WP4.
8
Some of the journal Computers and Education impact metrics are: Impact per Publication (IPP)
of 3.720 and Impact Factor (IF) of 2.775. As reported at http://www.journals.elsevier.com/
computers-and-education/ (last accessed on the 4th
July 2014).
Chapter 5 Research Plan for Final Thesis 38
Table 5.2: University Selection Tests (Prueba de SeleccioĢn Universitaria, PSU) data
fields. Data from fields marked in bold will be used to validate H1, complemented
with other fields of interest (marked X). Note that fields marked ā€  will require some
preprocessing for anonymisation. (Based on http://www.demre.cl/instr_incrip_
p2014.htm. Last accessed: 3th
July 2014).
Personal data (Comments)
Full name prefilled on login
ā€  National identification number prefilled on login
X Country of nationality
X Gender prefilled on login
ā€  Date of Birth prefilled on login
X Occupation two choices: Student or blank field
School data
X Type of applicant either from current or previous years
X Educational Institution prefilled
Educational Branch institutions may have several ones
X Year of graduation from High School prefilled
X Average high-school marks prefilled if from previous years
Geographical Area prefilled
Test choices data
Test choices Social and/or pure sciences (but just one
amongst Biology, Physics and Chemistry)
Admissions office
Test venue dropdown menu
Personal contacts
Home address: street, number
X Home: city, region and province dropdown menus
Phone numbers
E-mail address
Socio-economic data
X Marital status dropdown menu
X Work status dropdown menu
X Working hours dropdown menu
X Number of working hours a week
X Term time type of accomodation dropdown menu
X Household size
X Number of people in the household
in employment
X Who is the head of the household? dropdown menu
X Are your parents alive?
X How many people study in your
household
discriminated by educational stage
X Have you studied in a Higher Educa-
tion Institution
Yes/No
X If so, type of institution dropdown menu
Name of institution
About each parent
X Occupation multiple choice
X Industry multiple choice
Funding and payment
X Are you a beneficiary of a junaeb
scholarship?
dropdown menu
Chapter 5 Research Plan for Final Thesis 39
5.4 Contingency research plan
The research plan above described is predicated on acquiring specific data from a sub-
stantially large group of students, in particular, U-Campus, U-Cursos and U-Cursos
mobile. Although I have successfully established the appropriate contacts at the Uni-
versity of Chile (in the ADI group and with NIC Labs), and substantial progress has
already been made towards accessing U-Cursos and U-Campus data, a contingency plan
is in place for the event of failure to secure suitable data.
My contacts from the University of Chile have been forthcoming in answering my
questions as I become familiar with the platform and the organisation itself. My con-
tribution in this collaboration is that my findings will be used to inform the evolution
of the platform and further extensions are likely to incorporate ā€œnudgesā€ for a future
digital behavioural intervention seeking to improve retention and shortening the length
of time students need to graduate. Our close collaboration is already fruitful, as during
my research visit last March, we were able to prepare a research paper together where
U-Cursos is well described (CaĢdiz et al., 2014, as in Section 4.3). However, despite this
strong assurances evidencing their willingness for sharing the relevant data with me,
there are some practical issues to be resolved which may affect the feasibility of securing
the data as planned. In particular, the data architecture seems to have followed an
ad-hoc design and there are many redundancies and inefficiencies of which I have just
began to become aware. Being distributed across a number of tables, many a time on
separate sites, it is not a matter of simply being granted access to a centralised reposi-
tory. In addition, our requirement for anonymisation of the data adds another level of
uncertainty (which is hard to quantify) as this clearly will require time and effort by my
Chilean colleagues.
Should it be the case that the contingency research plan is carried out, hypotheses H1
and H2 may alternatively be tested on data from the University of Southampton Massive
Open Online Courses (MOOCs)9, which are run by the University of Southampton via
Future Learn.
Data regarding several conditioning factors to test hypothesis H1 are also harvested
during enrolment in these courses as part of a ā€œpre-courseā€ questionnaire. These include
socio-economic indicators (e.g. age, country, gender, employment status and reported
disabilities if any), and other conditioning factors such as course expectations, reported
learning preferences, subject areas of interest, and prior education (both in formal edu-
cation and in other MOOCs). Given this data, a similar study as that planned for WP4
can still be undertaken but using this data instead.
9
As an example, the MOOC ā€œHow the Web is Changing the Worldā€ has had two intakes since
2012 (and is running for third time this October). Further details at http://www.soton.ac.uk/moocs/
webscience.shtml (last accessed on the 26th
June 2014).
Chapter 5 Research Plan for Final Thesis 40
With regards to the testing of H2, there are a number of datasets available, for which
there is implicit consent from participants for their use in research. These datasets are
files in Comma Separated Value (CSV) format, the most relevant being:
ā€¢ the End of Course dataset ā€“ contains metrics such the proportion of those who
enrolled in the course (ā€œJoinersā€) has abandoned (ā€œleaversā€). Other characterisa-
tions include: ā€œLearnersā€(those who have viewed at least one step of the course),
ā€œactive learnersā€ (thouse who has marked at least one step as complete),ā€œreturning
learnersā€ (those who completed steps in more than one week), ā€œsocial learnersā€
(those who have left at least one comment), and ā€œfully participated learnersā€ (sic),
those who have completed a majority of the steps including all tests10.
ā€¢ the Step Completion dataset ā€“ Note that each course has a number of ā€œstepsā€ that
need to be completed to succeed (typically watching a video, reading a text, or
completing an assessment). Each step can have a number of comments associated.
ā€¢ the Quiz data ā€“ which would constitute a proxy for ā€œmarksā€ in the traditional
sense; and
ā€¢ the Comments dataset ā€“ Table 5.3 is a detailed example of the structure of this
datasets, the Comments dataset.
A ā€œpost-courseā€ questionnaire, though mainly intended as a course evaluation ex-
ercise (and therefore including questions where the student rates the course in several
ways), also helps in gathering other indicators of the learning behaviour, such as point
of entry (whether from the start of the course or later on), reasons for attrition (if the
course was abandoned) and specific learning behaviours adopted investigating dedication
in time and effort, reported frequency of access, reflection, collaboration (through social
media as well as via comments in a step within the course) and connectivity (devices
used to access the course and typical study places) as well as their use of prior learning.
Combined, these datasets record all the interactions between participants through
the platform and hold a complete record of achievement and progress as the students
take on the various tasks and assessments in the course.
Admittedly, hypothesis H3 cannot be tested using MOOCs data, but alternatively
we would formulate a domain-specific hypothesis applicable to online-only courses, as
opposed to face-to-face instruction supported by an LMS, which is the case of interest
in the current plan. Also in this case, a shift in focus will be necessary, an the literature
review presented in Section 2.2.3.
10
Thanks to Kate Dickens from the Centre for Innovation in Technologies and Education (CITE) for
facilitating this information.
Chapter 5 Research Plan for Final Thesis 41
Table 5.3: FutureLearn Platform Data Exports. Adapted from https://www.
futurelearn.com/courses/course-slug/). (Last accessed: 4th
July 2014, by Kate
Dickens (Project Leader for the Web Science MOOC).
Comments
id [integer] a unique id assigned to each comment
author id [string] the unique, anonymous id assigned to the author
user
parent id [integer] the unique id of the parent comment (i.e. the com-
ment this comment replies to)
step [string] the human readable step number (e.g. 1.13)
text [string] the comment text
timestamp [timestamp] when the comment was posted
moderated [timestamp] the time at which a comment was moderated, if at
all
likes [integer] the number of likes attributed to the comment
Peer Review - Assignments
id [integer] a unique id assigned to each assignment submission
(referenced by reviews)
step [string] the human readable step number (e.g. 1.13)
author id [string] the unique, anonymous id assigned to the author
user
text [string] the comment text
first viewed at [timestamp] when the assignment step was first viewed
created at [timestamp] when the assignment was submitted
moderated [timestamp] the time at which a comment was moderated, if at
all
review count [integer] how many reviews are associated with the assign-
ment
Peer Review - Reviews
id [integer] a unique id assigned to each assignment review
step [string] the human readable step number (e.g. 1.13)
author id [string] the unique, anonymous id assigned to the author
user
assignment id [integer] the id identifying the assignment reviewed
guideline one feedback [string] text submitted for the first guideline
guideline two feedback [string] text submitted for the second guideline
guideline three feedback
[string]
text submitted for the third guideline
created at [timestamp] when the review was submitted
Chapter 5 Research Plan for Final Thesis 42
5.5 Summary
This Chapter presented the motivation behind the research question ā€œWhat are the
measurable factors for the prediction of student academic success?ā€ and outlined three
research hypothesis associated to it. Two of these hypothesis consider conditioning
and behavioural factors as predictors of academic success, whilst the last one regards
smartphone data as suitable to inform a prediction model of success. In order to test
them, a number of work packages (WP1-WP7) are planned, with deliverables at specific
points in the time remaining until the submission of the final thesis. I have also outlined
a contingency research plan should the data expected from the University of Chile prove
difficult to obtain for unforseen circumstances.
The following Chapter will outline future work that has been identified yet is beyond
the scope of this research given the time and resources remaining.
Chapter 6
Conclusions
This research will explore the predictability of student success from learning analytics
on big data sets. In particular, we seek to analyse a rich ā€œdata trailā€ of student activities
as gathered via their interactions with a Learning Management System (LMS), such as
the University of Chileā€™s U-Cursos1. This data can be combined with data captured by
the institution at first enrolment, such as socio-economic indicators (typically used in
traditional learning analytics). From this analysis, a model of academic success will be
developed, providing insight on the factors influencing academic performance amongst
other measurable proxies for success.
A primary motivation behind seeking such an insight is that it would facilitate the
identification of students ā€œat riskā€, and further enable behavioural interventions so that
students can be supported in becoming successful in their studies. A greater, lasting goal
would be to influence student behaviour via persuasive technologies, so that the students
themselves are empowered to effect a significant change. This is a long-term goal beyond
the scope of the present research. Whilst the rich interconnection necessary for a digital
behavioural intervention is not yet fully supported, and the existing student data is both
incomplete and noisy for this specific purpose, we can still gain some knowledge of how
it might look by examining current student data, from both the educational and the
pervasive computing perspectives.
The central theme of this research is learning analytics, informed by relevant studies
on behavioural interventions and the application of pervasive computing to education. In
order to build on the traditional learning analytics research approaches (generally limited
to data controlled by the educational institution), I have also considered including data
that could offer an additional insight into student behaviour, by articulating descriptions
of what successful students do when they are not studying.
1
Developed by the University of Chileā€™s Information Technologies group (ADI, AĢrea de InfotecnologıĢas
in Spanish).
43
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors
A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets  Conditioning And Behavioural Factors

More Related Content

Similar to A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets Conditioning And Behavioural Factors

Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Jason Cheung
Ā 
A Mentored Project Management Curriculum Theme Building 21St Century Projec...
A Mentored Project Management Curriculum Theme   Building 21St Century Projec...A Mentored Project Management Curriculum Theme   Building 21St Century Projec...
A Mentored Project Management Curriculum Theme Building 21St Century Projec...Tye Rausch
Ā 
Con ed project-based-learning--model
Con ed project-based-learning--modelCon ed project-based-learning--model
Con ed project-based-learning--modelYatin Ngadiyono
Ā 
08gainsandgapsedtech
08gainsandgapsedtech08gainsandgapsedtech
08gainsandgapsedtechTom Loughran
Ā 
ECE_OBE_BOOKLET_UG20_REGULATION.pdf
ECE_OBE_BOOKLET_UG20_REGULATION.pdfECE_OBE_BOOKLET_UG20_REGULATION.pdf
ECE_OBE_BOOKLET_UG20_REGULATION.pdfMallikarjunaRaoYamar1
Ā 
David Thesis Final 1 Sided
David Thesis Final 1 SidedDavid Thesis Final 1 Sided
David Thesis Final 1 SidedDavid Lawrence
Ā 
Paul Williams Final Paper 06112016
Paul Williams Final Paper 06112016Paul Williams Final Paper 06112016
Paul Williams Final Paper 06112016Paul Williams
Ā 
Students in the director's seat: Teaching and learning across the school curr...
Students in the director's seat: Teaching and learning across the school curr...Students in the director's seat: Teaching and learning across the school curr...
Students in the director's seat: Teaching and learning across the school curr...Matthew Kearney
Ā 
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition 2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition eraser Juan JosĆ© CalderĆ³n
Ā 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability EvaluationJPC Hanson
Ā 
Learning with technology
Learning with technologyLearning with technology
Learning with technologyMiraAlmirys
Ā 
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...Developing and Implementing Competency-based ICT Training for Teachers: A Cas...
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...eraser Juan JosĆ© CalderĆ³n
Ā 
green_MGT581_Mod8-PortfolioProject_final_9.28.14
green_MGT581_Mod8-PortfolioProject_final_9.28.14green_MGT581_Mod8-PortfolioProject_final_9.28.14
green_MGT581_Mod8-PortfolioProject_final_9.28.14Green Tina
Ā 
MANAGEMENT RESEARCH PROJECT
MANAGEMENT RESEARCH PROJECTMANAGEMENT RESEARCH PROJECT
MANAGEMENT RESEARCH PROJECTERICK MAINA
Ā 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringPriyanka Pandit
Ā 
Grade 9 Mathematics (10F) - Education and Advanced ...
 Grade 9 Mathematics (10F) - Education and Advanced ... Grade 9 Mathematics (10F) - Education and Advanced ...
Grade 9 Mathematics (10F) - Education and Advanced ...emz89pakto
Ā 
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...Nischal Lal Shrestha
Ā 

Similar to A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets Conditioning And Behavioural Factors (20)

Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Ā 
A Mentored Project Management Curriculum Theme Building 21St Century Projec...
A Mentored Project Management Curriculum Theme   Building 21St Century Projec...A Mentored Project Management Curriculum Theme   Building 21St Century Projec...
A Mentored Project Management Curriculum Theme Building 21St Century Projec...
Ā 
Con ed project-based-learning--model
Con ed project-based-learning--modelCon ed project-based-learning--model
Con ed project-based-learning--model
Ā 
08gainsandgapsedtech
08gainsandgapsedtech08gainsandgapsedtech
08gainsandgapsedtech
Ā 
ECE_OBE_BOOKLET_UG20_REGULATION.pdf
ECE_OBE_BOOKLET_UG20_REGULATION.pdfECE_OBE_BOOKLET_UG20_REGULATION.pdf
ECE_OBE_BOOKLET_UG20_REGULATION.pdf
Ā 
David Thesis Final 1 Sided
David Thesis Final 1 SidedDavid Thesis Final 1 Sided
David Thesis Final 1 Sided
Ā 
Paul Williams Final Paper 06112016
Paul Williams Final Paper 06112016Paul Williams Final Paper 06112016
Paul Williams Final Paper 06112016
Ā 
Students in the director's seat: Teaching and learning across the school curr...
Students in the director's seat: Teaching and learning across the school curr...Students in the director's seat: Teaching and learning across the school curr...
Students in the director's seat: Teaching and learning across the school curr...
Ā 
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition 2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition
2020 EDUCAUSE Horizon Report ā„¢ Teaching and Learning Edition
Ā 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability Evaluation
Ā 
Learning with technology
Learning with technologyLearning with technology
Learning with technology
Ā 
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...Developing and Implementing Competency-based ICT Training for Teachers: A Cas...
Developing and Implementing Competency-based ICT Training for Teachers: A Cas...
Ā 
green_MGT581_Mod8-PortfolioProject_final_9.28.14
green_MGT581_Mod8-PortfolioProject_final_9.28.14green_MGT581_Mod8-PortfolioProject_final_9.28.14
green_MGT581_Mod8-PortfolioProject_final_9.28.14
Ā 
MANAGEMENT RESEARCH PROJECT
MANAGEMENT RESEARCH PROJECTMANAGEMENT RESEARCH PROJECT
MANAGEMENT RESEARCH PROJECT
Ā 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and Engineering
Ā 
Luįŗ­n Văn How Does Channel Integration Quality Enrich Customer Experiences Wit...
Luįŗ­n Văn How Does Channel Integration Quality Enrich Customer Experiences Wit...Luįŗ­n Văn How Does Channel Integration Quality Enrich Customer Experiences Wit...
Luįŗ­n Văn How Does Channel Integration Quality Enrich Customer Experiences Wit...
Ā 
Grade 9 Mathematics (10F) - Education and Advanced ...
 Grade 9 Mathematics (10F) - Education and Advanced ... Grade 9 Mathematics (10F) - Education and Advanced ...
Grade 9 Mathematics (10F) - Education and Advanced ...
Ā 
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Violaā€“Jones for Face Detect...
Ā 
Validation of NOn-formal MOOC-based Learning
Validation of NOn-formal MOOC-based LearningValidation of NOn-formal MOOC-based Learning
Validation of NOn-formal MOOC-based Learning
Ā 
OpenCred REport published
OpenCred REport publishedOpenCred REport published
OpenCred REport published
Ā 

More from Joaquin Hamad

Christmas Present Writing Template (Teacher Made)
Christmas Present Writing Template (Teacher Made)Christmas Present Writing Template (Teacher Made)
Christmas Present Writing Template (Teacher Made)Joaquin Hamad
Ā 
Free Narrative Essay Examples
Free Narrative Essay ExamplesFree Narrative Essay Examples
Free Narrative Essay ExamplesJoaquin Hamad
Ā 
Writing A Case Study Analysis - 500 MBA Leve
Writing A Case Study Analysis - 500 MBA LeveWriting A Case Study Analysis - 500 MBA Leve
Writing A Case Study Analysis - 500 MBA LeveJoaquin Hamad
Ā 
Top Essay Writing Servic
Top Essay Writing ServicTop Essay Writing Servic
Top Essay Writing ServicJoaquin Hamad
Ā 
Importance Of Secondary Speech And English Educ
Importance Of Secondary Speech And English EducImportance Of Secondary Speech And English Educ
Importance Of Secondary Speech And English EducJoaquin Hamad
Ā 
Argumentative Essay Structure Coretan
Argumentative Essay Structure CoretanArgumentative Essay Structure Coretan
Argumentative Essay Structure CoretanJoaquin Hamad
Ā 
021 Personal Essays For College Examp
021 Personal Essays For College Examp021 Personal Essays For College Examp
021 Personal Essays For College ExampJoaquin Hamad
Ā 
Why Do You Want To Be An Engineer College Essay
Why Do You Want To Be An Engineer College EssayWhy Do You Want To Be An Engineer College Essay
Why Do You Want To Be An Engineer College EssayJoaquin Hamad
Ā 
Red And Blue Lined Handwriting Paper Printable
Red And Blue Lined Handwriting Paper PrintableRed And Blue Lined Handwriting Paper Printable
Red And Blue Lined Handwriting Paper PrintableJoaquin Hamad
Ā 
The 25 Best Persuasive Writing Prompts Ideas On Pi
The 25 Best Persuasive Writing Prompts Ideas On PiThe 25 Best Persuasive Writing Prompts Ideas On Pi
The 25 Best Persuasive Writing Prompts Ideas On PiJoaquin Hamad
Ā 
Ginger Snaps Presidents Day Freebie
Ginger Snaps Presidents Day FreebieGinger Snaps Presidents Day Freebie
Ginger Snaps Presidents Day FreebieJoaquin Hamad
Ā 
Writing A Short Essay Essay Stru
Writing A Short Essay Essay StruWriting A Short Essay Essay Stru
Writing A Short Essay Essay StruJoaquin Hamad
Ā 
Pin Em SAT MISSION
Pin Em SAT MISSIONPin Em SAT MISSION
Pin Em SAT MISSIONJoaquin Hamad
Ā 
005 How To Write An Academic Essay Example
005 How To Write An Academic Essay Example005 How To Write An Academic Essay Example
005 How To Write An Academic Essay ExampleJoaquin Hamad
Ā 
My Writing A Perfect Paper Immigrant.Com.Tw
My Writing A Perfect Paper Immigrant.Com.TwMy Writing A Perfect Paper Immigrant.Com.Tw
My Writing A Perfect Paper Immigrant.Com.TwJoaquin Hamad
Ā 
Free Printable Lined Paper With Decorative Borders -
Free Printable Lined Paper With Decorative Borders -Free Printable Lined Paper With Decorative Borders -
Free Printable Lined Paper With Decorative Borders -Joaquin Hamad
Ā 
Expository Essay Argumentative Paragraph S
Expository Essay Argumentative Paragraph SExpository Essay Argumentative Paragraph S
Expository Essay Argumentative Paragraph SJoaquin Hamad
Ā 
Breathtaking Critical Essay
Breathtaking Critical EssayBreathtaking Critical Essay
Breathtaking Critical EssayJoaquin Hamad
Ā 
How To Write Speech Essay. How To Write A
How To Write Speech Essay. How To Write AHow To Write Speech Essay. How To Write A
How To Write Speech Essay. How To Write AJoaquin Hamad
Ā 

More from Joaquin Hamad (20)

Hetyps - Blog
Hetyps - BlogHetyps - Blog
Hetyps - Blog
Ā 
Christmas Present Writing Template (Teacher Made)
Christmas Present Writing Template (Teacher Made)Christmas Present Writing Template (Teacher Made)
Christmas Present Writing Template (Teacher Made)
Ā 
Free Narrative Essay Examples
Free Narrative Essay ExamplesFree Narrative Essay Examples
Free Narrative Essay Examples
Ā 
Writing A Case Study Analysis - 500 MBA Leve
Writing A Case Study Analysis - 500 MBA LeveWriting A Case Study Analysis - 500 MBA Leve
Writing A Case Study Analysis - 500 MBA Leve
Ā 
Top Essay Writing Servic
Top Essay Writing ServicTop Essay Writing Servic
Top Essay Writing Servic
Ā 
Importance Of Secondary Speech And English Educ
Importance Of Secondary Speech And English EducImportance Of Secondary Speech And English Educ
Importance Of Secondary Speech And English Educ
Ā 
Argumentative Essay Structure Coretan
Argumentative Essay Structure CoretanArgumentative Essay Structure Coretan
Argumentative Essay Structure Coretan
Ā 
021 Personal Essays For College Examp
021 Personal Essays For College Examp021 Personal Essays For College Examp
021 Personal Essays For College Examp
Ā 
Why Do You Want To Be An Engineer College Essay
Why Do You Want To Be An Engineer College EssayWhy Do You Want To Be An Engineer College Essay
Why Do You Want To Be An Engineer College Essay
Ā 
Red And Blue Lined Handwriting Paper Printable
Red And Blue Lined Handwriting Paper PrintableRed And Blue Lined Handwriting Paper Printable
Red And Blue Lined Handwriting Paper Printable
Ā 
The 25 Best Persuasive Writing Prompts Ideas On Pi
The 25 Best Persuasive Writing Prompts Ideas On PiThe 25 Best Persuasive Writing Prompts Ideas On Pi
The 25 Best Persuasive Writing Prompts Ideas On Pi
Ā 
Ginger Snaps Presidents Day Freebie
Ginger Snaps Presidents Day FreebieGinger Snaps Presidents Day Freebie
Ginger Snaps Presidents Day Freebie
Ā 
Writing A Short Essay Essay Stru
Writing A Short Essay Essay StruWriting A Short Essay Essay Stru
Writing A Short Essay Essay Stru
Ā 
Pin Em SAT MISSION
Pin Em SAT MISSIONPin Em SAT MISSION
Pin Em SAT MISSION
Ā 
005 How To Write An Academic Essay Example
005 How To Write An Academic Essay Example005 How To Write An Academic Essay Example
005 How To Write An Academic Essay Example
Ā 
My Writing A Perfect Paper Immigrant.Com.Tw
My Writing A Perfect Paper Immigrant.Com.TwMy Writing A Perfect Paper Immigrant.Com.Tw
My Writing A Perfect Paper Immigrant.Com.Tw
Ā 
Free Printable Lined Paper With Decorative Borders -
Free Printable Lined Paper With Decorative Borders -Free Printable Lined Paper With Decorative Borders -
Free Printable Lined Paper With Decorative Borders -
Ā 
Expository Essay Argumentative Paragraph S
Expository Essay Argumentative Paragraph SExpository Essay Argumentative Paragraph S
Expository Essay Argumentative Paragraph S
Ā 
Breathtaking Critical Essay
Breathtaking Critical EssayBreathtaking Critical Essay
Breathtaking Critical Essay
Ā 
How To Write Speech Essay. How To Write A
How To Write Speech Essay. How To Write AHow To Write Speech Essay. How To Write A
How To Write Speech Essay. How To Write A
Ā 

Recently uploaded

call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
Ā 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
Ā 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
Ā 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Nguyen Thanh Tu Collection
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
Ā 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
Ā 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
Ā 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
Ā 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
Ā 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Ā 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
Ā 

Recently uploaded (20)

call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
Ā 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
Ā 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
Ā 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Ā 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
Ā 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
Ā 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
Ā 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
Ā 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
Ā 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
Ā 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Ā 
Model Call Girl in Bikash Puri Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Bikash Puri  Delhi reach out to us at šŸ”9953056974šŸ”Model Call Girl in Bikash Puri  Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Bikash Puri Delhi reach out to us at šŸ”9953056974šŸ”
Ā 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
Ā 

A Mini-Thesis Submitted For Transfer From MPhil To PhD Predicting Student Success With Learning Analytics On Big Data Sets Conditioning And Behavioural Factors

  • 1. UNIVERSITY OF SOUTHAMPTON Faculty of Physical Sciences and Engineering Electronics and Computer Science A mini-thesis submitted for transfer from MPhil to PhD Supervisors: Ed Zaluska (ejz), Dave Millard (dem) Examiner: Mark Weal (mjw) Predicting Student Success with Learning Analytics on Big Data Sets: Conditioning and Behavioural Factors by Adriana Wilde July 10, 2014
  • 2. UNIVERSITY OF SOUTHAMPTON FACULTY OF PHYSICAL SCIENCES AND ENGINEERING ELECTRONICS AND COMPUTER SCIENCE Predicting Student Success with Learning Analytics on Big Data Sets: Conditioning and Behavioural Factors A mini-thesis submitted for transfer from MPhil to PhD by Adriana Wilde ABSTRACT Advances in computing technologies have a profound impact in many areas of human concern, especially in education. Teaching and learning are undergoing a (digital) rev- olution, not only by changing the media and methods of delivery but by facilitating a conceptual shift from traditional face-to-face instruction towards a learner-centered paradigm with delivery increasingly becoming tailored to student needs. Educational institutions of the immediate future have the potential to predict (and even facilitate) student success by applying learning analytics techniques on the large amount of data they hold about their learners, which include a number of indicators that measure both the conditioning (under which students are subjected) and the behavioural factors (what students do) influencing whether a given student will be successful. More than ever before, key information about successful student habits and learning context can be discovered. Our hypothesis is that collective data can be used to construct a model of success for Higher Education students, which then can be used to identify students at risk. This is a complex issue which is receiving increased attention amongst e-learning commu- nities (of which Massive Open Online Courses are an example), and administrators of learning management system alike. Smartphones, as sensor-rich, ubiquitous devices, are expected to become an important source of such data in the imminent future, increasing significantly the complexity of the problem of devising an accurate predictive model of success. This interim thesis presents the relevant issues in predicting student success using learn- ing analytics approaches by incorporating both conditioning and behavioural factors with the ultimate goal of informing behavioural change interventions in the context of learning in Higher Education. It then discusses our work to date and concludes with a workplan to generate publishable results.
  • 3. Contents 1 Introduction 1 2 Background and Literature Review 4 2.1 Higher education learners today . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 A digitally-literate generation of students . . . . . . . . . . . . . . 4 2.1.2 Mature students in HE . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Computers and learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 Learning Management Systems . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Learning analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.3 Massive Open Online Courses . . . . . . . . . . . . . . . . . . . . . 10 2.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Smart badges and smartphones . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 Behaviour sensing and intervention . . . . . . . . . . . . . . . . . . . . . . 14 2.5 Final comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 A research question 18 3.1 What are the measurable factors for the prediction of student academic success? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4 Outcomes of Work to Date 21 4.1 Survey of HE English-speaking students . . . . . . . . . . . . . . . . . . . 21 4.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.1.2 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Survey of students from the University of Chile . . . . . . . . . . . . . . . 24 4.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.2 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 U-Cursos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.1 Current status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5 Research Plan for Final Thesis 31 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2 Research question and research hypotheses . . . . . . . . . . . . . . . . . 32 5.3 Work Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.4 Contingency research plan . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 ii
  • 4. CONTENTS iii 6 Conclusions 43 References 45 A Beyond this thesis 56 A.1 How to help students reflect on their behaviour? . . . . . . . . . . . . . . 56 B Predictability of human behaviour 60 C Survey questions 62 D A word cloud of concerns 66 E The U-Cursos experience 68 F U-Campus Screenshots 75 G Chilean University Selection Test 77 H Additional research 81 H.1 Audience response systems (zappers) . . . . . . . . . . . . . . . . . . . . . 81 H.1.1 Own experience with zappers . . . . . . . . . . . . . . . . . . . . . 82 H.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 H.3 Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 H.4 Activity Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
  • 5. List of Figures 2.1 Multi-level categorisation model of conceptions of teaching . . . . . . . . . 8 2.2 Smart badges: The Active Badge by Palo Alto Research Centre . . . . . . 11 2.3 Smart badges: The HBM (external and internal appearance) . . . . . . . 11 2.4 Smart badges: The MIT wearable sociometric badge . . . . . . . . . . . . 12 2.5 A smartphone sensing architecture . . . . . . . . . . . . . . . . . . . . . . 13 2.6 Components of digital behaviour interventions using smartphones . . . . 16 4.1 Survey responses from UK students (excluding qualitative data). . . . . . 23 4.2 Survey of University of Chile students: First screen . . . . . . . . . . . . . 25 4.3 Survey responses from students of the University of Chile (excluding qual- itative data). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 U-Cursos view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.5 Cramped look to the U-Cursos web interface from a smartphone . . . . . 28 4.6 Access graph between 2010 and 2014 for U-Cursos . . . . . . . . . . . . . 29 5.1 Data architecture at the University of Chile. . . . . . . . . . . . . . . . . . 36 D.1 Participantsā€™ answers to the question ā€œDo any of the potential applications described cause you any concern? Which ones? Why?ā€ . . . . . . . . . . . 66 F.1 U-Campus courses catalogue. . . . . . . . . . . . . . . . . . . . . . . . . . 75 F.2 U-Campus module catalogue for the Computer Science course. . . . . . . 76 G.1 Chilean University Selection Test (PSU) - step one . . . . . . . . . . . . . 77 G.2 PSU - step two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 G.3 PSU - step three . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 G.4 PSU - step four . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 H.1 A commercial zapper: A TurningPointTMresponse card . . . . . . . . . . . 82 H.2 Zappers in action: Example exam question with student responses . . . . 83 H.3 Zappers in action: Appraising students confidence on their self-assessment before (left slide) and after (right slide) the solution was discussed in class. 84 iv
  • 6. List of Tables 3.1 What do students do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.1 U-Cursos services ranked in ascendent order of popularity amongst users. 30 5.1 Schedule of research work and thesis submission (A Gantt chart) . . . . . 35 5.2 University Selection Tests (PSU) data fields . . . . . . . . . . . . . . . . . 38 5.3 FutureLearn Platform Data Exports . . . . . . . . . . . . . . . . . . . . . 41 A.1 Table of interventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 v
  • 7. Chapter 1 Introduction Recent developments in mobile technologies are characterised by a high integration of information processing, connectivity and sensing capabilities into everyday objects. It is now easier than ever to collect, analyse and exchange data about our daily activities: revolutionising how humans live, work and learn. This is particularly true amongst higher education students, who already generate a rich ā€œdata trailā€ as they navigate their way through towards successful completion of their studies. Traditional learning analytics research focuses on the use of data an educational institution holds about their students to promptly identify poor performance so that actions that can be taken to encourage success. Struggling students in particular need to be directed to be able to complete their courses more successfully (Baepler and Murdoch, 2010), as the failure to do so comes to a great cost, not only to these students but to their institutions. This is a difficult issue, as measures of success are usually limited to traditional indicators such as progression and academic performance. For a student, an educational institution and the wider society, ā€œsuccessā€ would have to be defined by retention, level of engagement and contentment as well as achievement of higher marks. Against this context, Higher Education institutions have, in recent years, devoted great efforts to support students and encourage them to succeed, by making learning materials widely available to their students, for example. Furthermore, the greater affordability of smartphones and the ubiquity of the Internet not only allows students to access learning materials at any time and any where (although students may well not see this as the primary benefit of such technologies), but also allows academics to learn more about student habits and context than ever before. In other words: what do students actually do and could this information empower them to do better? One valid approach to understanding how students learn may use technology to gather data about the conditioning factors for their success as well as the behaviours they adopt in their student lives. A second step would then use these indicators to 1
  • 8. Chapter 1 Introduction 2 predict student success in time to perform an intervention on those students identified as ā€œat riskā€. The technology available for collecting activity data is not only becoming more diverse and powerful but it is also becoming widely available at a decreasing costs, hence increasing the potential for building ā€œBig Dataā€ collections on which sophisticated prediction models could be devised. Students of today have unprecedented access to a breadth of technology, and this increase in access justify in its own right an study into how to bring pervasive computing ideas into learning analytics. Pervasive computing is a ā€˜post-desktopā€™ computing model under which, greater processing power, connectivity and sensing are all available at a low cost, facilitating a widespread adoption of sensor-loaded, powerful, mobile devices. This active area of research is concerned with context-awareness, i.e. how tailored services can be offered to users via interconnected computing devices that are sensitive to the users context as determined by the processing of sensor data. One area of application of increasing interest is education. However, in this area much of the current interest tends to focus on the delivery of learning resources to students (Laine and Joy, 2009, and references therein) and the provision of virtual learning environments rather than identifying what students do. The application of pervasive computing in the area of education exploits both the opportunity of the ubiquity of devices and the increasing interest in new technology exhibited across the current generation of students. Although there has been a great amount of research in this direction (Laine and Joy, 2009; Hwang and Tsai, 2011, and references therein), most of this research has been focused on the use of pervasive tech- nologies to: ā€¢ enrich student learning experiences indoors and/or outdoors with digital augmen- tation (Rogers et al., 2004, 2005); ā€¢ assess students (Cheng et al., 2005); ā€¢ increase access to content and annotation capabilities in support of peer-to-peer learning (Yang, 2006); ā€¢ inform the learning activity design taking student context into account (Hwang, Tsai, and Yang, 2008); ā€¢ increase interaction by broadening discourse in the classroom (Anderson and Serra, 2011; Griswold et al., 2004) or by playing mobile learning games (Laine et al., 2010); ā€¢ enable ubiquitous learning in resource-limited settings, and observing the influence of new tools in the adaptation of learning activities and community rules (Pimmer et al., 2013);
  • 9. Chapter 1 Introduction 3 ā€¢ ā€œdeconstructā€ everyday experiences into digital environments (Owens, Millard, and Stanford-Clark, 2009; Dix, 2004). These examples demonstrate the possibility of applying such technologies in educa- tion. However, they had not set out to use contextual information in order to predict or even understand student behaviours. To address this shortcoming, we will consider context aware computing methods and techniques that have been applied successfully in the areas of healthcare, assisted living and social networking, and apply them to Higher Education to complement knowledge gained through traditional educational analytics. Many researchers have worked on the acquisition of context in general and on the dis- crimination of human activity in particular, such as dos Santos et al. (2010); Lau (2012); Bieber and Peter (2008); Huynh and Schiele (2005) and Khattak et al. (2011). Their findings could be applied in this area of research too, especially as the rapid emergence of the Internet of Things (IoT) means that the available sensor data will grow exponentially (Manyika et al., 2011). In my opinion, the application of novel techniques from pervasive computing into an investigation of student behaviour is worth exploring (Wilde, 2013; Wilde, Zaluska, and Davis, 2013c,d). Indeed, I am interested in exploring the untapped possibilities of extending learning analytics in a data-rich environment such as the one that will be prevalent in the Internet of Things, where all specific activities and general behaviour of students will leave ā€œfingerprints of dataā€ about them. This data trail af- fords specific contextual information, capable of analysis for measures of engagement, collaboration and attainment, thereby enabling the provision of adequate and timely feedback. Within this research I have already considered certain aspects related to the study of behaviour in the population of interest, akin to those in ethnographic methods, with my specific contribution residing on the disconnect between intentions of privacy as declared by smartphone users and the actual privacy levels evident in their phone interactions (Wilde et al., 2013b), which is one of the findings from a survey described in detail later in this report. This remainder of this upgrade report is organised as follows: Chapter 2 considers the characteristics of our learners, explores the state of the art in context-aware tech- nologies and their existing use in education as well as looking at the predictability of human behaviour and the type of data that is available in order to infer behaviour. Chapter 3 examines the research question to be addressed during this research: what are the measurable factors for the prediction of student academic success?. Chapter 4 presents the research work to date, specifically the design and application of a survey of Higher Education students (in the UK and in Chile), as well as information discovery for a suitable dataset to explore these factors (on University of Chile students), which will be prepared by combining data from the platforms U-Campus and U-Cursos here described. These chapters lead into a plan for the remaining work, which is detailed in 5. Finally, the conclusions of this upgrade thesis are presented in Chapter 6.
  • 10. Chapter 2 Background and Literature Review The general motivation for this research is assisting higher education students to achieve success. As they are the subjects of interest, they are more precisely described in Sec- tion 2.1. Then, I look into the use of digital technologies for learning (in Section 2.2), both from the educational institutions and their students viewpoints, as well as ways of using mobile and wearable technologies to learn more about students (Section 2.3). Section 2.4 reviews existing literature on the identification of human behaviour through these technologies. Finally, Section 2.5 appraises this review as a foundation for predic- tion of student success using a characterisation of students from measurable data about their conditioning and behavioural factors. 2.1 Higher education learners today To learn about student behaviour, it is useful to start with identifying salient charac- teristics of the students in higher education today, considering those of the ā€œtypicalā€ student, as well as those pertaining to students that do not fit into that classification. Specifically, Iā€™ll look into two dimensions: one, being the student levels of efficacy or even engagement with digital technologies (in sub-section 2.1.1) and another one, the age group to which the student belongs (sub-section 2.1.2). 2.1.1 A digitally-literate generation of students Prenskyā€™s term digital natives (Prensky, 2001a) is one amongst many1 used to identify those born ā€œtypically between 1982 and 2003 (standard error of Ā±2 years)ā€ (Berk, 2009, 1 Terms include: Millennials, Generation Y, Echo Boomers, Trophy Kids, Net Generation, Net Geners, First Digitals, Dot.com Generation and Nexters (Berk, 2009). Other terms are: cybercitizens, netizens, 4
  • 11. Chapter 2 Background and Literature Review 5 2010). Members of this group, by this definition, are now 11 to 32 years old, so the ma- jority of students in higher education today would belong to it. Furthermore, according to Prensky (2001b), many may even process and interpret information differently (al- legedly due to the plasticity of the brain). These assertions would imply that what have been regarded as traditionally effective study habits and behaviours for previous gener- ations are no longer effective and need to be reviewed to accommodate to the needs of the current generation of students. Nevertheless, since only a fraction of the world population access digital technologies to achieve ā€˜nativeā€™-like fluency in their use, the term ā€œdigital nativesā€ is not a fit descrip- tion (Palfrey and Gasser, 2010), and for this reason (amongst others) it has become less accepted in the current educational discourse. Education, experience, breadth of use and self-efficacy are more relevant than age in explaining how people become ā€œdigital nativesā€ (Helsper and Eynon, 2010). As a response, Kennedy et al. (2010) proposed a different classification based on a study comprising 2096 students in Australian uni- versities: ā€œpower users (14% of sample), ordinary users (27%), irregular users (14%) and basic users (45%)ā€. However, rather than a discrete classification, a more useful typology is a continuum, as individuals are placed along it depending on a number of factors. Jones and Shao (2011) indicate that various demographic factors affect student responses to new technologies, such as gender, mode of study (distance or place-based) and whether the student is a home or international one. A JISC report questions the validity of certain attributed characteristics of this generation (Nicholas, Rowlands, and Huntington, 2008). Examples are: a preference for ā€œquick informationā€ and the need to be constantly connected to the web, now proved to be myths: these traits are not generational. Whilst Turkle (2008) notes that young people have digital devices always- on and always-on-them, becoming virtually ā€˜tetheredā€™, this behaviour is not restricted to young people. For these reasons, this term has increasingly become replaced by the term digital residents and its counterpart digital visitors (White et al., 2012). In any case, we acknowledge that many of our students today are not only engaged in digital technologies in a daily basis, but in their world there have always been digital technologies in various forms. Even with the proviso that this behaviour may not be generalisable ā€œoutside of the social class currently wealthy enough to afford such thingsā€ (Turkle, 2008), it is an observable behaviour that is becoming increasingly common as digital technologies have become more affordable than ever before. This suggests that in the planning of a study involving higher education students as participants, not only those in this generation should be considered, but also those outside it, such as mature students. homo digitalis, homo sapiens digital, technologically enhanced beings, digital youth and the ā€œyuk/wowā€ generation (Hockly, 2011; Dawson, 2010).
  • 12. Chapter 2 Background and Literature Review 6 2.1.2 Mature students in HE Ascribing generational traits to todayā€™s learners is somewhat an overgeneralisation. As Jones and Shao (2011) point out, global empirical evidence indicates that, on the whole, students do not form a generational cohort but they are ā€œa mixture of groups with var- ious interests, motives, and behavioursā€, not cohering into a single group or generation of students with common characteristics. In particular, research on higher education students often focus on the standard age band of students under 21 years of age, not accounting for mature students (this term is typically used to refer to those who are over this threshold upon entrance). Even amongst this group, there are significative differences in behaviour and attain- ment. Studies have found that older mature students were more likely to study part-time than full-time, as family and work commitments have been acquired. In fact, 90% of part-time undergraduate students are 25 years old or over and as many as 67% are over 30 (Smith, 2008). On this note, Baxter and Hatt (1999) argued that mature students could be disag- gregated according to age bands seemingly correlating with various levels of academic success. Therefore, instead of considering standard and mature students solely (under and over 21 respectively), they introduce the distinction between younger and older matures, as those over 24 were more likely to progress through into their second year, despite a longer period time out of education. In general the younger mature learners were more at risk of leaving the course than older mature students. However, even this division may well be still a poor generalisation about (mature) students, as beside their age, there are a myriad of more relevant factors affecting their experience, such as their route into HE, their background and motivation to study, all are difficult (if not pointless) to use for a classification of mature learners (Waller, 2006). An approach that acknowledges the individual characteristics of learners is to be preferred to those requiring conflating them into a homogeneous group, as conclude by Waller (2006), requiring educational providers to act on means to identify these characteristics in order to adopt such an approach. 2.1.3 Summary The literature reviewed in this area validates the need for individualised support and feedback, delivered timely and directly to each student, if it is to make an impact. Another conclusion from this review is that students in higher education today have been exposed to digital technologies (of which wearable and mobile devices are an example), suggesting that these can become appropriate channels to facilitate this delivery.
  • 13. Chapter 2 Background and Literature Review 7 2.2 Computers and learning A natural consequence of the pervasiveness of digital technologies in recent years is that they are now almost universally use in teaching and learning (to various degrees). In fact, coinciding with the advent of the personal computer in the 1970s, the term Computer Assisted Learning was first coined, alongside Computer Assisted Instruction and similar others, however, these terms are less commonly used as they are becoming replaced in the educational discourse by the term e-learning. The former have been used to characterise the use of computers in education, or more specifically, where digital content is used in teaching and learning. In contrast, the latter is generally used only when the content is accessed over the Internet (Derntl, 2005; Hughes, 2007; Jones, 2011; Sun et al., 2008). 2.2.1 Learning Management Systems Learning Management Systems (LMS), also known as virtual learning environments (VLE) and course management systems, are excellent examples of the application of e-learning to support traditional face-to-face instruction. These are systems used in the context of educational institutions offering technology-enhanced learning or computer- assisted instruction ā€“ BlackboardTMand Moodle are the best-known examples. Stakeholders may have different objectives for using a LMS. For example, Romero and Ventura (2010) reviewed 304 studies indicating that students use LMS to person- alise their learning, reviewing specific material and engaging in relevant discussions as they prepare for their exams. Lecturers and instructors use them to give and receive prompt feedback about their instruction, as well as to provide timely support to stu- dents (e.g. struggling students need additional attention to complete their courses more successfully (Baepler and Murdoch, 2010), as the failure to do so comes at a great cost, not only to these students but to their institutions). Administrators use LMS to inform their allocation of institutional resources, and other decision-making processes (Romero and Ventura, 2010). These authors argue the need for the integration of educational data mining tools into the e-learning environment, which can be achieved via LMS. LMS are being increasingly offered by Higher Education institutions (HEIs), a tech- nological trend making an impact on these institutions. Another trend is the prolifer- ation of powerful mobile devices such as smartphones and tablets, from which on-line resources can be accessed2. 2 These two trends push HEIs to provide LMS access via smartphones in a visually appealing and accessible way. These are inherent requirements of the mobile experience, which is fundamentally dif- ferent to the desktop one (Benson and Morgan, 2013). Benson and Morgan present their experiences migrating an existing LMS (StudySpace) to a mobile development, as a response to these pressures and the pitfalls identified on the Blackboard MobileTM app.
  • 14. Chapter 2 Background and Literature Review 8 It is worth noting that the majority of these systems have a client-server archi- tecture supporting teacher-centric models of learning (common scenarios have teachers producing the content while students ā€˜consumeā€™ it) (Yang, 2006). To put this assertion in context, pedagogic conceptions of teaching and learning are usually understood in the literature as falling into one of two categories: teacher-centred (content driven) and student-centred (learning driven) (Jones, 2011, and references therein). Figure 2.1 shows these orientations as overarching the main five conceptions of teaching and learning which act as landmarks alongside a continuum of roles in learning. Deep learning occurs at the bottom end of the scale, as opposed to shallow learning which occurs at the top end. When student-centred, computer assisted learning can increase studentsā€™ satisfac- tion and therefore engagement and attainment. It is remarkable that the move towards learner-centredness in Higher Education coincides with the trends towards personalisa- tion and user-centredness in Human-Computer Interaction and computing technologies in general. Imparting information Teacher-centred (content-driven) Transmitting structured knowledge Student-teacher interaction / apprenticeship Facilitating understanding Conceptual change / intellectual development Student-centred (learning-oriented) Figure 2.1: Multi-level categorisation model of conceptions of teaching (adapted) Kember (1997). The trend towards a widespread use of mobile devices, earlier identified, brings an increased number of opportunities of effecting the conceptual change from the categori- sation above, as it has the potential of making the learning more student-centred than
  • 15. Chapter 2 Background and Literature Review 9 before: it would take placer wherever the student goes, whenever it suits the student best3. Additional opportunities to reach students to either deliver content or to assess their learning, are coupled with opportunities for other stakeholders at educational insti- tutions to gain an insight on student achievement (typically progression and completion) via learning analytics, as presented in the next subsection. 2.2.2 Learning analytics As well as facilitating engagement, content delivery and even assessment and feedback, digital technologies have been increasingly being used for facilitating administrative tasks and decision-making at educational institutions. In particular, in recent years HE institutions have begun to use data held about their students for learning analytics (Barber and Sharkey, 2012; Sharkey, 2011; Bhardwaj and Pal, 2011; Glynn, Sauer, and Miller, 2003). Learning analytics (also known as academic analytics and educational data mining), are widely regarded as the analysis of student records held by the institution as well as course management system audits, including statistics on online participation and similar metrics, in order to inform stakeholders decisions in HE institutions. Academic analytics are considered as useful tools to study scholarly innovations in teaching and learning (Baepler and Murdoch, 2010). According to these authors, the term academic analytics was originally coined by the makers of the virtual learning environment (VLE) BlackboardTM, and it has become widely accepted to describe the actions ā€œthat can be taken with real-time data reporting and with predictive modelingā€ which in turn helps to suggest likely outcomes from certain behavioural patterns (Baepler and Murdoch, 2010). Educational data mining involves processing such data (collected from the VLE or other sources) through machine learning algorithms, enabling knowledge discovery, which is ā€œthe nontrivial extraction of implicit, previously unknown, and potentially useful information from dataā€ (Frawley, Piatetsky-Shapiro, and Matheus, 1992). Whilst data mining does not explain causality, it can discover important correlations which might still offer interesting insights. When applied to higher education, this might enable the discovery of positive behaviours, such as for example, whether students posting more than a certain number of times in an online forum tend to have higher final marks, or whether attendance at lectures is a defining factor for academic success, or even for any of its measures such as ā€œretention, progression and completionā€ (Sarker, 2014). 3 The ā€œanywhere, anytimeā€ maxim driving pervasive computing maxim is also a motivator for the development of the next generation of e-learning. Rubens, Kaplan, and Okamoto (2014) discuss the evolution of the field, aligning it to the advent of Web 2.0 and 3.0, central to this paradigm of learning.
  • 16. Chapter 2 Background and Literature Review 10 2.2.3 Massive Open Online Courses Developments in these learning digital technologies have facilitated the rise of massive open online courses (MOOCs)4, where the already difficult issues of assessing and provid- ing feedback increses dramatically in complexity with classes of up to tens of thousands of learners (Hyman, 2012). Within this context, a considerable amount of interest has been devoted very recently to the use of learning analytics too, for example: ā€¢ On social factors contibuting to student attrition in MOOCs (RoseĢ et al., 2014; Yang et al., 2013); ā€¢ On linguistic analysis of forum posts to predict learner motivation and cognitive engagement levels in MOOCs (Wen, Yang, and RoseĢ, 2014). 2.2.4 Summary The literature reviewed in this area evidences the impact of digital technologies in the provision of support and feedback to learners and other stakeholders of educational institutions, both in terms of facilitating learning and assessment (as in MOOCs, for example, but in e-learning in general) as well as in terms of characterising the learners using learning analytics. In doing so, it is possible to identify the variations amongst learners to better facilitate the learning experience. An important category of digital technologies used in education includes portable, light-weight devices, which can be additionally function as sensor carriers, as presented in the following section. 2.3 Smart badges and smartphones Until recently, cumbersome sensing equipment (often carried in backpacks) was required, as shown in a survey of early developments in sensing technologies for wearable comput- ers (Amft and Lukowicz, 2009). These are now replaced by small, light-weight sensors which are also capable of becoming embedded within badges and phones, for example. Smart badges are identity cards with embedded processors, sensors and transmitters. The concept is not new, in fact the first of these wearable computers was developed two decades ago, by the Olivetti Research Laboratory (Cambridge) and then further developed by Xerox PARC: the Active Badge (Want et al., 1992; Weiser, 1999), shown in Figure 2.2. More recently, smart badges have been used to study social behaviour, as with the Hitachiā€™s Business Microscope (HBM) (Ara et al., 2011; Watanabe, Matsuda, and Yano, 2013) and with its predecessor, the MIT wearable sociometric badge (Wu et al., 4 MOOCs are occasionally referred to as ā€œMassively-Open Online Coursesā€.
  • 17. Chapter 2 Background and Literature Review 11 Figure 2.2: Smart badges: The Active Badge by Palo Alto Research Centre (Weiser, 1999) 2008; Pentland, 2010; Dong et al., 2012), shown in Figures 2.3 and 2.4. These badges, containing tri-axial accelerometers, are able to capture some characteristics of the motion of the wearer (e.g. being still, walking, gesturing). Thanks to additional sensors such as infrarred transceivers, they are also able to capture face-to-face interaction time. Being lightweight and with a long battery life, these badges can be carried unobstrusively for several hours a day. Figure 2.3: Smart badges: Hitachiā€™s Business Microscope (external and internal appearance) (Ara et al., 2011) Watanabe et al. (2012) used the HBM in an office environment, finding evidence that the level of physical activity and interaction with others during break periods (rather than during working activities) is highly correlated with the performance of their team. Watanabe et al. (2013) then applied this methodology within a learning
  • 18. Chapter 2 Background and Literature Review 12 Figure 2.4: Smart badges: The MIT wearable sociometric badge (Dong et al., 2012) environment, this time using the smart badges on primary school children, observing a strong correlation between the scholastic attainment of a class and the degree of in which its members are ā€œbodily synchronisedā€. In other words, classes with all their members are either physically active or resting consistently during the same periods, perform better. Another correlation these authors observed is the number of face-to- face interactions per child during break. Their findings suggest that when children in a class move in a cohesive manner, the class perform well overall, and also, that the more face-to-face interactions an individual has, the better their attainment. The use of badges by all participants is easily enforced in an environment with a strict dress code, such as school uniforms. Since our population of interest is higher education students, smartphones are probably more appropriate than smart badges as sensor carriers, but it is nonetheless interesting to see how much can be learned from sen- sor data, especially when combined with learning analytics, as in the case of Watanabe et al. (2013), certain behaviours can be found to be related to a measure of success. Smartphones present another advantage over badges. Equipped with ambient light sensors, proximity sensors, accelerometers, GPS, camera(s), microphone, compass and gyroscope, plus WiFi, Bluetooth radios, a variety of applications can be built to gather a great range of sensed data Lane et al. (2010). Thanks to their communication and processing capabilities, smartphones could support a sensing architecture such as the one depicted in Figure 2.5. Contextual information can be inferred from the sensor data hence gathered, and the context determined as in, for example, location. However, it has been long accepted that ā€œthere is more to context than locationā€ (Schmidt, Beigl, and Gellersen, 1999). Contex- tual information broadly falls into one of two types: physical environment context (such as light, pressure, humidity, temperature, etc) and human factor related context such as information about users (habits, emotional state, bio-physiological conditions, etc), their social environment (co-location with others, social interaction, group dynamics, etc), and their tasks (spontaneous activity, engaged tasks, goals, plans, etc) (Schmidt et al., 1999).
  • 19. Chapter 2 Background and Literature Review 13 Figure 2.5: A smartphone sensing architecture (Lane et al., 2010). Context acquisition is, however, important not just because of the possibility to offer customised services that adapt to the circumstances. Context processing can increase user awareness (Andrew et al., 2007), and thereby prompt alternative actions to better achieve a desired goal given the current context, hereby modifying somehow an intended behavior. 2.3.1 Summary The literature in this area indicates that sensor data has the potential to help us un- derstand human behaviour as a collective and as individuals as well as gathering the context in which it is situated. This would be a suitable foundation for a behavioural
  • 20. Chapter 2 Background and Literature Review 14 intervention which is aligned to the userā€™s goals, and the smartphone is a suitable sensing platform which could be used to understand usersā€™ behaviour as well as supporting them in achieving their higher goals, as discussed in the next Section. 2.4 Behaviour sensing and intervention Despite its inherent complexity, researchers have shown that human behaviour is highly predictable in certain contexts. In the context of scale-free networks, the degree of predictability has been quantified to 93% (Song et al., 2010). Evidence suggests that behaviour can be ā€œminedā€ and even predicted using sensors on phones or smart badges (presented in the previous Section): ā€¢ identifying structure in routine (for location and activity) to infer the organisa- tional dynamics (Eagle and Pentland, 2006); ā€¢ analysing behaviour based on physical activity as detected via smartphones (Bieber and Peter, 2008); ā€¢ predicting work productivity based on face-to-face interaction metrics (Wu et al., 2008; Watanabe et al., 2012); ā€¢ inferring friendship network structure with mobile phone data (Eagle, Pentland, and Lazer, 2009); ā€¢ using mobile phone data to predict next geographical location based on peersā€™ mobility (De Domenico, Lima, and Musolesi, 2012), even predicting when will the transition occur (Baumann, Kleiminger, and Santini, 2013); ā€¢ classifying social interactions in contexts, where a crowd disaggregates in small groups (Hung, Englebienne, and Kools, 2013); ā€¢ predicting personality traits with mobile phones (de Montjoye et al., 2013); ā€¢ Bahamonde et al. (2014) showed that even data from smart cards which can be regarded as less personal than phones or identity cards are suitable capable for behavior mining. In particular, these researchers were able to deduce usersā€™ home address through the data exposed by their bip! cards, which are used for payment for public transport in Santiago de Chile. From this research we can assert that, given sufficient information, some human be- haviour can be predicted (see Appendix B for more on its high predictability). Specifically relevant to behaviour sensing in the educational context is the possibility of ā€œseeingā€ the learning community (Dawson, 2010) by studying the frequency and types
  • 21. Chapter 2 Background and Literature Review 15 of interactions amongst learners using social network analysis (SNA), as factors such as degree centrality5 is a positive predictor of a student sense of community, which is measurable. Srivastava, Abdelzaher, and Szymanski (2012) acknowledge the use of smartphones for sensing is becoming increasingly commonplace for human-centric sensing systems (whether the humans are the sensing targets, sensors operators or data sources). They identify various technical challenges to their wider adoption for these systems, one of them being the difficulty of inferring a rich context in the wild. They warn that earlier successes on inferences about mobility do not replicate with ease when making inferences about ā€œphysical, physiological, behavioural, social, environmental and other contextsā€ (my emphasis). In terms of behavioral change, the state of the art includes: ā€¢ using computers as persuasive technologies6 (Fogg, 2003, 2009, 2003; MuĢˆller, Rivera- Pelayo, and Heuer, 2012); ā€¢ promoting preventive health behaviors to healthy individuals through SMS, with positive behavior change in 13 out of 14 reviewed interventions (Fjeldsoe, Marshall, and Miller, 2009); ā€¢ health-promoting mobile applications (Halko and Kientz, 2010); ā€¢ HCI frameworks for assessing technologies for behavior change for health (Klasnja, Consolvo, and Pratt, 2011); ā€¢ ā€œsoft-paternalisticā€ approaches to nudge users to adopt good behaviours to protect their own privacy on mobile devices (Balebako et al., 2011); ā€¢ nonverbal behavior approaches to identify emergent leaders in small groups (Sanchez- Cortes et al., 2012); ā€¢ interactions of great impact and recall to facilitate behaviour change (Benford et al., 2012); ā€¢ protocols for behavior intervention for new university students (Epton et al., 2013); ā€¢ using smartphones for digital behavioral interventions (Lathia et al., 2013; Weal et al., 2012); ā€¢ guidance for planning, implementation and assessment of behavioral interventions for health (Wallace, Brown, and Hilton, 2014). 5 The degree centrality is defined by the number of connections a given node has. 6 Persuasive technologies, not to be confused with pervasive, as here the emphasis is on ā€œpersuasionā€ rather than ubiquity.
  • 22. Chapter 2 Background and Literature Review 16 In particular, Wallace et al. (2014) argue that interventions involve change processes ā€œlinked to psychological theories of human behaviour, cognition, beliefs and motivationā€ with a primary aim of improving experiences and well-being. This must be incorporated in the planning and implementation of any behavioural intervention, in particular for digital interventions. Lathia et al. (2013) identify the need for monitoring, learning about the behaviour, before delivering an intervention, effects of which must continue to be monitored (Figure 2.6). Monitor ā€¢ Gather mobile sensing data ā€¢ Collect online social network relationships and interactions Learn ā€¢ Develop behaviour models ā€¢ Infer when to trigger intervention ā€¢ Adapt sensing Deliver ā€¢ Tailored behaviour change intervention ā€¢ User feedback via the smart- phone Figure 2.6: The three components of digital behaviour interventions using smartphones (Lathia et al., 2013, adapted). Furthermore, Klasnja et al. (2011) assert that the development of such technolo- gies presupposes the need for large studies, suggesting that ā€œa critical contribution of evaluations in this domain, even beyond efficacy, should be to deeply understand how the design of a technology for behavior change affects the technologyā€™s use by its target audience in situā€. Translating this experience to the educational context means that it is not realistic to measure the success of the development by actual behavior change, but instead, by the degree of understanding of its potential to influence behaviour. 2.5 Final comments In the previous section, smartphones and badges were considered as sensing platforms for behaviour. In addition to the data that could be collected implicitly (i.e. without explicit intervention from the user) via these, the possibility of incorporating user-generated data is also valuable. As an example, life annotations (Smith, Oā€™Hara, and Lewis, 2006) and ā€˜lifeloggingā€™ (Oā€™Hara, 2010; Smith et al., 2011). This data could be potentially used to enrich that typically studied in learning analytics by giving an insight on an additional dimension of student lives: what do they do when they are not studying?
  • 23. Chapter 2 Background and Literature Review 17 Through this (still ongoing) survey of the relevant literature, I have now gained a greater understanding of the characteristics of Higher Education students (which may condition their levels of academic success), the devices they use in their learning (in and out of the classroom), and others from which their behaviour can be sensed, as behavioural factors may complement conditioning factors in determining of student suc- cess. I also explored the state of the art in behavioural interventions, and what data can be used to facilitate one. This is the foundation upon which key research components have been created, which are presented in the next Chapter.
  • 24. Chapter 3 A research question The literature review presented in the previous Chapter surveyed the type of data and techniques that can be used to understand and predict student behaviour. This Chap- ter formulates the research question to be addressed, in order to plan an experimental methodology and a road map for future work. The research question stated in the introduction is ā€œWhat are the measurable factors for the prediction of student academic success?ā€. This Chapter discusses conditioning and behavioural factors affecting students academic success and how to gather data for measures of these factors against academic performance (a proxy for success). 3.1 What are the measurable factors for the prediction of student academic success? Most context-aware pervasive systems use location as the most important contextual information available. Indeed, there is a wealth of research and commercial products which offer location-based services, which focus on the use of readily available informa- tion relevant to users in a given location. Not yet so well exploited, although gathering significant scientific interest, is the use of physical activities as contextual information. Other sources of contextual information that can become readily available include the use of social media and learning analytics. Additionally, using sentiment analysis on social media could help capture users mood and general outlook over the observable period. Data mining algorithms could be applied over collected data, however, the ā€œground truthā€ measure of what constitutes a successful student needs to be established beforehand, and as explained earlier, it is in itself a very difficult question. Proxy measures of success can be used, such as academic achievement and progression, but other aspects of student life such as level of engagement and contentedness (if somehow 18
  • 25. Chapter 3 A research question 19 measurable) could also taken into account for a more complete portrait of a successful student. Table 3.1 lists a range of activities that students in higher education are likely to engage in, as well as the means of gathering data which could lead to identify a given activity, assuming participantsā€™ consent and unrestricted access to data sources, and the practical viability of the creating such a data collection based on existing research. As Table 3.1 suggests, a substantial amount of information about the student behaviour can be harvested and quantified (albeit exhibiting ā€œBig Dataā€ challenges for any practical purposes). In other words, it is viable to investigate the behavioural factors affecting the student success, if, as in the traditional learning analytics (based on conditioning factors1), these are analysed against metrics of academic success, such as retention, progression and completion. This would give a more complete characterisation of a student than ever before and, as a consequence a more powerful, accurate prediction of their success. I have now specified the research question, and will now discuss the practical work to date conducted in pursuit of answers of aspects of this question, arisen from the literature review presented in Chapter 2. This is followed by the formulation of specific research hypothesis, which will specifically qualify the scope of this research (in Chapter 5). 1 Conditioning factors such as, for example, those highlighted in Table 5.2, page 38.
  • 26. Chapter 3 A research question 20 Table 3.1: What do students do? Activity What could be measured? Possible data source Research using ā€œsimilarā€ data sources Attend lectures Number of lectures attended during the semester, punctu- ality (by comparing calendar against actual arrival times) GPS, University timetable, co- location with peer learners, wi-fi Ara et al. (2011); Watan- abe et al. (2013); Wu et al. (2008); Pentland (2010); Dong et al. (2012) Use a VLE Forum participation (fre- quency, number of posts), number of downloads VLE records Barber and Sharkey (2012) Visit libraries Number of items borrowed, length of the loan, medium, material type Smartcard, Radio-Frequency Identification (RFID), library records Take exams Academic performance mea- sures (exam results, history of academic performance) University records, VLE Travel Mode of transport, Distance travelled, peridiocity Accelerometer, transport smart card records, GPS Hemminki, Nurmi, and Tarkoma (2013a); Baha- monde et al. (2014) Meet other students Co-location with other learn- ers, certain locations (labs, etc), noise levels at location GPS, Bluetooth, microphone, smartcard, RFID tags Hemminki, Zhao, Ding, RannanjaĢˆrvi, Tarkoma, and Nurmi (2013b) Extra- curricular activities Participation in societies, sports, games, etc VLE forums, Facebook Wen et al. (2014) Social networking Number and frequency of tweets and facebook posts, number of uploaded photos Twitter, Facebook Physical activities Frequency, level of activity (walk, cycle, run), fidgeting? Accelerometer, gyroscope Hung et al. (2013); Huynh (2008) Play and rest Number of hours watching TV or movies Lifelogging, ambi- ent light sensors, accelerometer Smith et al. (2011) Other activities of daily living Eating and drinking (regular- ity of meals, frequency) Lifelogging Smith et al. (2011) Social networking Number and frequency of tweets and facebook posts, number of uploaded photos Twitter, Face- book
  • 27. Chapter 4 Outcomes of Work to Date In addition to the literature review presented in Chapter 2, other work to date has involved the investigation of studentā€™s views via two surveys applied to Higher Education Students, one in English, of students in the UK (Section 4.1) and a version in Spanish, of students at the University of Chile (Section 4.2), as well as an investigation into a platform and its dataset from which student behaviour could be inferred: the U-Cursos platform (Section 4.3). 4.1 Survey of HE English-speaking students 4.1.1 Methodology A survey1 of Higher Education students, including undergraduate and postgraduate stu- dents in several disciplines, was applied between the 16th August and the 18th October 2013. This survey focused on exploring the current use of smartphones by Higher Ed- ucation students as well as establishing acceptability of a future application. It was developed iteratively, applying early versions amongst fellow researchers before deploy- ing it on the survey platform iSurvey. Data collected using early versions of the survey was discarded as their purpose was only to inform the design. The questions appearing in the final version of the survey can be seen in the Appendix C. Some of the elements in the literature review informed the questionnaire design. For example, the exploration the use of the smartphone that Questions 2 and 3 intended to test the extent to which the characterisation of a virtually ā€œtetheredā€ student presented in Section 2.1.1 is true. Similarly, the considerations presented in Section 2.1.2 helped in determining the age groups within question 5(b). In all, the information required fell across the following areas: 1 Hosted at https://www.isurvey.soton.ac.uk/admin/section_list.php?surveyID=8728. 21
  • 28. Chapter 4 Outcomes of Work to Date 22 ā€¢ Smartphone ownership ā€” to establish whether participants own (or intend to acquire) a smartphone shortly. If so, which brand, to confirm whether an Android development would be suitable. ā€¢ Current use of the smartphone ā€” in which participants are asked about the fre- quency of their use of their phone across a range of activities. ā€¢ Perception on whether the smartphone helps or hinders participantsā€™ personal goals in general, and their academic success specifically. ā€¢ Acceptability of a pervasive application that would provide behavioural ā€œnudgesā€ and desired features of such an application; ā€¢ Other information controlled including: discipline studied, level of study, modality of studies (part-time or full-time) and views on adoption of technology. The survey was publicised on various social networks (LinkedIn, Facebook and Twit- ter) as well as by direct e-mail invitation to University of Southampton students2. Par- ticipants were required to be students in Higher Education and over 18 years old. No compensation was offered as no detriments arose from the participation in the research other than an investment of ten minutes for the typical participant (of which partici- pants were duly warned beforehand) and participants were not required to give sensitive information, as questions related to the demographics section of the survey were not open (instead, meaningful bands were offered for selection whenever possible). Many questions could have been skipped if the participant wanted so3. A total of 807 students attempted this questionnaire however, many could not com- plete due to a limitation of the iSurvey platform, which hosted the survey4. After discarding incomplete submissions and those from participants in academic institutions outside the UK, data from 164 participants remained for analysis. 4.1.2 Findings An analysis of the responses indicate that participants, despite actively using smart- phones in their daily lives, are hesitant on allowing these devices to track their behaviour 2 Via Joyce Lewis, Senior Fellow for Partnerships and Business Development. 3 Compliant with recommendations by the British Educational Research Association (BERA), out- lined in ā€œEthical Guidelines for Educational Researchā€, http://www.bera.ac.uk/system/files/BERA% 20Ethical%20Guidelines%202011.pdf. Also compliant with our institutional guidelines collated un- der https://sharepoint.soton.ac.uk/sites/fpas/governance/ethics/default.aspx, (both last ac- cessed 28th February 2014). Ethics reference number: ERGO/FoPSE/7447. 4 At the time, there was a requirement for the participants to have Flash-enabled devices to complete surveys with slider questions (as it was the case), so participants accessing via iPhones or iPads had to re-start the survey in other platforms. It is not possible to estimate how many did (given that the survey was anonymous). This problem has now been resolved (https://www.isurvey.soton.ac.uk/ help/changes-to-the-slider-question-type/) but unfortunately it affected this data collection.
  • 29. Chapter 4 Outcomes of Work to Date 23 and whether such feedback is desirable. On one hand, participants report their use of a smartphone for a number of activities, as shown in the charts in Figure 4.1. Figure 4.1: Survey responses from UK students (excluding qualitative data). The first 18 charts refer to activities that participants report undertaking with their smartphones, which correspond to the 18 activities indicated in Question 2 of the survey. A dominance towards lower numbers in the x axis corresponds to a high frequency in performing a given activity as reported by the participants. For example, this applies to making or receiving phone calls and text messages, using social networks and calendars or reminders. Conversely, a dominance towards higher numbers in the x axis corresponds to a low frequency, as it is the case for blogging, searching for a job, and playing podcasts.
  • 30. Chapter 4 Outcomes of Work to Date 24 The next two charts in Figure 4.1 show the reported purpose for participants to use their smartphone both in term time and outside of term. Whilst there is a preference towards the use of their smartphones for personal reasons, as expected, this was much more marked for outside of term periods. With regards to the perception of their phone being a help or a barrier towards their personal goals and their academic success (the subsequent two charts), most participants leaned towards the left end of the spectrum (a help). Figure 4.1 also indicates the reported desirability of features of a future smartphone application, in charts 23 to 28. In this case, a preference towards the left indicates that the given category is very desirable, and towards the right that it is not. Participants were then asked whether they were concerned about any of these possible features5. In this case, and with various degrees of acceptance, the majority welcomed features that provided them with information about themselves and their peers, with the exception to the check-in learning spaces, which is not desired for the majority of the participants in the survey. Out of 164 participants, as many as 95 reported no concern about the features mentioned. The remaining 69 participants had a variety of concerns, more prominently regarding feedback on their behaviour and about their peers, as well as privacy concerns regarding the capability of an application to check them when entering learning spaces. Other privacy concerns focused on the data itself, and who would access and control it. Many commented they would not want their smartphones to have these features, in particular those regarding physical activity tracking (terms such as ā€œsurveillanceā€, ā€œbig brotherā€ and ā€œpanopticonā€ were mentioned) but some others would welcome some feedback on how they use their time and see the benefits of using such an application. However, not all respondents have the same attitude towards adopting innovation6, as they claim identification with one of Rogers (1962) taxonomy classes: ā€œInnovators, Early adopters, Early majority, Late majority, or Laggardsā€7. 4.2 Survey of students from the University of Chile 4.2.1 Methodology Once it was decided to use data from the University of Chile students, it became relevant to adapt the survey previously described in Section 4.1 for its application on these 5 See Appendix D for a word cloud based on participantsā€™ responses. 6 Rogersā€™ taxonomy is succintly summarised as follows: Innovators: first to adopt an innovation; Early adopters: judicious in balancing financial risks; Early majority: adopt an innovation with early adopters advice; Late majority: adopt innovation after majority; ā€œLaggardsā€: the last to adopt an innovation. (Rogers, 1962) 7 Currently, this data is being analysed using NVIVO (for the open responses) and SPSS and SigmaPlot, and further conclusions will be reported in the final thesis.
  • 31. Chapter 4 Outcomes of Work to Date 25 Figure 4.2: Survey of University of Chile students: First screen. students8. As well as translating the content for each of the screens (see example 4.2), a question was removed as it was not relevant within this context (the concept of part- time studying is not formalised via registration), and further options were added to the educational stage question (as graduate courses last typically a minimum of 5 years, as opposed to the UKā€™s three-year courses). 4.2.2 Findings The general trend of the responses is remarkably similar to that of UK students. Only two exceptions, which are explained in the following paragraphs: Firstly, the Chilean participants seem to prefer phone calls to SMS messaging. This may be explained by the fact that each SMS text is typically charged (unlike in the UK, where most providers offer a number of free messages as part of their services). Given that Internet providers in Chile offer affordable flat-fare packages, for small texts, Chilean students may prefer communicating via social networks (such as Twitter direct messaging or Facebook chat), or messaging apps (such as WhatsApp and Viber). A second difference worth commenting is that whilst the UK participants perceive their smartphones as helpful towards the achievement of both their personal goals and their academic success, this is not so clear for the Chilean participants, who seem divided in their responses. Although the justification for this difference is yet to emerge from 8 The version of this survey in Spanish is hosted at https://www.isurvey.soton.ac.uk/admin/ section_list.php?surveyID=10807 (closed at present).
  • 32. Chapter 4 Outcomes of Work to Date 26 Figure 4.3: Survey responses from students of the University of Chile (excluding qualitative data). Note that it has one chart less than Figure 4.1 because there is no distinction between Full- and Part-Time at registration at the University of Chile. further analysis of the data, one possible explanation may lie with the stage in their studies: it is conceivable that students who have not progressed as quickly as they had expected may attribute their lack of progress to distractions related to their use of their smartphones, which is nevertheless, comparable to that of their UK counterparts.
  • 33. Chapter 4 Outcomes of Work to Date 27 4.3 U-Cursos U-Cursos is a web-based platform designed to support classroom teaching. An in-house development by the University of Chile, it was first released in 1999, when the Faculty of Engineering required the automation of academic and administrative tasks. In doing so, the quality and efficiency of their processes improved, whilst supporting specific tasks such as coordination, discussion, document sharing and marks publication, amongst oth- ers. Within a decade, U-Cursos became an indispensable platform to support teaching across the University, used in all 37 faculties and other related institutions. Channels Service content Channels services Figure 4.4: A typical U-Cursos view. Left: a list of current channels (courses, communities and associated institutions). Top right: services available for the selected channel. Bottom right: contents of a service. From CaĢdiz et al. (2014) (in Appendix E) The success of U-Cursos is demonstrated by the high levels of use amongst students and academics, reaching more than 30,000 are active users in 2013. U-Cursos provides over twenty services to support teaching, as well as community and institutional ā€œchan- nelsā€, which allow students to network, share interests and engage in discussion about various topics. Figure 4.4 shows a typical view of U-Cursos. On the left, a list of ā€œchannelsā€ available for the current term are shown. Channels are the ā€œcoursesā€, ā€œcom- munitiesā€ and ā€œinstitutionsā€ associated with the user. Typically, courses are transient, so they are replaced with new courses (if any) at the start of the term. Communities are subscription channels which are permanent and typically refer to special interest groups, usually managed by students, with extracurricular topics. Finally, institutions
  • 34. Chapter 4 Outcomes of Work to Date 28 Figure 4.5: Cramped look to the U-Cursos web interface from a smartphone (CaĢdiz et al., 2014). refer to administrative figures within the organisation. The institutional channels are used to communicate official messages on the news publication service and also to allow students to interact using forums containing students from all of the programmes within each institution. A number of services are available for each type of channel. Users can select any of the shown services and interact with it on the content area of the view. Note that the majority of the services are provided for all types of channels, but courses also offer academic services such as homework publication and hand-in, partial marks publication and electronic transcripts of the final marks. These features make course channels official points of access for the most important events in a course and have become indispensable for students. 4.3.1 Current status The current version of U-Cursos displays well on all regular-size screens (above 9ā€), such as desktop computers and tablets. However, the user interaction becomes cumbersome on small displays, such as those in smartphones, as shown in Figure 4.5.
  • 35. Chapter 4 Outcomes of Work to Date 29 300,000 600,000 900,000 1,200,000 1,500,000 1,800,000 2,100,000 2,400,000 2,700,000 3,000,000 hits month 1st term 2nd term student strike Figure 4.6: Access graph between 2010 and 2014 for U-Cursos (CaĢdiz et al., 2014). Another shortcoming is the lack of notification facilities, in particular those alerting users of relevant content updates. The current setting requires users to manually access the platform repeatedly to confirm that the information is still current. This behaviour can be observed in Figure 4.6, which shows access statistics of U-cursos in the last four years. There are clear high-peaks during the end-of-term periods9. Additional factors may trigger an increased access rate to the service: students ask more questions and download class material for the final exams, project coordination, amongst others. According to the users, there is a component of uncertainty which encourages users to repeatedly access the platform during these periods. As a response, researchers from ADI designed a mobile application for the platform, currently in beta testing. A research visit to NIC Labs (University of Chile), took place from the 9th to the 19th of March 2014, to provide access and understanding of the historical data collected across the University and also study the platform itself. A paper on the collaboration was written and submitted to the 28th British HCI Conference, (see Appendix E). U-cursos offers a number of services, of which the most frequently used are shown in Table 4.1, with an indication of how popular are they amongst users as well as a list of features students would like to see in U-Cursos (both for mobile and web). The unique advantage of using this data above any other dataset currently available is that it has over 30,000 users (staff and students) covering the past ten years, therefore it is in principle viable for longitudinal and cross-sectional analysis. Whilst the mobile platform is still in beta testing, having access to this wide range of data would enable its analysis via educational analytics. 9 Terms run from March to July and from August to December in Chile. Some events may induce small variations on the actual dates. The university closes for summer holidays in February. Source: http://escuela.ing.uchile.cl/calendarios (In Spanish - Last accessed 9th July 2014).
  • 36. Chapter 4 Outcomes of Work to Date 30 Table 4.1: U-Cursos services ranked in ascendent order of popularity amongst users. The number in parenthesis indicates the percentage of students who flagged the relevant service or feature as especially useful or desirable (CaĢdiz, 2013, adapted). Current services New mobile features New general features My timetable (92) Granular push (20) Chat (39) E-mail (74) Preview material (11) Library (7) Notifications (70) Search for a room (10) Multiplatform (6) Teaching material (58) More simplicity (9) Tablet support (6) Calendar (50) Attendance log (5) Facebook integration (4) Partial marks (46) People search (4) Campus map (3) Forum (20) Offline access (4) Room status (2) Dropbox (14) Book a lab (4) Staff timetable (2) Guidance notes (11) Timeline (4) ā€œRead laterā€ (2) Coursework (7) Certificate requests (4) Virtual Classroom (2) News (7) Android widget (4) Notes bank (1) Access to past courses (5) Marks calculator (4) Health benefits (1) Favourites (3) Google drive (3) Evernote integration (1) Resolutions (2) Printing queues (2) Anonymous feedback (1) Polls (2) Institutional mail (2) Foursquare integration (1) Links (2) Enrolment (2) Group making (1) Official transcripts (2) Course catalogue (1) Compare timetables (1) Course administration (1) Find staff offices (1) Anonymous feedback (1) Posters (1) Shortcuts (1) Reporting admin errors (1) 4.3.2 Summary This chapter has described the practical experiences in my research, in particular, those related to the application of a survey amongst two different groups of HE students, and those related to the process of securing a dataset from which a model of student behaviour could be created in answering our first research question. This foundational work inform the steps for future action, described in the next Chapter, which lays out a plan for the following months up to the final thesis submission10. 10 Further work identified yet beyond the scope of this thesis is presented in Appendix A.
  • 37. Chapter 5 Research Plan for Final Thesis This research will explore the predictability of student success applying learning analytics on big data sets. In particular, I will analyse a rich ā€œdata trailā€ of student activities as gathered via their interactions with a Learning Management System (LMS), such as the University of Chileā€™s U-Cursos1. This data can be combined with data captured by the institution at first enrolment, such as socio-economic indicators (typically used in traditional learning analytics). From this analysis, a model of academic success will be developed, providing insight on the factors influencing academic performance amongst other measurable proxies for success. 5.1 Motivation A primary motivation behind seeking such an insight is that it would facilitate the identification of students ā€œat riskā€, and further enable behavioural interventions so that students can be supported in becoming successful in their studies. A greater, lasting goal would be to influence student behaviour via persuasive technologies, so that the students themselves are empowered to effect a significant change in their study. However, this is a long-term goal beyond the scope of the present research. Whilst the rich interconnection necessary for a digital behavioural intervention is not yet fully supported, and the existing student data is both incomplete and noisy for this specific purpose, we can still gain a good understanding of how it might look by examining current student data, from both the educational and the pervasive computing perspectives. A central theme of this research is learning analytics, informed by relevant studies on behavioural interventions and the application of pervasive computing to education. In order to build on the traditional learning analytics research approaches (generally limited 1 Developed by the University of Chileā€™s Information Technologies group (ADI, AĢrea de InfotecnologıĢas in Spanish). 31
  • 38. Chapter 5 Research Plan for Final Thesis 32 to data controlled by the educational institution), I have also considered including data that could offer an additional insight into student behaviour, by articulating descriptions of the activities successful students do even when they study. 5.2 Research question and research hypotheses The general research question to be addressed is: ā€œWhat are the measurable factors for the prediction of student academic success?ā€ This is a very wide-ranging question, which includes a number of conditioning fac- tors (e.g. what students bring with them before starting Higher Education) as well as behavioural ones (e.g. how do students engage in Higher Education studies). To focus the research, a number of specific research hypotheses have been identified: H1: Traditional learning analytics on conditioning factors are suitable pre- dictors of success. Specifically, are socioeconomic indicators and student com- petences2 acquired during secondary schooling adequate predictors for student performance in Higher Education? Existing research has strongly indicated this to be true, however the work published to date contains limitations, such as: (a) in the size of the sample. For example, Bhardwaj and Pal (2011) studied data from up to 300 participants; (b) studies predicting only persistence or attrition rather than measured academic performance (Glynn et al., 2003) My investigation of H1 is designed to extend the scope of the analysis and remove some of these limitations. However, since this and other work published to date highlight some factors as good predictors of student success, I will especially look for evidence of such a correlation in the data to either support or falsify hypothesis H1. These factors are: socio-economic factors such as age and parents level of education, as well as academic performance in previous learning (such as high- school marks). H2: Learning analytics data in the traditional sense can be significantly enriched by incorporating data from social media and other student- generated data. Students interacting with the LMS leave a data trail which can be quantified. Engagement in social forums within the U-Cursos platform is an additional variable that can be incorporated in the prediction model. Does the model become more accurate by doing so? 2 By student competences we refer to those measured by the University Selection Test in Chile (or PSU, Prueba de SeleccioĢn Universitaria in Spanish (Dinkelman and MartıĢnez A, 2014)), which is used for university admissions across the country.
  • 39. Chapter 5 Research Plan for Final Thesis 33 H3: Smartphone data can be used to inform the prediction model. In par- ticular, do measures of engagement with the U-Cursos mobile platform correlate with those in the web-based version (for which there is substantial historical data available)? To test hypothesis H1, I will work with institutional data held by the University of Chile via the platform U-Campus3, which holds databases on administrative data related to each student, e.g. status, courses in which they are enrolled, enrolment, pro- gression, withdrawal and completion, as well as the reported socio-economic indicators at the time the PSU test ( Prueba de SeleccioĢn Universitaria in Spanish) was taken. U- Campus offers a number of services to five4 faculties across the university: those services related to curriculum management (e.g. enrolments, course programmes, prospectuses, accreditation), administration and personal management (e.g. repository of University Council minutes, accreditation statistics). U-Campus is of interest for this research since the student data held (as above outlined) could well be used to predict success if H1 is true. In particular, and following on previous research (Sarker, 2014; Bhardwaj and Pal, 2011; Glynn et al., 2003), I expect to find a correlation between academic performance and socioeconomic indicators such as education level and occupation of the parents, To test hypothesis H2, I will include in the analysis log data from U-Cursos in- dicating the time and frequency of interactions with the LMS, including not only the instances in which students upload content (e.g. submitting coursework) but also the instances in which they retrieve information of interest (e.g. assessment results and course information). In testing hypothesis H3, I will follow closely the development of the mobile ex- tension of U-Cursos, which aims firstly at improving accessability and usability, and secondly at exploiting smartphones capabilities, such as nudges via granular pushes for delivery of information and the possibility of incorporating location data to the times- tamp of an interaction. Rather than investigating the effectiveness of these additions, Iā€™m interested in proposing a framework so that mobile data can be incorporated into the learning analytics. There are certain limitations regarding the mobile data which will be available in the coming months. In particular, this development is still in progress: beta testing is expected to finish by the end of July 2014 and therefore there is no historical data available. Additionally, the number of users is currently limited to just 50 (as opposed to the current 30,000 users of the web-based version of the platform). Despite this limitation, it is worth exploring whether the prediction model applied using the mobile 3 Access-restricted portal: https://www.u-campus.cl. See Appendix F for screenshots. 4 The University of Chile faculties currently using U-Campus are: Mathematical and Physical Sci- ences, Medicine, Architecture and Landscaping, Social Sciences, and Philosophy.
  • 40. Chapter 5 Research Plan for Final Thesis 34 data is reasonably aligned with the prediction results achieved when using the web-based platform. 5.3 Work Packages In order to test the hypotheses presented in the previous section, a number of activities have been planned. The timescales for the proposed future work are given in the Gantt chart in Table 5.1, and detailed in the following work packages: WP1: Enhanced literature review, with a focus on learning analytics as applied to the three research hypotheses. WP2: Additional data analysis on surveys conducted in Chile and the UK. WP3: Data acquisition and the collation of a complete dataset (a subset of U-Campus and U-Cursos). WP4: Analysis of historical data from the PSU admission test of University of Chile students, for indicators associated to completion (available via U-Campus). WP5: Analysis of U-Cursos data, for factors associated with high marks. WP6: Integrating WP4 with WP5 findings for a predictive model of academic success. WP7: Incorporating the additional variables gathered via U-Cursos mobile into the predictive model from WP4. I am currently working on the first three work packages (WP1 to WP3). WP1 is necessary to complement my existing literature review, and will continue for the next 12 months, to ensure awareness of state-of-the-art research. In WP2, I will finalise the quantitative and qualitative analysis of the surveys data that was described in Chapter 4. WP3 also completes ongoing work, this time regarding the datasets needed to work in this research. Work for this package started during my research visit to the University of Chile from the 9th to the 19th of March 2014, when an improved understanding of the data architecture of both U-Cursos and U-Campus was achieved (beyond the general concept presented by CaĢdiz (2013)). During this trip the collaboration with ADI and NIC Labs became formally established. Figure 5.1 provides an outline of the processes and the kind of data stored, as well as the domains of responsibility for each. WP4 will undertake a full analysis and evaluation of the PSU test data of students who have enrolled in the University of Chile since 2003, when the test was first intro- duced. More specifically, I will study correlations and statistical dependencies (using
  • 41. Chapter 5 Research Plan for Final Thesis 35 Table 5.1: Schedule of research work and thesis submission 2014 2015 Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Mini-thesis viva H1 ā€“ conditioning factors WP1: Extending literature review WP2: Additional data analysis on surveys data WP3: Securing U-campus and U-cursos data WP4: Analysis of U-campus data (with PSU data) Second research visit to Chile H2 ā€“ behavioural factors WP5: Analysis of U-Cursos data (SPSS and WEKA) WP6: Integration for a predictive model Submit WP6 results to Computers and Education H3 ā€“ smartphone data WP7: Incorporating mobile data Working with visiting researcher from Chile Thesis write-up Thesis submission
  • 42. Chapter 5 Research Plan for Final Thesis 36 U-Campus U-Cursos Monthly forum ā€œdumpā€ PSU ADI Manual enrolments at Faculty level Students automated enrolments Digitalisation (some) Digitalisation Institutional information Student RUT, name, address, socioeconomic data, age, etc Course data (e.g. syllabus, resources, coursework specs, timetable, news, student polls) Student data (e.g. RUT, names, email addresses, avatars, courseworks, partial marks, timetables, final marks or fail status (R/E/I)) U-Cursos Mobile Lecturer/instructor data (e.g. roles, courses, permissions) STI Figure 5.1: Data architecture at the University of Chile: U-Campus and U-Cursos, with processes and entities responsible for their management: ADI is the University of Chileā€™s Information Technologies group (AĢrea de InfotecnologıĢas in Spanish) and STI is the University of Chileā€™s Division of IT and Communications (DireccioĢn de Servicios de TecnologıĢas de InformacioĢn y Comunicaciones ). SPSS) between ā€œconditioningā€ factors and the academic performance to date as mea- sured by the PSU test. Table 5.2 shows the data fields available for this test5, with marks (X) next to those which are of interest for this analysis, in particular: socio-economic indicators and the average high-school marks, since they are generally accepted as re- liable predictors of academic performance in the literature. Additional factors, such as gender, age and nationality have been identified in the global literature as influential, therefore I will also incorporate this data. Specifically for the Chilean case, it has been reported that the PSU test is widely regarded as being biased towards school-leavers of private schools and towards the metropolitan area. Therefore, I will also study the impact of the educational institution of origin and the home city on the academic per- formance prior to the test (in this work package) and then later in Higher Education (in WP5). Finally, after certain pre-processing6, other fields (marked with ā€ ) are also 5 See Appendix G for further details, including screenshots of a sample student application. 6 In order to guarantee anonymity, it is necessary to avoid sensitive data, such as the name, phone numbers, email, exact home address (street and house number), and exact date of birth (month and year will suffice).
  • 43. Chapter 5 Research Plan for Final Thesis 37 necessary. In particular, I will require the national identification number (hashed or otherwise protected), since this will act as a unique key which could be used to link the data from the PSU test (ā€œconditioning dataā€) to the measures of academic performance available via U-Cursos in WP5. At this point, I will have sufficient evidence to either support or reject hypothesis H1 (ā€œtraditional learning analytics on conditioning factors are suitable predictors of successā€), as indicated in the Gantt chart (Table 5.1). My findings will be discussed with researchers in ADI and NIC Labs during my second research visit (for two weeks, exact dates TBA), where I will complete the analysis and commence work on WP5. The visit will be used also to agree with these researchers on measurable behavioural factors that are feasible to study via the smartphone extension of U-Cursos, which will be required for WP7. For WP5, data from U-Cursos will offer some information on measures of academic performance and ā€œbehavioural factorsā€, limited to how students interact with the plat- form, in terms of type and frequency of their access, including coursework submission information and interim assessments. This data will be analysed and correlations and statistical dependencies will be studied (using SPSS). Additionally, I will apply data mining techniques to formulate a prediction model of successful performance, consider- ing these variables as classifying features. WP6 concerns the integration of the conditioning factors (as gathered from U- Campus) and behavioural factors (from U-Cursos). Since the number of variables available will increase significantly, it is essential to apply feature selection methods to improve the model and avoid overfitting. A number of classification methods from the data mining toolset WEKA could be used, for example NaıĢˆve Bayes, which has been also used by Bhardwaj and Pal (2011) to predict academic performance7. As an outcome of this work package, I intend to submit a research paper to the journal Computers and Education8, where the evidence gathered to prove or disprove hypothesis H2 will be dis- cussed. The effort in writing this paper will count towards the task ā€œThesis write-upā€, shown last in Table 5.1), hence this is shown as formally starting at the same time as WP6, though in practice the writing takes place throughout the research project. Fi- nally, WP7 concerns entirely in testing hypothesis H3 (ā€œSmartphone data can be used to inform the prediction modelā€), and will incorporate data from U-Cursos mobile to the model created as part of WP6. 7 Bhardwaj and Pal (2011) only used conditioning variables such as those to be studied in WP4. 8 Some of the journal Computers and Education impact metrics are: Impact per Publication (IPP) of 3.720 and Impact Factor (IF) of 2.775. As reported at http://www.journals.elsevier.com/ computers-and-education/ (last accessed on the 4th July 2014).
  • 44. Chapter 5 Research Plan for Final Thesis 38 Table 5.2: University Selection Tests (Prueba de SeleccioĢn Universitaria, PSU) data fields. Data from fields marked in bold will be used to validate H1, complemented with other fields of interest (marked X). Note that fields marked ā€  will require some preprocessing for anonymisation. (Based on http://www.demre.cl/instr_incrip_ p2014.htm. Last accessed: 3th July 2014). Personal data (Comments) Full name prefilled on login ā€  National identification number prefilled on login X Country of nationality X Gender prefilled on login ā€  Date of Birth prefilled on login X Occupation two choices: Student or blank field School data X Type of applicant either from current or previous years X Educational Institution prefilled Educational Branch institutions may have several ones X Year of graduation from High School prefilled X Average high-school marks prefilled if from previous years Geographical Area prefilled Test choices data Test choices Social and/or pure sciences (but just one amongst Biology, Physics and Chemistry) Admissions office Test venue dropdown menu Personal contacts Home address: street, number X Home: city, region and province dropdown menus Phone numbers E-mail address Socio-economic data X Marital status dropdown menu X Work status dropdown menu X Working hours dropdown menu X Number of working hours a week X Term time type of accomodation dropdown menu X Household size X Number of people in the household in employment X Who is the head of the household? dropdown menu X Are your parents alive? X How many people study in your household discriminated by educational stage X Have you studied in a Higher Educa- tion Institution Yes/No X If so, type of institution dropdown menu Name of institution About each parent X Occupation multiple choice X Industry multiple choice Funding and payment X Are you a beneficiary of a junaeb scholarship? dropdown menu
  • 45. Chapter 5 Research Plan for Final Thesis 39 5.4 Contingency research plan The research plan above described is predicated on acquiring specific data from a sub- stantially large group of students, in particular, U-Campus, U-Cursos and U-Cursos mobile. Although I have successfully established the appropriate contacts at the Uni- versity of Chile (in the ADI group and with NIC Labs), and substantial progress has already been made towards accessing U-Cursos and U-Campus data, a contingency plan is in place for the event of failure to secure suitable data. My contacts from the University of Chile have been forthcoming in answering my questions as I become familiar with the platform and the organisation itself. My con- tribution in this collaboration is that my findings will be used to inform the evolution of the platform and further extensions are likely to incorporate ā€œnudgesā€ for a future digital behavioural intervention seeking to improve retention and shortening the length of time students need to graduate. Our close collaboration is already fruitful, as during my research visit last March, we were able to prepare a research paper together where U-Cursos is well described (CaĢdiz et al., 2014, as in Section 4.3). However, despite this strong assurances evidencing their willingness for sharing the relevant data with me, there are some practical issues to be resolved which may affect the feasibility of securing the data as planned. In particular, the data architecture seems to have followed an ad-hoc design and there are many redundancies and inefficiencies of which I have just began to become aware. Being distributed across a number of tables, many a time on separate sites, it is not a matter of simply being granted access to a centralised reposi- tory. In addition, our requirement for anonymisation of the data adds another level of uncertainty (which is hard to quantify) as this clearly will require time and effort by my Chilean colleagues. Should it be the case that the contingency research plan is carried out, hypotheses H1 and H2 may alternatively be tested on data from the University of Southampton Massive Open Online Courses (MOOCs)9, which are run by the University of Southampton via Future Learn. Data regarding several conditioning factors to test hypothesis H1 are also harvested during enrolment in these courses as part of a ā€œpre-courseā€ questionnaire. These include socio-economic indicators (e.g. age, country, gender, employment status and reported disabilities if any), and other conditioning factors such as course expectations, reported learning preferences, subject areas of interest, and prior education (both in formal edu- cation and in other MOOCs). Given this data, a similar study as that planned for WP4 can still be undertaken but using this data instead. 9 As an example, the MOOC ā€œHow the Web is Changing the Worldā€ has had two intakes since 2012 (and is running for third time this October). Further details at http://www.soton.ac.uk/moocs/ webscience.shtml (last accessed on the 26th June 2014).
  • 46. Chapter 5 Research Plan for Final Thesis 40 With regards to the testing of H2, there are a number of datasets available, for which there is implicit consent from participants for their use in research. These datasets are files in Comma Separated Value (CSV) format, the most relevant being: ā€¢ the End of Course dataset ā€“ contains metrics such the proportion of those who enrolled in the course (ā€œJoinersā€) has abandoned (ā€œleaversā€). Other characterisa- tions include: ā€œLearnersā€(those who have viewed at least one step of the course), ā€œactive learnersā€ (thouse who has marked at least one step as complete),ā€œreturning learnersā€ (those who completed steps in more than one week), ā€œsocial learnersā€ (those who have left at least one comment), and ā€œfully participated learnersā€ (sic), those who have completed a majority of the steps including all tests10. ā€¢ the Step Completion dataset ā€“ Note that each course has a number of ā€œstepsā€ that need to be completed to succeed (typically watching a video, reading a text, or completing an assessment). Each step can have a number of comments associated. ā€¢ the Quiz data ā€“ which would constitute a proxy for ā€œmarksā€ in the traditional sense; and ā€¢ the Comments dataset ā€“ Table 5.3 is a detailed example of the structure of this datasets, the Comments dataset. A ā€œpost-courseā€ questionnaire, though mainly intended as a course evaluation ex- ercise (and therefore including questions where the student rates the course in several ways), also helps in gathering other indicators of the learning behaviour, such as point of entry (whether from the start of the course or later on), reasons for attrition (if the course was abandoned) and specific learning behaviours adopted investigating dedication in time and effort, reported frequency of access, reflection, collaboration (through social media as well as via comments in a step within the course) and connectivity (devices used to access the course and typical study places) as well as their use of prior learning. Combined, these datasets record all the interactions between participants through the platform and hold a complete record of achievement and progress as the students take on the various tasks and assessments in the course. Admittedly, hypothesis H3 cannot be tested using MOOCs data, but alternatively we would formulate a domain-specific hypothesis applicable to online-only courses, as opposed to face-to-face instruction supported by an LMS, which is the case of interest in the current plan. Also in this case, a shift in focus will be necessary, an the literature review presented in Section 2.2.3. 10 Thanks to Kate Dickens from the Centre for Innovation in Technologies and Education (CITE) for facilitating this information.
  • 47. Chapter 5 Research Plan for Final Thesis 41 Table 5.3: FutureLearn Platform Data Exports. Adapted from https://www. futurelearn.com/courses/course-slug/). (Last accessed: 4th July 2014, by Kate Dickens (Project Leader for the Web Science MOOC). Comments id [integer] a unique id assigned to each comment author id [string] the unique, anonymous id assigned to the author user parent id [integer] the unique id of the parent comment (i.e. the com- ment this comment replies to) step [string] the human readable step number (e.g. 1.13) text [string] the comment text timestamp [timestamp] when the comment was posted moderated [timestamp] the time at which a comment was moderated, if at all likes [integer] the number of likes attributed to the comment Peer Review - Assignments id [integer] a unique id assigned to each assignment submission (referenced by reviews) step [string] the human readable step number (e.g. 1.13) author id [string] the unique, anonymous id assigned to the author user text [string] the comment text first viewed at [timestamp] when the assignment step was first viewed created at [timestamp] when the assignment was submitted moderated [timestamp] the time at which a comment was moderated, if at all review count [integer] how many reviews are associated with the assign- ment Peer Review - Reviews id [integer] a unique id assigned to each assignment review step [string] the human readable step number (e.g. 1.13) author id [string] the unique, anonymous id assigned to the author user assignment id [integer] the id identifying the assignment reviewed guideline one feedback [string] text submitted for the first guideline guideline two feedback [string] text submitted for the second guideline guideline three feedback [string] text submitted for the third guideline created at [timestamp] when the review was submitted
  • 48. Chapter 5 Research Plan for Final Thesis 42 5.5 Summary This Chapter presented the motivation behind the research question ā€œWhat are the measurable factors for the prediction of student academic success?ā€ and outlined three research hypothesis associated to it. Two of these hypothesis consider conditioning and behavioural factors as predictors of academic success, whilst the last one regards smartphone data as suitable to inform a prediction model of success. In order to test them, a number of work packages (WP1-WP7) are planned, with deliverables at specific points in the time remaining until the submission of the final thesis. I have also outlined a contingency research plan should the data expected from the University of Chile prove difficult to obtain for unforseen circumstances. The following Chapter will outline future work that has been identified yet is beyond the scope of this research given the time and resources remaining.
  • 49. Chapter 6 Conclusions This research will explore the predictability of student success from learning analytics on big data sets. In particular, we seek to analyse a rich ā€œdata trailā€ of student activities as gathered via their interactions with a Learning Management System (LMS), such as the University of Chileā€™s U-Cursos1. This data can be combined with data captured by the institution at first enrolment, such as socio-economic indicators (typically used in traditional learning analytics). From this analysis, a model of academic success will be developed, providing insight on the factors influencing academic performance amongst other measurable proxies for success. A primary motivation behind seeking such an insight is that it would facilitate the identification of students ā€œat riskā€, and further enable behavioural interventions so that students can be supported in becoming successful in their studies. A greater, lasting goal would be to influence student behaviour via persuasive technologies, so that the students themselves are empowered to effect a significant change. This is a long-term goal beyond the scope of the present research. Whilst the rich interconnection necessary for a digital behavioural intervention is not yet fully supported, and the existing student data is both incomplete and noisy for this specific purpose, we can still gain some knowledge of how it might look by examining current student data, from both the educational and the pervasive computing perspectives. The central theme of this research is learning analytics, informed by relevant studies on behavioural interventions and the application of pervasive computing to education. In order to build on the traditional learning analytics research approaches (generally limited to data controlled by the educational institution), I have also considered including data that could offer an additional insight into student behaviour, by articulating descriptions of what successful students do when they are not studying. 1 Developed by the University of Chileā€™s Information Technologies group (ADI, AĢrea de InfotecnologıĢas in Spanish). 43