Multimodal Learning
Analytics
Xavier Ochoa
Escuela Superior Politécnica del Litoral
http://www.slideshare.net/xaoch
Download this presentation
(Multimodal)
Learning Analytics
Learning analytics is the measurement,
collection, analysis and reporting of data
about learners and their contexts, for
purposes of understanding and
optimising learning and the
environments in which it occurs.
Examining engagement: analysing
learner subpopulations in massive
open online courses (MOOCs)
Using transaction-level data to
diagnose knowledge gaps and
misconceptions
Likelihood analysis of student
enrollment outcomes using learning
environment variables: a case study
approach
Tracking student progress in a game-
like learning environment with a
Monte Carlo Bayesian knowledge
tracing model
Strong focus on online data
Based on the papers it should be called
Online-Learning Analytics
Streetlight effect
Where learning is
happening?
Why
Multimodal Learning Analytics?
We should be looking where it is useful to look,
not where it is easy
There is learning outside the
LMS
But it is very messy!
Who is learning?
Who is learning?
Who is learning?
Who is learning? – Traditional way
But there are better ways to
assess learning
At least theoretically
Who is learning? – Educational Research
How can we approach the
problem from a
Learning Analytics perspective
Measure, collect, analyze and report
to understand and optimize
We need to capture learning
traces from the real world
Look ma, no log files!
In the real world, humans
communicate (and leave traces)
in several modalities
What you say is as important as
how you say it
We need to analyze the traces
with variable degrees of
sophistication
And we have to do it automatically as
humans are not scalable
We need to provide
feedback in the real world
Often in a multimodal way too
But…
Which modes are important to
understand the learning
process?
We do not know yet…
Possibilities
• What we see
• What we hear
• How we move
• How we write
• How we blink
• Our pulse
• Brain activity?
• Our hormones?
What are the relevant
features of those signals
We do not know yet…
Our current analysis tools are
good enough?
We do not know yet…
How to present the information
(and uncertainty)
in a way that is actually useful?
We do not know yet…
It is an open
(but very dark) field
One feels like an explorer
This particular flavor of Learning
Analytics is what we called
Multimodal Learning Analytics
Multimodal Learning Analytics is related to:
• Behaviorism
• Cognitive Science
• Multimodal Interaction (HCI)
• Educational Research (old school one)
• Computer Vision
• Natural Language Processing
• Biosignals Processing
• And as many fields as modes you can think of...
Examples
Math Data Corpus
How to (easily) obtain
multimodal features?
What is already there?
Three Approaches
• Literature-based features
• Common-sense-based features
• “Why not?”-based features
All approaches proved useful
Proof that we are in an early stage
Video: Calculator Use (NTCU)
et s of
s was
t , on
t hat
eived
so in-
n and
lving
ed by
udent
e was
score
given
diffi-
ssion,
n ex-
Mat h
at t et
direct ion in which it was point ing at using t he rigid t rans-
format ions capabilit ies provided by OpenCV . W hile t here
were some frames in which t his mat ching was not possible
due t o object occlusions or changes in t he illuminat ion of
t he calculat or, in general t he described det ect ion t echnique
was robust and provided useful posit ion and direct ion dat a.
F igur e 1: D et er m in at ion of w hich st udent is usin g
Video: Calculator Use (NTCU)
• Idea:
• Calculator user is the one solving the problem
• Procedure:
• Obtain a picture of the calculator
• Track the position and angle of the image in the video
using SURF + FLANN + Rigid Object Transformation
(OpenCV)
• Determine to which student the calculator is pointing in
each frame
Video: Total Movement (TM)
fined as t he number of whit e pixels cont ained in t he binary
image out put by t he Codebook algorit hm. T his magnit ude,
when comput ed for t he ent ire problem solving session, re-
sult s from summing up it s individual values obt ained from
each frame t hat compose a problem recording.
(a) Original frame (b) Difference frame
F igur e 2: R esult s of t he C odeb ook algor it hm .
Video: Total Movement (TM)
• Idea:
• Most active student is the leader/expert?
• Procedure:
• Subtract current frame from previous frame
• Codebook algorithm to eliminate noise-movement
• Add the number of remaining pixels
Video: Distance from center table
(DHT)
he de-
oblem
ed t o
calcu-
par-
a spe-
ed as
e pre-
Code-
ficant
small
ondi-
where
ained
s de-
inary
t ude,
n, re-
from
each head cent roid t o t he cent er of t he t able is calculat ed
and t hen, t he average of t hese dist ances is obt ained by prob-
lem (see Figure 3). Addit ionally, t he variance of t he average
dist ance head t o t able (SD-DHT ), was calculat ed t o det er-
mine if a part icipant remains most ly st at ic or varies his or
her dist ance t o t he t able.
F igur e 3: C alculat ion of t he dist ance of t he st udent ’s
Video: Distance from center table
(DHT)
• Idea:
• If the head is near the table (over paper) the student is
working on the problem
• Procedure:
• Identify image of heads
• Use TLD algorithm to track heads
• Determine the distance from head to center of table
Audio: Processing
Audio: Features
• Number of Interventions (NOI)
• Total Speech Duration (TSD)
• Times Numbers were Mentioned (TNM)
• Times Math Terms were Mentioned (TMTM)
• Times Commands were Pronounced (TCP)
Digital Pen: Basic Features
Digital Pen: Basic Features
• Total Number of Strokes (TNS)
• Average Number of Points (ANP)
• Average Stroke Path Length (ASPL)
• Average Stroke Displacement (ASD)
• Average Stroke Pressure (ASP)
Digital Pen: Shape Recognition
Stronium – Sketch Recognition Libraries
Digital Pen: Shape Recognition
• Number of Lines (NOL)
• Number of Rectangles (NOR)
• Number of Circles (NOC)
• Number of Ellipses (NOE)
• Number of Arrows (NOA)
• Number of Figures (NOF)
Analysis at Problem level
Solving Problem Correctly
• Logistic Regression to model Student Solving Problem
Correctly
• Resulting model was significantly reliable
• 60,9% of the problem solving student was identified
• 71,8% of incorrectly solved problems were identified
Analysis at problem level
effi cient s of t he L ogist ic M odel Predict ing Odds for a St udent Solving Cor rect ly
Predictor Variable B W ald df p value exp(B )
Number of Interventions (N OI ) 0.068 24.019 1 0.001 0.934
Times numbers were mentioned (TN M ) 0.175 23.816 1 0.001 1.192
Times commands were pronounced (TCP) 0.329 4.956 1 0.026 1.390
Proportion of Calculator Usage (PCU) 1.287 25.622 1 0.001 3.622
Fastest Student in the Group (F W ) 0.924 18.889 1 0.001 2.519
Constant 1.654 53.462 1 0.001 0.191
mber of Point s (AN P): Represents, in
mber of points that compose each stroke
sub-sets. Classification Trees, provided by
in the R statistical software [21] for Mac, w
second part of the analysis.
Analysis at Group Level
Expertise Estimation
• Features were feed to a Classification Tree algorithm
• Several variables had a high discrimination power between
expert and non-experts
• Best discrimination result in 80% expert prediction and 90%
non-expert prediction
Analysis at Group Level
Expertise Estimation
1)
he
le
M
s-
in
ed
0).
es
or
x-
es
he
le
n-
as
0
AN P L P Lowest value
ASD M D Highest value
AST L F W Lowest value
ASP M P Highest value
T able 4: C l assifi cat ion t r ee split s w it h nor m alized
and non-nor m al ized feat ur es
Variable Value for Expert s Discriminat ion Power
F W > 0.5 6.53
L P > 34.74 6.53
P CU > 38.05 4.44
M N > 0.13 4.03
P N M > 6.25 3.19
classificat ion is maint ained and plat eau at t he final value
around t he 12t h problem.
5. DISCUSSION
Expert Estimation over Problems
Plateau reached after
just 4 problems
Main conclusion: Simple
features could identify expertise
Faster Writer (Digital Pen)
Percentage of Calculator Use (Video)
Times Numbers were Mentioned (Audio)
Oral Presentation Quality Corpus
Video Features
• 66 facial features were
extracted using Luxand
software including both eyes
and nose tip to estimate the
presenter’s gaze.
Kinect features
• Identify Common postures
Kinect features
• Identify Common postures
Kinect Features
Laban’s theory helps to describe human movement using non-verbal
characteristics:
Spatial aspects of movement
Temporal aspects of movement
Fluency, smoothness, impulsivity
Energy and power
Overall activity
EVALUATION METHODOLOGY
Extracted
Features
Human coded
Criterion
Video
Kinect
Body and Posture
Language
Eye Contact
Results: less than 50%
accuracy
What we were measuring was not was
humans were measuring
What is next in MLA?
Mode integration framework
for MLA
Currently pioneered by Marcelo Worsley
Developing
Multimodal Measuring Devices
Our Fitbits
Record different learning
settings
And share them with the community
Conclusions
Multimodal Learning Analytics is
not a subset of Learning
Analytics
Current Learning Analytics is a subset of MLA

Some problems are easy,
some hard
But we do not know until we try to solve them
There is a lot of
exploring to do
And we need explorers
Gracias / Thank you
Questions?
Xavier Ochoa
xavier@cti.espol.edu.ec
http://ariadne.cti.espol.edu.ec/xavier
Twitter: @xaoch
http://www.sigmla.org

Multimodal Learning Analytics

  • 1.
    Multimodal Learning Analytics Xavier Ochoa EscuelaSuperior Politécnica del Litoral
  • 2.
  • 3.
  • 4.
    Learning analytics isthe measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs.
  • 5.
    Examining engagement: analysing learnersubpopulations in massive open online courses (MOOCs) Using transaction-level data to diagnose knowledge gaps and misconceptions Likelihood analysis of student enrollment outcomes using learning environment variables: a case study approach Tracking student progress in a game- like learning environment with a Monte Carlo Bayesian knowledge tracing model
  • 6.
    Strong focus ononline data Based on the papers it should be called Online-Learning Analytics
  • 7.
  • 8.
  • 15.
    Why Multimodal Learning Analytics? Weshould be looking where it is useful to look, not where it is easy
  • 16.
    There is learningoutside the LMS But it is very messy!
  • 17.
  • 18.
  • 19.
  • 20.
    Who is learning?– Traditional way
  • 21.
    But there arebetter ways to assess learning At least theoretically
  • 22.
    Who is learning?– Educational Research
  • 24.
    How can weapproach the problem from a Learning Analytics perspective Measure, collect, analyze and report to understand and optimize
  • 25.
    We need tocapture learning traces from the real world Look ma, no log files!
  • 26.
    In the realworld, humans communicate (and leave traces) in several modalities What you say is as important as how you say it
  • 27.
    We need toanalyze the traces with variable degrees of sophistication And we have to do it automatically as humans are not scalable
  • 28.
    We need toprovide feedback in the real world Often in a multimodal way too
  • 29.
  • 30.
    Which modes areimportant to understand the learning process? We do not know yet…
  • 31.
    Possibilities • What wesee • What we hear • How we move • How we write • How we blink • Our pulse • Brain activity? • Our hormones?
  • 32.
    What are therelevant features of those signals We do not know yet…
  • 33.
    Our current analysistools are good enough? We do not know yet…
  • 34.
    How to presentthe information (and uncertainty) in a way that is actually useful? We do not know yet…
  • 35.
    It is anopen (but very dark) field One feels like an explorer
  • 36.
    This particular flavorof Learning Analytics is what we called Multimodal Learning Analytics
  • 37.
    Multimodal Learning Analyticsis related to: • Behaviorism • Cognitive Science • Multimodal Interaction (HCI) • Educational Research (old school one) • Computer Vision • Natural Language Processing • Biosignals Processing • And as many fields as modes you can think of...
  • 38.
  • 40.
  • 41.
    How to (easily)obtain multimodal features? What is already there?
  • 42.
    Three Approaches • Literature-basedfeatures • Common-sense-based features • “Why not?”-based features
  • 43.
    All approaches proveduseful Proof that we are in an early stage
  • 44.
    Video: Calculator Use(NTCU) et s of s was t , on t hat eived so in- n and lving ed by udent e was score given diffi- ssion, n ex- Mat h at t et direct ion in which it was point ing at using t he rigid t rans- format ions capabilit ies provided by OpenCV . W hile t here were some frames in which t his mat ching was not possible due t o object occlusions or changes in t he illuminat ion of t he calculat or, in general t he described det ect ion t echnique was robust and provided useful posit ion and direct ion dat a. F igur e 1: D et er m in at ion of w hich st udent is usin g
  • 45.
    Video: Calculator Use(NTCU) • Idea: • Calculator user is the one solving the problem • Procedure: • Obtain a picture of the calculator • Track the position and angle of the image in the video using SURF + FLANN + Rigid Object Transformation (OpenCV) • Determine to which student the calculator is pointing in each frame
  • 46.
    Video: Total Movement(TM) fined as t he number of whit e pixels cont ained in t he binary image out put by t he Codebook algorit hm. T his magnit ude, when comput ed for t he ent ire problem solving session, re- sult s from summing up it s individual values obt ained from each frame t hat compose a problem recording. (a) Original frame (b) Difference frame F igur e 2: R esult s of t he C odeb ook algor it hm .
  • 47.
    Video: Total Movement(TM) • Idea: • Most active student is the leader/expert? • Procedure: • Subtract current frame from previous frame • Codebook algorithm to eliminate noise-movement • Add the number of remaining pixels
  • 48.
    Video: Distance fromcenter table (DHT) he de- oblem ed t o calcu- par- a spe- ed as e pre- Code- ficant small ondi- where ained s de- inary t ude, n, re- from each head cent roid t o t he cent er of t he t able is calculat ed and t hen, t he average of t hese dist ances is obt ained by prob- lem (see Figure 3). Addit ionally, t he variance of t he average dist ance head t o t able (SD-DHT ), was calculat ed t o det er- mine if a part icipant remains most ly st at ic or varies his or her dist ance t o t he t able. F igur e 3: C alculat ion of t he dist ance of t he st udent ’s
  • 49.
    Video: Distance fromcenter table (DHT) • Idea: • If the head is near the table (over paper) the student is working on the problem • Procedure: • Identify image of heads • Use TLD algorithm to track heads • Determine the distance from head to center of table
  • 50.
  • 51.
    Audio: Features • Numberof Interventions (NOI) • Total Speech Duration (TSD) • Times Numbers were Mentioned (TNM) • Times Math Terms were Mentioned (TMTM) • Times Commands were Pronounced (TCP)
  • 52.
  • 53.
    Digital Pen: BasicFeatures • Total Number of Strokes (TNS) • Average Number of Points (ANP) • Average Stroke Path Length (ASPL) • Average Stroke Displacement (ASD) • Average Stroke Pressure (ASP)
  • 54.
    Digital Pen: ShapeRecognition Stronium – Sketch Recognition Libraries
  • 55.
    Digital Pen: ShapeRecognition • Number of Lines (NOL) • Number of Rectangles (NOR) • Number of Circles (NOC) • Number of Ellipses (NOE) • Number of Arrows (NOA) • Number of Figures (NOF)
  • 56.
    Analysis at Problemlevel Solving Problem Correctly • Logistic Regression to model Student Solving Problem Correctly • Resulting model was significantly reliable • 60,9% of the problem solving student was identified • 71,8% of incorrectly solved problems were identified
  • 57.
    Analysis at problemlevel effi cient s of t he L ogist ic M odel Predict ing Odds for a St udent Solving Cor rect ly Predictor Variable B W ald df p value exp(B ) Number of Interventions (N OI ) 0.068 24.019 1 0.001 0.934 Times numbers were mentioned (TN M ) 0.175 23.816 1 0.001 1.192 Times commands were pronounced (TCP) 0.329 4.956 1 0.026 1.390 Proportion of Calculator Usage (PCU) 1.287 25.622 1 0.001 3.622 Fastest Student in the Group (F W ) 0.924 18.889 1 0.001 2.519 Constant 1.654 53.462 1 0.001 0.191 mber of Point s (AN P): Represents, in mber of points that compose each stroke sub-sets. Classification Trees, provided by in the R statistical software [21] for Mac, w second part of the analysis.
  • 58.
    Analysis at GroupLevel Expertise Estimation • Features were feed to a Classification Tree algorithm • Several variables had a high discrimination power between expert and non-experts • Best discrimination result in 80% expert prediction and 90% non-expert prediction
  • 59.
    Analysis at GroupLevel Expertise Estimation 1) he le M s- in ed 0). es or x- es he le n- as 0 AN P L P Lowest value ASD M D Highest value AST L F W Lowest value ASP M P Highest value T able 4: C l assifi cat ion t r ee split s w it h nor m alized and non-nor m al ized feat ur es Variable Value for Expert s Discriminat ion Power F W > 0.5 6.53 L P > 34.74 6.53 P CU > 38.05 4.44 M N > 0.13 4.03 P N M > 6.25 3.19 classificat ion is maint ained and plat eau at t he final value around t he 12t h problem. 5. DISCUSSION
  • 60.
    Expert Estimation overProblems Plateau reached after just 4 problems
  • 61.
    Main conclusion: Simple featurescould identify expertise Faster Writer (Digital Pen) Percentage of Calculator Use (Video) Times Numbers were Mentioned (Audio)
  • 63.
  • 64.
    Video Features • 66facial features were extracted using Luxand software including both eyes and nose tip to estimate the presenter’s gaze.
  • 65.
  • 66.
  • 67.
    Kinect Features Laban’s theoryhelps to describe human movement using non-verbal characteristics: Spatial aspects of movement Temporal aspects of movement Fluency, smoothness, impulsivity Energy and power Overall activity
  • 68.
  • 69.
    Results: less than50% accuracy What we were measuring was not was humans were measuring
  • 70.
    What is nextin MLA?
  • 71.
    Mode integration framework forMLA Currently pioneered by Marcelo Worsley
  • 72.
  • 74.
    Record different learning settings Andshare them with the community
  • 75.
  • 76.
    Multimodal Learning Analyticsis not a subset of Learning Analytics Current Learning Analytics is a subset of MLA 
  • 77.
    Some problems areeasy, some hard But we do not know until we try to solve them
  • 78.
    There is alot of exploring to do And we need explorers
  • 79.
    Gracias / Thankyou Questions? Xavier Ochoa xavier@cti.espol.edu.ec http://ariadne.cti.espol.edu.ec/xavier Twitter: @xaoch http://www.sigmla.org