Learning analytics to identify exploratory dialogue in online discussions

An Evaluation of Learning
Analytics To Identify Exploratory
Dialogue in Online Discussions

Rebecca Ferguson, The Open University, UK
Zhongyu Wei, The Chinese University of Hong Kong
Yulan He, Aston University, UK
Simon Buckingham Shum, The Open University, UK

Discourse analytics
The ways in which learners engage in
dialogue indicate how they engage with
the ideas of others, how they relate
those ideas to their understanding and
how they explain their own point of view.

• Disputational dialogue
• Cumulative dialogue
• Exploratory dialogue

Exploratory dialogue
Category Indicator

Challenge But if, have to respond, my view

Critique However, I’m not sure, maybe

Discussion of resources Have you read, more links

Evaluation Good example, good point

Explanation Means that, our goals

Explicit reasoning Next step, relates to, that’s why

Justification I mean, we learned, we observed

Reflections of Agree, here is another, makes the point,
perspectives of others take your point, your view

Pilot study: LAK 2011
Time Contribution
2:42 PM I hate talking. :-P My question was whether "gadgets" were
just basically widgets and we could embed them in various
web sites, like Netvibes, Google Desktop, etc.
2:42 PM Thanks, that's great! I am sure I understood everything, but
looks inspiring!
2:43 PM Yes why OU tools not generic tools?
2:43 PM Issues of interoperability
2:43 PM The "new" SocialLearn site looks a lot like a corkboard
where you can add various widgets, similar to those existing
web start pages.
2:43 PM What if we end up with as many apps/gadgets as we have
social networks and then we need a recommender for the
apps!
2:43 PM My question was on the definition of the crowd in the wisdom
of crowds we acsess in the service model?
2:43 PM there are various different flavours of widget e.g. Google
gadgets, W3C widgets etc. SocialLearn has gone for Google
gadgets

Computational linguistics

Interdisciplinary field that
deals with statistical
and rule-based modelling
of natural language from a
computational perspective

Zhongyu Wei Yulan He

Three challenges

1. The annotated dataset is limited
2. Text classification problems are
typically topic driven – this is not
3. Nevertheless, both dialogue
features and topical features need
to be taken into account

Self-training from labelled
instances – a problem
Pseudo-label ✓
Exploratory

Pseudo-label
Exploratory ✓

Pseudo-label ✓
Exploratory

Including this
Pseudo-label ✗ instance would
Exploratory
degrade the
classifier

Self-training
from labelled features
• For each turn in the dialogue, consider each unigram
(word), bigram (2 words) and trigram (3 words)
• Exploratory or non-exploratory?
• Take into account word-association probabilities
averaged over many pseudo-labelled examples

Pseudo-label
N o n exploratory Bigrams

✓
Focusing on To improve labelling, take into account the
features gives classification of a number (k) of the nearest
a more reliable neighbours of that turn in the dialogue
classification

Taking context into account
Unlabelled turn in the dialogue, p1 Nearest neighbour pni1
Pseudo label lni1
Confidence level cni1

Nearest neighbour pni2
Pseudo label lni2
Pseudo label for that turn, l1 Confidence level cni2

Pseudo-label Nearest neighbour pni3
Non Pseudo label lni3
exploratory
Confidence level cni3

Confidence value for that label, c1 Let k = 3
(look at 3 nearest
0.272727271 neighbours)

Checking against context
Pseudo-label based on features is
considered correct if support value (s)
Pseudo-label
? based on context is high enough
Non
exploratory

Support value is calculated by taking into account the
pseudo labels and confidence values of k nearest neigbours

Checking the pseudo-labels
Pseudo-label
N on
exploratory

Nearest neighbour 1
Pseudo-label
N o n exploratory
? Confidence level 0.333
s = 0.333 + 0 + 0.666
Pseudo-label
Exploratory
3
If the support value
for this pseudo label s = 0.333
is greater than R then Nearest neighbour 2
this turn in the dialogue Confidence level 0.999
can be labelled
Pseudo-label
Because s < R
‘non-exploratory’ N on
exploratory
this turn in the dialogue
Let R = 0.5 should not be labelled
Nearest neighbour 3 ‘non-exploratory’
Confidence level 0.666

Cue phrases from pilot
Agree Does this suggest
Also Draft
Evidence
Although
Example
94 cue phrases
Alternative
Any research Except •Precise but
Misunderstanding
Are we
Good example
•Low recall
Because
But if Good point
Good thing about
Challenge
Have we
Used to improve
Claim
Debate Have you looked at accuracy when
Have you read
Definitely
Here is another
classifying
Depends
Difficult How are unannotated dataset
Discussion How can
Do we have [...]
Do you Why
Does that mean Your view

Dataset
Time Contribution Annotated
•Elluminate text chat
2:43 Issues of interoperability
PM •Two-day conference
2:43 The "new" SocialLearn site looks a lot like a •2,636 dialogue turns
PM corkboard where you can add various widgets, •Mean word tokens
similar to those existing web start pages.
per turn: 10.14

2:43 What if we end up with as many apps/gadgets as Unannotated
PM we have social networks and then we need a
recommender for the apps!
•Elluminate text chat
•Three MOOCs
•10,568 dialogue turns
2:43 My question was on the definition of the crowd in •Mean word tokens
PM the wisdom of crowds we acsess in the service
model? per turn: 9.24

Manual coding of data subset
Category Description Examples include
Challenge A challenge identifies Calling into question
something that may be Contradicting
wrong and in need of Proposing revision
correction
Evaluation An evaluation has a Appraising
descriptive quality Assessing
Judging
Extension An extension builds on, or Applying idea to a new area
provides resources that Increasing range of an idea
support, discussion Providing related resources
Reasoning Reasoning is the process of Explaining
thinking an idea through Justifying your position
Reaching a conclusion

Combining methods
• Train initial classifier on annotated dataset
• Apply trained classifier to un-annotated data
• Use self-learned features to find exploratory dialogue
• Use cue-phrase matching to improve accuracy
• Take context into account using k-nearest neighbours
• Add selected instances to the training dataset

• Repeat for five iterations
or until less than 0.5% of labels are changed

Evaluation criteria
On a scale of 0 to 1…
Accuracy
How many decisions were correct?
Pilot 0.5389 SF+CP+KNN = 0.7924
Precision
How many ‘exploratory’ turns were actually exploratory?
Pilot 0.9523 SF+CP+KNN = 0.8083
Recall
How many exploratory turns were classified as exploratory?
Pilot 0.4241 SF+CP+KNN = 0.8688
F1
Weighted average of precision and recall
Pilot 0.5865 SF+CP+KNN = 0.8331

Varying the value of k
k Accuracy Precision Recall F1

1 0.7868 0.8007 0.8666 0.8282

3 0.7924 0.8083 0.8688 0.8331

5 0.7881 0.8005 0.8685 0.8292

7 0.7586 0.7505 0.8640 0.8001

Looking at three nearest neighbours gives best results

Making use of the classifier

Each colour block represents 10 turns in the dialogue
Red blocks are primarily exploratory, blue blocks primarily non-exploratory

Making use of the classifier
Exploratory turns in the dialogue

The line here is set
to highlight anyone
who had more than
5/6 of their turns
classified as
exploratory

Analytics like these
could be used to
provide focused
support to learners

Total turns in the dialogue

Issues
Visual literacy
How can we share the maximum
Exploratory turns in the dialogue

amount of information while making
these analytics easy to use?
Assessment for learning
How can we use these analytics
to motivate and guide, rather than to
discourage?
Participatory design
Total turns in the dialogue
How can we involve learners and
teachers in learning discussions
around these analytics?

Conclusion
• We proposed and tested a self-training framework
• Found it out-performs alternative methods of
detecting exploratory dialogue
• Developed an annotated corpus for the development
of automatic exploratory dialogue detection
• Identified areas for future research
• Identified ways of applying this work to support
learners and educators

SoLAR Storm webinar
bit.ly/YSEVHG

Yulan He
Senior Lecturer at the School of Engineering
and Appied Science, Aston University, UK

Learning analytics to identify exploratory dialogue in online discussions

Recommended

Recommended

More Related Content

Similar to Learning analytics to identify exploratory dialogue in online discussions

Similar to Learning analytics to identify exploratory dialogue in online discussions (16)

More from Rebecca Ferguson

More from Rebecca Ferguson (20)

Recently uploaded

Recently uploaded (20)

Learning analytics to identify exploratory dialogue in online discussions

Editor's Notes