Detection of Online Learning Activity
Scopes
Syeda Sana E Zainab
Mathieu D’Aquin
Overview
• Introduction
• Related Work
• Motivation
• Methodology
 Enrichment
 Clustering
 Cluster Selection and Labeling
• Application in two case studies
• Conclusion and Future Work
This work has received funding from the European Union's Horizon 2020 research and innovation programme as part of the AFEL
(Analytics for Everyday Learning) project under grant agreement No 687916, and supported by the Insight Centre for Data Analytics,
funded by Science Foundation Ireland.
2
Introduction: Online Learning
Online learning, often called “eLearning", is a new class of learning that has been
flourishing during the last ten years, following the growing adoption of eLearning in
universities, schools, and companies.
3
Social Media
Collaboration
Blogs/Comm
unities
Webinars Online
Discussions/For
ums
Blackboards
Introduction: Online Learning Activities
Identify learning
activities and
recognize how
they participate
in the learner's
progress.
Set of online
activities carried
out by a learner, as
compared to what
is achieved on
specific online
platforms of
learning.
Learning activities
which are
contributing to the
learner's progress
are not well
detected and are
outside of these
systems and
applications.
• http://afel-project.eu
• https://didactalia.net
4
Challenges
• The analysis of online learning activities often requires systems and applications
using learning analytics and data mining approaches.
In the AFEL* (Analytics for Everyday Learning),
we aim to address this challenge by defining a
process for detecting online learning activities
enriched with specific “topic-based” learning
trajectories and visualize them in an
application that enables potential learners to
reflect on their activities and ultimately
improve the way they focus their learning.
Target
Testbed
Related Work
• Track user progress
on certain areas
(games, music,
document writing etc)
or communities
(Players, Moodles
etc).
• Sense and collect user’s
activities data such as
eating or exercise and
apply machine learning
algorithms to identify
activities progress.
• Data Integration
• Practical
educational
benefits for
teaching and
learning
• Detecting
learning aspects
in online
activities.
• Social
interactions
underlying the
social learning
processes
Social
Media
Learning
Activities
Analysis
Semantic
web
technologie
s with
Learning
Activities
Applications
and tools
for Learning
Activities
Analysis
Health
care
Learning
Activities
Analysis
5
Motivation: The AFEL Project
6
Jane
Motivation: The AFEL Project
7
Jane
Motivation: The AFEL Project
8
Jane
Overview
9
Method used for Identifying Learning scopes
Enrichment
10
Method used for Identifying Learning scopes
Enrichment: Named Entity Recognition
• The goal of the enrichment phase is to extract from the resources used in
learning activities information about their content, in the case such information
is not already available.
• Using named entity recognition is a common approach to this problem.
• By extracting from the textual content of the resource key entities and concepts
that are being mentioned, we can build a profile of the resources that can
potentially be used to replace the tags available in existing platforms.
11
Enrichment: Named Entity Recognition
• We use DBpedia Spotlight as an
off-the-shelf named entity recognition tool.
Advantages:
• It is open source and can be deployed locally, removing the need to rely on an
external service.
• Since it is based on Wikipedia,
in addition, the vocabulary of entities
being covered is very wide and
domain-independent, with millions
of entities being recognizable on all sorts of topics.
12
Enrichment: Generalization
• Augment the profile created by named entity recognition by adding the
categories that relate to the entities found through named entity recognition to
the profile of each resource.
• To make this efficient, it relies on an index of the “direct” categories of each
entity and on an index of the DBpedia category hierarchy.
• Each entity is then iteratively assigned categories at a higher level than its
direct ones by climbing up this hierarchy until a given level.
13
Clustering
14
Method used for Identifying Learning scopes
Clustering
There are many ways to cluster based on the kind of resource profiles we have
constructed in the previous step, which very much represent vectors of term
frequencies. However, considering the nature of the scenario in which we are
operating, we need to consider three key requirements:
15
1. The clustering cannot be static
2. The clustering needs to be incremental
3. The number of clusters cannot be fixed in advance
Clustering: Hierarchical Clustering
• Hierarchical clustering is a method by which clusters are formed by
progressively grouping items, creating a hierarchy of larger and larger
clusters.
16
Function to add a new item to an already
existing cluster
function add(item, clusterset)
if clusterset is empty then
c = new cluster([item])
insert c in cluster set
else
find c the cluster in clusterset most similar
to item
nc1 = new cluster(concat(c, item))
nc2 = new cluster(item)
p = parent(c)
remove c from children of p
add nc1 as child of p
add c as child of nc1
add nc2 as child of nc1
• We use an Euclidean distance on the
frequency vectors of terms (entities and
categories).
• The creation of new a cluster computes a
term frequency vector aggregating the
vectors of all included items using their
average. Therefore, the similarity is applied
between those aggregated vectors and the
original vector of the item to be added.
Cluster Selection and Labeling
17
Method used for Identifying Learning scopes
Cluster Selection and Labeling
• Ranking the clusters in the clusterset according to a score.
• The ranked clusterset, sorted in descending order of the score considered.
• The selection process from there can be reduced to selecting non-overlapping
clusters that are highly ranked.
• The final results of this phase are a set of selected clusters grouping similar
activities that have on average a good contribution to the coverage of the topic
of the cluster, and characterized by a vector of term frequencies from the entities
and categories in DBpedia.
• We take the simple approach to choose as label the entity or category in the
vector with the greatest weight as label for a given cluster.
18
Application: Didactalia Platform
19
Application: Learner’s Browsing History
20
Conclusion and Future Work
• We described a method that can be used to dynamically and efficiently detect
the main topics a user is learning about (the learning scopes).
• This method is currently deployed in real-case scenarios within the AFEL
project, and an evaluation is being carried out of the benefits they can generate
with respect to the learning process of the user, beyond the basic validation
presented through the two case studies described.
• The named entity recognition method employed currently, as well as the
methods to compute the learning indicators used in the applications other than
the one related to coverage, are currently designed only considering the
English language.
21
Thank you
Questions?

Detection of Online Learning Activity Scopes

  • 1.
    Detection of OnlineLearning Activity Scopes Syeda Sana E Zainab Mathieu D’Aquin
  • 2.
    Overview • Introduction • RelatedWork • Motivation • Methodology  Enrichment  Clustering  Cluster Selection and Labeling • Application in two case studies • Conclusion and Future Work This work has received funding from the European Union's Horizon 2020 research and innovation programme as part of the AFEL (Analytics for Everyday Learning) project under grant agreement No 687916, and supported by the Insight Centre for Data Analytics, funded by Science Foundation Ireland. 2
  • 3.
    Introduction: Online Learning Onlinelearning, often called “eLearning", is a new class of learning that has been flourishing during the last ten years, following the growing adoption of eLearning in universities, schools, and companies. 3 Social Media Collaboration Blogs/Comm unities Webinars Online Discussions/For ums Blackboards
  • 4.
    Introduction: Online LearningActivities Identify learning activities and recognize how they participate in the learner's progress. Set of online activities carried out by a learner, as compared to what is achieved on specific online platforms of learning. Learning activities which are contributing to the learner's progress are not well detected and are outside of these systems and applications. • http://afel-project.eu • https://didactalia.net 4 Challenges • The analysis of online learning activities often requires systems and applications using learning analytics and data mining approaches. In the AFEL* (Analytics for Everyday Learning), we aim to address this challenge by defining a process for detecting online learning activities enriched with specific “topic-based” learning trajectories and visualize them in an application that enables potential learners to reflect on their activities and ultimately improve the way they focus their learning. Target Testbed
  • 5.
    Related Work • Trackuser progress on certain areas (games, music, document writing etc) or communities (Players, Moodles etc). • Sense and collect user’s activities data such as eating or exercise and apply machine learning algorithms to identify activities progress. • Data Integration • Practical educational benefits for teaching and learning • Detecting learning aspects in online activities. • Social interactions underlying the social learning processes Social Media Learning Activities Analysis Semantic web technologie s with Learning Activities Applications and tools for Learning Activities Analysis Health care Learning Activities Analysis 5
  • 6.
    Motivation: The AFELProject 6 Jane
  • 7.
    Motivation: The AFELProject 7 Jane
  • 8.
    Motivation: The AFELProject 8 Jane
  • 9.
    Overview 9 Method used forIdentifying Learning scopes
  • 10.
    Enrichment 10 Method used forIdentifying Learning scopes
  • 11.
    Enrichment: Named EntityRecognition • The goal of the enrichment phase is to extract from the resources used in learning activities information about their content, in the case such information is not already available. • Using named entity recognition is a common approach to this problem. • By extracting from the textual content of the resource key entities and concepts that are being mentioned, we can build a profile of the resources that can potentially be used to replace the tags available in existing platforms. 11
  • 12.
    Enrichment: Named EntityRecognition • We use DBpedia Spotlight as an off-the-shelf named entity recognition tool. Advantages: • It is open source and can be deployed locally, removing the need to rely on an external service. • Since it is based on Wikipedia, in addition, the vocabulary of entities being covered is very wide and domain-independent, with millions of entities being recognizable on all sorts of topics. 12
  • 13.
    Enrichment: Generalization • Augmentthe profile created by named entity recognition by adding the categories that relate to the entities found through named entity recognition to the profile of each resource. • To make this efficient, it relies on an index of the “direct” categories of each entity and on an index of the DBpedia category hierarchy. • Each entity is then iteratively assigned categories at a higher level than its direct ones by climbing up this hierarchy until a given level. 13
  • 14.
    Clustering 14 Method used forIdentifying Learning scopes
  • 15.
    Clustering There are manyways to cluster based on the kind of resource profiles we have constructed in the previous step, which very much represent vectors of term frequencies. However, considering the nature of the scenario in which we are operating, we need to consider three key requirements: 15 1. The clustering cannot be static 2. The clustering needs to be incremental 3. The number of clusters cannot be fixed in advance
  • 16.
    Clustering: Hierarchical Clustering •Hierarchical clustering is a method by which clusters are formed by progressively grouping items, creating a hierarchy of larger and larger clusters. 16 Function to add a new item to an already existing cluster function add(item, clusterset) if clusterset is empty then c = new cluster([item]) insert c in cluster set else find c the cluster in clusterset most similar to item nc1 = new cluster(concat(c, item)) nc2 = new cluster(item) p = parent(c) remove c from children of p add nc1 as child of p add c as child of nc1 add nc2 as child of nc1 • We use an Euclidean distance on the frequency vectors of terms (entities and categories). • The creation of new a cluster computes a term frequency vector aggregating the vectors of all included items using their average. Therefore, the similarity is applied between those aggregated vectors and the original vector of the item to be added.
  • 17.
    Cluster Selection andLabeling 17 Method used for Identifying Learning scopes
  • 18.
    Cluster Selection andLabeling • Ranking the clusters in the clusterset according to a score. • The ranked clusterset, sorted in descending order of the score considered. • The selection process from there can be reduced to selecting non-overlapping clusters that are highly ranked. • The final results of this phase are a set of selected clusters grouping similar activities that have on average a good contribution to the coverage of the topic of the cluster, and characterized by a vector of term frequencies from the entities and categories in DBpedia. • We take the simple approach to choose as label the entity or category in the vector with the greatest weight as label for a given cluster. 18
  • 19.
  • 20.
  • 21.
    Conclusion and FutureWork • We described a method that can be used to dynamically and efficiently detect the main topics a user is learning about (the learning scopes). • This method is currently deployed in real-case scenarios within the AFEL project, and an evaluation is being carried out of the benefits they can generate with respect to the learning process of the user, beyond the basic validation presented through the two case studies described. • The named entity recognition method employed currently, as well as the methods to compute the learning indicators used in the applications other than the one related to coverage, are currently designed only considering the English language. 21
  • 22.