MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM

Using Text Mining and Linked Open Data to
assist the Mashup of Educational Resources
Santa Vallejo-Figueroa
Miguel Rodriguez-Artacho
Manuel Castor-Gil
Elio San Cristobal
IEEE Global Engineering Education Conference
Abril 2018, Santa Cruz de Tenerife
Universidad Nacional de Educación a Distancia
(UNED)

2
Context (1)
 Creation: hard and time-consuming
task
 There not exist standards
 Necessity of OERs is increasing
 Several institutions promote the
generation and use of OERs
 The integration of learning contents
is a niche for reusing of such
contents
 We refer to an online course as an
OER

3
Context (2)
 Semantic Web technologies for
reuse and dissemination of OERs
 Linked Open Data (LOD): publishing
and sharing information
 The LOD adds a semantic layer to
OERs
 apply the LOD approach on OERs is
not an easy task
 well represented and organized
information of OERs enhance its
searching
© Brock University

4
Method
Premises for the method:
1) Exists an OER repository
containing online courses
2) Courses have a basic structure, as
minimum a textual description
3) Instructional aspects of contents
are not taken into account
4) The human creator will get
suggested resources to be related
and integrated according to final
decisions of the creator.
 Aim 1: to exploit
existent LOD
information for
integrating contents
for OERs
 Aim 2: to assist in the
creation of an OER
(course)

6
Method - Getting OERs
 Retrieve OERs from an
repository
 Only online courses
 SPARQL queries
PREFIX terms: <http://purl.org/dc/terms/>
…..
PREFIX ou: <http://data.open.ac.uk/ontology/>
SELECT * WHERE {
?x rdf:type ou:Course .
?x id:internalID ?id .
?x terms:title ?title .
?x abs:abstract ?abstract .
?x purl:url ?url .
?x terms:description ?descriptionA .
?x dc:description ?descriptionB .
?x sm:isSimilarTo ?sameAs
} ORDER BY ?id

7
Method - Text processing (1)
 Named Entities (NE) and relevant
words are detected for retrieved
courses
 A Named Entity is a universal-known
word with a unique meaning, such
as persons, locations, organizations,
etc.
 Relevant words are keywords or
concepts to “describe” a text
 A concept is a semantic class (group)
of terms sharing a similar idea.
 A keyword is a term with a number
of occurrences in sentences.
 A concept can group words and
keywords

8
Method - Text processing (2)
 Semantic information from the
DBpedia knowledge base is used
 An OER is represented by:
 a syntactic layer (textual
description), and
 a semantic layer (sets of NEs
and relevant words)
 The semantic layer points to
resources from DBpedia
 Each course is indexed by using a
text search engine (Lucene)
8

9
Method - Query Generation
 A course is retrieved by using
a text fragment or set of
words
 NEs and/or relevant words
are identified from input text
 These are used to formulate
a simple text query over the
text search engine
 No SPARQL queries are
required
9

10
Method - Query Processing (1)
 The most relevant courses are
retrieved from the semantic
index
 For syntactic search, NEs,
concepts, and keywords from
query are used for searching on
fields ([NEs], [concepts],
[keywords])
 Only NEs, concepts, and
keywords are used from the
query
 A list of retrieved courses is
ranked according to its similarity
respect to the query
10

11
Method - Query Processing (2)
 For semantic search, a
matching of relationships
between NEs/concepts of the
query against NEs/concepts of
each retrieved course is made
 The relations (graph) that each
NE/concept has in the
DBpedia are explored
 The set of graphs of the query
and the set of graphs of each
retrieved course are obtained
 Those courses with greater
matching to the query are
ranked as a result
11

12
Method - Results Processing
 The relationships between
the NEs/concepts of retrieved
courses are given to the
creator
 Per each retrieved course its
NEs/concepts are connected
by means of a concept map
 The main idea is to represent
how NEs/concepts from
retrieved courses are related.
 The creator can generate a big
picture about the mashup of
OERs and DBpedia resources
12

13
Preliminary results (1)
 An implementation was developed on an 8GB RAM
Linux machine by using Java (web application), DBpedia
SpotLight, KeyGraph, Lucene, and MySQL
 A total number of 265 courses were retrieved from the
UK Open University
 The application was tested by queries about Parallel
Computing, Database, Software, Computer Aided
Software Engineering, Data Structures, and Operating
Systems

14
Getting OERs
Text Processing
Query Generation
Query Processing

15
 Query: Computer Aided Software Engineering

16
 At this moment only the
syntactic and semantic
search has been
implemented
 We are working on a
similarity measure for
distinguising OERs with
similar names
 The concept maps are not
generated yet
 The implementation shows
the feasibility of this
approach

17
Conclusions (1)
 This work proposes an approach to assist to the human
creator in the generation or re-structuring of courses
 A course is an educational resource
 The approach exploits:
 Text mining techniques to identity key elements
from text
 Semantic linked information from the DBpedia
knowledge base
 Stages of the method
 Getting OERs
 Text processing
 Query generation
 Query processing
 Results processing

18
Conclusions (2)
 The approach was implemented as a prototype
showing promising results
 Real experimentation
 265 online courses related to Computer Science were
retrieved from the UK Open University
 Future work
 Enhance the semantic similarity measure
 The generation of concept maps

Using Text Mining and Linked Open Data to
assist the Mashup of Educational Resources
Santa Vallejo-Figueroa
Miguel Rodriguez-Artacho
Manuel Castor-Gil
Elio San Cristobal
IEEE Global Engineering Education Conference
Abril 2018, Santa Cruz de Tenerife
Universidad Nacional de Educación a Distancia
(UNED)
Thanks!

MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM

Similar to MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM (20)

More from eMadrid network

More from eMadrid network (20)

Recently uploaded

Recently uploaded (20)

MULTI-LEARNING SPECIAL SESSION / EDUCON 2018 / EMADRID TEAM

Editor's Notes