Ph.D. Defense

LiDom Builder: Automatising the Construction of
Multilingual Domain Modules
Ángel Conde Manjón
GaLan Research Group – LSI Department
University of the Basque Country (UPV/EHU)
Supervisors:
Dr. Mikel Larrañaga Olagaray & Dr. Ana Arruarte Lasa
UPV/EHU
25 February 2016

• Technology Supported Learning Systems (TSLS)
• Learning Management Systems:
• Massive Open Online Courses:
• Intelligent Tutoring Systems: SQL-Tutor
• …
• Bilingual and Multilingual Contexts are a reality (Unesco, 2003)
• Acquiring the Domain Module is a cost and work intensive
task
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Context
2

3
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Main Goal
Automatising the construction of MULTILINGUAL DOMAIN MODULES

4
DOM-Sortze (Larrañaga, 2012) a framework for building DOMAIN MODULES from
electronic textbooks
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Previous Work: DOM-Sortze

5
Electronic Textbook
LDO Gathering
Preprocess
LOs Gathering
Domain Module
Document Body Internal
Representation
Document Outline Internal
Representation
Learning Domain Ontology
1
2
3
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Previous Work: DOM-Sortze

6
Planetary
System
Solar System
Moon
Satellite
Planet Earth
partOfpartOf
partOf
isA
isA
prerequisite
The Moon is Earth's
only natural satellite
LO1
hasDR
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
DOM-Sortze: Domain Module Representation Formalism
Learning Domain Ontology (LDO)
Topics and pedagogical relationships
Learning Objects (LO)
• Definitions
• Examples
• Problem Statements
• …

Limitations of DOM-Sortze:
1. Developed for a single language: Basque.
2. Its formalism is not able to represent Multilingual Domain
Modules.
7
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
DOM-Sortze: Limitations

8
1. Can be the formalism used in DOM-Sortze be enhanced for
Multilingual Domain Modules?
– Extend the formalism to deal with Multilingual Domain Modules.
2. Which enhancements are required to deal with various languages?
– Develop a method for extracting Multilingual Terminology.
– Improve the Relationship Acquisition.
– Provide a method for acquiring Multilingual Learning Objects.
Automatising the construction of MULTILINGUAL DOMAIN MODULES
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Goals

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
9
I. Introduction: Motivations and Goals
II. LiDom Builder: Building Multilingual Domain
Modules
III. Acquisition of Multilingual Terminology
IV. Identification of Pedagogical Relationships
V. Gathering Multilingual Learning Objects
VI. Conclusions and Future Work
Outline

10
Modules
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future WorkLiDomBuilder
Outline

11
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future WorkLiDomBuilder
Multilingual
Terminology
Extraction
Pedagogical
Relationship
Extraction
Textbook
Multilingual
Learning Object
Generation
LiDom Builder
Overview
LiDom Builder: framework for automatising the
acquisition of Multilingual Domain Modules
Domain Module

Equiv. “en”
Equiv. “es”
12
Planetary
System
Solar System
Moon
Satellite
Planet Earth
partOfpartOf partOf
isA
isA
prerequisite
pedagogically
Close
“ilargi”
“luna”
“moon”
LO1 LO2
eu
en
es
hasDR hasDR
@
@ @
@
@
@
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future WorkLiDom Builder
Multilingual Domain Module Formalism

Language
Identification
LDO
Gathering
13
Electronic Textbook
Preprocess
LOs Gathering
Document Internal
Representation
Representation
1
2
3
Domain Module
NLP Parsers
Illinois Chunker
Illinois POS tagger
FreeLing
IXA-Pipes
Topic Extraction
Relationship Extraction
Set of Heuristics
Grammar
Multilingual LOs
Grammar
Discourse Markers
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Proposed Enhancements
LiTeWi
LiReWi
LiLoWi
0

12
Electronic Textbook
LDO
Gathering
Preprocess
LOs Gathering
Document Internal
Representation
Representation
1
2
3
Domain Module
Knowledge Resources
…..
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Proposed Enhancements

15
• Two phases
• Tuning up
• Set the thresholds and default confidence values.
• Evaluation
• Gold Standard (Recall, Precision, F1-Score).
• Expert validation.
• Use of three textbooks
1. Programming: Introduction to Object Oriented Programming (Wong .S,
2010).
2. Astronomy: Introduction to Astronomy (Morison, 2008).
3. Biology: Introduction to Molecular Biology (Raineri,2010).
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
General Evaluation Methodology

16
I. Introduction: Motivation and Goals
Modules
Introduction
Acquisition of
Multilingual Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Outline

17
In DOM-Sortze, terminology extracted with ErauzTerm (Alegria et al., 2004).
A new tool called LiTeWi has been developed.
Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Acquisition of Multilingual Terminology

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiTeWi
18
TF-IDF KP-Miner CValue
Shallow Parsing
Grammar
Electronic Textbook
Candidate Extraction
Generic
Corpus
Mapping
Disambiguation
Filtering
Mapping to other languages
Candidate Selection
Combination

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Shallow Parsing Algorithm
19
• Uses a derived grammar from (Larrañaga, 2012).
Constraint
Grammar applied
to POS tags
Shallow Parser
Topics
Array List
Stack
………
Grammar
Topic + [*]+ part of + [det] +Topic
……………….
Textbook
Sentences may contain topics
This is called an Array List
A Stack is used to model systems that exhibit LIFO…
Extraction
Rules
Chunks
an Array List
A Stack
…….

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiTeWi
Shallow Parsing
Grammar
Electronic Textbook
Mapping
Disambiguation
Filtering
Generic
Corpus
Candidate Selection
Combination
20

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Mapping
21
• Terms mapped to their corresponding Wikipedia articles.
• Search procedure to match Wikipedia article titles and their labels.

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiTeWi
Shallow Parsing
Grammar
Electronic Textbook
Mapping
Disambiguation
Filtering
Generic
Corpus
Candidate Selection
Combination
22

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Disambiguation
23
• Method based on global disambiguation (Milne et al., 2008).
• Domain knowledge step added to improve the results.
• Use as a disambiguation context the domain important terms.
• Gold Term List: Domain important terms with only one sense.
Monosemic terms that have highest CValue score.

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Disambiguation
24
Wikiminer
Compare Service
Term List (to disambiguate)
-Java
- Inheritance
-Property
Disambiguated Term -Java (programming Language)
Gold Term List
-Class
-Programming Language
-Array List
Class Prog.
Lang.
Array List
Prog. Language 0.90 0.85 0.64
Island 0.7 0.77 0.53
City 0.56 0.75 0.6
Average
0.89
0.70
0.63
-Java

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiTeWi
25
Shallow Parsing
Grammar
Electronic Textbook
Mapping
Disambiguation
Filtering
Generic
Corpus
Candidate Selection
Combination

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Filtering Unwanted Terms
26
Wikiminer
Compare Service
Number of Related Gold
Terms
Gold Term List
-Solar System
- Black Hole
-Solar Mass
Term List (to filter)
-Universal Studios
-Planet
-Windows 98
Relatedness Score
-Planet
-Windows 98
Domain Related Term
-Planet
-Planet
N(>1)
Threshold(>=0.6)
Solar System (0.34)
Black Hole (0.53)
Solar Mass (0.47)
Solar System (0.23)
Black Hole (0.68)
Solar Mass (0.50)
-Universal Studios
-Windows 98

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiTeWi
27
Shallow Parsing
Grammar
Electronic Textbook
Mapping
Disambiguation
Filtering
Generic
Corpus
Candidate Selection
Topic EN ES EU
Moon Moon Luna Ilargia
Combination

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Evaluation
28
Tuning up
• Introduction to Object Oriented Programming textbook.
Evaluation
• Gold Standard and Expert Validation.
• Gold Standard based on the terms appearing on the index of each textbook.
• Evaluated on Introduction to Astronomy and Introduction to Molecular
Biology.

Introduction
Acquisition of
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Results
29
Gold-Standard Ex. Validation
Precision (%) Recall (%) F1 Score (%) Correctness (%)
Astronomy 3.55 62.96 6.72 18.55
Mol. Biology 2.24 10.21 3.67 49.27
Gold-Standard Ex. Validation
Precision (%) Recall (%) F1 Score (%) Correctness (%)
Astronomy 17.96 72.55 28.79 78.77
Mol. Biology 27.09 50.53 87.70 71.65
• Wikifier (Cheng , 2013)
• LiTeWi

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Outline
30
Modules

Introduction
31
In DOM-Sortze, relationship acquisition for Basque using Shallow Parsing
An adaptation and extension of the Heuristic-based analysis of
the outline has been developed.
A new tool called LiReWi has been developed.
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder

Heuristic-based analysis of the outline
32
Document Outlines
• Reflects the organization made by the author.
• The structure of the outline underlies pedagogical relationships.
• Low cost process (summarised).
DOM-Sortze
• Each outline item is considered as a domain topic.
• By default gathers a partOf relation between an item and its subitems.
• Heuristics to detect isA relations.
LiDom Builder
• Adaptation to English of heuristics from (Larrañaga et al., 2004).
• Improvement of isA identification using Wikitaxonomy (Ponzetto et al., 2007).
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Wikipedia Enhanced Process
33
………..
4.- Structure of polymers / Macromolecules
4.1.- Polymer chemistry
4.2.- Molecular weight
4.3.- Form, structure and molecular configuration
4.3.- Supramolecular arrangement
4.4.- Crystalline and amorphous polymers
4.5.- Families of polymeric materials
4.5.1.- Thermosettings
4.5.2.- Thermoplastics
4.5.3.- Elastomers
5.- Phase diagrams / Definitions
5.1.- Solid solutions
5.2.- Phases rule of Gibbs
5.3.- Types of phase diagram
1. Identify groups of sibling nodes
2. Select the groups of leaf nodes in which
the partOf relationship has been
identified
Thermosettings polymer (Article id= 321827)
Thermoplastic (Article id= 182444)
Elastomer (Article id = 842224)
3. Link and disambiguate each
node to a Wikipedia article
using Wikiminer (Milne et al.,
2012)
Materials science
Elastomers
Polymer physics
Polymer physics
Polymer chemistry
4. Process every group using
(Ponzetto et al., 2007) taxonomy
5. Infer isA relationship in those
groups that share a common
ancestor

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Evaluation
34
Gold Standard
• 57 document outlines in English from different
domains.
• Human instructors defined the optimal output (LDOs).
• Each LDO restricted to the topics of the outline.

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Results
35
• Heuristic Analysis
• Heuristic Analysis + Wikipedia Enhanced Process
partOf isA Total
Precision (%) 84.12 78.95 83.85
Recall (%) 98.66 21.20 83.85
partOf isA Total
Precision (%) 89.19 77.30 87.70
Recall (%) 96.49 50.53 87.70

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Identification of Pedagogical Relationships: LiReWi
36
Mapping
Topics
Knowledge Bases
LiReWiElectronic
Textbook
Candidate
Relationship
Extraction
Combination &
Filtering

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Mapping
37
Topic: Syntax
Wikipedia id=3206060
WordNet id=?
Comparer
Page Rank
Disambiguation
Syntax
WordNet id= 6176322
Syntax
WordNet id= 8436203
Final id
Mapped WordNet id
returned=
WordNet id =
6176322
! =
Fernando’s Mappings
Babelnet Mappings
Wiki Id WordNet id
3206060 8436203,…
………. ………..
……… …………
Wiki Id WordNet id
3206060 6176322,…
………. ………..
……… …………
Mapping To
WordNet
Disambiguation
Disambiguation Context
WordNet id
8436203
6176322
……….
Java, Programming….

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
38
Mapping
Candidate
Relationship
Extraction
Topics
Knowledge Bases
LiReWiElectronic
Textbook
Combination &
Filtering

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Candidate Relationship Extraction
39
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
NLP data
WikiTaxonomy
Extractor
isA
partOf
prerequisite
prerequisite
pedagogically-
Close
isA
partOf
isAisA isA
partOf
Candidate Relationships

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
40
Path Based Extractors:
Rocky planet
Mars
Planet
(path length=2,
confidence=0.9)(path length=1,
confidence=1)
isA
isA
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
WikiTaxonomy
Extractor

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
41
• WikiRelations: Set of tuples that state the relationships between Wikipedia
categories.
T Tauri, Star, isA
…………
Radiation, Radio waves, partOf
Light, Electromagnetic radiation, partOf
…………
Light, Electromagnetic radiation, partOf
…………
T Tauri star, Star, isA
007 license to kill, video games, isA
WikiRelations Tuples
Light partOf
Electromagnetic radiation
(Confidence=0.7)
Topic: Light
Cat1: Light
Cat2: …
Topic: Electromagnetic radiation
Cat1: Electromagnetic radiation
Topic: ……
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
WikiTaxonomy
Extractor

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Sentences with mentions
Earth is part of the Solar System.
……………….
42
• Extractor based on the rules defined in (Larrañaga, 2012).
Topics
Solar System
Earth
Planet
Mars
Find Mentions
Constraint Grammar
applied to POS tags
Relationships
Earth partOf Solar System
……………….
…………
Grammar
Topic + [*]+ part of + [det] +Topic
……………….
Textbook
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
WikiTaxonomy
Extractor

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
WikiTaxonomy
Extractor
43
Textbook
Topics
Wavelength
Emission spectrum
Planet
Solar System
Find
Mentions
Look links
in/links out on
Wikipedia
Reasoner
Relations
Emission spectrum
pedagogicallyClose Wavelength
…………………….
Possible candidates:
Wavelength, Emission Spectrum
(2 times)
...leading to different radiated wavelengths,
make up an emission spectrum.
... the emission spectrum of a particular
star, the wavelength of …
……………..
Relatedness > threshold
Emission spectrum (link out) Wavelength
Wavelength (link out) Emission spectrum

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
44
Topic1 Topic2 Topic3 Topic4
Topic1 is pedagogicallyClose to Topic2 Topic3 is a prerequisite of Topic4
4
3
4
1
Mentions (Links):
-Topic3, 4 mentions
-….
Mentions (Links):
-Topic4, 1 mentions
-….
Mentions (Links):
-Topic2, 3 mentions
-….
Mentions (Links):
-Topic1, 4 mentions
-….
WordNet
Extractor
Wibi
Extractor
WikiRelations
Extractor
Shallow Parsing
Grammar
Extractor
Sequential
Extractor
WikiTaxonomy
Extractor

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
45
Mapping
Candidate
Relationship
Extraction
Combination &
Filtering
Learning Domain
Ontology
Topics
Knowledge Bases
LiReWiElectronic
Textbook

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Combination & Filtering Relationships
46
-Earth isA Planet (WordNet Ex) (Conf=1)
-Earth isA Planet (WikiRelations Ex) (Conf=0.8)
-Planet isA Earth (WikiTax Ex) (Conf=0.7)
-Earth partOf Solar System (WordNet Ex) (Conf=1)
-Earth isA Terrestrial Planet (WikiTax Ex) (Conf=0.5)
-Earth isA Planet (WordNet Ex, WikiRelations Ex) (Conf=1)
Relationships
Confidence
Combiner
Conflict
Resolver
Filter
Final Relationships
Conflict
Resolution
Relationships combined
Filter below
threshold

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Evaluation
47
Tuning up
• Introduction to Object Oriented Programming textbook.
Evaluation
• Gold Standard and Expert Validation.
• Introduction to Astronomy textbook.
• Gold standard, four experts stated the set of relationships.
• Using a subset of the main domain topics according to the score given by LiTeWi.

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Results
48
Precision (%) Recall (%) F1-Score (%) Expert
Validation (%)
LiReWi 36.21 50.57 42.42 43.98
DOM-Sortze 63.27 20.74 31.24 N.A.

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Multilingual
Learning Objects
Conclusions and
Future Work
LiDom Builder
Outline
49
Modules

Learning Objects
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Conclusions and
Future Work
LiDom Builder
Introduction
50
In DOM-Sortze, LOs acquisition for Basque using Shallow Parsing.
A Validation of the approach for English has been carried out.
LiLoWi has been developed to move towards the elicitation of
Multilingual LOs.

Learning Objects
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Conclusions and
Future Work
LiDom Builder
Adapting Learning Object elicitation to English
51
Basque English
Pattern adibidez, @topic for instance, @topic
Example
Uretan, adibidez hidrogeno eta oxigeno
atomoak daude.
For instance, there are hydrogen
and oxygen atoms in water.
Textbook
Topics
Wavelength
Emission spectrum
Earth.
Solar System Find
Mentions
Grammar
Earth is a planet.
……………….
Learning Objects
The Moon is Earth's
only natural satellite

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Evaluation
52
Gold Standard and Expert Validation:
• Evaluated on Introduction to Object Oriented Programming.
• Gold Standard built by some experts.
Two Aspects
• Grammar.
• Learning Objects.

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Evaluation
53
Definitions Examples Prob. Stat. Princ. Stat. Total
Found 164 1 12 49 226
Correct 138 1 7 35 181
Precision (%) 84.15 100 58.33 71.43 80.09
Recall (%) Expert
Validation (%)
DOM-Sortze 70.31 91.88
LiDom 75.93 86.79
• Grammar
• Learning Objects

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
LiLoWi
54
Metadata
Generator
Multilingual LOs
from WordNet/Wikipedia
Topics
Solar System
Emission spectrum
Earth.
LO2es
LO1en
LO2en
Equivalents

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
• Evaluated on the Principles of Object-Oriented Programming.
• Used the same LDO described in the previous experiment.
• Expert Validation.
Two Aspects
 How LiLoWi enhanced the LO coverage for the LDO topics.
 How many multilingual LOs are extracted.
Evaluation
55

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and
Future Work
LiDom Builder
Results
56
Definitions References
English Spanish Basque French
Number of topics
Topic coverage (%)
46
56.10
36
43.90
9
10.97
36
43.90
12
14.63
• Grammar + Wikipedia/WordNet
Total Definitions
Number of topics 21 19
Topics coverage (%) 25.61 19.51
• Grammar-based approach

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Conclusions and Future
Work
LiDom Builder
Modules
Outline
57

58
1. Provision of a suitable formalism to represent Multilingual Domain Modules.
2. Developed a method for the elicitation of multilingual terminology.
– First term extractor to our knowledge based on searching patterns for
educational content.
3. Relationship Acquisition has been improved.
– Extension of outline processor to English + Enhancement with Wikipedia.
– Development of LiReWi, a module for the elicitation of pedagogical
relationships for Educational Ontologies.
– Developed a state of the art mapper from Wikipedia to WordNet.
4. Developed a method for multilingual LO generation.
– Extension of DOM-Sortze for English.
– Development of LiLoWi, a module for the elicitation of multilingual LOs using
different knowledge bases.
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Work
LiDom Builder
Goal Achievement

Work
• Automatising the inclusion of new languages.
• Multilingual Learning Object generation from similarity and machine
translation techniques.
• Concept Map-Based Learning Object Generation.
• Improvements on each module of LiDom Builder.
59
Future Work
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
LiDom Builder

Work
Software Released
60
Software
• LiTeWi, released with Spanish/English support: https://github.com/Neuw84/LiTe
• Wikipedia/WordNet mapper: https://github.com/Neuw84/Wikipedia2WordNet
• Spanish stemmer: https://github.com/Neuw84/SpanishInflectorStemmer
• Training Data for Wikiminer: https://github.com/Neuw84/Wikipedia353Spanish
• LiReWi: coming soon….
Web Demo
• LiDom builder : http://galan.ehu.es/lidom/
Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
LiDom Builder

Introduction
Acquisition of
Multilingual
Terminology
Identification of
Pedagogical
Relationships
Gathering Learning
Objects
Work
LiDom Builder
61
Publications
A Combined Approach for Eliciting Relationships for Educational Ontologies Using Several
Knowledge Bases.
Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga.
Journal of Knowledge-Based Systems. Submitted.
LiteWi: A Combined Term Extraction Method for Eliciting Educational Ontologies from Textbooks.
Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga, Dan Roth.
Journal of the Association for Information Science and Technology, 67(2), pp. 380–399, 2016.
Testing Language Independence in the Semiautomatic Construction of Educational Ontologies.
Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga.
12th International Conference on Intelligent Tutoring Systems ITS 2014, Springer, Vol. 8474, pp.
545-550, 2014.
Automatic Generation of the Domain Module from Electronic Textbooks. Method and Validation.
Mikel Larrañaga, Ángel Conde, Iñaki Calvo, Jon A. Elorriaga, Ana Arruarte
IEEE Transactions on Knowledge and Data Engineering, 26(1), pp. 69-82, 2014.
Automating the Authoring of Learning Material in Computer Engineering Education.
Ángel Conde, Mikel Larrañaga, Iñaki Calvo, Jon A. Elorriaga, Ana Arruarte.
42nd Frontiers in Education Conference, pp. 1376-1381, 2012.

LiDom Builder: Automatising the Construction of Multilingual Domain
Ángel Conde Manjón
GaLan Research Group – LSI department, University of the
Basque Country (UPV/EHU)
Supervisors:
Mikel Larrañaga Olagaray & Ana Arruarte Lasa
UPV/EHU

Ph.D. Defense

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (15)

Similar to Ph.D. Defense

Similar to Ph.D. Defense (20)

Ph.D. Defense

Editor's Notes