Application of formal ontology and semantic techniques to improve the coherence and usability of lexical resources

Nervo Verdezoto
University of Trento
nervo.verdezoto@studenti.unitn.it

Prof. Laure Vieu and Prof. Alessandro Oltramari
Tutors
Application of formal ontology and semantic
techniques to improve the coherence and
usability of lexical resources

Master HLTI 2009-2010

Outline

 Objectives and Tasks
– Data
– Ontological Principles
– Experiments
– Results
 Manual Analysis and discussion
 Summary


Objectives

• Get familiar with Ontology-driven
Conceptual Modeling
• Develop semi-automatic methods to
spot semantic/ontological problems in
WordNet at lower levels
• Get familiar with scientific reporting


Tasks


Study WordNet semantic relations to spot ontological
problems

Applications:

RTE

Automatic detection of part-whole relations e.g.
(atmospheric phenomenon, communication), (shape, artifact),
(shape, physical phenomenon)


The Data

WordNet: 82115 synsets were examined to collect the initial data, 22187 were
involved in meronyms and holonyms relations (50% meronyms – 50% holonyms)

Semeval 2007: 89 pairs relations were extracted.

Additionally, we eliminated the redundant pairs from initial data.

MERONYMS
14000

12000

10000

8000 # PAIRS –
MERONYMS
6000

4000

2000

0
MEMBER PART SUBSTANCE


Ontological Principles

• Constraints: part and whole should be of a
similar nature.
• DOLCE-ontological distinctions between:
– endurants (ED) or physical entities (like a
dog, a table, a cave, etc.)
– perdurants (PD) or eventualities (like a
lecture, a sleep, a raining, etc.)
– abstract (AB, entities like a number, the
content of a text, etc.).


Experiments – Tests
[defining queries]

• Semantic Constraints
– Test 0: Individual – Class pairs:
• (great_divide%1:15:00,continental_divide%1:15:00)
– Test 4: Meronymy – Member and Member–Collection
pairs:
• (coronal%1:06:00, rose%1:20:00)

• Ontological Constraints
– Test 1: ED–AB (test 1.1) or AB–ED (test 1.2)
• Test 1.1: physical entity 1:03:00 (but not process 1:03:00) / abstraction 1:03:00 (but not event
1:03:00 + state 1:03:00. (head%1:06:04::,coin%1:21:02::)
– Test 2: ED–PD (test 2.1) or PD–ED (test 2.2)
• Test 2.1 , physical entity 1:03:00 (but not process 1:03:00) / process 1:03:00 + event 1:03:00 +
state 1:03:00. ⟨air%1:27:00, wind%1:19:00⟩

– Test 3: PD–AB (test 3.1) or AB–PD (test 3.2)
• Test 3.1 , abstraction 1:03:00 – but not event 1:03:00 + state 1:03:00(first all and then without
group) / event 1:03:00 + state 1:03:00 + process 1:03:00. ⟨regulation time%1:28:00, athletic
game%1:04:00⟩

Results
Ontological Problems

180
163
160

140

120 108

100 WORDNET

SEMEVAL
80

60
45

40

20
2 2
0

Test 1 Test 2 Test 3

Ontological Problems

180
163
160

140

120 108

100 W ORDNET

SEMEVAL
80

60 45

40

20
2 2
0

Test 1 Test 2 Test 3


Manual Analysis and discussion

General Errors
• a synset is considered as a class but should be an individual
– Confusion between class and an instance of this class for which the term is used with a specific
sense e.g., ⟨great_divide%1:15:00,continental_divide%1:15:00⟩
– Confusion between class and group e.g., new_testament%1:10:00
• a synset is not attached to the right place in the taxonomy
– Confusion between a property and a physical entity having that property (shape, quantity or
measure, location) or between a relation and a physical entity being an argument in that relation
e.g., coin%1:21:02, hay_mow%1:23:00 - calyx%1:20:00, mothball%1:06:00
• a synset mixes two senses, and the missing sense should be attached elsewhere in the
taxonomy or this missing sense is an individual, not a class
– Confusion between 2 senses of a word, amounting to a missing sense e.g.
⟨ethiopian%1:18:00, ethiopia%1:15:00⟩
• the meronymy relation is wrong
– Confusion between meronymy and other relations (location, participation, etc.):
• “is located in” - ⟨balkan_wars%1:04:00, balkan_peninsula%1:15:00⟩
• “participates in” - ⟨feminist%1:18:00,feminist_movement%1:04:00⟩


Summary and future work

• An automatic query system based on ontological
principles and semantic constraints is effective to build
semi-automatic methods to spot errors in WordNet
• Increase the number and type of experiments
• Exploit the results of this study to:
– Develop a semi-automatic tool for ”cleaning-up” WordNet
– Design and develop guidelines to help lexicographers
(Christiane Fellbaum from Princeton WordNet Group) to
prevent classical ontological mistakes
– Evaluation for NLP applications


THANK YOU

Nervo Verdezoto D.

Application of formal ontology and semantic techniques to improve the coherence and usability of lexical resources

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Application of formal ontology and semantic techniques to improve the coherence and usability of lexical resources