Extracting Sense-Disambiguated Example Sentences From Parallel Corpora

Gerard de Melo
Gerard de MeloAssistant Professor at Rutgers University
Introduction
Example Extraction
Example Selection
Results
Conclusion
Extracting
Sense-Disambiguated Example Sentences
From Parallel Corpora
Gerard de Melo and Gerhard Weikum
Max Planck Institute for Informatics
Saarbr¨ucken, Germany
2009-09-18
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
a thin, sharp-pointed metal pin with a raised spiral
thread running around it and a slotted head, used to join
things together by being rotated in under pressure
(Concise OED)
Use a crosshead screwdriver to tighten the screws inserted
in the pre-fixed connectors.
(Web)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
a thin, sharp-pointed metal pin with a raised spiral
thread running around it and a slotted head, used to join
things together by being rotated in under pressure
(Concise OED)
Use a crosshead screwdriver to tighten the screws inserted
in the pre-fixed connectors.
(Web)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
any sentence that contains a word (being used in a specific
sense)
allow the user to grasp a word’s meaning
see circumstances a word would typically be used in
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
any sentence that contains a word (being used in a specific
sense)
allow the user to grasp a word’s meaning
see circumstances a word would typically be used in
traditional intensional word definitions may be too confusing
humans used to deriving meaning from context
users can verify whether they have understood definition correctly
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
any sentence that contains a word (being used in a specific
sense)
allow the user to grasp a word’s meaning
see circumstances a word would typically be used in
possible contexts, e.g. child vs. youngster
typical collocations, e.g. to give birth or birth rate
but not *to give nascence or *nascence rate
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
=⇒ most modern dictionaries include example sentences
number, length often limited
digital dictionaries: tight space constraints of print media no
longer apply!
larger number of example sentences can be presented to the
user on demand
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
=⇒ most modern dictionaries include example sentences
number, length often limited
digital dictionaries: tight space constraints of print media no
longer apply!
larger number of example sentences can be presented to the
user on demand
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
=⇒ most modern dictionaries include example sentences
number, length often limited
digital dictionaries: tight space constraints of print media no
longer apply!
larger number of example sentences can be presented to the
user on demand
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Example Sentences
=⇒ most modern dictionaries include example sentences
number, length often limited
digital dictionaries: tight space constraints of print media no
longer apply!
larger number of example sentences can be presented to the
user on demand
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Goals
automatically obtain example sentences for a specific sense of
a word
choose set of representative sentences to present to user
distinguish senses:
There were many bats flying out of the cave.
vs.
In professional baseball, only wooden bats
are permitted.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Goals
automatically obtain example sentences for a specific sense of
a word
choose set of representative sentences to present to user
not all examples are equally useful
screen space may be limited (can still show more only on demand)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Index (Dictionary)
English: Princeton WordNet 3.0
Spanish: Spanish WordNet
σ(t): get senses for term t
σ(t, s): is sense s a sense of term t?
behind the scenes: morphological analysis, multi-word
expression detection
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Index (Dictionary)
English: Princeton WordNet 3.0
Spanish: Spanish WordNet
σ(t): get senses for term t
σ(t, s): is sense s a sense of term t?
behind the scenes: morphological analysis, multi-word
expression detection
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Index (Dictionary)
English: Princeton WordNet 3.0
Spanish: Spanish WordNet
σ(t): get senses for term t
σ(t, s): is sense s a sense of term t?
behind the scenes: morphological analysis, multi-word
expression detection
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Disambiguation
disambiguate word occurrences, then use sentence as example
for those word senses
fine-grained WSD not reliable enough (cf. SemEval results)
idea: use parallel corpora, jointly look at both versions of text
=⇒ greater accuracy can be achieved
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Disambiguation
disambiguate word occurrences, then use sentence as example
for those word senses
fine-grained WSD not reliable enough (cf. SemEval results)
idea: use parallel corpora, jointly look at both versions of text
=⇒ greater accuracy can be achieved
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Disambiguation
disambiguate word occurrences, then use sentence as example
for those word senses
fine-grained WSD not reliable enough (cf. SemEval results)
idea: use parallel corpora, jointly look at both versions of text
=⇒ greater accuracy can be achieved
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Sense Disambiguation
disambiguate word occurrences, then use sentence as example
for those word senses
fine-grained WSD not reliable enough (cf. SemEval results)
idea: use parallel corpora, jointly look at both versions of text
=⇒ greater accuracy can be achieved
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Sense Disambiguation
There were many bats flying out of the cave.
Aus der H¨ohle flogen viele Flederm¨ause.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Introduction
Sense Disambiguation
There were many bats flying out of the cave.
Aus der H¨ohle flogen viele Flederm¨ause.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Parallel Disambiguation
req.: text available in two languages a, b
good news: more and more parallel corpora available
word alignment
compute score
wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa)
s ∈σ(ta)
σ(ta,s )csim(tb,s )
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Parallel Disambiguation
req.: text available in two languages a, b
good news: more and more parallel corpora available
word alignment
compute score
wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa)
s ∈σ(ta)
σ(ta,s )csim(tb,s )
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Parallel Disambiguation
req.: text available in two languages a, b
good news: more and more parallel corpora available
word alignment
compute score
wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa)
s ∈σ(ta)
σ(ta,s )csim(tb,s )
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Components
csim(tb, sa): cross-lingual similarity
monolingual WSD: compare bag-of-words vector for term
context with sense context (constructed from definitions, etc.)
Semantic Similarity sim(s1, s2)
identify only near-identical senses (e.g. of house and home),
not arbitrary associations (e.g. house and door)
csim(tb, sa) =
sb∈σ(tb)
sim(sa, sb) wsd(sb|tb)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Components
csim(tb, sa): cross-lingual similarity
monolingual WSD: compare bag-of-words vector for term
context with sense context (constructed from definitions, etc.)
Semantic Similarity sim(s1, s2)
identify only near-identical senses (e.g. of house and home),
not arbitrary associations (e.g. house and door)
wsd(s|t) = σ(t, s) α + v(s)T
v(t)
||v(s)|| ||v(t)||
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Components
csim(tb, sa): cross-lingual similarity
monolingual WSD: compare bag-of-words vector for term
context with sense context (constructed from definitions, etc.)
Semantic Similarity sim(s1, s2)
identify only near-identical senses (e.g. of house and home),
not arbitrary associations (e.g. house and door)
sim(s1, s2) =



1 s1 = s2
1 s1, s2 in near-synonymy relationship
1 s1, s2 in hypernymy/hyponymy relationship
0 otherwise
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Extraction
Extraction
Use as example sentence iff score sufficiently high
wsd(sa|ta, tb) = wsd(ta, sa) σ(ta,sa)csim(tb,sa)
s ∈σ(ta)
σ(ta,s )csim(tb,s )
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Task Motivation
for computational applications: more data =⇒ better data
for human users: a good limited selection should be provided
at first
assumption: space constraint k - number of sentences
goal: choose a good set of k example sentences
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Task Motivation
for computational applications: more data =⇒ better data
for human users: a good limited selection should be provided
at first
assumption: space constraint k - number of sentences
goal: choose a good set of k example sentences
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Task Motivation
for computational applications: more data =⇒ better data
for human users: a good limited selection should be provided
at first
assumption: space constraint k - number of sentences
goal: choose a good set of k example sentences
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good example sentence?
one that showcases how a word is used in context
(typical prepositions, collocations, etc.)
one that helps in grasping the meaning
related work by Rychly et al. (2008):
one that is intelligible to learners
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good example sentence?
one that showcases how a word is used in context
(typical prepositions, collocations, etc.)
one that helps in grasping the meaning
related work by Rychly et al. (2008):
one that is intelligible to learners
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good example sentence?
one that showcases how a word is used in context
(typical prepositions, collocations, etc.)
one that helps in grasping the meaning
related work by Rychly et al. (2008):
one that is intelligible to learners
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Assets
example sentences can be thought of as having certain assets
6 different classes of n-gram assets
1 class of assets for entire original sentence
e.g. for account:
containing frequent bigram bank account can be an asset
containing frequent trigram to account for can be an asset
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Assets
example sentences can be thought of as having certain assets
6 different classes of n-gram assets
1 class of assets for entire original sentence
1: original unigram
2-5: bigram with preceding/following, trigram, etc.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Assets
example sentences can be thought of as having certain assets
6 different classes of n-gram assets
1 class of assets for entire original sentence
weights w(a) = f (x,a)
n
i=1 f (xi ,a)
, etc.
=⇒ higher weight for open an account than for Peter’s chequing
account
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Assets
example sentences can be thought of as having certain assets
6 different classes of n-gram assets
1 class of assets for entire original sentence
weight: cosine similarity with sense definition
=⇒ bias towards explanatory sentences
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good set of example sentences?
Select sentences with greatest total asset weights?
Instead: maximize
a∈
x∈C
A(x)
w(a)
Problem is NP-hard (proof via Vertex Cover reduction)
−→ use greedy heuristic
No: Most frequent expressions would dominate the result set.
Need diversity!
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good set of example sentences?
Select sentences with greatest total asset weights?
Instead: maximize
a∈
x∈C
A(x)
w(a)
Problem is NP-hard (proof via Vertex Cover reduction)
−→ use greedy heuristic
each asset in result set only counted once
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good set of example sentences?
Select sentences with greatest total asset weights?
Instead: maximize
a∈
x∈C
A(x)
w(a)
Problem is NP-hard (proof via Vertex Cover reduction)
−→ use greedy heuristic
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
What is a good set of example sentences?
Select sentences with greatest total asset weights?
Instead: maximize
a∈
x∈C
A(x)
w(a)
Problem is NP-hard (proof via Vertex Cover reduction)
−→ use greedy heuristic
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Greedy Heuristic to select set C
Greedily select highest-weighted sentence x:
x ← argmax
x∈XC a∈A(x)
w(a)
Then set the weights w(a) of all assets a ∈ A(x) to zero
=⇒ ranked list obtained (useful!)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Greedy Heuristic to select set C
Greedily select highest-weighted sentence x:
x ← argmax
x∈XC a∈A(x)
w(a)
Then set the weights w(a) of all assets a ∈ A(x) to zero
=⇒ ranked list obtained (useful!)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Example Selection
Greedy Heuristic to select set C
Greedily select highest-weighted sentence x:
x ← argmax
x∈XC a∈A(x)
w(a)
Then set the weights w(a) of all assets a ∈ A(x) to zero
=⇒ ranked list obtained (useful!)
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Resources
OPUS corpora (OpenSubtitles, OpenOffice.org collections)
and Reuters RCV1 for non-disambiguated sentence selection
GIZA++/UPlug for lexical alignment
Princeton WordNet 3.0 and Spanish WordNet
(+ sense mappings for compatibility)
TreeTagger for morphological analysis
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Resources
OPUS corpora (OpenSubtitles, OpenOffice.org collections)
and Reuters RCV1 for non-disambiguated sentence selection
GIZA++/UPlug for lexical alignment
Princeton WordNet 3.0 and Spanish WordNet
(+ sense mappings for compatibility)
TreeTagger for morphological analysis
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Resources
OPUS corpora (OpenSubtitles, OpenOffice.org collections)
and Reuters RCV1 for non-disambiguated sentence selection
GIZA++/UPlug for lexical alignment
Princeton WordNet 3.0 and Spanish WordNet
(+ sense mappings for compatibility)
TreeTagger for morphological analysis
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Resources
OPUS corpora (OpenSubtitles, OpenOffice.org collections)
and Reuters RCV1 for non-disambiguated sentence selection
GIZA++/UPlug for lexical alignment
Princeton WordNet 3.0 and Spanish WordNet
(+ sense mappings for compatibility)
TreeTagger for morphological analysis
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Examples from OpenSubtitles Corpus
line (something, as a
cord or rope, that is long
and thin and flexible)
I got some fishing line if you want
me to stitch that.
Von Sefelt, get the stern line.
line (the descendants of
one individual)
What line of kings do you descend
from?
My line has ended.
catch (catch up with
and possibly overtake)
He’s got 100 laps to catch Beau
Brandenburg if he wants to become
world champion.
They won’t catch up.
catch (grasp with the I didn’t catch your name.
mind or develop an
understanding of)
Sorry, I didn’t catch it.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Examples from OpenSubtitles Corpus
talk (exchange thoughts,
talk with)
Why don’t we have a seat and talk it
over.
Okay, I’ll talk to you but one
condition...
talk (use language) But we’ll be listening from the kitchen
so talk loud.
You spit when you talk.
opening (a ceremony
accompanying the start
of some enterprise)
We don’t have much time until the
opening day of Exhibition.
What a disaster tomorrow is the
opening ceremony!
opening (the first
performance, as of a
theatrical production)
It will be rehearsed in the morning
ready for the opening tomorrow night.
You ready for our big opening night?
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Sense-disambiguated Example Sentences
Corpus Covered
Senses
Example
Sentences
Accuracy
(Wilson
interval)
OpenSubtitles en-es 13,559 117,078 0.815 ± 0.081
OpenSubtitles es-en 8,833 113,018 0.798 ± 0.090
OpenOffice.org en-es 1,341 13,295 0.803 ± 0.081
OpenOffice.org es-en 932 11,181 0.793 ± 0.087
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
error analysis:
(1) incorrect lexical alignments unlikely to lead to incorrect
disambiguation
(2) incomplete sense inventories can lead to mistakes
(3) also, on a few occasions, the morphological analyser led to
wrong results
implications for future work:
(1) better monolingual WSD
(2) additional languages, especially phylogenetically unrelated
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
error analysis:
(1) incorrect lexical alignments unlikely to lead to incorrect
disambiguation
(2) incomplete sense inventories can lead to mistakes
(3) also, on a few occasions, the morphological analyser led to
wrong results
implications for future work:
(1) better monolingual WSD
(2) additional languages, especially phylogenetically unrelated
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Rankings for OpenSubtitles Corpus
being or located on or
directed toward the
1. In America we drive on the right
side of the road.
side of the body to the
east when facing north
2. I’ll tie down your right arm so
you can learn to throw a left.
3. If we wait from the right side, we
have an advantage there.
put up with something 1. You can’t stand it, can you?
or somebody
unpleasant
2. You really think I can tolerate
such an act?
3. No one can stand that harmonica
all day long.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Rankings for OpenSubtitles Corpus
using or providing or 1. Not the electric chair.
producing or
transmitting or
2. Some electrical current
circulating through my body.
operated by electricity 3. Near as I can tell it’s an
electrical impulse.
take something or
somebody with
1. And they were kind enough to
take me in here.
oneself somewhere 2. It conveys such a great feeling.
3. We interrupt this program to
bring you a special news bulletin.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Rankings for long in RCV1 Corpus
1. In the long term interest rate market, the yield of the key
182nd 10 year Japanese government bond (JGB) fell to
2.060 percent early on Tuesday, a record low for any
benchmark 10-year JGB.
2. “The government and opposition have gambled away the last
chance for a long time to prove they recognise the country’s
problems, and that they put the national good above their
own power interests”, news weekly Der Spiegel said.
3. As long as the index keeps hovering between 957 and 995,
we will maintain our short term neutral recommendation.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Rankings for purchase in RCV1 Corpus
1. Romania’s State Ownership Fund (FPS), the country’s main
privatisation body, said on Wednesday it had accepted five
bids for the purchase of a 50.98 percent stake in the largest
local cement maker Romcim.
2. Grand Hotel Group said on Wednesday it has agreed to
procure an option to purchase the remaining 50 percent of
the Grand Hyatt complex in Melbourne from hotel developer
and investor Lustig & Moar.
3. The purchase price for the business, which had 1996
calendar year sales of about $25 million, was not disclosed.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Results
Rankings for colonial in RCV1 Corpus
1. Hong Kong came to the end of 156 years of British colonial
rule on June 30 and is now an autonomous capitalist region
of China, running all its own affairs except defence and
diplomacy.
2. The letter was sent in error to the embassy of Portugal – the
former colonial power in East Timor – and was neither
returned nor forwarded to the Indonesian embassy.
3. Sino-British relations hit a snag when former Governor Chris
Patten launched electoral reforms in the twilight years of
colonial rule despite fierce opposition by Beijing.
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Outline
1 Introduction
2 Example Extraction
3 Example Selection
4 Results
5 Conclusion
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Conclusion
Framework for:
extracting sense-disambiguated example sentences from
parallel corpora
selecting limited numbers of sentences given space constraints
Future Work:
better disambiguation, e.g. additional languages, better
techniques
additional input for selection: sentence length, definition
extraction techniques
integrated user interface for UWN (Universal Wordnet)
Contact: demelo@mpi-inf.mpg.de
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Conclusion
Framework for:
extracting sense-disambiguated example sentences from
parallel corpora
selecting limited numbers of sentences given space constraints
Future Work:
better disambiguation, e.g. additional languages, better
techniques
additional input for selection: sentence length, definition
extraction techniques
integrated user interface for UWN (Universal Wordnet)
Contact: demelo@mpi-inf.mpg.de
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
Introduction
Example Extraction
Example Selection
Results
Conclusion
Conclusion
Framework for:
extracting sense-disambiguated example sentences from
parallel corpora
selecting limited numbers of sentences given space constraints
Future Work:
better disambiguation, e.g. additional languages, better
techniques
additional input for selection: sentence length, definition
extraction techniques
integrated user interface for UWN (Universal Wordnet)
Contact: demelo@mpi-inf.mpg.de
G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
1 of 68

Recommended

is2013 by
is2013is2013
is2013Jind?ich Matou?ek
219 views29 slides
ANChor: A powerful approach to scientific communication by
ANChor: A powerful approach to scientific communicationANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationJosh Inouye
1.3K views42 slides
Primary research and conclusion by
Primary research and conclusionPrimary research and conclusion
Primary research and conclusionrpatel35
2.3K views13 slides
UBC research paper-format by
UBC research paper-formatUBC research paper-format
UBC research paper-formatYender McLee
1.1K views11 slides
Presentation on Memorandum of Association by
Presentation on Memorandum of AssociationPresentation on Memorandum of Association
Presentation on Memorandum of AssociationNaveen Chopra
44.5K views15 slides
Schaefer framing theory and methods overview and open questions by
Schaefer   framing theory and methods overview and open questionsSchaefer   framing theory and methods overview and open questions
Schaefer framing theory and methods overview and open questionsMike Schäfer
4.2K views31 slides

More Related Content

Similar to Extracting Sense-Disambiguated Example Sentences From Parallel Corpora

Academic writing: pointers for G-cube PhD students 23-01.12 by
Academic writing: pointers for G-cube PhD students 23-01.12Academic writing: pointers for G-cube PhD students 23-01.12
Academic writing: pointers for G-cube PhD students 23-01.12Lawrie Hunter
10 views46 slides
Neural machine translation of rare words with subword units by
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsTae Hwan Jung
153 views47 slides
More on Indexing Text Operations (1).pptx by
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptxMahsadelavari
4 views25 slides
Summary distributed representations_words_phrases by
Summary distributed representations_words_phrasesSummary distributed representations_words_phrases
Summary distributed representations_words_phrasesYue Xiangnan
141 views29 slides
Oxford English for Careers_ Technology 1 Student's Book ( PDFDrive ).pdf by
Oxford English for Careers_ Technology 1 Student's Book   ( PDFDrive ).pdfOxford English for Careers_ Technology 1 Student's Book   ( PDFDrive ).pdf
Oxford English for Careers_ Technology 1 Student's Book ( PDFDrive ).pdfbeatrix15
1.1K views137 slides
Word embeddings by
Word embeddingsWord embeddings
Word embeddingsShruti kar
157 views15 slides

Similar to Extracting Sense-Disambiguated Example Sentences From Parallel Corpora(20)

Academic writing: pointers for G-cube PhD students 23-01.12 by Lawrie Hunter
Academic writing: pointers for G-cube PhD students 23-01.12Academic writing: pointers for G-cube PhD students 23-01.12
Academic writing: pointers for G-cube PhD students 23-01.12
Lawrie Hunter10 views
Neural machine translation of rare words with subword units by Tae Hwan Jung
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword units
Tae Hwan Jung153 views
More on Indexing Text Operations (1).pptx by Mahsadelavari
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptx
Mahsadelavari4 views
Summary distributed representations_words_phrases by Yue Xiangnan
Summary distributed representations_words_phrasesSummary distributed representations_words_phrases
Summary distributed representations_words_phrases
Yue Xiangnan141 views
Oxford English for Careers_ Technology 1 Student's Book ( PDFDrive ).pdf by beatrix15
Oxford English for Careers_ Technology 1 Student's Book   ( PDFDrive ).pdfOxford English for Careers_ Technology 1 Student's Book   ( PDFDrive ).pdf
Oxford English for Careers_ Technology 1 Student's Book ( PDFDrive ).pdf
beatrix151.1K views
Word embeddings by Shruti kar
Word embeddingsWord embeddings
Word embeddings
Shruti kar157 views
Towards a Universal Wordnet by Learning from Combined Evidence by Gerard de Melo
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined Evidence
Gerard de Melo1.8K views
Tutorial - Speech Synthesis System by IJERA Editor
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
IJERA Editor42 views
CMSC 723: Computational Linguistics I by butest
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
butest1K views
Past, Present, and Future: Machine Translation & Natural Language Processing ... by John Tinsley
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
John Tinsley83 views
Past, Present, and Future: Machine Translation & Natural Language Processing ... by Iconic Translation Machines
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Dual Learning for Machine Translation (NIPS 2016) by Toru Fujino
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
Toru Fujino3K views
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL by IJNSA Journal
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVALA NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL
IJNSA Journal434 views
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL by IJNSA Journal
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVALA NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL
A NEW STEMMER TO IMPROVE INFORMATION RETRIEVAL
IJNSA Journal4 views
Using ontology based context in the by ijaia
Using ontology based context in theUsing ontology based context in the
Using ontology based context in the
ijaia232 views

More from Gerard de Melo

SEMAC Graph Node Embeddings for Link Prediction by
SEMAC Graph Node Embeddings for Link PredictionSEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link PredictionGerard de Melo
932 views39 slides
How to Manage your Research by
How to Manage your ResearchHow to Manage your Research
How to Manage your ResearchGerard de Melo
2.3K views142 slides
Knowlywood: Mining Activity Knowledge from Hollywood Narratives by
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesKnowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesGerard de Melo
848 views28 slides
Learning Multilingual Semantics from Big Data on the Web by
Learning Multilingual Semantics from Big Data on the WebLearning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the WebGerard de Melo
1.2K views156 slides
From Big Data to Valuable Knowledge by
From Big Data to Valuable KnowledgeFrom Big Data to Valuable Knowledge
From Big Data to Valuable KnowledgeGerard de Melo
1K views44 slides
Scalable Learning Technologies for Big Data Mining by
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningGerard de Melo
1.7K views152 slides

More from Gerard de Melo(13)

SEMAC Graph Node Embeddings for Link Prediction by Gerard de Melo
SEMAC Graph Node Embeddings for Link PredictionSEMAC Graph Node Embeddings for Link Prediction
SEMAC Graph Node Embeddings for Link Prediction
Gerard de Melo932 views
How to Manage your Research by Gerard de Melo
How to Manage your ResearchHow to Manage your Research
How to Manage your Research
Gerard de Melo2.3K views
Knowlywood: Mining Activity Knowledge from Hollywood Narratives by Gerard de Melo
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesKnowlywood: Mining Activity Knowledge from Hollywood Narratives
Knowlywood: Mining Activity Knowledge from Hollywood Narratives
Gerard de Melo848 views
Learning Multilingual Semantics from Big Data on the Web by Gerard de Melo
Learning Multilingual Semantics from Big Data on the WebLearning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the Web
Gerard de Melo1.2K views
From Big Data to Valuable Knowledge by Gerard de Melo
From Big Data to Valuable KnowledgeFrom Big Data to Valuable Knowledge
From Big Data to Valuable Knowledge
Gerard de Melo1K views
Scalable Learning Technologies for Big Data Mining by Gerard de Melo
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data Mining
Gerard de Melo1.7K views
Searching the Web of Data (Tutorial) by Gerard de Melo
Searching the Web of Data (Tutorial)Searching the Web of Data (Tutorial)
Searching the Web of Data (Tutorial)
Gerard de Melo1.9K views
From Linked Data to Tightly Integrated Data by Gerard de Melo
From Linked Data to Tightly Integrated DataFrom Linked Data to Tightly Integrated Data
From Linked Data to Tightly Integrated Data
Gerard de Melo1.6K views
Information Extraction from Web-Scale N-Gram Data by Gerard de Melo
Information Extraction from Web-Scale N-Gram DataInformation Extraction from Web-Scale N-Gram Data
Information Extraction from Web-Scale N-Gram Data
Gerard de Melo1.8K views
UWN: A Large Multilingual Lexical Knowledge Base by Gerard de Melo
UWN: A Large Multilingual Lexical Knowledge BaseUWN: A Large Multilingual Lexical Knowledge Base
UWN: A Large Multilingual Lexical Knowledge Base
Gerard de Melo1.1K views
Not Quite the Same: Identity Constraints for the Web of Linked Data by Gerard de Melo
Not Quite the Same: Identity Constraints for the Web of Linked DataNot Quite the Same: Identity Constraints for the Web of Linked Data
Not Quite the Same: Identity Constraints for the Web of Linked Data
Gerard de Melo989 views
Good, Great, Excellent: Global Inference of Semantic Intensities by Gerard de Melo
Good, Great, Excellent: Global Inference of Semantic IntensitiesGood, Great, Excellent: Global Inference of Semantic Intensities
Good, Great, Excellent: Global Inference of Semantic Intensities
Gerard de Melo2K views
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology by Gerard de Melo
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged OntologyYAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
Gerard de Melo2.2K views

Recently uploaded

Data Journeys Hard Talk workshop final.pptx by
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptxinfo828217
10 views18 slides
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... by
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...DataScienceConferenc1
6 views11 slides
SAP-TCodes.pdf by
SAP-TCodes.pdfSAP-TCodes.pdf
SAP-TCodes.pdfmustafaghulam8181
10 views285 slides
CRIJ4385_Death Penalty_F23.pptx by
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptxyvettemm100
6 views24 slides
CRM stick or twist.pptx by
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptxinfo828217
11 views16 slides
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptxDataScienceConferenc1
5 views12 slides

Recently uploaded(20)

Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821710 views
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... by DataScienceConferenc1
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
CRIJ4385_Death Penalty_F23.pptx by yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1006 views
CRM stick or twist.pptx by info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821711 views
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra17 views
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... by StatsCommunications
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... by DataScienceConferenc1
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
Data about the sector workshop by info828217
Data about the sector workshopData about the sector workshop
Data about the sector workshop
info82821712 views
Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 views
Advanced_Recommendation_Systems_Presentation.pptx by neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials16 views
SUPER STORE SQL PROJECT.pptx by khan888620
SUPER STORE SQL PROJECT.pptxSUPER STORE SQL PROJECT.pptx
SUPER STORE SQL PROJECT.pptx
khan88862013 views

Extracting Sense-Disambiguated Example Sentences From Parallel Corpora

  • 1. Introduction Example Extraction Example Selection Results Conclusion Extracting Sense-Disambiguated Example Sentences From Parallel Corpora Gerard de Melo and Gerhard Weikum Max Planck Institute for Informatics Saarbr¨ucken, Germany 2009-09-18 G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 2. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 3. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 4. Introduction Example Extraction Example Selection Results Conclusion Introduction a thin, sharp-pointed metal pin with a raised spiral thread running around it and a slotted head, used to join things together by being rotated in under pressure (Concise OED) Use a crosshead screwdriver to tighten the screws inserted in the pre-fixed connectors. (Web) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 5. Introduction Example Extraction Example Selection Results Conclusion Introduction a thin, sharp-pointed metal pin with a raised spiral thread running around it and a slotted head, used to join things together by being rotated in under pressure (Concise OED) Use a crosshead screwdriver to tighten the screws inserted in the pre-fixed connectors. (Web) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 6. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences any sentence that contains a word (being used in a specific sense) allow the user to grasp a word’s meaning see circumstances a word would typically be used in G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 7. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences any sentence that contains a word (being used in a specific sense) allow the user to grasp a word’s meaning see circumstances a word would typically be used in traditional intensional word definitions may be too confusing humans used to deriving meaning from context users can verify whether they have understood definition correctly G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 8. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences any sentence that contains a word (being used in a specific sense) allow the user to grasp a word’s meaning see circumstances a word would typically be used in possible contexts, e.g. child vs. youngster typical collocations, e.g. to give birth or birth rate but not *to give nascence or *nascence rate G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 9. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences =⇒ most modern dictionaries include example sentences number, length often limited digital dictionaries: tight space constraints of print media no longer apply! larger number of example sentences can be presented to the user on demand G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 10. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences =⇒ most modern dictionaries include example sentences number, length often limited digital dictionaries: tight space constraints of print media no longer apply! larger number of example sentences can be presented to the user on demand G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 11. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences =⇒ most modern dictionaries include example sentences number, length often limited digital dictionaries: tight space constraints of print media no longer apply! larger number of example sentences can be presented to the user on demand G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 12. Introduction Example Extraction Example Selection Results Conclusion Introduction Example Sentences =⇒ most modern dictionaries include example sentences number, length often limited digital dictionaries: tight space constraints of print media no longer apply! larger number of example sentences can be presented to the user on demand G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 13. Introduction Example Extraction Example Selection Results Conclusion Introduction Goals automatically obtain example sentences for a specific sense of a word choose set of representative sentences to present to user distinguish senses: There were many bats flying out of the cave. vs. In professional baseball, only wooden bats are permitted. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 14. Introduction Example Extraction Example Selection Results Conclusion Introduction Goals automatically obtain example sentences for a specific sense of a word choose set of representative sentences to present to user not all examples are equally useful screen space may be limited (can still show more only on demand) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 15. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 16. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Index (Dictionary) English: Princeton WordNet 3.0 Spanish: Spanish WordNet σ(t): get senses for term t σ(t, s): is sense s a sense of term t? behind the scenes: morphological analysis, multi-word expression detection G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 17. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Index (Dictionary) English: Princeton WordNet 3.0 Spanish: Spanish WordNet σ(t): get senses for term t σ(t, s): is sense s a sense of term t? behind the scenes: morphological analysis, multi-word expression detection G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 18. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Index (Dictionary) English: Princeton WordNet 3.0 Spanish: Spanish WordNet σ(t): get senses for term t σ(t, s): is sense s a sense of term t? behind the scenes: morphological analysis, multi-word expression detection G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 19. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Disambiguation disambiguate word occurrences, then use sentence as example for those word senses fine-grained WSD not reliable enough (cf. SemEval results) idea: use parallel corpora, jointly look at both versions of text =⇒ greater accuracy can be achieved G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 20. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Disambiguation disambiguate word occurrences, then use sentence as example for those word senses fine-grained WSD not reliable enough (cf. SemEval results) idea: use parallel corpora, jointly look at both versions of text =⇒ greater accuracy can be achieved G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 21. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Disambiguation disambiguate word occurrences, then use sentence as example for those word senses fine-grained WSD not reliable enough (cf. SemEval results) idea: use parallel corpora, jointly look at both versions of text =⇒ greater accuracy can be achieved G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 22. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Sense Disambiguation disambiguate word occurrences, then use sentence as example for those word senses fine-grained WSD not reliable enough (cf. SemEval results) idea: use parallel corpora, jointly look at both versions of text =⇒ greater accuracy can be achieved G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 23. Introduction Example Extraction Example Selection Results Conclusion Introduction Sense Disambiguation There were many bats flying out of the cave. Aus der H¨ohle flogen viele Flederm¨ause. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 24. Introduction Example Extraction Example Selection Results Conclusion Introduction Sense Disambiguation There were many bats flying out of the cave. Aus der H¨ohle flogen viele Flederm¨ause. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 25. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Parallel Disambiguation req.: text available in two languages a, b good news: more and more parallel corpora available word alignment compute score wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa) s ∈σ(ta) σ(ta,s )csim(tb,s ) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 26. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Parallel Disambiguation req.: text available in two languages a, b good news: more and more parallel corpora available word alignment compute score wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa) s ∈σ(ta) σ(ta,s )csim(tb,s ) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 27. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Parallel Disambiguation req.: text available in two languages a, b good news: more and more parallel corpora available word alignment compute score wsd(sa|ta, tb) = wsd(sa|ta) σ(ta,sa)csim(tb,sa) s ∈σ(ta) σ(ta,s )csim(tb,s ) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 28. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Components csim(tb, sa): cross-lingual similarity monolingual WSD: compare bag-of-words vector for term context with sense context (constructed from definitions, etc.) Semantic Similarity sim(s1, s2) identify only near-identical senses (e.g. of house and home), not arbitrary associations (e.g. house and door) csim(tb, sa) = sb∈σ(tb) sim(sa, sb) wsd(sb|tb) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 29. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Components csim(tb, sa): cross-lingual similarity monolingual WSD: compare bag-of-words vector for term context with sense context (constructed from definitions, etc.) Semantic Similarity sim(s1, s2) identify only near-identical senses (e.g. of house and home), not arbitrary associations (e.g. house and door) wsd(s|t) = σ(t, s) α + v(s)T v(t) ||v(s)|| ||v(t)|| G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 30. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Components csim(tb, sa): cross-lingual similarity monolingual WSD: compare bag-of-words vector for term context with sense context (constructed from definitions, etc.) Semantic Similarity sim(s1, s2) identify only near-identical senses (e.g. of house and home), not arbitrary associations (e.g. house and door) sim(s1, s2) =    1 s1 = s2 1 s1, s2 in near-synonymy relationship 1 s1, s2 in hypernymy/hyponymy relationship 0 otherwise G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 31. Introduction Example Extraction Example Selection Results Conclusion Example Extraction Extraction Use as example sentence iff score sufficiently high wsd(sa|ta, tb) = wsd(ta, sa) σ(ta,sa)csim(tb,sa) s ∈σ(ta) σ(ta,s )csim(tb,s ) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 32. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 33. Introduction Example Extraction Example Selection Results Conclusion Example Selection Task Motivation for computational applications: more data =⇒ better data for human users: a good limited selection should be provided at first assumption: space constraint k - number of sentences goal: choose a good set of k example sentences G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 34. Introduction Example Extraction Example Selection Results Conclusion Example Selection Task Motivation for computational applications: more data =⇒ better data for human users: a good limited selection should be provided at first assumption: space constraint k - number of sentences goal: choose a good set of k example sentences G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 35. Introduction Example Extraction Example Selection Results Conclusion Example Selection Task Motivation for computational applications: more data =⇒ better data for human users: a good limited selection should be provided at first assumption: space constraint k - number of sentences goal: choose a good set of k example sentences G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 36. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good example sentence? one that showcases how a word is used in context (typical prepositions, collocations, etc.) one that helps in grasping the meaning related work by Rychly et al. (2008): one that is intelligible to learners G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 37. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good example sentence? one that showcases how a word is used in context (typical prepositions, collocations, etc.) one that helps in grasping the meaning related work by Rychly et al. (2008): one that is intelligible to learners G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 38. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good example sentence? one that showcases how a word is used in context (typical prepositions, collocations, etc.) one that helps in grasping the meaning related work by Rychly et al. (2008): one that is intelligible to learners G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 39. Introduction Example Extraction Example Selection Results Conclusion Example Selection Assets example sentences can be thought of as having certain assets 6 different classes of n-gram assets 1 class of assets for entire original sentence e.g. for account: containing frequent bigram bank account can be an asset containing frequent trigram to account for can be an asset G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 40. Introduction Example Extraction Example Selection Results Conclusion Example Selection Assets example sentences can be thought of as having certain assets 6 different classes of n-gram assets 1 class of assets for entire original sentence 1: original unigram 2-5: bigram with preceding/following, trigram, etc. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 41. Introduction Example Extraction Example Selection Results Conclusion Example Selection Assets example sentences can be thought of as having certain assets 6 different classes of n-gram assets 1 class of assets for entire original sentence weights w(a) = f (x,a) n i=1 f (xi ,a) , etc. =⇒ higher weight for open an account than for Peter’s chequing account G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 42. Introduction Example Extraction Example Selection Results Conclusion Example Selection Assets example sentences can be thought of as having certain assets 6 different classes of n-gram assets 1 class of assets for entire original sentence weight: cosine similarity with sense definition =⇒ bias towards explanatory sentences G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 43. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good set of example sentences? Select sentences with greatest total asset weights? Instead: maximize a∈ x∈C A(x) w(a) Problem is NP-hard (proof via Vertex Cover reduction) −→ use greedy heuristic No: Most frequent expressions would dominate the result set. Need diversity! G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 44. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good set of example sentences? Select sentences with greatest total asset weights? Instead: maximize a∈ x∈C A(x) w(a) Problem is NP-hard (proof via Vertex Cover reduction) −→ use greedy heuristic each asset in result set only counted once G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 45. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good set of example sentences? Select sentences with greatest total asset weights? Instead: maximize a∈ x∈C A(x) w(a) Problem is NP-hard (proof via Vertex Cover reduction) −→ use greedy heuristic G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 46. Introduction Example Extraction Example Selection Results Conclusion Example Selection What is a good set of example sentences? Select sentences with greatest total asset weights? Instead: maximize a∈ x∈C A(x) w(a) Problem is NP-hard (proof via Vertex Cover reduction) −→ use greedy heuristic G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 47. Introduction Example Extraction Example Selection Results Conclusion Example Selection Greedy Heuristic to select set C Greedily select highest-weighted sentence x: x ← argmax x∈XC a∈A(x) w(a) Then set the weights w(a) of all assets a ∈ A(x) to zero =⇒ ranked list obtained (useful!) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 48. Introduction Example Extraction Example Selection Results Conclusion Example Selection Greedy Heuristic to select set C Greedily select highest-weighted sentence x: x ← argmax x∈XC a∈A(x) w(a) Then set the weights w(a) of all assets a ∈ A(x) to zero =⇒ ranked list obtained (useful!) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 49. Introduction Example Extraction Example Selection Results Conclusion Example Selection Greedy Heuristic to select set C Greedily select highest-weighted sentence x: x ← argmax x∈XC a∈A(x) w(a) Then set the weights w(a) of all assets a ∈ A(x) to zero =⇒ ranked list obtained (useful!) G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 50. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 51. Introduction Example Extraction Example Selection Results Conclusion Results Resources OPUS corpora (OpenSubtitles, OpenOffice.org collections) and Reuters RCV1 for non-disambiguated sentence selection GIZA++/UPlug for lexical alignment Princeton WordNet 3.0 and Spanish WordNet (+ sense mappings for compatibility) TreeTagger for morphological analysis G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 52. Introduction Example Extraction Example Selection Results Conclusion Results Resources OPUS corpora (OpenSubtitles, OpenOffice.org collections) and Reuters RCV1 for non-disambiguated sentence selection GIZA++/UPlug for lexical alignment Princeton WordNet 3.0 and Spanish WordNet (+ sense mappings for compatibility) TreeTagger for morphological analysis G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 53. Introduction Example Extraction Example Selection Results Conclusion Results Resources OPUS corpora (OpenSubtitles, OpenOffice.org collections) and Reuters RCV1 for non-disambiguated sentence selection GIZA++/UPlug for lexical alignment Princeton WordNet 3.0 and Spanish WordNet (+ sense mappings for compatibility) TreeTagger for morphological analysis G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 54. Introduction Example Extraction Example Selection Results Conclusion Results Resources OPUS corpora (OpenSubtitles, OpenOffice.org collections) and Reuters RCV1 for non-disambiguated sentence selection GIZA++/UPlug for lexical alignment Princeton WordNet 3.0 and Spanish WordNet (+ sense mappings for compatibility) TreeTagger for morphological analysis G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 55. Introduction Example Extraction Example Selection Results Conclusion Results Examples from OpenSubtitles Corpus line (something, as a cord or rope, that is long and thin and flexible) I got some fishing line if you want me to stitch that. Von Sefelt, get the stern line. line (the descendants of one individual) What line of kings do you descend from? My line has ended. catch (catch up with and possibly overtake) He’s got 100 laps to catch Beau Brandenburg if he wants to become world champion. They won’t catch up. catch (grasp with the I didn’t catch your name. mind or develop an understanding of) Sorry, I didn’t catch it. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 56. Introduction Example Extraction Example Selection Results Conclusion Results Examples from OpenSubtitles Corpus talk (exchange thoughts, talk with) Why don’t we have a seat and talk it over. Okay, I’ll talk to you but one condition... talk (use language) But we’ll be listening from the kitchen so talk loud. You spit when you talk. opening (a ceremony accompanying the start of some enterprise) We don’t have much time until the opening day of Exhibition. What a disaster tomorrow is the opening ceremony! opening (the first performance, as of a theatrical production) It will be rehearsed in the morning ready for the opening tomorrow night. You ready for our big opening night? G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 57. Introduction Example Extraction Example Selection Results Conclusion Results Sense-disambiguated Example Sentences Corpus Covered Senses Example Sentences Accuracy (Wilson interval) OpenSubtitles en-es 13,559 117,078 0.815 ± 0.081 OpenSubtitles es-en 8,833 113,018 0.798 ± 0.090 OpenOffice.org en-es 1,341 13,295 0.803 ± 0.081 OpenOffice.org es-en 932 11,181 0.793 ± 0.087 G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 58. Introduction Example Extraction Example Selection Results Conclusion Results error analysis: (1) incorrect lexical alignments unlikely to lead to incorrect disambiguation (2) incomplete sense inventories can lead to mistakes (3) also, on a few occasions, the morphological analyser led to wrong results implications for future work: (1) better monolingual WSD (2) additional languages, especially phylogenetically unrelated G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 59. Introduction Example Extraction Example Selection Results Conclusion Results error analysis: (1) incorrect lexical alignments unlikely to lead to incorrect disambiguation (2) incomplete sense inventories can lead to mistakes (3) also, on a few occasions, the morphological analyser led to wrong results implications for future work: (1) better monolingual WSD (2) additional languages, especially phylogenetically unrelated G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 60. Introduction Example Extraction Example Selection Results Conclusion Results Rankings for OpenSubtitles Corpus being or located on or directed toward the 1. In America we drive on the right side of the road. side of the body to the east when facing north 2. I’ll tie down your right arm so you can learn to throw a left. 3. If we wait from the right side, we have an advantage there. put up with something 1. You can’t stand it, can you? or somebody unpleasant 2. You really think I can tolerate such an act? 3. No one can stand that harmonica all day long. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 61. Introduction Example Extraction Example Selection Results Conclusion Results Rankings for OpenSubtitles Corpus using or providing or 1. Not the electric chair. producing or transmitting or 2. Some electrical current circulating through my body. operated by electricity 3. Near as I can tell it’s an electrical impulse. take something or somebody with 1. And they were kind enough to take me in here. oneself somewhere 2. It conveys such a great feeling. 3. We interrupt this program to bring you a special news bulletin. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 62. Introduction Example Extraction Example Selection Results Conclusion Results Rankings for long in RCV1 Corpus 1. In the long term interest rate market, the yield of the key 182nd 10 year Japanese government bond (JGB) fell to 2.060 percent early on Tuesday, a record low for any benchmark 10-year JGB. 2. “The government and opposition have gambled away the last chance for a long time to prove they recognise the country’s problems, and that they put the national good above their own power interests”, news weekly Der Spiegel said. 3. As long as the index keeps hovering between 957 and 995, we will maintain our short term neutral recommendation. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 63. Introduction Example Extraction Example Selection Results Conclusion Results Rankings for purchase in RCV1 Corpus 1. Romania’s State Ownership Fund (FPS), the country’s main privatisation body, said on Wednesday it had accepted five bids for the purchase of a 50.98 percent stake in the largest local cement maker Romcim. 2. Grand Hotel Group said on Wednesday it has agreed to procure an option to purchase the remaining 50 percent of the Grand Hyatt complex in Melbourne from hotel developer and investor Lustig & Moar. 3. The purchase price for the business, which had 1996 calendar year sales of about $25 million, was not disclosed. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 64. Introduction Example Extraction Example Selection Results Conclusion Results Rankings for colonial in RCV1 Corpus 1. Hong Kong came to the end of 156 years of British colonial rule on June 30 and is now an autonomous capitalist region of China, running all its own affairs except defence and diplomacy. 2. The letter was sent in error to the embassy of Portugal – the former colonial power in East Timor – and was neither returned nor forwarded to the Indonesian embassy. 3. Sino-British relations hit a snag when former Governor Chris Patten launched electoral reforms in the twilight years of colonial rule despite fierce opposition by Beijing. G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 65. Introduction Example Extraction Example Selection Results Conclusion Outline 1 Introduction 2 Example Extraction 3 Example Selection 4 Results 5 Conclusion G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 66. Introduction Example Extraction Example Selection Results Conclusion Conclusion Framework for: extracting sense-disambiguated example sentences from parallel corpora selecting limited numbers of sentences given space constraints Future Work: better disambiguation, e.g. additional languages, better techniques additional input for selection: sentence length, definition extraction techniques integrated user interface for UWN (Universal Wordnet) Contact: demelo@mpi-inf.mpg.de G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 67. Introduction Example Extraction Example Selection Results Conclusion Conclusion Framework for: extracting sense-disambiguated example sentences from parallel corpora selecting limited numbers of sentences given space constraints Future Work: better disambiguation, e.g. additional languages, better techniques additional input for selection: sentence length, definition extraction techniques integrated user interface for UWN (Universal Wordnet) Contact: demelo@mpi-inf.mpg.de G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences
  • 68. Introduction Example Extraction Example Selection Results Conclusion Conclusion Framework for: extracting sense-disambiguated example sentences from parallel corpora selecting limited numbers of sentences given space constraints Future Work: better disambiguation, e.g. additional languages, better techniques additional input for selection: sentence length, definition extraction techniques integrated user interface for UWN (Universal Wordnet) Contact: demelo@mpi-inf.mpg.de G. de Melo / G. Weikum, Max-Planck-Institut Informatik Sense-Disambiguated Example Sentences