SlideShare a Scribd company logo
What do you know about
an alligator when you know
the company it keeps?
Katrin Erk
University of Texas at Austin
STARSEM 2017
Distributional semantics and
you
• Distributional models/Embeddings: An incredible
success story in computational linguistics
• Do you make use of distributional information, too?
• Landauer & Dumais, 1997: “A solution to Plato’s problem”
• How do humans acquire such a gigantic vocabulary in such a
short time?
• Much debate in psychology,
experimental support: McDonald&Ramscar, 2001,
Lazaridou et al, 2016
• But how about the linguistic side of the story?
“A solution to Plato’s problem”
“Many well-read adults know that Buddha sat long
under a banyan tree (whatever that is) and Tahitian
natives lived idyllically on breadfruit and poi (whatever
those are). More or less correct usage often precedes
referential knowledge” (Landauer&Dumais, 1997)
“A solution to Plato’s problem”
“Many well-read adults know that Buddha sat long
under a banyan tree (whatever that is) and Tahitian
natives lived idyllically on breadfruit and poi (whatever
those are). More or less correct usage often precedes
referential knowledge” ” (Landauer&Dumais, 1997)
But wait: How can you use the word “banyan” more or
less correctly when you are not aware of its reference?
When you couldn’t point out a banyan in a yard?
Learning about word meaning
from textual context
• Main aim: insight
• What information is present in distributional
representations, and why?
• Assuming a learner with grounded concepts:
How can distributional information contribute?
Learning about meaning from
textual context
Suppose you do not know what an alligator is. What
do these sentence tell you about alligators?
• On our last evening, the boatman killed an alligator as
it crawled past our camp-fire to go hunting in the reeds
beyond.
• A study done by Edwin Colbert and his colleagues
showed that a tiny 50 gramme (1.76 oz) alligator heated
up 1◦C every minute and a half from the Sun[…]
• The throne was occupied by a pipe-smoking alligator.
Learning about word meaning
from textual context
• Setting: adult learner
• What kind of information can you get from text?
• How does it enable you to use “alligator” more or less
correctly?
• Why can you learn anything from text?
• Textual clues are rarely 100% reliable
• “An alligator was lying at the bottom of a pool”
• Could be an animal, a pool-cleaning implement…
The story in a nutshell
• How can I successfully use the word “alligator”
when I don’t know what it refers to?
• I know some properties of alligators: they are
animals, dangerous, …
• So then I use “alligator” in animal-like textual
contexts
The story in a nutshell
• How does distributional information help?
• It lets me infer properties of words:
• Suppose I don’t know what an alligator is
• But it appears in similar contexts as “crocodile”
• So it must be something like a crocodile:
• That is, it must share properties with a crocodile
• So it may be an animal, it may be dangerous…
The story in a nutshell
• But distributional information can never yield
certain knowledge
• Instead uncertain, probabilistic information
• Formal semantics framework
• Probabilistic semantics:
• Probability distribution over worlds that could be the
current one
• Probability of a world influenced by distributional
information
Plan
• What can an agent learn from distributional context?
• A probabilistic information state
• Influencing a probabilistic information state with
distributional information
• A toy experiment
What is in an embedding?
• What information can be encoded in an embedding
computed from text data?
• Lots of things, given the right objective function
• But:
• What objective function can we assume a human agent
to use?
• What individual linguistic phenomena have been
shown to be encoded?
• So, restrict ourselves to simple model
What is in an embedding?
• Count-based models of textual context
• (and neural models like word2vec,
see Levy&Goldberg 2015)
• Long-time criticism in psychology, eg. Murphy (2002):
only a vague notion of “similarity”
• But in fact distributional models can distinguish between
semantic relations
• by choice of what “context” means
• through relation-specific classifiers (Fu et al, 2014; Levy et al,
2015; Shwartz et al, 2016; Roller& Erk, 2016, …)
The effect of context window size
• Peirsman 2008 (Dutch):
• Narrow context window: high ratings to “similar” words
• Particularly to co-hyponyms
• Syntactic context even more so
• Wide context window: high ratings to “related” words
• Baroni/Lenci 2011 (English):
• Narrow context window: highest ratings to co-hyponyms
• Wide context window: ratings equal across many relations
What is narrow-window
similarity?
• High ratings for co-hyponyms, also synonyms, some
hypernyms, antonyms (well-known bug)
• What semantic relation is that?
• Co-hyponymy is an odd relation
• dictionary-specific
• can be incompatible (cat/dog) or compatible
(hotel/restaurant)
• Proposal: property overlap
• Alligator, crocodile have many properties in common:
animal, reptile, scaly, dangerous, …
Why does narrow-window
similarity do this?
• Focus on noun targets
• Narrow window, syntactic context contain:
• Modifiers
• Verbs that take target as argument
• Selectional constraints
• Traditionally formulated in terms of taxonomic
properties
• subject of “crawl”: animate
But wait, where do the
probabilities come from?
• Frequency in text is not frequency in real life
• Reporting bias: Almost no one says “Bananas are
yellow” (Bruni et al, 2012)
• Genre bias: “captive” and “westerner” respective
nearest neighbors in Lin 1998
• Then how can counts in text lead us to probabilities
relevant to grounded concepts?
But wait, where do the
probabilities come from?
• Two tricks in this study
1. Only consider properties that apply to all members of
a category (like “being an animal”)
2. Use distributional context only indirectly: Learn
correlation between distributional context and real-
world properties
• More recent work: trick 2 without trick 1
• I think we can use distributional context directly
and properly to get probabilities – more later
Learning properties from
distributional data
• Concrete noun concepts
• To learn: properties of a concept
• Focus on properties applying to all members of a
category (like taxonomic properties)
• Broad definition of a property: can be expressed as an
adjective, can be a hypernym, …
Property overlap
• Percentage of properties that are joint
• Jaccard coefficient on sets
• A, B, sets of properties:
• Degrees of property overlap
• Idea: The more properties in common, the higher the
distributional similarity
Jac(A, B) =
|A  B|
|A [ B|
jac = 2 / 6 = 0.33
Plan
• What can an agent learn from distributional context?
• A probabilistic information state
• Influencing a probabilistic information state with
distributional information
• A toy experiment
Information states
• Information state of Agent: set of worlds that the agent
considers possibilities
• Agent not omniscient
• As far as Agent is concerned, any of these worlds could be
the actual world
• Update semantics: Information state updated through
communication (Veltman 1996)
• Probabilistic information state: probability distribution
over worlds (van Benthem et al. 2009, Zeevat 2013)
• Not all worlds equally likely to be the actual world
Probabilistic logics
• Uncertainty about the world we are in
• Probability distribution over worlds
• Nilsson 1986
• Probability that a sentence is true depends on the
probabilities of the worlds in which it is true
P(') =
X
w:||'||w=t
P(w)
Generating a probability
distribution over worlds
• Text understanding as a generative process
• Agent mentally simulates (i.e., probabilistically
generates) the situation described in the text
• Goodman et al, 2015; Goodman and Lassiter, 2016
• To generate a person:
• draw gender: flip a fair coin
• draw height from the normal distribution of heights for
that gender.
Properties in a probabilistic
information state
• Property applies in a particular world: extension of
predicate included in extension of property in that
world
• Focus here: Properties that the agent is certain
about: apply in all worlds that have non-zero
probability
Plan
• What can an agent learn from distributional context?
• A probabilistic information state
• Influencing a probabilistic information state with
distributional information
• A toy experiment
Bayesian update on the probability
distribution over worlds
• Prior distribution over worlds P0
• Then we see distributional evidence Edist
• e.g.: Distributional similarity of “crocodile” and
“alligator” is 0.93
• Posterior distribution P1 given Edist
• How do we determine the likelihood?
P1(w) = P(w|Edist) =
P(Edist|w)P0(w)
P(Edist)
Interpreting distributional data
• Speaker observes words with known properties,
and their
distributional
similarity
Property overlap from McRae feature norms (McRae et al 2005).
Similarities from a narrow-context model computed on UKWaC+
Wikipedia+BNC
word 1 word 2 ovl sim
peacock raven 0.29 0.70
mixer toaster 0.19 0.72
crocodile frog 0.17 0.86
bagpipe banjo 0.10 0.72
scissors typewriter 0.04 0.62
crocodile lime 0.03 0.33
coconut porcupine 0.03 0.42
Observing regularities: high property overlap
goes with high distributional similarity
word 1 word 2 ovl sim
peacock raven 0.29 0.70
mixer toaster 0.19 0.72
crocodile frog 0.17 0.86
bagpipe banjo 0.10 0.72
scissors typewriter 0.04 0.62
crocodile lime 0.03 0.33
coconut porcupine 0.03 0.42
0.05 0.10 0.15 0.20 0.25 0.30
0.20.61.0
Property overlap versus
similarity (artificial data)
property overlap
dist.sim.
In the simplest case:
linear regression.
Given the regularities I observed, and the
distributional evidence, what do I now
think of world w?
• World w:
• property overlap of crocodile and alligator is o = 0.1
• Predicted similarity:
• Distributional evidence: sim(crocodile, alligator) = 0.93
• How likely are we to observe a distributional
similarity of 0.93 if the predicted similarity is 0.53?
• Standard move in hypothesis testing: How likely to
see an observed value this high or higher
given the predicted distribution?
0 + 1o = 0.53
Likelihood of the distributional
evidence in this world
• What distribution?
• Equivalent view of linear regression:
Observed similarity = predicted similarity + normally
distributed error
• Normal distribution with mean
f(o) = 0 + 1o
0.00.10.20.30.4
dist.rating
prob.density
f(o)
0.00.10.20.30.4
prob.density
Likelihood of the distributional
evidence in this world
• Distributional similarity s = sim(crocodile, alligator)
• Hypothesis testing: How likely to see similarity value
as high as s or higher given property overlap o?
0.00.10.20.30.4
prob.density
f(o)
0.00.10.20.30.4
prob.density
f(o) s
Computing posterior probabilities in
a probabilistic generative framework
• Probabilistically generate worlds:
• “To generate a person, flip a fair coin to determine their
gender…”
• Approximately determine probability distribution
over worlds: Sample n probabilistically generated
worlds
• Sample from posterior:
• Rejection sampling
• Formulate likelihood as a sampling condition
Computing posterior probabilities in
a probabilistic generative framework
• Property overlap o between crocodiles and alligators
in world w
• Distributional similarity s = sim(crocodile, alligator)
• Keep w if similarity as high as s or higher is likely
given o
• Sample s’ from the normal
distribution with mean f(o)
• Keep world w if s’ >= s
0.00.10.20.30.4
prob.density
f(o) 0.00.10.20.30.4
prob.density
f(o) s
Plan
• What can an agent learn from distributional context?
• A probabilistic information state
• Influencing a probabilistic information state with
distributional information
• A toy experiment
Toy experiments
• Property collection: McRae et al., 2005
• Human-generated definitional features for concrete noun
properties
• Distributional model: narrow context, UKWaC + Wikipedia +
BNC
• Hold out alligator as unknown word
• Given distributional evidence, how likely are we to believe…
1. All alligators are dangerous
2. All alligators are edible
3. All alligators are animals
Toy experiments
• All alligators are dangerous:
• Known word: crocodile. sim(alligator, crocodile) = 0.93
• Crocodiles are animals, dangerous, scaly, and crocodiles
• All alligators are edible:
• Known word: trout. sim(alligator, trout) = 0.68
• Trouts are animals, aquatic, edible, and trouts
• Probability should be lower because similarity is lower
• All alligators are animals:
• Known words: crocodile, trout.
• Can evidence accumulate with multiple similarity ratings?
Generative story for the
prior probability
• Fix domain size to 10
• For each entity in the domain:
• Flip a fair coin to determine if it is a crocodile. Likewise for
alligator.
• For each entity in the domain:
• If it is a crocodile, it is also an animal, dangerous, and scaly.
• Otherwise, flip a fair coin to see if it is an animal (dangerous,
scaly).
Implemented in Church.
Results: All alligators are…
Sentence words sim prior posterior
. . . dangerous alligator,
crocodile
0.93 0.26 0.47
. . . edible alligator, trout 0.68 0.26 0.38
• Aim: Significant increase in probability
• Absolute probabilities depend on domain size,
problem formulation
• Higher similarities lead to significantly more confident inferences
• “Crocodile” much more similar to “alligator” than “trout”:
Agent more confidently ascribes crocodile properties to alligators
Probability of property
overlap: prior versus posterior
0 0.2 0.4 0.6 0.8 1
no dist. evidence
with dist. evidence
Property overlap of 'alligator' and 'crocodile'
prop. overlap
num.worlds
0200400600800
0 0.2 0.4 0.6 0.8 1
no dist. evidence
with dist. evidence
Property overlap of 'alligator' and trout'
prop. overlap
num.worlds
0200400600800
Alligator vs crocodile Alligator vs trout
prior
posterior
Accumulating evidence:
“All alligators are animals”
sim of alligator to. . . prior posterior
crocodile: 0.93 0.53 0.68
trout: 0.68 0.53 0.63
crocodile: 0.93,
trout: 0.68
0.53 0.80
• Does distributional evidence accumulate?
• Both crocodiles and trouts are known to be animals
• Posterior significantly higher
when two pieces of evidence present
Summary
• How can people use a word whose reference they don’t
know?
• Suppose we don’t know what an alligator is, can we still
infer from context clues that it’s an animal?
• Proposal:
• (Narrow-window) distributional evidence is property overlap
evidence
• Distributional evidence affects probabilistic information state
• Can be described in probabilistic generative framework
Next questions
• Learning from a single sentence only
• On our last evening, the boatman killed an alligator as it
crawled past our camp-fire to go hunting in the reeds beyond.
• Distributional one-shot learning
• Doable: same setup, learn McRae et al. definitional features
using selectional constraints of neighboring predicates
• Properties that do not apply to all members of a category
• Some but not all crocodiles are dangerous
• Learn probability of generating a property for “alligator”
Next questions
• Here: Learn from context only indirectly,
from correlation with grounded properties
• Can we learn from what is said in the text?
• On our last evening, the boatman killed an alligator as it
crawled past our camp-fire to go hunting in the reeds beyond.
• Alligators are entities that generally crawl, hunt, and are
found in reeds
• P(q is a generic property of alligators that would be
mentioned by people)
• Relevant to “human experience of alligators”
(Thill/Padó/Ziemke 2014)
Thanks
Gemma Boleda, Louise McNally, Judith Tonhauser
(best editor on earth!), Nicholas Asher, Marco Baroni,
David Beaver, John Beavers, Ann Copestake, Ido
Dagan, Aurélie Herbelot, Hans Kamp, Alexander
Koller, Alessandro Lenci, Sebastian Löbner, Julian
Michael, Ray Mooney, Sebastian Padó, Manfred
Pinkal, Stephen Roller, Hinrich Schütze, Jan van Eijck,
Leah Velleman, Steve Wechsler, Roberto Zamparelli,
and the Foundations of Semantic Spaces reading group

More Related Content

Similar to Katrin Erk - 2017 - What do you know about an alligator when you know the company it keeps?

Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in betweenVariation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
Tyler Schnoebelen
 
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
Walid Saba
 
Tok 2 1
Tok 2 1Tok 2 1
Tok 2 1
Justin Morris
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolution
Yannick Wurm
 
Fuzzy mathematics:An application oriented introduction
Fuzzy mathematics:An application oriented introductionFuzzy mathematics:An application oriented introduction
Fuzzy mathematics:An application oriented introduction
Nagasuri Bala Venkateswarlu
 
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
Manuela Pestana
 
Classical and Fuzzy Relations
Classical and Fuzzy RelationsClassical and Fuzzy Relations
Classical and Fuzzy Relations
Musfirah Malik
 
Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
Leon Derczynski
 
Cat Essay Writer
Cat Essay WriterCat Essay Writer
Cat Essay Writer
Tiffany Rossi
 
DATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptxDATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptx
DrPraveenPawar
 
Uncertain Knowledge in AI from Object Automation
Uncertain Knowledge in AI from Object Automation Uncertain Knowledge in AI from Object Automation
Uncertain Knowledge in AI from Object Automation
Object Automation
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
IT Arena
 
009 Essay Example Maxr. Online assignment writing service.
009 Essay Example Maxr. Online assignment writing service.009 Essay Example Maxr. Online assignment writing service.
009 Essay Example Maxr. Online assignment writing service.
Angelina Johnson
 
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
Barbara Konat
 
Loubier slide share_qualitative
Loubier slide share_qualitativeLoubier slide share_qualitative
Loubier slide share_qualitative
GryphTheriaultLoubie
 
Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)
Kieran Ryan
 
Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)
Kieran Ryan
 
DH Tools Workshop #1: Text Analysis
DH Tools Workshop #1:  Text AnalysisDH Tools Workshop #1:  Text Analysis
DH Tools Workshop #1: Text Analysis
cjbuckner
 
Thinking, Language, and Intelligence
Thinking, Language, and IntelligenceThinking, Language, and Intelligence
Thinking, Language, and Intelligence
Tan Gent
 
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
HakimSudinpreeda
 

Similar to Katrin Erk - 2017 - What do you know about an alligator when you know the company it keeps? (20)

Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in betweenVariation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
Variation in speech tempo: Capt. Kirk, Mr. Spock, and all of us in between
 
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
 
Tok 2 1
Tok 2 1Tok 2 1
Tok 2 1
 
Week 5 genetic basis of evolution
Week 5   genetic basis of evolutionWeek 5   genetic basis of evolution
Week 5 genetic basis of evolution
 
Fuzzy mathematics:An application oriented introduction
Fuzzy mathematics:An application oriented introductionFuzzy mathematics:An application oriented introduction
Fuzzy mathematics:An application oriented introduction
 
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
You are-all-crazy-subjectivaly-speaking-uploaded-1224441527362216-8
 
Classical and Fuzzy Relations
Classical and Fuzzy RelationsClassical and Fuzzy Relations
Classical and Fuzzy Relations
 
Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
 
Cat Essay Writer
Cat Essay WriterCat Essay Writer
Cat Essay Writer
 
DATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptxDATA641 Lecture 3 - Word meaning.pptx
DATA641 Lecture 3 - Word meaning.pptx
 
Uncertain Knowledge in AI from Object Automation
Uncertain Knowledge in AI from Object Automation Uncertain Knowledge in AI from Object Automation
Uncertain Knowledge in AI from Object Automation
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
 
009 Essay Example Maxr. Online assignment writing service.
009 Essay Example Maxr. Online assignment writing service.009 Essay Example Maxr. Online assignment writing service.
009 Essay Example Maxr. Online assignment writing service.
 
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
Scientific Method Lecture 1 WA UAM 1MA/2 sem/2013
 
Loubier slide share_qualitative
Loubier slide share_qualitativeLoubier slide share_qualitative
Loubier slide share_qualitative
 
Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)
 
Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)Essential human sciences in 2 lessons (with extension if required)
Essential human sciences in 2 lessons (with extension if required)
 
DH Tools Workshop #1: Text Analysis
DH Tools Workshop #1:  Text AnalysisDH Tools Workshop #1:  Text Analysis
DH Tools Workshop #1: Text Analysis
 
Thinking, Language, and Intelligence
Thinking, Language, and IntelligenceThinking, Language, and Intelligence
Thinking, Language, and Intelligence
 
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
1_Introductio_thinking, reasoning, logic, argument, fallacies.pptx
 

More from Association for Computational Linguistics

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Association for Computational Linguistics
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Association for Computational Linguistics
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
What is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptxWhat is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptx
christianmathematics
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 

Recently uploaded (20)

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
What is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptxWhat is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptx
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 

Katrin Erk - 2017 - What do you know about an alligator when you know the company it keeps?

  • 1. What do you know about an alligator when you know the company it keeps? Katrin Erk University of Texas at Austin STARSEM 2017
  • 2. Distributional semantics and you • Distributional models/Embeddings: An incredible success story in computational linguistics • Do you make use of distributional information, too? • Landauer & Dumais, 1997: “A solution to Plato’s problem” • How do humans acquire such a gigantic vocabulary in such a short time? • Much debate in psychology, experimental support: McDonald&Ramscar, 2001, Lazaridou et al, 2016 • But how about the linguistic side of the story?
  • 3. “A solution to Plato’s problem” “Many well-read adults know that Buddha sat long under a banyan tree (whatever that is) and Tahitian natives lived idyllically on breadfruit and poi (whatever those are). More or less correct usage often precedes referential knowledge” (Landauer&Dumais, 1997)
  • 4. “A solution to Plato’s problem” “Many well-read adults know that Buddha sat long under a banyan tree (whatever that is) and Tahitian natives lived idyllically on breadfruit and poi (whatever those are). More or less correct usage often precedes referential knowledge” ” (Landauer&Dumais, 1997) But wait: How can you use the word “banyan” more or less correctly when you are not aware of its reference? When you couldn’t point out a banyan in a yard?
  • 5. Learning about word meaning from textual context • Main aim: insight • What information is present in distributional representations, and why? • Assuming a learner with grounded concepts: How can distributional information contribute?
  • 6. Learning about meaning from textual context Suppose you do not know what an alligator is. What do these sentence tell you about alligators? • On our last evening, the boatman killed an alligator as it crawled past our camp-fire to go hunting in the reeds beyond. • A study done by Edwin Colbert and his colleagues showed that a tiny 50 gramme (1.76 oz) alligator heated up 1◦C every minute and a half from the Sun[…] • The throne was occupied by a pipe-smoking alligator.
  • 7. Learning about word meaning from textual context • Setting: adult learner • What kind of information can you get from text? • How does it enable you to use “alligator” more or less correctly? • Why can you learn anything from text? • Textual clues are rarely 100% reliable • “An alligator was lying at the bottom of a pool” • Could be an animal, a pool-cleaning implement…
  • 8. The story in a nutshell • How can I successfully use the word “alligator” when I don’t know what it refers to? • I know some properties of alligators: they are animals, dangerous, … • So then I use “alligator” in animal-like textual contexts
  • 9. The story in a nutshell • How does distributional information help? • It lets me infer properties of words: • Suppose I don’t know what an alligator is • But it appears in similar contexts as “crocodile” • So it must be something like a crocodile: • That is, it must share properties with a crocodile • So it may be an animal, it may be dangerous…
  • 10. The story in a nutshell • But distributional information can never yield certain knowledge • Instead uncertain, probabilistic information • Formal semantics framework • Probabilistic semantics: • Probability distribution over worlds that could be the current one • Probability of a world influenced by distributional information
  • 11. Plan • What can an agent learn from distributional context? • A probabilistic information state • Influencing a probabilistic information state with distributional information • A toy experiment
  • 12. What is in an embedding? • What information can be encoded in an embedding computed from text data? • Lots of things, given the right objective function • But: • What objective function can we assume a human agent to use? • What individual linguistic phenomena have been shown to be encoded? • So, restrict ourselves to simple model
  • 13. What is in an embedding? • Count-based models of textual context • (and neural models like word2vec, see Levy&Goldberg 2015) • Long-time criticism in psychology, eg. Murphy (2002): only a vague notion of “similarity” • But in fact distributional models can distinguish between semantic relations • by choice of what “context” means • through relation-specific classifiers (Fu et al, 2014; Levy et al, 2015; Shwartz et al, 2016; Roller& Erk, 2016, …)
  • 14. The effect of context window size • Peirsman 2008 (Dutch): • Narrow context window: high ratings to “similar” words • Particularly to co-hyponyms • Syntactic context even more so • Wide context window: high ratings to “related” words • Baroni/Lenci 2011 (English): • Narrow context window: highest ratings to co-hyponyms • Wide context window: ratings equal across many relations
  • 15. What is narrow-window similarity? • High ratings for co-hyponyms, also synonyms, some hypernyms, antonyms (well-known bug) • What semantic relation is that? • Co-hyponymy is an odd relation • dictionary-specific • can be incompatible (cat/dog) or compatible (hotel/restaurant) • Proposal: property overlap • Alligator, crocodile have many properties in common: animal, reptile, scaly, dangerous, …
  • 16. Why does narrow-window similarity do this? • Focus on noun targets • Narrow window, syntactic context contain: • Modifiers • Verbs that take target as argument • Selectional constraints • Traditionally formulated in terms of taxonomic properties • subject of “crawl”: animate
  • 17. But wait, where do the probabilities come from? • Frequency in text is not frequency in real life • Reporting bias: Almost no one says “Bananas are yellow” (Bruni et al, 2012) • Genre bias: “captive” and “westerner” respective nearest neighbors in Lin 1998 • Then how can counts in text lead us to probabilities relevant to grounded concepts?
  • 18. But wait, where do the probabilities come from? • Two tricks in this study 1. Only consider properties that apply to all members of a category (like “being an animal”) 2. Use distributional context only indirectly: Learn correlation between distributional context and real- world properties • More recent work: trick 2 without trick 1 • I think we can use distributional context directly and properly to get probabilities – more later
  • 19. Learning properties from distributional data • Concrete noun concepts • To learn: properties of a concept • Focus on properties applying to all members of a category (like taxonomic properties) • Broad definition of a property: can be expressed as an adjective, can be a hypernym, …
  • 20. Property overlap • Percentage of properties that are joint • Jaccard coefficient on sets • A, B, sets of properties: • Degrees of property overlap • Idea: The more properties in common, the higher the distributional similarity Jac(A, B) = |A B| |A [ B| jac = 2 / 6 = 0.33
  • 21. Plan • What can an agent learn from distributional context? • A probabilistic information state • Influencing a probabilistic information state with distributional information • A toy experiment
  • 22. Information states • Information state of Agent: set of worlds that the agent considers possibilities • Agent not omniscient • As far as Agent is concerned, any of these worlds could be the actual world • Update semantics: Information state updated through communication (Veltman 1996) • Probabilistic information state: probability distribution over worlds (van Benthem et al. 2009, Zeevat 2013) • Not all worlds equally likely to be the actual world
  • 23. Probabilistic logics • Uncertainty about the world we are in • Probability distribution over worlds • Nilsson 1986 • Probability that a sentence is true depends on the probabilities of the worlds in which it is true P(') = X w:||'||w=t P(w)
  • 24. Generating a probability distribution over worlds • Text understanding as a generative process • Agent mentally simulates (i.e., probabilistically generates) the situation described in the text • Goodman et al, 2015; Goodman and Lassiter, 2016 • To generate a person: • draw gender: flip a fair coin • draw height from the normal distribution of heights for that gender.
  • 25. Properties in a probabilistic information state • Property applies in a particular world: extension of predicate included in extension of property in that world • Focus here: Properties that the agent is certain about: apply in all worlds that have non-zero probability
  • 26. Plan • What can an agent learn from distributional context? • A probabilistic information state • Influencing a probabilistic information state with distributional information • A toy experiment
  • 27. Bayesian update on the probability distribution over worlds • Prior distribution over worlds P0 • Then we see distributional evidence Edist • e.g.: Distributional similarity of “crocodile” and “alligator” is 0.93 • Posterior distribution P1 given Edist • How do we determine the likelihood? P1(w) = P(w|Edist) = P(Edist|w)P0(w) P(Edist)
  • 28. Interpreting distributional data • Speaker observes words with known properties, and their distributional similarity Property overlap from McRae feature norms (McRae et al 2005). Similarities from a narrow-context model computed on UKWaC+ Wikipedia+BNC word 1 word 2 ovl sim peacock raven 0.29 0.70 mixer toaster 0.19 0.72 crocodile frog 0.17 0.86 bagpipe banjo 0.10 0.72 scissors typewriter 0.04 0.62 crocodile lime 0.03 0.33 coconut porcupine 0.03 0.42
  • 29. Observing regularities: high property overlap goes with high distributional similarity word 1 word 2 ovl sim peacock raven 0.29 0.70 mixer toaster 0.19 0.72 crocodile frog 0.17 0.86 bagpipe banjo 0.10 0.72 scissors typewriter 0.04 0.62 crocodile lime 0.03 0.33 coconut porcupine 0.03 0.42 0.05 0.10 0.15 0.20 0.25 0.30 0.20.61.0 Property overlap versus similarity (artificial data) property overlap dist.sim. In the simplest case: linear regression.
  • 30. Given the regularities I observed, and the distributional evidence, what do I now think of world w? • World w: • property overlap of crocodile and alligator is o = 0.1 • Predicted similarity: • Distributional evidence: sim(crocodile, alligator) = 0.93 • How likely are we to observe a distributional similarity of 0.93 if the predicted similarity is 0.53? • Standard move in hypothesis testing: How likely to see an observed value this high or higher given the predicted distribution? 0 + 1o = 0.53
  • 31. Likelihood of the distributional evidence in this world • What distribution? • Equivalent view of linear regression: Observed similarity = predicted similarity + normally distributed error • Normal distribution with mean f(o) = 0 + 1o 0.00.10.20.30.4 dist.rating prob.density f(o) 0.00.10.20.30.4 prob.density
  • 32. Likelihood of the distributional evidence in this world • Distributional similarity s = sim(crocodile, alligator) • Hypothesis testing: How likely to see similarity value as high as s or higher given property overlap o? 0.00.10.20.30.4 prob.density f(o) 0.00.10.20.30.4 prob.density f(o) s
  • 33. Computing posterior probabilities in a probabilistic generative framework • Probabilistically generate worlds: • “To generate a person, flip a fair coin to determine their gender…” • Approximately determine probability distribution over worlds: Sample n probabilistically generated worlds • Sample from posterior: • Rejection sampling • Formulate likelihood as a sampling condition
  • 34. Computing posterior probabilities in a probabilistic generative framework • Property overlap o between crocodiles and alligators in world w • Distributional similarity s = sim(crocodile, alligator) • Keep w if similarity as high as s or higher is likely given o • Sample s’ from the normal distribution with mean f(o) • Keep world w if s’ >= s 0.00.10.20.30.4 prob.density f(o) 0.00.10.20.30.4 prob.density f(o) s
  • 35. Plan • What can an agent learn from distributional context? • A probabilistic information state • Influencing a probabilistic information state with distributional information • A toy experiment
  • 36. Toy experiments • Property collection: McRae et al., 2005 • Human-generated definitional features for concrete noun properties • Distributional model: narrow context, UKWaC + Wikipedia + BNC • Hold out alligator as unknown word • Given distributional evidence, how likely are we to believe… 1. All alligators are dangerous 2. All alligators are edible 3. All alligators are animals
  • 37. Toy experiments • All alligators are dangerous: • Known word: crocodile. sim(alligator, crocodile) = 0.93 • Crocodiles are animals, dangerous, scaly, and crocodiles • All alligators are edible: • Known word: trout. sim(alligator, trout) = 0.68 • Trouts are animals, aquatic, edible, and trouts • Probability should be lower because similarity is lower • All alligators are animals: • Known words: crocodile, trout. • Can evidence accumulate with multiple similarity ratings?
  • 38. Generative story for the prior probability • Fix domain size to 10 • For each entity in the domain: • Flip a fair coin to determine if it is a crocodile. Likewise for alligator. • For each entity in the domain: • If it is a crocodile, it is also an animal, dangerous, and scaly. • Otherwise, flip a fair coin to see if it is an animal (dangerous, scaly). Implemented in Church.
  • 39. Results: All alligators are… Sentence words sim prior posterior . . . dangerous alligator, crocodile 0.93 0.26 0.47 . . . edible alligator, trout 0.68 0.26 0.38 • Aim: Significant increase in probability • Absolute probabilities depend on domain size, problem formulation • Higher similarities lead to significantly more confident inferences • “Crocodile” much more similar to “alligator” than “trout”: Agent more confidently ascribes crocodile properties to alligators
  • 40. Probability of property overlap: prior versus posterior 0 0.2 0.4 0.6 0.8 1 no dist. evidence with dist. evidence Property overlap of 'alligator' and 'crocodile' prop. overlap num.worlds 0200400600800 0 0.2 0.4 0.6 0.8 1 no dist. evidence with dist. evidence Property overlap of 'alligator' and trout' prop. overlap num.worlds 0200400600800 Alligator vs crocodile Alligator vs trout prior posterior
  • 41. Accumulating evidence: “All alligators are animals” sim of alligator to. . . prior posterior crocodile: 0.93 0.53 0.68 trout: 0.68 0.53 0.63 crocodile: 0.93, trout: 0.68 0.53 0.80 • Does distributional evidence accumulate? • Both crocodiles and trouts are known to be animals • Posterior significantly higher when two pieces of evidence present
  • 42. Summary • How can people use a word whose reference they don’t know? • Suppose we don’t know what an alligator is, can we still infer from context clues that it’s an animal? • Proposal: • (Narrow-window) distributional evidence is property overlap evidence • Distributional evidence affects probabilistic information state • Can be described in probabilistic generative framework
  • 43. Next questions • Learning from a single sentence only • On our last evening, the boatman killed an alligator as it crawled past our camp-fire to go hunting in the reeds beyond. • Distributional one-shot learning • Doable: same setup, learn McRae et al. definitional features using selectional constraints of neighboring predicates • Properties that do not apply to all members of a category • Some but not all crocodiles are dangerous • Learn probability of generating a property for “alligator”
  • 44. Next questions • Here: Learn from context only indirectly, from correlation with grounded properties • Can we learn from what is said in the text? • On our last evening, the boatman killed an alligator as it crawled past our camp-fire to go hunting in the reeds beyond. • Alligators are entities that generally crawl, hunt, and are found in reeds • P(q is a generic property of alligators that would be mentioned by people) • Relevant to “human experience of alligators” (Thill/Padó/Ziemke 2014)
  • 45. Thanks Gemma Boleda, Louise McNally, Judith Tonhauser (best editor on earth!), Nicholas Asher, Marco Baroni, David Beaver, John Beavers, Ann Copestake, Ido Dagan, Aurélie Herbelot, Hans Kamp, Alexander Koller, Alessandro Lenci, Sebastian Löbner, Julian Michael, Ray Mooney, Sebastian Padó, Manfred Pinkal, Stephen Roller, Hinrich Schütze, Jan van Eijck, Leah Velleman, Steve Wechsler, Roberto Zamparelli, and the Foundations of Semantic Spaces reading group