SlideShare a Scribd company logo
1 of 38
Download to read offline
Compositional Distributional Models of Meaning
Dimitri Kartsaklis
Department of Computer Science
University of Oxford
Hilary Term, 2015
Dimitri Kartsaklis Compositional Distributional Models of Meaning 1/37
In a nutshell
Compositional distributional models of meaning (CDMs) aim
to unify two orthogonal semantic paradigms:
The type-logical compositional approach of formal semantics
The quantitative perspective of vector space models of
meaning
The goal is to represent sentences as points in some
highly-dimensional metric space
Useful in every NLP task: sentence similarity, paraphrase
detection, sentiment analysis, machine translation etc.
We review three generic classes of CDMs: vector mixtures,
tensor-based models and deep-learning models
Dimitri Kartsaklis Compositional Distributional Models of Meaning 2/37
Outline
1 Introduction
2 Vector mixture models
3 Tensor-based models
4 Deep learning models
5 Afterword
Dimitri Kartsaklis Compositional Distributional Models of Meaning 3/37
Compositional semantics
The principle of compositionality
The meaning of a complex expression is determined by the
meanings of its parts and the rules used for combining them.
A lexicon:
(1) a. every Dt : λP.λQ.∀x[P(x) → Q(x)]
b. man N : λy.man(y)
c. walks VI : λz.walk(z)
A parse tree, so syntax guides the semantic composition:
S
NP
Dt
Every
N
man
V IN
walks
NP → Dt N : [[N]]([[Dt]])
S → NP VIN : [[VIN ]]([[NP]])
Dimitri Kartsaklis Compositional Distributional Models of Meaning 4/37
Syntax-to-semantics correspondence
Logical forms of compounds are computed via β-reduction:
S
∀x[man(x) → walk(x)]
NP
λQ.∀x[man(x) → Q(x)]
Dt
λP.λQ.∀x[P(x) → Q(x)]
Every
N
λy.man(y)
man
V IN
λz.walk(z)
walks
The semantic value of a sentence can be true or false.
An approach known as Montague Grammar (1970)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 5/37
Distributional models of meaning
Distributional hypothesis
The meaning of a word is determined by its context [Harris, 1958].
Dimitri Kartsaklis Compositional Distributional Models of Meaning 6/37
Distributional models of meaning
Distributional hypothesis
The meaning of a word is determined by its context [Harris, 1958].
Dimitri Kartsaklis Compositional Distributional Models of Meaning 6/37
Words as vectors
A word is a vector of co-occurrence statistics with every other
word in the vocabulary:
milk
cute
dog
bank
money
12
8
5
0
1
cat
cat
dog
account
money
pet
Semantic relatedness is usually based on cosine similarity:
sim(−→v , −→u ) = cos(−→v , −→u ) =
(−→v · −→u )
−→v −→u
(1)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 7/37
A real vector space
30
20
10
0
10
20
cat
dog
pet
kitten
mouse
horse
lion
elephanttiger
parrot
eagle
pigeon
raven
seagull
money
stock
bank
financecurrencyprofit
credit
market
business
broker
account
laptop
data
processor
technologyram
megabyte
keyboard monitor
intel
motherboard
microchip
cpu
game
football
score
championshipplayer league
team
Dimitri Kartsaklis Compositional Distributional Models of Meaning 8/37
The necessity for a unified model
Montague grammar is compositional but not quantitative
Distributional models of meaning are quantitative but not
compositional
Note that the distributional hypothesis does not apply to
phrases or sentences. There is not enough data:
Even if we had an infinitely large corpus, what the context of
a sentence would be?
Dimitri Kartsaklis Compositional Distributional Models of Meaning 9/37
The role of compositionality
Compositional distributional models
We can synthetically produce a sentence vector by composing the
vectors of the words in that sentence.
−→s = f (−→w1, −→w2, . . . , −→wn) (2)
Three generic classes of CDMs:
Vector mixture models [Mitchell and Lapata, 2008]
Tensor-based models
[Coecke et al., 2010, Baroni and Zamparelli, 2010]
Deep learning models
[Socher et al., 2012, Kalchbrenner et al., 2014].
Dimitri Kartsaklis Compositional Distributional Models of Meaning 10/37
Outline
1 Introduction
2 Vector mixture models
3 Tensor-based models
4 Deep learning models
5 Afterword
Dimitri Kartsaklis Compositional Distributional Models of Meaning 11/37
Element-wise vector composition
We can combine two vectors by working element-wise
[Mitchell and Lapata, 2008]:
−−−→w1w2 = α−→w1 + β−→w2 =
i
(αcw1
i + βcw2
i )−→ni (3)
−−−→w1w2 = −→w1
−→w2 =
i
cw1
i cw2
i
−→ni (4)
An element-wise “mixture” of the input elements:
= =
Vector mixture Tensor-based
Dimitri Kartsaklis Compositional Distributional Models of Meaning 12/37
Vector mixtures: Summary
Distinguishing feature:
All words contribute equally to the final result.
PROS:
Trivial to implement
As computationally light-weight as it gets
Surprisingly effective in practice
CONS:
A bag-of-word approach
Does not distinguish between the type-logical identities of the
words
By definition, sentence space = word space
Dimitri Kartsaklis Compositional Distributional Models of Meaning 13/37
Outline
1 Introduction
2 Vector mixture models
3 Tensor-based models
4 Deep learning models
5 Afterword
Dimitri Kartsaklis Compositional Distributional Models of Meaning 14/37
Relational words as functions
In a vector mixture model, an adjective is of the same order as
the noun it modifies, and both contribute equally to the result.
One step further: Relational words are multi-linear maps
(tensors of various orders) that can be applied to one or more
arguments (vectors).
= =
Vector mixture Tensor-based
Formalized in the context of compact closed categories by
Coecke, Sadrzadeh and Clark.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 15/37
A categorical framework for composition
Categorical compositional distributional semantics
Coecke, Sadrzadeh and Clark (2010): Syntax and vector space
semantics can be structurally homomorphic.
Take CF, the free compact closed category over a pregroup
grammar, as the structure accommodating the syntax
Take FVect, the category of finite-dimensional vector spaces
and linear maps, as the semantic counterpart of CF.
CF and FVect share a compact closed structure, so a strongly
monoidal functor Q can be defined such that:
Q : CF → FVect (5)
The meaning of a sentence w1w2 . . . wn with type reduction α
is given as:
Q(α)(−→w1 ⊗ −→w2 ⊗ . . . ⊗ −→wn) (6)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 16/37
Categorical composition: Example
S
NP
Adj
happy
N
kids
VP
V
play
N
games
happy kids play games
n nl
n nr
s nl
n
Grammar rules follow the inequalities:
pl
· p ≤ 1 ≤ p · pl
and p · pr
≤ 1 ≤ pr
· p (7)
Derivation becomes:
(n · nl
) · n · (nr
· s · nl
) · n = n · (nl
· n) · (nr
· s · nl
) · n ≤
n · 1 · (nr
· s · nl
) · n = n · (nr
· s · nl
) · n =
(n · nr
) · s · (nl
· n) ≤ 1 · s · 1 ≤ s
Dimitri Kartsaklis Compositional Distributional Models of Meaning 17/37
Categorical composition: From syntax to semantics
Each atomic grammar type is mapped to a vector space:
Q(n) = N Q(s) = S
Due to strong monoidality, complex types are mapped to
tensor product of vector spaces:
Q(n · nl
) = Q(n) ⊗ Q(nl
) = N ⊗ N
Q(nr
· s · nl
) = Q(nr
) ⊗ Q(s) ⊗ Q(nl
) = N ⊗ S ⊗ N
Finally, each morphism in CF is mapped to a linear map in
FVect.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 18/37
A multi-linear model
The grammatical type of a word defines the vector space in
which the word lives.
A noun lives in a basic vector space N.
Adjectives are linear maps N → N, i.e elements in N ⊗ N.
Intransitive verbs are maps N → S, i.e elements in N ⊗ S.
Transitive verbs are bi-linear maps N ⊗ N → S, i.e. elements
of N ⊗ S ⊗ N.
And so on.
The composition operation is tensor contraction, based on
inner product.
happy kids play games
N Nl N Nr
S Nl N
Dimitri Kartsaklis Compositional Distributional Models of Meaning 19/37
Frobenius algebras in language
Kartsaklis, Sadrzadeh, Pulman and Coecke (2013): Use the
co-multiplication of a Frobenius algebra to model verb tensors.
s v o
v
A combination of element-wise and categorical composition:
−−−→s v o = (s × v) o
Useful e.g. in intonation:
Who does John like? John likes Mary
(
−−→
john × likes) −−−→mary
Sadrzadeh, Clark and Coecke (2013): Relative pronouns are
modelled in terms of Frobenius copying and deleting.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 20/37
Composition and lexical ambiguity
Kartsaklis and Sadrzadeh (2013): Explicitly handling lexical
ambiguity improves the performance of tensor-based models
How can we model ambiguity in categorical composition?
Words as mixed states
Piedeleu, Kartsaklis, Coecke and Sadrzadeh (2015): Ambiguous
words are represented as mixed states in CPM(FHilb).
ρ(wt) =
i
pi |wi
t wi
t | (8)
Von Neumann entropy shows how ambiguity evolves from
words to compounds
Disambiguation = purification: Entropy of ‘vessel’ is 0.25,
but entropy of ‘vessel that sails’ is 0.01.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 21/37
Tensor-based models: Summary
Distinguishing feature
Relational words are functions acting on arguments.
PROS:
Highly justified from a linguistic perspective
More powerful and robust than vector mixtures
A glass box approach that allows theoretical reasoning
CONS:
Every logical and functional word must be assigned to an
appropriate tensor representation (e.g. how do you represent
quantifiers?)
Space complexity problems for functions of higher arity (e.g. a
ditransitive verb is a tensor of order 4)
Limited to linearity
Dimitri Kartsaklis Compositional Distributional Models of Meaning 22/37
Outline
1 Introduction
2 Vector mixture models
3 Tensor-based models
4 Deep learning models
5 Afterword
Dimitri Kartsaklis Compositional Distributional Models of Meaning 23/37
A simple neural net
A feed-forward neural network with one hidden layer:
h1 = f (w11x1+w21x2+w31x3+w41x4+w51x5+b1)
h2 = f (w12x1+w22x2+w32x3+w42x4+w52x5+b2)
h3 = f (w13x1+w23x2+w33x3+w43x4+w53x5+b3)
or
−→
h = f (W(1)−→x +
−→
b (1)
)
Similarly:
−→y = f (W(2)−→
h +
−→
b (2)
)
Note that W(1) ∈ R3×5 and W(2) ∈ R2×3
f is a non-linear function such as tanh or sigmoid
(take f = Id and you have a tensor-based model)
Weights are optimized against some objective function
A universal approximator.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 24/37
Recursive neural networks
Pollack (1990), Socher et al. (2012): Recursive neural nets
for composition in natural language.
f is an element-wise non-linear function such as tanh or
sigmoid.
A supervised approach.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 25/37
Unsupervised learning with NNs
How can we train a NN in an unsupervised manner?
Create an auto-encoder: Train the network to reproduce its
input via an expansion layer.
Use the output of the hidden layer as a compressed version of
the input [Socher et al., 2011]
Dimitri Kartsaklis Compositional Distributional Models of Meaning 26/37
Convolutional NNs
Originated in pattern recognition [Fukushima, 1980]
Small filters apply on every position of the input vector:
Capable of extracting fine-grained local features independently
of the exact position in input
Features become increasingly global as more layers are stacked
Each convolutional layer is usually followed by a pooling layer
Top layer is fully connected, usually a soft-max classifier
Application to language: Collobert and Weston (2008)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 27/37
DCNNs for modelling sentences
Kalchbrenner, Grefenstette
and Blunsom (2014): A deep
architecture using dynamic
k-max pooling
Syntactic structure is
induced automatically:
(Figures reused with permission)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 28/37
Beyond sentence level
Denil, Demiraj, Kalchbrenner, Blunsom, and De Freitas (2014): An
additional convolutional layer can provide document vectors:
(Figure reused with permission)
Dimitri Kartsaklis Compositional Distributional Models of Meaning 29/37
Deep learning models: Summary
Distinguishing feature
Drastic transformation of the sentence space.
PROS:
Non-linearity and layered approach allow the simulation of a
very wide range of functions
Word vectors are parameters of the model, optimized during
training
State-of-the-art results in a number of NLP tasks
CONS:
Requires expensive training
Difficult to discover the right configuration
A black-box approach: not easy to correlate inner workings
with output
Dimitri Kartsaklis Compositional Distributional Models of Meaning 30/37
Outline
1 Introduction
2 Vector mixture models
3 Tensor-based models
4 Deep learning models
5 Afterword
Dimitri Kartsaklis Compositional Distributional Models of Meaning 31/37
A hierarchy of CDMs
Putting everything together:
Note that “power” is only theoretical; actual performance
depends on task, underlying assumptions, and configuration
Dimitri Kartsaklis Compositional Distributional Models of Meaning 32/37
Open issues-Future work
No convincing solution for logical connectives, negation,
quantifiers and so on.
Functional words, such as prepositions and relative pronouns,
are also a problem.
Sentence space is usually identified with word space. This is
convenient, but obviously a simplification
Solutions depend on the specific CDM class—e.g. not much
to do in a vector mixture setting
Important: How can we make NNs more linguistically aware?
[Hermann and Blunsom, 2013]
Dimitri Kartsaklis Compositional Distributional Models of Meaning 33/37
Conclusion
CDMs provide quantitative semantic representations for
sentences (or even documents)
Element-wise operations on word vectors constitute an easy
and reasonably effective way to get sentence vectors
Categorical compositional distributional models allow
reasoning on a theoretical level—a glass box approach
Deep learning models are extremely powerful and effective;
still a black-box approach, not easy to explain why a specific
configuration works and some other does not.
Convolutional networks seem to constitute a very promising
solution for sentential/discourse semantics
Dimitri Kartsaklis Compositional Distributional Models of Meaning 34/37
References I
Baroni, M. and Zamparelli, R. (2010).
Nouns are Vectors, Adjectives are Matrices.
In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP).
Coecke, B., Sadrzadeh, M., and Clark, S. (2010).
Mathematical Foundations for a Compositional Distributional Model of Meaning. Lambek Festschrift.
Linguistic Analysis, 36:345–384.
Collobert, R. and Weston, J. (2008).
A unified architecture for natural language processing: Deep neural networks with multitask learning.
In Proceedings of the 25th international conference on Machine learning, pages 160–167. ACM.
Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., and de Freitas, N. (2014).
Modelling, visualising and summarising documents with a single convolutional neural network.
arXiv preprint arXiv:1406.3830.
Kalchbrenner, N., Grefenstette, E., and Blunsom, P. (2014).
A convolutional neural network for modelling sentences.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1:
Long Papers), pages 655–665, Baltimore, Maryland. Association for Computational Linguistics.
Kartsaklis, D. and Sadrzadeh, M. (2013).
Prior disambiguation of word tensors for constructing sentence vectors.
In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages
1590–1601, Seattle, Washington, USA. Association for Computational Linguistics.
Kartsaklis, D., Sadrzadeh, M., Pulman, S., and Coecke, B. (2014).
Reasoning about meaning in natural language with compact closed categories and Frobenius algebras.
arXiv preprint arXiv:1401.5980.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 35/37
References II
Lambek, J. (2008).
From Word to Sentence.
Polimetrica, Milan.
Mitchell, J. and Lapata, M. (2008).
Vector-based Models of Semantic Composition.
In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 236–244.
Montague, R. (1970a).
English as a formal language.
In Linguaggi nella Societ`a e nella Tecnica, pages 189–224. Edizioni di Comunit`a, Milan.
Montague, R. (1970b).
Universal grammar.
Theoria, 36:373–398.
Piedeleu, R., Kartsaklis, D., Coecke, B., and Sadrzadeh, M. (2015).
Open System Categorical Quantum Semantics in Natural Language Processing.
arXiv preprint arXiv:1502.00831.
Pollack, J. B. (1990).
Recursive distributed representations.
Artificial Intelligence, 46(1):77–105.
Sadrzadeh, M., Clark, S., and Coecke, B. (2013).
The Frobenius anatomy of word meanings I: subject and object relative pronouns.
Journal of Logic and Computation, Advance Access.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 36/37
References III
Socher, R., Huang, E., Pennington, J., Ng, A., and Manning, C. (2011).
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection.
Advances in Neural Information Processing Systems, 24.
Socher, R., Huval, B., Manning, C., and A., N. (2012).
Semantic compositionality through recursive matrix-vector spaces.
In Conference on Empirical Methods in Natural Language Processing 2012.
Dimitri Kartsaklis Compositional Distributional Models of Meaning 37/37

More Related Content

What's hot

Topic models
Topic modelsTopic models
Topic models
Ajay Ohri
 
Modular Ontologies - A Formal Investigation of Semantics and Expressivity
Modular Ontologies - A Formal Investigation of Semantics and ExpressivityModular Ontologies - A Formal Investigation of Semantics and Expressivity
Modular Ontologies - A Formal Investigation of Semantics and Expressivity
Jie Bao
 
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
Facultad de Informática UCM
 

What's hot (20)

An Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from TextAn Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from Text
 
Latent dirichletallocation presentation
Latent dirichletallocation presentationLatent dirichletallocation presentation
Latent dirichletallocation presentation
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Categorical Explicit Substitutions
Categorical Explicit SubstitutionsCategorical Explicit Substitutions
Categorical Explicit Substitutions
 
Categorical Semantics for Explicit Substitutions
Categorical Semantics for Explicit SubstitutionsCategorical Semantics for Explicit Substitutions
Categorical Semantics for Explicit Substitutions
 
Canini09a
Canini09aCanini09a
Canini09a
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Point-free semantics of dependent type theories
Point-free semantics of dependent type theoriesPoint-free semantics of dependent type theories
Point-free semantics of dependent type theories
 
Negation in the Ecumenical System
Negation in the Ecumenical SystemNegation in the Ecumenical System
Negation in the Ecumenical System
 
Topic models
Topic modelsTopic models
Topic models
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Modular Ontologies - A Formal Investigation of Semantics and Expressivity
Modular Ontologies - A Formal Investigation of Semantics and ExpressivityModular Ontologies - A Formal Investigation of Semantics and Expressivity
Modular Ontologies - A Formal Investigation of Semantics and Expressivity
 
Dialectica Categories Surprising Application: mapping cardinal invariants
Dialectica Categories Surprising Application: mapping cardinal invariantsDialectica Categories Surprising Application: mapping cardinal invariants
Dialectica Categories Surprising Application: mapping cardinal invariants
 
Social activity method (sam) a fractal language for mathematics
Social activity method (sam) a fractal language for mathematicsSocial activity method (sam) a fractal language for mathematics
Social activity method (sam) a fractal language for mathematics
 
Summary of Analogical Reinforcement Learning
Summary of Analogical Reinforcement LearningSummary of Analogical Reinforcement Learning
Summary of Analogical Reinforcement Learning
 
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
Like Alice in Wonderland: Unraveling Reasoning and Cognition Using Analogies ...
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation
 
Constructive Modalities
Constructive ModalitiesConstructive Modalities
Constructive Modalities
 
A Formal Model of Metaphor in Frame Semantics
A Formal Model of Metaphor in Frame SemanticsA Formal Model of Metaphor in Frame Semantics
A Formal Model of Metaphor in Frame Semantics
 

Similar to Compositional Distributional Models of Meaning

The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
fridolin.wild
 
Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)
KU Leuven
 
Ireducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst gamesIreducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst games
vinnief
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
Sean Golliher
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
CSR2011
 

Similar to Compositional Distributional Models of Meaning (20)

The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
Probabilistic models (part 1)
Probabilistic models (part 1)Probabilistic models (part 1)
Probabilistic models (part 1)
 
June 22nd 2014: Seminar at JAIST
June 22nd 2014: Seminar at JAISTJune 22nd 2014: Seminar at JAIST
June 22nd 2014: Seminar at JAIST
 
graph_embeddings
graph_embeddingsgraph_embeddings
graph_embeddings
 
Proof-Theoretic Semantics: Point-free meaninig of first-order systems
Proof-Theoretic Semantics: Point-free meaninig of first-order systemsProof-Theoretic Semantics: Point-free meaninig of first-order systems
Proof-Theoretic Semantics: Point-free meaninig of first-order systems
 
Embedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglioEmbedding for fun fumarola Meetup Milano DLI luglio
Embedding for fun fumarola Meetup Milano DLI luglio
 
semeval2016
semeval2016semeval2016
semeval2016
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
 
Reduction Monads and Their Signatures
Reduction Monads and Their SignaturesReduction Monads and Their Signatures
Reduction Monads and Their Signatures
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Point-free foundation of Mathematics
Point-free foundation of MathematicsPoint-free foundation of Mathematics
Point-free foundation of Mathematics
 
Artifial intelligence
Artifial intelligenceArtifial intelligence
Artifial intelligence
 
Categorical Semantics for Explicit Substitutions
Categorical Semantics for Explicit SubstitutionsCategorical Semantics for Explicit Substitutions
Categorical Semantics for Explicit Substitutions
 
Ireducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst gamesIreducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst games
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
 
AI to advance science research
AI to advance science researchAI to advance science research
AI to advance science research
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
 

Recently uploaded

biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Recently uploaded (20)

Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 

Compositional Distributional Models of Meaning

  • 1. Compositional Distributional Models of Meaning Dimitri Kartsaklis Department of Computer Science University of Oxford Hilary Term, 2015 Dimitri Kartsaklis Compositional Distributional Models of Meaning 1/37
  • 2. In a nutshell Compositional distributional models of meaning (CDMs) aim to unify two orthogonal semantic paradigms: The type-logical compositional approach of formal semantics The quantitative perspective of vector space models of meaning The goal is to represent sentences as points in some highly-dimensional metric space Useful in every NLP task: sentence similarity, paraphrase detection, sentiment analysis, machine translation etc. We review three generic classes of CDMs: vector mixtures, tensor-based models and deep-learning models Dimitri Kartsaklis Compositional Distributional Models of Meaning 2/37
  • 3. Outline 1 Introduction 2 Vector mixture models 3 Tensor-based models 4 Deep learning models 5 Afterword Dimitri Kartsaklis Compositional Distributional Models of Meaning 3/37
  • 4. Compositional semantics The principle of compositionality The meaning of a complex expression is determined by the meanings of its parts and the rules used for combining them. A lexicon: (1) a. every Dt : λP.λQ.∀x[P(x) → Q(x)] b. man N : λy.man(y) c. walks VI : λz.walk(z) A parse tree, so syntax guides the semantic composition: S NP Dt Every N man V IN walks NP → Dt N : [[N]]([[Dt]]) S → NP VIN : [[VIN ]]([[NP]]) Dimitri Kartsaklis Compositional Distributional Models of Meaning 4/37
  • 5. Syntax-to-semantics correspondence Logical forms of compounds are computed via β-reduction: S ∀x[man(x) → walk(x)] NP λQ.∀x[man(x) → Q(x)] Dt λP.λQ.∀x[P(x) → Q(x)] Every N λy.man(y) man V IN λz.walk(z) walks The semantic value of a sentence can be true or false. An approach known as Montague Grammar (1970) Dimitri Kartsaklis Compositional Distributional Models of Meaning 5/37
  • 6. Distributional models of meaning Distributional hypothesis The meaning of a word is determined by its context [Harris, 1958]. Dimitri Kartsaklis Compositional Distributional Models of Meaning 6/37
  • 7. Distributional models of meaning Distributional hypothesis The meaning of a word is determined by its context [Harris, 1958]. Dimitri Kartsaklis Compositional Distributional Models of Meaning 6/37
  • 8. Words as vectors A word is a vector of co-occurrence statistics with every other word in the vocabulary: milk cute dog bank money 12 8 5 0 1 cat cat dog account money pet Semantic relatedness is usually based on cosine similarity: sim(−→v , −→u ) = cos(−→v , −→u ) = (−→v · −→u ) −→v −→u (1) Dimitri Kartsaklis Compositional Distributional Models of Meaning 7/37
  • 9. A real vector space 30 20 10 0 10 20 cat dog pet kitten mouse horse lion elephanttiger parrot eagle pigeon raven seagull money stock bank financecurrencyprofit credit market business broker account laptop data processor technologyram megabyte keyboard monitor intel motherboard microchip cpu game football score championshipplayer league team Dimitri Kartsaklis Compositional Distributional Models of Meaning 8/37
  • 10. The necessity for a unified model Montague grammar is compositional but not quantitative Distributional models of meaning are quantitative but not compositional Note that the distributional hypothesis does not apply to phrases or sentences. There is not enough data: Even if we had an infinitely large corpus, what the context of a sentence would be? Dimitri Kartsaklis Compositional Distributional Models of Meaning 9/37
  • 11. The role of compositionality Compositional distributional models We can synthetically produce a sentence vector by composing the vectors of the words in that sentence. −→s = f (−→w1, −→w2, . . . , −→wn) (2) Three generic classes of CDMs: Vector mixture models [Mitchell and Lapata, 2008] Tensor-based models [Coecke et al., 2010, Baroni and Zamparelli, 2010] Deep learning models [Socher et al., 2012, Kalchbrenner et al., 2014]. Dimitri Kartsaklis Compositional Distributional Models of Meaning 10/37
  • 12. Outline 1 Introduction 2 Vector mixture models 3 Tensor-based models 4 Deep learning models 5 Afterword Dimitri Kartsaklis Compositional Distributional Models of Meaning 11/37
  • 13. Element-wise vector composition We can combine two vectors by working element-wise [Mitchell and Lapata, 2008]: −−−→w1w2 = α−→w1 + β−→w2 = i (αcw1 i + βcw2 i )−→ni (3) −−−→w1w2 = −→w1 −→w2 = i cw1 i cw2 i −→ni (4) An element-wise “mixture” of the input elements: = = Vector mixture Tensor-based Dimitri Kartsaklis Compositional Distributional Models of Meaning 12/37
  • 14. Vector mixtures: Summary Distinguishing feature: All words contribute equally to the final result. PROS: Trivial to implement As computationally light-weight as it gets Surprisingly effective in practice CONS: A bag-of-word approach Does not distinguish between the type-logical identities of the words By definition, sentence space = word space Dimitri Kartsaklis Compositional Distributional Models of Meaning 13/37
  • 15. Outline 1 Introduction 2 Vector mixture models 3 Tensor-based models 4 Deep learning models 5 Afterword Dimitri Kartsaklis Compositional Distributional Models of Meaning 14/37
  • 16. Relational words as functions In a vector mixture model, an adjective is of the same order as the noun it modifies, and both contribute equally to the result. One step further: Relational words are multi-linear maps (tensors of various orders) that can be applied to one or more arguments (vectors). = = Vector mixture Tensor-based Formalized in the context of compact closed categories by Coecke, Sadrzadeh and Clark. Dimitri Kartsaklis Compositional Distributional Models of Meaning 15/37
  • 17. A categorical framework for composition Categorical compositional distributional semantics Coecke, Sadrzadeh and Clark (2010): Syntax and vector space semantics can be structurally homomorphic. Take CF, the free compact closed category over a pregroup grammar, as the structure accommodating the syntax Take FVect, the category of finite-dimensional vector spaces and linear maps, as the semantic counterpart of CF. CF and FVect share a compact closed structure, so a strongly monoidal functor Q can be defined such that: Q : CF → FVect (5) The meaning of a sentence w1w2 . . . wn with type reduction α is given as: Q(α)(−→w1 ⊗ −→w2 ⊗ . . . ⊗ −→wn) (6) Dimitri Kartsaklis Compositional Distributional Models of Meaning 16/37
  • 18. Categorical composition: Example S NP Adj happy N kids VP V play N games happy kids play games n nl n nr s nl n Grammar rules follow the inequalities: pl · p ≤ 1 ≤ p · pl and p · pr ≤ 1 ≤ pr · p (7) Derivation becomes: (n · nl ) · n · (nr · s · nl ) · n = n · (nl · n) · (nr · s · nl ) · n ≤ n · 1 · (nr · s · nl ) · n = n · (nr · s · nl ) · n = (n · nr ) · s · (nl · n) ≤ 1 · s · 1 ≤ s Dimitri Kartsaklis Compositional Distributional Models of Meaning 17/37
  • 19. Categorical composition: From syntax to semantics Each atomic grammar type is mapped to a vector space: Q(n) = N Q(s) = S Due to strong monoidality, complex types are mapped to tensor product of vector spaces: Q(n · nl ) = Q(n) ⊗ Q(nl ) = N ⊗ N Q(nr · s · nl ) = Q(nr ) ⊗ Q(s) ⊗ Q(nl ) = N ⊗ S ⊗ N Finally, each morphism in CF is mapped to a linear map in FVect. Dimitri Kartsaklis Compositional Distributional Models of Meaning 18/37
  • 20. A multi-linear model The grammatical type of a word defines the vector space in which the word lives. A noun lives in a basic vector space N. Adjectives are linear maps N → N, i.e elements in N ⊗ N. Intransitive verbs are maps N → S, i.e elements in N ⊗ S. Transitive verbs are bi-linear maps N ⊗ N → S, i.e. elements of N ⊗ S ⊗ N. And so on. The composition operation is tensor contraction, based on inner product. happy kids play games N Nl N Nr S Nl N Dimitri Kartsaklis Compositional Distributional Models of Meaning 19/37
  • 21. Frobenius algebras in language Kartsaklis, Sadrzadeh, Pulman and Coecke (2013): Use the co-multiplication of a Frobenius algebra to model verb tensors. s v o v A combination of element-wise and categorical composition: −−−→s v o = (s × v) o Useful e.g. in intonation: Who does John like? John likes Mary ( −−→ john × likes) −−−→mary Sadrzadeh, Clark and Coecke (2013): Relative pronouns are modelled in terms of Frobenius copying and deleting. Dimitri Kartsaklis Compositional Distributional Models of Meaning 20/37
  • 22. Composition and lexical ambiguity Kartsaklis and Sadrzadeh (2013): Explicitly handling lexical ambiguity improves the performance of tensor-based models How can we model ambiguity in categorical composition? Words as mixed states Piedeleu, Kartsaklis, Coecke and Sadrzadeh (2015): Ambiguous words are represented as mixed states in CPM(FHilb). ρ(wt) = i pi |wi t wi t | (8) Von Neumann entropy shows how ambiguity evolves from words to compounds Disambiguation = purification: Entropy of ‘vessel’ is 0.25, but entropy of ‘vessel that sails’ is 0.01. Dimitri Kartsaklis Compositional Distributional Models of Meaning 21/37
  • 23. Tensor-based models: Summary Distinguishing feature Relational words are functions acting on arguments. PROS: Highly justified from a linguistic perspective More powerful and robust than vector mixtures A glass box approach that allows theoretical reasoning CONS: Every logical and functional word must be assigned to an appropriate tensor representation (e.g. how do you represent quantifiers?) Space complexity problems for functions of higher arity (e.g. a ditransitive verb is a tensor of order 4) Limited to linearity Dimitri Kartsaklis Compositional Distributional Models of Meaning 22/37
  • 24. Outline 1 Introduction 2 Vector mixture models 3 Tensor-based models 4 Deep learning models 5 Afterword Dimitri Kartsaklis Compositional Distributional Models of Meaning 23/37
  • 25. A simple neural net A feed-forward neural network with one hidden layer: h1 = f (w11x1+w21x2+w31x3+w41x4+w51x5+b1) h2 = f (w12x1+w22x2+w32x3+w42x4+w52x5+b2) h3 = f (w13x1+w23x2+w33x3+w43x4+w53x5+b3) or −→ h = f (W(1)−→x + −→ b (1) ) Similarly: −→y = f (W(2)−→ h + −→ b (2) ) Note that W(1) ∈ R3×5 and W(2) ∈ R2×3 f is a non-linear function such as tanh or sigmoid (take f = Id and you have a tensor-based model) Weights are optimized against some objective function A universal approximator. Dimitri Kartsaklis Compositional Distributional Models of Meaning 24/37
  • 26. Recursive neural networks Pollack (1990), Socher et al. (2012): Recursive neural nets for composition in natural language. f is an element-wise non-linear function such as tanh or sigmoid. A supervised approach. Dimitri Kartsaklis Compositional Distributional Models of Meaning 25/37
  • 27. Unsupervised learning with NNs How can we train a NN in an unsupervised manner? Create an auto-encoder: Train the network to reproduce its input via an expansion layer. Use the output of the hidden layer as a compressed version of the input [Socher et al., 2011] Dimitri Kartsaklis Compositional Distributional Models of Meaning 26/37
  • 28. Convolutional NNs Originated in pattern recognition [Fukushima, 1980] Small filters apply on every position of the input vector: Capable of extracting fine-grained local features independently of the exact position in input Features become increasingly global as more layers are stacked Each convolutional layer is usually followed by a pooling layer Top layer is fully connected, usually a soft-max classifier Application to language: Collobert and Weston (2008) Dimitri Kartsaklis Compositional Distributional Models of Meaning 27/37
  • 29. DCNNs for modelling sentences Kalchbrenner, Grefenstette and Blunsom (2014): A deep architecture using dynamic k-max pooling Syntactic structure is induced automatically: (Figures reused with permission) Dimitri Kartsaklis Compositional Distributional Models of Meaning 28/37
  • 30. Beyond sentence level Denil, Demiraj, Kalchbrenner, Blunsom, and De Freitas (2014): An additional convolutional layer can provide document vectors: (Figure reused with permission) Dimitri Kartsaklis Compositional Distributional Models of Meaning 29/37
  • 31. Deep learning models: Summary Distinguishing feature Drastic transformation of the sentence space. PROS: Non-linearity and layered approach allow the simulation of a very wide range of functions Word vectors are parameters of the model, optimized during training State-of-the-art results in a number of NLP tasks CONS: Requires expensive training Difficult to discover the right configuration A black-box approach: not easy to correlate inner workings with output Dimitri Kartsaklis Compositional Distributional Models of Meaning 30/37
  • 32. Outline 1 Introduction 2 Vector mixture models 3 Tensor-based models 4 Deep learning models 5 Afterword Dimitri Kartsaklis Compositional Distributional Models of Meaning 31/37
  • 33. A hierarchy of CDMs Putting everything together: Note that “power” is only theoretical; actual performance depends on task, underlying assumptions, and configuration Dimitri Kartsaklis Compositional Distributional Models of Meaning 32/37
  • 34. Open issues-Future work No convincing solution for logical connectives, negation, quantifiers and so on. Functional words, such as prepositions and relative pronouns, are also a problem. Sentence space is usually identified with word space. This is convenient, but obviously a simplification Solutions depend on the specific CDM class—e.g. not much to do in a vector mixture setting Important: How can we make NNs more linguistically aware? [Hermann and Blunsom, 2013] Dimitri Kartsaklis Compositional Distributional Models of Meaning 33/37
  • 35. Conclusion CDMs provide quantitative semantic representations for sentences (or even documents) Element-wise operations on word vectors constitute an easy and reasonably effective way to get sentence vectors Categorical compositional distributional models allow reasoning on a theoretical level—a glass box approach Deep learning models are extremely powerful and effective; still a black-box approach, not easy to explain why a specific configuration works and some other does not. Convolutional networks seem to constitute a very promising solution for sentential/discourse semantics Dimitri Kartsaklis Compositional Distributional Models of Meaning 34/37
  • 36. References I Baroni, M. and Zamparelli, R. (2010). Nouns are Vectors, Adjectives are Matrices. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP). Coecke, B., Sadrzadeh, M., and Clark, S. (2010). Mathematical Foundations for a Compositional Distributional Model of Meaning. Lambek Festschrift. Linguistic Analysis, 36:345–384. Collobert, R. and Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160–167. ACM. Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., and de Freitas, N. (2014). Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1406.3830. Kalchbrenner, N., Grefenstette, E., and Blunsom, P. (2014). A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 655–665, Baltimore, Maryland. Association for Computational Linguistics. Kartsaklis, D. and Sadrzadeh, M. (2013). Prior disambiguation of word tensors for constructing sentence vectors. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1590–1601, Seattle, Washington, USA. Association for Computational Linguistics. Kartsaklis, D., Sadrzadeh, M., Pulman, S., and Coecke, B. (2014). Reasoning about meaning in natural language with compact closed categories and Frobenius algebras. arXiv preprint arXiv:1401.5980. Dimitri Kartsaklis Compositional Distributional Models of Meaning 35/37
  • 37. References II Lambek, J. (2008). From Word to Sentence. Polimetrica, Milan. Mitchell, J. and Lapata, M. (2008). Vector-based Models of Semantic Composition. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 236–244. Montague, R. (1970a). English as a formal language. In Linguaggi nella Societ`a e nella Tecnica, pages 189–224. Edizioni di Comunit`a, Milan. Montague, R. (1970b). Universal grammar. Theoria, 36:373–398. Piedeleu, R., Kartsaklis, D., Coecke, B., and Sadrzadeh, M. (2015). Open System Categorical Quantum Semantics in Natural Language Processing. arXiv preprint arXiv:1502.00831. Pollack, J. B. (1990). Recursive distributed representations. Artificial Intelligence, 46(1):77–105. Sadrzadeh, M., Clark, S., and Coecke, B. (2013). The Frobenius anatomy of word meanings I: subject and object relative pronouns. Journal of Logic and Computation, Advance Access. Dimitri Kartsaklis Compositional Distributional Models of Meaning 36/37
  • 38. References III Socher, R., Huang, E., Pennington, J., Ng, A., and Manning, C. (2011). Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. Advances in Neural Information Processing Systems, 24. Socher, R., Huval, B., Manning, C., and A., N. (2012). Semantic compositionality through recursive matrix-vector spaces. In Conference on Empirical Methods in Natural Language Processing 2012. Dimitri Kartsaklis Compositional Distributional Models of Meaning 37/37