• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
663
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
11
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Diagram
  • Diagram
  • The effect of syntactic constituency on composition is partially addressed by Mitchell and Lapata’s weighted additive model, where the vectors are multiplied by different scalar values before summing.
  • F is the matrix encoding function f as a linear transformation, a is the vector denoting the argument a and b is the vector output to the composition process
  • Table 3 contains a 2×2 matrix with the same labels for rows and columns (this is not necessary: it happens here because adjectives, as we have already stated, map nounsonto the same nominal space), , and where the first cell, for example, weights the mapping from and onto the runs-labeled components of the input and output vectors.
  • In the ML models, all words and larger constituents live in the same space, so everything is directly comparable with everything else.
  • Diagram
  • Phrase structure grammars (as opposed to dependency grammars).Are equivalent in generative capacity to context-free grammars.Based on function application rules.Only a small number of (mostly language-independent) rules are employed, and all other syntactic phenomena derive from the lexical entries of specific words.First assign interpretation types to all the basic categoriesThen associate all the derived categories with appropriate function types.
  • cat plays the double role of being the subject of the main clauseand the object of the relative clause
  • cat plays the double role of being the subject of the main clauseand the object of the relative clause

Transcript

  • 1. www.insight-centre.org An Introduction to Compositional Models in Distributional Semantics André Freitas Supervisor: Edward Curry Reading Group Friday (22/11/2013)
  • 2. www.insight-centre.org Based on: Baroni et al. (2012) Frege in Space: A Program for Compositional Distributional Semantics
  • 3. The Paper www.insight-centre.org • Comprehensive (107 pages) introduction and overview of compositional distributional models. 3
  • 4. Semantics for a Complex World www.insight-centre.org • Most semantic models have dealt with particular types of constructions, and have been carried out under very simplifying assumptions, in true lab conditions. • If these idealizations are removed it is not clear at all that modern semantics can give a full account of all but the simplest sentences. Sahlgren, 2013 4
  • 5. Goal behind Compositional Distributional Models www.insight-centre.org • Principled and effective semantic models for coping with real world semantic conditions. • Focus on semantic approximation. • Applications – – – – – Semantic search. Approximate semantic inference. Paraphrase detection. Semantic anomaly detection. ... 5
  • 6. Paraphrase Detection www.insight-centre.org • I find it rather odd that people are already trying to tie the Commission's hands in relation to the proposal for a directive, while at the same calling on it to present a Green Paper on the current situation with regard to optional and supplementary health insurance schemes. =? • I find it a little strange to now obliging the Commission to a motion for a resolution and to ask him at the same time to draw up a Green Paper on the current state of voluntary insurance and supplementary sickness insurance. 6
  • 7. Solving the Problem: The Data-driven Way www.insight-centre.org • Distributional – Use vast corpora to extract the meaning of content words. – Provide a principled representation of distributional meaning. • Compositional – These representations should be objects that compose together to form more complex meanings. – Content words should be able to combine with grammatical roles, in ways that account for the importance of structure in sentence meaning. 7
  • 8. www.insight-centre.org Distributional Semantics 8
  • 9. Distributional Semantics www.insight-centre.org • “Words occurring in similar (linguistic) contexts are semantically similar.” • Practical way to automatically harvest word “meanings” on a large-scale. • meaning = linguistic context. • This can then be used as a surrogate of its semantic representation. 9
  • 10. Vector Space Model www.insight-centre.org function (number of times that the words occur in c1) c1 0.7 0.5 husband spouse cn child c2 10
  • 11. Semantic Similarity/Relatedness www.insight-centre.org c1 husband spouse θ cn child c2 11
  • 12. Similarity www.insight-centre.org • Distributional vectors allow a precise quantification of similarity. • Measured by the distance of the corresponding vectors on the Cartesian plane. 12
  • 13. Semantic Approximation (Video) www.insight-centre.org
  • 14. www.insight-centre.org Compositional Model
  • 15. Compositional Semantics www.insight-centre.org • Can we extend DS to account for the meaning of phrases and sentences? 15
  • 16. Compositionality www.insight-centre.org • The meaning of a complex expression is a function of the meaning of its constituent parts. digest slowly carnivorous plants 16
  • 17. Compositionality Principles www.insight-centre.org Words that act as functions transforming the distributional profile of other words (e.g., verbs, adjectives, …). Words in which the meaning is directly determined by their distributional behaviour (e.g., nouns). 17
  • 18. Compositionality Principles www.insight-centre.org • Take the syntactic structure to constitute the backbone guiding the assembly of the semantic representations of phrases. • A correspondence between syntactic categories and distributional objects. 18
  • 19. Mixture-based Models www.insight-centre.org • Mitchell and Lapata (2010) • Proposed two broad classes of composition models. – Additive. – Multiplicative. 19
  • 20. Additive Model www.insight-centre.org 20
  • 21. Additive Model www.insight-centre.org • Limitations with the additive model: – The input vectors contribute to the composed expression in the same way. – Linguistic intuition would suggest that the composition operation is asymmetric (head of the phrase should have greater weight). 21
  • 22. Multiplicative Model www.insight-centre.org 22
  • 23. Analysis www.insight-centre.org • Multiplicative models perform quite well in the task of predicting human similarity judgments about adjective-noun, noun-noun, verb-noun and noun-verb phrases. 23
  • 24. Criticism of Mixture Models www.insight-centre.org • Some words have an intrinsic functional behaviour: “lice on dogs”, “lice and dogs” • Lack of recursion. • To address these limitations function-based models were introduced. 24
  • 25. Mixture vs Function www.insight-centre.org 25
  • 26. Distributional Functions www.insight-centre.org • Composition as function application. • Nouns are still represented as vectors. • Adjectives, verbs, determiners, prepositions, c onjunctions and so forth are all modelled by distributional functions. (ON(dogs))(lice) AND(lice, dogs) 26
  • 27. Distributional functions as linear transformations www.insight-centre.org • Distributional functions are linear transformations on semantic vector/tensor spaces. • Matrix: First-order, one argument distributional functions. • Used to represent adjectives and intransitive verbs. 27
  • 28. Example: Adjective + Noun www.insight-centre.org • Adjective = a function from nouns to nouns, 28
  • 29. Measuring similarity of tensors www.insight-centre.org • Two matrices (or tensors) are similar when they have a similar weight distribution, i.e., they perform similar input-to-output component mappings. • DECREPIT, OLD might dampen the “runs” component of a noun. 29
  • 30. Inducing distributional functions from corpus data www.insight-centre.org - Distributional functions are induced from input to output transformation examples Regression techniques commonly used in machine learning. 30
  • 31. www.insight-centre.org 31
  • 32. Socher, 2012 www.insight-centre.org • Recursive neural network (RNN) model that learns compositional vector representations for phrases and sentences. • State of the art performance on three different experiments sentiment analysis and cause-effect semantic relations. 32
  • 33. Main Challenges www.insight-centre.org • Challenge I: Lack of sufficient examples of their inputs and outputs. – Possible Solution: Extend the training sets exploiting similarities between linguistic expressions to ‘share’ training examples across distributional functions. • Challenge II: Computational power and space – Grefenstette et al., 2013. – Nouns live in 300-dimensional spaces, a transitive verb is a (300 × 300) × 300 tensor, that is, it contains 27 million components. – Relative pronoun: (300 × 300) × (300 × 300) tensor, contains 8.1 billion components. 33
  • 34. Categorial Grammar www.insight-centre.org • • • • Provides the syntax-semantics interface. Tight connection between syntax and semantics. Motivated by the principle of compositionality. View that syntactic constituents should generally combine as functions or according to a functionargument relationship. 34
  • 35. Categorial Grammar www.insight-centre.org Apply Inference rules The string is a sentence ((the (bad boy)) (made (that mess))) 35
  • 36. Local compositions www.insight-centre.org BARK x dogs vector matrix 36
  • 37. Local compositions www.insight-centre.org (CHASE × cats) × dogs. (CHASE × cats) vector 3rd order tensor 37 vector
  • 38. Syntax-Semantics interface for a English fragment 38 www.insight-centre.org
  • 39. Other Compositional Models www.insight-centre.org • Coecke et al. (2010): Category theory and Lambek calculus. • Grefenstette et al. (2013): Simulating Logical Calculi with Tensors. • Novacek et al. ISWC (2011), Freitas et al. ICSC (2011) : Semantic Web & Distributional Semantics. 39
  • 40. Conclusion www.insight-centre.org • Distributional semantics brings a promising approach for building computational models that work in the real world. • Semantic approximation as a built-in construct. • Compositionality is still an open problem but classical (formal) works have been leveraged and adapted to DSMs. • Exciting time to be around! 40