Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards a lingua universalis

268 views

Published on

The main thesis here is this: (i) The Data-Driven approach to NLU is utterly fallacious; (ii) Logical Semantics has been seriously misguided; and (iii) logical semantics can be rectified, and here we suggest how this can be done and how to go forward, again

Published in: Science
  • Login to see the comments

Towards a lingua universalis

  1. 1. An Utterly Fallacious Data-Driven Approach, a Misguided Logical Semantics, and How to Go Forward, Again
  2. 2. Work in Progress
  3. 3. Language is not Learnable: PART I • natural language is an infinite object • infinite objects are recursively defined • recursive definitions are rules • rules are not learnable from examples natural language is not learnable
  4. 4. How many valid programs is a Python compiler ready to interpret? Chomsky’s Infinity
  5. 5. How many valid programs is a Python compiler ready to interpret? ) 1 Chomsky’s Infinity
  6. 6. Chomsky’s Infinity How many sentences are people are ready to understand (once they have attained linguistic competency)? I’m sorry but I don’t have your last sentence in my dictionary.
  7. 7. Chomsky’s Infinity ) 1 How many sentences are people are ready to understand (once they have attained linguistic competency)? I’m sorry but I don’t have your last sentence in my dictionary.
  8. 8. We have the capacity to express (and interpret) an infinite number of thoughts Infinite ) we can never be exposed to but a tiny fraction of examples that, in the end, are statistically insignificant
  9. 9. Noam Chomsky the notion of the probability of a sentence is an entirely useless one, under any known interpretation of this term.
  10. 10. External Syntax of Infinite Objects
  11. 11. External Syntax of Infinite Objects NL NL
  12. 12. Recursion is the tool by which we can have a finite representation of infinite objects. But recursive definitions are rules; and rules are not susceptible to individual experiences; and thus infinite objects cannot be learned from observation (experience) Recursion is the tool by which we can have a finite representation of infinite objects. But recursive definitions are rules; and rules are not susceptible to individual experiences; and thus infinite objects cannot be learned from observation (experience)
  13. 13. I reject the contention that an important theoretical difference exists between formal and natural languages RICHARD MONTAGUE
  14. 14. IMMANUEL KANT Every thing in nature, in the inanimate as well as in the animate world, happens according to some rules, though we do not always know them I reject the contention that an important theoretical difference exists between formal and natural languages RICHARD MONTAGUE
  15. 15. the challenge in language understanding is related to uncovering all the missing text that is never explicitly stated, but is often implicitly assumed as shared background knowledge Language is not Learnable: PART II
  16. 16. The MissingText Phenomenon (MTP) quantifier scope BBC has a reporter in every country BBC has a different reporter in every country)
  17. 17. prepositional phrase attachments ) John had pizza with his kids John had pizza along/together with his kids ) John had pizza with a pineapple topping John had pizza with pineapple The MissingText Phenomenon (MTP)
  18. 18. metonymy ) The corner table wants another beer The person sitting at the corner table wants another beer The MissingText Phenomenon (MTP)
  19. 19. metaphor ) Don’t worry about Simon, he’s a rock Don’t worry about Simon, he’s solid like a rock The MissingText Phenomenon (MTP)
  20. 20. compound nominals ) John works in a car factory John works in a car -producing factory The MissingText Phenomenon (MTP)
  21. 21. lexical ambiguity ) John likes to play bridge John likes to play the game bridge The MissingText Phenomenon (MTP)
  22. 22. The MissingText Phenomenon (MTP)
  23. 23. HECTOR LEVESQUE You need to have background knowledge that is not expressed in the words of the sentence to be able to sort out what is going on … and it is precisely bringing this background knowledge to bear that we informally call thinking. The MissingText Phenomenon (MTP)
  24. 24. 4 technical reasons why natural language is not Learnable from Data
  25. 25. Unlike formal languages (e.g., Java), in ordinary spoken languages (e.g., English, Spanish, etc. ) we leave out implicitly assumed information by relying in our “common” background knowledge Ordinary spoken language is thus highly (in fact, optimally) compressed )
  26. 26. 4 technical reasons why languages are not learnable QED 3. NL is not compressible (it is already highly compressed) from 1 4. NL does not have redundancies and thus is not learnable from 2 & 3 MISSING TEXT PHENOMENON (MTP)
  27. 27. 4 technical reasons why languages are not learnable FUNCTION WORDS quantifiers every some all most modals must could should can prepositions with on for to at connectives not and or if relative pronouns that which In ML/Data-Driven approaches function words are considered to be stopwords and are typically ignored since their probabilities are equal in all contexts (they are statistically insignificant) and thus leaving them would disrupt the entire statistical model.
  28. 28. 4 technical reasons why languages are not learnable FUNCTION WORDS But ignoring function words is problematic: these words are what in the end determines (‘glues together’) the final meaning. Thus, ML/Data-Driven models, while they can approximate text similarity, cannot account for true meanings.
  29. 29. 4 technical reasons why languages are not learnable STATISTICAL INSIGNIFICANCE Besides function words, statistical insignificance can occur in situations where the distinguishing information is not even in the data. Antonyms/opposites (e.g., big/small, writing/reading) are known to occur in similar contexts with equal probabilities, and thus in the above, statistical analysis would be useless since the only difference in the preferred reference is a function of the antonyms
  30. 30. 4 technical reasons why languages are not learnable STATISTICAL INSIGNIFICANCE Clearly, it is neither psychologically nor computationally plausible that we need to see 40,000,000 examples just to learn how to resolve a reference such as ‘it’ in (1). (it would take a child a lifetime to learn how to resolve such references!)
  31. 31. 4 technical reasons why languages are not learnable ACCOUNTING FOR INTENSIONS Note that ‘2 * (4 + 3)’ might be equal to ‘14’ and to ‘7 + 7’ (by value, only) but the three expressions are not the same objects (besides their value, they have many other attributes that they differ in!) Intension (with an ‘s’) is a complex and very involving subject, but for now we look at the simple notion of intension that precludes data-driven/quantitative approaches from being relevant to NLU as these models deal with extensions only and cannot account for intensions. Basically, data-driven/quantitative systems can deal with equality, but not sameness – the latter implies the former, but the former is much weaker!
  32. 32. 4 technical reasons why languages are not learnable ACCOUNTING FOR INTENSIONS
  33. 33. PART III if data-driven (quantitative) NLU is not a viable approach, and if logical semantics failed (thus far), then what to do?
  34. 34. One can assume a theory of the world that is isomorphic to the way we talk about it… in this case, semantics becomes very nearly trivial JERRY HOBBS Use language as a tool for uncovering the semiotic ontology of commonsense since ordinary language is the best-known theory we have of everyday knowledge JOHN A. BATEMAN we should investigate how our language functions and then answer the metaphysical questions MICHAEL DUMMETT
  35. 35. We know any object only through predicates that we can say or think of it. Any object has a set of predicates that can ‘sensibly’ be applied to it. Being (and having) a concept is being locked to the property that the concept expresses. Only where the word for been found is the thing a thing... the word alone gives being to the thing. How can we uncover all the implicitly assumed information that is never explicitly stated? KANT HEIDEGGER SOMMERS FODOR
  36. 36. PART III All of the above can be summarized as follows: 1. There’s a formal system that underlies all natural languages 2. In our linguistic communication, there seems to be an innate ontological structure that we safely assume is common to all humans 3. We need to discover the nature of that ontological structure 4. We can use (reverse-engineer) language itself to discover the nature of that ontological structure that underlies all natural languages
  37. 37. Unfortunately, the dominant logic that won the day is the ‘logic as a calculus’ - which is an abstract symbol manipulation system devoid of any content, and not the ‘logic as a language’ – a logic that has ontological content, and the logic that was to be the lingua universalis NINO. B. COCCHIARELLA
  38. 38. But mistakes made in logical semantics can be corrected leading to a computationally formal system that underlies all natural languages the main mishap in logical semantics was confusing predicates and types: predication was wrongly used to represent types in a strongly-typed ontology; types that correspond to all that we talk about in NL
  39. 39. Types vs. Predicates How can we explain that (1) and (2) convey, roughly, the same cognitive content? (1) Julie is an articulate person ) articulate(julie) ^ person(julie) (2) Julie is articulate ) articulate(julie)
  40. 40. Types vs. Predicates How can we explain that (1) and (2) convey, roughly, the same cognitive content? (1) Julie is an articulate person ) articulate(julie) ^ person(julie) (2) Julie is articulate ) articulate(julie) (p ^ q  p) ¾ q, thus in (1) person(Julie) is assumed to be true a priori. As we suggest below, distinguishing between logical and ontological concepts results in (1)/(2) ) (91julie :: person)(articulate(julie))
  41. 41. Types vs. Predicates
  42. 42. Types vs. Predicates We can later discuss how this hierarchy of ontological types (roughly, what Fred Sommers calls ‘The Language Tree’) might be discovered
  43. 43. Type Unification: Simple Case
  44. 44. Adjective-Ordering Restrictions Why is (a) more natural to say than (b)? (a) Jon bought a beautiful red car (b) Jon bought a red beautiful car
  45. 45. Because we can always cast-up (generalize). Casting down, however, is undecidable Adjective-Ordering Restrictions Why is (a) more natural to say than (b)? (a) Jon bought a beautiful red car (b) Jon bought a red beautiful car
  46. 46. An Innate Ontological Structure? Because we can always cast-up (generalize). Casting down, however, is undecidable Why is (a) more natural to say than (b)? (a) Jon bought a beautiful red car (b) Jon bought a red beautiful car
  47. 47. Type Unification: Ambiguity in Nominal Modification
  48. 48. The objects a and Olga are associated with more than one type in the same scope: type unification is required Type Unification: Ambiguity in Nominal Modification
  49. 49. Type Unification: Ambiguity in Nominal Modification
  50. 50. Type Unification: Ambiguity in Nominal Modification Check why the nominal modification in “Olga is an experienced dancer” is not ambiguous
  51. 51. Discovering the MissingText: Metonymy
  52. 52. Discovering the MissingText: Metonymy The unification of b is easy. The unification of oml, however, will introduce a salient relationship between oml and another object (b :: (Beer ² Thing)) ! (b :: Beer) (oml :: (Omelet ² Human)) ! R(Omelet, Human) eat(x :: Human, y:: Food) is the most salient relationship between Human and Food and Omelet v Food
  53. 53. Discovering the MissingText: Metonymy The person eating the omelet wants a beer
  54. 54. Type Unification and Uncovering the MissingText Activities can be wise?
  55. 55. Type Unification and Uncovering the MissingText Activities can be wise?
  56. 56. Type Unification and Uncovering the MissingText Activities can be wise?
  57. 57. Type Unification and Uncovering the MissingText [any person engaged in the activity of] exercising is wise Activities can be wise?
  58. 58. What ‘Paradox of the Ravens’? The story goes like this: H1 and H2 are logically equivalent, and thus whatever confirms H1 must (equally) confirm H2, and vice versa. But now seeing a red ball, or a pink elephant, or a white table, etc. will confirm H1, since all of these confirm the logically equivalent hypothesis H2 – which is clearly counter intuitive (not sure that it’s paradoxical, though!)
  59. 59. 32 What if we distinguish between types and predicates?
  60. 60. What if we distinguish between types and predicates?
  61. 61. What if we distinguish between types and predicates?
  62. 62. What if we distinguish between types and predicates?
  63. 63. What if we distinguish between types and predicates? Now both equivalent hypothesis are equally confirmed and disconfirmed by the same observations and ‘Paradox of the Ravens’ no more!
  64. 64. Lexical Disambiguation ‘party’ is still ambiguous because one can promote a political party as well as promote an event
  65. 65. Lexical Disambiguation ‘party’ is still ambiguous because one can promote a political party as well as promote an event ‘party’ here is not ambiguous because the object of a cancellation can only be an Event
  66. 66. Co-Predication
  67. 67. Co-Predication Type unifications will result in interpreting the above as: John bought a physical Book and he studied its Content
  68. 68. Co-Predication Type unifications will result in interpreting the above as: John bought a physical Book and he studied its Content
  69. 69. SUMMARY 1. Most of the challenges in the semantics of NL are about discovering the missing text – text that is implicitly assumed as shared background knowledge 2. By embedding ontological types in our predicates and performing various type operations we can discover all the implicitly assumed information 3. Logical semantics can be salvaged in a Logic as a Language – that is, a logic with ontological content

×