Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Universal Dependencies For Inference

178 views

Published on

Talk at Nuance nuSVN seminar in Feb 2017

Published in: Education
  • Be the first to comment

  • Be the first to like this

Universal Dependencies For Inference

  1. 1. © 2017 Nuance Communications, Inc. All rights reserved. Universal Dependencies for Inference Valeria de Paiva AI & NLU Lab, Sunnyvale (joint work with A. Rademaker and F. Chalub, IBM Research, Brazil) February, 2017
  2. 2. © 2017 Nuance Communications, Inc. All rights reserved. 2 Outline •  The corpus SICK •  Universal Dependencies •  Processing SICK with FreeLing & Parsey •  Analysis •  What’s Next?
  3. 3. © 2017 Nuance Communications, Inc. All rights reserved. 3 •  Stands for Sentences Involved in Compositional Knowledge, •  Result of 5-year European project COMPOSES(2011-2016) •  SICK data set consists of about 10,000 English sentence pairs, from captions of images •  original sentences were normalized to remove unwanted linguistic phenomena; sentences were expanded to obtain up to three new suitable for evaluation; all the sentences were paired to obtain the final data set. •  Each sentence pair was annotated for (relatedness and) entailment by means of crowdsourcing techniques. •  entailment annotation led to 5595 neutral pairs, 1424 contradiction pairs, and 2821 entailment pairs •  All available from http://clic.cimec.unitn.it/composes/ sick.html The Corpus SICK
  4. 4. © 2017 Nuance Communications, Inc. All rights reserved. 4 –  Simple, common sense like sentences: People are walking or One man in a big city is holding up a sign and begging for money. –  After de-duplication of sentences we have 6076 sentences, with 512 unique verb lemmas, 269 unique adjectives, 148 unique adverbs and 1076 unique nouns. –  Created from captions of pictures, SICK is about daily activities and entities; ones you can see in pictures taken by real people and which require common sense concepts: people, cats, dogs, running, climbing, eating, etc. The SICK corpus
  5. 5. © 2017 Nuance Communications, Inc. All rights reserved. 5 –  Full details, code and data included, can be found in our GitHub repository https://github.com/own-pt/rte-sick –  We run the sentences through FreeLing and use its tokenization to match the UD dependencies, obtained using Parsey McFace. –  From the CoNLL representation of each sentence we obtain its bag-of- concepts using the Princeton WordNet (PWN) mappings provided by FreeLing. –  We use FreeLing’s Word Sense Disambiguation (WSD) -- based on Aguirre’s UKB state-of-the-art Personalized PageRank algo -- to pick a favorite sense. –  We use PWN to SUMO mappings to obtain logical concepts. Processing the sentences Parsey McFace, FreeLing
  6. 6. © 2017 Nuance Communications, Inc. All rights reserved. 6 Available from http://nlp.cs.upc.edu/freeling/ FreeLing –  Established project, led by Lluís Padró, UPC (Universitat Politècnica de Catalunya), since 2004. latest release FreeLing 4.0, May 2016 –  FreeLing is a C++ library providing language analysis functionalities: –  morphological analysis, named entity detection, PoS-tagging, parsing, Word Sense Disambiguation, Semantic Role Labelling, etc. –  for a variety of languages: English, Spanish, Portuguese, Italian, French, German, Russian, Catalan, Galician, Croatian, Slovene, etc. –  We have been using FreeLing since 2014 A multilingual, open source language analysis tool suite
  7. 7. © 2017 Nuance Communications, Inc. All rights reserved. 7 From https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds- most.html SyntaxNet –  Project, led by Slav Petrov, Google Research, first release May 2016 –  SyntaxNet, is an open-source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding (NLU) systems. –  Parsey McParseface is an English parser trained by SyntaxNet and used to analyze English text, producing Stanford dependencies. Given a sentence as input, it tags each word with a part-of-speech (POS) tag that describes the word's syntactic function, and it determines the syntactic relationships between words in the sentence, represented in the dependency parse tree. –  SyntaxNet applies neural networks to the ambiguity problem using beam search in the model. A multilingual, open source language analysis tool suite, based on NNs
  8. 8. © 2017 Nuance Communications, Inc. All rights reserved. 8 Globally Normalized Transition-Based Neural Networks is their paper Parsey McParseface –  Given some data from the Google supported Universal Dependencies project, you can train a parsing model on your own machine, they say. –  On a standard benchmark (the 20 year old Penn Treebank), Parsey McParseface achieves 94% accuracy, beating their own previous state-of- the-art results. On Web text over 90% parse accuracy. (linguists trained for this task agree in 96-97% of the cases) –  Coupled this with Manning’s famous 97% accuracy on pos-tagging… –  But: Alice drove down the street in her car The English face of SyntaxNet
  9. 9. © 2017 Nuance Communications, Inc. All rights reserved. 9 “our work is still cut out for us: we would like to develop methods that can learn world knowledge and enable equal understanding of natural language across all languages and contexts” Slav Petrov Research Engineer, Google Research
  10. 10. © 2017 Nuance Communications, Inc. All rights reserved. 10 Processing of SICK –  Want to do inference with SICK sentences –  Want sentences into 3 buckets: –  Prop, Prop&Prop, notProp –  Not so easy to bucket them.. –  Need to find all negations, need to find all conjunctions of propositions –  Examples: A woman is rock climbing , pausing and calculating the route –  vs –  A man and a woman are walking through the woods 6076 sentences, but only 1814 unique lemmas, yay!
  11. 11. © 2017 Nuance Communications, Inc. All rights reserved. 11 Processing of SICK What the CoNLL representations look like? Similar to this 6076 sentences, but only 1814 unique lemmas, yay!
  12. 12. © 2017 Nuance Communications, Inc. All rights reserved. 12 Processing of SICK –  SICK sentences were “expanded” from a core set of sentences in visual corpora, via human constructed transformations such as adding passive or active voice, adding negations, adding adjectival modifiers, etc. –  Idea was to simplify the linguistic structure, and to create comparisons of different linguistic phenomena. –  They say: [caption corpora] contain sentences (as opposed to paragraphs) that describe the same picture or video and are thus near paraphrases, are lean on named entities and rich in generic terms. –  They call this process “normalization”, they provide us with 11 normalization rules. –  Examples next slide How did they construct their corpus?
  13. 13. © 2017 Nuance Communications, Inc. All rights reserved. 13 Processing of SICK –  SICK sentences were “expanded” from a core set of sentences in visual corpora, via human constructed transformations such as adding passive or active voice, adding negations, adding adjectival modifiers, etc. –  Idea was to simplify the linguistic structure, and to create comparisons of different linguistic phenomena. –  They say: [caption corpora] contain sentences (as opposed to paragraphs) that describe the same picture or video and are thus near paraphrases, are lean on named entities and rich in generic terms. –  They call this process “normalization”, they provide us with 11 normalization rules. –  Examples next slide How did they construct their corpus? (their LREC2014 paper)
  14. 14. © 2017 Nuance Communications, Inc. All rights reserved. 14 Processing of SICK –  SICK sentences were “expanded” from a core set of sentences in visual corpora, via human constructed transformations such as adding passive or active voice, adding negations, adding adjectival modifiers, etc. –  Idea was to simplify the linguistic structure, and to create comparisons of different linguistic phenomena. –  They say: [caption corpora] contain sentences (as opposed to paragraphs) that describe the same picture or video and are thus near paraphrases, are lean on named entities and rich in generic terms. –  They call this process “normalization”, they provide us with 11 normalization rules. –  Examples next slide How did they construct their corpus? More “normalization” rules
  15. 15. © 2017 Nuance Communications, Inc. All rights reserved. 15 Processing of SICK 1.  There are still non-sentences like “A couple standing on the curb”, rule #10. 2.  There are 41 possessive dependencies (genitives like An animal is biting somebody 's hand) and 341 poss ones (for possessive pronouns) as in A man is holding a mask in *his* raised hand, rule #1. 3. There are still a few NE’s in the corpus (e.g “Seadoo”, a kind of jet ski) and ATVs (all terrain vehicles) rule #2 4. There are a few non “present continuous sentences” like (The speaker likes parrots) rule #3 5. Rule #4 ``Replace complex verb constructions into simpler ones” is difficult to judge. But examples similar to the one given (A man is attempting to surf down a hill made of sand è A man is surfing down a hill made of sand) happen in the corpus several times. We have 4 of ``trying to catch” and 37 xcomp dependency relations all together. 6. Rule #5 (Simplify verb phrases with modals and auxiliaries) is hard to check as modals are not marked in the dependencies, as far as I can see. 7. Rule #6 (Replace phrasal verbs with a synonym if verb and preposition are not adjacent) only helps a little. There are 324 particles, most of them corresponding to particle verbs that change meaning (e.g. A toddler is standing up ) gets one concept for standing and one for up. 8. Rule #7(Remove multiword expressions), we have 1148 noun-noun dependency relations and many are mwes. 9. Rule #8 (Remove dates and numbers) worked fairly well, we only have one temporal modifier (a sunny day) but 748 number dependencies and a need to tie this up to real numbers 10. Rule #9 (Turn subordinates into coordinates) also seems to have worked ok. 11. Rule #11(Remove indirect interrogative and parenthetical phrases). Well there are at least 4 real appositives, e.g Four middle eastern children , three girls and one boy , are climbing on the grotto with a pink interior. The construction rules didn’t work too well…
  16. 16. © 2017 Nuance Communications, Inc. All rights reserved. 16 Results of SICK –  The curators of the corpus also made an effort to reduce the amount of “encyclopedic knowledge” about the world that is needed. –  They say “To ensure the quality of the data set, all the sentences were checked for grammatical or lexical mistakes and disfluencies by a native English speaker.” –  Reasonably well, we expect? –  But the negations tell a different story. All (or almost all) the expletives (539 of them) are negative, but negation is not marked. This wouldn’t be a problem, if these sentences made sense. Many do not. How well did they do the job of simplifying the language?
  17. 17. © 2017 Nuance Communications, Inc. All rights reserved. 17 Processing of SICK Meaning-preserving Transformations (from LREC2014 paper again)
  18. 18. © 2017 Nuance Communications, Inc. All rights reserved. 18 Processing of SICK Meaning-altering Transformations
  19. 19. © 2017 Nuance Communications, Inc. All rights reserved. 19 Results of SICK According to stats we have 436 neg dependency relations. They really seem to correspond to negation using the adverb “not”. Clearly we are missing many negations (most if not all the 539 expletives are negative), but the determiners ‘no’ are not marked as negative. This we can correct semi-automatically. But there are many atrocious sentences that do not make sense at all: A person is ignoring the motocross bike that is lying on its side and there is no one is racing by; (maybe the third ``is” is a typo?) or There is no man wearing clothes that are covered with paint or is sitting outside in a busy area writing something. How can we measure how many bad sentences there are? We decided to investigate a reduced sub-corpus. Negation in SICK-PARSEY
  20. 20. © 2017 Nuance Communications, Inc. All rights reserved. 20 The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again. 3-4-5: a hand-checked sub-corpus –  Only one conjunction: Paper and scissors both cut –  13 copulas (4 wrong of them are wrong: The man is training, The man is rock climbing, A woman is grating carrots, The woman is dicing garlic ) –  19 negations marked, but 26 neg expletives, 10 nobodies, 1 no person è should be 56 negations, needs work, we know. –  Compounds? Have 20 nns, some are real compounds: –  ping pong, golden retriever, sumo wrestlers, tiger cub, sumo ringers, baby pandas, cartoon airplane –  Some are processing mistakes, gerund for the noun, lots (14 in 390) hamster singing, lion walking, panda climbing, etc. –  Also 9 cases of particle verbs (need to update PWN?) 390 sentences (385 nsubj+ 5 nsubjpass) Problems pale in comparison to WSD
  21. 21. © 2017 Nuance Communications, Inc. All rights reserved. 21 –  Mapping has many issues –  QAD (quick and dirty, Tetrault&Lee) –  Rewriting of CoNLL is necessary for most sentences, future work… Princeton WN to SUMO mapping –  SUMO is intended as a foundation ontology for a variety of information processing systems. –  First released in December 2000, it uses a version of the language SUO-KIF (first-order logic with bells&whistles) in a LISP-like syntax. –  A mapping from WordNet synsets to SUMO which makes it special. –  Many domain ontologies to extend it –  “the largest formal public ontology in existence today.” Why SUMO? Language is important SUMO is open source, upper ontology
  22. 22. © 2017 Nuance Communications, Inc. All rights reserved. 22 Princeton WN-SUMO mapping text = People are walking 1 People people NOUN NNS_ 3 nsubj _NNS|07942152-n GroupOfPeople= 2 are be VERB VBP _ 3 aux _ VBP|02604760-v|Entity+ 3 walking walk VERB VBG_ 0 ROOT _ VBG|01904930-v|Walking= Kinds of mapping: 1.  Equality: the synset corresponds exactly to the concept (as in Walking= above) 2.  Containment: Concept+, the synset is a subconcept of the SUMO concept 3.  Negation: Concept[ (the negation of the concept) silent= not communicating 4.  A-box: shouldn’t be happening here, but it does. ok for Asian, German? Why SUMO? SUMO’s mappings: the theory
  23. 23. © 2017 Nuance Communications, Inc. All rights reserved. 23 Examples text = Men are sawing 1 Men man NOUN NNS_ 3 nsubj _ NNS|02472293-n|Hominid= 2 are be VERB VBP _ 3 aux _ VBP|02604760-v|Entity+ 3  sawing saw VERB VBG_ 0 ROOT _ VBG|01559590-v|Cutting+ “saw” is a very specific verb and has only one synset in PWN: 01559590-v saw | (cut with a saw; "saw wood for the fireplace") But SUMO has no exact match for this concept, so it maps to Cutting+, a kind of Cutting, the one done with a “saw”. Much worse is the situation with “Man” that occurs in 33 PWN synsets, the WSD alg decided for Hominid=, but Man= (equivalent mapping) would be much better. Containment
  24. 24. © 2017 Nuance Communications, Inc. All rights reserved. 24 Examples text = Some people are silent 1 Some some DET DT _ 2 det _ DT|?|? 2 people people NOUN NNS_ 4 nsubj _NNS|07942152- nGroupOfPeople= 3 are be VERB VBP _ 4 cop _ VBP|02604760-v|Entity+ 4 silent silent ADJ JJ _ 0 ROOT _ JJ|00942163-a| LinguisticCommunication[ “silent” is an adjective in 6 synsets, mapped to LinguisticCommunication[, which is to be read as non-LinguisticCommunication. This seems the wrong WSD, but the correct one would give us Communication[ (negated subsuming mapping). Negation
  25. 25. © 2017 Nuance Communications, Inc. All rights reserved. 25 Examples 1 An a DET DT _ 3 det _ DT|?|? 2 American american ADJ JJ _ 3 amod_ JJ|02927303-a|LandArea@ 3 footballer footballer NOUN NN _ 5 nsubj _ NN|10101634-n| ProfessionalAthlete+ 4 is be VERB VBZ _ 5 aux _ VBZ|02604760-v|Entity+ 5 wearing wear VERB VBG _ 0 ROOT _ VBG|00075021-v| RecreationOrExercise+ 6 the the DET DT _ 10 det _ DT|?|? 7 red red ADJ JJ _ 10 amod_ JJ|00381097-a|Red= 8 and and CONJ CC _ 7 cc _ CC|?|? 9 white white ADJ JJ _ 7 conj _ JJ|00393105-a|White= 10 strip strip NOUN NN _ 5 dobj _ NN|09449510-n|SelfConnectedObject+ The mapping of ‘American’ to related to the continent America is a reasonable hook for future work A-box
  26. 26. © 2017 Nuance Communications, Inc. All rights reserved. 26 Analysis The numbers Total SICK mappings: 38671 ? Most need to be discarded 29606 + most need to be made specific 16477 = 171 @ 67 [ Need to deal with negated, a-box concepts Need more bodily process (sleeping..), Need more baby-zoology (tiger, kitten not the same) But the killer is WSD
  27. 27. © 2017 Nuance Communications, Inc. All rights reserved. 27 Conclusion QAD with FreeLing WSD does NOT work Could repeat Petrov’s words. Could make a more careful corpus, checking its processing. Could rethink WSD. Have done so and included all the PWN senses, for the time being. It’s less wrong!! Considering how to make sure that ontologies pay for their upkeeping, without throwing away our Neural Networks gains.
  28. 28. © 2017 Nuance Communications, Inc. All rights reserved. 28 Future Work Enhancing dependencies with friends Want to use what we’ve learnt during the summer with the presentations of Schuster, Lewis, Reddy and Stanovsky.They all had rules that applied to dependencies to make them more `semantic’. We also have the AKR rules and the matcher rules to use and improve. At the moment, working on Stanovsky’s (and Fauke’s rules, they work for EN and DE) with Prateek Jain. Also working on the Entailment pairs of SICK. (note that simply concepts can give useful info2)
  29. 29. © 2017 Nuance Communications, Inc. All rights reserved. 29 CODA We are the UD-Portuguese now! There were two totally automatic treebanks in Portuguese in UDs. Zeman’s version started from Linguateca’s Bosque and did conversions. Google’s UD-BR used web data and no manual input. Started re-annotating Bosque in Oct 2016. We’re now the official UD-PT, as we automatically converted and manually checked (or are checking) Bosque. Paper submitted.
  30. 30. © 2017 Nuance Communications, Inc. All rights reserved. Thank you
  31. 31. © 2017 Nuance Communications, Inc. All rights reserved. 31

×