Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Weakly Supervised Machine Reading

923 views

Published on

Presentation of work that will be published at EMNLP 2016.

Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel. emoji2vec: Learning Emoji Representations from their Description. SocialNLP at EMNLP 2016. https://arxiv.org/abs/1609.08359

Georgios Spithourakis, Isabelle Augenstein, Sebastian Riedel. Numerically Grounded Language Models for Semantic Error Correction. EMNLP 2016. https://arxiv.org/abs/1608.04147

Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva. Stance Detection with Bidirectional Conditional Encoding. EMNLP 2016. https://arxiv.org/abs/1606.05464

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Weakly Supervised Machine Reading

  1. 1. Weakly Supervised Machine Reading Isabelle Augenstein University College London October 2016
  2. 2. What is Machine Reading? • Automatic reading (i.e. encoding of text) • Automatic understanding of text • Useful ingredients for machine reading • Representation learning • Structured prediction • Generating training data
  3. 3. Machine Reading RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question u(q)r(s) g(x)What is a good method for machine reading?
  4. 4. Machine Reading Tasks • Word Representation Learning • Output: vector for each word • Learn relations between words, learn to distinguish words from one another • Unsupervised objective: word embeddings • Sequence Representation Learning • Output: vector for each sentence / paragraph • Learn how likely a sequence is given a corpus, learn what next most likely word is given a sequence of words • Unsupervised objective: unconditional language models, natural language generation • Supervised objective: sequence classification tasks
  5. 5. Machine Reading Tasks • Pairwise Sequence Representation Learning • Output: vector for pairs of sentences / paragraphs • Learn how likely a sequence is given another sequence and a corpus • Pairs of sequences can be encoded independently or encoded conditioned on one another • Unsupervised objective: conditional language models • Supervised objective: stance detection, knowledge base slot filling, question answering
  6. 6. Talk Outline • Learning emoji2vec Embeddings from their Description – Word representation learning, generating training data • Numerically Grounded and KB Conditioned Language Models – (Conditional) Sequence representation learning • Stance Detection with Bidirectional Conditional Encoding – Conditional sequence representation learning, generating training data
  7. 7. Machine Reading: Word Representation Learning RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question What is a good method for machine reading?
  8. 8. emoji2vec • Emoji use has increased • Emoji carry sentiment, which could be useful for e.g. sentiment analysis
  9. 9. emoji2vec
  10. 10. emoji2vec • Task: learn representations for emojis • Problem: many emojis are used infrequently, and typical word representation learning methods (e.g. word2vec) require them to be seen several times • Solution: learn emojis from their description
  11. 11. emoji2vec • Method: emoji embedding is sum of word embeddings of words in description
  12. 12. emoji2vec • Results – Emoji vectors are useful in addition to GoogleNews vectors for sentiment analysis task – Analogy task also works for emojis
  13. 13. emoji2vec • Conclusions – Alternative source for learning representations (descriptions) very useful, especially for rare words
  14. 14. Machine Reading: Sequence Representation Learning (Unsupervised) RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question What is a good method for machine reading?
  15. 15. Numerically Grounded + KB Conditioned Language Models Semantic Error Correction with Language Models
  16. 16. Numerically Grounded + KB Conditioned Language Models • Problem: clinical data contains many numbers, many are unseen at test time • Solution: concatenate RNN input embeddings with numerical representations • Problem: clinical data contains, in addition to report, incomplete and inconsistent KB entry for each patient, how to use it? • Solution: lexicalise KB and condition on it
  17. 17. Numerically Grounded + KB Conditioned Language Models
  18. 18. Numerically Grounded + KB Conditioned Language Models Model MAP P R F1 Random 27.75 5.73 10.29 7.36 Base LM 64.37 39.54 64.66 49.07 Cond 62.76 37.46 62.20 46.76 Num 68.21 44.25 71.19 54.58 Cond+Num 69.14 45.36 71.43 55.48 Semantic Error Correction Results
  19. 19. Numerically Grounded + KB Conditioned Language Models • Conclusions – Accounting for out-of-vocabulary tokens at test time increases performance – Duplicate information from lexicalising KB can help further
  20. 20. Machine Reading: Pairwise Sequence Representation Learning (Supervised) RNNs are a popular method for machine reading method_for ( MR, XXX ) Supporting text Question u(q)r(s) g(x)What is a good method for machine reading?
  21. 21. Stance Detection with Conditional Encoding “@realDonaldTrump is the only honest voice of the @GOP” • Task: classify attitude of a text towards a given target as “positive”, ”negative”, or ”neutral” • Example tweet is positive towards Donald Trump, but (implicitly) negative towards Hillary Clinton
  22. 22. Stance Detection with Conditional Encoding • Challenges – Learn a model that interprets the tweet stance towards a target that might not be mentioned in the tweet itself – Learn model without labelled training data for the target with respect to which we are predicting the stance
  23. 23. Stance Detection with Conditional Encoding • Challenges – Learn a model that interprets the tweet stance towards a target that might not be mentioned in the tweet itself • Solution: bidirectional conditional model – Learn model without labelled training data for the target with respect to which we are predicting the stance • Solution 1: use training data labelled for other targets (domain adaptation setting) • Solution 2: automatically label training data for target, using a small set of manually defined hashtags (weakly labelled setting)
  24. 24. Stance Detection with Conditional Encoding
  25. 25. Stance Detection with Conditional Encoding • Domain Adaptation Setting – Train on Legalization of Abortion, Atheism, Feminist Movement, Climate Change is a Real Concern and Hillary Clinton, evaluate on Donald Trump tweets Model Stance P R F1 FAVOR 0.3145 0.5270 0.3939 Concat AGAINST 0.4452 0.4348 0.4399 Macro 0.4169 FAVOR 0.3033 0.5470 0.3902 BiCond AGAINST 0.6788 0.5216 0.5899 Macro 0.4901
  26. 26. Stance Detection with Conditional Encoding • Weakly Supervised Setting – Weakly label Donald Trump tweets using hashtags, evaluate on Donald Trump tweets Model Stance P R F1 FAVOR 0.5506 0.5878 0.5686 Concat AGAINST 0.5794 0.4883 0.5299 Macro 0.5493 FAVOR 0.6268 0.6014 0.6138 BiCond AGAINST 0.6057 0.4983 0.5468 Macro 0.5803
  27. 27. Stance Detection with Conditional Encoding • Other findings – Pre-training word embeddings on large in-domain corpus with unsupervised objective and continuing to optimise them towards supervised objective works well • Better than pre-training without further optimisation, or random initialisation, or Google News embeddings – LSTM encoding of tweets and targets works better than sum of word embeddings baseline, despite small training set (7k – 14k instances) – Almost all instances for which target mentioned in tweet have non-neutral stance
  28. 28. Stance Detection with Conditional Encoding • Conclusions – Modelling sentence pair relationship is important – Automatic labelling of in-domain tweets is even more important – Learning sequence representations also a good approach for small data
  29. 29. Thank you! isabelleaugenstein.github.io i.augenstein@ucl.ac.uk @IAugenstein github.com/isabelleaugenstein
  30. 30. References Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel. emoji2vec: Learning Emoji Representations from their Description. SocialNLP at EMNLP 2016. https://arxiv.org/abs/1609.08359 Georgios Spithourakis, Isabelle Augenstein, Sebastian Riedel. Numerically Grounded Language Models for Semantic Error Correction. EMNLP 2016. https://arxiv.org/abs/1608.04147 Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva. Stance Detection with Bidirectional Conditional Encoding. EMNLP 2016. https://arxiv.org/abs/1606.05464
  31. 31. Collaborators Kalina Bontcheva University of Sheffield Andreas Vlachos University of Sheffield George Spithourakis UCLMatko Bošnjak UCL Sebastian Riedel UCL Tim Rocktäschel UCL Ben Eisner Princeton

×