Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sentence representations and question answering (YerevaNN)


Published on

Michael Manukyan and Hrayr Harutyunyan gave a talk on sentence representations in the context of deep learning during Armenian NLP Meetup. They also reviewed a recent paper on machine comprehension (Wang, Jiang, 2016)

Published in: Science
  • Be the first to comment

Sentence representations and question answering (YerevaNN)

  1. 1. Sentence Representations Michael Manukyan and Hrayr Harutyunyan YerevaNN
  2. 2. Where it is used? • Machine Translation • Text classification • Text clustering • Machine Comprehension
  3. 3. Unsupervised solutions • Bag of Words (multiset of words) • Based on Word Embeddings (word2vec, GloVe): • sum of word vectors • weighted sum • positional encoding • max-pooling • Recurrent Neural Networks (RNN)
  4. 4. Sum of the word vectors
  5. 5. * color shades indicate weights Weighted sum of the word vectors
  6. 6. * color shades indicate weights Positional encoding (Sukhbaatar et al. 2015)
  7. 7. * color shades indicate values Max-pooling
  8. 8. here is the sentence representation RNN: encoder-decoder (J. Li et al., 2015)
  9. 9. Supervised (task dependent) solutions
  10. 10. Recursive NN learnable parameters here is the sentence representation
  11. 11. Convolutional NN learnable parameters here is the sentence representation
  12. 12. Machine Comprehension: Question Answering
  13. 13. • Set of triples (𝑃, 𝑄, 𝐴) • 𝑃 - passage (the text that computer should read and comprehend) • 𝑄 - question asked on that passage • 𝐴 - answer for the question Datasets
  14. 14. Facebook bAbI • Passage: 1. Mary moved to the bathroom. 2. John went to the hallway. • Question: Where is Mary? • Answer: bathroom 20 tasks, 10k examples per task
  15. 15. CNN/Daily Mail 10M examples
  16. 16. Children’s Book Test 700k examples
  17. 17. MCTest James the Turtle was always getting in trouble. Sometimes he'd reach into the freezer and empty out all the food. Other times he'd sled on the deck and get a splinter. His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky and got into lots of trouble behind her back. One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home. His aunt was waiting for him in his room. She told James that she loved him, but he would have to start acting like a well-behaved turtle.After about a month, and after getting into lots of trouble, James finally made up his mind to be a better turtle. What is the name of the trouble making turtle? A) Fries B) Pudding C) James D) Jane 600 examples
  18. 18. SQuAD • The Stanford Question Answering Dataset • questions a set of Wikipedia articles • the answer to every question is a segment of text, or span, from the corresponding reading passage • 100,000+ question-answer pairs on 500+ articles
  19. 19. SQuAD
  20. 20. SQuAD scoring • Exact match • the percentage of predictions that exactly match one of the ground truth answers • F1 score • F1 score over common word tokens between the predicted answer and the ground truth
  21. 21. SQuAD Leaderboard
  22. 22. Best published model for SQuAD so far: Match-LSTM with Answer-Pointer (Boundary) Singapore Management University (Wang & Jiang '16)
  23. 23. Model • LSTM-preprocessing • Match-LSTM • Answer module
  24. 24. LSTM Preprocessing • incorporate contextual information into the representation of each token in the passage and the question
  25. 25. Match-LSTM • It has been used to predict whether the premise entails the hypothesis • In this model the question is considered as premise and the passage as hypothesis • For each word we get one vector which contains its word vector and the question representation that depends on that word • Bidirectional LSTM is applied on those vectors to encode the sequence
  26. 26. Answer Module • Vocabulary is huge and answer is always present in the passage (it’s a substring of it) • Models • Sequence: predict each word one by one and guess where to stop • Boundary: predict two indices indicating the beginning and the end of the answer
  27. 27. Results
  28. 28. Results
  29. 29. Results
  30. 30. Attention
  31. 31. Thanks