Sentence representations and question answering (YerevaNN)

•

5 likes•3,163 views

Michael Manukyan and Hrayr Harutyunyan gave a talk on sentence representations in the context of deep learning during Armenian NLP Meetup. They also reviewed a recent paper on machine comprehension (Wang, Jiang, 2016)

Science

Sentence
Representations
Michael Manukyan and Hrayr Harutyunyan
YerevaNN

Where it is used?
• Machine Translation
• Text classification
• Text clustering
• Machine Comprehension

Unsupervised solutions
• Bag of Words (multiset of words)
• Based on Word Embeddings (word2vec, GloVe):
• sum of word vectors
• weighted sum
• positional encoding
• max-pooling
• Recurrent Neural Networks (RNN)

* color shades indicate weights
Weighted sum of the word vectors

* color shades indicate weights
Positional encoding
(Sukhbaatar et al. 2015)

* color shades indicate values
Max-pooling

here is the sentence representation
RNN: encoder-decoder
(J. Li et al., 2015)

Recursive NN
learnable parameters
here is the
sentence
representation

Convolutional NN
learnable parameters
here is the
sentence
representation

Machine Comprehension:
Question Answering

• Set of triples (𝑃, 𝑄, 𝐴)
• 𝑃 - passage (the text that computer should
read and comprehend)
• 𝑄 - question asked on that passage
• 𝐴 - answer for the question
Datasets

Facebook bAbI
• Passage:
1. Mary moved to the bathroom.
2. John went to the hallway.
• Question: Where is Mary?
• Answer: bathroom
20 tasks, 10k examples per task

MCTest
James the Turtle was always getting in trouble. Sometimes he'd reach into the
freezer and empty out all the food. Other times he'd sled on the deck and get a splinter.
His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky
and got into lots of trouble behind her back.
One day, James thought he would go into town and see what kind of trouble he could
get into. He went to the grocery store and pulled all the pudding off the shelves and ate
two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He
didn't pay, and instead headed home.
His aunt was waiting for him in his room. She told James that she loved him, but he
would have to start acting like a well-behaved turtle.After about a month, and after
getting into lots of trouble, James finally made up his mind to be a better turtle.
What is the name of the trouble making turtle?
A) Fries
B) Pudding
C) James
D) Jane
600 examples

SQuAD
• The Stanford Question Answering Dataset
• questions a set of Wikipedia articles
• the answer to every question is a segment of
text, or span, from the corresponding reading
passage
• 100,000+ question-answer pairs on 500+ articles

SQuAD scoring
• Exact match
• the percentage of predictions that exactly
match one of the ground truth answers
• F1 score
• F1 score over common word tokens between
the predicted answer and the ground truth

Best published model for SQuAD so far:
Match-LSTM with Answer-Pointer
(Boundary)
Singapore Management University
(Wang & Jiang '16)

Model
• LSTM-preprocessing
• Match-LSTM
• Answer module

LSTM Preprocessing
• incorporate contextual information into the
representation of each token in the passage and
the question

Match-LSTM
• It has been used to predict whether the premise
entails the hypothesis
• In this model the question is considered as premise
and the passage as hypothesis
• For each word we get one vector which contains its
word vector and the question representation that
depends on that word
• Bidirectional LSTM is applied on those vectors to
encode the sequence

Answer Module
• Vocabulary is huge and answer is always
present in the passage (it’s a substring of it)
• Models
• Sequence: predict each word one by one and
guess where to stop
• Boundary: predict two indices indicating the
beginning and the end of the answer

What's hot

Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena

Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease

Practical machine learning - Part 1Traian Rebedea

NLP BootcampAnuj Gupta

What is word2vec?Traian Rebedea

Anthiil Inside workshop on NLPSatyam Saxena

MaxEnt (Loglinear) Models - Overviewananth

Deep Learning, Where Are You Going?NAVER Engineering

Deep Learning for Natural Language ProcessingJonathan Mugan

Word vectorization(embedding) with nnlmhyunsung lee

NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta

Deep Learning for NLP ApplicationsSamiur Rahman

Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters

DLBLR talkAnuj Gupta

Talk@rmit 09112017Shuai Zhang

Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Ahmed Magdy Ezzeldin, MSc.

Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague

NLP from scratch Bryan Gummibearehausen

Talk from NVidia Developer ConnectAnuj Gupta

Thai Word Embedding with Tensorflow Kobkrit Viriyayudhakorn

What's hot (20)

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk

Deep Learning for Information Retrieval: Models, Progress, & Opportunities

Practical machine learning - Part 1

NLP Bootcamp

What is word2vec?

Anthiil Inside workshop on NLP

MaxEnt (Loglinear) Models - Overview

Deep Learning, Where Are You Going?

Deep Learning for Natural Language Processing

Word vectorization(embedding) with nnlm

NLP Bootcamp 2018 : Representation Learning of text for NLP

Deep Learning for NLP Applications

Deep Learning for NLP: An Introduction to Neural Word Embeddings

DLBLR talk

Talk@rmit 09112017

Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...

Tomáš Mikolov - Distributed Representations for NLP

NLP from scratch

Talk from NVidia Developer Connect

Thai Word Embedding with Tensorflow

Recently uploaded (20)

Scheme-of-Work-Science-Stage-4 cambridge science.docx

Grafana in space: Monitoring Japan's SLIM moon lander in real time

Vision and reflection on Mining Software Repositories research in 2024

BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.

Base editing, prime editing, Cas13 & RNA editing and organelle base editing

Behavioral Disorder: Schizophrenia & it's Case Study.pdf

The dark energy paradox leads to a new structure of spacetime.pptx

preservation, maintanence and improvement of industrial organism.pptx

Pests of jatropha_Bionomics_identification_Dr.UPR.pdf

Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR

Analytical Profile of Coleus Forskohlii | Forskolin .pptx

Neurodevelopmental disorders according to the dsm 5 tr

Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf

User Guide: Capricorn FLX™ Weather Station

Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx

Topic 9- General Principles of International Law.pptx

Davis plaque method.pptx recombinant DNA technology

RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx

Sentence representations and question answering (YerevaNN)

1. Sentence Representations Michael Manukyan and Hrayr Harutyunyan YerevaNN

2. Where it is used? • Machine Translation • Text classification • Text clustering • Machine Comprehension

3. Unsupervised solutions • Bag of Words (multiset of words) • Based on Word Embeddings (word2vec, GloVe): • sum of word vectors • weighted sum • positional encoding • max-pooling • Recurrent Neural Networks (RNN)

4. Sum of the word vectors

5. * color shades indicate weights Weighted sum of the word vectors

6. * color shades indicate weights Positional encoding (Sukhbaatar et al. 2015)

7. * color shades indicate values Max-pooling

8. here is the sentence representation RNN: encoder-decoder (J. Li et al., 2015)

9. Supervised (task dependent) solutions

10. Recursive NN learnable parameters here is the sentence representation

11. Convolutional NN learnable parameters here is the sentence representation

12. Machine Comprehension: Question Answering

13. • Set of triples (𝑃, 𝑄, 𝐴) • 𝑃 - passage (the text that computer should read and comprehend) • 𝑄 - question asked on that passage • 𝐴 - answer for the question Datasets

14. Facebook bAbI • Passage: 1. Mary moved to the bathroom. 2. John went to the hallway. • Question: Where is Mary? • Answer: bathroom 20 tasks, 10k examples per task

15. CNN/Daily Mail 10M examples

16. Children’s Book Test 700k examples

17. MCTest James the Turtle was always getting in trouble. Sometimes he'd reach into the freezer and empty out all the food. Other times he'd sled on the deck and get a splinter. His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky and got into lots of trouble behind her back. One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home. His aunt was waiting for him in his room. She told James that she loved him, but he would have to start acting like a well-behaved turtle.After about a month, and after getting into lots of trouble, James finally made up his mind to be a better turtle. What is the name of the trouble making turtle? A) Fries B) Pudding C) James D) Jane 600 examples

18. SQuAD • The Stanford Question Answering Dataset • questions a set of Wikipedia articles • the answer to every question is a segment of text, or span, from the corresponding reading passage • 100,000+ question-answer pairs on 500+ articles

19. SQuAD

20. SQuAD scoring • Exact match • the percentage of predictions that exactly match one of the ground truth answers • F1 score • F1 score over common word tokens between the predicted answer and the ground truth

21. SQuAD Leaderboard

22. Best published model for SQuAD so far: Match-LSTM with Answer-Pointer (Boundary) Singapore Management University (Wang & Jiang '16)

23. Model • LSTM-preprocessing • Match-LSTM • Answer module

24. LSTM Preprocessing • incorporate contextual information into the representation of each token in the passage and the question

25. Match-LSTM • It has been used to predict whether the premise entails the hypothesis • In this model the question is considered as premise and the passage as hypothesis • For each word we get one vector which contains its word vector and the question representation that depends on that word • Bidirectional LSTM is applied on those vectors to encode the sequence

26. Answer Module • Vocabulary is huge and answer is always present in the passage (it’s a substring of it) • Models • Sequence: predict each word one by one and guess where to stop • Boundary: predict two indices indicating the beginning and the end of the answer

27.

28.

29. Results

30. Results

31. Results

32. Attention

33. Thanks