Sentence representations and question answering (YerevaNN)

Sentence
Representations
Michael Manukyan and Hrayr Harutyunyan
YerevaNN

Where it is used?
• Machine Translation
• Text classification
• Text clustering
• Machine Comprehension

Unsupervised solutions
• Bag of Words (multiset of words)
• Based on Word Embeddings (word2vec, GloVe):
• sum of word vectors
• weighted sum
• positional encoding
• max-pooling
• Recurrent Neural Networks (RNN)

* color shades indicate weights
Weighted sum of the word vectors

* color shades indicate weights
Positional encoding
(Sukhbaatar et al. 2015)

* color shades indicate values
Max-pooling

here is the sentence representation
RNN: encoder-decoder
(J. Li et al., 2015)

Supervised
(task dependent) solutions

Recursive NN
learnable parameters
here is the
sentence
representation

Convolutional NN
learnable parameters
here is the
sentence
representation

Machine Comprehension:
Question Answering

• Set of triples (𝑃, 𝑄, 𝐴)
• 𝑃 - passage (the text that computer should
read and comprehend)
• 𝑄 - question asked on that passage
• 𝐴 - answer for the question
Datasets

Facebook bAbI
• Passage:
1. Mary moved to the bathroom.
2. John went to the hallway.
• Question: Where is Mary?
• Answer: bathroom
20 tasks, 10k examples per task

Children’s Book Test
700k examples

MCTest
James the Turtle was always getting in trouble. Sometimes he'd reach into the
freezer and empty out all the food. Other times he'd sled on the deck and get a splinter.
His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky
and got into lots of trouble behind her back.
One day, James thought he would go into town and see what kind of trouble he could
get into. He went to the grocery store and pulled all the pudding off the shelves and ate
two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He
didn't pay, and instead headed home.
His aunt was waiting for him in his room. She told James that she loved him, but he
would have to start acting like a well-behaved turtle.After about a month, and after
getting into lots of trouble, James finally made up his mind to be a better turtle.
What is the name of the trouble making turtle?
A) Fries
B) Pudding
C) James
D) Jane
600 examples

SQuAD
• The Stanford Question Answering Dataset
• questions a set of Wikipedia articles
• the answer to every question is a segment of
text, or span, from the corresponding reading
passage
• 100,000+ question-answer pairs on 500+ articles

SQuAD scoring
• Exact match
• the percentage of predictions that exactly
match one of the ground truth answers
• F1 score
• F1 score over common word tokens between
the predicted answer and the ground truth

Best published model for SQuAD so far:
Match-LSTM with Answer-Pointer
(Boundary)
Singapore Management University
(Wang & Jiang '16)

Model
• LSTM-preprocessing
• Match-LSTM
• Answer module

LSTM Preprocessing
• incorporate contextual information into the
representation of each token in the passage and
the question

Match-LSTM
• It has been used to predict whether the premise
entails the hypothesis
• In this model the question is considered as premise
and the passage as hypothesis
• For each word we get one vector which contains its
word vector and the question representation that
depends on that word
• Bidirectional LSTM is applied on those vectors to
encode the sequence

Answer Module
• Vocabulary is huge and answer is always
present in the passage (it’s a substring of it)
• Models
• Sequence: predict each word one by one and
guess where to stop
• Boundary: predict two indices indicating the
beginning and the end of the answer

Sentence representations and question answering (YerevaNN)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Sentence representations and question answering (YerevaNN)

Similar to Sentence representations and question answering (YerevaNN) (20)

Recently uploaded

Recently uploaded (20)

Sentence representations and question answering (YerevaNN)