This document discusses machine reading comprehension and question answering tasks. It provides an overview of the SQuAD dataset and state-of-the-art methods like BERT and QANet. BERT introduced a novel pre-training technique that helped advance NLP. QANet combines local convolution and global self-attention to achieve top results on SQuAD. The document also outlines an implementation of QANet for question answering.
1. Machine Can Do Reading
Comprehension Test for You
Nguyen Phuoc Tat Dat
AI Research Engineer
BizReach, Inc.
Pham Quang Khang
AI Solution Architect
Consulting firm
Dec 16, 2018
2. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
3. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
4. Question Answering & Machine Reading Comprehension task
4
Given passage Generate answer to questions
The Apollo program, also known as Project Apollo, was
the third United States human spaceflight program carried
out by the National Aeronautics and Space Administration
(NASA), which accomplished landing the first humans on
the Moon from 1969 to 1972. First conceived during
Dwight D. Eisenhower's administration as a three-man
spacecraft to follow the one-man Project Mercury which
put the first Americans in space, Apollo was later
dedicated to President John F. Kennedy's national goal of
"landing a man on the Moon and returning him safely to
the Earth" by the end of the 1960s, which he proposed in
a May 25, 1961, address to Congress. Project Mercury was
followed by the two-man Project Gemini (1962–66). The
first manned flight of Apollo was in 1968.
What project put the first Americans into space?
Project Mercury
5. Question Answering & Machine Reading Comprehension task
5
Given passage Generate answer to questions
The Apollo program, also known as Project Apollo, was
the third United States human spaceflight program carried
out by the National Aeronautics and Space
Administration (NASA), which accomplished landing the
first humans on the Moon from 1969 to 1972. First
conceived during Dwight D. Eisenhower's administration
as a three-man spacecraft to follow the one-man Project
Mercury which put the first Americans in space, Apollo
was later dedicated to President John F. Kennedy's
national goal of "landing a man on the Moon and
returning him safely to the Earth" by the end of the 1960s,
which he proposed in a May 25, 1961, address to
Congress. Project Mercury was followed by the two-man
Project Gemini (1962–66). The first manned flight of
Apollo was in 1968.
What project put the first Americans into space?
Project Mercury
What program was created to carry out these projects
and missions?
National Aeronautics and Space Administration (NASA)
6. Question Answering & Machine Reading Comprehension task
6
The Apollo program, also known as Project Apollo, was
the third United States human spaceflight program carried
out by the National Aeronautics and Space Administration
(NASA), which accomplished landing the first humans on the
Moon from 1969 to 1972. First conceived during Dwight D.
Eisenhower's administration as a three-man spacecraft to
follow the one-man Project Mercury which put the first
Americans in space, Apollo was later dedicated to President
John F. Kennedy's national goal of "landing a man on the
Moon and returning him safely to the Earth" by the end of
the 1960s, which he proposed in a May 25, 1961, address to
Congress. Project Mercury was followed by the two-man
Project Gemini (1962–66). The first manned flight of Apollo
was in 1968.
What project put the first Americans into space?
Project Mercury
What program was created to carry out these projects
and missions?
National Aeronautics and Space Administration (NASA)
What year did the first manned Apollo flight occur?
1968
Given passage Generate answer to questions
7. Question Answering & Machine Reading Comprehension task
7
The Apollo program, also known as Project Apollo, was
the third United States human spaceflight program carried
out by the National Aeronautics and Space Administration
(NASA), which accomplished landing the first humans on the
Moon from 1969 to 1972. First conceived during Dwight D.
Eisenhower's administration as a three-man spacecraft to
follow the one-man Project Mercury which put the first
Americans in space, Apollo was later dedicated to President
John F. Kennedy's national goal of "landing a man on the
Moon and returning him safely to the Earth" by the end of
the 1960s, which he proposed in a May 25, 1961, address to
Congress. Project Mercury was followed by the two-man
Project Gemini (1962–66). The first manned flight of Apollo
was in 1968.
What project put the first Americans into space?
Project Mercury
What program was created to carry out these projects
and missions?
National Aeronautics and Space Administration (NASA)
What year did the first manned Apollo flight occur?
1968
Given passage Generate answer to questions
• Challenges:
○ Understand the natural language
• Knowledge on the domain
8. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
9. SQuAD – Stanford Question Answering Dataset
• Large and high quality dataset for Machine Reading Comprehension task
• https://rajpurkar.github.io/SQuAD-explorer/
• Answer is a span of text from the input passage
• 107,785 question-answer pairs on 23,215 paragraphs from 536 Wikipedia
articles
• Randomly split:
• Training set (80%)
• Dev set (10%)
• Test set(10%)
10. Examples from dev set
https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/Teacher.html
11. Analysis - Diversity
of answer types
• Separate numerical & non-
numerical answers
• Parse non-numerical answers,
classify entities using NER
Rajpurkar et al., 2016
12. Analysis -
Reasoning required
to answer questions
• Sampled 4 questions from
each of 48 articles in dev set.
• Manually labeled the
examples with the following
categories
Rajpurkar et al., 2016
14. Evaluation
Exploit 2 different metrics
1. Exact match
• Match any one of the ground truth answers exactly
2. Macro-averaged F1-score
• Average overlap between the prediction & ground truth
answer
• Treat prediction and ground truth answer as bags of
tokens, then compute their F1.
• Take the maximum F1 over all of the ground truth
answers, then calculate the average
15. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
16. Concept
1. A pre-trained language representation utilizing the architecture of Transformer on 2 tasks
a. Randomly mask some words within a sequence and let the model try to predict that masked
words
b. Predict if a pair of sequences is actually one next to other in a larger context: “next sentence
prediction”
2. Can be used as transfer learning (similar to pre-trained on ImageNet in Computer Vision)
a. Pre-train on a large corpus as un-supervised learning to learn the language representation
b. Fine-tune the model for specific tasks: text classification, Name entity recognition, SQuAD
Devlin et al,. 2018
17. Architecture
1. Encoder: encoder from transformer
a. Base model: N = 12, Hidden dim=768, Heads=12
b. Large model: N=24, Hidden dime=1024, Heads=16
2. Embedding:
7
-
7
& & -
7
N
-
7
+ 7
7
- + 1 7-
Output
Devlin et al,. 2018
19. Fine-tuning on SQuAD
Ø Use output hidden states to predict start and end span
Ø Apply 1 Linear(output=2) onto output hidden
state vectors T’i
Ø Output is predictions of starting and ending
positions of answer within input paragraph
Ø Objective function is log-likelihood of correct
start and end positions
Devlin et al,. 2018
20. Result on SQuAD
.: .0 .- . 0 .- 2 -.1
https://rajpurkar.github.io/SQuAD-explorer/
Devlin et al,. 2018
21. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
22. QANet -
On top of
SQuAD’s
Leaderboard in
Mar & July 2018
https://rajpurkar.github.io/SQuAD-explorer/
23. Background: BiDAF - One of the most important SoTA architecture
Seo et al., 2017
24. Key ideas
Combine local dependencies
(convolution) & global
dependencies (attention)
Data augmentation using
NMT back-translation
25. Key idea 1
Combine local dependencies
(convolution) & global
dependencies (attention)
30. 23 1 23 3
• 2AE C E A : 7 E E
C :E: C C
• 2C : E 3 E E (
C A E:E:
• -2 C A E:E: : ) C 3
QANet architecture
(
- : 3
• .C : E E AC 323: :E 7 C 2 C :E
: E E
E 3 E E2CE A&0:1 2 E A 0:1 7 E
2 C A2 : A E
• A2 E2CE: 2E :E C 2 : 2E E
C : 2 2 2 C :7 A&0:1 A 0 1 :
2 : :
: 3225 4 3
• CE C : E C2 A2
• 3: 3 E C 3 : 2C 3 :
3225 4 1 23 3
• 2AE C 2 A : 32
A : 7 C E E C
A2C2E
• - E E E 2 2E2 7 C 2
C
• 2C : E 3 E 3 7 C
E E 2 C
3 . 3 3 5 3
• 3 . 3 3 5 * 2AE C E C:E: 2
C C 7 C 2 E E C
• 2 2E . 3 3 3 5 *
2AE C E C:E: 2 E E C 7 C 2
C C
Yu et al., 2018
33. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
35. What have I done for this paper implementation?
3. Improve the
implementation
for convergence
training
2. Implement
the paper
1. Understand
the paper
36. Problem 1 with Keras: Nested model with multi-inputs
def build_encoder_block_model(input_dim, output_dim, num_convs, ...):
inp = x = KL.Input(shape=[None, input_dim], name=name+'_inp')
mask = KL.Input(shape=[None], name=name+'_mask', dtype=tf.bool)
enc = PositionEncoding(name=name+'_pe')(x)
...
return KM.Model([inp, mask], [enc], name=name+'_model’)
encoder_model = build_encoder_block_model(emb_dim, d, 4, 1...)
context_enc = encoder_model([context, c_mask])
query_enc = encoder_model([query, q_mask])
...
model = KM.Model(inputs=[contextw_inp, queryw_inp, contextc_inp, queryc_inp],
outputs=[p1, p2, start, end])
: :A : A: 6 : , 3? ' : :? ?
? : : : : , : , 6 A .::
BUT: Nested model has no problem with single input
40. Outline
• Part 1: Introduction
• Question Answering & Reading Comprehension task
• SQuAD dataset
• Part 2: State-of-the-art methods
• BERT – Open the new era of NLP
• QANet
• Implementation of QANet
• Summary
41. . C
• . . A .
• - . . - C
A . . .
• B . . . A .
• . . . C
.
• . . . . . . .
• . B . .
42. 1. Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova . BERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding. URL https://arxiv.org/abs/1810.04805.
2. Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc V. Le.
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. ICLR
2018.
3. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz
Kaiser, and Illia Polosukhin. Attention is all you need. NIPS 2017.
4. Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. Bi-Directional Attention
Flow for Machine Comprehension. ICLR 2017.
5. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. Squad: 100, 000+ questions for
machine comprehension of text. EMNLP 2016.