Neural Question Generation
CSS 801
Seminar Presentation
ARIJIT MUKHERJEE
17305T0021
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
2
Sequence Processing with Neural Networks
● A ‘n’ length sequence can be encoded to ‘nxd’ dimensional vector and then
passed to a feed-forward network whose input is ‘nxd’ but doing so we lose the
temporal property of the sequence. What we do ?
● We share the same feed-forward network for all our time step recurrently,
instead of taking only the ‘d’ size input at each time step we also pass the
previous timestep’s output as input.
3
Input0
h0 h1
Input1
hn
Inputn
...
Recurrent Neural Network Training Problem
4
Loss(y,o)Loss(y,o)Loss(y,o)
Vanishing Gradients or Exploding Gradients !!
Image source : François Deloche (CCASAI)
Long-Short Term Memory
5
Image source : Understanding LSTMs
My name is Arijit
Context
<SOS>
What is your name
What is your
Encoder
Decoder
6
Encoder
Decoder
Attention
7
My name is Arijit
<SOS>
What is your name
What is your
Encoder Hidden States
+
Decoder
Hidden
State (t-1)
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
8
Question Answering Task
The (20) QA bAbI tasks
● It is available for hindi and english
● The stories are generated by a
simulator
● It have two variants of 1k and 10k
samples.
● There are 20 tasks with different
difficulty levels
9
Stanford Question Answering Dataset
● 536 Wikipedia Articles.
● 23,215 total paragraphs.
● 100,000+ question-answer pairs.
● Evaluation metrics
○ Exact Match Score
○ F1 Score
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
10
Memory Network
11
I
Input
G
Generalizer O
Output
R
Response
Memory
Input X
I(X)
I(X)
Update
memory
given I(X)
M
O(M,X)
Generated
Response
Story
Question
Memory Neural Network MemNN
12
I
Input
G
Generalizer O
Output
R
Response
Memory
Input X
I(X)
I(X)
Update
memory
given I(X)
M
O(M,X)
Generated
Response
Embedding
Copy input Xi to memory Xi
End to End Memory Network MemN2N Single Layer
13
Image source : End to End Memory Network
End-to-End Memory Networks MemN2N Multi Layer
14
Image source : End to End Memory Network
15
Image source : End to End Memory Network
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
16
17
Natural Language
Inference
P Premise H Hypothesis
Y Label {Entrailment, Contradiction, Neutral}
18
Match LSTM
Match LSTM
Match LSTM Performance
19
Image source : Wange et al.
Pointer
Network
20X1 X2 X3 X4
<SOS>
C1 C2 C3 C4
C1 C2 C3
Encoder Hidden States
+
Decoder
Hidden
State (t-1)
Match LSTM + Pointer Network
21
Hq
a1
Q1 Q2 Q3 Q4 Qn P1 P2 P3 Pm
a2 a3 am
hr
1 hr
2 hr
3 hr
m
MATCH-LSTM
LAYER
LSTM
PREPROCESSING
LAYER
Match-LSTM + Pointer Network Answer Pointer Layer
hr
1 hr
2 hr
3 hr
m
Sequence
Model
hr
1 hr
2 hr
3 hr
m
as ae
Boundary
Model
23Image source : Wange & Jiang.
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
24
Changes From Wang & Jiang
● Instead of LSTM use BiDirectional GRUs, GRUs have less training params that
of LSTMs
● In the last pointer layer ha
0 will play a vital role, so we need a proper
initialization, here they used attention pooling to initialize.
● Match-LSTM layer performs matching between question and paragraph, we
can add another layer and match the paragraph with the paragraph.
● Scrap Sequence Model.
● Adding Gates to intermediate hidden states in between layers.
25
26Image source : R-Net
Experiments R-Net
27
Image source : R-Net
S-Net
28
● R-Net Only predicts the starting and ending index of an continous answer while
S-Net tries to synthesize answer.
● S-Net have to main components
○ Evidence Extraction
○ Answer Synthesis
● Evidence Extraction uses R-Net without Passage matching and Adds another
output as passage ranking.
● Answer Synthesis Model Generates answer given paragraph and start and end
token.
● Two components are trained separately.
Evidence
Extraction
29Image source : S-Net
Answer Synthesis
30
Image source : S-Net
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
31
32
Learning to Ask Neural Question Generation
● This paper proposes a Question Generation System powered by the
Seq2Seq architecture and Attention Mechanism .
● Which Encodes the Passage and the Answer to generate a Question .
● The Evaluation is done on the basis of Machine Translation Metrics like
BLEU , METEOR , ROUGE-L
Neural Question Generation
33
S1 S2 S3 S4
<SOS>
Q1 Q2 Q3 Q4
Q1 Q2 Q3
Sentence Hidden States
+Decoder
Hidden
State (t-1)
P1 P2 P3 P4
Paragraph Final
Hidden State
Experiments
34
Image source : Learning to Ask
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
35
Question-Answering Question-Generation Duality
36
Question
Generation
Question
Answering
Question
Answering
Loss
Question
Generation
Loss
Regularization Term to Connect both Tasks
Experiments
● QA model is defined by two
GRUs encoding question and
answer.
● Then creating a vector as
then pppppppppppassing to a
linear layer followed by a
sigmoid.
● The QG is an encoder decoder
model with attention.
37
Image source : QA QG dual task
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
38
Adversarial Generation of Language with GANs
39
Image source : KDnuggents
Blog, From GAN to WGA
Wasserstein GAN
40
Image source : WGAN Paper
Adversarial Generation of Natural Language
41Image source : Learning to Ask Paper
Experiment Language Generation Tasks
TASK CFG
● 248 production rules
● generate two sets of data one
consisting of samples of
length 5 and another
consisting of samples of
length 11.
● Each contains 100,000
samples.
● Set 1 vocabulary of size 36
tokens while the second have
45 tokens. 42Image source : Learning to Ask Paper
Chinese Poetry
● each line as a training example
with lines of length 5 (poem-5)
and length 7 (poem 7).
● BLEU-2 and BLEU-3 score on
corpus level for evaluation
Language Generation
● CMU-SE, PTB English
Language Modeling, Google
1billion sentences.
Conditional Language Generation
● Generate sentences
conditioned on Wh word and
sentiment
● Sequence Processing With Neural Network
● Question Answering Tasks
● Memory Networks
● Match-LSTM + Pointer Networks
● R-NET S-NET
● Neural Question Generation
● Question-Generation Question-Answering Duality
● Natural Language Generation with GANs
● Way-Forward
43
Way Forward
● Exploring Memory Networks for Question-Generation
● Incorporating Structured knowledge-bases in question-generation framework.
● Exploring Conditional GANs for Sequence to Sequence tasks.
● Improving the Learning to ask baseline and adding match-LSTMs to it.
● Conditional generation of questions with specific WH words and scores.
● Applying Question-Generation for Hindi Textbooks.
44
THANKS
45

Natural Question Generation using Deep Learning

  • 1.
    Neural Question Generation CSS801 Seminar Presentation ARIJIT MUKHERJEE 17305T0021
  • 2.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 2
  • 3.
    Sequence Processing withNeural Networks ● A ‘n’ length sequence can be encoded to ‘nxd’ dimensional vector and then passed to a feed-forward network whose input is ‘nxd’ but doing so we lose the temporal property of the sequence. What we do ? ● We share the same feed-forward network for all our time step recurrently, instead of taking only the ‘d’ size input at each time step we also pass the previous timestep’s output as input. 3 Input0 h0 h1 Input1 hn Inputn ...
  • 4.
    Recurrent Neural NetworkTraining Problem 4 Loss(y,o)Loss(y,o)Loss(y,o) Vanishing Gradients or Exploding Gradients !! Image source : François Deloche (CCASAI)
  • 5.
    Long-Short Term Memory 5 Imagesource : Understanding LSTMs
  • 6.
    My name isArijit Context <SOS> What is your name What is your Encoder Decoder 6
  • 7.
    Encoder Decoder Attention 7 My name isArijit <SOS> What is your name What is your Encoder Hidden States + Decoder Hidden State (t-1)
  • 8.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 8
  • 9.
    Question Answering Task The(20) QA bAbI tasks ● It is available for hindi and english ● The stories are generated by a simulator ● It have two variants of 1k and 10k samples. ● There are 20 tasks with different difficulty levels 9 Stanford Question Answering Dataset ● 536 Wikipedia Articles. ● 23,215 total paragraphs. ● 100,000+ question-answer pairs. ● Evaluation metrics ○ Exact Match Score ○ F1 Score
  • 10.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 10
  • 11.
    Memory Network 11 I Input G Generalizer O Output R Response Memory InputX I(X) I(X) Update memory given I(X) M O(M,X) Generated Response Story Question
  • 12.
    Memory Neural NetworkMemNN 12 I Input G Generalizer O Output R Response Memory Input X I(X) I(X) Update memory given I(X) M O(M,X) Generated Response Embedding Copy input Xi to memory Xi
  • 13.
    End to EndMemory Network MemN2N Single Layer 13 Image source : End to End Memory Network
  • 14.
    End-to-End Memory NetworksMemN2N Multi Layer 14 Image source : End to End Memory Network
  • 15.
    15 Image source :End to End Memory Network
  • 16.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 16
  • 17.
    17 Natural Language Inference P PremiseH Hypothesis Y Label {Entrailment, Contradiction, Neutral}
  • 18.
  • 19.
    Match LSTM Performance 19 Imagesource : Wange et al.
  • 20.
    Pointer Network 20X1 X2 X3X4 <SOS> C1 C2 C3 C4 C1 C2 C3 Encoder Hidden States + Decoder Hidden State (t-1)
  • 21.
    Match LSTM +Pointer Network 21 Hq a1 Q1 Q2 Q3 Q4 Qn P1 P2 P3 Pm a2 a3 am hr 1 hr 2 hr 3 hr m MATCH-LSTM LAYER LSTM PREPROCESSING LAYER
  • 22.
    Match-LSTM + PointerNetwork Answer Pointer Layer hr 1 hr 2 hr 3 hr m Sequence Model hr 1 hr 2 hr 3 hr m as ae Boundary Model
  • 23.
    23Image source :Wange & Jiang.
  • 24.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 24
  • 25.
    Changes From Wang& Jiang ● Instead of LSTM use BiDirectional GRUs, GRUs have less training params that of LSTMs ● In the last pointer layer ha 0 will play a vital role, so we need a proper initialization, here they used attention pooling to initialize. ● Match-LSTM layer performs matching between question and paragraph, we can add another layer and match the paragraph with the paragraph. ● Scrap Sequence Model. ● Adding Gates to intermediate hidden states in between layers. 25
  • 26.
  • 27.
  • 28.
    S-Net 28 ● R-Net Onlypredicts the starting and ending index of an continous answer while S-Net tries to synthesize answer. ● S-Net have to main components ○ Evidence Extraction ○ Answer Synthesis ● Evidence Extraction uses R-Net without Passage matching and Adds another output as passage ranking. ● Answer Synthesis Model Generates answer given paragraph and start and end token. ● Two components are trained separately.
  • 29.
  • 30.
  • 31.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 31
  • 32.
    32 Learning to AskNeural Question Generation ● This paper proposes a Question Generation System powered by the Seq2Seq architecture and Attention Mechanism . ● Which Encodes the Passage and the Answer to generate a Question . ● The Evaluation is done on the basis of Machine Translation Metrics like BLEU , METEOR , ROUGE-L
  • 33.
    Neural Question Generation 33 S1S2 S3 S4 <SOS> Q1 Q2 Q3 Q4 Q1 Q2 Q3 Sentence Hidden States +Decoder Hidden State (t-1) P1 P2 P3 P4 Paragraph Final Hidden State
  • 34.
  • 35.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 35
  • 36.
  • 37.
    Experiments ● QA modelis defined by two GRUs encoding question and answer. ● Then creating a vector as then pppppppppppassing to a linear layer followed by a sigmoid. ● The QG is an encoder decoder model with attention. 37 Image source : QA QG dual task
  • 38.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 38
  • 39.
    Adversarial Generation ofLanguage with GANs 39 Image source : KDnuggents Blog, From GAN to WGA
  • 40.
  • 41.
    Adversarial Generation ofNatural Language 41Image source : Learning to Ask Paper
  • 42.
    Experiment Language GenerationTasks TASK CFG ● 248 production rules ● generate two sets of data one consisting of samples of length 5 and another consisting of samples of length 11. ● Each contains 100,000 samples. ● Set 1 vocabulary of size 36 tokens while the second have 45 tokens. 42Image source : Learning to Ask Paper Chinese Poetry ● each line as a training example with lines of length 5 (poem-5) and length 7 (poem 7). ● BLEU-2 and BLEU-3 score on corpus level for evaluation Language Generation ● CMU-SE, PTB English Language Modeling, Google 1billion sentences. Conditional Language Generation ● Generate sentences conditioned on Wh word and sentiment
  • 43.
    ● Sequence ProcessingWith Neural Network ● Question Answering Tasks ● Memory Networks ● Match-LSTM + Pointer Networks ● R-NET S-NET ● Neural Question Generation ● Question-Generation Question-Answering Duality ● Natural Language Generation with GANs ● Way-Forward 43
  • 44.
    Way Forward ● ExploringMemory Networks for Question-Generation ● Incorporating Structured knowledge-bases in question-generation framework. ● Exploring Conditional GANs for Sequence to Sequence tasks. ● Improving the Learning to ask baseline and adding match-LSTMs to it. ● Conditional generation of questions with specific WH words and scores. ● Applying Question-Generation for Hindi Textbooks. 44
  • 45.