SlideShare a Scribd company logo
Improving Question Answering by
Bridging Linguistic Structures with
Statistical Learning
Tomasz Jurczyk
Advisor: Jinho D. Choi
Emory University
11/02/2017
PhD Dissertation Defense
Want big
impact?
Use big image.
2
Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.html
Want big
impact?
Use big image.
3
Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: http://www.kindynews.com/blog/kids-ask-how-many-questions-per-day
Want big
impact?
Use big image.
4
Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: https://www.shutterstock.com/video/clip-11021852-stock-footage-hispanic-woman-reading-laying-on-the-floor-of-the-library-k.html
Want big
impact?
Use big image.
5
Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: https://autoshopsolutions.com/top-10-questions-ask-web-design-internet-marketing-company/
6
“Questions vs. Queries in Informational Search Tasks”, Ryen W. White et al., WWW 2015
http://www.internetlivestats.com/google-search-statistics/
7
8
9
10
Research
Goal
Improve various question answering
aspects by combining linguistic structures
with statistical learning and constructing
abstract text representations. Address the
challenges for the applications in
cross-genre tasks 11
Research Contributions
12
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
Research Contributions
13
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Research Contributions
14
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Research Contributions
15
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
Research Contributions
16
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
A semantics-based
graph
Abstract representation
applied on arithmetic
question answering
Sentence-based Factoid Question Answering (2016)
Research Contributions
17
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
A semantics-based
graph
Abstract representation
applied on arithmetic
question answering
Multi-field
structural
decomposition
For event-based question
answering
Sentence-based Factoid Question Answering (2016)
Research Contributions
18
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
A semantics-based
graph
Abstract representation
applied on arithmetic
question answering
Multi-field
structural
decomposition
For event-based question
answering
Document retrieval
task for Cross-genre
Structure Matching for
conversation and formal
writings
Non-factoid Question Answering
(2015-2016)
Sentence-based Factoid Question Answering (2016)
Research Contributions
19
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
A semantics-based
graph
Abstract representation
applied on arithmetic
question answering
Multi-field
structural
decomposition
For event-based question
answering
A multi-gram
attention CNN
For passage completion
task for conversational
dialog texts
Document retrieval
task for Cross-genre
Structure Matching for
conversation and formal
writings
Non-factoid Question Answering
Applications to Cross-genre Tasks (2016-2017)
Sentence-based Factoid Question Answering (2016)
(2015-2016)
1.
Sentence-based
Factoid Question
Answering
20
Answering questions about
concise, well-known facts
What is Sentence-based Question
Answering?
Given a question and a list of sentences,
reorder or classify them with respect to how
likely the one answers or supports the answer
to the question.
21
Example question and its candidates
22
Tasks in sentence-based question
answering
Answer Sentence Selection Answer Triggering
A ranking problem A classification/ranking problem
Rerank sentences with respect to how
likely they support the question
Decide whether the answer is among the
sentence candidates
MRR - Mean Reciprocal Rank
(multiplicative inverse of the rank of the
first correct answer)
MAP - Mean Average Precision
Precision and Recall (F1 score)
23
24
How to build
a scalable,
diverse and
challenging
datasets
Image: https://www.meraevents.com/event/How-To-Write-and-Publish-a-Book-
Building a sentence-based factoid question
answering corpus
25
Diverse and artificially
challenging datasets
are needed to train
statistical models.
However, access to real
search engines user
queries (Google, Bing,
etc.) is almost
impossible
SelQA - a dataset built using a multi-stage
crowdsourcing annotation scheme
26
Crowdsourced
Crowdsourcing
techniques used to
built a dataset
Scalable size
Can be used on a big
scale
Low-cost
Cost-effective due to
quality control
Quality control
Poor-quality
annotations are
rejected
Diverse
Data sources come
from multiple
domains
Challenging
Semantically difficult
due to paraphrase
step
SelQA: A New Benchmark for Selection-Based Question Answering, Jurczyk
et al., ICTAI’2016
The annotation process
27
Sample data
collection
(e.g.:
articles
from
Wikipedia) Preprocess
the
collection
(sentence
segmentati
on etc.)
4 annotation
tasks on
MTurk + 1
task using
Elasticsearch
More detailed look at the process
28
An example annotation
29
Annotation summary
30
Qs
Qm
Qs+m
Ωq
Ωa
Ωf
Time Credit
Task1 1,824 154 1,978 44.99 23.65 28.88 71 sec. $0.10
Task2 1,828 148 1,976 44.64 23.20 28.62 64 sec. $0.10
Task 3 3,637 313 3,950 38.03 19.99 24.41 41 sec. $0.08
Task 4 682 55 737 31.09 19.41 21.88 54 sec. $0.08
SelQA 7,289 615 7,904 40.54 21.51 26.18 - -
WikiQA 1,068 174 1,242 39.31 9.82 15.03 - -
7,904Questions annotated with their contexts
9%More overlapping words compared to WikiQA,
on average
15%Drop of ratio in overlapping words due to
paraphrasing step
31
32
Context
Matching
Using
Syntactic
Structures
Image: https://www.huffingtonpost.com/galtime/importance-of-doing-puzzles-with-your-kids_b_4683094.html
How to
match
contexts?
Even advanced word
matching is not working
for advanced questions
and text collections
33
An example: one sentence supports the
question example
34
Question: who lead the polish army in the siege of warsaw?
Sentences:
...
1) Despite German radio broadcasts claiming to have captured
Warsaw, the initial enemy attack was repelled and soon
afterwards Warsaw was placed under siege.
2) The siege lasted until September 28, when the Polish garrison,
commanded under General Walerian Czuma, officially
capitulated.
3) The following day approximately 140,000 Polish soldiers and
troops left the city and were taken as prisoners of war.
...
Subtree matching for
contextual semantic
similarity
The dependency
grammar can be used
to match the syntax of
two sentences (a
question and a
sentence candidate)
and to calculate their
semantic similarity
35
Subtree matching example
36
Question: Who lead the polish army in the Siege of Warsaw?
Sentence: The siege lasted until September 28, when the Polish
garrison, commanded under General Walerian Czuma, officially
capitulated.
SelQA: A New Benchmark for Selection-Based Question Answering, Jurczyk
et al., ICTAI’2016
How to use the subtree
matching features?
The subtree
matching features
are combined with a
convolutional neural
network to prove the
effectiveness in
extracting semantic
similarity
37
Implemented architecture
38
Answer Sentence Selection on SelQA
39
Model
Development Evaluation
MAP MRR MAP MRR
CNN0
: baseline 84.62 85.65 83.20 84.20
CNN2
: avg + emb 85.70 86.67 84.66 85.68
Santos et al. (2017) - - 87.58 88.12
Shen et al. (2017) - - 89.14 89.93
Answer sentence selection on WikiQA
40
Model
Development Evaluation
MAP MRR MAP MRR
CNN0
: baseline 69.93 70.66 65.62 66.46
CNN2
: avg + emb (2016) 69.22 70.18 68.78 70.82
Yang et al. (2015) - - 65.20 66.52
Santos et al. (2016) - - 68.86 69.57
Miao et al. (2016) - - 68.86 70.69
Yin et al. (2016) - - 69.21 71.08
Wang et al. (2016) - - 70.58 72.26
Wang et al. (2017) - - 73.41 74.18
Answer triggering on SelQA
41
Model
Development Evaluation
P R F1 P R F1
CNN0
: baseline 50.63 40.60 45.07 52.10 40.34 45.47
CNN2
: max + emb 49.32 48.99 49.16 53.69 48.38 50.89
Answer triggering on WikiQA
42
Model
Development Evaluation
P R F1 P R F1
CNN0
: baseline 41.86 42.86 42.35 29.70 37.45 32.73
CNN3
: max + emb+ 44.44 44.44 44.44 29.43 48.56 36.65
Yang et al. (2015) - - - 27.96 37.86 32.17
6.5%Improvement over the state of the art for the
WikiQA dataset in answer sentence selection
F1@36.65New state of the art for the answer triggering
on WikiQA
12%Improvement over the state of the art for the
SelQA dataset in answer triggering
43
44
Image: https://www.meraevents.com/event/How-To-Write-and-Publish-a-Book-
How to
combine
multiple
question
answering
corpora?
Image:http://www.softaid.info/Products/SoftLibrarian
Taking advantage of
multiple QA corpora
Researchers have
independently released
several QA corpora.
The performance of QA
systems could be
improved by combining
them.
45
46
WikiQA SelQA SQuAD InfoboxQA
Source:
Bing
search
queries
Crowdsourced Crowdsourced Crowdsourced
Answer Sentence
Selection:
YES YES YES YES
Answer Triggering: YES YES NO NO
Questions: 1,242 7,904 98,202 15,271
Candidates: 12,153 95,250 496,167 271,038
Candidates/question: 9.79 12.05 5.05 17.75
Analysis of Wikipedia-based Corpora for Question Answering, Jurczyk et al.,
arXiv
How these datasets compare to each other
47
The results on cross-testing the corpora
48
Trained On
Evaluated on
WikiQA SelQA SQuAD
MAP MRR F1 MAP MRR F1 MAP MRR F1
WikiQA 65.54 67.41 13.33 53.47 54.12 8.68 73.16 73.72 11.26
SelQA 49.05 49.64 24.30 82.72 83.70 48.66 77.22 78.04 44.70
SQuAD 58.17 58.53 19.35 81.15 82.27 42.88 88.84 89.69 44.93
W+S+Q 56.40 56.51 - 83.19 84.25 - 88.78 89.65 -
ALL 60.19 60.68 - 82.88 83.97 - 88.92 89.79 -
HigherAccuracy of answer triggering on WikiQA
when trained on SelQA
HigherAccuracy on SQuAD when trained on
combined datasets
FasterConvergence for SQuAD when trained on
SelQA with almost identical performance
49
2.
Non-factoid
Question
Answering
50
An umbrella term for questions
that are not factoid
What is non-factoid question
answering?
As an umbrella, it covers a
wide spectrum of tasks
such as recommendation,
arithmetic, visual,
community-based question
answering, etc.
It often requires more
complex and customized
approaches.
51
52Image:http://www.schoolnetuganda.com/study-tips/strategies-for-solving-math-word-problems/
Solving
Elementary-
School Level
Arithmetic
Questions
How we solve the following problems?
53
Question
A restaurant served 9
pizzas during lunch and 6
during dinner today. How
many pizzas were served
today?
Sara has 31 red and 15
green balloons. Sandy has
24 red balloons. How many
red balloons do they have
in total?
… likely, we construct an equation and
calculate it
54
Question Equation Answer
A restaurant served 9
pizzas during lunch and 6
during dinner today. How
many pizzas were served
today?
x = 9 + 6 x = 15
Sara has 31 red and 15
green balloons. Sandy has
24 red balloons. How many
red balloons do they have
in total?
x = 31 + 24 x = 55
Application to arithmetic questions
55
Sequence classification
This task can be seen as a
sequence classification of verb
polarities
Three verb classes
Each verb can be either
positive (+), negative (-) or
neutral (0)
Linear equation formed
Once all polarities are
classified, the equation is
formed
Semantics-based graph
Is used to extract syntactic
and semantic features for verb
classification
Natural language processing tasks
56
Semantics-based Graph Approach to Complex Question Answering,
Jurczyk et al., NAACL-SRW’2015
Natural language processing tasks are used to
build a graph
57
The flow in the system
58
… as an example
59
The results on the AllenAI dataset
60
Model Accuracy
This work
(2015)
71.75%
Roy et al.
(2014)
64.00%
Hosseini et
al. (2015)
77.7%
Roy et al.
(2016)
78.00%
~6%lower than the previous approach
SuccessfullyApplied the semantics-graph to non-factoid
question answering
ButDoes not require extra annotation in verb
polarities
61
62
Structure
Decomposition
for Story-based
Question
Answering
Image:http://www.schoolnetuganda.com/study-tips/strategies-for-solving-math-word-problems/
How does event-based question answering
look
63
Sentence ID Text Support
1 Fred picked up the football there.
2 Fred gave the football to Jeff
3 What did Fred gave to Jeff? 2
4 Bill went back to the bathroom
5 Jeff grabbed the milk there
6 Who gave the football to Jeff? 2
Hybrid system for event-based question
answering
64
NLP/IR solution
A good mix of natural
language processing and
information retrieval
Three groups of fields
Lexical, syntactic, and
semantic representations of
text are extracted
Lucene-based engine
A lucene-based search engine
is used to index extracted
fields
Event-based QA eval.
The approach will be
evaluated on non-factoid
question answering task
Multi-Field Structural Decomposition for Question Answering, Jurczyk et al.
arXiv
… as the flow of execution
65
Example decomposition for incoming
document
66
Results on the bAbI dataset
67
Type
Lexical Lexical + Syntax
Lexical + Syntax +
Semantics
λ = 1 λ learned λ = 1 λ learned λ = 1 λ learned
MAP MRR MAP MRR MAP MRR MAP MRR MAP MRR MAP MRR
task1 39.62 61.73 39.62 61.73 29.90 48.05 40.50 61.47 72.60 85.07 100.0 100.0
task5 37.10 54.00 38.20 54.70 48.00 62.15 48.40 62.25 72.60 82.65 94.20 96.33
taski …
Avg. 44.45 61.25 44.63 61.37 45.16 60.34 48.41 63.76 59.60 73.70 85.16 90.47
3.
Applications for
Cross-genre
Tasks
68
Document retrieval and passage
completion cross-genre tasks
69
Document
Retrieval for
Conversatio-
nal and
Formal
Writings
Image:https://www.popsugar.com/entertainment/Friends-TV-Show-Theory-About-Rachel-43718473
What is the cross-genre
document retrieval task?
Given a description (query) and a list of
conversational scripts (documents),
retrieve the scripts that are relevant
(support) this description
70
Documents: ‘Friends’ scripts
10 seasons
of Friends
1 season =
~24
episodes
71
1 episode =
~14 scenes
1 scene =
~20 utterances
1 utterance =
speaker +
utterance
A slice from a scene
Rachel How does going to a strip club help him better?
Ross Because there are naked ladies there.
Joey
Which helps him get to Phase Three, picturing yourself
with other women.
Ross There are naked ladies there too.
Joey Yeah.
72
Descriptions: episode summaries & plots
73
Summary:
one-paragr
aph
episode
summary
Plots:
more
detailed
episode
description
~5,000
sentence
descripti
ons
Description examples
74
Dialogue Summary + Plot
Joey
One woman? That’s like saying there’s
only one flavor of ice cream for you.
Lemme tell you something, Ross. There’s
lots of flavors out there.
Joey compares
women to ice cream
Ross
You know you probably didn’t know this,
but back in the high school, I had, a, um,
major crush on you.
Ross reveals his high
school crush on
Rachel
Rachel I knew.
Chandler
Alright, one of you give me your
underpants.
Chandler asks Joey
for his underwear,
but Joey can’t help
him out as he’s not
wearing any
Joey Can’t help you, I’m not wearing any.
Elasticsearch - first results
75
k
Development Evaluation
R@k MRR R@k MRR
1 46.00 46.00 47.64 47.64
2 65.80 53.80 69.26 55.79
5 72.60 54.71 74.66 56.53
10 78.80 55.13 79.73 56.91
20 83.80 55.31 84.80 57.08
Structure extraction for conversational and
formal writings
76
“Chandler: Alright, one of you give me your underpants”
Cross-genre Document Retrieval: Matching between Conversational and
Formal Writings, Jurczyk et al.,
BLGNLP 2017 (during EMNLP)
How does the structure matching work
77
Indexed scripts’ structures
Match
Description
Extract
Retrieve
Experimental results
78
Model
Development Evaluation
R@1 MRR R@1 MRR
Elastic1
46.00 54.71 47.64 56.53
Structw
34.80 45.75 35.47 47.40
Structl
35.60 46.86 39.53 50.84
Structm
33.80 45.10 35.98 47.76
but...
The structure matching is
capable of locating ~15%
descriptions that Elastic
can not
79
Two-stage classification retrieval
80
Experimental results
81
Model
Development Evaluation
R@1 MRR R@1 MRR
Elastic10
46.00 54.71 47.64 56.53
Structl
35.60 46.86 39.53 50.84
Rerank1
48.20 56.02 51.86 59.46
Rerankλ
51.20 57.74 52.03 59.84
47.64%Initial R@1 achieved by Elasticsearch
ButStructure matching is based on single
utterance, this will be improved
9.2%Improvement when the structure extraction
features were used
82
83
Passage
Completion
for
Cross-genre
Texts
Image:https://www.bustle.com/articles/30983-friends-trivia-game-show-episode-features-
the-cast-at-its-funniest-can-you-pass-the-test
Passage completion is reading
comprehension
84
Cross-genre
It is a cross-genre task on the
conversational data
Reading Comprehension
It benchmarks the ability to
read and comprehend the
natural language
Entity-based
It is based on the entity
prediction given a query and a
passage
Towards QA
As a reading comprehension,
it will be crucial for future
question answering
Already existing PC task: CNN/Daily News
85
Passage completion in conversational dialogs
86
Approach: a multi-gram convolutional
neural network with attention
87
Approach: a multi-gram convolutional
neural network with attention
88
Experimental results
89
Model Accuracy
Baseline1
(entity majority) 27.30
Baseline2
(word-distance) 27.26
Linguistic approach (L2R, 2016) 51.16
Bi-LSTM + attention (2017) 69.26 (test: 62.52%)
Multi-gram CNN 57.43
Multi-gram CNN + attndot
63.58
Latest approach: stacked-multi-gram
90
Experimental results
91
Model Accuracy
Baseline1
(entity majority) 27.52
Baseline2
(word-distance) 27.67
Linguistic approach (L2R) 47.36
Bi-LSTM + attention 69.26
Multi-gram CNN 57.43
Multi-gram CNN + attndot
63.58
Utterance-based multi-gram CNN + attn 66.59
66.59%Best so-far score
ButIs more robust when CNN + Bi-LSTM is used
(accuracy more stable when number of
utterances increased)
~5%Lower than Bi-LSTM
92
Thanks to the
committee members
and audience!
93
Thanks to Jinho for
mentoring me and
believing in me!
94
Thanks to Emory NLP Lab!
95
Thanks to friends!
96
Thanks to friends!
97
Thanks to friends!
98
Thanks to my family!
99
Research Contributions
100
A multi-stage
annotation scheme
For sentence-based factoid
question answering using
crowdsourcing technique
A subtree matching
mechanism
For measuring contextual
similarity between two
sentences
Exploration of neural
architectures for FQA
Convolutional neural
networks for
sentence-based FQA
Combining multiple
QA corpora
Improving the performance
of QA systems by
cross-using multiple sets
A semantics-based
graph
Abstract representation
applied on arithmetic
question answering
Multi-field
structural
decomposition
For event-based question
answering
A multi-gram
attention CNN
For passage completion
task for conversational
dialog texts
Document retrieval
task for Cross-genre
Structure Matching for
conversation and formal
writings
Non-factoid Question AnsweringSentence-based Factoid Question Answering (2016)
(2015-2016)
Applications to Cross-genre Tasks (2016-2017)
The process of the scheme
101
1. ~500 articles are uniformly sampled from Wikipedia from
the following topics: Arts, Country, Food, Historical Events,
Movies, Music, Science, Sports, Travel, TV.
2. Section that have more than 2 and less than 26 sentences
are selected.
3. These sections are used in the annotation scheme and are
sent to annotators.
4. Four tasks are performed on Mechanical Turk, the fifth task
is performed using Elasticsearch.
The process of subtree matching
102
1. For a question-sentence pair, extract a list of
overlapping words
2. For each overlapping pairs, extract their tree slices
3. Perform the matching step on three levels: parents,
siblings, and children
4. Calculate their semantic similarity scores
The experimentation setup
103
◎ One of the state-of-the-art convolutional neural
networks introduced by Yu et al. (2014) is used to
evaluate QA corpora.
◎ The model consists of a single convolutional layer, a
max pooling, and then the sigmoid function.
◎ 40 words of question and 40 words of answer are used
as the input to the model.
◎ Original splits provided by the creators are used.
Natural language processing tasks are used to
build a graph
104
Definition Example
Document a single document of text
microblog note or
Wikipedia article
Entity
a set of instances
referring to the same
instance in given context
“John went to Emory
University. He majored in
CompSci.”
Instance
the atomic level object in
the graph, usually
represented by a modifier
of an instance
“John went to Emory
University with Jessica.”
Predicate-Argument
an argument that
completes the meaning of
other instance (predicate)
“This car was sold to
Michael two days ago.”
Attribute a modifier of a instance “Alicia has a black cat.”
Document retrieval in the cross-genre texts
105
◎ Source text is Friends TV show (scripts), while target
text (queries) are episode descriptions.
◎ For a list of source texts (episodes) and a query
(description), retrieve the source that matches the
description.
◎ Structure extraction is presented that improves the
retrieval performance.
◎ This task is a preliminary step to perform a question
answering on conversational scripts.
Target texts - Show’s descriptions
106
◎ Episode summaries and plots have been crawled from
fan sites.
◎ Summaries are one-paragraph texts, usually of 5-6
sentences that provide a high overview of an episode.
◎ Plots are multi-paragraph texts, usually giving more
detailed description of an episode
◎ Each summary and plot was sentence segmented,
tokenized and is represented as a single query.
◎ The set of over 5,000 queries is used in the
experimentation.
Improving R@1 given R@10
107
◎ Extracted relations are now used to improve R@1
when 10 (k=10) most relevant documents are given.
◎ Two-stage classification setup is presented that uses
the extracted relations.
◎ Feed forward neural network is used to train the
model which makes a decision whether the episode
ranked as top (k 1) should be returned as the correct
one.
◎ If not, the initial ranking from Elasticsearch is paired
with structure matching scores, and a new
prediction is made.
Passage completion for cross-genre texts
108
◎ It is very difficult to extract actual answers from the
dialogue data, where an answer might be contained
within a single or many utterances.
◎ The machine must understand the logic of the
human dialogue first, it can not just “extract” the
answer from the, it must infer it.
◎ As a proxy task, we first want to tackle a passage
completion task.
◎ A query that consists of one or more entities is tested
against a document text (news article)
Tackling complexity of arithmetic questions
109
◎ It is relatively easy to develop a question answering
system for a single type of questions.
◎ Arithmetic questions seen on the previous slide
require reasoning on the abstract level.
◎ A semantics-based graph approach is presented that
builds an abstract representation of the text
Application to arithmetic questions
110
◎ This task is represented as a sequence classification
of verb polarities.
◎ Each verb can be either positive (+), negative (-) or
neutral (0).
◎ Positive/negative verb yields an addition/subtraction
operation with its associated chain node. Neutral is
omitted from the linear equation.
◎ Prediction is made for all recognized verbs in a
sentence, then a linear equation is formed and solved.
◎ Presented graph is applied to extract abstract
features based on text.
Hybrid system for event-based question
answering
111
◎ A solution that is a good mix of natural language
processing and information retrieval is presented.
◎ NLP structures are used to extract lexical, syntactic
and semantic representations of text.
◎ Lucene-based search engine is used to index extracted
fields, and then to perform a document retrieval on
these fields.
◎ The approach is evaluated using publicly available
dataset bAbI that consists of 20 tasks, where each
task represents a different kind of question answering
challenge.
Multi-Field Structural Decomposition for Question Answering, Jurczyk et al., arXiv

More Related Content

Similar to Improving Question Answering by Bridging Linguistic Structures with Statistical Learning

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
Marco Brambilla
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
Paul Hofmann
 
Semantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaSemantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by Wikipedia
Maxim Grinev
 
Poster Abstracts
Poster AbstractsPoster Abstracts
Poster Abstractsbutest
 
G1803054653
G1803054653G1803054653
G1803054653
IOSR Journals
 
rnn_review.10.pdf
rnn_review.10.pdfrnn_review.10.pdf
rnn_review.10.pdf
FlyingColours13
 
Unsupervised Learning of a Social Network from a Multiple-Source News Corpus
Unsupervised Learning of a Social Network from a Multiple-Source News CorpusUnsupervised Learning of a Social Network from a Multiple-Source News Corpus
Unsupervised Learning of a Social Network from a Multiple-Source News Corpushtanev
 
Metron seas collaboration
Metron seas collaborationMetron seas collaboration
Metron seas collaboration
ikekala
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
Joachim Schlosser
 
E0322035037
E0322035037E0322035037
E0322035037
inventionjournals
 
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
ataloadane
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
Andre Freitas
 
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
Phu H. Nguyen
 
Deep learning review
Deep learning reviewDeep learning review
Deep learning review
Manas Gaur
 
Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013
Yueshen Xu
 
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
IJwest
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Jan Sifra
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
IRJET Journal
 

Similar to Improving Question Answering by Bridging Linguistic Structures with Statistical Learning (20)

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Semantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaSemantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by Wikipedia
 
Poster Abstracts
Poster AbstractsPoster Abstracts
Poster Abstracts
 
G1803054653
G1803054653G1803054653
G1803054653
 
rnn_review.10.pdf
rnn_review.10.pdfrnn_review.10.pdf
rnn_review.10.pdf
 
Unsupervised Learning of a Social Network from a Multiple-Source News Corpus
Unsupervised Learning of a Social Network from a Multiple-Source News CorpusUnsupervised Learning of a Social Network from a Multiple-Source News Corpus
Unsupervised Learning of a Social Network from a Multiple-Source News Corpus
 
Metron seas collaboration
Metron seas collaborationMetron seas collaboration
Metron seas collaboration
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
Accelerating the Pace of Engineering Education with Simulation, Hardware and ...
 
E0322035037
E0322035037E0322035037
E0322035037
 
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
Recent_Trends_in_Deep_Learning_Based_Open-Domain_Textual_Question_Answering_S...
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
Resume
ResumeResume
Resume
 
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
Some insights from a Systematic Mapping Study and a Systematic Review Study: ...
 
Deep learning review
Deep learning reviewDeep learning review
Deep learning review
 
Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013Summary on the Conference of WISE 2013
Summary on the Conference of WISE 2013
 
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
 

More from Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
Jinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
Jinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
Jinho Choi
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
Jinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
Jinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
Jinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
Jinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
Jinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
Jinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
Jinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological Sort
Jinho Choi
 
Tries - Put
Tries - PutTries - Put
Tries - Put
Jinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Jinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Jinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
Jinho Choi
 

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Recently uploaded

Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Improving Question Answering by Bridging Linguistic Structures with Statistical Learning

  • 1. Improving Question Answering by Bridging Linguistic Structures with Statistical Learning Tomasz Jurczyk Advisor: Jinho D. Choi Emory University 11/02/2017 PhD Dissertation Defense
  • 2. Want big impact? Use big image. 2 Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.html
  • 3. Want big impact? Use big image. 3 Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: http://www.kindynews.com/blog/kids-ask-how-many-questions-per-day
  • 4. Want big impact? Use big image. 4 Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: https://www.shutterstock.com/video/clip-11021852-stock-footage-hispanic-woman-reading-laying-on-the-floor-of-the-library-k.html
  • 5. Want big impact? Use big image. 5 Image: https://www.psychologicalscience.org/news/minds-business/asking-questions-increases-likability.htmlImage: https://autoshopsolutions.com/top-10-questions-ask-web-design-internet-marketing-company/
  • 6. 6 “Questions vs. Queries in Informational Search Tasks”, Ryen W. White et al., WWW 2015 http://www.internetlivestats.com/google-search-statistics/
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. 10
  • 11. Research Goal Improve various question answering aspects by combining linguistic structures with statistical learning and constructing abstract text representations. Address the challenges for the applications in cross-genre tasks 11
  • 12. Research Contributions 12 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique
  • 13. Research Contributions 13 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA
  • 14. Research Contributions 14 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA
  • 15. Research Contributions 15 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets
  • 16. Research Contributions 16 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets A semantics-based graph Abstract representation applied on arithmetic question answering Sentence-based Factoid Question Answering (2016)
  • 17. Research Contributions 17 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets A semantics-based graph Abstract representation applied on arithmetic question answering Multi-field structural decomposition For event-based question answering Sentence-based Factoid Question Answering (2016)
  • 18. Research Contributions 18 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets A semantics-based graph Abstract representation applied on arithmetic question answering Multi-field structural decomposition For event-based question answering Document retrieval task for Cross-genre Structure Matching for conversation and formal writings Non-factoid Question Answering (2015-2016) Sentence-based Factoid Question Answering (2016)
  • 19. Research Contributions 19 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets A semantics-based graph Abstract representation applied on arithmetic question answering Multi-field structural decomposition For event-based question answering A multi-gram attention CNN For passage completion task for conversational dialog texts Document retrieval task for Cross-genre Structure Matching for conversation and formal writings Non-factoid Question Answering Applications to Cross-genre Tasks (2016-2017) Sentence-based Factoid Question Answering (2016) (2015-2016)
  • 21. What is Sentence-based Question Answering? Given a question and a list of sentences, reorder or classify them with respect to how likely the one answers or supports the answer to the question. 21
  • 22. Example question and its candidates 22
  • 23. Tasks in sentence-based question answering Answer Sentence Selection Answer Triggering A ranking problem A classification/ranking problem Rerank sentences with respect to how likely they support the question Decide whether the answer is among the sentence candidates MRR - Mean Reciprocal Rank (multiplicative inverse of the rank of the first correct answer) MAP - Mean Average Precision Precision and Recall (F1 score) 23
  • 24. 24 How to build a scalable, diverse and challenging datasets Image: https://www.meraevents.com/event/How-To-Write-and-Publish-a-Book-
  • 25. Building a sentence-based factoid question answering corpus 25 Diverse and artificially challenging datasets are needed to train statistical models. However, access to real search engines user queries (Google, Bing, etc.) is almost impossible
  • 26. SelQA - a dataset built using a multi-stage crowdsourcing annotation scheme 26 Crowdsourced Crowdsourcing techniques used to built a dataset Scalable size Can be used on a big scale Low-cost Cost-effective due to quality control Quality control Poor-quality annotations are rejected Diverse Data sources come from multiple domains Challenging Semantically difficult due to paraphrase step SelQA: A New Benchmark for Selection-Based Question Answering, Jurczyk et al., ICTAI’2016
  • 27. The annotation process 27 Sample data collection (e.g.: articles from Wikipedia) Preprocess the collection (sentence segmentati on etc.) 4 annotation tasks on MTurk + 1 task using Elasticsearch
  • 28. More detailed look at the process 28
  • 30. Annotation summary 30 Qs Qm Qs+m Ωq Ωa Ωf Time Credit Task1 1,824 154 1,978 44.99 23.65 28.88 71 sec. $0.10 Task2 1,828 148 1,976 44.64 23.20 28.62 64 sec. $0.10 Task 3 3,637 313 3,950 38.03 19.99 24.41 41 sec. $0.08 Task 4 682 55 737 31.09 19.41 21.88 54 sec. $0.08 SelQA 7,289 615 7,904 40.54 21.51 26.18 - - WikiQA 1,068 174 1,242 39.31 9.82 15.03 - -
  • 31. 7,904Questions annotated with their contexts 9%More overlapping words compared to WikiQA, on average 15%Drop of ratio in overlapping words due to paraphrasing step 31
  • 33. How to match contexts? Even advanced word matching is not working for advanced questions and text collections 33
  • 34. An example: one sentence supports the question example 34 Question: who lead the polish army in the siege of warsaw? Sentences: ... 1) Despite German radio broadcasts claiming to have captured Warsaw, the initial enemy attack was repelled and soon afterwards Warsaw was placed under siege. 2) The siege lasted until September 28, when the Polish garrison, commanded under General Walerian Czuma, officially capitulated. 3) The following day approximately 140,000 Polish soldiers and troops left the city and were taken as prisoners of war. ...
  • 35. Subtree matching for contextual semantic similarity The dependency grammar can be used to match the syntax of two sentences (a question and a sentence candidate) and to calculate their semantic similarity 35
  • 36. Subtree matching example 36 Question: Who lead the polish army in the Siege of Warsaw? Sentence: The siege lasted until September 28, when the Polish garrison, commanded under General Walerian Czuma, officially capitulated. SelQA: A New Benchmark for Selection-Based Question Answering, Jurczyk et al., ICTAI’2016
  • 37. How to use the subtree matching features? The subtree matching features are combined with a convolutional neural network to prove the effectiveness in extracting semantic similarity 37
  • 39. Answer Sentence Selection on SelQA 39 Model Development Evaluation MAP MRR MAP MRR CNN0 : baseline 84.62 85.65 83.20 84.20 CNN2 : avg + emb 85.70 86.67 84.66 85.68 Santos et al. (2017) - - 87.58 88.12 Shen et al. (2017) - - 89.14 89.93
  • 40. Answer sentence selection on WikiQA 40 Model Development Evaluation MAP MRR MAP MRR CNN0 : baseline 69.93 70.66 65.62 66.46 CNN2 : avg + emb (2016) 69.22 70.18 68.78 70.82 Yang et al. (2015) - - 65.20 66.52 Santos et al. (2016) - - 68.86 69.57 Miao et al. (2016) - - 68.86 70.69 Yin et al. (2016) - - 69.21 71.08 Wang et al. (2016) - - 70.58 72.26 Wang et al. (2017) - - 73.41 74.18
  • 41. Answer triggering on SelQA 41 Model Development Evaluation P R F1 P R F1 CNN0 : baseline 50.63 40.60 45.07 52.10 40.34 45.47 CNN2 : max + emb 49.32 48.99 49.16 53.69 48.38 50.89
  • 42. Answer triggering on WikiQA 42 Model Development Evaluation P R F1 P R F1 CNN0 : baseline 41.86 42.86 42.35 29.70 37.45 32.73 CNN3 : max + emb+ 44.44 44.44 44.44 29.43 48.56 36.65 Yang et al. (2015) - - - 27.96 37.86 32.17
  • 43. 6.5%Improvement over the state of the art for the WikiQA dataset in answer sentence selection F1@36.65New state of the art for the answer triggering on WikiQA 12%Improvement over the state of the art for the SelQA dataset in answer triggering 43
  • 45. Taking advantage of multiple QA corpora Researchers have independently released several QA corpora. The performance of QA systems could be improved by combining them. 45
  • 46. 46 WikiQA SelQA SQuAD InfoboxQA Source: Bing search queries Crowdsourced Crowdsourced Crowdsourced Answer Sentence Selection: YES YES YES YES Answer Triggering: YES YES NO NO Questions: 1,242 7,904 98,202 15,271 Candidates: 12,153 95,250 496,167 271,038 Candidates/question: 9.79 12.05 5.05 17.75 Analysis of Wikipedia-based Corpora for Question Answering, Jurczyk et al., arXiv
  • 47. How these datasets compare to each other 47
  • 48. The results on cross-testing the corpora 48 Trained On Evaluated on WikiQA SelQA SQuAD MAP MRR F1 MAP MRR F1 MAP MRR F1 WikiQA 65.54 67.41 13.33 53.47 54.12 8.68 73.16 73.72 11.26 SelQA 49.05 49.64 24.30 82.72 83.70 48.66 77.22 78.04 44.70 SQuAD 58.17 58.53 19.35 81.15 82.27 42.88 88.84 89.69 44.93 W+S+Q 56.40 56.51 - 83.19 84.25 - 88.78 89.65 - ALL 60.19 60.68 - 82.88 83.97 - 88.92 89.79 -
  • 49. HigherAccuracy of answer triggering on WikiQA when trained on SelQA HigherAccuracy on SQuAD when trained on combined datasets FasterConvergence for SQuAD when trained on SelQA with almost identical performance 49
  • 50. 2. Non-factoid Question Answering 50 An umbrella term for questions that are not factoid
  • 51. What is non-factoid question answering? As an umbrella, it covers a wide spectrum of tasks such as recommendation, arithmetic, visual, community-based question answering, etc. It often requires more complex and customized approaches. 51
  • 53. How we solve the following problems? 53 Question A restaurant served 9 pizzas during lunch and 6 during dinner today. How many pizzas were served today? Sara has 31 red and 15 green balloons. Sandy has 24 red balloons. How many red balloons do they have in total?
  • 54. … likely, we construct an equation and calculate it 54 Question Equation Answer A restaurant served 9 pizzas during lunch and 6 during dinner today. How many pizzas were served today? x = 9 + 6 x = 15 Sara has 31 red and 15 green balloons. Sandy has 24 red balloons. How many red balloons do they have in total? x = 31 + 24 x = 55
  • 55. Application to arithmetic questions 55 Sequence classification This task can be seen as a sequence classification of verb polarities Three verb classes Each verb can be either positive (+), negative (-) or neutral (0) Linear equation formed Once all polarities are classified, the equation is formed Semantics-based graph Is used to extract syntactic and semantic features for verb classification
  • 56. Natural language processing tasks 56 Semantics-based Graph Approach to Complex Question Answering, Jurczyk et al., NAACL-SRW’2015
  • 57. Natural language processing tasks are used to build a graph 57
  • 58. The flow in the system 58
  • 59. … as an example 59
  • 60. The results on the AllenAI dataset 60 Model Accuracy This work (2015) 71.75% Roy et al. (2014) 64.00% Hosseini et al. (2015) 77.7% Roy et al. (2016) 78.00%
  • 61. ~6%lower than the previous approach SuccessfullyApplied the semantics-graph to non-factoid question answering ButDoes not require extra annotation in verb polarities 61
  • 63. How does event-based question answering look 63 Sentence ID Text Support 1 Fred picked up the football there. 2 Fred gave the football to Jeff 3 What did Fred gave to Jeff? 2 4 Bill went back to the bathroom 5 Jeff grabbed the milk there 6 Who gave the football to Jeff? 2
  • 64. Hybrid system for event-based question answering 64 NLP/IR solution A good mix of natural language processing and information retrieval Three groups of fields Lexical, syntactic, and semantic representations of text are extracted Lucene-based engine A lucene-based search engine is used to index extracted fields Event-based QA eval. The approach will be evaluated on non-factoid question answering task Multi-Field Structural Decomposition for Question Answering, Jurczyk et al. arXiv
  • 65. … as the flow of execution 65
  • 66. Example decomposition for incoming document 66
  • 67. Results on the bAbI dataset 67 Type Lexical Lexical + Syntax Lexical + Syntax + Semantics λ = 1 λ learned λ = 1 λ learned λ = 1 λ learned MAP MRR MAP MRR MAP MRR MAP MRR MAP MRR MAP MRR task1 39.62 61.73 39.62 61.73 29.90 48.05 40.50 61.47 72.60 85.07 100.0 100.0 task5 37.10 54.00 38.20 54.70 48.00 62.15 48.40 62.25 72.60 82.65 94.20 96.33 taski … Avg. 44.45 61.25 44.63 61.37 45.16 60.34 48.41 63.76 59.60 73.70 85.16 90.47
  • 68. 3. Applications for Cross-genre Tasks 68 Document retrieval and passage completion cross-genre tasks
  • 70. What is the cross-genre document retrieval task? Given a description (query) and a list of conversational scripts (documents), retrieve the scripts that are relevant (support) this description 70
  • 71. Documents: ‘Friends’ scripts 10 seasons of Friends 1 season = ~24 episodes 71 1 episode = ~14 scenes 1 scene = ~20 utterances 1 utterance = speaker + utterance
  • 72. A slice from a scene Rachel How does going to a strip club help him better? Ross Because there are naked ladies there. Joey Which helps him get to Phase Three, picturing yourself with other women. Ross There are naked ladies there too. Joey Yeah. 72
  • 73. Descriptions: episode summaries & plots 73 Summary: one-paragr aph episode summary Plots: more detailed episode description ~5,000 sentence descripti ons
  • 74. Description examples 74 Dialogue Summary + Plot Joey One woman? That’s like saying there’s only one flavor of ice cream for you. Lemme tell you something, Ross. There’s lots of flavors out there. Joey compares women to ice cream Ross You know you probably didn’t know this, but back in the high school, I had, a, um, major crush on you. Ross reveals his high school crush on Rachel Rachel I knew. Chandler Alright, one of you give me your underpants. Chandler asks Joey for his underwear, but Joey can’t help him out as he’s not wearing any Joey Can’t help you, I’m not wearing any.
  • 75. Elasticsearch - first results 75 k Development Evaluation R@k MRR R@k MRR 1 46.00 46.00 47.64 47.64 2 65.80 53.80 69.26 55.79 5 72.60 54.71 74.66 56.53 10 78.80 55.13 79.73 56.91 20 83.80 55.31 84.80 57.08
  • 76. Structure extraction for conversational and formal writings 76 “Chandler: Alright, one of you give me your underpants” Cross-genre Document Retrieval: Matching between Conversational and Formal Writings, Jurczyk et al., BLGNLP 2017 (during EMNLP)
  • 77. How does the structure matching work 77 Indexed scripts’ structures Match Description Extract Retrieve
  • 78. Experimental results 78 Model Development Evaluation R@1 MRR R@1 MRR Elastic1 46.00 54.71 47.64 56.53 Structw 34.80 45.75 35.47 47.40 Structl 35.60 46.86 39.53 50.84 Structm 33.80 45.10 35.98 47.76
  • 79. but... The structure matching is capable of locating ~15% descriptions that Elastic can not 79
  • 81. Experimental results 81 Model Development Evaluation R@1 MRR R@1 MRR Elastic10 46.00 54.71 47.64 56.53 Structl 35.60 46.86 39.53 50.84 Rerank1 48.20 56.02 51.86 59.46 Rerankλ 51.20 57.74 52.03 59.84
  • 82. 47.64%Initial R@1 achieved by Elasticsearch ButStructure matching is based on single utterance, this will be improved 9.2%Improvement when the structure extraction features were used 82
  • 84. Passage completion is reading comprehension 84 Cross-genre It is a cross-genre task on the conversational data Reading Comprehension It benchmarks the ability to read and comprehend the natural language Entity-based It is based on the entity prediction given a query and a passage Towards QA As a reading comprehension, it will be crucial for future question answering
  • 85. Already existing PC task: CNN/Daily News 85
  • 86. Passage completion in conversational dialogs 86
  • 87. Approach: a multi-gram convolutional neural network with attention 87
  • 88. Approach: a multi-gram convolutional neural network with attention 88
  • 89. Experimental results 89 Model Accuracy Baseline1 (entity majority) 27.30 Baseline2 (word-distance) 27.26 Linguistic approach (L2R, 2016) 51.16 Bi-LSTM + attention (2017) 69.26 (test: 62.52%) Multi-gram CNN 57.43 Multi-gram CNN + attndot 63.58
  • 91. Experimental results 91 Model Accuracy Baseline1 (entity majority) 27.52 Baseline2 (word-distance) 27.67 Linguistic approach (L2R) 47.36 Bi-LSTM + attention 69.26 Multi-gram CNN 57.43 Multi-gram CNN + attndot 63.58 Utterance-based multi-gram CNN + attn 66.59
  • 92. 66.59%Best so-far score ButIs more robust when CNN + Bi-LSTM is used (accuracy more stable when number of utterances increased) ~5%Lower than Bi-LSTM 92
  • 93. Thanks to the committee members and audience! 93
  • 94. Thanks to Jinho for mentoring me and believing in me! 94
  • 95. Thanks to Emory NLP Lab! 95
  • 99. Thanks to my family! 99
  • 100. Research Contributions 100 A multi-stage annotation scheme For sentence-based factoid question answering using crowdsourcing technique A subtree matching mechanism For measuring contextual similarity between two sentences Exploration of neural architectures for FQA Convolutional neural networks for sentence-based FQA Combining multiple QA corpora Improving the performance of QA systems by cross-using multiple sets A semantics-based graph Abstract representation applied on arithmetic question answering Multi-field structural decomposition For event-based question answering A multi-gram attention CNN For passage completion task for conversational dialog texts Document retrieval task for Cross-genre Structure Matching for conversation and formal writings Non-factoid Question AnsweringSentence-based Factoid Question Answering (2016) (2015-2016) Applications to Cross-genre Tasks (2016-2017)
  • 101. The process of the scheme 101 1. ~500 articles are uniformly sampled from Wikipedia from the following topics: Arts, Country, Food, Historical Events, Movies, Music, Science, Sports, Travel, TV. 2. Section that have more than 2 and less than 26 sentences are selected. 3. These sections are used in the annotation scheme and are sent to annotators. 4. Four tasks are performed on Mechanical Turk, the fifth task is performed using Elasticsearch.
  • 102. The process of subtree matching 102 1. For a question-sentence pair, extract a list of overlapping words 2. For each overlapping pairs, extract their tree slices 3. Perform the matching step on three levels: parents, siblings, and children 4. Calculate their semantic similarity scores
  • 103. The experimentation setup 103 ◎ One of the state-of-the-art convolutional neural networks introduced by Yu et al. (2014) is used to evaluate QA corpora. ◎ The model consists of a single convolutional layer, a max pooling, and then the sigmoid function. ◎ 40 words of question and 40 words of answer are used as the input to the model. ◎ Original splits provided by the creators are used.
  • 104. Natural language processing tasks are used to build a graph 104 Definition Example Document a single document of text microblog note or Wikipedia article Entity a set of instances referring to the same instance in given context “John went to Emory University. He majored in CompSci.” Instance the atomic level object in the graph, usually represented by a modifier of an instance “John went to Emory University with Jessica.” Predicate-Argument an argument that completes the meaning of other instance (predicate) “This car was sold to Michael two days ago.” Attribute a modifier of a instance “Alicia has a black cat.”
  • 105. Document retrieval in the cross-genre texts 105 ◎ Source text is Friends TV show (scripts), while target text (queries) are episode descriptions. ◎ For a list of source texts (episodes) and a query (description), retrieve the source that matches the description. ◎ Structure extraction is presented that improves the retrieval performance. ◎ This task is a preliminary step to perform a question answering on conversational scripts.
  • 106. Target texts - Show’s descriptions 106 ◎ Episode summaries and plots have been crawled from fan sites. ◎ Summaries are one-paragraph texts, usually of 5-6 sentences that provide a high overview of an episode. ◎ Plots are multi-paragraph texts, usually giving more detailed description of an episode ◎ Each summary and plot was sentence segmented, tokenized and is represented as a single query. ◎ The set of over 5,000 queries is used in the experimentation.
  • 107. Improving R@1 given R@10 107 ◎ Extracted relations are now used to improve R@1 when 10 (k=10) most relevant documents are given. ◎ Two-stage classification setup is presented that uses the extracted relations. ◎ Feed forward neural network is used to train the model which makes a decision whether the episode ranked as top (k 1) should be returned as the correct one. ◎ If not, the initial ranking from Elasticsearch is paired with structure matching scores, and a new prediction is made.
  • 108. Passage completion for cross-genre texts 108 ◎ It is very difficult to extract actual answers from the dialogue data, where an answer might be contained within a single or many utterances. ◎ The machine must understand the logic of the human dialogue first, it can not just “extract” the answer from the, it must infer it. ◎ As a proxy task, we first want to tackle a passage completion task. ◎ A query that consists of one or more entities is tested against a document text (news article)
  • 109. Tackling complexity of arithmetic questions 109 ◎ It is relatively easy to develop a question answering system for a single type of questions. ◎ Arithmetic questions seen on the previous slide require reasoning on the abstract level. ◎ A semantics-based graph approach is presented that builds an abstract representation of the text
  • 110. Application to arithmetic questions 110 ◎ This task is represented as a sequence classification of verb polarities. ◎ Each verb can be either positive (+), negative (-) or neutral (0). ◎ Positive/negative verb yields an addition/subtraction operation with its associated chain node. Neutral is omitted from the linear equation. ◎ Prediction is made for all recognized verbs in a sentence, then a linear equation is formed and solved. ◎ Presented graph is applied to extract abstract features based on text.
  • 111. Hybrid system for event-based question answering 111 ◎ A solution that is a good mix of natural language processing and information retrieval is presented. ◎ NLP structures are used to extract lexical, syntactic and semantic representations of text. ◎ Lucene-based search engine is used to index extracted fields, and then to perform a document retrieval on these fields. ◎ The approach is evaluated using publicly available dataset bAbI that consists of 20 tasks, where each task represents a different kind of question answering challenge. Multi-Field Structural Decomposition for Question Answering, Jurczyk et al., arXiv