SlideShare a Scribd company logo
Talking Geckos QA System
Jie Cao
Boya Song
Overview
• Rule based system [Based on Ellen’s Quarc Paper]
• Stanford core nlp lib: POS, sentence split, NER,
dependency relations
• Classify questions according to keyword: what, where,
when, why, how, etc.
• Word Match function
• Rules that award points to each sentence
• Narrow down answer according to key word type
Work Process
Input
wordMatch(Common Component)
(Question,StorySentences)->ScoredSentences
reverseWordMatch(Make Assumption)
correctAnswer->correctSentence
narrowDownAnswer(Why)
correctSentence->correctAnswer
Google: Fast is better than slow
narrowDownAnswer(What)
correctSentence->correctAnswer
narrowDownAnswer(…)
correctSentence->correctAnswer
narrowDownAnswer(How)
correctSentence->correctAnswer
BestSentence(Why)
ScoredSentence->correctSentence
BestSentence(What)
ScoredSentence->correctSentence
BestSentence(…)
ScoredSentence->correctSentence
BestSentence(How)
ScoredSentence->correctSentence
Rule Isolation
Rule Isolation
Developing Task Parallelism
Reports and Testing
Fast Testing
Classify Question Type Rules
WordMatch Rules Object Persistence, Reload
Test all data in less than 1 minute
Pre-annotated Story and Question
Annotated Test Report
Test By Question Type List
Annotated and Scored Report For
Every Question Type
151 Stories,1200 Questions
Precision Improvement
• Used reports and searched for general patterns to narrow down
answer within sentence
• Find continuous named entity tags in answer
• Location. time, and preposition keywords
• Why: substring after “so” or “because”—simple but very effective
• What: substring after core sub, core verb, root verb, or strike of
word contained in question
• Who, Where type of questions are more complicated
Sample Report
QuestionID: 1999-W38-1-9

Question: How many peacekeeping troops does Canada now have in 22 nations around the world?

Answer: about 3,900 | 3,900

SentenceSize: 1

CorrectSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers
on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]

MyAnswer: 3,900 22

MyAnswerSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers
on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]

MyAnswerScore: AnswerScore{recall=1.0, precise=0.5, fmeasure=0.6666666666666666, myCorrect=1, correctTotal=1,
myTotal=2, matchKey=' 3,900'}

Difficulty: easy

Included: Yes





QuestionID: 1999-W37-5-3

Question: How much longer did Meiorin take to run 2.5 kilometres than she was supposed to?

Answer: 49 seconds

SentenceSize: 1

CorrectSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer
[+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]

MyAnswer: 49 seconds

MyAnswerSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer
[+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]

MyAnswerScore: AnswerScore{recall=1.0, precise=1.0, fmeasure=1.0, myCorrect=2, correctTotal=2, myTotal=2, matchKey='49
seconds'}

Difficulty: moderate

Included: Yes
rightSentence=94, length = 175, avgRecall = 0.47672550966159993, avgPrecision = 0.3014595835896379, avgFmeasure =
0.32152025726198186
WordMatch: TokenScore&RuleScore
Question: Why is the Sheldon Kennedy Foundation abandoning its dream of building a
ranch for sexually abused children?
MyAnswerSentence: [47.5]Troubled by poor business decisions , the Sheldon
[+strike_1+word_0.5+disword_1.0] Kennedy [+strike_1+word_0.5+disword_1.0]
Foundation [+strike_1+word_0.5+disword_2.0+subj_1+secverb_3] has abandoned
[+word_0.5+rootverb_6] its [+strike_1+word_0.5+disword_1.0] dream
[+strike_1+word_0.5+disword_2.0+dobj_1+secverb_3] of [+strike_1] building
[+strike_1+word_0.5+coreverb_4] a [+strike_1] ranch [+strike_1+word_0.5+disword_0.5]
for [+strike_1] sexually [+strike_1+word_0.5+disword_0.125] abused
[+strike_1+word_0.5+disword_0.125] children [+strike_1+word_0.5+disword_0.25] and will
hand its [+word_0.5+disword_0.5] donations over to the Canadian Red Cross .
[WHY_CURRENT_GOOD:2.0 ]
Stopword
Total Score
=Token Score+Rule Score
Strike_1:
Continuous Match
Word Match:0.5
closer
dis_word_1.0
Math.pow(2, 2 -
dobj_1
root verb_6
coreverb_4:
verb but not root
further dis_word_0.125
Math.pow(2, 2 - distance)
Incomplete
Lucene Stopword
Why RuleScore
WordMatch: TokenScore
1. Generate
1. VerbsInQuestion,OthersInQuestion,rootVerb,subjInQuestion,dObjInQuestion
2. General Word Match Score
1. WORD: “word_0.5”
2. DIS_WORD: “disword_{pow(2,2-distance)}” //unimportant “Mod”, longer distance from the root
3. Continuous Word Match Bonus Score
1. STRIKE: “strike_1” // for every continuous word(not include stopword)
4. Verb Match
1. Verb stopwords no score. //lucene stopwords
2. VerbsInQuestion.contains(“verb”)
• AUX_VERB: 0.5
• ROOT_VERB: 6
• COM_VERB: 0.5
• CORE_VERB: 4
5. Noun Match
1. ROOT_VERB: 6 // Root Verb in noun form
2. subjInQuestion != null && matched
• SUBJ: 1 // match subj in question.
• CORE_SUBJ: 3 // match core_subj in question
• SEC_VERB: 3 // verb of this matched noun is also in Question
3. dObjInQuestion != null && matched
• DOBJ: 1 // match dobj in question
• SEC_VERB: 3 // verb of this matched noun is also in Question
Time Limited
Just Extract Features, Manual tuning weights
Future Work:
1. Learning weights for features
2. dcoref
3. Thesaurus
BestSentence: Rule Score
WEAK_CLUE=1
CLUE=2
GOOD_CLUE =4
CONFIDENT=6
SLAM_DUNK=20
Only TokenScore
in all dataset
Average:
CorrectSentence/All
TokenScore+RuleScore
in all dataset
Average:
CorrectSentence/All
Where GOOD_CLUE: WHERE_LOCATION_PREP
CLUE: WHERE_LOCATION_NER
86/155=0.5548 93/155=0.6000
When
GOOD_CLUE: WHEN_TIME_NER
SLAM_DUNK: WHEN_ORDER_TOKEN
SLAM_DUNK: WHEN_BEGIN_TOKEN
117/173=0.6763 119/173=0.6879
Why
GOOD_CLUE: WHY_REASON_TOKEN
CLUE: WHY_CURRENT_GOOD
CLUE: WHY_POST_GOOD
CLUE: WHY_PRE_GOOD
WEAK_CLUE: WHY_LEAD_TOKEN
WEAK_CLUE: WHY_THINK_TOKEN
WEAK_CLUE: WHY_WANT_TOKEN
70/115=0.6087 71/115=0.6174
Who/
Whose
GOOD_CLUE:WHO_PERSON_NER
CLUE:WHO_ORGANIZATION_MISC_NER
CLUE:WHOSE_PRP_POS
116/187=0.6203 118/187=0.6310
What
GOOD_CLUE: WHAT_KIND_TOKEN
CLUE: WHAT_DATE_NER
SLAM_DUNK: WHAT_NAME_TOKEN
202/311=0.6325 202/311=0.6325
How HOW_SECKEY_K: distance from k to root
GOOD_CLUE: HOW_DISTANCE_TEXT
141/232=0.6078 143/232=0.6164
testset1 52/84 What, 35/57 How, 27/44 Where, 25/41 Who, 23/32 When, 16/28 Why
…
212/313=0.6773 212/313=0.6773
testset2 57/88 What, 41/50 How, 28/44 Who, 27/40 When, 24/38 Where, 21/31 Why … 215/315=0.6825 221/315=0.7016
Summary
• Reporting system worked very well, helped to
improve precision
• System is fairly simple and straight forward
• Unfortunately due to Stanford core nlp lib bug, we
could not incorporate coreference resolution into
our system
• Short on time: Tuning weights manually(ML?) Recall
and precision still have room for improvement
Thanks
Q&A

More Related Content

Similar to Talking Geckos (Question and Answering)

Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
Cédric Hourcade
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
epamspb
 
Teaching Constraint Programming, Patrick Prosser
Teaching Constraint Programming,  Patrick ProsserTeaching Constraint Programming,  Patrick Prosser
Teaching Constraint Programming, Patrick Prosser
Pierre Schaus
 
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
Insight Technology, Inc.
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)
Sam Livingston-Gray
 
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent ScoreQuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
James Wirth
 
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
PyData
 
What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)
Kerry Buckley
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Avkash Chauhan
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
Sarah Allen
 
Varga ha
Varga haVarga ha
Varga ha
Andrea Varga
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
Sho Fola Soboyejo
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
a3sec
 

Similar to Talking Geckos (Question and Answering) (13)

Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
 
Teaching Constraint Programming, Patrick Prosser
Teaching Constraint Programming,  Patrick ProsserTeaching Constraint Programming,  Patrick Prosser
Teaching Constraint Programming, Patrick Prosser
 
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)
 
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent ScoreQuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
 
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
 
What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
 
Varga ha
Varga haVarga ha
Varga ha
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 

More from jie cao

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
jie cao
 
A Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State TrackingA Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State Tracking
jie cao
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsing
jie cao
 
CCG
CCGCCG
CCG
jie cao
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲
jie cao
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
jie cao
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
jie cao
 

More from jie cao (7)

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
 
A Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State TrackingA Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State Tracking
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsing
 
CCG
CCGCCG
CCG
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 

Recently uploaded

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 

Recently uploaded (20)

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 

Talking Geckos (Question and Answering)

  • 1. Talking Geckos QA System Jie Cao Boya Song
  • 2. Overview • Rule based system [Based on Ellen’s Quarc Paper] • Stanford core nlp lib: POS, sentence split, NER, dependency relations • Classify questions according to keyword: what, where, when, why, how, etc. • Word Match function • Rules that award points to each sentence • Narrow down answer according to key word type
  • 3. Work Process Input wordMatch(Common Component) (Question,StorySentences)->ScoredSentences reverseWordMatch(Make Assumption) correctAnswer->correctSentence narrowDownAnswer(Why) correctSentence->correctAnswer Google: Fast is better than slow narrowDownAnswer(What) correctSentence->correctAnswer narrowDownAnswer(…) correctSentence->correctAnswer narrowDownAnswer(How) correctSentence->correctAnswer BestSentence(Why) ScoredSentence->correctSentence BestSentence(What) ScoredSentence->correctSentence BestSentence(…) ScoredSentence->correctSentence BestSentence(How) ScoredSentence->correctSentence Rule Isolation Rule Isolation Developing Task Parallelism
  • 4. Reports and Testing Fast Testing Classify Question Type Rules WordMatch Rules Object Persistence, Reload Test all data in less than 1 minute Pre-annotated Story and Question Annotated Test Report Test By Question Type List Annotated and Scored Report For Every Question Type 151 Stories,1200 Questions
  • 5. Precision Improvement • Used reports and searched for general patterns to narrow down answer within sentence • Find continuous named entity tags in answer • Location. time, and preposition keywords • Why: substring after “so” or “because”—simple but very effective • What: substring after core sub, core verb, root verb, or strike of word contained in question • Who, Where type of questions are more complicated
  • 6. Sample Report QuestionID: 1999-W38-1-9
 Question: How many peacekeeping troops does Canada now have in 22 nations around the world?
 Answer: about 3,900 | 3,900
 SentenceSize: 1
 CorrectSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]
 MyAnswer: 3,900 22
 MyAnswerSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]
 MyAnswerScore: AnswerScore{recall=1.0, precise=0.5, fmeasure=0.6666666666666666, myCorrect=1, correctTotal=1, myTotal=2, matchKey=' 3,900'}
 Difficulty: easy
 Included: Yes
 
 
 QuestionID: 1999-W37-5-3
 Question: How much longer did Meiorin take to run 2.5 kilometres than she was supposed to?
 Answer: 49 seconds
 SentenceSize: 1
 CorrectSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer [+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]
 MyAnswer: 49 seconds
 MyAnswerSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer [+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]
 MyAnswerScore: AnswerScore{recall=1.0, precise=1.0, fmeasure=1.0, myCorrect=2, correctTotal=2, myTotal=2, matchKey='49 seconds'}
 Difficulty: moderate
 Included: Yes rightSentence=94, length = 175, avgRecall = 0.47672550966159993, avgPrecision = 0.3014595835896379, avgFmeasure = 0.32152025726198186
  • 7. WordMatch: TokenScore&RuleScore Question: Why is the Sheldon Kennedy Foundation abandoning its dream of building a ranch for sexually abused children? MyAnswerSentence: [47.5]Troubled by poor business decisions , the Sheldon [+strike_1+word_0.5+disword_1.0] Kennedy [+strike_1+word_0.5+disword_1.0] Foundation [+strike_1+word_0.5+disword_2.0+subj_1+secverb_3] has abandoned [+word_0.5+rootverb_6] its [+strike_1+word_0.5+disword_1.0] dream [+strike_1+word_0.5+disword_2.0+dobj_1+secverb_3] of [+strike_1] building [+strike_1+word_0.5+coreverb_4] a [+strike_1] ranch [+strike_1+word_0.5+disword_0.5] for [+strike_1] sexually [+strike_1+word_0.5+disword_0.125] abused [+strike_1+word_0.5+disword_0.125] children [+strike_1+word_0.5+disword_0.25] and will hand its [+word_0.5+disword_0.5] donations over to the Canadian Red Cross . [WHY_CURRENT_GOOD:2.0 ] Stopword Total Score =Token Score+Rule Score Strike_1: Continuous Match Word Match:0.5 closer dis_word_1.0 Math.pow(2, 2 - dobj_1 root verb_6 coreverb_4: verb but not root further dis_word_0.125 Math.pow(2, 2 - distance) Incomplete Lucene Stopword Why RuleScore
  • 8. WordMatch: TokenScore 1. Generate 1. VerbsInQuestion,OthersInQuestion,rootVerb,subjInQuestion,dObjInQuestion 2. General Word Match Score 1. WORD: “word_0.5” 2. DIS_WORD: “disword_{pow(2,2-distance)}” //unimportant “Mod”, longer distance from the root 3. Continuous Word Match Bonus Score 1. STRIKE: “strike_1” // for every continuous word(not include stopword) 4. Verb Match 1. Verb stopwords no score. //lucene stopwords 2. VerbsInQuestion.contains(“verb”) • AUX_VERB: 0.5 • ROOT_VERB: 6 • COM_VERB: 0.5 • CORE_VERB: 4 5. Noun Match 1. ROOT_VERB: 6 // Root Verb in noun form 2. subjInQuestion != null && matched • SUBJ: 1 // match subj in question. • CORE_SUBJ: 3 // match core_subj in question • SEC_VERB: 3 // verb of this matched noun is also in Question 3. dObjInQuestion != null && matched • DOBJ: 1 // match dobj in question • SEC_VERB: 3 // verb of this matched noun is also in Question Time Limited Just Extract Features, Manual tuning weights Future Work: 1. Learning weights for features 2. dcoref 3. Thesaurus
  • 9. BestSentence: Rule Score WEAK_CLUE=1 CLUE=2 GOOD_CLUE =4 CONFIDENT=6 SLAM_DUNK=20 Only TokenScore in all dataset Average: CorrectSentence/All TokenScore+RuleScore in all dataset Average: CorrectSentence/All Where GOOD_CLUE: WHERE_LOCATION_PREP CLUE: WHERE_LOCATION_NER 86/155=0.5548 93/155=0.6000 When GOOD_CLUE: WHEN_TIME_NER SLAM_DUNK: WHEN_ORDER_TOKEN SLAM_DUNK: WHEN_BEGIN_TOKEN 117/173=0.6763 119/173=0.6879 Why GOOD_CLUE: WHY_REASON_TOKEN CLUE: WHY_CURRENT_GOOD CLUE: WHY_POST_GOOD CLUE: WHY_PRE_GOOD WEAK_CLUE: WHY_LEAD_TOKEN WEAK_CLUE: WHY_THINK_TOKEN WEAK_CLUE: WHY_WANT_TOKEN 70/115=0.6087 71/115=0.6174 Who/ Whose GOOD_CLUE:WHO_PERSON_NER CLUE:WHO_ORGANIZATION_MISC_NER CLUE:WHOSE_PRP_POS 116/187=0.6203 118/187=0.6310 What GOOD_CLUE: WHAT_KIND_TOKEN CLUE: WHAT_DATE_NER SLAM_DUNK: WHAT_NAME_TOKEN 202/311=0.6325 202/311=0.6325 How HOW_SECKEY_K: distance from k to root GOOD_CLUE: HOW_DISTANCE_TEXT 141/232=0.6078 143/232=0.6164 testset1 52/84 What, 35/57 How, 27/44 Where, 25/41 Who, 23/32 When, 16/28 Why … 212/313=0.6773 212/313=0.6773 testset2 57/88 What, 41/50 How, 28/44 Who, 27/40 When, 24/38 Where, 21/31 Why … 215/315=0.6825 221/315=0.7016
  • 10. Summary • Reporting system worked very well, helped to improve precision • System is fairly simple and straight forward • Unfortunately due to Stanford core nlp lib bug, we could not incorporate coreference resolution into our system • Short on time: Tuning weights manually(ML?) Recall and precision still have room for improvement