SlideShare a Scribd company logo
1 of 11
Talking Geckos QA System
Jie Cao
Boya Song
Overview
• Rule based system [Based on Ellen’s Quarc Paper]
• Stanford core nlp lib: POS, sentence split, NER,
dependency relations
• Classify questions according to keyword: what, where,
when, why, how, etc.
• Word Match function
• Rules that award points to each sentence
• Narrow down answer according to key word type
Work Process
Input
wordMatch(Common Component)
(Question,StorySentences)->ScoredSentences
reverseWordMatch(Make Assumption)
correctAnswer->correctSentence
narrowDownAnswer(Why)
correctSentence->correctAnswer
Google: Fast is better than slow
narrowDownAnswer(What)
correctSentence->correctAnswer
narrowDownAnswer(…)
correctSentence->correctAnswer
narrowDownAnswer(How)
correctSentence->correctAnswer
BestSentence(Why)
ScoredSentence->correctSentence
BestSentence(What)
ScoredSentence->correctSentence
BestSentence(…)
ScoredSentence->correctSentence
BestSentence(How)
ScoredSentence->correctSentence
Rule Isolation
Rule Isolation
Developing Task Parallelism
Reports and Testing
Fast Testing
Classify Question Type Rules
WordMatch Rules Object Persistence, Reload
Test all data in less than 1 minute
Pre-annotated Story and Question
Annotated Test Report
Test By Question Type List
Annotated and Scored Report For
Every Question Type
151 Stories,1200 Questions
Precision Improvement
• Used reports and searched for general patterns to narrow down
answer within sentence
• Find continuous named entity tags in answer
• Location. time, and preposition keywords
• Why: substring after “so” or “because”—simple but very effective
• What: substring after core sub, core verb, root verb, or strike of
word contained in question
• Who, Where type of questions are more complicated
Sample Report
QuestionID: 1999-W38-1-9

Question: How many peacekeeping troops does Canada now have in 22 nations around the world?

Answer: about 3,900 | 3,900

SentenceSize: 1

CorrectSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers
on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]

MyAnswer: 3,900 22

MyAnswerSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers
on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]

MyAnswerScore: AnswerScore{recall=1.0, precise=0.5, fmeasure=0.6666666666666666, myCorrect=1, correctTotal=1,
myTotal=2, matchKey=' 3,900'}

Difficulty: easy

Included: Yes





QuestionID: 1999-W37-5-3

Question: How much longer did Meiorin take to run 2.5 kilometres than she was supposed to?

Answer: 49 seconds

SentenceSize: 1

CorrectSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer
[+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]

MyAnswer: 49 seconds

MyAnswerSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer
[+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]

MyAnswerScore: AnswerScore{recall=1.0, precise=1.0, fmeasure=1.0, myCorrect=2, correctTotal=2, myTotal=2, matchKey='49
seconds'}

Difficulty: moderate

Included: Yes
rightSentence=94, length = 175, avgRecall = 0.47672550966159993, avgPrecision = 0.3014595835896379, avgFmeasure =
0.32152025726198186
WordMatch: TokenScore&RuleScore
Question: Why is the Sheldon Kennedy Foundation abandoning its dream of building a
ranch for sexually abused children?
MyAnswerSentence: [47.5]Troubled by poor business decisions , the Sheldon
[+strike_1+word_0.5+disword_1.0] Kennedy [+strike_1+word_0.5+disword_1.0]
Foundation [+strike_1+word_0.5+disword_2.0+subj_1+secverb_3] has abandoned
[+word_0.5+rootverb_6] its [+strike_1+word_0.5+disword_1.0] dream
[+strike_1+word_0.5+disword_2.0+dobj_1+secverb_3] of [+strike_1] building
[+strike_1+word_0.5+coreverb_4] a [+strike_1] ranch [+strike_1+word_0.5+disword_0.5]
for [+strike_1] sexually [+strike_1+word_0.5+disword_0.125] abused
[+strike_1+word_0.5+disword_0.125] children [+strike_1+word_0.5+disword_0.25] and will
hand its [+word_0.5+disword_0.5] donations over to the Canadian Red Cross .
[WHY_CURRENT_GOOD:2.0 ]
Stopword
Total Score
=Token Score+Rule Score
Strike_1:
Continuous Match
Word Match:0.5
closer
dis_word_1.0
Math.pow(2, 2 -
dobj_1
root verb_6
coreverb_4:
verb but not root
further dis_word_0.125
Math.pow(2, 2 - distance)
Incomplete
Lucene Stopword
Why RuleScore
WordMatch: TokenScore
1. Generate
1. VerbsInQuestion,OthersInQuestion,rootVerb,subjInQuestion,dObjInQuestion
2. General Word Match Score
1. WORD: “word_0.5”
2. DIS_WORD: “disword_{pow(2,2-distance)}” //unimportant “Mod”, longer distance from the root
3. Continuous Word Match Bonus Score
1. STRIKE: “strike_1” // for every continuous word(not include stopword)
4. Verb Match
1. Verb stopwords no score. //lucene stopwords
2. VerbsInQuestion.contains(“verb”)
• AUX_VERB: 0.5
• ROOT_VERB: 6
• COM_VERB: 0.5
• CORE_VERB: 4
5. Noun Match
1. ROOT_VERB: 6 // Root Verb in noun form
2. subjInQuestion != null && matched
• SUBJ: 1 // match subj in question.
• CORE_SUBJ: 3 // match core_subj in question
• SEC_VERB: 3 // verb of this matched noun is also in Question
3. dObjInQuestion != null && matched
• DOBJ: 1 // match dobj in question
• SEC_VERB: 3 // verb of this matched noun is also in Question
Time Limited
Just Extract Features, Manual tuning weights
Future Work:
1. Learning weights for features
2. dcoref
3. Thesaurus
BestSentence: Rule Score
WEAK_CLUE=1
CLUE=2
GOOD_CLUE =4
CONFIDENT=6
SLAM_DUNK=20
Only TokenScore
in all dataset
Average:
CorrectSentence/All
TokenScore+RuleScore
in all dataset
Average:
CorrectSentence/All
Where GOOD_CLUE: WHERE_LOCATION_PREP
CLUE: WHERE_LOCATION_NER
86/155=0.5548 93/155=0.6000
When
GOOD_CLUE: WHEN_TIME_NER
SLAM_DUNK: WHEN_ORDER_TOKEN
SLAM_DUNK: WHEN_BEGIN_TOKEN
117/173=0.6763 119/173=0.6879
Why
GOOD_CLUE: WHY_REASON_TOKEN
CLUE: WHY_CURRENT_GOOD
CLUE: WHY_POST_GOOD
CLUE: WHY_PRE_GOOD
WEAK_CLUE: WHY_LEAD_TOKEN
WEAK_CLUE: WHY_THINK_TOKEN
WEAK_CLUE: WHY_WANT_TOKEN
70/115=0.6087 71/115=0.6174
Who/
Whose
GOOD_CLUE:WHO_PERSON_NER
CLUE:WHO_ORGANIZATION_MISC_NER
CLUE:WHOSE_PRP_POS
116/187=0.6203 118/187=0.6310
What
GOOD_CLUE: WHAT_KIND_TOKEN
CLUE: WHAT_DATE_NER
SLAM_DUNK: WHAT_NAME_TOKEN
202/311=0.6325 202/311=0.6325
How HOW_SECKEY_K: distance from k to root
GOOD_CLUE: HOW_DISTANCE_TEXT
141/232=0.6078 143/232=0.6164
testset1 52/84 What, 35/57 How, 27/44 Where, 25/41 Who, 23/32 When, 16/28 Why
…
212/313=0.6773 212/313=0.6773
testset2 57/88 What, 41/50 How, 28/44 Who, 27/40 When, 24/38 Where, 21/31 Why … 215/315=0.6825 221/315=0.7016
Summary
• Reporting system worked very well, helped to
improve precision
• System is fairly simple and straight forward
• Unfortunately due to Stanford core nlp lib bug, we
could not incorporate coreference resolution into
our system
• Short on time: Tuning weights manually(ML?) Recall
and precision still have room for improvement
Thanks
Q&A

More Related Content

Similar to Talking Geckos QA system overview and performance

Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at DailymotionCédric Hourcade
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"epamspb
 
Teaching Constraint Programming, Patrick Prosser
Teaching Constraint Programming,  Patrick ProsserTeaching Constraint Programming,  Patrick Prosser
Teaching Constraint Programming, Patrick ProsserPierre Schaus
 
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)Insight Technology, Inc.
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Sam Livingston-Gray
 
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent ScoreQuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent ScoreJames Wirth
 
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"PyData
 
What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)Kerry Buckley
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
Test First Teaching
Test First TeachingTest First Teaching
Test First TeachingSarah Allen
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerSho Fola Soboyejo
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressionsa3sec
 

Similar to Talking Geckos QA system overview and performance (13)

Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
 
Teaching Constraint Programming, Patrick Prosser
Teaching Constraint Programming,  Patrick ProsserTeaching Constraint Programming,  Patrick Prosser
Teaching Constraint Programming, Patrick Prosser
 
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
[INSIGHT OUT 2011] A21 why why is probably the right answer(tom kyte)
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)
 
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent ScoreQuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
QuestionPro Integrates with TryMyUI to Launch the Survey Respondent Score
 
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
 
What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)What I learned from Seven Languages in Seven Weeks (IPRUG)
What I learned from Seven Languages in Seven Weeks (IPRUG)
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
 
Varga ha
Varga haVarga ha
Varga ha
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 

More from jie cao

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codesjie cao
 
A Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State TrackingA Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State Trackingjie cao
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingjie cao
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲jie cao
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networksjie cao
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learningjie cao
 

More from jie cao (7)

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
 
A Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State TrackingA Comparative Study on Schema-guided Dialog State Tracking
A Comparative Study on Schema-guided Dialog State Tracking
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsing
 
CCG
CCGCCG
CCG
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 

Talking Geckos QA system overview and performance

  • 1. Talking Geckos QA System Jie Cao Boya Song
  • 2. Overview • Rule based system [Based on Ellen’s Quarc Paper] • Stanford core nlp lib: POS, sentence split, NER, dependency relations • Classify questions according to keyword: what, where, when, why, how, etc. • Word Match function • Rules that award points to each sentence • Narrow down answer according to key word type
  • 3. Work Process Input wordMatch(Common Component) (Question,StorySentences)->ScoredSentences reverseWordMatch(Make Assumption) correctAnswer->correctSentence narrowDownAnswer(Why) correctSentence->correctAnswer Google: Fast is better than slow narrowDownAnswer(What) correctSentence->correctAnswer narrowDownAnswer(…) correctSentence->correctAnswer narrowDownAnswer(How) correctSentence->correctAnswer BestSentence(Why) ScoredSentence->correctSentence BestSentence(What) ScoredSentence->correctSentence BestSentence(…) ScoredSentence->correctSentence BestSentence(How) ScoredSentence->correctSentence Rule Isolation Rule Isolation Developing Task Parallelism
  • 4. Reports and Testing Fast Testing Classify Question Type Rules WordMatch Rules Object Persistence, Reload Test all data in less than 1 minute Pre-annotated Story and Question Annotated Test Report Test By Question Type List Annotated and Scored Report For Every Question Type 151 Stories,1200 Questions
  • 5. Precision Improvement • Used reports and searched for general patterns to narrow down answer within sentence • Find continuous named entity tags in answer • Location. time, and preposition keywords • Why: substring after “so” or “because”—simple but very effective • What: substring after core sub, core verb, root verb, or strike of word contained in question • Who, Where type of questions are more complicated
  • 6. Sample Report QuestionID: 1999-W38-1-9
 Question: How many peacekeeping troops does Canada now have in 22 nations around the world?
 Answer: about 3,900 | 3,900
 SentenceSize: 1
 CorrectSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]
 MyAnswer: 3,900 22
 MyAnswerSentence: [8.5]Right now [+word_0.5] , Canada [+word_0.5] has [+word_0.5+auxverb_0.5] about 3,900 peacekeepers on 22 [+word_0.5] missions around [+word_0.5] the world [+strike_1+word_0.5] . [HOW_NUMBER_NER:4.0 ]
 MyAnswerScore: AnswerScore{recall=1.0, precise=0.5, fmeasure=0.6666666666666666, myCorrect=1, correctTotal=1, myTotal=2, matchKey=' 3,900'}
 Difficulty: easy
 Included: Yes
 
 
 QuestionID: 1999-W37-5-3
 Question: How much longer did Meiorin take to run 2.5 kilometres than she was supposed to?
 Answer: 49 seconds
 SentenceSize: 1
 CorrectSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer [+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]
 MyAnswer: 49 seconds
 MyAnswerSentence: [15.5]She [+word_0.5+disword_2.0] took [+word_0.5+rootverb_6] 49 seconds longer [+word_0.5+disword_2.0] . [HOW_NUMBER_NER:4.0 ]
 MyAnswerScore: AnswerScore{recall=1.0, precise=1.0, fmeasure=1.0, myCorrect=2, correctTotal=2, myTotal=2, matchKey='49 seconds'}
 Difficulty: moderate
 Included: Yes rightSentence=94, length = 175, avgRecall = 0.47672550966159993, avgPrecision = 0.3014595835896379, avgFmeasure = 0.32152025726198186
  • 7. WordMatch: TokenScore&RuleScore Question: Why is the Sheldon Kennedy Foundation abandoning its dream of building a ranch for sexually abused children? MyAnswerSentence: [47.5]Troubled by poor business decisions , the Sheldon [+strike_1+word_0.5+disword_1.0] Kennedy [+strike_1+word_0.5+disword_1.0] Foundation [+strike_1+word_0.5+disword_2.0+subj_1+secverb_3] has abandoned [+word_0.5+rootverb_6] its [+strike_1+word_0.5+disword_1.0] dream [+strike_1+word_0.5+disword_2.0+dobj_1+secverb_3] of [+strike_1] building [+strike_1+word_0.5+coreverb_4] a [+strike_1] ranch [+strike_1+word_0.5+disword_0.5] for [+strike_1] sexually [+strike_1+word_0.5+disword_0.125] abused [+strike_1+word_0.5+disword_0.125] children [+strike_1+word_0.5+disword_0.25] and will hand its [+word_0.5+disword_0.5] donations over to the Canadian Red Cross . [WHY_CURRENT_GOOD:2.0 ] Stopword Total Score =Token Score+Rule Score Strike_1: Continuous Match Word Match:0.5 closer dis_word_1.0 Math.pow(2, 2 - dobj_1 root verb_6 coreverb_4: verb but not root further dis_word_0.125 Math.pow(2, 2 - distance) Incomplete Lucene Stopword Why RuleScore
  • 8. WordMatch: TokenScore 1. Generate 1. VerbsInQuestion,OthersInQuestion,rootVerb,subjInQuestion,dObjInQuestion 2. General Word Match Score 1. WORD: “word_0.5” 2. DIS_WORD: “disword_{pow(2,2-distance)}” //unimportant “Mod”, longer distance from the root 3. Continuous Word Match Bonus Score 1. STRIKE: “strike_1” // for every continuous word(not include stopword) 4. Verb Match 1. Verb stopwords no score. //lucene stopwords 2. VerbsInQuestion.contains(“verb”) • AUX_VERB: 0.5 • ROOT_VERB: 6 • COM_VERB: 0.5 • CORE_VERB: 4 5. Noun Match 1. ROOT_VERB: 6 // Root Verb in noun form 2. subjInQuestion != null && matched • SUBJ: 1 // match subj in question. • CORE_SUBJ: 3 // match core_subj in question • SEC_VERB: 3 // verb of this matched noun is also in Question 3. dObjInQuestion != null && matched • DOBJ: 1 // match dobj in question • SEC_VERB: 3 // verb of this matched noun is also in Question Time Limited Just Extract Features, Manual tuning weights Future Work: 1. Learning weights for features 2. dcoref 3. Thesaurus
  • 9. BestSentence: Rule Score WEAK_CLUE=1 CLUE=2 GOOD_CLUE =4 CONFIDENT=6 SLAM_DUNK=20 Only TokenScore in all dataset Average: CorrectSentence/All TokenScore+RuleScore in all dataset Average: CorrectSentence/All Where GOOD_CLUE: WHERE_LOCATION_PREP CLUE: WHERE_LOCATION_NER 86/155=0.5548 93/155=0.6000 When GOOD_CLUE: WHEN_TIME_NER SLAM_DUNK: WHEN_ORDER_TOKEN SLAM_DUNK: WHEN_BEGIN_TOKEN 117/173=0.6763 119/173=0.6879 Why GOOD_CLUE: WHY_REASON_TOKEN CLUE: WHY_CURRENT_GOOD CLUE: WHY_POST_GOOD CLUE: WHY_PRE_GOOD WEAK_CLUE: WHY_LEAD_TOKEN WEAK_CLUE: WHY_THINK_TOKEN WEAK_CLUE: WHY_WANT_TOKEN 70/115=0.6087 71/115=0.6174 Who/ Whose GOOD_CLUE:WHO_PERSON_NER CLUE:WHO_ORGANIZATION_MISC_NER CLUE:WHOSE_PRP_POS 116/187=0.6203 118/187=0.6310 What GOOD_CLUE: WHAT_KIND_TOKEN CLUE: WHAT_DATE_NER SLAM_DUNK: WHAT_NAME_TOKEN 202/311=0.6325 202/311=0.6325 How HOW_SECKEY_K: distance from k to root GOOD_CLUE: HOW_DISTANCE_TEXT 141/232=0.6078 143/232=0.6164 testset1 52/84 What, 35/57 How, 27/44 Where, 25/41 Who, 23/32 When, 16/28 Why … 212/313=0.6773 212/313=0.6773 testset2 57/88 What, 41/50 How, 28/44 Who, 27/40 When, 24/38 Where, 21/31 Why … 215/315=0.6825 221/315=0.7016
  • 10. Summary • Reporting system worked very well, helped to improve precision • System is fairly simple and straight forward • Unfortunately due to Stanford core nlp lib bug, we could not incorporate coreference resolution into our system • Short on time: Tuning weights manually(ML?) Recall and precision still have room for improvement