SlideShare a Scribd company logo
1 of 38
Download to read offline
Instant Question Answering

Dhwaj Raj
What is Instant Question Answering?
User asks a question in text format and the instantQA
system automatically retrieves or formulates an answer and
presents it back to the user, instantly.
●
Why Instant Question Answering?
●

●

●

●

In spite of the continuous progress of search engines, many of
users’ needs still remain unanswered.
While Community Question Answering (e.g. AnA platform) can
feature factoid questions but their primary goal is to satisfy needs
such as: Opinion seeking, Recommendation, Open-ended questions,
Problem solving.
In community question answering user has to wait for answers
which he seeks, even if question is very simple and a mere fact.
Better User Experience : Why browse through search result listings
or related questions when information can be catered upfront.
Why Instant Question Answering?
●

CASE : SHIKSHA.COM
●

●

●

Top domains being searched based on Both query logs
and data availability with listings: fees, duration, seats,
application date, application url, affiliation, approval,
entrance exams, placement companies and job salaries.
High number of Fact type questions, which can be
targeted, although we are not targeting opinion based or
open ended questions.
23% of questions belong to these 10 domains out of 1.15L
random sample.
Is it something similar to AnA platform?
●

●

Our organization have a discussion forum
called as AnA(Ask and Answer) platform.
InstantQA has no relation what so ever and no
direct usecase with the current AnA forum
contents, as of now.
What kind of questions we target?
●

What is the price of X?

●

When is the last date of Y?

●

How much is the fee for W?

●

What is the fee for W?

●

●

What is meaning of life, universe
and everything?
I do not feel like studying, what
to do?

●

Which company hire from
campus Q?

Will I get admission in Z?

●

How to improve my career?

●

●

●

Should I invest in noida?

How is the placement at Z?
●

●

Is Z college in Delhi? (transform
to where)
●

I have purchased X project,
should I sell it now or hold?
Is it beneficial to buy 2bhk in 30
lacs?
What kind of questions we target?

●

When is the last date of Y?

ID

How much is the fee for W?

●

What is the fee for W?

TO

●

Which company hire from
campus Q?

FA
C
●

●

●

How is the placement at Z?
Is Z college in Delhi? (transform
to where)

●

What is meaning of life, universe
and everything?

O
N pe
ot n
de en
fin de
ite d.

What is the price of X?

S

●

●

I do not feel like studying, what
to do?

●

Will I get admission in Z?

●

How to improve my career?

●

Should I invest in noida?

●

●

I have purchased X project,
should I sell it now or hold?
Is it beneficial to buy 2bhk in 30
lacs?
What is the very basic approach to
instant question answering?
●

General architecture

question

e.g.
What is
Calvad
os?

Question
Classification
and Analysis

/Q is /A
where:/Q=
“(Calvado
s)”

Information
Retrieval

Query=“Calvad
os is”
Text retrieva
l=“…Calvados
is often used in
cooking…
Calvados is a
dry apple
brandy made
in…

Answer
Extraction

/A is : a
dry
apple
brandy

answer
Answe
r

Answer:
/Q is /A:
“Calvad
os” is ”a
dry
apple
brandy”
If it is so simple, why haven't you
done it already?
There are challenges in QA !
●
●
●

●

●

●

●

Quality of text data.
Language variability (paraphrase)
Knowledge base domain: the answer has to be
supported by the collection, not by the current state
of the world.
How to locate the information given the question
keywords.
It is unlikely that a system will have all necessary
resources pre-computed.
The task requires some deduction or extra linguistic
knowledge.
How does a reasoning system find relevant pieces
of information.
Do we have any prior research to
tackle these challeneges?
QA research
●
●

Well established over two decades
TREC (Text REtrieval Conference)
●
●

●

CLEF (Cross Language Evaluation Forum)
●
●

●

2001- current
Information Retrieval, language resources

NTCIR (NII Test Collection for IR Systems)
●
●

●

funded by NIST/DARPA since 1992
QA track 1999 – 2007, directed at ‘Factoids’

1997 – current
IR, question answering, summarization, extraction

Our Literature Survey can be accessed at :

http://svn.infoedge.com:8080/Common_Engineering_Projects_Trac/wiki/instant_question_answering#LiteratureSurvey
Ok investigation is done.
But how to do it actually?
Knowledge base generation
PH
AS

E

1

Knowledge base generation
Knowledge base generation: Example
Index

Btech, iit d, fees,
24000, INR

●

PH
AS

E

1

●

●

●

●

●

●

The fees for Btech
course in IIT D is
24000 INR.
The <<fees>> for
<<Btech>> course
in <<IIT D>> is
<<24000 INR>>.

The fees for Btech course in IIT
D is 24000 INR.
The <<fees>> for <<Btech>>
course in <<IIT D>> is <<24000
INR>>.
Fees, Btech, IIT D, 24000
What is the fees of Btech
course at IIT Delhi?
How much is the fees for Btech
Coure from IIT Delhi?
How many INR is the fees of
btech from iit delhi.
What ….........
Answer Retreival
Answer Retreival : Example
Already indexed
knowledge base.
Trained once at
startup.
How much will I pay
for btech from IIT D?
How much will I
<<pay for>>
<<btech>> from
<<IIT D>>?

Rank and prune
best answer based
on collective match.

Focus: How Much
Object : Pay
Class: quanitity to
pay, fees
●

●

Consistency checks

●

You should pay
24000 INR for
Btech from IIT D.
The fees for Btech
from IITD is 24000
INR.
24000 INR should
be paid for Btech
from IIT D.
So many boxes !!
Let us check out major components in
brief.
A.1. Fact phrase generator from
structured listings
●

Structured listing to factoid text.

●

No need to rely only on user generated sentences.

●

Use basic language model techniques to create
sentences from templates.
<doc>
…..
<college_name>iit</college_name>
<college_id>13213</college_id>
<fee>54000 inr annual</fee>
<location>delhi</location>
…....
</doc>

Language Model

Fee of iit delhi is 54000 inr annual.
A.2. Template Generator
●

Start with identifying:
–
–
–

●

Answer Type
Entities in focus
Part of Speech tags

With these tags and language grammar rules, a
factoid/ sentence can be converted into all possible
question forms. (Question Generation QG task)

Fee of iit delhi is 54000 inr annually. Answer type: quantity● What is the fee of iit delhi annually?
● What is the fee of iit delhi
focus: fee
Fee of <II> <LL> is <$$>.
● How much is the fee of iit delhi?
entity : iit + delhi
Fees of <II> <LL> is <$$>.
● Is fee of iit delhi 54000 inr?
Pos tags etc.
Cost of <II> <LL> is <$$>.
B.1. Text Preprocessing
●

Short-forms
– i’m, im, i m
– can’t, cant, can t

i am
can not

●

Spelling correction

●

Repeated punctuation (!!!, ???, …)

●

Smilies

●

Salutations (Hi all, Hiya, etc.)

●

Names, signature, course codes
B.2. Entity and POS Tagger
●

QER
–

●

Names, locations etc.

Part of Speech Tagger using word sequence
patterns
–

Sequence (noun, verbs, auxiliaries, modifiers)

●

Phrase Chunker

●

Dependency parsing : validate tag relationships
B.3. Question Analysis
●

Create features to be used during answer extraction

●

Identify keywords to be matched in document sentences

●

●

●

Identify answer type to match answer candidates. We can
create an inventory of questions and expected answer
types and so we can train a classifier
– Quantity?
– Dates?
– Definition?
Select a list of useful patterns from a pattern repository
Identify question relations which may be used for sentence
analysis, etc.
B.4. Query Formulation
●

●

●

●

The question needs to be transformed in a
query to the document retrieval system
Each IR system has its own query language
so we need to perform this mapping
Identify useful keywords; use type of answer
sought, entities to boost etc.
Query Creation : Ordered terms, combined
terms, weighted terms.
B.5. Answer Candidate Searcher
●

●

●

Index the <question, qtypes, entities, answer
template> in a training corpus
Retrieve set of n <question, qtypes, entities,
answer template> given a new question
Decide based on the scores of answers
returned the best answer to the new question
Pheww.... !
Where do we need Natural
Language Processing?
●
●
●
●
●
●
●
●

Tokenisation (words, numbers, punctuation, whitespace)
Sentence detection
Part of speech tagging (verbs, nouns, pronouns, etc.)
Query entity recognition
Chunking/Parsing (noun/verb phrases and relationships)
Statistical modelling tools
Dictionaries, word-lists, WordNet , VerbNet
Template generation using grammar rules.
So you are telling me there
are readymade nlp tools?
NLP tools problems
●

Training data issues
●

●

●

Training domains are completely different.
Local english language: slang, spell, localisation

Sentence detection failures:
●
●

●

Tokenisation failures:
●
●

●

Multiple punctuation ???, !!! (student emphasis)
Abbreviations (im, m.b.a, cant, doesnt, etc.)

POS errors
●

●

Bad style (capitalisation, punctuation)
Ellipsis (i tried... it failed... error message...)

Spelling, grammar

We need to experiment, modify codes and train
on our domain data !
What are the use cases of instant QA ?
How does it fit in our system?
Interaction
●

If users are not writing good english then try to minimize their
writings. We can focus on capturing user intent with least amount
of typed text.
✔ Auto complete
✔ Guidance
✔ Spell check
✔ Auto correct
✔ Manual feedback on conflicts
✔ Make them write good queries

●

This helps not onle user experience but increases the
accuracy of language based statistical systems.
Shiksha : main search
& cafe search
Shiksha : Integration with main
search auto-suggestor

We will already generate
good quality questions.
Could be intigrated here.
99acres
●

●

●

Similar use cases like shiksha.
The real estate domain has more open ended
opinion question and very less factoid
questions.
If a single text box search is introduced in future
–
–

SRP can cater not only listings but also Question
Answers
Instant QA would be really helpful in user experience.
And many more other use cases …...

Plus some components of this system will be utilized separately in
improving other existing systems.
Thank you.

More Related Content

What's hot

Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)H K Yoon
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
A performance of svm with modified lesk approach for word sense disambiguatio...
A performance of svm with modified lesk approach for word sense disambiguatio...A performance of svm with modified lesk approach for word sense disambiguatio...
A performance of svm with modified lesk approach for word sense disambiguatio...eSAT Journals
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopiwan_rg
 
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Weiwei Guo
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageJinho Choi
 
Detection of semantic errors from simple bangla sentences
Detection of semantic errors from simple bangla sentencesDetection of semantic errors from simple bangla sentences
Detection of semantic errors from simple bangla sentencesHozaifa Moaj
 

What's hot (8)

Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
A performance of svm with modified lesk approach for word sense disambiguatio...
A performance of svm with modified lesk approach for word sense disambiguatio...A performance of svm with modified lesk approach for word sense disambiguatio...
A performance of svm with modified lesk approach for word sense disambiguatio...
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
 
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese Language
 
Detection of semantic errors from simple bangla sentences
Detection of semantic errors from simple bangla sentencesDetection of semantic errors from simple bangla sentences
Detection of semantic errors from simple bangla sentences
 
1909 paclic
1909 paclic1909 paclic
1909 paclic
 

Similar to Instant Question Answering System

[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Code Institute October Open Evening
Code Institute October Open EveningCode Institute October Open Evening
Code Institute October Open EveningEoghan O'Neill
 
[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my groupNAVER Engineering
 
Codecademy Nashville Roundtable
Codecademy Nashville RoundtableCodecademy Nashville Roundtable
Codecademy Nashville RoundtableSarahMorrisOKeefe1
 
Recruitment and selection process in it industry
Recruitment and selection process in it industryRecruitment and selection process in it industry
Recruitment and selection process in it industryABHISHEK SARKAR
 
Digital Marketing Course
Digital Marketing CourseDigital Marketing Course
Digital Marketing CourseHasibulShanto22
 
How to review a pull request
How to review a pull requestHow to review a pull request
How to review a pull requestrouanw
 
Interview Skills
Interview SkillsInterview Skills
Interview SkillsA2GSERVICES
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systemsQi He
 
How To Do A Project?
How To Do A Project?How To Do A Project?
How To Do A Project?Aravinth NSP
 
Ba why development projects fail
Ba   why development projects failBa   why development projects fail
Ba why development projects failCTE Solutions Inc.
 
Digital Marketing Course
Digital Marketing Course Digital Marketing Course
Digital Marketing Course MdRobiul14
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMProduct School
 
Digital Marketing Course.pdf
Digital Marketing Course.pdfDigital Marketing Course.pdf
Digital Marketing Course.pdfVarendra it
 
Technical recruiter
Technical recruiter Technical recruiter
Technical recruiter harigopala
 
Development Projects Failing? What can the Business Analyst Do?
Development Projects Failing?  What can the Business Analyst Do?Development Projects Failing?  What can the Business Analyst Do?
Development Projects Failing? What can the Business Analyst Do?CTE Solutions Inc.
 
How to land SDE Jobs Outside India
How to land SDE Jobs Outside IndiaHow to land SDE Jobs Outside India
How to land SDE Jobs Outside IndiaSarathkrishnanGS1
 
From idea to ux roadmap - MakeIt Masterclass - Boost User Experience
From idea to ux roadmap - MakeIt Masterclass - Boost User ExperienceFrom idea to ux roadmap - MakeIt Masterclass - Boost User Experience
From idea to ux roadmap - MakeIt Masterclass - Boost User ExperienceClaudio Cossio
 
Journey to Google
Journey to GoogleJourney to Google
Journey to GoogleGDSC PJATK
 

Similar to Instant Question Answering System (20)

[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Code Institute October Open Evening
Code Institute October Open EveningCode Institute October Open Evening
Code Institute October Open Evening
 
[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group
 
Codecademy Nashville Roundtable
Codecademy Nashville RoundtableCodecademy Nashville Roundtable
Codecademy Nashville Roundtable
 
Recruitment and selection process in it industry
Recruitment and selection process in it industryRecruitment and selection process in it industry
Recruitment and selection process in it industry
 
Digital Marketing Course
Digital Marketing CourseDigital Marketing Course
Digital Marketing Course
 
How to review a pull request
How to review a pull requestHow to review a pull request
How to review a pull request
 
Interview Skills
Interview SkillsInterview Skills
Interview Skills
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
 
How To Do A Project
How To Do A ProjectHow To Do A Project
How To Do A Project
 
How To Do A Project?
How To Do A Project?How To Do A Project?
How To Do A Project?
 
Ba why development projects fail
Ba   why development projects failBa   why development projects fail
Ba why development projects fail
 
Digital Marketing Course
Digital Marketing Course Digital Marketing Course
Digital Marketing Course
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
 
Digital Marketing Course.pdf
Digital Marketing Course.pdfDigital Marketing Course.pdf
Digital Marketing Course.pdf
 
Technical recruiter
Technical recruiter Technical recruiter
Technical recruiter
 
Development Projects Failing? What can the Business Analyst Do?
Development Projects Failing?  What can the Business Analyst Do?Development Projects Failing?  What can the Business Analyst Do?
Development Projects Failing? What can the Business Analyst Do?
 
How to land SDE Jobs Outside India
How to land SDE Jobs Outside IndiaHow to land SDE Jobs Outside India
How to land SDE Jobs Outside India
 
From idea to ux roadmap - MakeIt Masterclass - Boost User Experience
From idea to ux roadmap - MakeIt Masterclass - Boost User ExperienceFrom idea to ux roadmap - MakeIt Masterclass - Boost User Experience
From idea to ux roadmap - MakeIt Masterclass - Boost User Experience
 
Journey to Google
Journey to GoogleJourney to Google
Journey to Google
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Instant Question Answering System

  • 2. What is Instant Question Answering? User asks a question in text format and the instantQA system automatically retrieves or formulates an answer and presents it back to the user, instantly. ●
  • 3. Why Instant Question Answering? ● ● ● ● In spite of the continuous progress of search engines, many of users’ needs still remain unanswered. While Community Question Answering (e.g. AnA platform) can feature factoid questions but their primary goal is to satisfy needs such as: Opinion seeking, Recommendation, Open-ended questions, Problem solving. In community question answering user has to wait for answers which he seeks, even if question is very simple and a mere fact. Better User Experience : Why browse through search result listings or related questions when information can be catered upfront.
  • 4. Why Instant Question Answering? ● CASE : SHIKSHA.COM ● ● ● Top domains being searched based on Both query logs and data availability with listings: fees, duration, seats, application date, application url, affiliation, approval, entrance exams, placement companies and job salaries. High number of Fact type questions, which can be targeted, although we are not targeting opinion based or open ended questions. 23% of questions belong to these 10 domains out of 1.15L random sample.
  • 5. Is it something similar to AnA platform? ● ● Our organization have a discussion forum called as AnA(Ask and Answer) platform. InstantQA has no relation what so ever and no direct usecase with the current AnA forum contents, as of now.
  • 6. What kind of questions we target? ● What is the price of X? ● When is the last date of Y? ● How much is the fee for W? ● What is the fee for W? ● ● What is meaning of life, universe and everything? I do not feel like studying, what to do? ● Which company hire from campus Q? Will I get admission in Z? ● How to improve my career? ● ● ● Should I invest in noida? How is the placement at Z? ● ● Is Z college in Delhi? (transform to where) ● I have purchased X project, should I sell it now or hold? Is it beneficial to buy 2bhk in 30 lacs?
  • 7. What kind of questions we target? ● When is the last date of Y? ID How much is the fee for W? ● What is the fee for W? TO ● Which company hire from campus Q? FA C ● ● ● How is the placement at Z? Is Z college in Delhi? (transform to where) ● What is meaning of life, universe and everything? O N pe ot n de en fin de ite d. What is the price of X? S ● ● I do not feel like studying, what to do? ● Will I get admission in Z? ● How to improve my career? ● Should I invest in noida? ● ● I have purchased X project, should I sell it now or hold? Is it beneficial to buy 2bhk in 30 lacs?
  • 8.
  • 9. What is the very basic approach to instant question answering? ● General architecture question e.g. What is Calvad os? Question Classification and Analysis /Q is /A where:/Q= “(Calvado s)” Information Retrieval Query=“Calvad os is” Text retrieva l=“…Calvados is often used in cooking… Calvados is a dry apple brandy made in… Answer Extraction /A is : a dry apple brandy answer Answe r Answer: /Q is /A: “Calvad os” is ”a dry apple brandy”
  • 10. If it is so simple, why haven't you done it already?
  • 11. There are challenges in QA ! ● ● ● ● ● ● ● Quality of text data. Language variability (paraphrase) Knowledge base domain: the answer has to be supported by the collection, not by the current state of the world. How to locate the information given the question keywords. It is unlikely that a system will have all necessary resources pre-computed. The task requires some deduction or extra linguistic knowledge. How does a reasoning system find relevant pieces of information.
  • 12. Do we have any prior research to tackle these challeneges?
  • 13. QA research ● ● Well established over two decades TREC (Text REtrieval Conference) ● ● ● CLEF (Cross Language Evaluation Forum) ● ● ● 2001- current Information Retrieval, language resources NTCIR (NII Test Collection for IR Systems) ● ● ● funded by NIST/DARPA since 1992 QA track 1999 – 2007, directed at ‘Factoids’ 1997 – current IR, question answering, summarization, extraction Our Literature Survey can be accessed at : http://svn.infoedge.com:8080/Common_Engineering_Projects_Trac/wiki/instant_question_answering#LiteratureSurvey
  • 14. Ok investigation is done. But how to do it actually?
  • 17. Knowledge base generation: Example Index Btech, iit d, fees, 24000, INR ● PH AS E 1 ● ● ● ● ● ● The fees for Btech course in IIT D is 24000 INR. The <<fees>> for <<Btech>> course in <<IIT D>> is <<24000 INR>>. The fees for Btech course in IIT D is 24000 INR. The <<fees>> for <<Btech>> course in <<IIT D>> is <<24000 INR>>. Fees, Btech, IIT D, 24000 What is the fees of Btech course at IIT Delhi? How much is the fees for Btech Coure from IIT Delhi? How many INR is the fees of btech from iit delhi. What ….........
  • 19. Answer Retreival : Example Already indexed knowledge base. Trained once at startup. How much will I pay for btech from IIT D? How much will I <<pay for>> <<btech>> from <<IIT D>>? Rank and prune best answer based on collective match. Focus: How Much Object : Pay Class: quanitity to pay, fees ● ● Consistency checks ● You should pay 24000 INR for Btech from IIT D. The fees for Btech from IITD is 24000 INR. 24000 INR should be paid for Btech from IIT D.
  • 20. So many boxes !! Let us check out major components in brief.
  • 21. A.1. Fact phrase generator from structured listings ● Structured listing to factoid text. ● No need to rely only on user generated sentences. ● Use basic language model techniques to create sentences from templates. <doc> ….. <college_name>iit</college_name> <college_id>13213</college_id> <fee>54000 inr annual</fee> <location>delhi</location> ….... </doc> Language Model Fee of iit delhi is 54000 inr annual.
  • 22. A.2. Template Generator ● Start with identifying: – – – ● Answer Type Entities in focus Part of Speech tags With these tags and language grammar rules, a factoid/ sentence can be converted into all possible question forms. (Question Generation QG task) Fee of iit delhi is 54000 inr annually. Answer type: quantity● What is the fee of iit delhi annually? ● What is the fee of iit delhi focus: fee Fee of <II> <LL> is <$$>. ● How much is the fee of iit delhi? entity : iit + delhi Fees of <II> <LL> is <$$>. ● Is fee of iit delhi 54000 inr? Pos tags etc. Cost of <II> <LL> is <$$>.
  • 23. B.1. Text Preprocessing ● Short-forms – i’m, im, i m – can’t, cant, can t i am can not ● Spelling correction ● Repeated punctuation (!!!, ???, …) ● Smilies ● Salutations (Hi all, Hiya, etc.) ● Names, signature, course codes
  • 24. B.2. Entity and POS Tagger ● QER – ● Names, locations etc. Part of Speech Tagger using word sequence patterns – Sequence (noun, verbs, auxiliaries, modifiers) ● Phrase Chunker ● Dependency parsing : validate tag relationships
  • 25. B.3. Question Analysis ● Create features to be used during answer extraction ● Identify keywords to be matched in document sentences ● ● ● Identify answer type to match answer candidates. We can create an inventory of questions and expected answer types and so we can train a classifier – Quantity? – Dates? – Definition? Select a list of useful patterns from a pattern repository Identify question relations which may be used for sentence analysis, etc.
  • 26. B.4. Query Formulation ● ● ● ● The question needs to be transformed in a query to the document retrieval system Each IR system has its own query language so we need to perform this mapping Identify useful keywords; use type of answer sought, entities to boost etc. Query Creation : Ordered terms, combined terms, weighted terms.
  • 27. B.5. Answer Candidate Searcher ● ● ● Index the <question, qtypes, entities, answer template> in a training corpus Retrieve set of n <question, qtypes, entities, answer template> given a new question Decide based on the scores of answers returned the best answer to the new question
  • 29. Where do we need Natural Language Processing? ● ● ● ● ● ● ● ● Tokenisation (words, numbers, punctuation, whitespace) Sentence detection Part of speech tagging (verbs, nouns, pronouns, etc.) Query entity recognition Chunking/Parsing (noun/verb phrases and relationships) Statistical modelling tools Dictionaries, word-lists, WordNet , VerbNet Template generation using grammar rules.
  • 30. So you are telling me there are readymade nlp tools?
  • 31. NLP tools problems ● Training data issues ● ● ● Training domains are completely different. Local english language: slang, spell, localisation Sentence detection failures: ● ● ● Tokenisation failures: ● ● ● Multiple punctuation ???, !!! (student emphasis) Abbreviations (im, m.b.a, cant, doesnt, etc.) POS errors ● ● Bad style (capitalisation, punctuation) Ellipsis (i tried... it failed... error message...) Spelling, grammar We need to experiment, modify codes and train on our domain data !
  • 32. What are the use cases of instant QA ? How does it fit in our system?
  • 33. Interaction ● If users are not writing good english then try to minimize their writings. We can focus on capturing user intent with least amount of typed text. ✔ Auto complete ✔ Guidance ✔ Spell check ✔ Auto correct ✔ Manual feedback on conflicts ✔ Make them write good queries ● This helps not onle user experience but increases the accuracy of language based statistical systems.
  • 34. Shiksha : main search & cafe search
  • 35. Shiksha : Integration with main search auto-suggestor We will already generate good quality questions. Could be intigrated here.
  • 36. 99acres ● ● ● Similar use cases like shiksha. The real estate domain has more open ended opinion question and very less factoid questions. If a single text box search is introduced in future – – SRP can cater not only listings but also Question Answers Instant QA would be really helpful in user experience.
  • 37. And many more other use cases …... Plus some components of this system will be utilized separately in improving other existing systems.