SlideShare a Scribd company logo
1 of 24
1
KNOWLEDGE GRAPH AND CORPUS DRIVEN
SEGMENTATION AND ANSWER INFERENCE
FOR TELEGRAPHIC ENTITY-SEEKING QUERIES
EMNLP 2014
MANDAR JOSHI UMA SAWANT SOUMEN CHAKRABARTI
IBM RESEARCH IIT BOMBAY, YAHOO LABS IIT BOMBAY
mandarj90@in.ibm.com uma@cse.iitb.ac.in soumen@cse.iitb.ac.in
2
ENTITY-SEEKING TELEGRAPHIC QUERIES
• Short
• Unstructured (like natural language questions)
• Expect entities as answers
3
• No reliable syntax clues
 Free word order
 No or rare capitalization, quoted phrases
• Ambiguous
 Multiple interpretations
 aamir khan films
• Aamir Khan - the Indian actor or British boxer
• Films - appeared in, directed by, or about
• Previous QA work
 Convert to structured query
 Execute on knowledge graph (KG)
CHALLENGES
4
• KG is high precision but incomplete
 Work in progress
 Triples can not represent all information
 Structured – unstructured gap
• Corpus provides recall
• fastest odi century batsman
WHY DO WE NEED THE CORPUS?
… Corey Anderson hits fastest ODI century. This was the first time
two batsmen have hit hundreds in under 50 balls in the same ODI.
5
ANNOTATED WEB WITH KNOWLEDGE GRAPH
Entity: Cricketer
Type: /people/profession
instanceOf
Annotated document
Entity: Corey_Anderson
/people/person/profession
Type: /cricket/cricket_player
instanceOf
mentionOf
… Corey Anderson hits fastest ODI century in mismatch ... was the first
time two batsmen have hit hundreds in under 50 balls in the same ODI.
6
INTERPRETATION VIA SEGMENTATION
7
• Queries seek answer entities (e2)
• Contain (query) entities (e1) , target types (t2),
relations (r), and selectors (s).
SIGNALS FROM THE QUERY
query e1 r t2 s
washington
first
governor
washington governor governor first
washington - governor first
spider
automobile
company
spider - automobile
company
-
automobile company company spider
Assignment of tokens to columns for illustration only; not necessarily optimal
8
• Interpretation = Segmentation + Annotation
• Segmentation of query tokens into 3 partitions
 Query entity (E1)
 Relation and Type (T2/R)
 Selectors (S)
• Multiple ways to annotate each partition
SEGMENTATION AND INTERPRETATION
r: governorOf t2: us_state_governor
r: null t2: us_state_governor
1. Washington (State)
2. Washington_D.C. (City)
washington first governor
E1 partition T2/R partition
S partition
9
ΨT2, E2
Entity Type
Compatibility
COMBINING KG AND CORPUS EVIDENCE
Target
type
Relation
Query
entity
Selectors
Segmentation
Candidate
entity
ΨT2
Type
language
model
ΨR
Relation
language
model
ΨE1
Entity
language
model
ΨE1, R, E2
KG-assisted
relation
evidence
potential
ΨE1, R, E2, S
Corpus-assisted
entity-relation
evidence potential
Washington | first | governor
washington first | governor
Z
T2
R E1 S
E2
governorOf
null
us_state_governor
governor_general
Washington (State)
null
first
washington first
Elisha Peyre
Ferry
10
• Generate interpretations
• Retrieve snippets for each interpretation
• Construct candidate answer entities (e2) set
 Top k from corpus based on snippet frequency
 By KG links that are in interpretations set
• Inference
FROM QUERY TO ANSWER ENTITY
query – signals compatibility
e2-t2 compatibility
evidence from KG and corpus
11
• Objective: To map relation (or type) mentions in
query to Freebase relation (or types)
• Relation Language Model (ΨR)
 Use annotated ClueWeb09 + Freebase triples
 Locate Freebase relation endpoints in corpus
 Extract dependency path words between entities
 Maintain co-occurrence counts of <words, rel>
 Assumption: Co-occurrence implies relation
• Type Language Model (ΨT2)
 Smoothed Dirichlet language model using
Freebase type names
RELATION AND TYPE MODELS
12
• Estimates support to e1-r-e2-s in corpus
• Snippet retrieval and scoring
• Snippets scored using RankSVM
• Partial list of features
 #snippets with distance(e2 , e1 ) < k (k = 5, 10)
 #snippets with distance(e2 , r) < k (k = 3, 6)
 #snippets with relation r = ⊥
 #snippets with relation phrases as prepositions
 #snippets covering fraction of query IDF > k (k =
0.2, 0.4, 0.6, 0.8)
CORPUS POTENTIAL
13
LATENT VARIABLE DISCRIMINATIVE TRAINING (LVDT)
• q, e2 are observed; e1, t2, r and z are latent
• Non-convex formulation
• Constraints are formulated using the best scoring
interpretation
• Training
• Inference
14
EXPERIMENTS
15
TEST BED
• Freebase entity, type and relation knowledge graph
 ~29 million entities
 14000 types
 2000 selected relation
• Annotated corpus
 Clueweb09B Web corpus having 50 million pages
 Google (FACC1), ~ 13 annotations per page
 Text and Entity Index
16
TEST BED
• Query sets
 TREC-INEX: 700 entity search queries
 WQT: Subset of ~800 queries from WebQuestions (WQ)
natural language query set [1], manually converted to
telegraphic form
 Available at http://bit.ly/Spva49
TREC-INEX WQT
Has type and/or relation hints Has mostly relation hints
Answers from KG and corpus
collected by volunteers
Answers from KG only collected
by turkers.
Answer evidence from corpus (+
KG)
Answer evidence from KG
[1] Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question-
answer pairs. In Empirical Methods in Natural Language Processing (EMNLP).
17
0
0.1
0.2
0.3
0.4
0.5
MAP
KG only Corpus only Both
TREC-INEX WQT
Unoptimized LVDT LVDT
Unoptimized
Corpus and knowledge graph help each other to deliver better
performance
SYNERGY BETWEEN KG AND CORPUS
18
0
0.1
0.2
0.3
0.4
0.5
TREC-INEX WQT
MAP
No Interpretation Type + Selector [2] Unoptimized LVDT
QUERY TEMPLATE COMPARISON
Entity-relation-type-selector template provides yields better
accuracy than type-selector template
[2] Uma Sawant and Soumen Chakrabarti. 2013. Learning joint query interpretation and response ranking. In WWW
Conference, Brazil.
19
COMPARISON WITH SEMANTIC PARSERS
0
0.1
0.2
0.3
0.4
0.5
TREC-INEX WQT
MAP
Jacana Sempre Unoptimized LVDT
20
• Benefits of collective inference
 automobile company makes spider
 Entity model fails to identify e1 (Alfa Romeo Spider)
 Recovery: automobile company makes spider
• Limitations
 Sparse corpus annotations
 south africa political system
 Few corpus annotations for e2: Constitutional Republic
 Can’t find appropriate t2 (/../form_of_government) and
r (/location/country/form_of_government)
QUALITATIVE COMPARISON
e1: Automobile t2: /../organization r : /business/industry/companies
21
SUMMARY
• Query interpretation is rewarding, but non-trivial
• Segmentation based models work well for
telegraphic queries
• Entity-relation-type-selector template better than
type-selector template
• Knowledge graph and corpus provide
complementary benefits
22
• S&C: Uma Sawant and Soumen Chakrabarti.
2013. Learning joint query interpretation and
response ranking. In WWW Conference, Brazil.
• Sempre: Jonathan Berant, Andrew Chou, Roy
Frostig, and Percy Liang. 2013. Semantic parsing
on Freebase from question-answer pairs. In
Empirical Methods in Natural Language
Processing (EMNLP).
• Jacana: Xuchen Yao and Benjamin Van Durme.
2014. Information extraction over structured
data: Question answering with Freebase. In ACL
Conference. ACL.
REFERENCES
23
• TREC-INEX and WQT
 Short URL http://bit.ly/Spva49
 Long URL
https://docs.google.com/spreadsheets/d/1AbKBd
FOIXum_NwXeWub0SdeG-
y8Ub4_ub8qTjAw4Qug/edit#gid=0
• Project page
 http://www.cse.iitb.ac.in/~soumen/doc/CSAW/
DATA
24
THANK YOU!
QUESTIONS?

More Related Content

Similar to emnlp14v6.pptx

Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resumemuddanas
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resumemuddanas
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resumemuddanas
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalFaegheh Hasibi
 
Structure-Compiler-phases information about basics of compiler. Pdfpdf
Structure-Compiler-phases information  about basics of compiler. PdfpdfStructure-Compiler-phases information  about basics of compiler. Pdfpdf
Structure-Compiler-phases information about basics of compiler. Pdfpdfovidlivi91
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REAkenbot
 
SQL For PHP Programmers
SQL For PHP ProgrammersSQL For PHP Programmers
SQL For PHP ProgrammersDave Stokes
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through EntitiesPeter Mika
 
Lecture 13
Lecture 13Lecture 13
Lecture 13Shani729
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Avelin Huo
 
Dare to build vertical design with relational data (Entity-Attribute-Value)
Dare to build vertical design with relational data (Entity-Attribute-Value)Dare to build vertical design with relational data (Entity-Attribute-Value)
Dare to build vertical design with relational data (Entity-Attribute-Value)Ivo Andreev
 
C4l2008charper
C4l2008charperC4l2008charper
C4l2008charpercharper
 

Similar to emnlp14v6.pptx (20)

Resume
ResumeResume
Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
Srinivas Muddana Resume
Srinivas Muddana ResumeSrinivas Muddana Resume
Srinivas Muddana Resume
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
 
FinalReport
FinalReportFinalReport
FinalReport
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity Retrieval
 
Structure-Compiler-phases information about basics of compiler. Pdfpdf
Structure-Compiler-phases information  about basics of compiler. PdfpdfStructure-Compiler-phases information  about basics of compiler. Pdfpdf
Structure-Compiler-phases information about basics of compiler. Pdfpdf
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REA
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
SQL For PHP Programmers
SQL For PHP ProgrammersSQL For PHP Programmers
SQL For PHP Programmers
 
I20191007
I20191007I20191007
I20191007
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Lecture 13
Lecture 13Lecture 13
Lecture 13
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Dare to build vertical design with relational data (Entity-Attribute-Value)
Dare to build vertical design with relational data (Entity-Attribute-Value)Dare to build vertical design with relational data (Entity-Attribute-Value)
Dare to build vertical design with relational data (Entity-Attribute-Value)
 
Unit 04 dbms
Unit 04 dbmsUnit 04 dbms
Unit 04 dbms
 
C4l2008charper
C4l2008charperC4l2008charper
C4l2008charper
 

Recently uploaded

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 

emnlp14v6.pptx

  • 1. 1 KNOWLEDGE GRAPH AND CORPUS DRIVEN SEGMENTATION AND ANSWER INFERENCE FOR TELEGRAPHIC ENTITY-SEEKING QUERIES EMNLP 2014 MANDAR JOSHI UMA SAWANT SOUMEN CHAKRABARTI IBM RESEARCH IIT BOMBAY, YAHOO LABS IIT BOMBAY mandarj90@in.ibm.com uma@cse.iitb.ac.in soumen@cse.iitb.ac.in
  • 2. 2 ENTITY-SEEKING TELEGRAPHIC QUERIES • Short • Unstructured (like natural language questions) • Expect entities as answers
  • 3. 3 • No reliable syntax clues  Free word order  No or rare capitalization, quoted phrases • Ambiguous  Multiple interpretations  aamir khan films • Aamir Khan - the Indian actor or British boxer • Films - appeared in, directed by, or about • Previous QA work  Convert to structured query  Execute on knowledge graph (KG) CHALLENGES
  • 4. 4 • KG is high precision but incomplete  Work in progress  Triples can not represent all information  Structured – unstructured gap • Corpus provides recall • fastest odi century batsman WHY DO WE NEED THE CORPUS? … Corey Anderson hits fastest ODI century. This was the first time two batsmen have hit hundreds in under 50 balls in the same ODI.
  • 5. 5 ANNOTATED WEB WITH KNOWLEDGE GRAPH Entity: Cricketer Type: /people/profession instanceOf Annotated document Entity: Corey_Anderson /people/person/profession Type: /cricket/cricket_player instanceOf mentionOf … Corey Anderson hits fastest ODI century in mismatch ... was the first time two batsmen have hit hundreds in under 50 balls in the same ODI.
  • 7. 7 • Queries seek answer entities (e2) • Contain (query) entities (e1) , target types (t2), relations (r), and selectors (s). SIGNALS FROM THE QUERY query e1 r t2 s washington first governor washington governor governor first washington - governor first spider automobile company spider - automobile company - automobile company company spider Assignment of tokens to columns for illustration only; not necessarily optimal
  • 8. 8 • Interpretation = Segmentation + Annotation • Segmentation of query tokens into 3 partitions  Query entity (E1)  Relation and Type (T2/R)  Selectors (S) • Multiple ways to annotate each partition SEGMENTATION AND INTERPRETATION r: governorOf t2: us_state_governor r: null t2: us_state_governor 1. Washington (State) 2. Washington_D.C. (City) washington first governor E1 partition T2/R partition S partition
  • 9. 9 ΨT2, E2 Entity Type Compatibility COMBINING KG AND CORPUS EVIDENCE Target type Relation Query entity Selectors Segmentation Candidate entity ΨT2 Type language model ΨR Relation language model ΨE1 Entity language model ΨE1, R, E2 KG-assisted relation evidence potential ΨE1, R, E2, S Corpus-assisted entity-relation evidence potential Washington | first | governor washington first | governor Z T2 R E1 S E2 governorOf null us_state_governor governor_general Washington (State) null first washington first Elisha Peyre Ferry
  • 10. 10 • Generate interpretations • Retrieve snippets for each interpretation • Construct candidate answer entities (e2) set  Top k from corpus based on snippet frequency  By KG links that are in interpretations set • Inference FROM QUERY TO ANSWER ENTITY query – signals compatibility e2-t2 compatibility evidence from KG and corpus
  • 11. 11 • Objective: To map relation (or type) mentions in query to Freebase relation (or types) • Relation Language Model (ΨR)  Use annotated ClueWeb09 + Freebase triples  Locate Freebase relation endpoints in corpus  Extract dependency path words between entities  Maintain co-occurrence counts of <words, rel>  Assumption: Co-occurrence implies relation • Type Language Model (ΨT2)  Smoothed Dirichlet language model using Freebase type names RELATION AND TYPE MODELS
  • 12. 12 • Estimates support to e1-r-e2-s in corpus • Snippet retrieval and scoring • Snippets scored using RankSVM • Partial list of features  #snippets with distance(e2 , e1 ) < k (k = 5, 10)  #snippets with distance(e2 , r) < k (k = 3, 6)  #snippets with relation r = ⊥  #snippets with relation phrases as prepositions  #snippets covering fraction of query IDF > k (k = 0.2, 0.4, 0.6, 0.8) CORPUS POTENTIAL
  • 13. 13 LATENT VARIABLE DISCRIMINATIVE TRAINING (LVDT) • q, e2 are observed; e1, t2, r and z are latent • Non-convex formulation • Constraints are formulated using the best scoring interpretation • Training • Inference
  • 15. 15 TEST BED • Freebase entity, type and relation knowledge graph  ~29 million entities  14000 types  2000 selected relation • Annotated corpus  Clueweb09B Web corpus having 50 million pages  Google (FACC1), ~ 13 annotations per page  Text and Entity Index
  • 16. 16 TEST BED • Query sets  TREC-INEX: 700 entity search queries  WQT: Subset of ~800 queries from WebQuestions (WQ) natural language query set [1], manually converted to telegraphic form  Available at http://bit.ly/Spva49 TREC-INEX WQT Has type and/or relation hints Has mostly relation hints Answers from KG and corpus collected by volunteers Answers from KG only collected by turkers. Answer evidence from corpus (+ KG) Answer evidence from KG [1] Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question- answer pairs. In Empirical Methods in Natural Language Processing (EMNLP).
  • 17. 17 0 0.1 0.2 0.3 0.4 0.5 MAP KG only Corpus only Both TREC-INEX WQT Unoptimized LVDT LVDT Unoptimized Corpus and knowledge graph help each other to deliver better performance SYNERGY BETWEEN KG AND CORPUS
  • 18. 18 0 0.1 0.2 0.3 0.4 0.5 TREC-INEX WQT MAP No Interpretation Type + Selector [2] Unoptimized LVDT QUERY TEMPLATE COMPARISON Entity-relation-type-selector template provides yields better accuracy than type-selector template [2] Uma Sawant and Soumen Chakrabarti. 2013. Learning joint query interpretation and response ranking. In WWW Conference, Brazil.
  • 19. 19 COMPARISON WITH SEMANTIC PARSERS 0 0.1 0.2 0.3 0.4 0.5 TREC-INEX WQT MAP Jacana Sempre Unoptimized LVDT
  • 20. 20 • Benefits of collective inference  automobile company makes spider  Entity model fails to identify e1 (Alfa Romeo Spider)  Recovery: automobile company makes spider • Limitations  Sparse corpus annotations  south africa political system  Few corpus annotations for e2: Constitutional Republic  Can’t find appropriate t2 (/../form_of_government) and r (/location/country/form_of_government) QUALITATIVE COMPARISON e1: Automobile t2: /../organization r : /business/industry/companies
  • 21. 21 SUMMARY • Query interpretation is rewarding, but non-trivial • Segmentation based models work well for telegraphic queries • Entity-relation-type-selector template better than type-selector template • Knowledge graph and corpus provide complementary benefits
  • 22. 22 • S&C: Uma Sawant and Soumen Chakrabarti. 2013. Learning joint query interpretation and response ranking. In WWW Conference, Brazil. • Sempre: Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question-answer pairs. In Empirical Methods in Natural Language Processing (EMNLP). • Jacana: Xuchen Yao and Benjamin Van Durme. 2014. Information extraction over structured data: Question answering with Freebase. In ACL Conference. ACL. REFERENCES
  • 23. 23 • TREC-INEX and WQT  Short URL http://bit.ly/Spva49  Long URL https://docs.google.com/spreadsheets/d/1AbKBd FOIXum_NwXeWub0SdeG- y8Ub4_ub8qTjAw4Qug/edit#gid=0 • Project page  http://www.cse.iitb.ac.in/~soumen/doc/CSAW/ DATA