• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Automated Focus Extraction for Question Answering over Topic Maps
 

Automated Focus Extraction for Question Answering over Topic Maps

on

  • 823 views

This paper describes the first stage of question analysis in Question Answering over Topic Maps. It introduces the concepts of asking point and expected answer type as variations of the question ...

This paper describes the first stage of question analysis in Question Answering over Topic Maps. It introduces the concepts of asking point and expected answer type as variations of the question focus. We identify the question focus in questions asked to a Question Answering system over Topic Maps. We use known machine learning techniques for expected answer type extraction and implement a novel approach to the asking point extraction. We also provide a mathematical model to predict the performance of the system.

Statistics

Views

Total Views
823
Views on SlideShare
819
Embed Views
4

Actions

Likes
0
Downloads
10
Comments
0

2 Embeds 4

http://tmra.de 3
http://www.tmra.de 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Automated Focus Extraction for Question Answering over Topic Maps Automated Focus Extraction for Question Answering over Topic Maps Presentation Transcript

    • Automated Focus Extraction for Question Answering over Topic Maps Rani Pinchuk, Alexander Mikhailian and Tiphaine Dalmas Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 2 Context: domain portable Question Answering over Topic Maps •Partly funded by the Flemish government as part of the ITEA2 project LINDO (ITEA2-06011) •The research towards portable domain question answering over Topic Maps is done within the Belgian part of the LINDO project. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 3 Why Topic Maps? • Space industry needs a solution to the knowledge retention problem. • More structured than mind maps, less formal than RDF/OWL. • Allows to organize information in an ontological view. • An ISO standard. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 4 Why Topic Maps? Who is the composer of La Bohème? Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 5 LINDO-BE General Architecture Focus Extractor Answer Question Graph Answer Anchorer Reducer Extractor Time Exp. Topic Map Engine Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 6 LINDO-BE General Architecture Focus Extractor Answer Question Graph Answer Anchorer Reducer Extractor Time Exp. Topic Map Engine Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 7 Question Focus Focus is the type of the answer in the question terminology Who is the composer of La Bohème? Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 8 Focus Asking Point (AP) Expected Answer Type (EAT) “Who is the librettist of La Tilda?” HUMAN: “Who wrote the libretto for La Tilda?” (explicit) (implicit) EAT Classes: TIME, NUMERIC, DEFINITION, LOCATION, Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig HUMAN,
    • 9 Is it difficult to find the focus? • Where was Puccini born? City • What is Puccini's place of birth? • What is Puccini's birthplace? is a • What is the birth place of Puccini? • What city was Puccini born in? Lucca ce • What place was Puccini born in? in pla n or • Where is Puccini from? b n o rs pe Puccini Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 10 Why AP should take precedence over EAT? “Who is the librettist of La Tilda?” EAT = HUMAN Person AP = Librettist Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 11 Precision and Recall | {relevant} I {retrieved } | P= | {retrieved } | | {relevant} I {retrieved} | R= | {relevant} | Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 12 Why AP should take precedence over EAT? “Who is the librettist of La Tilda?” EAT = HUMAN Person AP = Librettist PAP = 57/57 = 1 PEAT = 57/1165 = 0.049 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 13 Why AP should take precedence over EAT? Results over 100 annotated questions: Name Precision Recall AP 0.311 0.30 EAT 0.089 0.21 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 14 Focus Branching Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 15 Focus Extractor Architecture • Supervised machine learning based on the principal of maximum entropy (Maxent). • 2100 questions have been annotated: • 1500 from Li & Roth corpus • 500 from TREC-10 • 100 asked over the Italian Opera topic map • The corpus was split into 80% of training and 20% testing. The evaluation was done 10 times, each time shuffling the training and test data. Question POS Syntactic Lexical Focus Focus Tokenizer Tagger Parser Analysis Extractor Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 16 Questions Annotation Asking Point Expected Answer Type HUMAN: Who is Puccini O: What DEFINITION: What is Tosca? AP: opera LOCATION: Where did Dante die? O: did TIME: When did Puccini die? O: Puccini NUMERIC: How many characters have O: write been killed by poisoning? O: ? OTHER: What did Heinrich Heine write? AP classifier EAT classifier Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 17 AP Results Class Precision Recall F-Score AskingPoint 0.854 0.734 0.789 Other 0.973 0.987 0.980 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 18 EAT Results Class Precision Recall F-Score DEFINITION 0.887 0.800 0.841 LOCATION 0.834 0.812 0.821 HUMAN 0.904 0.753 0.820 TIME 0.880 0.802 0.838 NUMERIC 0.943 0.782 0.854 OTHER 0.746 0.893 0.812 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 19 Overall Results The overall results are provided as the accuracy of the classifier. Accuracy = correct instances / overall instances Value Std dev Std err Focus (AP+EAT) 0.827 0.020 0.006 Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 20 Prediction of Accuracy Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 21 Conclusions • We achieved 82.7% accuracy for focus extraction. • The specificity of the focus degrades gracefully (we first try to extract the AP, and fall back to the EAT). • The focus is identified dynamically instead of relying on static taxonomy of question types. • Machine learning techniques were used throughout the application stack. • The results could be improved with more training data. • The whole setting is domain independent. Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig
    • 22 Questions? Thank you Automated Focus Extraction for Question Answering over Topic Maps TMRA’09, Leipzig