Identification of temporal expressions in the domain of tourism
Upcoming SlideShare
Loading in...5
×
 

Identification of temporal expressions in the domain of tourism

on

  • 307 views

This paper presents how an existing temporal processor was adapted to be used by the English Question Answering system developed part of the EU-funded project QALL-ME. Experiments applying the ...

This paper presents how an existing temporal processor was adapted to be used by the English Question Answering system developed part of the EU-funded project QALL-ME. Experiments applying the existing temporal processor to questions from the domain of tourism revealed that the existing temporal processor tackles far too many temporal expressions, and this makes it slower than necessary. In light of this, a simplified tem- poral processor which identifies only temporal expressions present in user questions was implemented. The two temporal annotators are evaluated on 1,118 randomly selected user questions and an error analysis is presented. #kept2009 #qallme

Statistics

Views

Total Views
307
Views on SlideShare
307
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Identification of temporal expressions in the domain of tourism Identification of temporal expressions in the domain of tourism Presentation Transcript

  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Identification of temporal expressions in the domain of tourism Andrea Varga, Georgiana Puscasu, Constatin Orasan 1Research Group in Computational Linguistics University of Wolverhampton 2-4 July 2009 / KEPT 2009 Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Outline 1 Context: the QALL-ME project 2 Classification of temporal expressions 3 The temporal annotator in QALL-ME 4 The QALL-ME benchmark 5 Evaluation 6 Conclusions Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions The QALL-ME project QALL-ME: Question Answering Learning technologies in a multiLingual and Multimodal Environment EU-funded project (FP6 IST-033860) shared infrastructure for multilingual and multimodal question answering in the domain of tourism answers questions about local events: movie showtimes, directions to sites (cinemas, hotels), etc a large number of user questions contain temporal constraints Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions The QALL-ME project: http://qallme.fbk.eu/ Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions The QALL-ME project: http://qallme.fbk.eu/ Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Temporal expressions TEs denote: 1 position in time 2 duration 3 time frequency in order to obtain data to train / test automatic TE identification modules, TEs need to be annotated => annotation standards are required annotation standards for annotating TEs: 1 TIMEX2 2 TIMEX3 (part of TimeML) TIMEX2 has been adopted for the purposes of temporal expression annotation in QALL-ME Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TIMEX2 annotation standard Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE classes and subclasses (1) 1. TEs indicating time position 1. Precise TEs 1. Calendar dates 2. Times of day 3. Week references 2. Fuzzy TEs 1. Generic references to the past, present or future 2. Seasons, parts of the year (quarters and halves) 3. Weekends 4. Fuzzy day parts 3. Non-specific TEs referring to time position 2. TEs capturing durations 1. Precise durations 2. Fuzzy durations 3. Non-specific durations 3. Set-denoting TEs 1. Precise frequency TEs 2. Non-specific frequencies Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE classes and subclasses (2) 1. TEs indicating time position 1. Precise TEs 1. Calendar dates: <TIMEX2 VAL="2009">2009</TIMEX2>, <TIMEX2 VAL="2009-07">July</TIMEX2>, <TIMEX2 VAL="2009-07-02">2nd of July</TIMEX2>, <TIMEX2 VAL="197">70s</TIMEX2>, <TIMEX2 VAL="10">11th century</TIMEX2>, <TIMEX2 VAL="2">this millennium</TIMEX2> 2. Times of day: <TIMEX2 VAL="2009-07-02T21:36:42.85">21:36:42.85</TIMEX2>, <TIMEX2 VAL="2009-07-02T10:00">10 o’clock</TIMEX2>, <TIMEX2 VAL="2009-07-02T07:53">7:53 am.</TIMEX2>, <TIMEX2 VAL="2009-07-02T11:15Z">11:15 GMT</TIMEX2> 3. Week references: <TIMEX2 VAL="2009-W27">next week</TIMEX2> Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE classes and subclasses (3) 1. TEs indicating time position 2. Fuzzy TEs 1. Generic references to the past, present or future: <TIMEX2 VAL="PAST_REF" ANCHOR_DIR="BEFORE" ANCHOR_VAL="2009-07-02"> recently</TIMEX2>, <TIMEX2 VAL="PRESENT_REF" ANCHOR_DIR="AS_OF" ANCHOR_VAL="2009-07-02"> now</TIMEX2>, <TIMEX2 VAL="FUTURE_REF" ANCHOR_DIR="AFTER" ANCHOR_VAL="2009-07-02" > future</TIMEX2> 2. Seasons, parts of the year (quarters and halves): <TIMEX2 VAL="2009-SU">this summer</TIMEX2>, <TIMEX2 VAL="2009-Q4">the 4th quarter of 2009</TIMEX2>, <TIMEX2 VAL="2009-H2">2nd half of 2009</TIMEX2> 3. Weekends: <TIMEX2 val="2009-W26-WE">this weekend</TIMEX2> 4. Fuzzy day parts: <TIMEX2 val="2009-07-02TAF">afternoon</TIMEX2> 3. Non-specific TEs referring to time position: <TIMEX2 VAL="XXXX-SU">summers</TIMEX2> Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE classes and subclasses (4) 2. TEs capturing durations 1. Precise durations: <TIMEX2 VAL="PT24H" ANCHOR_DIR="WITHIN" ANCHOR_VAL="2009-07-02"> 24 hours</TIMEX2> 2. Fuzzy durations: <TIMEX2 VAL="PXW" ANCHOR_DIR="BEFORE" ANCHOR_VAL="2009-W26"> preceding weeks</TIMEX2> 3. Non-specific durations: <TIMEX2 VAL="P1D">all day</TIMEX2> 3. Set-denoting TEs 1. Precise frequency TEs: <TIMEX2 VAL="XXXX-XX-XX" SET="YES">every day</TIMEX2> 2. Non-specific frequencies: <TIMEX2 VAL="TNI" SET="YES">some nights</TIMEX2> Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions The temporal annotator in QALL-ME the QALL-ME QA system required a module that adds temporal expression annotations to a user question TIMEX2 standard was adopted as temporal annotation scheme for the following reasons: a shared common annotation schema among all QALL-ME partners usability of QALL-ME benchmark outside the QALL-ME project re-use existing annotation tools capable of annotating according to TIMEX2 standard a system that performs TIMEX2 tagging involves two stages: TE identification: detecting the textual extent of the TEs present in a text TE normalisation: is the whole process carried out in order to identify the final values of the attributes attached to every temporal expression Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE indentification (1) Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE indentification (2) Why is it easy? - in general Calendar dates ("2009-07-02") represent more than 50% of all TEs present in user questions Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions TE normalisation Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Simplifying an existing TE annotator for QALL-ME a complex TE annotator is available at the University of Wolverhampton (initial TE annotator) is able to annotate all types of existing TEs according to TIMEX2 has high performance simplified TE annotator: covers only a few TE classes (the most frequent types of TE present in user questions) follows the design and methodology employed in the initial TE annotator Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions The QALL-ME benchmark a collection of several thousand spoken questions in 4 languages: 1 Italian 2 English 3 Spanish 4 German purposes: 1 to allow development of applications based on machine-learning for QA 2 to enable testing their performance in a controlled laboratory settings to date the benchmark contains 15,479 questions related to cultural events and tourism: eq. accommodation, gastro, cinemas, movies, exhibitions,etc. from the 4,501 questions included in the English part of the QALL-ME benchmark, we selected 1,118 questions for our experiments Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Distribution of TEs 1,118 randomly selected user questions have been manually annotated according to TIMEX2 not the full range of possible existing TEs is covered in QALL-ME user questions Time Position Duration Frequency Calendar dates 86 Precise Times of day 81 19 13 Week 12 Past, Present, Future 0 Fuzzy Seasons and parts of year 1 1 N/A Weekends 16 Day parts 20 Non-specific 43 10 0 Table: Distribution of TEs in the QALL-ME user questions Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Coverage of the simplified TE annotator Time Position Duration Frequency Calendar dates Precise Times of day Week Past, Present, Future Fuzzy Seasons and parts of year N/A Weekends Day parts Non-specific Table: TE classes partially covered by QALL-ME temporal annotator Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Evaluation results the gold standard consists of 1,118 user questions manually annotated according to TIMEX2 guidelines complete and partial matches complete matches precision(initial annotator) 96.0 % 89.0% recall(initial annotator) 95.0 % 88.1% F-measure(initial annotator) 95.5 % 88.5% precision(simplified annotator) 96.6 % 82.4 % recall(simplified annotator) 76.2 % 64.9 % F-measure(simplified annotator) 85.2 % 72.6 % Table: initial TE identifier and the simplified version evaluated against the gold standard Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Error analysis some expressions are not captured by the simplified annotator: a check-out time the check-in time the minimum age weekday daytime some expressions are partially captured by the simplified: annotator simplified annotator gold standard <8-years>-old 8-years-old weekday <evening> weekday evening <10 pm> <tomorrow> 10 pm tomorrow 1 hundred and <23 minutes> 1 hundred and 23 minutes Table: partial matches examples Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
  • Context: the QALL-ME project Classification of temporal expressions The temporal annotator in QALL-ME The QALL-ME benchmark Evaluation Conclusions Conclusions the paper presented a simplified temporal processor used by an English QA system: implemented for a specific task obtained acceptable performance proved to be enough for a practical application Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism