Temporal Relations with Signals: the Case of Italian Temporal Prepositions Tommaso Caselli , Felice dell’Orletta and Irina...
<ul><li>Different approach </li></ul><ul><li>Application oriented </li></ul><ul><li>NLP techniques </li></ul><ul><li>Focus...
Outline: <ul><li>Motivations </li></ul><ul><li>Temporal Signals in Italian:  </li></ul><ul><ul><li>Theoretical background ...
Motivations <ul><li>Recovering temporal relations in text/discourse is essential to improve the performance of many NLP sy...
Motivations (2)
Theoretical Background  <ul><li>SIGNAL = cover term for a homogeneous class of words which express  relations  between tex...
Corpus Study: Data <ul><li>To identify a large set of temporal signals realized by prepositions we have conducted a corpus...
Corpus Study (2) <ul><li>Temporal relations coded by implicit signals: </li></ul><ul><li>annotation of temporal relations ...
Feature Identification <ul><li>The corpus study together with theoretical statements have led to the identification of 16 ...
Feature Identification – Temporal Expressions <ul><li>Temporal expression features: </li></ul><ul><li>Ontological status: ...
Feature Identification - Eventuality <ul><li>Eventuality features: </li></ul><ul><li>Lemma (POTGOV_head); </li></ul><ul><l...
Feature Identification – Local context <ul><li>Local context features: features which accounts for the presence of further...
Building a M.E. Model <ul><li>Feature annotation: manually conducted by one annotator + one of the author. </li></ul><ul><...
Evaluation  The data set has been split in test (100) and training (900) data 8 different models have been created to disc...
Evaluation (2) <ul><li>PREP  </li></ul><ul><li>INTERVAL </li></ul><ul><li>INSTANT </li></ul><ul><li>POTGOV_head </li></ul>...
Evaluation (3) PREP, INTERVAL,  INSTANT,  FOLLOWED_SIGNAL+TIMEX, PRECEED_SIGNAL+TIMEX, FOLLOWED_SIGNAL+EVENT, TIMEX,  QUAN...
Evaluation (3) PREP, INTERVAL,  INSTANT, TIMEX, QUANTIFIER 87.6%  5 features PREP, INTERVAL,  INSTANT,  FOLLOWED_SIGNAL+TI...
Conclusion & Future Work Mismatch between linguistic theory and features salience <ul><li>Observations on the features:  <...
<ul><li>Thanks </li></ul>
Upcoming SlideShare
Loading in …5
×

Temporal Relations with Signals: the case of Italian Temporal Prepositions

740 views
604 views

Published on

Presentation for the IEEE sponsored conference, held at Brixen in July 2009. http://www.inf.unibz.it/krdb/events/time-2009/

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
740
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Temporal Relations with Signals: the case of Italian Temporal Prepositions

  1. 1. Temporal Relations with Signals: the Case of Italian Temporal Prepositions Tommaso Caselli , Felice dell’Orletta and Irina Prodanof {firstname.lastname@ilc.cnr.it} ILC-CNR, Pisa 16 th International Symposium on Temporal Representation and Reasoning TIME 2009 Bressanone/Brixen, July 24 2009
  2. 2. <ul><li>Different approach </li></ul><ul><li>Application oriented </li></ul><ul><li>NLP techniques </li></ul><ul><li>Focus on: intuitions, knowledge and strategies people use in order to </li></ul><ul><ul><ul><li>place events in time </li></ul></ul></ul><ul><ul><ul><li>order events (encoding and decoding) </li></ul></ul></ul><ul><li>Query texts (corpora) and NOT structured knowledge </li></ul>Introduction
  3. 3. Outline: <ul><li>Motivations </li></ul><ul><li>Temporal Signals in Italian: </li></ul><ul><ul><li>Theoretical background </li></ul></ul><ul><ul><li>Methodology </li></ul></ul><ul><ul><li>Corpus Study </li></ul></ul><ul><li>A Maximum Entropy Model </li></ul><ul><ul><li>Feature Identification </li></ul></ul><ul><ul><li>Evaluation and Results </li></ul></ul><ul><li>Conclusion and Future Work </li></ul>
  4. 4. Motivations <ul><li>Recovering temporal relations in text/discourse is essential to improve the performance of many NLP systems (O.D-Q.A., Text Mining, Summarization, Reasoning) </li></ul><ul><li>Most temporal information in text/discourse is only IMPLICITLY stated </li></ul><ul><li>Need to develop procedures to maximize the role of the various sources of information </li></ul>Temporal prepositions are a partially explicit source of information. Determinig their meaning is part of a strategy to improve the extraction of temporal information
  5. 5. Motivations (2)
  6. 6. Theoretical Background <ul><li>SIGNAL = cover term for a homogeneous class of words which express relations between textual entities </li></ul><ul><li>EXPLICIT = self-evident and stable meaning; Rel (X, Y) </li></ul><ul><li>IMPLICIT = abstract meaning which gets specialized in the co-text; Rel ( λ (X), λ (Y)) </li></ul><ul><li>Temporal signals express temporal relations. </li></ul><ul><li>Temporal signals can occur in 3 types of constructions involving temporal entities: </li></ul><ul><li>temporal expression – temporal expression </li></ul><ul><li>eventuality – temporal expression </li></ul><ul><li>eventuality - eventuality </li></ul>
  7. 7. Corpus Study: Data <ul><li>To identify a large set of temporal signals realized by prepositions we have conducted a corpus study: </li></ul><ul><li>5 million shallow parsed word corpus (from the PAROLE corpus) </li></ul><ul><li>all PP chunks with their left and right contexts have been automatically extracted and imported into a database structure </li></ul><ul><li>automatically generated DB augmented with ontological information from the SIMPLE/CLIPS Ontology, by associating the head noun of each PP chunk to its ontological type </li></ul><ul><li>extraction of the noun head corresponding to type TIME + postprocessing to exclude false positives (e.g. incubation , school …) </li></ul>
  8. 8. Corpus Study (2) <ul><li>Temporal relations coded by implicit signals: </li></ul><ul><li>annotation of temporal relations by means of paraphrase tests </li></ul><ul><li>e.g. [sono stato sposato] per [4 anni] ( I’ve been married for four years ) </li></ul><ul><li>The state of “being married” EQUALS four years </li></ul><ul><li>499 occurrences of construction of the type “eventuality + signal + temporal expressions” </li></ul><ul><li>9 temporal relations (compliant with TimeML and ISO-TimeML): overlap , simultaneous, before, after, no tlink, begin, end, before_ending , equals </li></ul><ul><li>the most frequent temporal relation/implicit signal is assumed to be the prototypical meaning of the signal </li></ul>
  9. 9. Feature Identification <ul><li>The corpus study together with theoretical statements have led to the identification of 16 features: </li></ul><ul><li>PREP: the signal lemma </li></ul><ul><li>3 sets of co-textual feature: </li></ul><ul><ul><li>information about temporal expression </li></ul></ul><ul><ul><li>information about the eventuality </li></ul></ul><ul><ul><li>local contextual information </li></ul></ul>
  10. 10. Feature Identification – Temporal Expressions <ul><li>Temporal expression features: </li></ul><ul><li>Ontological status: INSTANT, INTERVAL </li></ul><ul><li>Type of temporal expressions (TIMEX): </li></ul><ul><ul><li>DATE: August 3 ; 1968 ; 01/12/1980 … </li></ul></ul><ul><ul><li>DURATION: 3 hours ; the last quarter … </li></ul></ul><ul><ul><li>SET: once every year … </li></ul></ul><ul><ul><li>TIME: 3 o’ clock ; (in) the morning … </li></ul></ul><ul><li>Presence of a quantifier: QUANTIFIER </li></ul>
  11. 11. Feature Identification - Eventuality <ul><li>Eventuality features: </li></ul><ul><li>Lemma (POTGOV_head); </li></ul><ul><li>POS of the eventuality: VERB, NOUN </li></ul><ul><li>Presence of negations (NEGATION) </li></ul><ul><li>Verb diatesis (DIATESIS) </li></ul><ul><li>Tense: PRESENT, IMPERFECT, FUTURE, PAST, INFINITIVE </li></ul><ul><li>(Viewpoint) Aspect: IMPERFECTIVE, PERFECTIVE, PROGRESSIVE, NONE </li></ul><ul><li>Lexical Aspect (AKTIONSAART): TRANSITION, PROCESS, STATE </li></ul>
  12. 12. Feature Identification – Local context <ul><li>Local context features: features which accounts for the presence of further signals in the local context which influence the identification of the Rel value of the signal in analysis </li></ul><ul><li>FOLLOWED_SIGNAL+TIMEX </li></ul><ul><li>PRECEED_SIGNAL+TIMEX </li></ul><ul><li>FOLLOWED_SIGNAL+EVENT </li></ul>
  13. 13. Building a M.E. Model <ul><li>Feature annotation: manually conducted by one annotator + one of the author. </li></ul><ul><li>1000 instances of constructions of the type “eventuality + signal + timex” </li></ul><ul><ul><li>two interlinked criteria: semantic transparency of the signal + relative frequency of the signal in the 5 million shallow parsed corpus </li></ul></ul>Assigning the right temporal relation is (in essence) a tagging task.  Maximum Entropy algorithm: it provides a suitable solution to identify the set of possible values for each signal on the basis of the conditional probability distribution. No a priori constraints must be met other than those related to a set of features f i (a, c) of a context C, whose distribution is derived from the training data.
  14. 14. Evaluation The data set has been split in test (100) and training (900) data 8 different models have been created to discover the most salient features. 10- cross fold validation/model. All models outperforms the baseline  relevance of the features
  15. 15. Evaluation (2) <ul><li>PREP </li></ul><ul><li>INTERVAL </li></ul><ul><li>INSTANT </li></ul><ul><li>POTGOV_head </li></ul><ul><li>VERB </li></ul><ul><li>NOUN </li></ul><ul><li>DIATESIS </li></ul><ul><li>NEGATION </li></ul><ul><li>AKTIONSAART </li></ul><ul><li>FOLLOWED_SIGNAL+TIMEX </li></ul><ul><li>PRECEED_SIGNAL+TIMEX </li></ul><ul><li>FOLLOWED_SIGNAL+EVENT </li></ul><ul><li>TENSE </li></ul><ul><li>ASPECT </li></ul><ul><li>TIMEX </li></ul><ul><li>QUANTIFIER </li></ul>10 Feature Model Performance = 90% <ul><li>surface-based features </li></ul><ul><li>good performance without the AKTIONSAART feature </li></ul>
  16. 16. Evaluation (3) PREP, INTERVAL, INSTANT, FOLLOWED_SIGNAL+TIMEX, PRECEED_SIGNAL+TIMEX, FOLLOWED_SIGNAL+EVENT, TIMEX, QUANTIFIER 89.8% 8 features PREP, INTERVAL, INSTANT, AKTIONSAART , FOLLOWED_SIGNAL+TIMEX, PRECEED_SIGNAL+TIMEX, FOLLOWED_SIGNAL+EVENT, TIMEX, QUANTIFIER 89.8% 9 features Features Performance Model
  17. 17. Evaluation (3) PREP, INTERVAL, INSTANT, TIMEX, QUANTIFIER 87.6% 5 features PREP, INTERVAL, INSTANT, FOLLOWED_SIGNAL+TIMEX, PRECEED_SIGNAL+TIMEX, FOLLOWED_SIGNAL+EVENT, TIMEX, QUANTIFIER 86.8% 7 features PREP, INTERVAL, INSTANT, AKTIONSAART, FOLLOWED_SIGNAL+TIMEX, PRECEED_SIGNAL+TIMEX, FOLLOWED_SIGNAL+EVENT, TIMEX 85% 8 features (No QUANTIFIER) Features Performance Model
  18. 18. Conclusion & Future Work Mismatch between linguistic theory and features salience <ul><li>Observations on the features: </li></ul><ul><li>5 core features: PREP, INSTANT, INTERVAL, TIMEX, QUANTIFIER (5 feature model) </li></ul><ul><li>AKTIONSAART influence in this task is almost null. It could be reduced with a set of features more surface-based e.g. presence of D.O., definiteness, cardinality, type of subject… </li></ul><ul><li>the remaining features could be activated in particular linguistic context and with particular signals; e.g. TENSE, ASPECT and AKTIONSAART (ot its subsitutes) with the signal IN; the local context features with the signals DA, A and TRA. </li></ul>Integration of the M.E. Model into a complete automatic system for temporal processing of text/discourse
  19. 19. <ul><li>Thanks </li></ul>

×