Determining the Types of Temporal Relations in Discourse

475 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Determining the Types of Temporal Relations in Discourse

  1. 1. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Determining the Types of Temporal Relations in Discourse Leon Derczynski University of Sheffield 5 March, 2013Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  2. 2. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionThe Role of Time Why is time important in language processing? World state changes constantly Every empirical assertion has temporal bounds “The sky is blue”, but it was not always Without it, na¨ knowledge extraction will fail (given an ıve Almanac of Presidents, who is President?) By understanding temporal information, you will do better knowledge extraction. Overall goal How do we automatically understand temporal information in natural languages?Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  3. 3. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTemporal Information Extraction Existing state of the art How can we categorise types of temporal information? Events – e.g. occurrences, states Temporal expressions (timexes) – e.g. dates, durations Links – relations between pairs of events or times Supporting texts – e.g. action cardinality, event ordering We develop and use ISO-TimeML to annotate these entities. Main dataset: TimeBank (about 180 annotated documents)Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  4. 4. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTimeML Organizers <EVENT eid="e2120" class="REPORTING">state</EVENT> the <TIMEX3 tid="t29" type="DURATION" value="P2D" temporalFunction="false" functionInDocument="NONE">two days</TIMEX3> of music, dancing, and speeches is <EVENT eid="e2123" class="I STATE">expected</EVENT> to <EVENT eid="e13" class="OCCURRENCE">draw</EVENT> some two million people. <TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/>Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  5. 5. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTimes and Events What are temporal expressions? They refer to a time Subtasks: recognition and interpretation; SotA recognition is 0.86 F1 What do we consider as events? Verbal, nominal State of the art: 0.90 F1 for recognition Doesn’t cover complex structure; e.g. a music festival Events are not very useful unless related to other temporal entities How can we describe this structural complexity? Start by modeling the document as a graphLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  6. 6. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTemporal relations What are temporal relations? They describe the links between times and events Can capture both complex and partial orderings What kinds of temporal relation are there? 1 Interval (before, after, included by, simultaneous) 2 Subordinate (reported speech, modal, conditional) 3 Aspectual (start, culmination – see Vendler, Comrie) This work is concerned with the coarsest-grained information: the first categoryLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  7. 7. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionProblem Definition How are these relations represented? Temporal interval algebra (Allen 1984) – a set of 14 relations between a pair of intervals TimeML defines a set of relation types and also types of interval What is our problem? Assume discourse w/ perfect event and timex annotations In fact, assume we know which intervals to link! “Given an ordered pair of intervals (arg1 , arg2 ), which relation in the set Rallen describes them?”Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  8. 8. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionRelation Extraction How can relations be labelled? Machine learning Using TimeML attributes: some success Using syntactic relations: matches SotA in tree kernels What’s the state of the art? 2007: Mani et al.: baseline 56%, system has 61% accuracy 2008: Bethard, Chambers: many sophisticated improvements – ILP, timex-timex ordering. Improved on Mani et al. by 1.5%. 2010: TempEval-2: baseline 58%, best was 65% accuracy Why do we find this performance ceiling?Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  9. 9. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionSources of Temporal Relation Information What are we missing? There is a heterogeneous set of temporal information types, including: Explicit signals – subsequently, as soon as Linguistic theory offers some models What is the evidence these two types will help? Conducted failure analysis: TempEval-2010 1 Multiple diverse approaches, same dataset Find the set of difficult links Characterise information supporting these links 1 Verhagen et al., 2010: Semeval Task 13 - TempEval-2Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  10. 10. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Task C: event−timex intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task D: event−DCT relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Task E: main event inter−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail Task F: event−subordinate intra−sentence relations All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail Figure: TempEval-2 relation labelling tasks, showing proportions of relations according to the number of systems that gave correct labels.Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  11. 11. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion Proportion of links within a task that are difficult 40 30 % difficult 20 10 0 C D E F Task The problem is difficult, and there is a consistently-difficult set of links. Perhaps we are ignoring some critical information.Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  12. 12. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionNew sources of ordering information Next step: manually characterise each “difficult” link. Attempt to identify what kind of information could be used to label it. Sources to investigate Explicit text – signals “After you pull the pin, throw the grenade” Sources to investigate Tensed relations “Having eaten, I left”Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  13. 13. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTemporal Signals What are these? In TimeML, they are text annotated as being helpful to a temporal relation Used by 12.2% of TimeBank’s relations Are temporal signals useful? A resounding yes! 61% → 83% accuracy with simple features 2 This level of performance on event-event links is above general state-of-the-art Existing corpora are under-annotated 2 Derczynski and Gaizauskas, 2010: Using signals for temporal relation classificationLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  14. 14. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionTemporal Signal Annotation How can we automatically annotate temporal signals? Define signals formally 3 Define a closed class of signals Re-annotate TimeBank Train discrimination and association We included dependency information and function tagging. 3 Derczynski and Gaizauskas, 2011: A corpus based study of temporal signalsLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  15. 15. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionResults How well did our approach perform? 1 Discrimination: 92% accuracy, 75% accuracy on positives (0.77 IAA) 2 Association: 99% accuracy / 80% error reduction 3 Inductive bias towards independence assumption was harmful (MaxEnt, NBayes) Results: 16% of links have signals (31% improvement) and can now be labelled at high accuracy. What remains to be done? How can we remedy under-annotation at the source? Clear links to spatial signal annotation (e.g. -LOC tags)Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  16. 16. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionReichenbach’s Model of Verbs How can we model tense in language? Each verb happens at event time, E The verb is uttered at speech time, S Past tense: E < S John ran. Present tense: E = S I’m free! What differentiates simple past from past perfect? John ran. is not the same as John had run. Introduce abstract reference time, R John had run. E < R < SLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  17. 17. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionReasoning about tense How is Reichenbach’s model helpful? We can describe all verbal events as three points linked by either equality or precedence Automatic and quick inference for relating intervals Does it work? Conducted first corpus-driven validation of the framework For reporting-type links, we used features based on pairwise event-time relations Add one feature representing the Reichenbachian ordering Classifier reached 59% accuracy (48% MCC baseline) on 9% of all temporal relations (above SotA)Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  18. 18. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionExtending the model How else can we use the model? Positional use Timexes relate to reference points Only consider cases where the event and time are linguistically connected Identify these using dependency parses Add a feature hinting at the ordering We reach 75% accuracy from a 67% baseline (above SotA) Also useful for timex standard transduction 4 4 Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3 resourcesLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  19. 19. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionContributions A large part of the difficult relation set (roughly 60%) is catered for by these new information sources. Difficult task, with notable impact Focus on automatic annotation of temporal relations Pushed beyond SotA understanding of the problem Creation of and contribution to language resources – e.g. ISO-TimeML, RTMML, CAVaT (among others) .. where could we go next?Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  20. 20. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionFuture Forensic analysis How can we build a consistent event model from multiple semi-reliable accounts of an event? Challenges: Multi-document event and actor co-reference Story conflict resolution 5 Spatial and temporal IE from colloquial text Building and resolving accurate co-constraining models from unreliable data (belief networks) 5 Regneri, Koller and Pinkal 2010: Learning Script Knowledge with Web ExperimentsLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  21. 21. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionFuture Assertion bounding All assertions have temporal bounds. How can we determine these? Challenges: Accurate extraction of document temporal structure Automated reasoning High-precision timex normalisation Doing temporal IE & IR at gigaword scaleLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  22. 22. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionFuture Temporal dataset construction Many current systems index whole documents by date, but information is more nuanced than that Challenges: Mapping events to temporal data points Storing and extracting events Anchoring events with uncertain bounds (“last year’s fighting” vs. “the fighting on April 23, 2011”) Mining complex super-events; e.g. the Fukushima disaster; what happened when?Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  23. 23. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionRecap Temporality is ubiquitous, in the world around us and in the language we use to describe our world Processing it automatically is difficult Doing high-performance temporal IE opens exciting research avenues Thank you for your time. Are there any questions?Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  24. 24. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionLabellings as probability distributions Automated methods (e.g. classifiers) may have varying degrees of confidence about a link’s label. We could assign a set of labels and probabilities to each label. Consistency constraints allow us to find the most-likely possible graph. A:B → before: 0.9; after 0.1 B:C → before: 0.5; simultaneous: 0.5 A:C → before: 1.0 Very time-consuming to compute – optimisations welcome!Leon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse
  25. 25. Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense ConclusionUnuttered temporal orderings Event/Time distance “When I was brushing my teeth” → This event happens at least twice daily; assume this instance is 0-16 hours away Complex events “When we were putting up the tents for the festival” → near the beginning of / just before the “festival” eventLeon Derczynski University of SheffieldDetermining the Types of Temporal Relations in Discourse

×