Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Annotating Causality in the
TempEval-3 Corpus
Paramita Mirza Rachele Sprugnoli Sara Tonelli Manuela Speranza
paramita@fbk....
• TimeML annotation → a markup language for events
and temporal expressions
• Include causal information in the TempEval-3...
What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• ...
What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• ...
C-SIGNAL and CLINK
TimeML annotation
- EVENT
- TIMEX3
- SIGNAL
- TLINK
+ Causality
- C-SIGNAL
- CLINK
• C-SIGNAL → textual...
Causal Concepts
Dynamics Model based on Force Dynamics Theory (Talmy, 1988)
• Captures the concept of causality, along wit...
CLINK: explicit causal constructions
linking two events (source to target)
• Basic construction
– The purchaseS caused the...
Polarity of CLINK
• Polarity of events can help determining
polarity of CLINKs
– Serotonin deficiencyS does not cause depr...
What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• ...
Rule-based Annotation
• Dataset:
– TBAQ-cleaned corpus from TempEval-3, with gold annotated events
• Algorithm:
– The data...
Statistics of Automatic Annotation
• Remarks
– ENABLE-type verbs never
appear in basic construction
– 36 affect verb occur...
Statistics of Automatic Annotation (3)
• CLINKs vs TLINKs
– 173 CLINKs vs 5.2K TLINKs
– 33% of CLINKs have underlying TLIN...
What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• ...
Qualitative Evaluation
• Two main types of errors:
– Wrong identification of involved events, due to
dependency parser mis...
Qualitative Evaluation (2)
Connector types Extracted Correct Precision
Basic CAUSE
PREVENT
ENABLE
5
12
0
3
3
-
0.60
0.25
-...
Quantitative Evaluation
• Manual annotation:
– Dataset: 100 documents from TimeBank corpus
– Inter-annotator agreement (on...
Conclusions
• Annotation guidelines for causality between events…
presented
• Rule-based algorithm for automatic annotatio...
Conclusions (2)
• Polarity of CLINK can be easily identified, though negative
polarity is not so frequent
• There are only...
Thank you!
CAUSE
BEFORE
Paramita closes the presentation so the
question-answering session may start.
Upcoming SlideShare
Loading in …5
×

Annotating Causality in the TempEval-3 Corpus

596 views

Published on

In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

While there is a wide consensus in the NLP community over the modeling of temporal relations between events, mainly based on Allen’s temporal logic, the question on how to annotate other types of event relations, in particular causal ones, is still open. In this work, we present some annotation guidelines to capture causality between event
pairs, partly inspired by TimeML. We then implement a rule-based algorithm to automatically identify explicit causal relations in the TempEval-3 corpus. Based on this
annotation, we report some statistics on the behavior of causal cues in text and perform a preliminary investigation on the interaction between causal and temporal relations.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Annotating Causality in the TempEval-3 Corpus

  1. 1. Annotating Causality in the TempEval-3 Corpus Paramita Mirza Rachele Sprugnoli Sara Tonelli Manuela Speranza paramita@fbk.eu sprugnoli@fbk.eu satonelli@fbk.eu manspera@fbk.eu CAtoCL Workshop, EACL April, 2014
  2. 2. • TimeML annotation → a markup language for events and temporal expressions • Include causal information in the TempEval-3 corpus CAUSE IS_INCLUDED EVENT TIMEX BEFORE EVENT TLINKTLINK SIGNAL Hewlett-Packard acquired 730,070 common shares from Octel as a result of a stock purchase agreement signed on Aug. 10, 1988. Hewlett-Packard acquired 730,070 common shares from Octel as a result of a stock purchase agreement signed on Aug. 10, 1988. TempEval-3 Corpus
  3. 3. What will be covered… • Annotation guidelines for causality • Automatic annotation of explicit causality between events • Qualitative and quantitative evaluation
  4. 4. What will be covered… • Annotation guidelines for causality • Automatic annotation of explicit causality between events • Qualitative and quantitative evaluation
  5. 5. C-SIGNAL and CLINK TimeML annotation - EVENT - TIMEX3 - SIGNAL - TLINK + Causality - C-SIGNAL - CLINK • C-SIGNAL → textual elements indicating the presence of causal relations • Prepositions • Conjunctions • Adverbial connectors • Clause-integrated expressions because of, as a result of, due to, … because, since, so that, … as a result, so, therefore, … the result is, that’s why, … • CLINK → a directional one-to-one relation where source = causing event and target = caused event (optional) c-signalID = ID of related C-SIGNAL
  6. 6. Causal Concepts Dynamics Model based on Force Dynamics Theory (Talmy, 1988) • Captures the concept of causality, along with its related concepts, in terms of three dimensions: – the patient tendency for the result – the presence of concordance between the affector and the patient – the occurrence of the result • Able to distinguish the concept of CAUSE from ENABLE, which is not available in the counterfactual model • Was tested by linking it with natural language • The causality concepts can be lexicalized as verbs (Wolff and Song, 2003): – CAUSE-type cause, influence, persuade, prompt, … – ENABLE-type aid, allow, enable, let, … – PREVENT-type block, constrain, prevent, restrain, …
  7. 7. CLINK: explicit causal constructions linking two events (source to target) • Basic construction – The purchaseS caused the creationT of the current building – The purchaseS enabled the diversificationT of their business – The purchaseS prevented a future transferT • Expressions with affect verbs affect, influence, determine, change – Ogun CAN crisisS affects the launchT of the All Progressives Congress • Expressions with link verbs link, lead, depend (on) – An earthquakeT in North America was linked to a tsunamiS in Japan • Periphrastic causatives – The blastS prompts the boat to heelT violently – The oxygenS lets the fire getsT bigger – The poleS restrains the tent from collapsingT • Expressions with C-SIGNALs – Iraq said it invadedT Kuwait because of disputesS over oil and money
  8. 8. Polarity of CLINK • Polarity of events can help determining polarity of CLINKs – Serotonin deficiencyS does not cause depressionT
  9. 9. What will be covered… • Annotation guidelines for causality • Automatic annotation of explicit causality between events • Qualitative and quantitative evaluation
  10. 10. Rule-based Annotation • Dataset: – TBAQ-cleaned corpus from TempEval-3, with gold annotated events • Algorithm: – The dataset is PoS-tagged and parsed with Stanford dependency parser – The dataset is further analyzed with addDiscourse tool – Look for specific dependency constructions where a causal verb/signal is connected to two events – If such dependency is found: • establish CLINK • identify source and target events – If a causal connector is an event, uses the polarity of the event to assign polarity of the CLINK • Limitations: – Only look for CLINKs between events within the same sentence – Only consider a finite set of causal verbs/signals
  11. 11. Statistics of Automatic Annotation • Remarks – ENABLE-type verbs never appear in basic construction – 36 affect verb occurrences – 50 link verb occurrences – From around 1K causative verb occurrences, only 14% are in periphrastic constructions – From around 1.2K potential causal connectors, only 194 are recognized as causal signals (after disambiguation) – Only 2 CLINKs found with negative polarity Explicit causality CLINKs Basic construction 17 Affect verbs 0 Link verbs 4 Periphrastic causatives 41 Causal signals 111 Total 173
  12. 12. Statistics of Automatic Annotation (3) • CLINKs vs TLINKs – 173 CLINKs vs 5.2K TLINKs – 33% of CLINKs have underlying TLINKs, most are signaled by C- SIGNALs • Iraq said it invadedT Kuwait because of disputesS over oil and money → BEFORE – For CLINK with causative verbs, BEFORE is the only type (with one exception of SIMULTANEOUS) – For CLINK with causal signals, BEFORE type is also the majority, with some exceptions: • But some analysts questionedT how much of an impact the retirement will have, because few jobs will endS up being eliminated → AFTER • The 486 is the descendant of a long series of Intel chips that beganT dominating the market ever since IBM pickedS the 16- bit 8088 chip for its first personal computer → BEGINS
  13. 13. What will be covered… • Annotation guidelines for causality • Automatic annotation of explicit causality between events • Qualitative and quantitative evaluation
  14. 14. Qualitative Evaluation • Two main types of errors: – Wrong identification of involved events, due to dependency parser mistakes • StatesWest Airlines said it withdrew its offer to acquireS Mesa Airlines because the Farmington carrier did not respondT to its offer. – Annotation of sentences not containing causal relations, due to ambiguous nature of verbs, prepositions and conjunctions • Since then, 427 fugitives have been taken into custody or located.
  15. 15. Qualitative Evaluation (2) Connector types Extracted Correct Precision Basic CAUSE PREVENT ENABLE 5 12 0 3 3 - 0.60 0.25 - Affect verbs 0 - - Link verbs 4 3 0.75 Periphrastic CAUSE PREVENT ENABLE 11 6 24 8 1 17 0.73 0.17 0.71 C-SIGNALs 111 70 0.63 Total 173 105 0.61
  16. 16. Quantitative Evaluation • Manual annotation: – Dataset: 100 documents from TimeBank corpus – Inter-annotator agreement (on 5 documents): • 0.844 Dice’s coefficient on C-SIGNAL • 0.73 Dice’s coefficient on CLINK Automatic precision recall F1-score C-SIGNAL 0.64 0.49 0.55 CLINK 0.42 0.23 0.30 Annotation EVENT C-SIGNAL CLINK Manual 3933 78 144 Manual-w/o new events 3872 78 95 Automatic 3872 59 52
  17. 17. Conclusions • Annotation guidelines for causality between events… presented • Rule-based algorithm for automatic annotation: – Manual evaluation: 0.61 precision – Compared with manual annotation: 0.55 F1-score for CSIGNAL and 0.3 F1-score for CLINK – Mistakes are introduced by tools used for parsing and disambiguating causal signals – Not all events involved in causal relations are annotated • Recognizing CLINKs based on causal signals is more straightforward
  18. 18. Conclusions (2) • Polarity of CLINK can be easily identified, though negative polarity is not so frequent • There are only few overlaps between CLINKs and TLINKs, with BEFORE as the majority underlying temporal relation type • Future work… – Factuality and certainty annotation of events – Complete the manual annotation of TempEval-3 corpus, and make it available – Another approach for automatic causal relation extraction – Integration of the proposed guidelines1 with GAF (Fokkens et al., 2013) 1 available at http://www.newsreader-project.eu/publications/technical-reports/ (NWR-2014-2)
  19. 19. Thank you! CAUSE BEFORE Paramita closes the presentation so the question-answering session may start.

×