Using SVMs with the Command Relation Feature
to Identify Negated Events in Biomedical
Literature

Farzaneh Sarafraz
Goran ...
Outline
•   Motivation & aim
•   Molecular events
•   Data & experiments
•   Methods
•   Discussion
•   Summary



       ...
Motivation & aim
• Biomedical literature
     • 2000 papers published every day
•   Biomedical information extraction need...
Problem statement
• Given
  • Pubmed abstracts
  • Protein/gene mentions annotated
  • Molecular events annotated

• Wante...
Molecular events
                      participant   trigger                               participant

“We further show t...
Molecular events – class I
• One theme (gene/protein)

• “The effect of this synergism was perceptible at
  the level of i...
Molecular events – class II
• One or more themes (gene/protein)

• “We further show that Nmi interacts with all
  STATs ex...
Molecular events – class III
• 1 theme, 0 or 1 cause
   • may be gene/protein or other events
• “Overexpression of full-le...
Data: BioNLP’09
• Training: 800 abstracts
• Test: 260 abstracts
• Gold annotations
   • Event trigger, type, participants,...
Methodologies
• Rule-based
   • The command relation
• Classification
   • SVM on event representation
      • Lexical fea...
TP
                     Precision =
                                   TP + FP
  Evaluation measures
        TP
Precision ...
Baseline results


Approach                   P     R     F1    Spec.
No negation detection      -     0%    -     94%
any...
The command relation
• If a and b are nodes in the constituency parse
  tree of a sentence, then a X-commands b iff the
  ...
Example of the command relation
                    S



                   a        S


• a S-commands b.
• b does not S-...
X-command
in action
                         S

             We now                 VP
                               show...
Rule-based method
• An event is negated if
  •   Negation cue exists;
and
  •   Negation cue S-commands any participant
  ...
Results of rule-based method

Approach                      P     R     F1    Spec.
negation cue S-commands any   23%   76...
SVM features
• Semantic features
   • Event type
• Lexical features
   • Sentence contains negation cue
   • Negation cue
...
Results of single SVM, incremental
    feature sets
Feature set      P    R     F1    Spec.

Features 1-7    43%   8%    1...
1. Event type

        Results of single SVM, incremental
2. Sentence contains neg
   cue
        feature sets
3. Neg cue
...
1. Event type

        Results of single SVM, incremental
2. Sentence contains neg
   cue
        feature sets
3. Neg cue
...
1. Event type

        Results of single SVM, incremental
2. Sentence contains neg
   cue
        feature sets
3. Neg cue
...
1. Event type

        Results of single SVM, incremental
2. Sentence contains neg
   cue
        feature sets
3. Neg cue
...
Results of separate SVMs for each class
Event class      P      R     F1    Spec.
Class I          94%    65%   77%   99.8...
Future work
• Use class-specific features
• Study other variants of command
• Combine negation detection with automatic
  ...
Conclusions
• SVM for extracting negated events
   • >99% specificity
   • 63% F-measure (micro average)
• Different class...
Acknowledgements
• Organisers of BioNLP’09
• GN TEAM
• Casey Bergman’s lab – Faculty of Life Sciences,
  University of Man...
X-command              S
in action
           We now              VP
                             show that



           ...
Upcoming SlideShare
Loading in …5
×

Workshop negations

547 views

Published on

Using SVMs with the Command Relation Feature to Identify Negated Events in Biomedical Literature

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
547
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Workshop negations

  1. 1. Using SVMs with the Command Relation Feature to Identify Negated Events in Biomedical Literature Farzaneh Sarafraz Goran Nenadic School of Computer Science University of Manchester sarafraf@cs.man.ac.uk g.nenadic@manchester.ac.uk
  2. 2. Outline • Motivation & aim • Molecular events • Data & experiments • Methods • Discussion • Summary 2 / 27
  3. 3. Motivation & aim • Biomedical literature • 2000 papers published every day • Biomedical information extraction needed • Improve IE by negation information • Negative results are interesting and reported • “The IKK complex, but not p90 (rsk), is responsible for the in vivo phosphorylation of I-kappa-B-alpha.” • Resources • Shared tasks, data • Linguistic tools (syntactic parsers) 3 / 27
  4. 4. Problem statement • Given • Pubmed abstracts • Protein/gene mentions annotated • Molecular events annotated • Wanted for every event • Negated or not • Classification problem 4 / 27
  5. 5. Molecular events participant trigger participant “We further show that Nmi interacts with all STATs except Stat2.” trigger event participation type participation type {theme, cause} {theme, cause} event type participant {binding, participant transcription, regulation, participant type expression} participant type {gene/protein, event} {gene/protein, event}/ 27 5
  6. 6. Molecular events – class I • One theme (gene/protein) • “The effect of this synergism was perceptible at the level of induction of the IL-2 gene.” • Trigger: induction • Type: gene expression • Theme: IL-2 • Types: transcription, gene expression, phosphorylation, protein catabolism, localization 6 / 27
  7. 7. Molecular events – class II • One or more themes (gene/protein) • “We further show that Nmi interacts with all STATs except Stat2.” • Trigger: interacts • Type: binding • Themes: Nmi, Stat2 • Negated • Type: Binding 7 / 27
  8. 8. Molecular events – class III • 1 theme, 0 or 1 cause • may be gene/protein or other events • “Overexpression of full-length ALG-4 induced transcription of FasL and, consequently, apoptosis.” Event Trigger Type Theme Cause Event 1 “transcription” Transcription FasL Event 2 “Overexpression” Gene expression ALG-4 Event 3 “Overexpression” Regulation Event 2 Event 4 “induced” Regulation Event 1 Event 3 8 / 27 • Types: regulation types
  9. 9. Data: BioNLP’09 • Training: 800 abstracts • Test: 260 abstracts • Gold annotations • Event trigger, type, participants, negation • Negation cue not annotated Event Training data Development data Test data class total negated total negated Class I 2,858 131 559 26 Class II 887 44 249 15 Class III 4,870 440 987 66 Total 9,685 615 1,795 107 9 / 27
  10. 10. Methodologies • Rule-based • The command relation • Classification • SVM on event representation • Lexical features: negation cue, POS • Syntactic features: command • Semantic features: event types • Baseline • NegEx: event triggers as “terms” 10 / 27
  11. 11. TP Precision = TP + FP Evaluation measures TP Precision = TP TP + FP Recall = Sensitivity = TP + FN TP Recall = Sensitivity == 2 × Precision× Recall F1 TP + FN Precision+ Recall Precision × Recall TN F1 = 2 × Specificity = Precision + Recall TN + FP TN Specificity = TN + FP 11 / 27
  12. 12. Baseline results Approach P R F1 Spec. No negation detection - 0% - 94% any negation cue present 20% 78% 32% 81% NegEx 36% 37% 36% 93% 12 / 27
  13. 13. The command relation • If a and b are nodes in the constituency parse tree of a sentence, then a X-commands b iff the lowest ancestor of a with label X is also an ancestor of b. Ronald Langacker, On Pronominalization and the Chain of Command, in D. Reibel and S. Schane (eds.) Modern Studies in English, Prentice-Hall, Englewood Cliffs, NJ. 160-186. 1969. 13 / 27
  14. 14. Example of the command relation S a S • a S-commands b. • b does not S-command a. b 14 / 27
  15. 15. X-command in action S We now VP show that S VP a mutant motif that exchanges fails to bind the p50 the terminal 3' C for a G homodimer. 15 / 27
  16. 16. Rule-based method • An event is negated if • Negation cue exists; and • Negation cue S-commands any participant • Negation cue S-commands trigger • Negation cue S-commands both • Negation cue VP-commands both 16 / 27
  17. 17. Results of rule-based method Approach P R F1 Spec. negation cue S-commands any 23% 76% 35% 84% participant negation cue 23% 68% 34% 85% S-commands trigger negation cue 23% 68% 35% 86% S-commands both negation cue 42% VP-commands both 17 / 27
  18. 18. SVM features • Semantic features • Event type • Lexical features • Sentence contains negation cue • Negation cue • Syntactic features • POS of neg cue • POS of event trigger • POS of the participants • Parse tree distance between trigger & cue • Type of smallest phrase containing trigger & cue • Cue S-commands any participant • Cue S-commands trigger 18 / 27
  19. 19. Results of single SVM, incremental feature sets Feature set P R F1 Spec. Features 1-7 43% 8% 14% 99.2% Features 1-8 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 19 / 27
  20. 20. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% Features 1-8 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 20 / 27
  21. 21. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 21 / 27
  22. 22. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% 9. Cue S-commands Features 1-9 trigger 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 22 / 27
  23. 23. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% 9. Cue S-commands Features 1-9 trigger 71% 38% 49% 99.2% 10.Parse tree distance Features 1-10 between trigger & cue 76% 38% 51% 99.2% 23 / 27
  24. 24. Results of separate SVMs for each class Event class P R F1 Spec. Class I 94% 65% 77% 99.8% (559 events) Class II 100% 33% 50% 100% (249 events) Class III 81% 44% 57% 99.2% (987 events) Micro-average 88% 49% 63% 99.4% (1,795 events) Macro-average 92% 47% 62% 99.7% (3 classes) 24 / 27
  25. 25. Future work • Use class-specific features • Study other variants of command • Combine negation detection with automatic event detection instead of using ‘gold’ events • Use negation detection on a larger scale dataset (MEDLINE) to find contradictions & contrasts in the biomedical literature 25 / 27
  26. 26. Conclusions • SVM for extracting negated events • >99% specificity • 63% F-measure (micro average) • Different classes of events behave differently • To detect negated molecular event • Event trigger & surface distances not enough • Semantic & command features useful • Event participants as important as triggers • Apply on large scale data – MEDLINE 26 / 27
  27. 27. Acknowledgements • Organisers of BioNLP’09 • GN TEAM • Casey Bergman’s lab – Faculty of Life Sciences, University of Manchester • James Eales – University of Manchester • Jonathan Caruana – University College London • Web service soon available at http://gnode1.mib.man.ac.uk/negmole 27 / 27
  28. 28. X-command S in action We now VP show that S VP a mutant motif that exchanges fails to bind the p50 the terminal 3' C for a G homodimer that S is upregulated in LPS tolerant human Mono Mac 6 cells. 28 / 27

×