Extending the Espresso Method for Greater Recall


Published on

An in-depth experimental evaluation of Extended Espresso System while comparing it to prior work (Chu, Ganapathi).

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Extending the Espresso Method for Greater Recall

  1. 1. Relationship Extraction from Text Extending the Espresso Method for Greater Recall Derek Springer UCLA Computer Science Department November 19, 2009
  2. 2. Related Works• Ganapathi, Swathi. “Relationship Extraction from Text: Comparison and Experimental Evaluation of the State-of- the-Art.” UCLA comp exam. March 2009.• Chu, A., Sakurai, S., Cárdenas, A. F., "Automatic Detection of Treatment Relationships in Patent Retrieval." 2008 CIKM Patent Information Retrieval Workshop. October 2008.
  3. 3. Related Works, contd• Girju, R. "Automatic Detection of Causal Relations for Question Answering." In the proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003). Workshop on "Multilingual Summarization and Question Answering - Machine Learning and Beyond". 2003.• Pantel, Patrick and Pennacchiotti, Marco. "Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations." In Proceedings of Conference on Computational Linguistics / Association for Computational Linguistics (COLING/ACL- 06). pp. 113-120. Sydney, Australia. 2006.
  4. 4. Relationship Extraction• The task of recognizing the assertion of a particular relationship between two or more entities in text.• Can aid in the development of standalone, intelligent, automated and adaptable user-specific content retrieval systems.• We focus on extracting treatment relationships → A (subject) used to treat B (object).
  5. 5. Goals and Contributions• Extended state-of-the-art Espresso relationship extraction system originally implemented by Ganapathi.• Did an in-depth experimental evaluation of the developed system while comparing it to prior work (Chu, Ganapathi).• Future goal is to use the system developed here as a plug for relationship feature extractor in iScore.
  6. 6. Integration Into iScore• iScore presents additional articles based on an aggregate score of “interestingness.”• We believe filtering articles based on relationships can improve the results of iScore.• We hypothesize that extending the Espresso system implemented by Swathi Ganapathi will improve the ability of a system such as iScore to utilize relationship extraction as a feature.
  7. 7. Comparison Criteria• Performance: Want system to have high precision and recall• Minimal Supervision: Want system to require little to no human supervision• Breadth: Want system to extract relations from varying corpus sizes, domains and formats.• Generality: Want system to extract wide variety of relation types without losing its edge in any of the above criteria.
  8. 8. The Espresso Algorithm• General purpose algorithm which can be used to extract a wide variety of binary relations.• Requires minimal supervision. Only input is a small seed set of known relations.• By looking at individual sentences in detecting relationships, works well on all kinds of corpora.• On tests conducted by the creators of the algorithm, Espresso generated balanced precision and recall.
  9. 9. The Espresso Method
  10. 10. Extending EspressoGanapathis 37.8%ImplementationExtension 91.2%
  11. 11. Ganapathis Implementation• Ganapathis approach uses lexico-syntactic patterns of the form NP1 VP NP2 (Verb category in Table 1).• VP contains treatment verb or pattern and the two NPs would contain the subject and object.• This structure is a very common relationship, accounting for 37.8% of all relationships.
  12. 12. Extension• There still remains a large number of relationships that may provide fruitful results.• Expanding the implementation to include: - Noun+Prep e.g. "X settlement with Y" - Verb+Prep e.g. "X moved to Y" - Infinitive e.g. "X plans to acquire Y" and - Modifier e.g. "X is Y winner" relationship• Retrieves 91.2% of common relationships.
  13. 13. Test Corpora• Patent Corpus: Developed by Shige o 50,000 drug patent documents from 2008 from Class 424 & 514 of the U.S. Patents Classification: “drug, bio-affecting and body treating compositions” and their subclasses. o Patents were pre-filtered to only contain keywords “diabetes”, “metastatic”, “cancer”, “tuberculosis”, “lung”, “bronchitis”, “coronary artery” o All sentences from each document added to a sentence table in the schema• PubMed Corpus: Developed by Gustavo o Comprised of medical abstracts from PubMed o Each abstract was parsed and all sentences from each abstract was stored as individual tuples in the sentence table
  14. 14. Performance Measures
  15. 15. Seed Treatment Relationships• (Xanax, Anxiety) • (Glycoside, Depression)• (Ambien, Insomnia) • (Ibuprofen, Arthritis)• (Effexor, Depression) • (Ibuprofen, Headache)• (Paxil, Depression) • (Tylenol, Fever)• (Lexapro, Depression) • (Tylenol, Headache)• (Caffeine, Depression) • (Antibody, Inflammation)• (Zoloft, Depression) • (Ibuprofen, Inflammation)• (Imipramine, Depression) • (Surgery, Glaucoma)
  16. 16. Procedure1.Re-tag original data set to incorporate extended relationship types.2.Re-run Ganapathis baseline Espresso implementation to compare against updated data set.3.Run extended Espresso implementation to compare against updated data set.
  17. 17. Experiment #1: Extraction on Drug Patent Corpus• Drug Patent corpus used.• Algorithm was run with seed relations and 12 verbs were extracted as being relevant (verbs with rπ greater than 0.2).• These treatment verbs were used to create a test sentence set of 120 sentences i.e. 10 sentences containing a treatment verb for every relevant treatment verb.• 358 possible relations were extracted for each of which we calculated the ri score.• 208 relations were obtained with ri score greater than the threshold out of which 126 were actually correct (through manual tagging).• Of the original 358 relations, manual tagging determined that 213 of them were correct treatment relations.
  18. 18. Experiment #1 Results
  19. 19. Experiment #2: Number of Relationships and Performance• Drug Patent corpus used.• Test the performance of the system under smaller and larger data loads.• Started with initial set of 120 sentences obtained from Drug Patent corpus (10 sentences for each verb, 12 verbs as in test #1)• Increased the number of sentences for each verb by 10 in each case, so that we had sentence sets of 240 and 360 sentences each
  20. 20. Experiment #2 Results
  21. 21. Experiment #2 Analysis• Performance of the system and the number of relationships are inversely related.• ri scores are affected inversely by the max pmi across all relationship instances, it is possible that having more relationship instances in a set lowers the ri for all those relationships.• more relationships => chance of a greater max pmi => lowered ri for all relationship instances.• Not worried → articles likely wont have 200 relations of the same type.
  22. 22. Experiment #3: Extraction on PubMed Corpus• PubMed corpus used.• Want to test the performance of the system on a different type and sized corpus• Algorithm was run with input seed relations on this corpus and10 verbs with the topmost rπ values were extracted• We constructed a test sentence set of 80 sentences (8 sentences for every relevant verb)• We then extracted a total of 162 relations from this test set and calculated their ri scores.• The average ri score was used as the threshold value
  23. 23. Experiment #3 Results
  24. 24. Comparison Over Both Corpora
  25. 25. Experiment #3 Analysis• Performance is worse on PubMed corpus.• Patent corpus dealt with drugs and cures for diseases.• Therefore, there was an abundance of treatment type relations in patent corpus.• PubMed had more general medical data and only contained abstracts => less info.• Therefore, there were fewer treatment relations in PubMed which affected performance.
  26. 26. Comparison with Previous Work * signifies our contribution
  27. 27. Analysis• F-score of Ganapathis version of Espresso fell nearly 10% → due to lower recall, as predicted.• Results of extension over the re-tagged data are on par with Ganapathis original results.• When you consider that Ganapathis system dropped nearly 10%, it seems to indicate the increased general purpose nature of the extension over the original version.
  28. 28. Success• Recall of system is more important than precision, especially when it comes to using relationships as a feature in iScore.• Method is almost completely automated.• Easily expanded to extract other relationship types by changing the input seed relations.• Initial results seem insignificant, but analysis indicates that extended system has the potential to be a general- purpose relationship extraction feature.
  29. 29. Future Work• Development of a relationship feature extractor for iScore.• Relations will have to be syntactically and semantically compared with relations present in other articles and the best article matches will be returned as “interesting” choices for a user.• Optimizations: algorithm design improvements, database connection optimizations and parallelization.