Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Rule-based Information Extraction is Dead! Long Live Rule-based Information Extraction Systems!

951 views

Published on

Poster for our ACL'2013 short paper "Rule-based Information Extraction is Dead! Long Live Rule-based Information Extraction Systems! "

Published in: Technology
  • Login to see the comments

Rule-based Information Extraction is Dead! Long Live Rule-based Information Extraction Systems!

  1. 1. Rule-based Information Extraction is DEAD Long Live Rule-based Information Extraction Systems! Laura Chiticariu, Yunyao Li, Frederick Reiss IBM Research - Almaden THE DISCONNECT: ACADEMIC vs. INDUSTRY Implementations of Entity Extraction Entity Extraction Papers by Year 3.5% 21% 100% RuleBased Hybrid 45% 50% RuleBased 22% 75% 17% Hybrid 17% Machine Learning Based 33% 0% NLP Papers (2003-2012) All Vendors Large Vendors Machine Learning Based Fraction of NLP Papers 67% Commercial Products (2013) Year of Publication THE EXPLANATIONS Academia Rule-based IE PROs •Declarative Heuristic •Easy to comprehend •Easy to maintain •Easy to incorporate domain knowledge •Easy to debug ML-based IE PROs •Trainable •Adaptable •Reduces manual effort CONs CONs • Heuristic •Requires tedious manual labor Industry •Requires labeled data •Requires retraining for domain adaptation •Requires ML expertise to use or maintain • Opaque Evaluating Benefits Evaluating IE on its own of IE Precision and Recall Evaluating Costs of IE Labor cost of writing rules Evaluating IE as part of a larger process Using ill-defined metrics that are subject to change Labor cost Hardware cost Business risk Others What’s the research in Rule-based IE? BRIDGING THE GAP Where is the research in rule-based IE? Making it more principled, effective, and efficient Define standard IE rule language and data model. • What is the right data model to capture text, annotations over text, and their properties? • Can we establish a standard declarative extensible rule language to solve most IE tasks encountered so far? Systems research based on standard IE rule language. • Data representation • Automatic performance optimization • Exploring modern hardware … ML research based on standard IE rule language • How to learn basic primitives such as regular expressions and dictionaries? • How to automatically generate rules that are understandable and maintainable?

×