-
Be the first to like this
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Information extraction (IE) systems serve as the front end and core stage in different natural language programming tasks. As IE has proved its efficiency in domain-specific tasks, this project focused on one domain: disease outbreak reports. Several reports from the World Health Organization were carefully examined to formulate the extraction tasks: named-entities, such as disease name, date and location; the location of the reporting authority; and the outbreak incident. Extraction rules were then designed, based on a study of the textual expressions and elements found in the text that appeared before and after the target text.
The experiment resulted in very high performance scores for all the tasks in general. The training corpora and the testing corpora were tested separately. The system performed with higher accuracy with entities and events extraction than with relationship extraction.
It can be concluded that the rule-based approach has been proven capable of delivering reliable IE, with extremely high accuracy and coverage results. However, this approach requires an extensive, time-consuming, manual study of word classes and phrases.
Be the first to like this
Login to see the comments