Event templates for Question answering

396 views

Published on

Understanding narrative text is more than simple information extraction on a sentence-by-sentence basis. To comprehend the true meaning of a narrative requires determining the connections between the sentences and the effect of one event on other events. This story understanding process can be greatly enhanced by the use of event descriptor templates that begin with the basic journalistic questions of who, what, when, where, why, and how but that go beyond these simple basics to address more complex relationships: role playing, context, impact, causality, and interests. Previously, representing story narratives as knowledge representations has required intensive manual effort on the part of trained knowledge engineers to correctly encode the contents of stories into a knowledge base (KB). For large volumes of text, this becomes impractical, limiting the usefulness of KB-based systems in question-answering. This paper describes a means of automating the narrative representation process by using event descriptor templates to elicit critical narrative information to be encoded in a knowledge based system.

  • Be the first to comment

  • Be the first to like this

Event templates for Question answering

  1. 1. Event Templates for Improved Narrative Understanding in Question Answering Systems Maureen Caudill SAIC Enterprise Information Systems Division 10260 Campus Point Drive, MS C-2 San Diego, CA 92121 Barbara Starr SAIC Enterprise Information Systems Division 10260 Campus Point Drive, MS C-2 San Diego, CA 92121 ABSTRACT simple look-up questions. If the system is capable of answering complex queries that require reasoning aboutUnderstanding narrative text is more than simple the situation, its comprehension of events in the realinformation extraction on a sentence-by-sentence basis. world must substantially exceed that of a simpleTo comprehend the true meaning of a narrative requires information extraction system.determining the connections between the sentences andthe effect of one event on other events. This story Information extraction systems typically deal withunderstanding process can be greatly enhanced by the narrative texts on a sentence-by-sentence basis. That is,use of event descriptor templates that begin with the they rarely are capable of understanding the relationshipbasic journalistic questions of who, what, when, where, between sentences. This limited knowledge makes themwhy, and how but that go beyond these simple basics to into systems that do little more than key word matchesaddress more complex relationships: role playing, when trying to answer questions. In a query aboutcontext, impact, causality, and interests. Previously, houses, a typical information extraction system looks forrepresenting story narratives as knowledge the word “house” in sentences. A slightly morerepresentations has required intensive manual effort on sophisticated version might also include synonymthe part of trained knowledge engineers to correctly searches, and include searches for “apartment,”encode the contents of stories into a knowledge base “condominium,” “condo,” “dwelling,” “abode,” etc.(KB). For large volumes of text, this becomes However, none of these techniques can identifyimpractical, limiting the usefulness of KB-based systems sentences that refer to houses indirectly such as the onein question-answering. This paper describes a means of below:automating the narrative representation process by usingevent descriptor templates to elicit critical narrative I lived on Main Street when I was a child, in a huge oldinformation to be encoded in a knowledge based system. barn. It had twelve rooms, five fireplaces, and many odd nooks and corners where a child could disappear forKeywords: Automating Knowledge Generation, , hours at a time.Knowledge Engineering Bottleneck, Knowledge-BasedSystem Development, Question-Answering Systems, In this example, the word “barn” likely would not beEvent extraction, Event understanding, Automatic Event identified as a metaphor for “house.” Even if thatExtraction. relationship were identified, the second sentence, 1. BACKGROUND referring to the dwelling place only as “It,” would certainly be missed by all but a very few current systems.A question-understanding system that can answer Thus, if the question is “How many rooms were in thequestions about historical or current events must have an house where you lived as a child?” an informationunderstanding of those events to do more than respond to extraction system could not answer it, even though the
  2. 2. information requested is explicitly included in the The concept of prefabricated understanding of specificsentence. For information that is not explicit, but types of events can, however, be extended to moreavailable only through inferencing, typical information general event types using an event descriptor template.extraction systems have no hope of generating an The initial version of this template first seeks to answeranswer. the basic journalistic queries of who, what, when, where, how, and why.In knowledge-based question-answering systems,however, the details of the house could be elicited during When viewing the output from a basic informationthe knowledge representation process. In previous extraction system, some of these journalistic queriesdevelopment of knowledge-based systems, the process of appear to be quite simple to determine. For example, theproducing these knowledge representations from the question of what happened is frequently identified by thecontent of a narrative has been entirely manual. Trained verb in the sentence (or in the clause). However, a singleknowledge engineers and/or experts in a particular sentence can encode more than one event, and multipledomain have been required to manually write the sentences can encode the same event. Thus, anyknowledge representations that reflect the contents of automated system to perform story understanding mustdocuments. This is sometimes called the “knowledge be able to view all sentences in the story and determineengineering bottleneck.” which portions of each refer to which specific events.In the current paper, we describe a method of resolving For example, consider the sentence:both these problems. The use of event descriptor [Manila, Philippines] About 36 US Special Forcestemplates as models for representing narrative text troops started a month of anti-terrorism training withprovides both completeness of coverage and a method of their Filipino counterparts at a northern army baseautomating the knowledge extraction and representation Tuesday, a Philippine army officer said. (taken from anprocesses. article on MSNBC.com)The following sections describe, first, the types of The obvious verb here is “started,” as in, the troopstemplates that we have found useful in describing started anti-terrorism training. However, the actual mainnarratives, and then a system design that automatically verb of the sentence is “said.” The verb “started” isfills in these event templates from large volumes of text merely describing what was “said.”documents without requiring knowledge engineers ordomain experts to generate the knowledge Sentences such as this are very common in real-worldrepresentations. story understanding. In such cases, both verbs identify events, one of which is an act of starting something (i.e., 2. BASIC JOURNALISTIC RELATIONS “starting to train”) and the other is an act of communicating (i.e., “said”). This sentence thus encodesMuch of the information needed to correctly respond to two separate events, the starting event and the sayinguser queries is found in the contents of story narratives. event. This sentence can thus begin to be encoded byThese text items range from historical event summaries generating the simplest knowledge relations:to news stories. Consequently, narrative text constitutesan important set of data items to be able to automatically (defobject Event-E1process and convert into knowledge representations. (instance-of Event-E1 saying))Story narratives can be encoded in many ways, going (defobject Event-E2back to the concept of scripts, in which a standardized set (instance-of Event-E2 starting))of roles and activities occur in a known generalenvironment. The classic example of scripts is eating at Thus, the verb(s) defines the type of action thata restaurant. There, a person who understands the basic corresponds to that event.“restaurant script” knows how to interpret such activitiesas being shown to a table and handed a menu, ordering The next simple journalistic question is who performedfrom a waiter or waitress, having the meal delivered, the event. Typically, this is identified as the subject ofreceiving a check, and paying for the meal. By encoding the verb for each verb in the sentence. (Care must, ofthis and other stereotypical scenarios in a standard course, be taken for passive voice and similar non-typicalknowledge system, then comparing a textual narrative of sentence constructs.) Thus, a parse of the samplea specific “eating at a restaurant” incident to the sentence above can identify the agents involved byrestaurant script, a knowledge-based system can respond identifying the subjects of each of the two verbs. Thisto many complex, non-obvious questions about that enhances the definitions of the two events by addingevent. more relations:
  3. 3. (defobject Event-E1 (instance-of Event-E1 saying) (defobject Event-E1 (performed-by Event-E1 Philippine-army-officer)) (instance-of Event-E1 saying)(defobject Event-E2 (performed-by Event-E1 Philippine-army-officer) (instance-of Event-E2 starting) (location-of Event-E1 city-of-manila)) (performed-by Event-E2 us-special-forces-troops) (performed-by Event-E2 filippino-troops)) Should this location not be in the base ontology, it would of course also have to be defined.To get these encodings, however, additional definitionsare required to identify the specific agents, since it is The next journalistic question is when did the event takeunlikely that any general ontology would have the place. This is again easier to determine for the startingconcepts of Philippine-army-officer or us-special-forces- event than the saying event:troops already present. (defobject Event-E2Thus, additional prior definitions are added to the (instance-of Event-E2 starting)knowledge base: (performed-by Event-E2 us-special-forces-troops) (performed-by Event-E2 filippino-troops)(defobject Philippine-army-officer (location-of Event-E2 northern-army-base) (instance-of Philippine-army-officer soldier) (took-place-when Event-E2 Tuesday)) (citizen-of Philippine-army-officer Philippines)) Presumably, a newswire article would also have a date attached to it, so “Tuesday” could be more specifically(defobject us-special-forces-troop defined as the most recent Tuesday prior to the date of (instance-of us-special-forces-troop soldier) the article. (citizen-of us-special-forces-troop united-states)) The question of when the saying event took place is again an inference, not something that is clearly defined;(defobject us-special-forces-troops however, it can be presumed to be not later than the date (group-of us-special-forces-troops us-special-forces- of the newswire article itself:troop) (count-of us-special-forces-troops 36)) (defobject Event-E1 (instance-of Event-E1 saying)The next journalistic query is where does this event take (performed-by Event-E1 Philippine-army-officer)place. This becomes somewhat more difficult to (location-of Event-E1 city-of-manila)ascertain in many cases. In the case of the starting event, (took-place-when Event-E1 before(date-of-article)))the location is straightforward, and can be added to theevent definition in a simple manner: The two remaining journalistic questions, how and why are not answered by the current sentence; they remain,(defobject Event-E2 for the moment, empty relations. And, in fact, the (instance-of Event-E2 starting) question of why an event occurs is nearly always a more (performed-by Event-E2 us-special-forces-troops) complex relation to implement. (performed-by Event-E2 filippino-troops) (location-of Event-E2 northern-army-base)) 3. COMPLEX RELATIONSObviously, this too spawns a definition of the term A careful reading of the original sentence shows thatnorthern-army-base: while the descriptors developed above are accurate, they are sorely incomplete. For example, the “starting” event(defobject northern-army-base does not explain what exactly is being started. (instance-of northern-army-base army-outpost) Furthermore, there’s no information regarding why this (location-of northern-army-base (location-function event is happening, what impacts it has on other events,(direction-fn north) Philippines))) or any of a number of other issues that a human reader can be expected to infer easily from the text. Thus, theThe location of the saying event is far more obscure, journalistic event relations are only the starting point forhowever. In this case, with a newswire article that a far more complex set of relationships.(presumably) has a dateline attached, the assumptionwould be made that the location of the saying event is the In the case of the example sentence, the first issue is tolocation specified in the dateline. Thus, the location for identify what exactly is being started. The object of thethat event would be: sentence (either direct object or indirect object,
  4. 4. depending on the verb) frequently gives insight into what • Is-innocent-bystanderis being used to perform an action or what is being • Is-financial-backerreferred to in an event description. The object of the • Is-betrayer, etc.starting clause is the phrase a month of anti-terrorism Causality Relationstraining. Thus, there must be a way to encode the • Event-causes-eventconcept of what is being started. • Interest-causes-event • Contributing-factor, etc.(defobject Event-E2 Impact Relations (instance-of Event-E2 starting) • Event-opposes-agent <a specific agent> (performed-by Event-E2 us-special-forces-troops) • Event-supports-agent <a specific agent> (performed-by Event-E2 filippino-troops) • Event-opposes-interest <a specific interest> (location-of Event-E2 northern-army-base) • Event-supports-interest <a specific interest> (took-place-when Event-E2 Tuesday) • Supports-interests-of <an agent and its (action-performed-on Event-E2 anti-terrorism- interest>training)) • Opposes-interests-of <an agent and its interest>This in turn spawns the definition: • Action-performed-on <an object or an agent> • Action-performed-with <an object>, etc.(defobject anti-terrorism-training (instance-of anti-terrorism-training training) The relations that form the event descriptor thus provide (opposes anti-terrorism-training terrorist-group) a rich encoding of the content of a narrative text (duration anti-terrorism-training (days 30))) description. The benefit of this approach to knowledge representation is two-fold. First, it ensures that eachNote that the training definition includes the approximate event is scrutinized to derive the maximum possibleduration of the training. Notice also that because the information from the narrative text. Second, it allowsprefix “anti” means “against,” the relation “opposes” is multi-sentence descriptions. For example, suppose theused. If there were no such prefix, and the phrase was next sentence in the text is:“terrorism training,” the relation “supports” would beused, on the assumption that training generally supports The training will provide experience in fighting bio-the activity it concerns. terrorist attacks in a jungle environment.The saying event also must be more carefully defined. This sentence can easily be determined to refer to theWhat exactly did the Philippine army officer say? same event as the “starting” event. The subject of theBasically, the officer described the training event. Thus, sentence is “the training.” Since it immediately followsa reasonable encoding of the object of the saying event a sentence that refers to starting a training event, it canis: be inferred that the training referred to is one and the same training. Thus, it is a simple matter to amend the (defobject Event-E1 earlier description of the training to assert: (instance-of Event-E1 saying) (performed-by Event-E1 Philippine-army-officer) (defobject anti-terrorism-training (location-of Event-E1 city-of-manila) (instance-of anti-terrorism-training training) (took-place-when Event-E1 before(date-of-article)) (opposes anti-terrorism-training terrorist-group) (action-performed-on Event-E1 Event-E2)) (duration anti-terrorism-training (days 30)) (location-of anti-terrorism-training jungle)This single sentence does not yield other relations. But (opposes anti-terrorism-training bio-terrorist-attack))there are many other relations that should be consideredbefore abandoning the analysis of the sentence (and the Filling in such detailed event descriptors amounts to pre-article). These relations can be grouped into categories answering a number of queries about the narrative. Anyas shown below with a few examples for each category. question that asks about these events, even in elliptical terms, will quickly generate the correct answer. ThisRole Relations removes a great deal of the processing and inferencing • Is-aggressor burden from the question-answering system and enables • Is-defender it to provide answers accurately, quickly, and efficiently. • Is-protector 4. AUTOMATING THE PROCESS • Is-mediator • Is-peacekeeper The remaining question is how can constructing such • Is-victim event descriptors be automated. In essence, this requires
  5. 5. constructing a pre-answering system that infers the and lexical terms and to define new ones as needed.answer to obvious questions about a narrative and stores These new terms comprise a mini-ontology that isthose answers in the knowledge base for rapid retrieval. relevant primarily to the current document; however, as appropriate, new terms can be added to the globalThe approach we have taken to this is a modular one for lexicon, thus enabling cross-document processing ofthe sake of easy construction, extension, and events. Thus, a sequence of events described in multiplemaintenance. In effect, each relation or logical set of documents is an easy extension of this concept.relations has its own miniature knowledge-basedreasoner—a knowledge base partition—which knows Furthermore, since each relation can be documented withonly how to inspect a parsed sentence and construct a the specific text that generated that relation, it isspecific relation from that sentence—if and only if that similarly easy to identify which document events andsentence contains the necessary information to construct mini-ontologies correspond to a particular term. Forthe relation. example, if a query asks about US Special Forces Troops in the Philippines, it is very easy to determine exactlyThese knowledge-based reasoners thus are kept within a which document(s) have information about these troops,small scale, they can be called or not as appropriate for a since they are mentioned in those documents that includegiven sentence, and they construct an appropriate that ontological term. A separate database (not shown inrelation (or small number of similar relations) based on the figure) keeps track of which ontological terms arethe contents of that sentence. Figure 1 illustrates how the defined in which document files and mini-ontologies.architecture operates in the case of the example sentence. Thus, if a query arrives asking about a specific concept (such as the U.S. Special Forces Troops in theThe KB Controller is the coordinator for the event Philippines), only those knowledge bases that includedescriptor construction. It does not generate event that concept are added to the answer reasoner’srelations itself, but calls whatever other modules are knowledge base to serve as the basis for determining theappropriate in a given sentence to determine the answer. This provides a mechanism for automaticrelations. It also deals with cross-sentence descriptors by partitioning of very large scale knowledge bases as welldetermining that the current sentence probably refers to as increases the efficiency of the answering system.an event that is already described; in this case it tells thevarious KB partitions it calls not to define a new event The modular architecture enables each KB partition tobut rather to modify a specific existing event. focus on only a single problem rather than trying to generate the entire event descriptor. This makes theEach of the KB partitions has access to the global entire system easy to construct, easy to extend, and easyontology and lexicon to identify appropriate concepts to maintain. Parsed sentence KB Controller KB Ontology Calls KB & Lexicon Engines as needed Etc. Instance-of Performed-by Location-of When Roles Causality KB partition KB partition KB partition KB partition KB partition KB partition (defobject Event-E2 (location-of Event-E2 (instance-of Event-E2 northern-army-base)) starting)) including definition of terms (defobject Event-E2 (took-place-when Event- (performed-by Event-E2 E2 Tuesday) us-special-forces-troops)) including definitions for Figure 1. Architecture for the Event Descriptor system. “us-special-forces-troops”
  6. 6. 5. CONCLUSIONS and Computational Issues (ICCS 2000). Darmstadt, Germany. August 14-18, 2000.This paper has outlined a workable methodology forconstructing a narrative understanding tool for text [10] McIlraith, S. and Amir, E. “Theorem Proving withnarratives that can automatically comprehend stories and Structured Theories,” Proceedings of theanswer questions about those stories. The system is Seventeenth International Conference on Artificialunder construction and preliminary results indicate that it Intelligence (IJCAI-01). pp. 624 -- 631, August, 2001.is flexible, powerful, and highly extensible. [11] Amir, E. and McIlraith, S. “Partition-Based Logical 6. REFERENCES Reasoning for First-Order and Propositional Theories,” Submitted for Publication.[1] Adam Farquhar, Richard Fikes, and James P. Rice.“A Collaborative Tool for Ontology Construction.,” This work was supported by the Advanced Research andInternational Journal of Human Computer Studies, Development Activity (ARDA) as part of its AQUAINT46:707-727, 1997. Program. Any opinions, findings, and conclusions or recommendations expressed in this material are those of[2] Peter D. Karp, Vinay K. Chaudhri, and Suzanne M. the author(s) and do not necessarily reflect the views ofPaley. “A Collaborative Environment for Authoring the U.S. Government.Large Knowledge Bases.” Journal of IntelligentInformation Systems, 1998.[3] Paul Cohen, Robert Schrag, Eric Jones, Adam Pease,Albert Lin, Barbara Starr, David Gunning, and MurrayBurke. The DARPA High Performance KnowledgeBases Project. AI Magazine, Winter, 1998. pp. 25-49[4] B. Katz, “From Sentence Processing to InformationAccess on the World Wide Web,” AAAI SpringSymposium on Natural Language Processing for theWorld Wide Web, Stanford University, Stanford CA(1997).[5] Gentner, D. and K. Forbus “MAC/FAC: A Model ofSimilarity-based Retrieval,” Proceedings of theCognitiveScience Society. 1991.[6] Forbus, K. and D. Oblinger “Making SME Greedyand Pragmatic,” Proceedings of the Cognitive ScienceSociety. 1990.[7] V. K. Chaudhri, J. D. Lowrance, M. E. Stickel, J. F.Thomere, and R. J. Waldinger“Ontology construction toolkit,” Technical NoteOntology, AI Center, SRI International, 333Ravenswood Ave., Menlo Park, CA 94025, 2000.[8] Deborah McGuinness “Description Logics Emergefrom Ivory Tower”’ Stanford Knowledge SystemsLaboratory Technical Report KSL-01-08 2001. In theProceedings of the International Workshop onDescription Logics. Stanford, CA, August 2001.[9] Deborah McGuinness “Conceptual Modeling forDistributed Ontology Environments.”(Word format)To appear in Proceedings of the Eighth InternationalConference on Conceptual Structures Logical,Linguistic,

×