• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Event templatesfor qa2

Event templatesfor qa2



Understanding narrative text is more than simple information extraction on a sentence-by-sentence basis. To comprehend the true meaning of a narrative requires determining the connections between the ...

Understanding narrative text is more than simple information extraction on a sentence-by-sentence basis. To comprehend the true meaning of a narrative requires determining the connections between the sentences and the effect of one event on other events. This story understanding process can be greatly enhanced by the use of event descriptor templates that begin with the basic journalistic questions of who, what, when, where, why, and how but that go beyond these simple basics to address more complex relationships: role playing, context, impact, causality, and interests. Previously, representing story narratives as knowledge representations has required intensive manual effort on the part of trained knowledge engineers to correctly encode the contents of stories into a knowledge base (KB). For large volumes of text, this becomes impractical, limiting the usefulness of KB-based systems in question-answering. This paper describes a means of automating the narrative representation process by using event descriptor templates to elicit critical narrative information to be encoded in a knowledge based system.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Event templatesfor qa2 Event templatesfor qa2 Document Transcript

    • Event Templates for Improved Narrative Understanding in Question Answering Systems Maureen Caudill SAIC Enterprise Information Systems Division 10260 Campus Point Drive, MS C-2 San Diego, CA 92121 and Maureen Caudill 4/7/02 9:03 AM Deleted: Barbara Starr SAIC Enterprise Information Systems Division 10260 Campus Point Drive, MS C-2 San Diego, CA 92121 ABSTRACT understanding of those events to do more than respond to simple look-up questions. If the system is capable ofUnderstanding narrative text is more than simple answering complex queries that require reasoning aboutinformation extraction on a sentence-by-sentence basis. the situation, its comprehension of events in the realTo comprehend the true meaning of a narrative requires world must substantially exceed that of a simpledetermining the connections between the sentences and information extraction system.the effect of one event on other events. This storyunderstanding process can be greatly enhanced by the Information extraction systems typically deal withuse of event descriptor templates that begin with the narrative texts on a sentence-by-sentence basis. That is,basic journalistic questions of who, what, when, where, they rarely are capable of understanding the relationshipwhy, and how but that go beyond these simple basics to between sentences. This limited knowledge makes themaddress more complex relationships: role playing, into systems that do little more than key word matchescontext, impact, causality, and interests. Previously, when trying to answer questions. In a query aboutrepresenting story narratives as knowledge houses, a typical information extraction system looks forrepresentations has required intensive manual effort on the word “house” in sentences. A slightly morethe part of trained knowledge engineers to correctly sophisticated version might also include synonymencode the contents of stories into a knowledge base searches, and include searches for “apartment,”(KB). For large volumes of text, this becomes “condominium,” “condo,” “dwelling,” “abode,” etc.impractical, limiting the usefulness of KB-based systems However, none of these techniques can identifyin question-answering. This paper describes a means of sentences that refer to houses indirectly such as the oneautomating the narrative representation process by using below:event descriptor templates to elicit critical narrativeinformation to be encoded in a knowledge based system. I lived on Main Street when I was a child, in a huge old barn. It had twelve rooms, five fireplaces, and many oddKeywords: Automating Knowledge Generation, , nooks and corners where a child could disappear forKnowledge Engineering Bottleneck, Knowledge-Based hours at a time.System Development, Question-Answering Systems,Event extraction, Event understanding, Automatic Event In this example, the word “barn” likely would not beExtraction. identified as a metaphor for “house.” Even if that 1. BACKGROUND relationship were identified, the second sentence, referring to the dwelling place only as “It,” wouldA question-understanding system that can answer certainly be missed by all but a very few current systems.questions about historical or current events must have an Thus, if the question is “How many rooms were in the
    • house where you lived as a child?” an information to many complex, non-obvious questions about thatextraction system could not answer it, even though the event.information requested is explicitly included in thesentence. For information that is not explicit, but The concept of prefabricated understanding of specificavailable only through inferencing, typical information types of events can, however, be extended to moreextraction systems have no hope of generating an general event types using an event descriptor template.answer. The initial version of this template first seeks to answer the basic journalistic queries of who, what, when, where,In knowledge-based question-answering systems, how, and why.however, the details of the house could be elicited duringthe knowledge representation process. In previous When viewing the output from a basic informationdevelopment of knowledge-based systems, the process of extraction system, some of these journalistic queriesproducing these knowledge representations from the appear to be quite simple to determine. For example, thecontent of a narrative has been entirely manual. Trained question of what happened is frequently identified by theknowledge engineers and/or experts in a particular verb in the sentence (or in the clause). However, a singledomain have been required to manually write the sentence can encode more than one event, and multipleknowledge representations that reflect the contents of sentences can encode the same event. Thus, anydocuments. This is sometimes called the “knowledge automated system to perform story understanding mustengineering bottleneck.” be able to view all sentences in the story and determine which portions of each refer to which specific events.In the current paper, we describe a method of resolvingboth these problems. The use of event descriptor For example, consider the sentence:templates as models for representing narrative text [Manila, Philippines] About 36 US Special Forcesprovides both completeness of coverage and a method of troops started a month of anti-terrorism training withautomating the knowledge extraction and representation their Filipino counterparts at a northern army baseprocesses. Tuesday, a Philippine army officer said. (taken from an article on MSNBC.com)The following sections describe, first, the types oftemplates that we have found useful in describing The obvious verb here is “started,” as in, the troopsnarratives, and then a system design that automatically started anti-terrorism training. However, the actual mainfills in these event templates from large volumes of text verb of the sentence is “said.” The verb “started” isdocuments without requiring knowledge engineers or merely describing what was “said.”domain experts to generate the knowledgerepresentations. Sentences such as this are very common in real-world story understanding. In such cases, both verbs identify 2. BASIC JOURNALISTIC RELATIONS events, one of which is an act of starting something (i.e., “starting to train”) and the other is an act ofMuch of the information needed to correctly respond to communicating (i.e., “said”). This sentence thus encodesuser queries is found in the contents of story narratives. two separate events, the starting event and the sayingThese text items range from historical event summaries event. This sentence can thus begin to be encoded byto news stories. Consequently, narrative text constitutes generating the simplest knowledge relations:an important set of data items to be able to automaticallyprocess and convert into knowledge representations. (defobject Event-E1Story narratives can be encoded in many ways, going (instance-of Event-E1 saying))back to the concept of scripts, in which a standardized set (defobject Event-E2of roles and activities occur in a known general (instance-of Event-E2 starting))environment. The classic example of scripts is eating ata restaurant. There, a person who understands the basic Thus, the verb(s) defines the type of action that“restaurant script” knows how to interpret such activities corresponds to that event.as being shown to a table and handed a menu, orderingfrom a waiter or waitress, having the meal delivered, The next simple journalistic question is who performedreceiving a check, and paying for the meal. By encoding the event. Typically, this is identified as the subject ofthis and other stereotypical scenarios in a standard the verb for each verb in the sentence. (Care must, ofknowledge system, then comparing a textual narrative of course, be taken for passive voice and similar non-typicala specific “eating at a restaurant” incident to the sentence constructs.) Thus, a parse of the samplerestaurant script, a knowledge-based system can respond sentence above can identify the agents involved by identifying the subjects of each of the two verbs. This
    • enhances the definitions of the two events by adding would be made that the location of the saying event is themore relations: location specified in the dateline. Thus, the location for that event would be:(defobject Event-E1 (instance-of Event-E1 saying) (defobject Event-E1 (performed-by Event-E1 Philippine-army-officer)) (instance-of Event-E1 saying)(defobject Event-E2 (performed-by Event-E1 Philippine-army-officer) (instance-of Event-E2 starting) (location-of Event-E1 city-of-manila)) (performed-by Event-E2 us-special-forces-troops) (performed-by Event-E2 filippino-troops)) Should this location not be in the base ontology, it would of course also have to be defined.To get these encodings, however, additional definitionsare required to identify the specific agents, since it is The next journalistic question is when did the event takeunlikely that any general ontology would have the place. This is again easier to determine for the startingconcepts of Philippine-army-officer or us-special-forces- event than the saying event:troops already present. (defobject Event-E2Thus, additional prior definitions are added to the (instance-of Event-E2 starting)knowledge base: (performed-by Event-E2 us-special-forces-troops) (performed-by Event-E2 filippino-troops)(defobject Philippine-army-officer (location-of Event-E2 northern-army-base) (instance-of Philippine-army-officer soldier) (took-place-when Event-E2 Tuesday)) (citizen-of Philippine-army-officer Philippines)) Presumably, a newswire article would also have a date attached to it, so “Tuesday” could be more specifically(defobject us-special-forces-troop defined as the most recent Tuesday prior to the date of (instance-of us-special-forces-troop soldier) the article. (citizen-of us-special-forces-troop united-states)) The question of when the saying event took place is again an inference, not something that is clearly defined;(defobject us-special-forces-troops however, it can be presumed to be not later than the date (group-of us-special-forces-troops us-special-forces- of the newswire article itself:troop) (count-of us-special-forces-troops 36)) (defobject Event-E1 (instance-of Event-E1 saying)The next journalistic query is where does this event take (performed-by Event-E1 Philippine-army-officer)place. This becomes somewhat more difficult to (location-of Event-E1 city-of-manila)ascertain in many cases. In the case of the starting event, (took-place-when Event-E1 before(date-of-article)))the location is straightforward, and can be added to theevent definition in a simple manner: The two remaining journalistic questions, how and why are not answered by the current sentence; they remain,(defobject Event-E2 for the moment, empty relations. And, in fact, the (instance-of Event-E2 starting) question of why an event occurs is nearly always a more (performed-by Event-E2 us-special-forces-troops) complex relation to implement. (performed-by Event-E2 filippino-troops) (location-of Event-E2 northern-army-base)) 3. COMPLEX RELATIONSObviously, this too spawns a definition of the term A careful reading of the original sentence shows thatnorthern-army-base: while the descriptors developed above are accurate, they are sorely incomplete. For example, the “starting” event(defobject northern-army-base does not explain what exactly is being started. (instance-of northern-army-base army-outpost) Furthermore, there’s no information regarding why this (location-of northern-army-base (location-function event is happening, what impacts it has on other events,(direction-fn north) Philippines))) or any of a number of other issues that a human reader can be expected to infer easily from the text. Thus, theThe location of the saying event is far more obscure, journalistic event relations are only the starting point forhowever. In this case, with a newswire article that a far more complex set of relationships.(presumably) has a dateline attached, the assumption
    • In the case of the example sentence, the first issue is to • Is-mediatoridentify what exactly is being started. The object of the • Is-peacekeepersentence (either direct object or indirect object, • Is-victimdepending on the verb) frequently gives insight into what • Is-innocent-bystanderis being used to perform an action or what is being • Is-financial-backerreferred to in an event description. The object of the • Is-betrayer, etc.starting clause is the phrase a month of anti-terrorism Causality Relationstraining. Thus, there must be a way to encode the • Event-causes-eventconcept of what is being started. • Interest-causes-event • Contributing-factor, etc.(defobject Event-E2 Impact Relations (instance-of Event-E2 starting) • Event-opposes-agent <a specific agent> (performed-by Event-E2 us-special-forces-troops) • Event-supports-agent <a specific agent> (performed-by Event-E2 filippino-troops) • Event-opposes-interest <a specific interest> (location-of Event-E2 northern-army-base) • Event-supports-interest <a specific interest> (took-place-when Event-E2 Tuesday) • Supports-interests-of <an agent and its (action-performed-on Event-E2 anti-terrorism- interest>training)) • Opposes-interests-of <an agent and its interest>This in turn spawns the definition: • Action-performed-on <an object or an agent> • Action-performed-with <an object>, etc.(defobject anti-terrorism-training (instance-of anti-terrorism-training training) The relations that form the event descriptor thus provide (opposes anti-terrorism-training terrorist-group) a rich encoding of the content of a narrative text (duration anti-terrorism-training (days 30))) description. The benefit of this approach to knowledge representation is two-fold. First, it ensures that eachNote that the training definition includes the approximate event is scrutinized to derive the maximum possibleduration of the training. Notice also that because the information from the narrative text. Second, it allowsprefix “anti” means “against,” the relation “opposes” is multi-sentence descriptions. For example, suppose theused. If there were no such prefix, and the phrase was next sentence in the text is:“terrorism training,” the relation “supports” would beused, on the assumption that training generally supports The training will provide experience in fighting bio-the activity it concerns. terrorist attacks in a jungle environment.The saying event also must be more carefully defined. This sentence can easily be determined to refer to theWhat exactly did the Philippine army officer say? same event as the “starting” event. The subject of theBasically, the officer described the training event. Thus, sentence is “the training.” Since it immediately followsa reasonable encoding of the object of the saying event a sentence that refers to starting a training event, it canis: be inferred that the training referred to is one and the same training. Thus, it is a simple matter to amend the (defobject Event-E1 earlier description of the training to assert: (instance-of Event-E1 saying) (performed-by Event-E1 Philippine-army-officer) (defobject anti-terrorism-training (location-of Event-E1 city-of-manila) (instance-of anti-terrorism-training training) (took-place-when Event-E1 before(date-of-article)) (opposes anti-terrorism-training terrorist-group) (action-performed-on Event-E1 Event-E2)) (duration anti-terrorism-training (days 30)) (location-of anti-terrorism-training jungle)This single sentence does not yield other relations. But (opposes anti-terrorism-training bio-terrorist-attack))there are many other relations that should be consideredbefore abandoning the analysis of the sentence (and the Filling in such detailed event descriptors amounts to pre-article). These relations can be grouped into categories answering a number of queries about the narrative. Anyas shown below with a few examples for each category. question that asks about these events, even in elliptical terms, will quickly generate the correct answer. ThisRole Relations removes a great deal of the processing and inferencing • Is-aggressor burden from the question-answering system and enables • Is-defender it to provide answers accurately, quickly, and efficiently. • Is-protector 4. AUTOMATING THE PROCESS
    • The remaining question is how can constructing such Each of the KB partitions has access to the globalevent descriptors be automated. In essence, this requires ontology and lexicon to identify appropriate conceptsconstructing a pre-answering system that infers the and lexical terms and to define new ones as needed.answer to obvious questions about a narrative and stores These new terms comprise a mini-ontology that isthose answers in the knowledge base for rapid retrieval. relevant primarily to the current document; however, as appropriate, new terms can be added to the globalThe approach we have taken to this is a modular one for lexicon, thus enabling cross-document processing ofthe sake of easy construction, extension, and events. Thus, a sequence of events described in multiplemaintenance. In effect, each relation or logical set of documents is an easy extension of this concept.relations has its own miniature knowledge-basedreasoner—a knowledge base partition—which knows Furthermore, since each relation can be documented withonly how to inspect a parsed sentence and construct a the specific text that generated that relation, it isspecific relation from that sentence—if and only if that similarly easy to identify which document events andsentence contains the necessary information to construct mini-ontologies correspond to a particular term. Forthe relation. example, if a query asks about US Special Forces Troops in the Philippines, it is very easy to determine exactlyThese knowledge-based reasoners thus are kept within a which document(s) have information about these troops,small scale, they can be called or not as appropriate for a since they are mentioned in those documents that includegiven sentence, and they construct an appropriate that ontological term. A separate database (not shown inrelation (or small number of similar relations) based on the figure) keeps track of which ontological terms arethe contents of that sentence. Figure 1 illustrates how the defined in which document files and mini-ontologies.architecture operates in the case of the example sentence. Thus, if a query arrives asking about a specific concept (such as the U.S. Special Forces Troops in theThe KB Controller is the coordinator for the event Philippines), only those knowledge bases that includedescriptor construction. It does not generate event that concept are added to the answer reasoner’srelations itself, but calls whatever other modules are knowledge base to serve as the basis for determining theappropriate in a given sentence to determine the answer. This provides a mechanism for automaticrelations. It also deals with cross-sentence descriptors by partitioning of very large scale knowledge bases as welldetermining that the current sentence probably refers to as increases the efficiency of the answering system.an event that is already described; in this case it tells thevarious KB partitions it calls not to define a new event The modular architecture enables each KB partition tobut rather to modify a specific existing event. focus on only a single problem rather than trying to Parsed sentence KB Controller KB Ontology Calls KB & Lexicon Engines as needed Etc. Instance-of Performed-by Location-of When Roles Causality KB partition KB partition KB partition KB partition KB partition KB partition (defobject Event-E2 (location-of Event-E2 (instance-of Event-E2 northern-army-base)) starting)) including definition of terms (defobject Event-E2 (took-place-when Event- (performed-by Event-E2 E2 Tuesday) us-special-forces-troops)) including definitions for Figure 1. Architecture for the Event Descriptor system. “us-special-forces-troops”
    • generate the entire event descriptor. This makes the To appear in Proceedings of the Eighth Internationalentire system easy to construct, easy to extend, and easy Conference on Conceptual Structures Logical,Linguistic,to maintain. and Computational Issues (ICCS 2000). Darmstadt, 5. CONCLUSIONS Germany. August 14-18, 2000.This paper has outlined a workable methodology for [10] McIlraith, S. and Amir, E. “Theorem Proving withconstructing a narrative understanding tool for text Structured Theories,” Proceedings of thenarratives that can automatically comprehend stories and Seventeenth International Conference on Artificialanswer questions about those stories. The system is Intelligence (IJCAI-01). pp. 624 -- 631, August, 2001.under construction and preliminary results indicate that itis flexible, powerful, and highly extensible. [11] Amir, E. and McIlraith, S. “Partition-Based Logical Reasoning for First-Order and Propositional Theories,” 6. REFERENCES Submitted for Publication.[1] Adam Farquhar, Richard Fikes, and James P. Rice. This work was supported by the Advanced Research and“A Collaborative Tool for Ontology Construction.,” Development Activity (ARDA) as part of its AQUAINTInternational Journal of Human Computer Studies, Program. Any opinions, findings, and conclusions or46:707-727, 1997. recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of[2] Peter D. Karp, Vinay K. Chaudhri, and Suzanne M. the U.S. Government.Paley. “A Collaborative Environment for AuthoringLarge Knowledge Bases.” Journal of IntelligentInformation Systems, 1998.[3] Paul Cohen, Robert Schrag, Eric Jones, Adam Pease,Albert Lin, Barbara Starr, David Gunning, and MurrayBurke. The DARPA High Performance KnowledgeBases Project. AI Magazine, Winter, 1998. pp. 25-49[4] B. Katz, “From Sentence Processing to InformationAccess on the World Wide Web,” AAAI SpringSymposium on Natural Language Processing for theWorld Wide Web, Stanford University, Stanford CA(1997).[5] Gentner, D. and K. Forbus “MAC/FAC: A Model ofSimilarity-based Retrieval,” Proceedings of theCognitiveScience Society. 1991.[6] Forbus, K. and D. Oblinger “Making SME Greedyand Pragmatic,” Proceedings of the Cognitive ScienceSociety. 1990.[7] V. K. Chaudhri, J. D. Lowrance, M. E. Stickel, J. F.Thomere, and R. J. Waldinger“Ontology construction toolkit,” Technical NoteOntology, AI Center, SRI International, 333Ravenswood Ave., Menlo Park, CA 94025, 2000.[8] Deborah McGuinness “Description Logics Emergefrom Ivory Tower”’ Stanford Knowledge SystemsLaboratory Technical Report KSL-01-08 2001. In theProceedings of the International Workshop onDescription Logics. Stanford, CA, August 2001.[9] Deborah McGuinness “Conceptual Modeling forDistributed Ontology Environments.”(Word format)