Building Conversational Agents In Oz


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Building Conversational Agents In Oz

  1. 1. The C URRENT Platform: Building Conversational Agents in Oz * Torbjörn Lager Fredrik Kronlid Department of Linguistics University of Göteborg
  2. 2. Conversational Software Agents <ul><li>Software we can ‘talk’ with, in natural language </li></ul><ul><li>Examples </li></ul><ul><ul><li>Eliza (Weizenbaum 1966) </li></ul></ul><ul><ul><li>Question-answering systems (e.g. on the web) </li></ul></ul><ul><ul><li>Spoken dialog systems (for reserving train tickets, etc.) </li></ul></ul><ul><ul><li>Intelligent homes / cars / appliances </li></ul></ul><ul><ul><li>HAL (in the movie 2001) </li></ul></ul>
  3. 3. We want to build agents that are… <ul><li>able to ‘perceive’, ‘think’ and ‘act’ at the same time </li></ul><ul><li>capable of incremental processing of natural language </li></ul><ul><li>sensitive to the presence of other conversational software agents </li></ul><ul><li>making use of sophisticated, state-of-the-art methods for NLP </li></ul><ul><li>Question: Can Oz be of any help here? </li></ul>
  4. 4. The C URRENT Platform <ul><li>An abstract characterization of what it means to be a conversational agent. </li></ul><ul><li>A ‘visual’ IDE supporting the building of conversational agents by constructing a graphical representation of a network of NLP components on a canvas. </li></ul><ul><li>A number of ready-made NLP components such as lexica, taggers and parsers for a variety of languages, and dialogue managers for a variety of tasks. </li></ul><ul><li>A number of libraries supporting the implementation of other such NLP components. </li></ul>
  5. 5. An Abstract Characterization of Conversational Agents <ul><li>Simplifying considerably, a conversational agent is an interactive and incremental transducer of an input stream of words into an output stream of words, accompanied by an evolving internal state. </li></ul><ul><li>Clarifications </li></ul><ul><ul><li>Interactive means that the transducer is able to accept external input, as well as output (intermediate) responses, during the course of a transduction </li></ul></ul><ul><ul><li>One agent – one transduction </li></ul></ul><ul><li>Inspiration </li></ul><ul><ul><li>Peter Wegner’s work on Turing machines vs. Interaction machines </li></ul></ul>
  6. 6. Very abstract indeed… <ul><li>… but can be directly implemented in Oz, since </li></ul><ul><ul><li>Streams are provided as basic building blocks, </li></ul></ul><ul><ul><li>transducers working on streams are no harder to implement than transducers working on lists or strings, </li></ul></ul><ul><ul><li>and if transducers are run in their own thread(s), incrementality, and thus interactivity, comes for free! </li></ul></ul><ul><li>Declarative concurrency – “… gives the same results as a sequential program but can give them incrementally… ” (van Roy and Haridi, Chapter 4) </li></ul>
  7. 7. C URRENT Agent = Network of Components <ul><li>An agent is a network of components – sources, sinks, transducers, splitters and mergers – connected by streams. </li></ul><ul><li>Each component runs in its own thread(s) </li></ul><ul><li>Since each component is incremental, so is the whole network. </li></ul>
  8. 8. Component Technologies <ul><li>Form-based dialogue management </li></ul><ul><li>Transformation-based tagging </li></ul><ul><li>Pattern-matching over streams of records </li></ul><ul><li>Deep parsing + compositional logical semantics </li></ul><ul><li>The information-state approach to dialogue management (Larsson, 2002) </li></ul><ul><li>… </li></ul>
  9. 9. Form-Based Dialogue Management <ul><li>Script specifies forms consisting of form items such as input fields and blocks of executable content </li></ul><ul><li>Input fields may be associated with grammars that specify what the user can say </li></ul><ul><li>Prompts specify the system’s response. Prompts may be tapered . </li></ul><ul><li>When a script is interpreted (by the FIA) the user is prompted for values of fields in order. But mixed initiative dialogue, as well as task switching, is also supported. </li></ul><ul><li>FIA raises events e.g. in case of no input, or input that does not match the grammar associated with the field. </li></ul><ul><li>Events may be handled , e.g. by jumping to another form that initiates (say) a clarification dialogue. (‘Jumping’ in the ‘state machine’ sense.) </li></ul><ul><li>The assignment of values to fields may trigger actions , e.g. a database lookup. </li></ul>
  10. 10. Example VoiceXML Script <ul><li><form id=&quot;get_from_and_to_cities&quot;> </li></ul><ul><li><grammar src=&quot;; </li></ul><ul><li>type=&quot;application/srgs+xml&quot;/> </li></ul><ul><li><block> </li></ul><ul><li>Welcome to the Driving Directions By Phone. </li></ul><ul><li></block> </li></ul><ul><li><initial name=&quot;bypass_init&quot;> </li></ul><ul><li><prompt> </li></ul><ul><li>Where do you want to drive from and to? </li></ul><ul><li></prompt> </li></ul><ul><li><nomatch count=&quot;1&quot;> </li></ul><ul><li>Please say something like &quot;from Atlanta Georgia to Toledo Ohio&quot;. </li></ul><ul><li></nomatch> </li></ul><ul><li><nomatch count=&quot;2&quot;> </li></ul><ul><li>I'm sorry, I still don't understand. </li></ul><ul><li>I'll ask you for information one piece at a time. </li></ul><ul><li><assign name=&quot;bypass_init&quot; expr=&quot;true&quot;/> </li></ul><ul><li><reprompt/> </li></ul><ul><li></nomatch> </li></ul><ul><li></initial> </li></ul><ul><li><field name=&quot;from_city&quot;> </li></ul><ul><li><grammar src=&quot;; </li></ul><ul><li>type=&quot;application/srgs+xml&quot;/> </li></ul><ul><li><prompt>From which city are you leaving?</prompt> </li></ul><ul><li></field> </li></ul><ul><li><field name=&quot;to_city&quot;> </li></ul><ul><li><grammar src=&quot;; </li></ul><ul><li>type=&quot;application/srgs+xml&quot;/> </li></ul><ul><li><prompt>Which city are you going to?</prompt> </li></ul><ul><li></field> </li></ul><ul><li></form> </li></ul>
  11. 11. Corresponding CURRENT FIA Script <ul><li>class $ from DM.form </li></ul><ul><li>feat </li></ul><ul><li>id: getFromAndToCities </li></ul><ul><li>test: match(con([opt(sym('from')) sym(nil label:fromCity) sym(to) sym(nil label:toCity)])) </li></ul><ul><li>items: [class $ from DM.block </li></ul><ul><li>feat </li></ul><ul><li>todo: prompt('Welcome to the Driving Distance Service.') </li></ul><ul><li>end </li></ul><ul><li>class $ from DM.initial </li></ul><ul><li>feat </li></ul><ul><li>name: bypassInit </li></ul><ul><li>todo: prompt('Where do you want to drive from and to?') </li></ul><ul><li>meth nomatch() </li></ul><ul><li>if {self nomatchCount($)}==1 then </li></ul><ul><li>{self prompt('Please say something like &quot;from Oslo to Kiel&quot;.')} </li></ul><ul><li>else </li></ul><ul><li>{self prompt('I'm sorry, I still don't understand. '# </li></ul><ul><li>'I'll ask you for info one piece at a time.')} </li></ul><ul><li>{self assign(name:bypassInit expr:true)} </li></ul><ul><li>{self reprompt()} </li></ul><ul><li>end </li></ul><ul><li>end </li></ul><ul><li>end </li></ul><ul><li>class $ from DM.field </li></ul><ul><li>feat </li></ul><ul><li>name: fromCity </li></ul><ul><li>todo: prompt('From which city are you leaving? ') </li></ul><ul><li>test: match(sym(nil label:fromCity)) </li></ul><ul><li>end </li></ul><ul><li>class $ from DM.field </li></ul><ul><li>feat </li></ul><ul><li>name: toCity </li></ul><ul><li>todo: prompt('Which city are you going to? ') </li></ul><ul><li>test: match(sym(nil label:toCity)) </li></ul><ul><li>end </li></ul><ul><li>] </li></ul><ul><li>end </li></ul>
  12. 12. Demo
  13. 13. Transformation-Based Tagging <ul><li>Rule-based disambiguation method invented by Eric Brill (1995) </li></ul><ul><li>Rules may be machine learned </li></ul><ul><li>For part-of-speech tagging, unknown-word guessing, phrase chunking, word sense disambiguation, dialogue act recognition, etc. </li></ul><ul><li>Useful also in a dialogue system setting? Yes, but only if it can be made to process input in an incremental fashion </li></ul>
  14. 14. Incremental Part-of-Speech Tagging <ul><li>The part-of-speech tagging problem </li></ul><ul><ul><li>The light went off, so I will light a candle. Pass me the light bag, please. No, the light blue one. </li></ul></ul><ul><li>Incremental part-of-speech tagging </li></ul><ul><ul><li>At t 1 : The/DET … </li></ul></ul><ul><ul><li>At t 2 : The/DET light/? … </li></ul></ul><ul><ul><li>At t 3 : The/DET light/ADJ bag/NOUN … </li></ul></ul><ul><li>Word-by-word incremental POS tagging is not possible, but nearly word-by-word incremental POS tagging is, and that’s sufficient. </li></ul>
  15. 15. Implementation <ul><li>Each rule is compiled into a stream transducer, running in a separate thread </li></ul><ul><li>Incremental but with accuracy as if run in batch </li></ul><ul><li>Several hundred rules  several hundred streams and several hundred threads... Performance still OK. </li></ul>
  16. 16. Pattern Matching over Streams of Records <ul><li>Condition–action rules </li></ul><ul><ul><li>Conditions: Labelled regular expression pattern matching over the input stream of records </li></ul></ul><ul><ul><li>Actions: Update out-stream with records </li></ul></ul><ul><li>Method inspired by Appelt’s CPSL and by JAPE (part of the GATE NLP platform - Cunningham et al.). </li></ul><ul><ul><li>Rules are interpreted rather than compiled into FSMs </li></ul></ul><ul><ul><li>Works on streams and in an incremental fashion </li></ul></ul><ul><li>For keyword spotting, phrase chunking, named entity recognition, ‘semantic parsing’ </li></ul>
  17. 17. Implementation <ul><li>Regular expression matcher implemented LP-style, using the choice construct in combination with a search engine </li></ul><ul><li>Roughly 200 lines of code </li></ul><ul><li>Slow, but sufficient for dialogue processing </li></ul>
  18. 18. Demo
  19. 19. Summary: Important Oz Features <ul><li>Concurrency , for building agents that are able to ‘perceive’, ‘think’ and ‘act’ at the same time </li></ul><ul><li>Concurrency + streams , for building agents that are capable of incremental processing of natural language </li></ul><ul><li>Concurrency + streams + ports, for allowing us to specify the `toplevel' transducer as a network of components, e.g. for building multi-modal conversational agents (modality fusion/fission) </li></ul><ul><li>All kinds of Oz features , for building agents using sophisticated, state-of-the-art methods for NLP </li></ul><ul><li>Network-transparent distribution , for building multi-party dialogue systems </li></ul>