Agent Based Multimedia Interaction
Upcoming SlideShare
Loading in...5

Agent Based Multimedia Interaction






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Agent Based Multimedia Interaction Agent Based Multimedia Interaction Presentation Transcript

  • IUI99, International Conference on Intelligent User Interfaces Los Angeles, January 6th, 1999 Agent-based Multimedia Interaction for Virtual Web Pages German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49 681) 302-5341 e-mail: WWW: Wolfgang Wahlster
  • Outline  What are Virtual Web Pages?  What are Agent-Based User Interfaces?  Using Life-like Characters for the Personalization of User Interfaces  Plan-based Media Transformation and Coordination  The Combination of Retrie ved and Generated Media Objects for the Generation of Virtual Web Pages  Commercial Applications in Advanced WebCommerce  SmartKom: a Transportable and Transmutable Multimodal Interface  Our Research Agenda for Agent-based User Interfaces  Conclusion
  • Three Generations of Web Sites First Generation Second Generation Third Generation Static Web Sites Fossils cast in HTML Interactive Web Sites JavaScripts and Applets Database Access and Template-based Generation Dynamic Web Sites Virtual Web Sites Netbots, Information Extraction, Presentation Planners Adaptive Web Sites User Modeling, Machine Learning, Online Layout
  • What is a Virtual Web Page? A Virtual Web Page  is generated on the fly as a combination of various media objects from multiple web sites or as a transformation of a real web page.  looks like a real web page, but is not persistently stored.  integrates generated and retrieved material in a coordinated way.  can be tailored to a particular user profile and adapted to a particular interaction context.  has an underlying representation of the presentation context so that an Interface Agent can comment, point to and explain its components. Virtual Memory, Virtual Relation, Virtual Reality...
  • Virtual Webpage Retrieved from 5 Different Servers
  • Virtual Webpage Augmented by Persona
  • What are Agent-based User Interfaces? PROACTIVE ACTIVE REACTIVE  anticipate the user's needs  adopt the user's goals  provide unsolicited comments  appear as life-like characters  plan interactive behavior autonomously  can initiate inter- action INTERFACE AGENTS  respond immediately to interruptions  criticism and clarification questions  direct manipulation
  • Intelligent Web Services Consumer Provider sells  Information  Goods  Services buys  Information  Goods  Services Web Sites Knowledge about:  Usage Patterns  User Models  Consumer Profiles Netbot  Intelligent Parallel Retrieval  Information Extraction and Summarization  Personalized Presentation  Matchmaking  Teleshopping Assistance  Telemarketing Assistance  Translation Services  Data Mining Services
  • Netbots as Personal Assistants for WWW Users Netbots Indices, Directories, Search Engines WWW Netbot := Intelligent Agent that uses Internet tools on a person’s behalf Traveller’s Netbot: Tries to achieve traveller’s goals (finding and executing plans)  checks availability  finds best price  uses personal preferences (e.g. frequent flyer programme, seating preferences  lets the traveller know, when seats become available (active help) Mass Services Personal Assistants e.g. Ahoy!, Jango, AiA
  • A Netbot for Portrait Photos: The Personal Picture Finder Portrait Photo Netbot: Personal Picture Finder Parallel Meta- Search of Webpages for <Name> <Name> Parallel Search in Picture Archives Home Pages Extraction of Images Filtering of Logos, Graphics, ... WWW Knowledge Sources Applications: Journalism, Contact Preparation, Tracing Criminals
  • The Personal Picture Finder
  • Netbots versus Push Technologies Information Information Information Customer 1 Provider Push Technologies + no effort for customer - minimal adaptation Interactive Pull + good adaptation - major effort for customer Customer Query Information Netbots with Parallel Pull Customer Netbot Query Information Provider 1 Provider 2 Provider N + good adaptation + minimal effort for customer Customer 2 Customer N Provider 1 2 3
  • Enhancement of User Interfaces through Personalization System is able to flexibly tailor presentations to the individual user and the current situation. An animated character serves as “Alter Ego” of the presentation system. Personalized Presenters at DFKI
  • Personalized Package Presentation by an Animated Agent  Personalization = adaption of system behavior according to a user model  Personalization = the “agent” appears as animated character which presents, explains and comments an offer, and to which the user can talk to.
  • Video Character Presents the Interior of a Boeing 757
  • Reactive Behavior of the Persona Agent
  • PPP’s Persona Server implements a generic Presentation Agent that can be easily adapted to various applications Persona Server Behaviors Presentation Gestures Reactive Behaviors Idle-time actions Navigation actions Auditory Characteristics Sound effects, auditory icons Voice: male, female Visual Appearances Hand-drawn Cartoon Bitmaps Generated Bitmaps from 3D-Models Video Bitmaps      
  • The frames of the visual appearance of persona can be cartoon-like images or video-frames which show real persons More than 200 cartoon frames were drawn by a professional artist. A real persona was filmed with a video-camera and the pictures were digitized with a frame grabber.
  • The Persona Editor
  • Context-Sensitive Decomposition of Persona Actions take-position (t 1 t 2 ) point-to (t 3 t 4 ) move-to (t 1 t 2 ) r-stick-pointing (t 3 t 4 ) High-Level Persona Actions Context-Sensitive Expansion (including Navigation Actions) Decomposition into Uninterruptable Basic Postures r-turn (t 1 t 21 ) r-step (t 21 t 22 ) f-turn (t 22 t 2 ) r-hand-lift (t 3 t 31 ) r-stick-expose (t 31 t 4 ) Bitmaps ... ... ... ...
  • PPP System Architecture Multimedia Generation Text Generator Graphics Generator Gesture Generator Animation Generator Presentation Planner (PREPLAN) KR & Reasoning (RAT) Production Acts Generated Material Presentation Acts Signals + Events Multimedia Server Layout Manager Persona Server Music Generator
  • Task of the Presentation Planner Plan multimedia material as well as presentation acts and their temporal coordination Presentation Acts Persona Acts Display Acts This is the transformer
  • Persona explains a modem
  • Extensions of the Representation Formalism Production Act Presentation Act Introduce Create- Graphics S-Show S-Wait S-Position Elaborate-Parts S-Create- Window S-Depict Label Label S-Point S-Speak S-Speak S-Point Qualitative constraints: Create-Graphics meets S-Show, ... Metric constraints: 1 <= Duration S-Wait <= 1, ... Distinction between production and presentation acts (i.e. Persona- or display acts) Explicit representation of qualitative and quantitative constraints
  • Presentation Strategies in PPP contain qualitative and metric constraints (define-plan-operator :HEADER ( A0 (INTRODUCE P A ?object ?window)) :INFERIORS (( A1 (CREATE-GRAPHICS P A ?window ?object)) ( A2 (S-SHOW P A ?window ?object)) ( A3 (S-POSITION P A ?window)) ( A4 (S-WAIT P A)) ( A5 (ELABORATE-PARTS P A ?object ?window))) :QUALITATIVE ((A1 (m) A2) (A3 (s) A2) (A3 (m) A5) (A5 (m) A4) (A4 (f) A2)) :METRIC ((10 <= DUR A2) (2 <= DUR A4 <= 2)) :START A1 :FINISH A2 ) (cf. [André/Rist 97])
  • PPP first builds up a preliminary schedule at design time PRELIMINARY SCHEDULE
  • The preliminary schedule is continously updated at presentation time UPDATED SCHEDULE
  • Temporal Reasoning in the Presentation Planner for Dynamic Multimedia Coordination a T e m p o r a l C o n s i s t e n c y C h e c k e r a n d P r o p a g a t o r P l a n S c h e d u l e r P l a n N o d e s w i t h L i n k s t o L o c a l T e m p o r a l C o n s t r a i n t N e t w o r k s P l a n O p e r a t o r s w i t h M e t r i c a n d Q u a l i t a t i v e T e m p o r a l C o n s t r a i n t s
  • Persona Presents an Automatically Designed Business Chart
  • The Combination of Retrieved and Generated Media Objects for Virtual Webpages Multi-Domain Problem Specs NETBOT Retrieved Results Information Structures  Relations, Lists  KR Terms Media Objects  Texts, Sounds, Videos  Pictures, Maps, Animations Distributed Information Multiple Data Sources
  • The Combination of Retrieved and Generated Media Objects for Virtual Webpages Retrieved Results Select & Design Select Canned Media Objects Design New Media Objects Information Structures  Relations, Lists  KR Terms  Graphics, Animation  Text, Speech, Mimic  Icons, Clip Art  Frames, Sounds Reuse & Transform Coordinate Media Objects Transform Media Objects  Temporal Synchroni- zation  Spatial Layout  Clip, Convert, Abstract  Zoom, Pan, Transition Effects Media Objects  Texts, Sounds, Videos  Pictures, Maps, Animations
  • Virtual Webpage with Animation Effects Based on a Single GIF Image
  • Transition Effects in a Series of Retrieved Pictures
  • Persona as a Personal Travel Consultant
  • The Generation of Virtual Webpages with PAN and AiA Netbot PAN Trip Data Pictures and Graphics Pieces of Text Coordinates for Pointing Gestures Input for Speech Synthesis Icons for Hyperlinks Hotel Agent Map Agent Address Weather Agent Train & Flight Scheduling Agent Major Event Agent Virtual Web Presentation Constraint- based Online Layout Presentation Planner Persona Server Components of virtual Webpages AiA
  • Persona as a Personal Travel Consultant
  • Dynamic Node Expansion for the Conditional Generation of Virtual Web Pages S-Include-Photo S-Include-Text S-Include-Link Introduce S-Speak Illustrate Design-Intro-Page Emphasize Location Elaborate S-Speak S-Point S-Include-Map Label Location Link Selected Default Time Over/ Up This hotel has a nice swimming pool. Your hotel is located here.
  • Use of a Life-like Character for Electronic Commerce
  • Use of a Life-like Character for Electronic Commerce
  • Use of a Life-like Character for Electronic Commerce
  • Jennifer James as a Virtual Sales Agent © extempo systems inc.
  • Sending Interface Agents to Clients: Plug-Ins or Applets? Plug-Ins Applets  Add features (character players) to browser  Download triggered by user  Requires disk space on client  Unrestricted access to client  Less appropriate for WebCommerce, Guides  Agents integrated in 3D environments  Appropriate for Entertainment Examples:  Extempo's Jennifer James (Hayes-Roth et al. 98)  PFMagic's virtual petz  Java animation code sent over the net  Automatic loading  Requires no disk space on client  Restricted access to client  Appropriate for WebCommerce, Guides  Agents integrated in 2D environments  Less appropriate for Entertainment Examples:  DFKI's Web Persona (Müller et al. 98)  ISI's Adele (Johnson et al 98) New in AiA/PAN: Balanced combination of Applets and Servelets Efficient distribution of client-side Java and server-side Java for driving the Interface Agent
  • Alternative Business Models for Shopbots a Queries Transactions Shopbots 2 Provider pays usage fee 3 pays fee after successful commercial transaction Internet Shop 1 . . . . . Internet Shop n Provider 1 pays fee for banner avertisement 3 Banner fee Usage fee Transaction fee (credit card model) 1 2
  • Intelligent Interface Technology is a Prerequisite for Advanced WebCommerce Advanced WebCommerce Virtual Web Pages One-to-One Marketing Intuitive, Multilingual Access Dialogue with Virtual Sales Agents Shopbots for Automated Comparison Shopping Text Analysis and Generation User Modeling and Language Generation Coordinated Text & Graphics Planning Robust Dialogue Understanding Advanced Speech Synthesis Information Extraction from HTML/XML Documents Machine Translation Multimodal Interfaces Multimedia Presentation Planning
  • SmartKom: A Transportable and Transmutable Interface Agent SmartKom-Home/Office : A Versatile Agent-based Interface SmartKom-Public : A Multimodal Communication Booth SmartKom-Mobile : A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design
  • SmartKom: Intuitive Multimodal Interaction MediaInterface Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart Munich Univ. of Erlangen Heidelberg Main Contractor Project Management Testbed Software Integration DFKI Saarbrücken The SmartKom Consortium: Project Budget: $ 34 M Project Duration: 4 years Ulm European Media Lab Uinv. Of Munich D AIMLER C HRYSLER
  • SmartKom-Public: A Multimodal Communication Booth Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing
  • SmartKom-Mobile: A Handheld Communication Assistant MOBILE Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity
  • SmartKom-Home/Office: A Versatile Agent-based Interface SpeechMike Virtual Touchscreen Natural Gesture Recognition
  • The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998) User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent
  • Our Research Agenda for Agent-based Interfaces (Wahlster, André, Rist, Müller, Graf etc; Personalized Presentation Agents (limited user interaction) WIP: 1989-1993 1 Personalized Interface Agents (full user interaction) 2 PPP:1994-1996 Multiple Interface Agents (agent-agent and user interaction) 3  Multiple Presentation Agents in one scene (eg. pros and cons)  Multiple Role-Taking (eg. Travel Assistance vs. Comparison Shopper)  Multiple Interface Agents (eg. human-computer, human-human interaction) AiA: 1997-2000
  • Multiple Agents Discussing Pros and Cons of a Mercedes Model I recommend you this SLX limousine.
  • Research on Intelligent Web Services brings disparate subfields in the area of intelligent systems together Intelligent Web Services Intelligent Web Services User Modeling Planning Natural Language Understanding Knowledge Representation Image Understanding Machine Learning Plan Recognition Information Retrieval Multimodal User Interfaces
  • Conclusion ECommerce projects of DFKI have shown that research on agent-based multimodal interfaces can be transferred to real real-world applications: Dekra (largest European organization of used car dealers): FairCar as an ECommerce platform with NL access and a comparison shopping agent for used cars DaimlerChrysler : IKP for online user modelling in a one-to-one marketing system for Mercedes cars Otto/Shopping24/Eddie Bauer (largest European mail order company): virtual sales agents for one-to-one marketing of fashion and computer hardware Lufthansa/Condor : direct marketing of charter flights
  • Conclusion Two Research Challenges: Making the interface agents sensitive to temporary limitations of the user´s time and working memory capacity (cf. our READY project, Jameson et al., p. 79-85 in IUI99 Proceedings) Making the agents instructible , so that they can learn from the user in a dialog, how to extract information in a new domain (cf. our PAN project, Bauer/Dengler, p. 153-156 in IUI99 Proceedings) The generation of virtual web pages by agent-based multimodal interfaces leads to innovative applications in: Electronic Commerce, Electronic TV Guides (EPG) Telelearning environments, Call Centers and Help Desks