IUI99, International Conference on Intelligent User Interfaces Los Angeles, January 6th, 1999 Agent-based Multimedia Interaction for Virtual Web Pages German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49 681) 302-5341 e-mail: email@example.com WWW:http://www.dfki.de/~wahlster Wolfgang Wahlster
Outline What are Virtual Web Pages? What are Agent-Based User Interfaces? Using Life-like Characters for the Personalization of User Interfaces Plan-based Media Transformation and Coordination The Combination of Retrie ved and Generated Media Objects for the Generation of Virtual Web Pages Commercial Applications in Advanced WebCommerce SmartKom: a Transportable and Transmutable Multimodal Interface Our Research Agenda for Agent-based User Interfaces Conclusion
What is a Virtual Web Page? A Virtual Web Page is generated on the fly as a combination of various media objects from multiple web sites or as a transformation of a real web page. looks like a real web page, but is not persistently stored. integrates generated and retrieved material in a coordinated way. can be tailored to a particular user profile and adapted to a particular interaction context. has an underlying representation of the presentation context so that an Interface Agent can comment, point to and explain its components. Virtual Memory, Virtual Relation, Virtual Reality...
Virtual Webpage Retrieved from 5 Different Servers
What are Agent-based User Interfaces? PROACTIVE ACTIVE REACTIVE anticipate the user's needs adopt the user's goals provide unsolicited comments appear as life-like characters plan interactive behavior autonomously can initiate inter- action INTERFACE AGENTS respond immediately to interruptions criticism and clarification questions direct manipulation
Intelligent Web Services Consumer Provider sells Information Goods Services buys Information Goods Services Web Sites Knowledge about: Usage Patterns User Models Consumer Profiles Netbot Intelligent Parallel Retrieval Information Extraction and Summarization Personalized Presentation Matchmaking Teleshopping Assistance Telemarketing Assistance Translation Services Data Mining Services
Netbots as Personal Assistants for WWW Users Netbots Indices, Directories, Search Engines WWW Netbot := Intelligent Agent that uses Internet tools on a person’s behalf Traveller’s Netbot: Tries to achieve traveller’s goals (finding and executing plans) checks availability finds best price uses personal preferences (e.g. frequent flyer programme, seating preferences lets the traveller know, when seats become available (active help) Mass Services Personal Assistants e.g. Ahoy!, Jango, AiA
A Netbot for Portrait Photos: The Personal Picture Finder Portrait Photo Netbot: Personal Picture Finder Parallel Meta- Search of Webpages for <Name> <Name> Parallel Search in Picture Archives Home Pages Extraction of Images Filtering of Logos, Graphics, ... WWW Knowledge Sources Applications: Journalism, Contact Preparation, Tracing Criminals http://finder.dfki.de:7000
Netbots versus Push Technologies Information Information Information Customer 1 Provider Push Technologies + no effort for customer - minimal adaptation Interactive Pull + good adaptation - major effort for customer Customer Query Information Netbots with Parallel Pull Customer Netbot Query Information Provider 1 Provider 2 Provider N + good adaptation + minimal effort for customer Customer 2 Customer N Provider 1 2 3
Enhancement of User Interfaces through Personalization System is able to flexibly tailor presentations to the individual user and the current situation. An animated character serves as “Alter Ego” of the presentation system. Personalized Presenters at DFKI
Personalized Package Presentation by an Animated Agent Personalization = adaption of system behavior according to a user model Personalization = the “agent” appears as animated character which presents, explains and comments an offer, and to which the user can talk to.
Video Character Presents the Interior of a Boeing 757
PPP’s Persona Server implements a generic Presentation Agent that can be easily adapted to various applications Persona Server Behaviors Presentation Gestures Reactive Behaviors Idle-time actions Navigation actions Auditory Characteristics Sound effects, auditory icons Voice: male, female Visual Appearances Hand-drawn Cartoon Bitmaps Generated Bitmaps from 3D-Models Video Bitmaps
The frames of the visual appearance of persona can be cartoon-like images or video-frames which show real persons More than 200 cartoon frames were drawn by a professional artist. A real persona was filmed with a video-camera and the pictures were digitized with a frame grabber.
Context-Sensitive Decomposition of Persona Actions take-position (t 1 t 2 ) point-to (t 3 t 4 ) move-to (t 1 t 2 ) r-stick-pointing (t 3 t 4 ) High-Level Persona Actions Context-Sensitive Expansion (including Navigation Actions) Decomposition into Uninterruptable Basic Postures r-turn (t 1 t 21 ) r-step (t 21 t 22 ) f-turn (t 22 t 2 ) r-hand-lift (t 3 t 31 ) r-stick-expose (t 31 t 4 ) Bitmaps ... ... ... ...
PPP System Architecture Multimedia Generation Text Generator Graphics Generator Gesture Generator Animation Generator Presentation Planner (PREPLAN) KR & Reasoning (RAT) Production Acts Generated Material Presentation Acts Signals + Events Multimedia Server Layout Manager Persona Server Music Generator
Task of the Presentation Planner Plan multimedia material as well as presentation acts and their temporal coordination Presentation Acts Persona Acts Display Acts This is the transformer
Extensions of the Representation Formalism Production Act Presentation Act Introduce Create- Graphics S-Show S-Wait S-Position Elaborate-Parts S-Create- Window S-Depict Label Label S-Point S-Speak S-Speak S-Point Qualitative constraints: Create-Graphics meets S-Show, ... Metric constraints: 1 <= Duration S-Wait <= 1, ... Distinction between production and presentation acts (i.e. Persona- or display acts) Explicit representation of qualitative and quantitative constraints
Presentation Strategies in PPP contain qualitative and metric constraints (define-plan-operator :HEADER ( A0 (INTRODUCE P A ?object ?window)) :INFERIORS (( A1 (CREATE-GRAPHICS P A ?window ?object)) ( A2 (S-SHOW P A ?window ?object)) ( A3 (S-POSITION P A ?window)) ( A4 (S-WAIT P A)) ( A5 (ELABORATE-PARTS P A ?object ?window))) :QUALITATIVE ((A1 (m) A2) (A3 (s) A2) (A3 (m) A5) (A5 (m) A4) (A4 (f) A2)) :METRIC ((10 <= DUR A2) (2 <= DUR A4 <= 2)) :START A1 :FINISH A2 ) (cf. [André/Rist 97])
PPP first builds up a preliminary schedule at design time PRELIMINARY SCHEDULE
The preliminary schedule is continously updated at presentation time UPDATED SCHEDULE
Temporal Reasoning in the Presentation Planner for Dynamic Multimedia Coordination a T e m p o r a l C o n s i s t e n c y C h e c k e r a n d P r o p a g a t o r P l a n S c h e d u l e r P l a n N o d e s w i t h L i n k s t o L o c a l T e m p o r a l C o n s t r a i n t N e t w o r k s P l a n O p e r a t o r s w i t h M e t r i c a n d Q u a l i t a t i v e T e m p o r a l C o n s t r a i n t s
Persona Presents an Automatically Designed Business Chart
The Combination of Retrieved and Generated Media Objects for Virtual Webpages Multi-Domain Problem Specs NETBOT Retrieved Results Information Structures Relations, Lists KR Terms Media Objects Texts, Sounds, Videos Pictures, Maps, Animations Distributed Information Multiple Data Sources
The Combination of Retrieved and Generated Media Objects for Virtual Webpages Retrieved Results Select & Design Select Canned Media Objects Design New Media Objects Information Structures Relations, Lists KR Terms Graphics, Animation Text, Speech, Mimic Icons, Clip Art Frames, Sounds Reuse & Transform Coordinate Media Objects Transform Media Objects Temporal Synchroni- zation Spatial Layout Clip, Convert, Abstract Zoom, Pan, Transition Effects Media Objects Texts, Sounds, Videos Pictures, Maps, Animations
Virtual Webpage with Animation Effects Based on a Single GIF Image
Transition Effects in a Series of Retrieved Pictures
The Generation of Virtual Webpages with PAN and AiA Netbot PAN Trip Data Pictures and Graphics Pieces of Text Coordinates for Pointing Gestures Input for Speech Synthesis Icons for Hyperlinks Hotel Agent Map Agent Address Weather Agent Train & Flight Scheduling Agent Major Event Agent Virtual Web Presentation Constraint- based Online Layout Presentation Planner Persona Server Components of virtual Webpages AiA
Dynamic Node Expansion for the Conditional Generation of Virtual Web Pages S-Include-Photo S-Include-Text S-Include-Link Introduce S-Speak Illustrate Design-Intro-Page Emphasize Location Elaborate S-Speak S-Point S-Include-Map Label Location Link Selected Default Time Over/ Up This hotel has a nice swimming pool. Your hotel is located here.
Use of a Life-like Character for Electronic Commerce
Use of a Life-like Character for Electronic Commerce
Use of a Life-like Character for Electronic Commerce
Sending Interface Agents to Clients: Plug-Ins or Applets? Plug-Ins Applets Add features (character players) to browser Download triggered by user Requires disk space on client Unrestricted access to client Less appropriate for WebCommerce, Guides Agents integrated in 3D environments Appropriate for Entertainment Examples: Extempo's Jennifer James (Hayes-Roth et al. 98) PFMagic's virtual petz Java animation code sent over the net Automatic loading Requires no disk space on client Restricted access to client Appropriate for WebCommerce, Guides Agents integrated in 2D environments Less appropriate for Entertainment Examples: DFKI's Web Persona (Müller et al. 98) ISI's Adele (Johnson et al 98) New in AiA/PAN: Balanced combination of Applets and Servelets Efficient distribution of client-side Java and server-side Java for driving the Interface Agent
Alternative Business Models for Shopbots a Queries Transactions Shopbots 2 Provider pays usage fee 3 pays fee after successful commercial transaction Internet Shop 1 . . . . . Internet Shop n Provider 1 pays fee for banner avertisement 3 Banner fee Usage fee Transaction fee (credit card model) 1 2
Intelligent Interface Technology is a Prerequisite for Advanced WebCommerce Advanced WebCommerce Virtual Web Pages One-to-One Marketing Intuitive, Multilingual Access Dialogue with Virtual Sales Agents Shopbots for Automated Comparison Shopping Text Analysis and Generation User Modeling and Language Generation Coordinated Text & Graphics Planning Robust Dialogue Understanding Advanced Speech Synthesis Information Extraction from HTML/XML Documents Machine Translation Multimodal Interfaces Multimedia Presentation Planning
SmartKom: A Transportable and Transmutable Interface Agent SmartKom-Home/Office : A Versatile Agent-based Interface SmartKom-Public : A Multimodal Communication Booth SmartKom-Mobile : A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design
SmartKom: Intuitive Multimodal Interaction MediaInterface Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart Munich Univ. of Erlangen Heidelberg Main Contractor Project Management Testbed Software Integration DFKI Saarbrücken The SmartKom Consortium: Project Budget: $ 34 M Project Duration: 4 years Ulm European Media Lab Uinv. Of Munich D AIMLER C HRYSLER
SmartKom-Public: A Multimodal Communication Booth Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing
SmartKom-Mobile: A Handheld Communication Assistant MOBILE Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity
The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998) User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent
Our Research Agenda for Agent-based Interfaces (Wahlster, André, Rist, Müller, Graf etc; www.dfki.de/imedia) Personalized Presentation Agents (limited user interaction) WIP: 1989-1993 1 Personalized Interface Agents (full user interaction) 2 PPP:1994-1996 Multiple Interface Agents (agent-agent and user interaction) 3 Multiple Presentation Agents in one scene (eg. pros and cons) Multiple Role-Taking (eg. Travel Assistance vs. Comparison Shopper) Multiple Interface Agents (eg. human-computer, human-human interaction) AiA: 1997-2000
Multiple Agents Discussing Pros and Cons of a Mercedes Model I recommend you this SLX limousine.
Research on Intelligent Web Services brings disparate subfields in the area of intelligent systems together Intelligent Web Services Intelligent Web Services User Modeling Planning Natural Language Understanding Knowledge Representation Image Understanding Machine Learning Plan Recognition Information Retrieval Multimodal User Interfaces
Conclusion ECommerce projects of DFKI have shown that research on agent-based multimodal interfaces can be transferred to real real-world applications: Dekra (largest European organization of used car dealers): FairCar as an ECommerce platform with NL access and a comparison shopping agent for used cars DaimlerChrysler : IKP for online user modelling in a one-to-one marketing system for Mercedes cars Otto/Shopping24/Eddie Bauer (largest European mail order company): virtual sales agents for one-to-one marketing of fashion and computer hardware Lufthansa/Condor : direct marketing of charter flights
Conclusion Two Research Challenges: Making the interface agents sensitive to temporary limitations of the user´s time and working memory capacity (cf. our READY project, Jameson et al., p. 79-85 in IUI99 Proceedings) Making the agents instructible , so that they can learn from the user in a dialog, how to extract information in a new domain (cf. our PAN project, Bauer/Dengler, p. 153-156 in IUI99 Proceedings) The generation of virtual web pages by agent-based multimodal interfaces leads to innovative applications in: Electronic Commerce, Electronic TV Guides (EPG) Telelearning environments, Call Centers and Help Desks