• Like

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Fusion Engines for Input Multimodal Interfaces: a Survey

  • 1,359 views
Uploaded on

Fusion engines are fundamental components of multimodal interactive systems, to interpret temporal combinations of deterministic as well as non-deterministic inputs whose meaning can vary according to …

Fusion engines are fundamental components of multimodal interactive systems, to interpret temporal combinations of deterministic as well as non-deterministic inputs whose meaning can vary according to the context, user and task. While various surveys have already been released on the topic of multimodal interactive systems, the current paper focuses on the design, specification, construction and evaluation of fusion engines. The article first introduces the adopted terminology and the major challenges that fusion engines propose to solve. Further, a history of the work achieved in the field of fusion engines is presented according to the main phases of the BRETAM model. A classification of existing approaches for fusion engines is then presented. The classification dimensions include the types of applications, the fusion principles and the temporal aspects. Finally, unsolved challenges, such as software frameworks, quantitative evaluation, machine learning and adaptation, sketch future work in the field of fusion engines.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,359
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
16
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Special session on Multimodal Fusion
    • A survey: Fusion Engines for Multimodal Input
    • 5 papers
    D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)
  • 2. Multimodal fusion
    • Multimodal fusion for
      • P erception
      • Interaction
    • Focus on multimodal interaction
      • 4 papers on multimodal interaction
      • 1 paper on multimodal perception (first one)
  • 3. Input Multimodal Interaction
  • 4. Input Fusion Engines
    • Multimodal fusion
      • Combining and interpreting data from multiple input modalities
      • Usage of input modalities
    Combined Independent Sequential Parallel Alternate Exclusive Synergistic Concurrent
  • 5. Input Fusion Engines
    • Combined usage (sequential, parallel) why?
    • Natural interaction is multimodal by nature.
    • The combination of input modalities increases the bandwidth of the human-computer interaction.
  • 6. Fusion engines
    • A very dynamic domain
    • ˜15 years of contributions: 1993-2008
  • 7. Input Fusion engines
    • Some key features
      • Multiple and temporal combinations
        • Types of data and time synchronization
      • Probabilistic inputs
        • Non deterministic inputs
      • Robustness
      • Error handling
      • Adaptation to context
        • Context = (user, environment, platform)
  • 8. Classification: Fusion engines 1980 R. Bolt “ Put that there”
  • 9. Classification: Fusion engines 1980 R. Bolt “ Put that there” Cubricon 1989 CARE 1995 Quickset 1997 ICARE 2004 Petshop 2004 FAME 2006
  • 10. Classification: Fusion engines 1980 R. Bolt “ Put that there” Multiple (up to 255) Input API in Windows 7 Microsoft MultiPoint SDK “ Zoom in here” UX beats Usability A gap
  • 11. Theories and Contributions over Time
  • 12. Reference Tool/ language/ program Fusion Time Representation Application types Notation Type Level Input Devices Ambiguity Resolution Quantitative Qualitative B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation                     R Wahlster Erreur ! Source du renvoi introuvable. XTRA None Unification Dialog Keyboard Mouse   N Y Map manipulation Neal [26] Cubricon Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity-based N Y Map manipulation                     E Koons [19]   No name   Parse tree Frame-based Dialog Speech, Eye gaze, Gesture   First solution Y   Y 3D World Nigay [28] Pac-Amodeus Melting Pot Frame-based Dialog + low level Speech, Keyboard, Mouse Context-based resolution Y N Flight Scheduling Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y N Simulation System training Bellik [3] MEDITOR None Frame-based Dialog + low level Speech Mouse History Buffer Y Y Text Editor Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse Probability-based resolution Y Y Edition of graphical user interfaces Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory                     T & A Krahnstoever [20] iMap Stream Stamped Frame-based Dialog Speech gesture Not given Y N Crisis Management Dumas [12] HephaisTK   XML Typed (SMUIML) Frame-based Dialog Speech Mouse Phidgets   First one Y   Y Meeting assistants Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool Milota [25] No Name Multimodal Parse Tree Unification Dialog Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design Melichar [24] WCI Multimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example Latoschik [21] No Name Temporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality Bouchet [5] [6] Mansoux [23] ICARE (Input/Output) Melting pot Frame-based Dialog + low level Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard Context-based resolution Y N Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery Navarre [30] Petshop Petri nets Procedural Dialog + low level Speech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit Flippo [14]   No Name Semantic tree Hybrid Dialog Speech Mouse Gaze gesture Feedback for missing data Y N Collaborative Map Portillo [34] MIMUS Feature Value Structure (DTAC) Hybrid Dialog Speech Mouse Knowledgeable agent Y N   Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book
  • 13. Reference Tool/ language/ program Fusion Time Representation Application types Notation Type Level Input Devices Ambiguity Resolution Quantitative Qualitative B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation                     R Wahlster XTRA None Unification Dialog Keyboard Mouse   N Y Map manipulation Neal [26] Cubricon Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity-based N Y Map manipulation                     E Koons [19]   No name   Parse tree Frame-based Dialog Speech, Eye gaze, Gesture   First solution Y   Y 3D World Nigay [28] Pac-Amodeus Melting Pot Frame-based Dialog + low level Speech, Keyboard, Mouse Context-based resolution Y N Flight Scheduling Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y N Simulation System training Bellik [3] MEDITOR None Frame-based Dialog + low level Speech Mouse History Buffer Y Y Text Editor Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse Probability-based resolution Y Y Edition of graphical user interfaces Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory                     T & A Krahnstoever [20] iMap Stream Stamped Frame-based Dialog Speech gesture Not given Y N Crisis Management Dumas [12] HephaisTK   XML Typed (SMUIML) Frame-based Dialog Speech Mouse Phidgets   First one Y   Y Meeting assistants Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool Milota [25] No Name Multimodal Parse Tree Unification Dialog Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design Melichar [24] WCI Multimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example Latoschik [21] No Name Temporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality Bouchet [5] [6] Mansoux [23] ICARE (Input/Output) Melting pot Frame-based Dialog + low level Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard Context-based resolution Y N Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery Navarre [30] Petshop Petri nets Procedural Dialog + low level Speech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit Flippo [14]   No Name Semantic tree Hybrid Dialog Speech Mouse Gaze gesture Feedback for missing data Y N Collaborative Map Portillo [34] MIMUS Feature Value Structure (DTAC) Hybrid Dialog Speech Mouse Knowledgeable agent Y N   Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book
  • 14. Special session Multimodal Fusion
    • Content
      • A survey
      • 5 papers
    • Schedule
      • 10 mn introduction and survey outlook
      • 15 mn per paper + 5 mn questions
      • 10 mn for questions on the session
    D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)
  • 15. Special session Multimodal Fusion
    • H. Mendonça: Agent-based fusion
    • B. Dumas: An evaluation framework to benchmarck fusion engines
    • L. Nigay: CARE-based fusion
    • J. Ladry & P. Palanque: Petri net based formal description and execution of fusion engines
    • M. Sezgin: Fusion of speech and facial expression recognition
  • 16. QUESTIONS?
  • 17. Fusion engines: research agenda
    • Performance evaluation
      • Testbeds, metrics
      • Identification of interpretation errors
      • Formal predictive evaluation
    • Adaptation to context
      • Dynamic aspect of adaptation
      • Reconfigurations
    • Engineering aspects
      • Difficult to develop (toolkit from manufacturers required)
      • Fusion engine tuning (tuning is the key for interaction techniques e.g. drag&drop)
  • 18. Fusion Principles
    • Notation: Petri nets based (ICOs)
    • Type: Procedural only
    • Level: Dialogue and low level
    • Input Devices: Speech, mice, keyboard, touch screen
    • Ambiguity resolution: inside models
    • Time representation (Quantitative – Qualitative): Both
    • Application Type : Safety Critical, Aeronautics and Space