0
Special session on
Multimodal Fusion
• A survey: Fusion Engines for Multimodal Input
• 5 papers
D. Lalanne (Switzerland), ...
Multimodal fusion
• Multimodal fusion for
• Perception
• Interaction
• Focus on multimodal interaction
• 4 papers on multi...
Input Multimodal Interaction
3
Input Fusion Engines
• Multimodal fusion
• Combining and interpreting data
from multiple input modalities
• Usage of input...
Input Fusion Engines
• Combined usage (sequential, parallel) why?
• Natural interaction is multimodal by nature.
• The com...
Fusion engines
• A very dynamic domain
• ˜15 years of contributions: 1993-2008
6
Input Fusion engines
• Some key features
• Multiple and temporal combinations
• Types of data and time synchronization
• P...
Classification:
Fusion engines
8
1980 R.
Bolt
“Put that
there”
Classification:
Fusion engines
9
1980 R.
Bolt
“Put that
there”
Cubrico
n
1989
CARE
1995
Quickse
t
1997
ICARE 2004
Petshop
...
Classification:
Fusion engines
10
1980 R.
Bolt
“Put that
there”
Multiple (up to
255) Input API in
Windows 7
Microsoft
Mult...
Theories and Contributions
over Time
11
Reference Tool/ language/ program
Fusion Time Representation
Application types
Notation Type Level Input Devices
Ambiguity...
Reference
Tool/
language/
program
Fusion
Time
Representation
Applicatio
n types
Notation Type Level Input Devices
Ambiguit...
Special session
Multimodal Fusion
• Content
• A survey
• 5 papers
• Schedule
• 10 mn introduction and survey outlook
• 15 ...
Special session
Multimodal Fusion
• H. Mendonça: Agent-based fusion
• B. Dumas: An evaluation framework to
benchmarck fusi...
16
QUESTIONS?
Fusion engines:
research agenda
• Performance evaluation
• Testbeds, metrics
• Identification of interpretation errors
• F...
Fusion Principles
• Notation: Petri nets based (ICOs)
• Type: Procedural only
• Level: Dialogue and low level
• Input Devi...
Upcoming SlideShare
Loading in...5
×

Fusion Engines for Input Multimodal Interfaces: a Survey

1,420

Published on

Fusion engines are fundamental components of multimodal interactive systems, to interpret temporal combinations of deterministic as well as non-deterministic inputs whose meaning can vary according to the context, user and task. While various surveys have already been released on the topic of multimodal interactive systems, the current paper focuses on the design, specification, construction and evaluation of fusion engines. The article first introduces the adopted terminology and the major challenges that fusion engines propose to solve. Further, a history of the work achieved in the field of fusion engines is presented according to the main phases of the BRETAM model. A classification of existing approaches for fusion engines is then presented. The classification dimensions include the types of applications, the fusion principles and the temporal aspects. Finally, unsolved challenges, such as software frameworks, quantitative evaluation, machine learning and adaptation, sketch future work in the field of fusion engines.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,420
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Fusion Engines for Input Multimodal Interfaces: a Survey"

  1. 1. Special session on Multimodal Fusion • A survey: Fusion Engines for Multimodal Input • 5 papers D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium) 1
  2. 2. Multimodal fusion • Multimodal fusion for • Perception • Interaction • Focus on multimodal interaction • 4 papers on multimodal interaction • 1 paper on multimodal perception (first one) 2
  3. 3. Input Multimodal Interaction 3
  4. 4. Input Fusion Engines • Multimodal fusion • Combining and interpreting data from multiple input modalities • Usage of input modalities Combined Independent Sequential Parallel Alternate Exclusive Synergistic Concurrent 4
  5. 5. Input Fusion Engines • Combined usage (sequential, parallel) why? • Natural interaction is multimodal by nature. • The combination of input modalities increases the bandwidth of the human-computer interaction. 5
  6. 6. Fusion engines • A very dynamic domain • ˜15 years of contributions: 1993-2008 6
  7. 7. Input Fusion engines • Some key features • Multiple and temporal combinations • Types of data and time synchronization • Probabilistic inputs • Non deterministic inputs • Robustness • Error handling • Adaptation to context • Context = (user, environment, platform) 7
  8. 8. Classification: Fusion engines 8 1980 R. Bolt “Put that there”
  9. 9. Classification: Fusion engines 9 1980 R. Bolt “Put that there” Cubrico n 1989 CARE 1995 Quickse t 1997 ICARE 2004 Petshop 2004 FAME 2006
  10. 10. Classification: Fusion engines 10 1980 R. Bolt “Put that there” Multiple (up to 255) Input API in Windows 7 Microsoft MultiPoint SDK “Zoom in here” UX beats Usability A gap
  11. 11. Theories and Contributions over Time 11
  12. 12. Reference Tool/ language/ program Fusion Time Representation Application types Notation Type Level Input Devices Ambiguity Resolution Quantitat ive Qualitative B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation R Wahlster Erreur ! Source du renvoi introuvable. XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation Neal [26] Cubricon Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity- based N Y Map manipulation E Koons [19] No name Parse tree Frame- based Dialog Speech, Eye gaze, Gesture First solution Y Y 3D World Nigay [28] Pac-Amodeus Melting Pot Frame- based Dialog + low level Speech, Keyboard, Mouse Context- based resolution Y N Flight Scheduling Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y N Simulation System training Bellik [3] MEDITOR None Frame- based Dialog + low level Speech Mouse History Buffer Y Y Text Editor Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse Probability- based resolution Y Y Edition of graphical user interfaces Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory T & A Krahnstoever [20] iMap Stream Stamped Frame- based Dialog Speech gesture Not given Y N Crisis Management Dumas [12] HephaisTK XML Typed (SMUIML) Frame- based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool Milota [25] No Name Multimodal Parse Tree Unification Dialog Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design Melichar [24] WCI Multimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example Latoschik [21] No Name Temporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality Bouchet [5] [6] Mansoux [23] ICARE (Input/Output) Melting pot Frame- based Dialog + low level Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard Context- based resolution Y N Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post- it), Augmented Surgery Navarre [30] Petshop Petri nets Procedural Dialog + low level Speech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit Flippo [14] No Name Semantic tree Hybrid Dialog Speech Mouse Gaze gesture Feedback for missing data Y N Collaborative Map Portillo [34] MIMUS Feature Value Structure (DTAC) Hybrid Dialog Speech Mouse Knowledgea ble agent Y N Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book 12
  13. 13. Reference Tool/ language/ program Fusion Time Representation Applicatio n types Notation Type Level Input Devices Ambiguity Resolution Quantit ative Qualita tive B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation R Wahlster XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation Neal [26] Cubricon Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity-based N Y Map manipulation E Koons [19] No name Parse tree Frame-based Dialog Speech, Eye gaze, Gesture First solution Y Y 3D World Nigay [28] Pac-Amodeus Melting Pot Frame-based Dialog + low level Speech, Keyboard, Mouse Context-based resolution Y N Flight Scheduling Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y N Simulation System training Bellik [3] MEDITOR None Frame-based Dialog + low level Speech Mouse History Buffer Y Y Text Editor Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse Probability-based resolution Y Y Edition of graphical user interfaces Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory T & A Krahnstoever [20] iMap Stream Stamped Frame-based Dialog Speech gesture Not given Y N Crisis Management Dumas [12] HephaisTK XML Typed (SMUIML) Frame-based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool Milota [25] No Name Multimodal Parse Tree Unification Dialog Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design Melichar [24] WCI Multimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example Latoschik [21] No Name Temporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality Bouchet [5] [6] Mansoux [23] ICARE (Input/Output) Melting pot Frame-based Dialog + low level Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard Context-based resolution Y N Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post- it), Augmented Surgery Speech mouse Keyboard 13
  14. 14. Special session Multimodal Fusion • Content • A survey • 5 papers • Schedule • 10 mn introduction and survey outlook • 15 mn per paper + 5 mn questions • 10 mn for questions on the session D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)
  15. 15. Special session Multimodal Fusion • H. Mendonça: Agent-based fusion • B. Dumas: An evaluation framework to benchmarck fusion engines • L. Nigay: CARE-based fusion • J. Ladry & P. Palanque: Petri net based formal description and execution of fusion engines • M. Sezgin: Fusion of speech and facial expression recognition
  16. 16. 16 QUESTIONS?
  17. 17. Fusion engines: research agenda • Performance evaluation • Testbeds, metrics • Identification of interpretation errors • Formal predictive evaluation • Adaptation to context • Dynamic aspect of adaptation • Reconfigurations • Engineering aspects • Difficult to develop (toolkit from manufacturers required) • Fusion engine tuning (tuning is the key for 17
  18. 18. Fusion Principles • Notation: Petri nets based (ICOs) • Type: Procedural only • Level: Dialogue and low level • Input Devices: Speech, mice, keyboard, touch screen • Ambiguity resolution: inside models • Time representation (Quantitative – Qualitative): Both • Application Type : Safety Critical, Aeronautics and Space 18
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×