A Fusion Framework for Multimodal Interactive Applications

1,348 views
1,280 views

Published on

This research aims to propose a multi-modal fusion framework for high-level data fusion between two or more modalities. It takes as input low level features extracted from di er-
ent system devices, analyses and identi es intrinsic meanings in these data. Extracted meanings are mutually compared to identify complementarities, ambiguities and inconsistencies to better understand the user intention when interacting with the system. The whole fusion life cycle will be described
and evaluated in an OCE environment scenario, where two co-workers interact by voice and movements, which might show their intentions. The fusion in this case is focusing on combining modalities for capturing a context to enhance the user experience.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,348
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

A Fusion Framework for Multimodal Interactive Applications

  1. 1. A Fusion Framework for Multimodal Interactive Applications Presented by: Hildeberto Mendonça Jean-Yves Lionel Lawson Olga Vybornova Benoit Macq Jean Vanderdonckt ICMI-MLMI 2009 – Cambridge MA, USA, November 2-6, 2009 Special Session Fusion Engines for Multimodal Interfaces November 3, 2009
  2. 2. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA2 Motivations  How to support multimodal fusion in order to maximize reuse and minimize complexity?  If there is complexity on multimodal fusion it should be about the fusion in itself  What already exists should be reused with minimal adaptation  A general life cycle can guarantee a standard treatment for each modality
  3. 3. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA3 Research Goal To define and develop a multipurpose framework for high level data fusion on multimodal interactive applications
  4. 4. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA4 Fusion Principles  Type: Parallel + Combined = Synergistic  Each modality is endowed of meanings  Level: Feature (i.e. pattern extraction) + decision (i.e. Recognized task)  Input Devices: Multiple  Notation: Defined by the developer  Ambiguity resolution: Defined by the developer  Time representation (Quantitative – Qualitative): Both  Application Type : The domain is defined using ontologies
  5. 5. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA5 Process  Recognition: identification of patterns on input signals.  Segmentation: delimitation of identified areas.  Meanings Extraction: deeper analysis to identify meanings and correlations between segments according to specific domains.  Annotation: formal description of segments through domain concepts.
  6. 6. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA6 Process  The flow is fixed but it can start at any point respecting the sequence.  Not fixed to any particular method. The method is “plugged”.  Focus on good level of analysis, not on real time processing.
  7. 7. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA7 OpenInterface
  8. 8. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA8 OpenInterface
  9. 9. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA9 OpenInterface
  10. 10. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA10 Fusion Mechanism  Define a process for each modality and put them in parallel.  Data from each stage is buffered and processed together for the purpose of fusion.  Agent-oriented: problem solved in a distributed fashion.
  11. 11. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA11 Fusion Mechanism
  12. 12. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA12 Fusion Mechanism
  13. 13. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA13 Fusion Mechanism – OpenInterface OI Modeling Tool
  14. 14. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA14 Fusion Mechanism - Instance
  15. 15. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA15 Scenario Maybe I can find a book about it in the library Ronald is moving towards the book shelves
  16. 16. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA16 Results  managed spatial relationships based on the fixed objects in the room  made semantic fusion of events not coinciding in time  achieved good results in speaker identification - synchronization between image and speech identification  created an open framework to manage fusion between two (in our case) or more modalities (in enhanced future work)  designed the system so that each component can run in a separate machine due to the distribution mechanism interchanging data through a TCP/IP network
  17. 17. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA17 Next Steps  Implementing the segmentation and annotation of 3D content  Migrate the framework to a real-time implementation  Evaluate other methods under the rules of the framework  Continuously extend the framework to support other fusion concepts and methods of implementation
  18. 18. 01/30/15 ICMI-MLMI 2009 – Cambridge MA, USA18 Thank you for your attention!

×