Multimodal man machine interaction


Published on

Seminar Presentation on Human Computer Interaction (HCI)/ Multimodal Man-Machine Interaction

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Multimodal man machine interaction

  1. 1. Multimodal Man-Machine Interaction By: Rajesh P. Barnwal School of Information Technology Indian Institute of Technology, Kharagpur Sociology Language Design Engineering Ethnography Psychology Human Factors Computer Science
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>What is Man-Machine Interaction </li></ul><ul><li>Interaction Modalities </li></ul><ul><li>Unimodal vs Multimiodal HCI </li></ul><ul><li>Limitation of Unimodal HCI </li></ul><ul><li>Multimodal Interaction </li></ul><ul><li>Various HCI Modalities </li></ul><ul><li>Architecture of Multimodal HCI </li></ul><ul><li>Issues and Challenges </li></ul><ul><li>Application Areas </li></ul><ul><li>Case Studies </li></ul><ul><li>Future Scopes </li></ul><ul><li>Conclusion </li></ul><ul><li>References </li></ul>
  3. 3. Introduction <ul><li>Human interacts with </li></ul><ul><ul><li>Human (Human-Human Interaction) </li></ul></ul><ul><ul><li>Machine (Man-Machine Interaction) </li></ul></ul><ul><li>Human-Human Interaction (By default Natural) </li></ul><ul><li>Human-Machine Interaction (Better, if Natural) </li></ul>
  4. 4.
  5. 5. What is Man-Machine Interaction? <ul><li>A process of information transfer from </li></ul><ul><ul><li>User to Machine </li></ul></ul><ul><ul><li>Machine to User </li></ul></ul><ul><li>Man-Machine Interaction also referred as </li></ul><ul><ul><li>Human-Computer Interaction (HCI) </li></ul></ul><ul><ul><li>Computer-Human Interaction (CHI) </li></ul></ul><ul><li>As per ACM SIGCHI: </li></ul><ul><ul><li>HCI is “a discipline concerned with the design , evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them” </li></ul></ul>
  6. 6. Human Interaction Modalities <ul><li>Human interaction with outside world uses </li></ul><ul><ul><li>Sensory Organs as Input </li></ul></ul><ul><ul><li>Effectors for Output </li></ul></ul>Picture Courtesy: Google Sight Touch Hearing Taste Smell Limbs Eyes Finger Head Vocal
  7. 7. Computer Interaction Modalities <ul><li>Computer interacts with outside world using </li></ul><ul><ul><li>Input medium </li></ul></ul><ul><ul><li>Output medium </li></ul></ul>Picture Courtesy: Google
  8. 8. HCI System Architecture <ul><li>Architecture of any HCI Systems identified by- </li></ul><ul><ul><li>Number of inputs and outputs in the system </li></ul></ul><ul><ul><li>Diversity of inputs and outputs in terms of modality </li></ul></ul><ul><ul><li>Workings of these diverse input and output for interaction purpose </li></ul></ul><ul><li>Based on different configuration and design of interface, HCI system can be divided into- </li></ul><ul><ul><li>Unimodal HCI System </li></ul></ul><ul><ul><li>Multimodal HCI System </li></ul></ul>
  9. 9. Unimodal vs Multimodal <ul><li>Unimodal HCI System </li></ul><ul><ul><li>The system based on single channel of input </li></ul></ul><ul><ul><li>Restricted to the use of only one mode of human-computer interaction modality </li></ul></ul><ul><ul><li>Example are Text based User Interface, Graphical User Interface, Pointer based Interface, Touch based interface etc </li></ul></ul><ul><li>Multimodal HCI System </li></ul><ul><ul><li>The system based on combination of multiple modalities of interaction by simultaneous use of different channels </li></ul></ul><ul><ul><li>Motivated by the natural way of human interaction </li></ul></ul>
  10. 10. Limitation of Unimodal Interaction <ul><li>Not a natural way of human interaction </li></ul><ul><li>Usually designed for the ‘average’ user </li></ul><ul><li>Fails to cater the need of diverse category of people </li></ul><ul><li>Difficult to use by disable, illiterate and untrained people </li></ul><ul><li>Cannot provide universal interface </li></ul><ul><li>More error prone </li></ul>
  11. 11. Multimodal Interaction <ul><li>To address limitations of Unimodal Interaction </li></ul><ul><li>Based on two views: </li></ul><ul><ul><li>Human centered: multiple and simultaneous use of human input/output channels for perception and control. </li></ul></ul><ul><ul><li>System centered: multiple input/ output modalities for better accuracy, naturalness, redundancy and efficiency. </li></ul></ul>
  12. 12. Human-Computer Interaction Modalities <ul><li>Sensor-based </li></ul><ul><ul><li>Mouse, Keyboard, Joystick </li></ul></ul><ul><ul><li>Pen-based sensors </li></ul></ul><ul><ul><li>Motion tracking sensors </li></ul></ul><ul><ul><li>Haptic/ Touch Sensors </li></ul></ul><ul><ul><li>Pressure Sensors </li></ul></ul><ul><ul><li>Smell/ Taste Sensors </li></ul></ul><ul><li>Visual-based </li></ul><ul><ul><li>Facial Expression Analysis </li></ul></ul><ul><ul><li>Body Movement Tracking (Large-scale) </li></ul></ul><ul><ul><li>Gesture Recognition </li></ul></ul><ul><ul><li>Gaze Detection (Eyes Movement Tracking) </li></ul></ul>
  13. 13. Human-Computer Interaction Modalities <ul><li>Audio-based </li></ul><ul><ul><li>Speech Recognition </li></ul></ul><ul><ul><li>Speaker Recognition </li></ul></ul><ul><ul><li>Auditory Emotion Analysis </li></ul></ul><ul><ul><li>Human-Noise/Sign Detections (Gasp, Sigh, Laugh, Cry, etc.) </li></ul></ul>
  14. 14. Architecture of Multimodal User Interface <ul><li>Inputs </li></ul><ul><li>text </li></ul><ul><li>speech </li></ul><ul><li>vision </li></ul><ul><li>motor </li></ul><ul><li>… </li></ul><ul><li>Media Analysis </li></ul><ul><li>language </li></ul><ul><li>gesture </li></ul><ul><li>gaze </li></ul><ul><li>… </li></ul><ul><li>Outputs </li></ul><ul><li>graphics </li></ul><ul><li>animation </li></ul><ul><li>speech </li></ul><ul><li>sound </li></ul><ul><li>… </li></ul><ul><li>Media Design </li></ul><ul><li>language </li></ul><ul><li>gesture </li></ul><ul><li>… </li></ul><ul><li>… </li></ul><ul><li>Interaction Management </li></ul><ul><li>media fusion </li></ul><ul><li>discourse management </li></ul><ul><li>plan recognition and generation </li></ul><ul><li>user modeling </li></ul><ul><li>presentation design </li></ul>(Picture from: Maybury and Whalster 1998)
  15. 15. Need for Multimodal HCI System <ul><li>To enhance error avoidance and ease of error resolution. </li></ul><ul><li>To accommodate a wider range of users, tasks, and environmental situations. </li></ul><ul><li>To cater the need of individual with differences, such as permanent or temporary handicaps. </li></ul><ul><li>To prevent overuse of any individual mode during extended computer usage. </li></ul><ul><li>To permit the flexible and improved use of input modes, including alternation and integrated use. </li></ul>
  16. 16. Issues and Challenges <ul><li>Perfection of technology </li></ul><ul><li>Lack of universal model for interface design </li></ul><ul><li>Simultaneous tracking of mode </li></ul><ul><li>Unambiguous interpretation </li></ul><ul><li>Multi-modal information fusion </li></ul><ul><li>Realizing Natural User Interface </li></ul><ul><li>Cost of hardware </li></ul>
  17. 17. Application Areas <ul><li>Computing devices and application for physically handicapped people </li></ul><ul><li>Universal Interface design for– </li></ul><ul><ul><li>Old Age </li></ul></ul><ul><ul><li>Children </li></ul></ul><ul><ul><li>Novice </li></ul></ul><ul><li>Robotic Interaction </li></ul><ul><li>Gaming Industry </li></ul><ul><li>Medical Industry </li></ul><ul><li>Smart Surveillance </li></ul>
  18. 18. Bharati – A Multimodal Web Interface <ul><li>An IIT-Kharagpur Initiative </li></ul><ul><li>The objective of the project </li></ul><ul><ul><li>a internet user interface for both language and computer illiterate people : text, speech and icon </li></ul></ul>
  19. 19. Bharati – Chitra <ul><li>Iconic Module for the People unable to read/ write in their mother tongue </li></ul>
  20. 20. Bharati – Dhwani <ul><li>Speech based module for those who has speaking but not reading/ writing ability in their mother language </li></ul>
  21. 21. Bharati – Akshar <ul><li>Text based module for the user unable to use English </li></ul>
  22. 22. Multimodal Framework of Bharati Text Analyzer Speech Recognizer Visual Language Manager Speech to text converter Hindi/ Bengali to English Language Translator Keywords Extractor Information Visualization Text to Speech Converter English to Hindi/ Bengali Language Translator Content Receiver Search Engine Internet Strings of text Strings of text Query Strings HTML page Query Strings Text Speech WIMP WIMP : Window, Icon, Menu, Pointing Device (Picture adapted from NID (IITKgp) Website) Next
  23. 23. ITR – A Multimodal Lab Project <ul><li>An Initiative of Beckman Institute, University of Illinois </li></ul>Multimodal Interaction Scenario (Adapted from ITR Website)
  24. 24. ITR – Multimodal Framework (Picture adapted from ITR Website)
  25. 25. ITR Project – Demo Video (Video from ITR Website)
  26. 26. Future of Human-Computer Interaction <ul><li>Future HCI will be more for realizing Natural User Interface </li></ul><ul><li>Interaction can be made using combination of </li></ul><ul><ul><li>Gesture, Speech, Facial Expression, Vision, Gaze, Touch, Brain wave </li></ul></ul><ul><li>As per prediction in Microsoft’s HCI Vision 2020 report, in future, </li></ul><ul><ul><li>Physical objects like wall, floor, furniture will be able to interact in distributed manner with the human beings in very natural way </li></ul></ul><ul><li>As per an article appeared on 3 rd September 2010 in an online magazine “Network World” </li></ul><ul><ul><li>Future computer can be interacted using brainwaves in combination of other modality like gesture and can be accessible at anywhere even by just using movements of human body parts in air – Pranav Mistry , MIT Media Lab. </li></ul></ul>
  27. 27. Conclusion <ul><li>Use of Multimodal HCI (MHCI) provides great many advantages over unimodal interface </li></ul><ul><li>MHCI is capable of providing natural user interface to human being </li></ul><ul><li>MHCI is able to pave a way for universal design for diverse application and people </li></ul><ul><li>MHCI has great potential in terms of applications areas and thus needs extensive inter-disciplinary research for addressing issues and challenges </li></ul>
  28. 28. References <ul><li>ACM SIGCHI Curricula for Human-Computer Interaction . [Online accessed on 28 th September 2010]. </li></ul><ul><li>Alan Dix, Janet Finlay, Gregory D. Abowd, Russel Beale (2004). Human Computer Interaction . Pearson Education (Singapore) Pte. Ltd., ISBN 81-297-0409-9. </li></ul><ul><li>Being Human: Human Computer Interaction in the Year 2020 . en-us/um/cambridge/projects/hci2020/downloads/Being Human_A3.pdf [Online accessed on 22 nd September 2010]. </li></ul><ul><li>Bharati: Internet for all . [Online accessed on 26 th September 2010]. </li></ul><ul><li>Fakhreddine Karray, Milad Alemzadeh, Jamil Abou Saleh and Mo Nours Arab. Human-Computer Interaction: Overview on State of the Art , International Journal on Smart Sensing and Intelligent Systems, Vol. 1, No. 1, March 2008, pp 137-159. </li></ul><ul><li>Human–computer interaction - Wikipedia, the free encyclopedia . Human–computer_interaction [Online accessed on 1st September 2010]. </li></ul>
  29. 29. References <ul><li>Jiagen Jin, Wenfeng Li. A Survey of The Information Fusion in MMHCI , 2010 International Conference on Machine Vision and Human-machine Interface, April 24-25, 2010, pp.509-513. </li></ul><ul><li>Jon Brodkin, The future of human-computer interaction. Article appeared in Network World Online Magazine on September 03, 2010, /2010/090210-human-computer-interaction.html [Online accessed on 29 th September 2010]. </li></ul><ul><li>Mark T. Maybury and Wolfgang Wahlster (Eds.), Readings in Intelligent User Interfaces . Morgan Kaufmann Publishers, 1998 [ Ref. in Multimodal Human-Computer Interaction: a constructive and empirical study . Academic Dissertation by Roope Raisamo, University of Tampere, 1999]. </li></ul><ul><li>Multimodal Human Computer Interaction: Toward a Proactive Computer . [Online accessed on 22 nd September 2010]. </li></ul><ul><li>S.K. Card, T.P. Moran and A. Newell (1983). The Psychology of Human-Computer Interaction . Lawrence Erlbaum Associates, ISBN 0-89859-859-1. </li></ul>
  30. 30. Thank You! Questions, if any?
  31. 31. Thank You!
  32. 32. Typical Information Flow in a Basic Multimodal Interaction Gesture Recognition Gesture Understanding Speech Recognition Natural Language Processing Camera Glove Laser Touch Microphone Context Management Multimodal Integration Dialogue Manager Application Invocation and Coordination Response Planning App1 App2 App3 Graphics VR TTS Feature/ Frame Structures Feature/ Frame Structures
  33. 33. Human Computer Interaction (Classical Model) (Picture adapted ACM-SIGCHI)
  34. 34. Why Man-Machine Interaction? <ul><li>Fast and information-centered life </li></ul><ul><li>Increasing tendency for getting self-serviced life </li></ul><ul><li>To get the work done with the help of machine </li></ul><ul><li>To get easy and better life with the help of machine </li></ul><ul><li>Requires interaction as natural as possible </li></ul>