Successfully reported this slideshow.

Gestures and Lip Shape Integration for Cued Speech Recognition

1,247 views

Published on

Published in: Technology
  • Be the first to comment

Gestures and Lip Shape Integration for Cued Speech Recognition

  1. 1. Gestures and Lip Shape Integration for Cued Speech RecognitionSeminar By: Seminar Coordinator:Mohammed Musfir Mr. Rino P. C.ECE-B, 08104131 Assistant Professor, ECE Seminar Guide: Mr. Edet Bijoy K. Assistant Professor, ECE
  2. 2. 02/12/2011 2
  3. 3. 02/12/2011 3
  4. 4. 02/12/2011 4
  5. 5. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Overview of Presentation  Objective  Introduction  ASR Techniques  Lip Reading – AVSR  Cued Speech  Integrated Recognition  Conclusion 02/12/2011 5
  6. 6. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Objective  Developments in ASR technique  AVSR Accessibility solution  Lip Detection  Cued Speech detection  Integration of both 02/12/2011 6
  7. 7. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 INTRODUCTION7
  8. 8. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Briefing ASR  First successful system in 1970  Consist of two systems  ASR – Transcribe  SU- Understand transcription  Knowledge Intensive 02/12/2011 8
  9. 9. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 ASR TECHNIQUES9
  10. 10. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION ASR Industry  Industry pioneers – NUANCE, NTT Labs, AT & T labs  MIT and GPL – Vox Forge, Gvoice  Desktop Dictation -1990  Types of ASR  DVI – Word or phrase spotting  LVCSR- Several thousands words 02/12/2011 10
  11. 11. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Techniques  Sequence of sounds  ASR involves  Acquisition - Recording  Feature Extraction – Spectral analysis  Pattern matching and decoding 02/12/2011 11
  12. 12. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 Techniques12
  13. 13. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Approaches  Template Based  Knowledge Based  Statistical  Learning based  Artificial Intelligence 02/12/2011 13
  14. 14. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 LIP READING14
  15. 15. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION 02/12/2011 Front end Lips detection Lip Reading - AVSR15
  16. 16. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Localisation and Tracking  ROI determination – Sobel Edge Filtering  Kalman Filter – Tracking  Principal Component Analysis – Feature Coefficients  Audio feature - MFCC 02/12/2011 16
  17. 17. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 CUED SPEECH17
  18. 18. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 Overview of Cued Speech18
  19. 19. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 INTEGRATION19
  20. 20. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Steps  Lip feature extraction  Audio Synchronization with the Image  Multistream HMM Fusion – State Synchronous Decision  Automatic Image Processing to record the CUEs  Lip Width, Aperture, Area, Upper pinch and Lower Pinch  Modeling - 8 lip parameters and 10 hand parameters 02/12/2011 20
  21. 21. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Fusion  Feature Fusion – Concatenation ������ ������ ������ ������ ������ ������ ������ ������������ = [������������ , ������������ ] ∈ ������������ ������������ ������ - Lip hand feature vector ������ ������ ������ ������������ - Lip shape feature vector ������ ������ ������������ - Hand feature vector D - Dimensionality 02/12/2011 21
  22. 22. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Conclusion  Cued Speech Recognition – 80% accuracy  Outstands ASR in normal environment  Visual mode – Education of the hearing impaired  Phoneme recognition successful  Another product over SIRI 02/12/2011 22
  23. 23. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION Reference 1. Baum L.E., Petrie T., “Statistical Inference for Probabilistic functions of Finite-State Markov Chains”, Annotated Mathematical Statistics, Volume 37, Number 6, pp.1554-1563, 1966 2. XiaoZheng Zhang, Charles C. Broun, Russell M. Mersereau, Mark A. Clements, “Automatic speech reading with applications to human computer interfaces”, Eurasip Journal on Applied Signal Processing, Volume 2002, Issue 11, pp. 1228-1247. 3. Jian-Ming Zhang, Liang-Min Wang, De-Jiao Niu,Yong-Zhao Zhan, “Research and implementation of a real time approach to lip detection in video sequence”, International Conference on Machine Learning and Cybernetics, IEEE, 2003. 4. Md. Rashidul Hasan, Mustafa Jamil, Md. Golam Rabbani Md Saifur Rahman, “Speaker identification using Mel frequency cepstral coefficients”, 3rd International Conference on Electrical And Computer Engineering, ICECE 2004. 5. P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney, “Speech recognition techniques for a sign language recognition system,” In Proceedings of Interspeech, pp. 2513–2516, 2007. 6. A. A. Montgomery and P. L. Jackson, “Physical characteristics of the lips underlying vowel lip reading performance,” Journal of the Acoustical Society of America, Volume 73, Number 6, pp. 2134–2144, 1983. 7. J. Leybaert, “Phonology acquired through the eyes and spelling in deaf children,” Journal of Experimental Child Psychology, Volume 75, pp. 291–318, 2000. 02/12/2011 23
  24. 24. GESTURE AND LIP SHAPE INTEGRATION FOR CUED SPEECH RECOGNITION02/12/2011 THANK YOU24

×