SlideShare a Scribd company logo
Visual-speech to text 
conversion applicable 
to telephone 
communication for deaf 
individuals 
30TH APRIL 2013
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
INTRODUCTION 
 Lip-reading technique, 
 speech can be understood by interpreting 
movements of lips, face and tongue. 
 not one-to-one 
 Impossible to distinguish phonemes using 
visual information alone
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
 the Cued Speech system 
 developed by Cornett 
 contains two components: 
the hand shape the hand position relative to the 
face. 
 Hand shapes- consonant phonemes 
 hand positions -vowel phonemes. 
 improves speech perception to a large extent
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
the Cued Speech system
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
AIM OF NEW SYSTEM 
 To investigate the designing of a system able to 
automatically recognize Cued Speech and convert it 
to text. 
 Possible for deaf or speech-impaired individuals to 
communicate with each other and also with normal-hearing 
persons 
 Using gestures 
 captured by devices equipped by a camera
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
METHODS 
 Corpus, feature extraction, and 
statistical modeling 
 The speakers’ lips were painted blue, and color 
marks were placed on the speakers’ fingers. . 
 The data were derived from a video recording of 
the cuers pronouncing and coding in Cued 
Speech 
 landmarks with different colors were placed on 
the fingers
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
 faster and more accurate image processing 
stage. 
 The audio part of the video recording was 
synchronized with the image. 
 An automatic image processing method was 
appliedli pt ow idththe ( Av)i,d eo 
 lip aperture (B), 
 lip area (S). 
 pinching of the upper lip (Bsup) 
 lower (Binf) lip
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
 Concatenative feature fusion 
 Tracks and extracts the xy coordinates 
each time frame, 
 uses those values as features in the 
HMM modeling. 
 uses the concatenation of the 
synchronous lip shape and hand features 
as the joint feature vector given by,
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
Joint lip hand 
feature vector, 
Lip shape 
feature vector, 
Hand feature 
vector, 
Dimensionality of the 
joint feature vector 
 Parameters used for lip 
shape modeling.
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
RESULTS 
 Isolated word recognition 
1. Recognition in normal-hearing subject
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
2. Recognition in deaf subject
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
3. Multi-speaker isolated word recognition: 
 investigate whether it is possible to train speaker-independent 
HMMs for Cued Speech recognition. 
 The training data consisted of 750 words from the 
normal-hearing subject, and 750 words from the 
deaf subject. 
 For testing 700 words from normal-hearing subject 
and 700 words from the deaf subject were used, 
respectively. 
 Each state was modeled with a mixture of 4 
Gaussian distributions. 
 For lip shape and hand shape integration, 
concatenative feature fusion was used.
Visual-speech to text conversion applicable to telephone communication for deaf individuals
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
4. Continuous phoneme recognition 
 Phoneme correct for continuous phoneme word 
recognition in the case of a normal-hearing subject.
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
Phoneme correct for continuous phoneme word 
recognition in the case of a deaf subject.
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
CONCLUSION 
 Hand shapes and lips shape were integrated 
using concatenative feature fusion and HMM-based 
automatic recognition was conducted. 
 For continuous phoneme recognition, a 86% 
phoneme correct was achieved for the normal-hearing 
cuer and a 82.7% phoneme correct for 
the dead cuer were achieved, respectively. 
 Speech in both normal-hearing and deaf 
subjects were also conducted obtaining a 
94.9% and a 89% accuracy, respectively. 
.
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
CONCLUSION 
 A multi-speaker experiment using data 
from both normal-hearing and deaf subject 
showed a 89.6% word accuracy, on 
average. 
 This result indicates that training speaker-independent 
HMMs for Cued Speech using 
a large number of subjects should not face 
particular difficulties
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
REFERENCES 
 G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, 
“recent Advances in the automatic recognition of audiovisual 
speech,” in Proceedings of the IEEE, vol. 91, issue 9, pp. 
1306–1326, 2003. 
 S. Nakamura, K. Kumatani, and S. Tamura, “Multi-modal 
temporal asynchronicity modeling by product hmms for 
robust audio-visual speech recognition,” in Proceedings of 
Fourth IEEE International Conference on Multimodal 
Interfaces (ICMI’02), p. 305, 2002. 
 R. O. Cornett, “Cued speech,” American Annals of the Deaf, 
vol. 112, pp. 3–13, 1967. 
 J. Leybaert, “Phonology acquired through the eyes and 
spelling in deaf children,”Journal of Experimental Child 
Psychology, vol. 75, pp. 291– 318, 2000
Thank you!
Visual-speech to text conversion applicable to telephone communication for deaf individuals 
ANY 
QUESTION 
S?

More Related Content

What's hot

Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
Ankit Gujrati
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
Goa App
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
Ahmed Moawad
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
Aamir-sheriff
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
Seminar Links
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
Ilhaan Marwat
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
Manthan Gandhi
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
Jimit Rupani
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
himanshubhatti
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
Alok Tiwari
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
Iqbal
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
RHIMRJ Journal
 
Speech Recognition Using Python | Edureka
Speech Recognition Using Python | EdurekaSpeech Recognition Using Python | Edureka
Speech Recognition Using Python | Edureka
Edureka!
 
12EEE032- text 2 voice
12EEE032-  text 2 voice12EEE032-  text 2 voice
12EEE032- text 2 voiceNsaroj kumar
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
fathitarek
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentationshahinmehr
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognitionVinay Jaisriram
 

What's hot (20)

Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Speech Recognition Using Python | Edureka
Speech Recognition Using Python | EdurekaSpeech Recognition Using Python | Edureka
Speech Recognition Using Python | Edureka
 
12EEE032- text 2 voice
12EEE032-  text 2 voice12EEE032-  text 2 voice
12EEE032- text 2 voice
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentation
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 

Viewers also liked

Chat Room System using Java Swing
Chat Room System using Java SwingChat Room System using Java Swing
Chat Room System using Java Swing
Tejas Garodia
 
Maestria en gestion de la innovación uniminuto
Maestria en gestion de la innovación uniminutoMaestria en gestion de la innovación uniminuto
Maestria en gestion de la innovación uniminutoOscar Lunatico
 
Proyecto marketing móvil
Proyecto marketing móvilProyecto marketing móvil
Noches románticas
Noches románticasNoches románticas
Noches románticas
HOTELFLORENCIAPLAZA
 
10 pasos a la felicidad
10 pasos a la felicidad10 pasos a la felicidad
10 pasos a la felicidadlopaumoval
 
STOP DIABETES CON FLP PERU
STOP DIABETES CON FLP PERUSTOP DIABETES CON FLP PERU
STOP DIABETES CON FLP PERU
Victor Ravines
 
Foro - Ley electoral
Foro - Ley electoralForo - Ley electoral
Foro - Ley electoral
PorOtraCuba
 
The Future of News, Publishing, and Media (INMA 2010 Presentation)
The Future of News, Publishing, and Media (INMA 2010 Presentation)The Future of News, Publishing, and Media (INMA 2010 Presentation)
The Future of News, Publishing, and Media (INMA 2010 Presentation)
Gerd Leonhard
 
Industrial Investment Engineering Presentation
Industrial Investment Engineering PresentationIndustrial Investment Engineering Presentation
Industrial Investment Engineering PresentationDavid1Mayagoitia
 
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
Timo Halima
 
Color vision
Color visionColor vision
Color visionguisbond
 
Actividad 2
Actividad 2Actividad 2
Actividad 2
leticia mendoza
 
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
Principled Technologies
 
Agua de mar es salud
Agua de mar es saludAgua de mar es salud
Agua de mar es saludPrema Perez
 
13 insights (Troiano Branding)
13 insights (Troiano Branding)13 insights (Troiano Branding)
13 insights (Troiano Branding)
Luis Rasquilha
 
Helden - Jugend gestern Und heute
Helden - Jugend gestern Und heuteHelden - Jugend gestern Und heute
Helden - Jugend gestern Und heute
MH1970
 
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevila
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevilaDAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevila
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevilaEmprèn UPF
 

Viewers also liked (20)

Chat Room System using Java Swing
Chat Room System using Java SwingChat Room System using Java Swing
Chat Room System using Java Swing
 
Maestria en gestion de la innovación uniminuto
Maestria en gestion de la innovación uniminutoMaestria en gestion de la innovación uniminuto
Maestria en gestion de la innovación uniminuto
 
(2012-12-12) HIPERTENSION ARTERIAL (DOC)
(2012-12-12) HIPERTENSION ARTERIAL (DOC)(2012-12-12) HIPERTENSION ARTERIAL (DOC)
(2012-12-12) HIPERTENSION ARTERIAL (DOC)
 
Proyecto marketing móvil
Proyecto marketing móvilProyecto marketing móvil
Proyecto marketing móvil
 
resume
resumeresume
resume
 
Noches románticas
Noches románticasNoches románticas
Noches románticas
 
10 pasos a la felicidad
10 pasos a la felicidad10 pasos a la felicidad
10 pasos a la felicidad
 
STOP DIABETES CON FLP PERU
STOP DIABETES CON FLP PERUSTOP DIABETES CON FLP PERU
STOP DIABETES CON FLP PERU
 
Foro - Ley electoral
Foro - Ley electoralForo - Ley electoral
Foro - Ley electoral
 
The Future of News, Publishing, and Media (INMA 2010 Presentation)
The Future of News, Publishing, and Media (INMA 2010 Presentation)The Future of News, Publishing, and Media (INMA 2010 Presentation)
The Future of News, Publishing, and Media (INMA 2010 Presentation)
 
Industrial Investment Engineering Presentation
Industrial Investment Engineering PresentationIndustrial Investment Engineering Presentation
Industrial Investment Engineering Presentation
 
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
eCommerce Helsinki 2016_Anders Innovations & GlobalSign_16th march, 2016, Hel...
 
Color vision
Color visionColor vision
Color vision
 
Actividad 2
Actividad 2Actividad 2
Actividad 2
 
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
Better email response time using Microsoft Exchange 2013 with the Dell PowerE...
 
Agua de mar es salud
Agua de mar es saludAgua de mar es salud
Agua de mar es salud
 
13 insights (Troiano Branding)
13 insights (Troiano Branding)13 insights (Troiano Branding)
13 insights (Troiano Branding)
 
Helden - Jugend gestern Und heute
Helden - Jugend gestern Und heuteHelden - Jugend gestern Und heute
Helden - Jugend gestern Und heute
 
Historia De Mi Vida
Historia De Mi Vida Historia De Mi Vida
Historia De Mi Vida
 
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevila
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevilaDAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevila
DAFO PERSONAL - Catedra Bancaja UPF-idec 12 feb2011 _lluis soldevila
 

Similar to Visual speech to text conversion applicable to telephone communication

LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
IRJET Journal
 
lips _reading_nagham _salim compute.pptx
lips _reading_nagham _salim compute.pptxlips _reading_nagham _salim compute.pptx
lips _reading_nagham _salim compute.pptx
naghamallella
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
kevig
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
ijnlc
 
lips _reading _in computer_ vision_n.ppt
lips _reading _in computer_ vision_n.pptlips _reading _in computer_ vision_n.ppt
lips _reading _in computer_ vision_n.ppt
naghamallella
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Performance estimation based recurrent-convolutional encoder decoder for spee...
Performance estimation based recurrent-convolutional encoder decoder for spee...Performance estimation based recurrent-convolutional encoder decoder for spee...
Performance estimation based recurrent-convolutional encoder decoder for spee...
karthik annam
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
TELKOMNIKA JOURNAL
 
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READINGLIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
IRJET Journal
 
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
IRJET Journal
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
Vikalp Mahendra
 
silent sound technology pdf
silent sound technology pdfsilent sound technology pdf
silent sound technology pdf
rahul mishra
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
aryan532920
 
Mobile asl
Mobile aslMobile asl
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
kevig
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
IOSR Journals
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
kevig
 
Incremental Difference as Feature for Lipreading
Incremental Difference as Feature for LipreadingIncremental Difference as Feature for Lipreading
Incremental Difference as Feature for Lipreading
IDES Editor
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
mlaij
 
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
HCI Lab
 

Similar to Visual speech to text conversion applicable to telephone communication (20)

LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
 
lips _reading_nagham _salim compute.pptx
lips _reading_nagham _salim compute.pptxlips _reading_nagham _salim compute.pptx
lips _reading_nagham _salim compute.pptx
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
lips _reading _in computer_ vision_n.ppt
lips _reading _in computer_ vision_n.pptlips _reading _in computer_ vision_n.ppt
lips _reading _in computer_ vision_n.ppt
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
Performance estimation based recurrent-convolutional encoder decoder for spee...
Performance estimation based recurrent-convolutional encoder decoder for spee...Performance estimation based recurrent-convolutional encoder decoder for spee...
Performance estimation based recurrent-convolutional encoder decoder for spee...
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
 
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READINGLIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
LIP READING: VISUAL SPEECH RECOGNITION USING LIP READING
 
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COM...
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
 
silent sound technology pdf
silent sound technology pdfsilent sound technology pdf
silent sound technology pdf
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 
Mobile asl
Mobile aslMobile asl
Mobile asl
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
 
Incremental Difference as Feature for Lipreading
Incremental Difference as Feature for LipreadingIncremental Difference as Feature for Lipreading
Incremental Difference as Feature for Lipreading
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
 
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
Hearing by seeing: Can improving the visibility of the speaker's lips make yo...
 

More from Swathi Venugopal

A new low cost shrm for adjustable-speed pump applications
A new low cost  shrm  for adjustable-speed pump applicationsA new low cost  shrm  for adjustable-speed pump applications
A new low cost shrm for adjustable-speed pump applications
Swathi Venugopal
 
Harnessing high altitude wind power
Harnessing high altitude wind powerHarnessing high altitude wind power
Harnessing high altitude wind power
Swathi Venugopal
 
Micro stepping mode for stepper motor
Micro stepping mode for stepper motorMicro stepping mode for stepper motor
Micro stepping mode for stepper motor
Swathi Venugopal
 
A Frequency-based RF Partial Discharge Detector for Low-power Wireless Sens...
A Frequency-based  RF Partial Discharge Detector  for Low-power Wireless Sens...A Frequency-based  RF Partial Discharge Detector  for Low-power Wireless Sens...
A Frequency-based RF Partial Discharge Detector for Low-power Wireless Sens...
Swathi Venugopal
 
Estimation of induction motor operating power factor.
Estimation of induction motor operating power factor.Estimation of induction motor operating power factor.
Estimation of induction motor operating power factor.
Swathi Venugopal
 
Save energy save enviornment ii
Save energy save enviornment iiSave energy save enviornment ii
Save energy save enviornment ii
Swathi Venugopal
 
Grid integration issues and solutions
Grid integration issues and solutionsGrid integration issues and solutions
Grid integration issues and solutions
Swathi Venugopal
 

More from Swathi Venugopal (7)

A new low cost shrm for adjustable-speed pump applications
A new low cost  shrm  for adjustable-speed pump applicationsA new low cost  shrm  for adjustable-speed pump applications
A new low cost shrm for adjustable-speed pump applications
 
Harnessing high altitude wind power
Harnessing high altitude wind powerHarnessing high altitude wind power
Harnessing high altitude wind power
 
Micro stepping mode for stepper motor
Micro stepping mode for stepper motorMicro stepping mode for stepper motor
Micro stepping mode for stepper motor
 
A Frequency-based RF Partial Discharge Detector for Low-power Wireless Sens...
A Frequency-based  RF Partial Discharge Detector  for Low-power Wireless Sens...A Frequency-based  RF Partial Discharge Detector  for Low-power Wireless Sens...
A Frequency-based RF Partial Discharge Detector for Low-power Wireless Sens...
 
Estimation of induction motor operating power factor.
Estimation of induction motor operating power factor.Estimation of induction motor operating power factor.
Estimation of induction motor operating power factor.
 
Save energy save enviornment ii
Save energy save enviornment iiSave energy save enviornment ii
Save energy save enviornment ii
 
Grid integration issues and solutions
Grid integration issues and solutionsGrid integration issues and solutions
Grid integration issues and solutions
 

Recently uploaded

Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
ShahidSultan24
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 

Recently uploaded (20)

Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
addressing modes in computer architecture
addressing modes  in computer architectureaddressing modes  in computer architecture
addressing modes in computer architecture
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 

Visual speech to text conversion applicable to telephone communication

  • 1. Visual-speech to text conversion applicable to telephone communication for deaf individuals 30TH APRIL 2013
  • 2. Visual-speech to text conversion applicable to telephone communication for deaf individuals INTRODUCTION  Lip-reading technique,  speech can be understood by interpreting movements of lips, face and tongue.  not one-to-one  Impossible to distinguish phonemes using visual information alone
  • 3. Visual-speech to text conversion applicable to telephone communication for deaf individuals  the Cued Speech system  developed by Cornett  contains two components: the hand shape the hand position relative to the face.  Hand shapes- consonant phonemes  hand positions -vowel phonemes.  improves speech perception to a large extent
  • 4. Visual-speech to text conversion applicable to telephone communication for deaf individuals the Cued Speech system
  • 5. Visual-speech to text conversion applicable to telephone communication for deaf individuals AIM OF NEW SYSTEM  To investigate the designing of a system able to automatically recognize Cued Speech and convert it to text.  Possible for deaf or speech-impaired individuals to communicate with each other and also with normal-hearing persons  Using gestures  captured by devices equipped by a camera
  • 6. Visual-speech to text conversion applicable to telephone communication for deaf individuals METHODS  Corpus, feature extraction, and statistical modeling  The speakers’ lips were painted blue, and color marks were placed on the speakers’ fingers. .  The data were derived from a video recording of the cuers pronouncing and coding in Cued Speech  landmarks with different colors were placed on the fingers
  • 7. Visual-speech to text conversion applicable to telephone communication for deaf individuals  faster and more accurate image processing stage.  The audio part of the video recording was synchronized with the image.  An automatic image processing method was appliedli pt ow idththe ( Av)i,d eo  lip aperture (B),  lip area (S).  pinching of the upper lip (Bsup)  lower (Binf) lip
  • 8. Visual-speech to text conversion applicable to telephone communication for deaf individuals  Concatenative feature fusion  Tracks and extracts the xy coordinates each time frame,  uses those values as features in the HMM modeling.  uses the concatenation of the synchronous lip shape and hand features as the joint feature vector given by,
  • 9. Visual-speech to text conversion applicable to telephone communication for deaf individuals Joint lip hand feature vector, Lip shape feature vector, Hand feature vector, Dimensionality of the joint feature vector  Parameters used for lip shape modeling.
  • 10. Visual-speech to text conversion applicable to telephone communication for deaf individuals RESULTS  Isolated word recognition 1. Recognition in normal-hearing subject
  • 11. Visual-speech to text conversion applicable to telephone communication for deaf individuals 2. Recognition in deaf subject
  • 12. Visual-speech to text conversion applicable to telephone communication for deaf individuals 3. Multi-speaker isolated word recognition:  investigate whether it is possible to train speaker-independent HMMs for Cued Speech recognition.  The training data consisted of 750 words from the normal-hearing subject, and 750 words from the deaf subject.  For testing 700 words from normal-hearing subject and 700 words from the deaf subject were used, respectively.  Each state was modeled with a mixture of 4 Gaussian distributions.  For lip shape and hand shape integration, concatenative feature fusion was used.
  • 13. Visual-speech to text conversion applicable to telephone communication for deaf individuals
  • 14. Visual-speech to text conversion applicable to telephone communication for deaf individuals 4. Continuous phoneme recognition  Phoneme correct for continuous phoneme word recognition in the case of a normal-hearing subject.
  • 15. Visual-speech to text conversion applicable to telephone communication for deaf individuals Phoneme correct for continuous phoneme word recognition in the case of a deaf subject.
  • 16. Visual-speech to text conversion applicable to telephone communication for deaf individuals CONCLUSION  Hand shapes and lips shape were integrated using concatenative feature fusion and HMM-based automatic recognition was conducted.  For continuous phoneme recognition, a 86% phoneme correct was achieved for the normal-hearing cuer and a 82.7% phoneme correct for the dead cuer were achieved, respectively.  Speech in both normal-hearing and deaf subjects were also conducted obtaining a 94.9% and a 89% accuracy, respectively. .
  • 17. Visual-speech to text conversion applicable to telephone communication for deaf individuals CONCLUSION  A multi-speaker experiment using data from both normal-hearing and deaf subject showed a 89.6% word accuracy, on average.  This result indicates that training speaker-independent HMMs for Cued Speech using a large number of subjects should not face particular difficulties
  • 18. Visual-speech to text conversion applicable to telephone communication for deaf individuals REFERENCES  G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, “recent Advances in the automatic recognition of audiovisual speech,” in Proceedings of the IEEE, vol. 91, issue 9, pp. 1306–1326, 2003.  S. Nakamura, K. Kumatani, and S. Tamura, “Multi-modal temporal asynchronicity modeling by product hmms for robust audio-visual speech recognition,” in Proceedings of Fourth IEEE International Conference on Multimodal Interfaces (ICMI’02), p. 305, 2002.  R. O. Cornett, “Cued speech,” American Annals of the Deaf, vol. 112, pp. 3–13, 1967.  J. Leybaert, “Phonology acquired through the eyes and spelling in deaf children,”Journal of Experimental Child Psychology, vol. 75, pp. 291– 318, 2000
  • 20. Visual-speech to text conversion applicable to telephone communication for deaf individuals ANY QUESTION S?