SlideShare a Scribd company logo
1 of 18
1
SRI SIDDHARTHA ACADEMY OF HIGHER EDUCATION
(DEEMED TO BE UNIVERSITY, Accredited A+ Grade by NAAC)
Sri Siddhartha Institute of Technology-Tumakuru
(A CONSTITUENT COLLEGE OF SSAHE)
Department of Electronics & Communication Engineering
(ACCREDITED BY NBA)
MAJOR PROJECT (EC7PW1) - Problem Definition Seminar
On
“LIP READING USING DEEP LEARNING”
Presented By:
Dongala Gokul Chandra Reddy (20EC023)
Manoj Kumar M O (20EC040)
Manu G (20EC041)
Sagar Sonale T V (20EC052)
Under the Guidance of:
Divyaprabha
Associate Professor
Dept. of ECE, SSIT
02/01/2024
2
OUTLINE
 INTRODUCTION
 LITERATURE SURVEY
 PROBLEM STATEMENT
 SYSTEM DESIGN AND IMPLEMETATION
 HARDWARE AND SOFTWARE REQUIREMENTS
 CONCLUSION
 REFERENCES
3
 Communication is fundamental to the existence and survival of humans to
express their ideas and feelings to reach common understand among the
people.
 Lip reading plays a supreme role in grasping human speech particularly for
listeners with hearing impairments. This serves as a hearing aid for the
hearing impaired particularly interacting with people with no knowledge of
sign language.
 Lip reading using deep learning is a technique that uses artificial intelligence
to interpret and understand spoken words by analyzing the movements of a
person's lips.
 By analyzing the visual patterns of lip movements, these algorithms can
accurately recognize and transcribe spoken words.
INTRODUCTION
4
MOTIVATION
Lip reading using deep learning is an
incredible application that has the
potential to revolutionize
communication for individuals with
hearing impairments. By using
advanced algorithms and analyzing the
movements of the lips, this technology
can bridge the gap and enable better
understanding of spoken words. It's
truly inspiring to see how artificial
intelligence can make a positive impact
on people's lives.
5
OBJECTIVES
 Develop an automatic feature extraction technique that is able to
extract lip geometry information from the mouth region.
 Design a state-of-art audio-visual speech recognition system using
dynamic geometry features from the lip shape.
 Create powerful tool capable of detecting the face, lips and describe
the events of video.
 Design a model that has much higher accuracy compared to other
6
LITERATURE SURVEY
Sl. No. TITLE of the Paper
AUTHORs,
YEAR & PUBLICATION
OBSERVATIONS
1. Lip Reading
Sentences Using
Deep Learning with
only Visual Cues
Souheil Fenghour,
London South Bank
University, London
2020
In this Paper, A neural network-based that is Visemes
classification, lip reading system has been developed to
predict sentences covering a wide range of vocabulary in
silent videos from people speaking, they have used BBC LRS2
data set and achieved accuracy of 65%.
2. Analyzing lower half
facial gestures for lip
reading applications:
Survey on vision
techniques
Niranjana Krupa B.
Department of ECE, PES
University, Bangalore,
560085, India
2022
Lip reading applications discussed in the survey can be of a
great helping hand to society by providing security to
systems, voice assistants for devices, a hearing aid for deaf
people, for generating video text transcriptions, for
pronunciation correction or evaluation, aiding forensics with
spy cameras, synthesizing voice for unable to talk patients,
attempting speech inpainting during noisy video
conferencing,
Dream High
LITERATURE SURVEY
Sl. No. TITLE of the Paper
AUTHORs,
YEAR & PUBLICATION
OBSERVATIONS
3. Text Extraction through
Video Lip Reading Using
Deep Learning
S. M. M. H. Chowdhury,
M. Rahman, M. T. Oyshi
and M. A. Hasan.
Moradabad, India, 2021
In this research, a method of converting video data to text
data through lip reading has been proposed. The proposed
method includes test dataset, image frame analysis and
having text output from identified words. They have used
HMM Architecture, and used Grid data set the accuracy has
been around 76%.
4. Deep Learning based
Lip-Reading Techniques
S. Pujari, S. Sneha, R.
Vinusha, P.
Bhuvaneshwari and
C.Yashaswini.
Tirunelveli, India, 2021
In this Paper, convolution Neural Network and Bi-LSTM are
used to design Lip reading Model and they have used Ou lu
VS2 dataset for training the model they have achieved
82.3% of accuracy.
5. Lipreading Using
Temporal Convolutional
Networks
B. Martinez, P. Ma, S.
Petridis and M. Pantic
Barcelona, Spain, 2020
Firstly, they use Temporal Convolutional Networks (TCN)
and used LRW1000 data set. Secondly simplify the training
procedure, which allows to train the model in one single
stage. Thirdly, to show that the current state-of-the-art
methodology produces models that do not generalize well
to variations on the sequence length, and addresses this
issue by proposing a variable-length augmentation. The
Dream High
LITERATURE SURVEY
Sl. No. TITLE of the Paper
AUTHORs,
YEAR & PUBLICATION
OBSERVATIONS
6. Convolutional Neural
Network Based Lip
Reading System for
Hearing Impaired
People
Fathima S, C. Jayanthi,
N.Sripriya
2020.
Each frame passes through trained CNN architecture and
frames are then divided into visemes. Produced visemes go
through a thick layer of Long Short Term Memory (LSTM).
The result of the LSTM layer turns into the contribution to
the following thick layer. Finally, they receive sequence of
visemes; classified visemes are labeled by LSTM softmax
activation function. Feature extraction of visemes are
judged using classifier schema known as visemes to
phoneme mapping. Considering the mapping procedure,
possible Word is detected using word detector. The
accuracy achieved is 83%.
7. Lip Reading
Experiments for
Multiple Databases
using Conventional
Method
T. Shirakata and T.
Saitoh.
Hiroshima, Japan, 2019
In this paper, not the latest deep learning-based method
but the standard recognition method by hidden Markov
model (HMM) which mainly used conventionally is applied,
and analyzes trends in recognition accuracy. Based-on
recognition experiments, it was found that the recognition
accuracy was correlated with the number of frames,which
is around 91%.
Dream High
LITERATURE SURVEY
Sl. No. TITLE of the Paper
AUTHORs,
YEAR & PUBLICATION
OBSERVATIONS
8. Vision based Lip
Reading System Using
Deep Learning
Fathima S, C. Jayanthi,
N.Sripriya
2020.
This paper presents the method for Vision based Lip
Reading System that uses convolutional neural network
with attention-based Long Shot-Term Memory. The data
sets includes video clips pronouncing single digits.
They used two pre-trained models namely VGG19 and
ResNet50, The system provides 85% accuracy.
10
PROBLEM STATEMENT
• Speech recognition may not work well If the user has a loud voice, a strong
accent, or a speech condition.
• when the user in the public location, a quite library, or a private conference,
voice recognition may not be possible or desired.
• Traditional machine learning models may struggle to handle large amounts of
data and exploit its potential due to limitations in scalability and computational
efficiency
• Speakers exhibit unique lip movements, making it challenging for traditional
machine learning models to adapt to speaker-specific variations.
11
SOLUTION
• with deep learning, to learn speaker-specific features, are better suited to handle
variability.
• Deep learning models often demonstrate better robustness to noisy environments
due to their ability to learn hierarchical representations.
• Use of deep learning in Lip reading often benefits from incorporating additional
modalities, such as audio information.
• Lip reading involves understanding not only lip movements but also contextual
and linguistic factors
12
BLOCK DIAGRAM/SYSTEM DESIGN
Fig: Block diagram representation
HARDWARE INTERFACES
13
SOFTWARE REQUIREMENTS
• Anaconda
• Spyder
• Keras
• Tensorflow
• Processor : Intel CORE i5 processor with minimum 3.2 GHz
• RAM : Minimum 4GB.
• Hard Disk : Minimum 500 GB
14
CONCLUSION
In conclusion, based on the literature survey the lip reading using deep learning can
be achieved through various deep learning techniques. We have choosen LipNet
due to its spatiotemporal visual features which has the possibility of increasing the
accuracy in our proposed work.
.
15
REFERENCES
1. S. Fenghour, D. Chen, K. Guo and P. Xiao, "Lip Reading Sentences Using Deep
Learning With Only Visual Cues," in IEEE Access, vol. 8, pp. 215516-215530, 2020,
doi:10.1109/ACCESS.2020.3040906.
2. 2. S. Fenghour, D. Chen, K. Guo, B. Li and P. Xiao, "Deep Learning-Based Automated
Lip-Reading: A Survey," in IEEE Access, vol. 9, pp. 121184-121205, 2021, doi:
10.1109/ACCESS.2021.3107946.
3. 3. N. Deshmukh, A. Ahire, S. H. Bhandari, A. Mali and K. Warkari, "Vision based Lip
Reading System using Deep Learning," 2021 International Conference on
Computing, Communication and Green Engineering (CCGE), Pune, India, 2021, pp.
1-6, doi: 10.1109/CCGE50943.2021.9776430.
4. 4. K. Neeraja, K. Srinivas Rao and G. Praneeth, "Deep Learning based Lip Movement
Technique for Mute," 2021 6th International Conference on Communication and
Electronics Systems (ICCES), Coimbatre, India, 2021, pp. 1446-1450, doi:
10.1109/ICCES51350.2021.9489122.
.
16
5. S. M. M. H. Chowdhury, M. Rahman, M. T. Oyshi and M. A. Hasan, "Text Extraction
through Video Lip Reading Using Deep Learning," 2019 8th International Conference
System Modeling and Advancement in Research Trends (SMART), Moradabad, India,
2019, pp. 240-243, doi: 10.1109/SMART46866.2019.9117224.
6. S. Pujari, S. Sneha, R. Vinusha, P. Bhuvaneshwari and C. Yashaswini, "A Survey on
Deep Learning based Lip-Reading Techniques," 2021 Third International Conference
on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), T
irunelveli, India, 2021, pp. 1286-1293, doi: 10.1109/ICICV50876.2021.9388569.
7. T. Shirakata and T. Saitoh, "Lip Reading Experiments for Multiple Databases using
Conventional Method," 2019 58th Annual Conference of the Society of Instrument
and Control Engineers of Japan (SICE), Hiroshima, Japan, 2019, pp. 409-414, doi:
10.23919/SICE.2019.8859932
.
17
8. F. S, C. J. S and N. Sripriya, "Convolutional Neural Network Based Lip Reading System
for Hearing Impaired People," 2022 8th International Conference on Smart Structures
and Systems (ICSSS), Chennai, India, 2022, pp. 1-5, doi:
10.1109/ICSSS54381.2022.9782208
9. M. Varshney, R. Yadav, V. P. Namboodiri and R. M. Hegde, "Learning Speaker-specific
Lip-to-Speech Generation," 2022 26th International Conference on Pattern
Recognition (ICPR), Montreal, QC, Canada, 2022, pp. 491-498, doi:
10.1109/ICPR56361.2022.9956600
10.B. Martinez, P. Ma, S. Petridis and M. Pantic, "Lipreading Using Temporal
Convolutional Networks," ICASSP 2020 - 2020 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 6319-
6323, doi: 10.1109/ICASSP40776.2020.9053841.
Thank You

More Related Content

Similar to lip reading using deep learning presentation

GLOVE BASED GESTURE RECOGNITION USING IR SENSOR
GLOVE BASED GESTURE RECOGNITION USING IR SENSORGLOVE BASED GESTURE RECOGNITION USING IR SENSOR
GLOVE BASED GESTURE RECOGNITION USING IR SENSORIRJET Journal
 
IRJET - Gesture based Communication Recognition System
IRJET -  	  Gesture based Communication Recognition SystemIRJET -  	  Gesture based Communication Recognition System
IRJET - Gesture based Communication Recognition SystemIRJET Journal
 
01 8445 speech enhancement
01 8445 speech enhancement01 8445 speech enhancement
01 8445 speech enhancementIAESIJEECS
 
Paper id 23201490
Paper id 23201490Paper id 23201490
Paper id 23201490IJRAT
 
Adopting progressed CNN for understanding hand gestures to native languages b...
Adopting progressed CNN for understanding hand gestures to native languages b...Adopting progressed CNN for understanding hand gestures to native languages b...
Adopting progressed CNN for understanding hand gestures to native languages b...IRJET Journal
 
CS8611-Mini Project - PPT Template-4 (1).ppt
CS8611-Mini Project - PPT Template-4 (1).pptCS8611-Mini Project - PPT Template-4 (1).ppt
CS8611-Mini Project - PPT Template-4 (1).pptJANARTHANANP013
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningIRJET Journal
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...IJECEIAES
 
Deep convolutional neural networks-based features for Indonesian large vocabu...
Deep convolutional neural networks-based features for Indonesian large vocabu...Deep convolutional neural networks-based features for Indonesian large vocabu...
Deep convolutional neural networks-based features for Indonesian large vocabu...IAESIJAI
 
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET Journal
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...TELKOMNIKA JOURNAL
 
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...IRJET Journal
 
Sign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationSign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationTriloki Gupta
 
E slate cost effective learning tool
E slate  cost effective learning toolE slate  cost effective learning tool
E slate cost effective learning tooleSAT Journals
 
A review of factors that impact the design of a glove based wearable devices
A review of factors that impact the design of a glove based wearable devicesA review of factors that impact the design of a glove based wearable devices
A review of factors that impact the design of a glove based wearable devicesIAESIJAI
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesTELKOMNIKA JOURNAL
 
IRJET - Sign Language Converter
IRJET -  	  Sign Language ConverterIRJET -  	  Sign Language Converter
IRJET - Sign Language ConverterIRJET Journal
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
 
Project_Phase1_-_Literature_Review-1[1].pptx
Project_Phase1_-_Literature_Review-1[1].pptxProject_Phase1_-_Literature_Review-1[1].pptx
Project_Phase1_-_Literature_Review-1[1].pptxASHWIN808488
 
Speech Automated Examination for Visually Impaired Students
Speech Automated Examination for Visually Impaired StudentsSpeech Automated Examination for Visually Impaired Students
Speech Automated Examination for Visually Impaired Studentsvivatechijri
 

Similar to lip reading using deep learning presentation (20)

GLOVE BASED GESTURE RECOGNITION USING IR SENSOR
GLOVE BASED GESTURE RECOGNITION USING IR SENSORGLOVE BASED GESTURE RECOGNITION USING IR SENSOR
GLOVE BASED GESTURE RECOGNITION USING IR SENSOR
 
IRJET - Gesture based Communication Recognition System
IRJET -  	  Gesture based Communication Recognition SystemIRJET -  	  Gesture based Communication Recognition System
IRJET - Gesture based Communication Recognition System
 
01 8445 speech enhancement
01 8445 speech enhancement01 8445 speech enhancement
01 8445 speech enhancement
 
Paper id 23201490
Paper id 23201490Paper id 23201490
Paper id 23201490
 
Adopting progressed CNN for understanding hand gestures to native languages b...
Adopting progressed CNN for understanding hand gestures to native languages b...Adopting progressed CNN for understanding hand gestures to native languages b...
Adopting progressed CNN for understanding hand gestures to native languages b...
 
CS8611-Mini Project - PPT Template-4 (1).ppt
CS8611-Mini Project - PPT Template-4 (1).pptCS8611-Mini Project - PPT Template-4 (1).ppt
CS8611-Mini Project - PPT Template-4 (1).ppt
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
 
Deep convolutional neural networks-based features for Indonesian large vocabu...
Deep convolutional neural networks-based features for Indonesian large vocabu...Deep convolutional neural networks-based features for Indonesian large vocabu...
Deep convolutional neural networks-based features for Indonesian large vocabu...
 
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
 
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...
IRJET- Design and Development of Tesseract-OCR Based Assistive System to Conv...
 
Sign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols ClassificationSign Language Recognition based on Hands symbols Classification
Sign Language Recognition based on Hands symbols Classification
 
E slate cost effective learning tool
E slate  cost effective learning toolE slate  cost effective learning tool
E slate cost effective learning tool
 
A review of factors that impact the design of a glove based wearable devices
A review of factors that impact the design of a glove based wearable devicesA review of factors that impact the design of a glove based wearable devices
A review of factors that impact the design of a glove based wearable devices
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approaches
 
IRJET - Sign Language Converter
IRJET -  	  Sign Language ConverterIRJET -  	  Sign Language Converter
IRJET - Sign Language Converter
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speech
 
Project_Phase1_-_Literature_Review-1[1].pptx
Project_Phase1_-_Literature_Review-1[1].pptxProject_Phase1_-_Literature_Review-1[1].pptx
Project_Phase1_-_Literature_Review-1[1].pptx
 
Speech Automated Examination for Visually Impaired Students
Speech Automated Examination for Visually Impaired StudentsSpeech Automated Examination for Visually Impaired Students
Speech Automated Examination for Visually Impaired Students
 

Recently uploaded

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Recently uploaded (20)

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 

lip reading using deep learning presentation

  • 1. 1 SRI SIDDHARTHA ACADEMY OF HIGHER EDUCATION (DEEMED TO BE UNIVERSITY, Accredited A+ Grade by NAAC) Sri Siddhartha Institute of Technology-Tumakuru (A CONSTITUENT COLLEGE OF SSAHE) Department of Electronics & Communication Engineering (ACCREDITED BY NBA) MAJOR PROJECT (EC7PW1) - Problem Definition Seminar On “LIP READING USING DEEP LEARNING” Presented By: Dongala Gokul Chandra Reddy (20EC023) Manoj Kumar M O (20EC040) Manu G (20EC041) Sagar Sonale T V (20EC052) Under the Guidance of: Divyaprabha Associate Professor Dept. of ECE, SSIT 02/01/2024
  • 2. 2 OUTLINE  INTRODUCTION  LITERATURE SURVEY  PROBLEM STATEMENT  SYSTEM DESIGN AND IMPLEMETATION  HARDWARE AND SOFTWARE REQUIREMENTS  CONCLUSION  REFERENCES
  • 3. 3  Communication is fundamental to the existence and survival of humans to express their ideas and feelings to reach common understand among the people.  Lip reading plays a supreme role in grasping human speech particularly for listeners with hearing impairments. This serves as a hearing aid for the hearing impaired particularly interacting with people with no knowledge of sign language.  Lip reading using deep learning is a technique that uses artificial intelligence to interpret and understand spoken words by analyzing the movements of a person's lips.  By analyzing the visual patterns of lip movements, these algorithms can accurately recognize and transcribe spoken words. INTRODUCTION
  • 4. 4 MOTIVATION Lip reading using deep learning is an incredible application that has the potential to revolutionize communication for individuals with hearing impairments. By using advanced algorithms and analyzing the movements of the lips, this technology can bridge the gap and enable better understanding of spoken words. It's truly inspiring to see how artificial intelligence can make a positive impact on people's lives.
  • 5. 5 OBJECTIVES  Develop an automatic feature extraction technique that is able to extract lip geometry information from the mouth region.  Design a state-of-art audio-visual speech recognition system using dynamic geometry features from the lip shape.  Create powerful tool capable of detecting the face, lips and describe the events of video.  Design a model that has much higher accuracy compared to other
  • 6. 6 LITERATURE SURVEY Sl. No. TITLE of the Paper AUTHORs, YEAR & PUBLICATION OBSERVATIONS 1. Lip Reading Sentences Using Deep Learning with only Visual Cues Souheil Fenghour, London South Bank University, London 2020 In this Paper, A neural network-based that is Visemes classification, lip reading system has been developed to predict sentences covering a wide range of vocabulary in silent videos from people speaking, they have used BBC LRS2 data set and achieved accuracy of 65%. 2. Analyzing lower half facial gestures for lip reading applications: Survey on vision techniques Niranjana Krupa B. Department of ECE, PES University, Bangalore, 560085, India 2022 Lip reading applications discussed in the survey can be of a great helping hand to society by providing security to systems, voice assistants for devices, a hearing aid for deaf people, for generating video text transcriptions, for pronunciation correction or evaluation, aiding forensics with spy cameras, synthesizing voice for unable to talk patients, attempting speech inpainting during noisy video conferencing,
  • 7. Dream High LITERATURE SURVEY Sl. No. TITLE of the Paper AUTHORs, YEAR & PUBLICATION OBSERVATIONS 3. Text Extraction through Video Lip Reading Using Deep Learning S. M. M. H. Chowdhury, M. Rahman, M. T. Oyshi and M. A. Hasan. Moradabad, India, 2021 In this research, a method of converting video data to text data through lip reading has been proposed. The proposed method includes test dataset, image frame analysis and having text output from identified words. They have used HMM Architecture, and used Grid data set the accuracy has been around 76%. 4. Deep Learning based Lip-Reading Techniques S. Pujari, S. Sneha, R. Vinusha, P. Bhuvaneshwari and C.Yashaswini. Tirunelveli, India, 2021 In this Paper, convolution Neural Network and Bi-LSTM are used to design Lip reading Model and they have used Ou lu VS2 dataset for training the model they have achieved 82.3% of accuracy. 5. Lipreading Using Temporal Convolutional Networks B. Martinez, P. Ma, S. Petridis and M. Pantic Barcelona, Spain, 2020 Firstly, they use Temporal Convolutional Networks (TCN) and used LRW1000 data set. Secondly simplify the training procedure, which allows to train the model in one single stage. Thirdly, to show that the current state-of-the-art methodology produces models that do not generalize well to variations on the sequence length, and addresses this issue by proposing a variable-length augmentation. The
  • 8. Dream High LITERATURE SURVEY Sl. No. TITLE of the Paper AUTHORs, YEAR & PUBLICATION OBSERVATIONS 6. Convolutional Neural Network Based Lip Reading System for Hearing Impaired People Fathima S, C. Jayanthi, N.Sripriya 2020. Each frame passes through trained CNN architecture and frames are then divided into visemes. Produced visemes go through a thick layer of Long Short Term Memory (LSTM). The result of the LSTM layer turns into the contribution to the following thick layer. Finally, they receive sequence of visemes; classified visemes are labeled by LSTM softmax activation function. Feature extraction of visemes are judged using classifier schema known as visemes to phoneme mapping. Considering the mapping procedure, possible Word is detected using word detector. The accuracy achieved is 83%. 7. Lip Reading Experiments for Multiple Databases using Conventional Method T. Shirakata and T. Saitoh. Hiroshima, Japan, 2019 In this paper, not the latest deep learning-based method but the standard recognition method by hidden Markov model (HMM) which mainly used conventionally is applied, and analyzes trends in recognition accuracy. Based-on recognition experiments, it was found that the recognition accuracy was correlated with the number of frames,which is around 91%.
  • 9. Dream High LITERATURE SURVEY Sl. No. TITLE of the Paper AUTHORs, YEAR & PUBLICATION OBSERVATIONS 8. Vision based Lip Reading System Using Deep Learning Fathima S, C. Jayanthi, N.Sripriya 2020. This paper presents the method for Vision based Lip Reading System that uses convolutional neural network with attention-based Long Shot-Term Memory. The data sets includes video clips pronouncing single digits. They used two pre-trained models namely VGG19 and ResNet50, The system provides 85% accuracy.
  • 10. 10 PROBLEM STATEMENT • Speech recognition may not work well If the user has a loud voice, a strong accent, or a speech condition. • when the user in the public location, a quite library, or a private conference, voice recognition may not be possible or desired. • Traditional machine learning models may struggle to handle large amounts of data and exploit its potential due to limitations in scalability and computational efficiency • Speakers exhibit unique lip movements, making it challenging for traditional machine learning models to adapt to speaker-specific variations.
  • 11. 11 SOLUTION • with deep learning, to learn speaker-specific features, are better suited to handle variability. • Deep learning models often demonstrate better robustness to noisy environments due to their ability to learn hierarchical representations. • Use of deep learning in Lip reading often benefits from incorporating additional modalities, such as audio information. • Lip reading involves understanding not only lip movements but also contextual and linguistic factors
  • 12. 12 BLOCK DIAGRAM/SYSTEM DESIGN Fig: Block diagram representation
  • 13. HARDWARE INTERFACES 13 SOFTWARE REQUIREMENTS • Anaconda • Spyder • Keras • Tensorflow • Processor : Intel CORE i5 processor with minimum 3.2 GHz • RAM : Minimum 4GB. • Hard Disk : Minimum 500 GB
  • 14. 14 CONCLUSION In conclusion, based on the literature survey the lip reading using deep learning can be achieved through various deep learning techniques. We have choosen LipNet due to its spatiotemporal visual features which has the possibility of increasing the accuracy in our proposed work.
  • 15. . 15 REFERENCES 1. S. Fenghour, D. Chen, K. Guo and P. Xiao, "Lip Reading Sentences Using Deep Learning With Only Visual Cues," in IEEE Access, vol. 8, pp. 215516-215530, 2020, doi:10.1109/ACCESS.2020.3040906. 2. 2. S. Fenghour, D. Chen, K. Guo, B. Li and P. Xiao, "Deep Learning-Based Automated Lip-Reading: A Survey," in IEEE Access, vol. 9, pp. 121184-121205, 2021, doi: 10.1109/ACCESS.2021.3107946. 3. 3. N. Deshmukh, A. Ahire, S. H. Bhandari, A. Mali and K. Warkari, "Vision based Lip Reading System using Deep Learning," 2021 International Conference on Computing, Communication and Green Engineering (CCGE), Pune, India, 2021, pp. 1-6, doi: 10.1109/CCGE50943.2021.9776430. 4. 4. K. Neeraja, K. Srinivas Rao and G. Praneeth, "Deep Learning based Lip Movement Technique for Mute," 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatre, India, 2021, pp. 1446-1450, doi: 10.1109/ICCES51350.2021.9489122.
  • 16. . 16 5. S. M. M. H. Chowdhury, M. Rahman, M. T. Oyshi and M. A. Hasan, "Text Extraction through Video Lip Reading Using Deep Learning," 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India, 2019, pp. 240-243, doi: 10.1109/SMART46866.2019.9117224. 6. S. Pujari, S. Sneha, R. Vinusha, P. Bhuvaneshwari and C. Yashaswini, "A Survey on Deep Learning based Lip-Reading Techniques," 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), T irunelveli, India, 2021, pp. 1286-1293, doi: 10.1109/ICICV50876.2021.9388569. 7. T. Shirakata and T. Saitoh, "Lip Reading Experiments for Multiple Databases using Conventional Method," 2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), Hiroshima, Japan, 2019, pp. 409-414, doi: 10.23919/SICE.2019.8859932
  • 17. . 17 8. F. S, C. J. S and N. Sripriya, "Convolutional Neural Network Based Lip Reading System for Hearing Impaired People," 2022 8th International Conference on Smart Structures and Systems (ICSSS), Chennai, India, 2022, pp. 1-5, doi: 10.1109/ICSSS54381.2022.9782208 9. M. Varshney, R. Yadav, V. P. Namboodiri and R. M. Hegde, "Learning Speaker-specific Lip-to-Speech Generation," 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 2022, pp. 491-498, doi: 10.1109/ICPR56361.2022.9956600 10.B. Martinez, P. Ma, S. Petridis and M. Pantic, "Lipreading Using Temporal Convolutional Networks," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 6319- 6323, doi: 10.1109/ICASSP40776.2020.9053841.