Project_Phase1_-_Literature_Review-1[1].pptx

Sri Raghavendra Educational Institutions Society (R)
(Approved by AICTE, Accredited by NAAC, Affiliated to VTU, Karnataka)
Sri Krishna Institute of Technology
www.skit.org.in
Project Phase 1 – Presentation on
Literature Review
Assistive Communication Between
The Blind Dumb and Deaf
. Ashwin G S 1KT20IS001
Hemanth Kumar N 1KT20IS006
Spoorthi L 1KT20IS017
Under the Guidance of
Ms.Ragini Krishna
Asst.Professor
Department of Information Science Engineering

/skit.org.in
Dept. of ISE
Project Title / 18CSP77
Contents
• Abstract
• Introduction
• Literature Survey
• Objectives
• Problem Statement
• Expected Outcomes
• Bibliography
/skit.org.in
Dept. of ISE 2
Assistive Communication between the Blind Dumb
and Deaf/ 18CSP77

Introduction
In today's interconnected world, communication is the cornerstone of human interaction, enabling us to
express thoughts, share ideas, and build relationships. However, not everyone enjoys the same level of accessibility
when it comes to communication.
• The blind, deaf, and dumb people’s face unique challenges in expressing themselves and connecting with others
through traditional means. Fortunately, advances in technology have opened doors to new possibilities
• The development of assistive communication software with intuitive Graphical User Interfaces (GUIs) is
breaking down barriers, enabling these individuals to communicate effectively and independently.
• These software applications are designed to empower the blind, deaf, and dumb individuals to engage in
seamless communication with others.
/skit.org.in
Dept. of ISE 3
and Deaf/ 18CSP77

Challenges Faced by the sensory impaired people’s
The blind, deaf, and dumb individuals encounter unique communication barriers that can be isolating
and frustrating.
▪ Blindness
▪ Deafness
▪ Dumbness
/skit.org.in
Dept. of ISE 4
and Deaf/ 18CSP77

/skit.org.in
Dept. of ISE 5 Project Title / 18CSP77
/skit.org.in
Dept. of ISE 5 Assistive Communication between the Blind Dumb
and Deaf/ 18CSP77
Key Features of Assistive System for sensory impaired peoples
 The proposed system consists of:
INPUT – microphone to record voice modulation, camera to capture image, keyboard to type a message
OUTPUT – speaker and device screen to display the texts and the images.
▪ Voice Recognition and Synthesis
▪ Screen Readers and Braille Compatibility
▪ High-Contrast and Tactile Interfaces
▪ Multimodal Input and Output

Literature Survey
Literature Survey is a systematic method for identifying, evaluating, and interpreting the work produced by
researchers, scholars, and practitioners. It is undertaken to decide how to move forward with the research idea
what has been done and what interesting avenues of future work this opens to be investigated.
/skit.org.in
Dept. of ISE 6
and Deaf/ 18CSP77

/skit.org.in
/skit.org.in
and Deaf/ 18CSP77
Method Advantages Drawbacks Results & Impact
Concatenative TTS [1] Natural sounding speech Limited flexibility Natural, human-like speech
Good intonation and prosody Large storage requirements Audiobooks, voiceovers, limited voice variation
Formant Synthesis [1]
Articulatory Synthesis: Control over speech parameters Less natural sounding Articulatory modeling for speech synthesi
Less storage requirements Robotic or artificialsounding voice Useful in speech therapy, artificial voice generation
Statistical Parametric Synthesis [1]
Hidden Markov Models
(HMM) Statistical modeling for natural speech Initial training data requirements Diverse voice options, adaptable to new languages
Flexibility in voice modification Computationally intensive Siri, Google Assistant, adaptable synthesized voices
Deep Learning and Neural Networks
Recurrent Neural [1]
Networks (RNN) - Improved naturalness and intonation - Resource-intensive training
WaveNet, Tacotron, more natural and expressive
synthesized speech
WaveNet and Tacotron [4]
Learning patterns for better speech synthesisOverfitting and training challenges High-quality TTS, expressive speech generation
Prosody Modeling [4]
Enhanced expressiveness in speech
Complex to model natural intonation
Speech with emotional inflection, more expressive
synthesis
Better emotional rendition in synthesized
speech Requires detailed linguistic analysis Conversational agents, emotional speech synthesis
TEXT TO SPEECH

/skit.org.in
/skit.org.in
and Deaf/ 18CSP77
SPEECH TO TEXT
Method Advantages Drawbacks Results/Impact
Automatic
Speech
Recognition
(ASR) [1]
Enables hands-free operation, enhancing convenience and
multitasking. Provides
accessibility for individuals with disabilities. Improves
productivity by allowing
faster transcription compared to typing.
Accuracy affected by background noise and accents. May
struggle with complex or rare vocabulary.Initial training and
setup can be resour ceintensive.
Facilitates accessibility for individuals with
disabilities and offers a
hands-free experience for users, thereby
enhancing productivity and convenience.
Deep Learning Models
Better accuracy in transcribing spoken language into text
compared to traditional
models. Allows for continuous improvement through learning
from data.
Require significant computational resources for training.
Accuracy highly
dependent on the quantity and quality of training data.
Enhances accuracy and efficiency in
transcribing speech,
continuously improving through the learning
process, aiding in better transcription outcomes.
Natural Language
Enhances accuracy by considering context, grammar, and
semantics of speech.
Supports various languages
Interpretation errors in understanding context and
sarcasm. Limited performance in certain languages or dialects.
Difficulty in
Enables better contextbased accuracy in
transcribing speech and facilitates multilingual
Processing (NLP) [2] and can be used for translation tasks. disambiguating between homophones. support for translation tasks.
Hidden Markov
Models
(HMMs) [1][4]
Historical significance in developing statistical models for
speech recognition.
Initially useful in capturing basic speech patterns.
Limited in capturing longterm dependencies in speech signals.
Not as effective in
handling variability in spoken language.
Played a significant role historically in the
development of
statistical models for speech recognition,
capturing fundamental speech patterns.
Neural Networks [1]
Offer enhanced accuracy by capturing intricate speech
patterns and context. Facilitate the development of end-to-end
systems for direct speech-totext mapping.
Complex architecture may require extensive
computational resources for training. Overfitting issues
might reduce generalization
performance. Interpretability can be challenging.
Provide higher accuracy by capturing intricate
speech
patterns, paving the
way for more direct
end-to-end speech-to text systems.
Post-Processing and Error
Correction [1]
Refines transcriptions and corrects errors, improving
overall accuracy. Enhances
the quality of transcribed text through language models and
context correction.
May introduce new errors or misinterpretations during the
correction process. Correction algorithms might not catch all
mistakes, leading to residual errors.
Refines transcriptions, improving accuracy by
rectifying errors and
enhancing the overall quality of transcribed text.

IMAGE TO TEXT
/skit.org.in
Dept. of ISE 9
and Deaf/ 18CSP77
Technology Advantages Drawbacks Results and Impact
CNN (Convolutional Neural Networks) [5] Excellent at feature extraction from images
High computational resources needed for
training Key in image preprocessing, object recognition
Effective at recognizing patterns in images Typically not used for direct text extraction Widely applied in image analysis tasks
Tesseract[ 7] Highly accurate in printed text extraction
May struggle with handwritten or complex
text Widely used for OCR, document digitization
Open-source and multilingual support Requires well-segmented images for best results Significant impact on text digitization tasks
TTS (Text-to-Speech) [5] Converts extracted text to audible speech
May lack natural intonation and human-like
qualities Enhances accessibility, audiobook creation
Enhances accessibility for visually impaired Requires additional computational resources Used in screen readers, navigation systems
OCR (Optical Character Recognition) [6] Recognizes printed or handwritten text Accuracy varies with image quality and complexity Revolutionizes data entry and document digitization
Widely applicable across various domains May require pre- and post-processing Automation in administrative and archival tasks

/skit.org.in
/skit.org.in
and Deaf/ 18CSP77
Method Advantages Drawbacks Results and Impact
CNN (Convolutional Neural Networks) [9][10] Effective for spatial feature extraction Requires a large labeled dataset State-of-the-art in imagebased gesture recognition
Automatic feature learning Computationally intensive Widely used in real-time applications
SVM (Support Vector Machines) [8] Effective for various data types Requires feature engineering Used in feature-based gesture recognition
Handles nonlinearity with kernels
May not perform well with imbalanced data Applied in sign language recognition
OpenCV (Open Source Computer Vision Library)
[7]
Wide range of computer vision tools
Limited to computer vision tasks Basic gesture recognition, hand tracking
Real-time image and video processing May require manual tuning
Simple gesture detection
TensorFlow [10]
Flexible deep learning platform Requires deep learning knowledge Widely used in research and custom model development
Support for image and sensor data Resource-intensive training Custom models for specific gesture recognition tasks
HMM (Hidden Markov Models) [11]
Suitable for sequential and temporal data Less effective for image-based recognition Used in dynamic gesture recognition, e.g., sign language
Captures transitions in gesture sequences Parameter selection required
ANN (Artificial Neural Networks)[12]
Flexible and capable for various data
types Requires large labeled data
Commonly used in hand gesture recognition, facial
expression analysis
Automatic feature learning Computational intensity
(deep nets)
KNN (K-Nearest Neighbors) [8] Simple to implement and understand Computationally intensive testing
Suitable for simple and small-scale gesture recognition
tasks
GESTURE RECOGNITION

GAPS IDENTIFIED
In the development of a communication system for individuals with multiple sensory impairments,
several gaps and challenges may be identified:
• Lack of Comprehensive Solutions
• Singular Sensory Focus
• Limited Integration of Technologies
• Accessibility and Affordability
/skit.org.in
and Deaf/ 18CSP77

Objectives
The objectives of developing a communication system for individuals with multiple sensory impairments could
include:
• Independence and Empowerment
• Enhanced Communication
• Accessibility
• Integration of Technologies
• Facilitating Inclusivity
/skit.org.in
and Deaf/ 18CSP77

Problem Statement
To develop a comprehensive communication system that effectively addresses the complex needs of individuals with
combined sensory impairments—such as those who are deaf, blind and dumb —by integrating Speech-to-Text, Text-to-
Speech, Image-to-Speech, and Gesture Recognition technologies.
/skit.org.in
and Deaf/ 18CSP77

Expected Outcomes
• Improved Communication Abilities
• User Empowerment
• Increased Access to Information
• Enhanced Independence
• Social Inclusion and Engagement
• Refinement and Advancement
P7
/skit.org.in
and Deaf/ 18CSP77

Bibliography
/skit.org.in
and Deaf/ 18CSP77
[1] Nagdewani, Shivangi, and Ashika Jain. "A review on methods for speech-to-text and textto-speech conversion."
International Research Journal of Engineering and Technology (IRJET) 7.05 (2020).
[2] Trivedi, Ayushi, et al. "Speech to text and text to speech recognition systemsAreview." IOSR J. Comput. Eng 20.2
(2018): 36-43.
[3] Buvaneswari, B., Hemalatha, T., Kalaivani, G., Pavithra, P. and Preethisree, A.R., 2020. “Communication among
blind, deaf and dumb People”. International Journal of Advanced Engineering, Management and Science, 6(4).
[4]"SPEECH TO TEXT USING MACHINE LEARNING", International Journal of Emerging Technologies and
Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.9, Issue 5, page no.j615-j620, May-2022
[5] Priya, Teja, Pulla Chanukah, and D. Vanusha. "Hand gesture to audio based communication system for blind people."
(2019).

Bibliography
/skit.org.in
and Deaf/ 18CSP77
[6] Gore, Sayali, Namrata Salvi, and Swati Singh. "Conversion of Sign Language into Text Using Machine Learning
Technique." International Journal of Research in Engineering, Science and Management 4.5 (2021): 126-128.
[7] HV, Mr Nithin, et al. "HAND TALK INTERPRETATION SYSTEM FOR DEAF AND DUMB USING MACHINE
LEARNING."
[8]Prajapati, Rupesh, et al. "Hand gesture recognition and voice conversion for deaf and dumb." International Research
Journal of Engineering and Technology (IRJET) 5.4 (2018): 1373-1376.
[9] Paul, Pias, et al. "A modern approach for sign language interpretation using convolutional neural network." Pacific
Rim International Conference on Artificial Intelligence. Cham: Springer International Publishing, 2019.
[10] Kumari, Sangeeta, et al. "Hand gesture-based recognition for interactive human computer using tenser-
flow." International Journal of Advanced Science and Technology 29.7 (2020): 14186-14197.
[11] Yang, Jie, and Yangsheng Xu. Hidden markov model for gesture recognition. Carnegie Mellon University, the
Robotics Institute.
[12] Aslam, Shabnam Mohamed, and Shirina Samreen. "Gesture recognition algorithm for visually blind touch interaction
optimization using crow search method." IEEE Access 8 (2020): 127560-127568.

/skit.org.in
Discussion
and Deaf/ 18CSP77

/skit.org.in
Dept. of ISE 18
and Deaf/ 18CSP77

Project_Phase1_-_Literature_Review-1[1].pptx

Recommended

Recommended

More Related Content

Similar to Project_Phase1_-_Literature_Review-1[1].pptx

Similar to Project_Phase1_-_Literature_Review-1[1].pptx (20)

More from ASHWIN808488

More from ASHWIN808488 (6)

Recently uploaded

Recently uploaded (20)

Project_Phase1_-_Literature_Review-1[1].pptx