SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 606
Real Time Direct Speech-to-Speech Translation
Sanchit Chaudhari1, Aniket Shukla2, Tanvi Gaware3
1-3Student, Dhole Patil College of Engineering, Pune, Maharashtra, India
----------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Speech-to-Speech translation has been developed
to ease and bridge the communication gap between people
who speak differentlanguages. Speech-to-Speechtechnologyis
effective because it allows speakers of languages from around
the world to communicate with each other by minimizing the
language gap in global commerce and cross-cultural
communication. Recent studies have recognized speech-to-
speech translation as one of the top technologies that will
transform and reform the current language barriers in our
world. Using Python and Spyder libraries development of the
voice-to-voice translation system is possible. Use of Google’s
Translate API comes in handy during the whole development
process.
Key Words: Speech-to-Speech, Language Barrier,
Translation, Python, Spyder, Voice-to-Voice, API
1. INTRODUCTION
The number of countries exchanging information continues
to rise. International visitors for tourism, emigration, or
overseas study are growing more diverse, necessitating the
development of a system that allows people who speak
different languages to communicate effectively. Automatic
spoken-to-speech translation (S2ST) considerably reduces
language barriers and bridges cross-cultural divides by
allowing people to converse in theirownlanguages.Overthe
last several decades, many researchers have been working
on constructing an S2ST system.
Traditional S2ST approaches necessitate time and effort to
build various components, such as automatic speech
recognition (ASR), machine translation (MT), and text-to-
speech (TTS) synthesis, all of which must be trained and
calibrated separately. ASR processes and transforms voice
into text in the source language, MT converts the source
language text into comparable text in the target language,
and TTS generates speech from the target language text.
1.1 Problem Statement
We propose a technique for training speech-to-speech
translation tasks without theuseoftranscriptionorlinguistic
supervision in the suggested system. To identify and localize
language, we utilized a machine learning technique and a
speech-to-speech algorithm.
1.2 Motivation
The necessity for a speech-based translation system
originates from the fact that standard Text-To-Text
translation systems disregard paralinguistic factors such as
prosody, voice, and emotion of the voice speaker. Our drive
is focused on developing a system that can best embody the
aforementioned characteristics while translating from one
language to another. Furthermore, the cascaded three-stage
machine translation systems have the potential to
exacerbate the errors that occur with each successivephase.
Speech-To-Speech systems, ontheotherhand,havea benefit
over cascaded TT systems since they use a single step
method that requires less computer power and has better
inference latency.
1.3 Objective
The main objectives of the system are: To create a login
system for the users to logon into the program, to have a
translator to translate the languages as per need ofenduser,
to have the output of the translation in form of both voice as
well as text.
2. PROPOSED FRAMEWORK
The proposedrealtimespeech-to-speechlanguagetranslator
consists of 4 modules i.e. Login, Input, Translation, Output.
Input Module: This module consistsoftakinganinputinform
of voice and sending it into the system.
Translation Module: This module does the hard work of
translating the given voice input into the desired output by
using the speech-to-speech translation technology by means
of Google’s Translate API.
Output Module: This module provides the result of work
performed by the system in the form of either text or voice.
3. SYSTEM OVERVIEW
Operating System: Windows 10+
Python: It is a high level general purpose programming
language which is significantly easytounderstandandlearn,
especially for beginners.
Spyder: It is a free to use development environment written
in python and for python.
Anaconda Navigator: Anaconda is a package and
environment manager used for python. Anaconda Navigator
is a GUI which comes pre-packed in Anaconda which allows
us to easily launch applications and manage various
packages
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 607
OpenCV: The OpenCV provides a real time Computer Vision
libraries and functions.
Pillow: Pillow Library addsimageprocessingfunctionalityto
our python interpreter.
4. LITERATURE SURVEY
Neural Machine Translation: Dzmitry Bahdanau suggested
that the neural translation machines aims towards
constructing a single neural network which will be jointly
tuned to increasetheperformancebetweentranslation. With
this method we can gain a rapid translation which can be
compared to the existing state of art phrase based system.
Tacotron: Yuxuan Wang proposed that a text to speech
translation system usually consistsofmultiplemodulessuch
as text analysis, acoustic model, audio synthesis model, etc.
but building this requires expertise knowledge and domain
expenses. Using Tacotron we display an end to end text to
speech model that translates speech directly from any given
text.
Speech Translation without Transcription: Long Doung
suggested that there are many low resource languages do
not have any kind of orthography for transcription. Even if
we try Phonetic transcription, the expense for it is high. By
performing set of experiments using phone-to-word
alignment there was upto 24% possibility of success over
several transcription base lines.
Speech-to-SpeechTranslationforNon-transcribedLanguages:
Andros Tjandra proposed a method for developing speech-
to-speech translation without any kind of linguistic
supervision. It consists of 2 steps – 1. Monitor and generate
representation with unsupervised discovery with discrete
auto encoder. 2. Execute a sequence-to-sequencemodel that
directly matches the core language .
Language Identification for Multilingual Translation: Arun
Babhulgaonkar proposed a n-gram and machine learning
based language identifiers to identify 3 specific Indian
languages specifically Hindi, Marathi and Sanskrit which
were present in a document for translation
Multilingual Corpus for Speech Translation: Javier Iranzo-
Sanchez describedthecorpus creationprocessanddisplayed
a series of automatic speech recognition, machine
translation and language translation.
5. WORKFLOW
The central objective of Real Time Direct Speech-to-Speech
Translation System is to provide an easy to use and accurate
translation system which can present output in real time
which previously was an hard task to beachievedduetolack
of resources and transcriptions. The entire system consists
of three main layers namely The User Interface, The
Application and The Backend. Users will interact with the
system by means of the user interface. The application acts
as a bridge between the user interface and the backend
database. Backend is the database that contains all the
transcriptions of the languages that are supported in the
translation.
5.1 Users:
The two key intended users are the Admin and the end
consumer.
1. Admin has the ability to edit and update the whole
dataset that contains the resources used for
translation process. Admin can also manage the
registered users in the system.
2. End users can login to the system and use theDirect
Speech-to-Speech Application for translating any
language into their desired choice.
5.2 Admin Workflow:
The admin generally manages the whole system’s dataflow
and registers the users who are going to implement the
translation application. The admin can also manage the
whole dataset in the backend database to edit or add new
language dataset to further improve the functionality of the
application. The workflow of an admin is as following steps.
1. Start
2. Register/Login
3. Monitor and check the status of the system
4. Add/Edit Language datasets
5. Logout
6. Stop
5.3 End Consumer Workflow:
The end consumer does an initial registration by providing
their details such as their Email and a Password for security
purposes. After successfully registering into the application
the end user can then login to the system and can make use
of the Direct Speech-to-Speech translation to translate any
kind of language into their desired one. The workflow of an
end user compromises of the following steps.
1. Start
2. Register/login
3. Make use of the Speech-to-Speech Translator
4. Give input
5. Receive output
6. Logout
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 608
7. Stop
Figure -1: System Architecture for Speech-to-Speech
Translation.
6. ALGORITHM:
Speech-to-Speech: Speech recognition is a computer science
and computational linguistics multidisciplinary topic that
develops approaches and technology that allow computers
to recognise and translate spoken language into text.
Automatic speech recognition (ASR), computer voice
recognition, and speech to text are some of the other names
for it (STT). It includes computer science, linguistics, and
computer engineering expertise and research. Some voice
recognition systems necessitate "training" (also known as
"enrolment"), in which a singlespeaker readstextorisolated
vocabulary into the system. "Speaker dependent" systems
are those that rely on training. The term "voice recognition"
or "speaker identification" refers to recognising the speaker
rather than the content of their speech. Recognizing the
speaker can help systems that have been trained on a
specific person's voice translates speech more quickly, or it
can be used to authenticate or verify the speaker's identity
as part of a security process. Speech recognition has a long
history in terms of technology, with multiple waves of key
advancements. Advances in deep learning and big data have
recently improved the field. The advancements are proven
not only by the increasing number of academic articles
published in the subject, but alsobythewidespreadindustry
adoption of a range of deep learning approaches in the
design and deployment of voice recognition systemsaround
the world.
7. FUTURE WORK
Since the technology has been under constant updates
addition of Artificial Intelligence will boost the performance
and make the flow of the system easier. Addition of multiple
outputs will be a key factor in upcoming works. Maximizing
the time of process and output can be a helpful upgrade.
8. CONCLUSION
We propose a new approach for training a speech-to-speech
translation between two languages without the need of
transcription in this system. To begin, a discrete quantized
auto encoder was trained to build a discrete representation
from the target voice features. Second, given the source
speech representation, we trained a sequence-to-sequence
model to predict the codebook sequence. Because the target
speech representations are learned and generated
unsupervised, this method can be applied to any sort of
language, with or without a written form. Our model can
perform a direct speech-to-speech translation.
REFERENCES
[1] Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk,
Kyunghyun Cho, and Yoshua Bengio, “Attention-based
models for speech recognition,” in Advances in Neural
Information Processing Systems 28: Annual Conference
on Neural Information Processing Systems 2015,
December 7-12, 2015, Montreal, Quebec, Canada, 2015,
pp. 577–585.
[2] Dzmitry Bahdanau, KyunghyunCho,andYoshua Bengio,
“Neural machine translation by jointly learning to align
and translate,” CoRR, vol. abs/1409.0473, 2014.
[3] Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui
Wu, Ron J Weiss, Navdeep Jaitly, Zongheng Yang, Ying
Xiao, Zhifeng Chen, Samy Bengio, et al., “Tacotron:
Towards endto-end speech synthesis,” arXiv preprint
arXiv:1703.10135, 2017.
[4] Long Duong, Antonios Anastasopoulos, David Chiang,
Steven Bird, and Trevor Cohn, “An attentional model for
speech translation without transcription,” in NAACL
HLT 2016, The 2016 Conference of the North American
Chapter of the Association for Computational
Linguistics: Human Language Technologies, San Diego
California, USA, June 12-17, 2016, 2016, pp. 949–959.
[5] Alexandre Berard, Olivier Pietquin, Christophe Servan,
and ´ Laurent Besacier, “Listen and translate: A proof of
concept for end-to-end speech-to-text translation,”
CoRR, vol. abs/1612.01744, 2016.

More Related Content

Similar to Real Time Direct Speech-to-Speech Translation

IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...IRJET Journal
 
Control mouse and computer system using voice commands
Control mouse and computer system using voice commandsControl mouse and computer system using voice commands
Control mouse and computer system using voice commandseSAT Journals
 
Towards building a message retrieval facility via telephone
Towards building a message retrieval facility via telephoneTowards building a message retrieval facility via telephone
Towards building a message retrieval facility via telephoneeSAT Journals
 
IRJET - Optical Character Recognition and Translation
IRJET -  	  Optical Character Recognition and TranslationIRJET -  	  Optical Character Recognition and Translation
IRJET - Optical Character Recognition and TranslationIRJET Journal
 
AI SMART BUDDY “MANAV”
AI SMART BUDDY “MANAV”AI SMART BUDDY “MANAV”
AI SMART BUDDY “MANAV”IRJET Journal
 
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET -  	  Speech to Speech Translation using Encoder Decoder ArchitectureIRJET -  	  Speech to Speech Translation using Encoder Decoder Architecture
IRJET - Speech to Speech Translation using Encoder Decoder ArchitectureIRJET Journal
 
A study on the techniques for speech to speech translation
A study on the techniques for speech to speech translationA study on the techniques for speech to speech translation
A study on the techniques for speech to speech translationIRJET Journal
 
IRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET Journal
 
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLE
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLEHAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLE
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLEIRJET Journal
 
IRJET - Voice based Email for Visually Challenged People
IRJET - Voice based Email for Visually Challenged PeopleIRJET - Voice based Email for Visually Challenged People
IRJET - Voice based Email for Visually Challenged PeopleIRJET Journal
 
IRJET - V-IDE: Voice Controlled IDE using Natural Language Processing and...
IRJET -  	  V-IDE: Voice Controlled IDE using Natural Language Processing and...IRJET -  	  V-IDE: Voice Controlled IDE using Natural Language Processing and...
IRJET - V-IDE: Voice Controlled IDE using Natural Language Processing and...IRJET Journal
 
Speech To Speech Translation
Speech To Speech TranslationSpeech To Speech Translation
Speech To Speech TranslationIRJET Journal
 
IRJET - Speech Recognition using Android
IRJET -  	  Speech Recognition using AndroidIRJET -  	  Speech Recognition using Android
IRJET - Speech Recognition using AndroidIRJET Journal
 
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIVoice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIIRJET Journal
 
IRJET- Voice based Billing System
IRJET-  	  Voice based Billing SystemIRJET-  	  Voice based Billing System
IRJET- Voice based Billing SystemIRJET Journal
 
IRJET - Hand Gestures Recognition using Deep Learning
IRJET -  	  Hand Gestures Recognition using Deep LearningIRJET -  	  Hand Gestures Recognition using Deep Learning
IRJET - Hand Gestures Recognition using Deep LearningIRJET Journal
 
IRJET- Online Compiler for Computer Languages with Security Editor
IRJET-  	  Online Compiler for Computer Languages with Security EditorIRJET-  	  Online Compiler for Computer Languages with Security Editor
IRJET- Online Compiler for Computer Languages with Security EditorIRJET Journal
 
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET Journal
 
LANGUAGE TRANSLATOR APP
LANGUAGE TRANSLATOR APPLANGUAGE TRANSLATOR APP
LANGUAGE TRANSLATOR APPIRJET Journal
 
SURVEY ON SMART VIRTUAL VOICE ASSISTANT
SURVEY ON SMART VIRTUAL VOICE ASSISTANTSURVEY ON SMART VIRTUAL VOICE ASSISTANT
SURVEY ON SMART VIRTUAL VOICE ASSISTANTIRJET Journal
 

Similar to Real Time Direct Speech-to-Speech Translation (20)

IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...
IRJET- Communication System for Blind, Deaf and Dumb People using Internet of...
 
Control mouse and computer system using voice commands
Control mouse and computer system using voice commandsControl mouse and computer system using voice commands
Control mouse and computer system using voice commands
 
Towards building a message retrieval facility via telephone
Towards building a message retrieval facility via telephoneTowards building a message retrieval facility via telephone
Towards building a message retrieval facility via telephone
 
IRJET - Optical Character Recognition and Translation
IRJET -  	  Optical Character Recognition and TranslationIRJET -  	  Optical Character Recognition and Translation
IRJET - Optical Character Recognition and Translation
 
AI SMART BUDDY “MANAV”
AI SMART BUDDY “MANAV”AI SMART BUDDY “MANAV”
AI SMART BUDDY “MANAV”
 
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET -  	  Speech to Speech Translation using Encoder Decoder ArchitectureIRJET -  	  Speech to Speech Translation using Encoder Decoder Architecture
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
 
A study on the techniques for speech to speech translation
A study on the techniques for speech to speech translationA study on the techniques for speech to speech translation
A study on the techniques for speech to speech translation
 
IRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET- Transcription of Conferences
IRJET- Transcription of Conferences
 
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLE
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLEHAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLE
HAND GESTURE RECOGNITION SYSTEM FOR DUMB AND DEAF PEOPLE
 
IRJET - Voice based Email for Visually Challenged People
IRJET - Voice based Email for Visually Challenged PeopleIRJET - Voice based Email for Visually Challenged People
IRJET - Voice based Email for Visually Challenged People
 
IRJET - V-IDE: Voice Controlled IDE using Natural Language Processing and...
IRJET -  	  V-IDE: Voice Controlled IDE using Natural Language Processing and...IRJET -  	  V-IDE: Voice Controlled IDE using Natural Language Processing and...
IRJET - V-IDE: Voice Controlled IDE using Natural Language Processing and...
 
Speech To Speech Translation
Speech To Speech TranslationSpeech To Speech Translation
Speech To Speech Translation
 
IRJET - Speech Recognition using Android
IRJET -  	  Speech Recognition using AndroidIRJET -  	  Speech Recognition using Android
IRJET - Speech Recognition using Android
 
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIVoice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AI
 
IRJET- Voice based Billing System
IRJET-  	  Voice based Billing SystemIRJET-  	  Voice based Billing System
IRJET- Voice based Billing System
 
IRJET - Hand Gestures Recognition using Deep Learning
IRJET -  	  Hand Gestures Recognition using Deep LearningIRJET -  	  Hand Gestures Recognition using Deep Learning
IRJET - Hand Gestures Recognition using Deep Learning
 
IRJET- Online Compiler for Computer Languages with Security Editor
IRJET-  	  Online Compiler for Computer Languages with Security EditorIRJET-  	  Online Compiler for Computer Languages with Security Editor
IRJET- Online Compiler for Computer Languages with Security Editor
 
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind People
 
LANGUAGE TRANSLATOR APP
LANGUAGE TRANSLATOR APPLANGUAGE TRANSLATOR APP
LANGUAGE TRANSLATOR APP
 
SURVEY ON SMART VIRTUAL VOICE ASSISTANT
SURVEY ON SMART VIRTUAL VOICE ASSISTANTSURVEY ON SMART VIRTUAL VOICE ASSISTANT
SURVEY ON SMART VIRTUAL VOICE ASSISTANT
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 

Recently uploaded (20)

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 

Real Time Direct Speech-to-Speech Translation

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 606 Real Time Direct Speech-to-Speech Translation Sanchit Chaudhari1, Aniket Shukla2, Tanvi Gaware3 1-3Student, Dhole Patil College of Engineering, Pune, Maharashtra, India ----------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Speech-to-Speech translation has been developed to ease and bridge the communication gap between people who speak differentlanguages. Speech-to-Speechtechnologyis effective because it allows speakers of languages from around the world to communicate with each other by minimizing the language gap in global commerce and cross-cultural communication. Recent studies have recognized speech-to- speech translation as one of the top technologies that will transform and reform the current language barriers in our world. Using Python and Spyder libraries development of the voice-to-voice translation system is possible. Use of Google’s Translate API comes in handy during the whole development process. Key Words: Speech-to-Speech, Language Barrier, Translation, Python, Spyder, Voice-to-Voice, API 1. INTRODUCTION The number of countries exchanging information continues to rise. International visitors for tourism, emigration, or overseas study are growing more diverse, necessitating the development of a system that allows people who speak different languages to communicate effectively. Automatic spoken-to-speech translation (S2ST) considerably reduces language barriers and bridges cross-cultural divides by allowing people to converse in theirownlanguages.Overthe last several decades, many researchers have been working on constructing an S2ST system. Traditional S2ST approaches necessitate time and effort to build various components, such as automatic speech recognition (ASR), machine translation (MT), and text-to- speech (TTS) synthesis, all of which must be trained and calibrated separately. ASR processes and transforms voice into text in the source language, MT converts the source language text into comparable text in the target language, and TTS generates speech from the target language text. 1.1 Problem Statement We propose a technique for training speech-to-speech translation tasks without theuseoftranscriptionorlinguistic supervision in the suggested system. To identify and localize language, we utilized a machine learning technique and a speech-to-speech algorithm. 1.2 Motivation The necessity for a speech-based translation system originates from the fact that standard Text-To-Text translation systems disregard paralinguistic factors such as prosody, voice, and emotion of the voice speaker. Our drive is focused on developing a system that can best embody the aforementioned characteristics while translating from one language to another. Furthermore, the cascaded three-stage machine translation systems have the potential to exacerbate the errors that occur with each successivephase. Speech-To-Speech systems, ontheotherhand,havea benefit over cascaded TT systems since they use a single step method that requires less computer power and has better inference latency. 1.3 Objective The main objectives of the system are: To create a login system for the users to logon into the program, to have a translator to translate the languages as per need ofenduser, to have the output of the translation in form of both voice as well as text. 2. PROPOSED FRAMEWORK The proposedrealtimespeech-to-speechlanguagetranslator consists of 4 modules i.e. Login, Input, Translation, Output. Input Module: This module consistsoftakinganinputinform of voice and sending it into the system. Translation Module: This module does the hard work of translating the given voice input into the desired output by using the speech-to-speech translation technology by means of Google’s Translate API. Output Module: This module provides the result of work performed by the system in the form of either text or voice. 3. SYSTEM OVERVIEW Operating System: Windows 10+ Python: It is a high level general purpose programming language which is significantly easytounderstandandlearn, especially for beginners. Spyder: It is a free to use development environment written in python and for python. Anaconda Navigator: Anaconda is a package and environment manager used for python. Anaconda Navigator is a GUI which comes pre-packed in Anaconda which allows us to easily launch applications and manage various packages
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 607 OpenCV: The OpenCV provides a real time Computer Vision libraries and functions. Pillow: Pillow Library addsimageprocessingfunctionalityto our python interpreter. 4. LITERATURE SURVEY Neural Machine Translation: Dzmitry Bahdanau suggested that the neural translation machines aims towards constructing a single neural network which will be jointly tuned to increasetheperformancebetweentranslation. With this method we can gain a rapid translation which can be compared to the existing state of art phrase based system. Tacotron: Yuxuan Wang proposed that a text to speech translation system usually consistsofmultiplemodulessuch as text analysis, acoustic model, audio synthesis model, etc. but building this requires expertise knowledge and domain expenses. Using Tacotron we display an end to end text to speech model that translates speech directly from any given text. Speech Translation without Transcription: Long Doung suggested that there are many low resource languages do not have any kind of orthography for transcription. Even if we try Phonetic transcription, the expense for it is high. By performing set of experiments using phone-to-word alignment there was upto 24% possibility of success over several transcription base lines. Speech-to-SpeechTranslationforNon-transcribedLanguages: Andros Tjandra proposed a method for developing speech- to-speech translation without any kind of linguistic supervision. It consists of 2 steps – 1. Monitor and generate representation with unsupervised discovery with discrete auto encoder. 2. Execute a sequence-to-sequencemodel that directly matches the core language . Language Identification for Multilingual Translation: Arun Babhulgaonkar proposed a n-gram and machine learning based language identifiers to identify 3 specific Indian languages specifically Hindi, Marathi and Sanskrit which were present in a document for translation Multilingual Corpus for Speech Translation: Javier Iranzo- Sanchez describedthecorpus creationprocessanddisplayed a series of automatic speech recognition, machine translation and language translation. 5. WORKFLOW The central objective of Real Time Direct Speech-to-Speech Translation System is to provide an easy to use and accurate translation system which can present output in real time which previously was an hard task to beachievedduetolack of resources and transcriptions. The entire system consists of three main layers namely The User Interface, The Application and The Backend. Users will interact with the system by means of the user interface. The application acts as a bridge between the user interface and the backend database. Backend is the database that contains all the transcriptions of the languages that are supported in the translation. 5.1 Users: The two key intended users are the Admin and the end consumer. 1. Admin has the ability to edit and update the whole dataset that contains the resources used for translation process. Admin can also manage the registered users in the system. 2. End users can login to the system and use theDirect Speech-to-Speech Application for translating any language into their desired choice. 5.2 Admin Workflow: The admin generally manages the whole system’s dataflow and registers the users who are going to implement the translation application. The admin can also manage the whole dataset in the backend database to edit or add new language dataset to further improve the functionality of the application. The workflow of an admin is as following steps. 1. Start 2. Register/Login 3. Monitor and check the status of the system 4. Add/Edit Language datasets 5. Logout 6. Stop 5.3 End Consumer Workflow: The end consumer does an initial registration by providing their details such as their Email and a Password for security purposes. After successfully registering into the application the end user can then login to the system and can make use of the Direct Speech-to-Speech translation to translate any kind of language into their desired one. The workflow of an end user compromises of the following steps. 1. Start 2. Register/login 3. Make use of the Speech-to-Speech Translator 4. Give input 5. Receive output 6. Logout
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 01 | Jan 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 608 7. Stop Figure -1: System Architecture for Speech-to-Speech Translation. 6. ALGORITHM: Speech-to-Speech: Speech recognition is a computer science and computational linguistics multidisciplinary topic that develops approaches and technology that allow computers to recognise and translate spoken language into text. Automatic speech recognition (ASR), computer voice recognition, and speech to text are some of the other names for it (STT). It includes computer science, linguistics, and computer engineering expertise and research. Some voice recognition systems necessitate "training" (also known as "enrolment"), in which a singlespeaker readstextorisolated vocabulary into the system. "Speaker dependent" systems are those that rely on training. The term "voice recognition" or "speaker identification" refers to recognising the speaker rather than the content of their speech. Recognizing the speaker can help systems that have been trained on a specific person's voice translates speech more quickly, or it can be used to authenticate or verify the speaker's identity as part of a security process. Speech recognition has a long history in terms of technology, with multiple waves of key advancements. Advances in deep learning and big data have recently improved the field. The advancements are proven not only by the increasing number of academic articles published in the subject, but alsobythewidespreadindustry adoption of a range of deep learning approaches in the design and deployment of voice recognition systemsaround the world. 7. FUTURE WORK Since the technology has been under constant updates addition of Artificial Intelligence will boost the performance and make the flow of the system easier. Addition of multiple outputs will be a key factor in upcoming works. Maximizing the time of process and output can be a helpful upgrade. 8. CONCLUSION We propose a new approach for training a speech-to-speech translation between two languages without the need of transcription in this system. To begin, a discrete quantized auto encoder was trained to build a discrete representation from the target voice features. Second, given the source speech representation, we trained a sequence-to-sequence model to predict the codebook sequence. Because the target speech representations are learned and generated unsupervised, this method can be applied to any sort of language, with or without a written form. Our model can perform a direct speech-to-speech translation. REFERENCES [1] Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio, “Attention-based models for speech recognition,” in Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, 2015, pp. 577–585. [2] Dzmitry Bahdanau, KyunghyunCho,andYoshua Bengio, “Neural machine translation by jointly learning to align and translate,” CoRR, vol. abs/1409.0473, 2014. [3] Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, et al., “Tacotron: Towards endto-end speech synthesis,” arXiv preprint arXiv:1703.10135, 2017. [4] Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, and Trevor Cohn, “An attentional model for speech translation without transcription,” in NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, 2016, pp. 949–959. [5] Alexandre Berard, Olivier Pietquin, Christophe Servan, and ´ Laurent Besacier, “Listen and translate: A proof of concept for end-to-end speech-to-text translation,” CoRR, vol. abs/1612.01744, 2016.