Google Voice-to-text

•Download as PPTX, PDF•

2 likes•478 views

Trần Hữu Tuấn

API of google

Services

Why this seminar?
- Speech recognition technology is one from the fast growing
engineering technologies.
- Nearly 20% people of the world are suffering from various
disabilities; many of them are blind or unable to use their
hands effectively. they can share information with people by
operating computer through voice input.
- Our seminar is capable to recognize the speech and convert
the input audio into text; it also enables a user to perform
operations such as open calculator, wordpad, notepad, log off
computer.
- Powerful application in the field of entertainment

Applications
In Car Systems
● Health care
● Military
● Training air traffic controller
● Telephony and other domains
● Usage in education and daily life
● Entertainment

Performance
The performance of speech recognition systems is usually evaluated in terms of
accuracy and speed. Accuracy is usually rated with word error rate (WER), whereas
speed is measured with the real time factor. Other measures of accuracy include
Single Word Error Rate (SWER) and Command Success Rate (CSR).

Accuracy
Accuracy of speech recognition vary with the following:
● Vocabulary size and confusability
● Speaker dependence vs. independence
● Isolated, discontinuous, or continuous speech
● Task and language constraints
● Read vs. spontaneous speech

Acoustic Model
An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using
software to create statistical representations of the sounds that make up each word. It is used by a speech
recognition engine to recognize speech.

Language Model
A language model is a file containing the probabilities of sequences of words. Language models are used
for dictation applications, whereas grammars are used in desktop command and control or telephony
interactive voice response (IVR) type applications.

Speech Engine
A speech engine is software that gives your computer the ability to play back text in a spoken voice
(referred to as text-to-speech or TTS).

Powerful Speech Recognition of google cloud
Google Cloud Speech API enables developers to convert audio to
text by applying powerful neural network models in an easy to use
API. The API recognizes over 110 languages and variants, to
support your global user base. You can transcribe the text of
users dictating to an application’s microphone, enable command-
and-control through voice, or transcribe audio files, among many
other use cases. Recognize audio uploaded in the request, and
integrate with your audio storage on Google Cloud Storage, by
using the same technology Google uses to power its own products.
https://cloud.google.com/speech/

Demo and Q&A
Thank you <3
Refer application auto sub https://github.com/agermanidis/autosub

What's hot

Text to speech with Google CloudRajarshi Ghosh

SeminarAkash Prajapati

Welcome to scrutodggamble

Voice browserSuman Bose

An Example of Speech Processing Program – SiriFlorian Leibert

Text to speech converter in C#.NETMandeep Cheema

Shraddha jaiswalShreyansh Jaiswal

A Text To Speech Detection Methodology for Bangla in AndroidHozaifa Moaj

F 08 dragon naturally speakingTracy Gilmer

What Is Speech Processing?Florian Leibert

What is a programmerHawreSardarMahmud

Top 10 mobile app development programming languages in 2022Zorbis Inc.

Career potentials and opportunities in ICTOsahon Gino Ediagbonya

Gujarati Text-to-Speech Presentationsamyakbhuta

Speech Recognition by IqbalIqbal

Introduction to computer_lec_07_fall_2018_python_lec_101Ramadan Babers, PhD

What is CodingRoboGarden

Speech to text conversionankit_saluja

VOICE BROWSERSai Sirisha

What's hot (19)

Text to speech with Google Cloud

Seminar

Welcome to scruto

Voice browser

An Example of Speech Processing Program – Siri

Text to speech converter in C#.NET

Shraddha jaiswal

A Text To Speech Detection Methodology for Bangla in Android

F 08 dragon naturally speaking

What Is Speech Processing?

What is a programmer

Top 10 mobile app development programming languages in 2022

Career potentials and opportunities in ICT

Gujarati Text-to-Speech Presentation

Speech Recognition by Iqbal

Introduction to computer_lec_07_fall_2018_python_lec_101

What is Coding

Speech to text conversion

VOICE BROWSER

Similar to Google Voice-to-text

10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...nehachhh

Paper on Speech RecognitionThejus Joby

Top 10 Best Speech Recognition Software Jame Williamson

Speech to text conversionankit_saluja

Speech Recognition By Hardik Mistry(Laxmi Institute Of Technology)Hardik_Dimps

Abstract of speech recognitionVinay Jaisriram

Wake-up-word speech recognition using GPS on smart phoneIJERA Editor

Artificial Intelligence for Speech RecognitionRHIMRJ Journal

Noise Adaptive Training for Robust Automatic Speech Recognitionأحلام انصارى

voice browserankitamohod

Speech Recognition Application for the Speech Impaired using the Android-base...TELKOMNIKA JOURNAL

Voice Command Mobile Phone Dialerijtsrd

Assistive Examination System for Visually ImpairedEditor IJCATR

Speech Recognition in Artificail InteligenceIlhaan Marwat

IRJET- Voice based Billing SystemIRJET Journal

AI for voice recognition.pptxJhalakDashora

D1803041822IOSR Journals

voice browserJavaria Kanwal

ICT, Importance of programming and programming languagesEbin Robinson

02 state of the art speech technology using java speech api@egsp 25.08.2011VinothkumaR Ramu

Similar to Google Voice-to-text (20)

10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...

Paper on Speech Recognition

Top 10 Best Speech Recognition Software

Speech to text conversion

Speech Recognition By Hardik Mistry(Laxmi Institute Of Technology)

Abstract of speech recognition

Wake-up-word speech recognition using GPS on smart phone

Artificial Intelligence for Speech Recognition

Noise Adaptive Training for Robust Automatic Speech Recognition

voice browser

Speech Recognition Application for the Speech Impaired using the Android-base...

Voice Command Mobile Phone Dialer

Assistive Examination System for Visually Impaired

Speech Recognition in Artificail Inteligence

IRJET- Voice based Billing System

AI for voice recognition.pptx

D1803041822

voice browser

ICT, Importance of programming and programming languages

02 state of the art speech technology using java speech api@egsp 25.08.2011

Recently uploaded

SANGLI CALL GIRL 92628/71154 SANGLI CALLNiteshKumar82226

Call Girls In {{Green Park Delhi}}9667938988 Indian Russian High Profile Esco...aakahthapa70

CALL GIRLS 9999288940 women seeking men Locanto No Advance North Goadelhincr993

BADDI CALL GIRL 92628/71154 BADDI CALL GNiteshKumar82226

Call Girls In {Connaught Place Delhi} 9667938988 IndianRussian High Profile E...aakahthapa70

Indore Call girl service 6289102337 indore escort servicemaheshsingh64440

Russian Call Girls in Goa %(9316020077)# Russian Call Girls in Goa By Russi...Goa Call Girls Service Goa escort agency

9891550660 Call Girls In Noida Sector 62 Short 1500 Night 6000teencall080

Bhopal Call girl service 6289102337 bhopal escort servicemaheshsingh64440

KAKINADA CALL GIRL 92628/71154 KAKINADA CNiteshKumar82226

Call Girls In Lahore || 03274100048 ||Lahore Call Girl Available 24/7Sana Rajpoot

💚😋Bangalore Escort Service Call Girls, ₹5000 To 25K With AC💚😋Sheetaleventcompany

Call Girls in Luxus Grand Hotel | 💋 03274100048Ifra Zohaib

Call Now HIgh profile ☎9870417354|| Call Girls in Ghaziabad Escort Service De...riyadelhic riyadelhic

Call Girls in Mukherjee Nagar Delhi 8826158885 Genuine Escorts Serviceteencall080

MYSORE CALL GIRLS ESCORT SER 92628/71154NiteshKumar82226

9953056974 Call Girls In Ashok Nagar, Escorts (Delhi) NCR.9953056974 Low Rate Call Girls In Saket, Delhi NCR

Low Rate Russian Call Girls In Lajpat Nagar ➡️ 7836950116 Call Girls Service ...riyasharma00119

Call Now ☎9870417354|| Call Girls in Dwarka Escort Service Delhi N.C.R.riyadelhic riyadelhic

Radhika Call Girls In Jaipur 9358660226 Escorts servicerahul222jai

Recently uploaded (20)

SANGLI CALL GIRL 92628/71154 SANGLI CALL

Call Girls In {{Green Park Delhi}}9667938988 Indian Russian High Profile Esco...

CALL GIRLS 9999288940 women seeking men Locanto No Advance North Goa

BADDI CALL GIRL 92628/71154 BADDI CALL G

Call Girls In {Connaught Place Delhi} 9667938988 IndianRussian High Profile E...

Indore Call girl service 6289102337 indore escort service

Russian Call Girls in Goa %(9316020077)# Russian Call Girls in Goa By Russi...

9891550660 Call Girls In Noida Sector 62 Short 1500 Night 6000

Bhopal Call girl service 6289102337 bhopal escort service

KAKINADA CALL GIRL 92628/71154 KAKINADA C

Call Girls In Lahore || 03274100048 ||Lahore Call Girl Available 24/7

💚😋Bangalore Escort Service Call Girls, ₹5000 To 25K With AC💚😋

Call Girls in Luxus Grand Hotel | 💋 03274100048

Call Now HIgh profile ☎9870417354|| Call Girls in Ghaziabad Escort Service De...

Call Girls in Mukherjee Nagar Delhi 8826158885 Genuine Escorts Service

MYSORE CALL GIRLS ESCORT SER 92628/71154

9953056974 Call Girls In Ashok Nagar, Escorts (Delhi) NCR.

Low Rate Russian Call Girls In Lajpat Nagar ➡️ 7836950116 Call Girls Service ...

Call Now ☎9870417354|| Call Girls in Dwarka Escort Service Delhi N.C.R.

Radhika Call Girls In Jaipur 9358660226 Escorts service

Google Voice-to-text

1. Google Voice-to-text November 13, 2017

2. Why this seminar? - Speech recognition technology is one from the fast growing engineering technologies. - Nearly 20% people of the world are suffering from various disabilities; many of them are blind or unable to use their hands effectively. they can share information with people by operating computer through voice input. - Our seminar is capable to recognize the speech and convert the input audio into text; it also enables a user to perform operations such as open calculator, wordpad, notepad, log off computer. - Powerful application in the field of entertainment

3. Applications In Car Systems ● Health care ● Military ● Training air traffic controller ● Telephony and other domains ● Usage in education and daily life ● Entertainment

4. Performance The performance of speech recognition systems is usually evaluated in terms of accuracy and speed. Accuracy is usually rated with word error rate (WER), whereas speed is measured with the real time factor. Other measures of accuracy include Single Word Error Rate (SWER) and Command Success Rate (CSR).

5. Accuracy Accuracy of speech recognition vary with the following: ● Vocabulary size and confusability ● Speaker dependence vs. independence ● Isolated, discontinuous, or continuous speech ● Task and language constraints ● Read vs. spontaneous speech

6. System block diagram

7. Acoustic Model An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech.

8. Language Model A language model is a file containing the probabilities of sequences of words. Language models are used for dictation applications, whereas grammars are used in desktop command and control or telephony interactive voice response (IVR) type applications.

9. Speech Engine A speech engine is software that gives your computer the ability to play back text in a spoken voice (referred to as text-to-speech or TTS).

10. Powerful Speech Recognition of google cloud Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API. The API recognizes over 110 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone, enable command- and-control through voice, or transcribe audio files, among many other use cases. Recognize audio uploaded in the request, and integrate with your audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products. https://cloud.google.com/speech/

11. Apply api to create subtitle for video

12. Demo and Q&A Thank you <3 Refer application auto sub https://github.com/agermanidis/autosub

Google Voice-to-text

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Google Voice-to-text

Similar to Google Voice-to-text (20)

Recently uploaded

Recently uploaded (20)

Google Voice-to-text