SlideShare a Scribd company logo
1 of 21
Speech Processing Lab
Lab1:Introduction Speech Recognition
By. Mohamed Essam
Introduction to Speech recognition
 “Hey Google. What’s the weather like today?”
This will sound familiar to anyone who has owned a smartphone in the
last decade.
 I can’t remember the last time I took the time to type out the entire
query on Google Search. I simply ask the question – and Google lays
out the entire weather pattern for me.
Introduction to Speech recognition
 It saves me a ton of time and I can quickly glance at my screen and
get back to work. A win-win for everyone! But how does Google
understand what I’m saying? And how does Google’s system convert
my query into text on my phone’s screen?
 This is where the beauty of speech-to-text models comes in. Google
uses a mix of deep learning and Natural Language Processing (NLP)
techniques to parse through our query, retrieve the answer and
present it in the form of both audio and text.
Introduction to Speech recognition
 The same speech-to-text concept is used in all the other popular
speech recognition technologies out there, such as Amazon’s Alexa,
Apple’s Siri, and so on. The semantics might vary from company to
company, but the overall idea remains the same.
A Brief History of Speech Recognition
through the Decades
The exploration of speech recognition goes way back to the 1950s?
A Brief History of Speech Recognition
through the Decades
 The first speech recognition system, Audrey, was developed back
in 1952 by three Bell Labs researchers. Audrey was designed to
recognize only digits
 Just after 10 years, IBM introduced its first speech recognition
system IBM Shoebox, which was capable of recognizing 16 words
including digits. It could identify commands like “Five plus three plus
eight plus six plus four minus nine, total,” and would print out the
correct answer, i.e., 17
A Brief History of Speech Recognition
through the Decades
 The Defense Advanced Research Projects Agency (DARPA)
contributed a lot to speech recognition technology during the 1970s.
DARPA funded for around 5 years from 1971-76 to a program
called Speech Understanding Research and finally, Harpy was
developed which was able to recognize 1011 words. It was quite a
big achievement at that time.
 In the 1980s, the Hidden Markov Model (HMM) was applied to the
speech recognition system. HMM is a statistical model which is used
to model the problems that involve sequential information. It has a
pretty good track record in many real-world applications including
speech recognition.
A Brief History of Speech Recognition
through the Decades
 In 2001, Google introduced the Voice Search application that
allowed users to search for queries by speaking to the
machine. This was the first voice-enabled application which was
very popular among the people. It made the conversation between
the people and machines a lot easier.
 By 2011, Apple launched Siri that offered a real-time, faster, and
easier way to interact with the Apple devices by just using your
voice. As of now, Amazon’s Alexa and Google’s Home are the
most popular voice command based virtual assistants that are being
widely used by consumers across the globe.
Speech Processing Lab
Lab2:Introduction to Signal Processing
By. Mohamed Essam
“Before dive into speech Recognition let us first
understand some common terms and parameters of a
signal.”
!
What is an Audio Signal?
This is pretty intuitive – any object that vibrates produces sound waves.
 Have you ever thought of how we are able to hear someone’s
voice? It is due to the audio waves. Let’s quickly understand the
process behind it.
 When an object vibrates, the air molecules oscillate to and from their
rest position and transmits its energy to neighboring molecules. This
results in the transmission of energy from one molecule to another
which in turn produces a sound wave.
What is an Audio Signal?
Parameters of an audio signal
Amplitude: Amplitude refers to the maximum displacement of the air molecules
from the rest position
Crest and Trough: The crest is the highest point in the wave whereas trough is
the lowest point
Wavelength: The distance between 2 successive crests or troughs is known as a
wavelength
Parameters of an audio signal
Cycle: Every audio signal traverses in the form of cycles. One complete upward
movement and downward movement of the signal form a cycle
Frequency: Frequency refers to how fast a signal is changing over a period of
time
The below GIF wonderfully depicts the difference between a high and low-
frequency signal:
Different types of signals
We come across broadly two different types of signals in our day-to-
day life – Digital and Analog.
 A digital signal is a discrete representation of a signal over a period
of time. Here, the finite number of samples exists between any two-
time intervals.
 Analog signal is a continuous representation of a signal over a
period of time. In an analog signal, an infinite number of samples
exist between any two-time intervals.
What is sampling the signal and
why is it required?
 An audio signal is a continuous representation of amplitude as it
varies with time.
Here, time can even be in picoseconds, that is why an audio signal is an
analog signal.
 Analog signals are memory hogging since they have an infinite
number of samples and processing them is highly computationally
demanding. Therefore, we need a technique to convert analog
signals to digital signals so that we can work with them easily.
What is sampling the signal and
why is it required?
 Sampling the signal is a process of converting an analog signal to
a digital signal by selecting a certain number of samples per second
from the analog signal. Can you see what we are doing here? We
are converting an audio signal to a discrete signal through sampling
so that it can be stored and processed efficiently in memory.
What is sampling the signal and
why is it required?
What is sampling the signal and
why is it required?
Any Questions?
Mohamed Essam
!
CREDITS: This presentation template was created by Slidesgo,
including icons by Flaticon, and infographics & images by Freepik
THANKS!
Contacts
Mhmd96.essam@gmail.com
Please keep this slide for attribution

More Related Content

What's hot

speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data miningJimit Rupani
 
Group 2 -innovation in smartphones-
Group 2 -innovation in smartphones-Group 2 -innovation in smartphones-
Group 2 -innovation in smartphones-Fuyi Pan
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)TANVIRAHMED611926
 
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYAUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYIJCERT
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
silent sound technology pdf
silent sound technology pdfsilent sound technology pdf
silent sound technology pdfrahul mishra
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemREHMAT ULLAH
 
silent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANsilent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANRaj Niranjan
 
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Saibur Rahman
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-reportAnakali Mahesh
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
E0ad silent sound technology
E0ad silent  sound technologyE0ad silent  sound technology
E0ad silent sound technologyMadhuri Rudra
 

What's hot (20)

speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Group 2 -innovation in smartphones-
Group 2 -innovation in smartphones-Group 2 -innovation in smartphones-
Group 2 -innovation in smartphones-
 
Speech Recognition No Code
Speech Recognition No CodeSpeech Recognition No Code
Speech Recognition No Code
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYAUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEY
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
silent sound technology pdf
silent sound technology pdfsilent sound technology pdf
silent sound technology pdf
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
silent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJANsilent sound new by RAJ NIRANJAN
silent sound new by RAJ NIRANJAN
 
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-report
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
E0ad silent sound technology
E0ad silent  sound technologyE0ad silent  sound technology
E0ad silent sound technology
 

Similar to Speech Analysis

Silent-Sound-Technology-PPT.pptx
Silent-Sound-Technology-PPT.pptxSilent-Sound-Technology-PPT.pptx
Silent-Sound-Technology-PPT.pptxomkarrekulwar
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
An Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresAn Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresSivaranjan Goswami
 
Silent sound technologyrevathippt
Silent sound technologyrevathipptSilent sound technologyrevathippt
Silent sound technologyrevathipptrevathiyadavb
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?CarterRodriguez6
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template MatchingIJORCS
 
Seminar PPT - Shreya Suroliya.pptx
Seminar PPT - Shreya Suroliya.pptxSeminar PPT - Shreya Suroliya.pptx
Seminar PPT - Shreya Suroliya.pptxchiragsharmaa36
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionIRJET Journal
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Paris Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glassesParis Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glassesAugmentedWorldExpo
 
Silentsound documentation
Silentsound documentationSilentsound documentation
Silentsound documentationRaj Niranjan
 
Silent sound interface
Silent sound interfaceSilent sound interface
Silent sound interfaceJeevitha Reddy
 
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...IRJET Journal
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound TechnologyHafiz Sanni
 

Similar to Speech Analysis (20)

Silent-Sound-Technology-PPT.pptx
Silent-Sound-Technology-PPT.pptxSilent-Sound-Technology-PPT.pptx
Silent-Sound-Technology-PPT.pptx
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
An Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresAn Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech features
 
Silent sound technologyrevathippt
Silent sound technologyrevathipptSilent sound technologyrevathippt
Silent sound technologyrevathippt
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?
 
Assign
AssignAssign
Assign
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template Matching
 
Seminar PPT - Shreya Suroliya.pptx
Seminar PPT - Shreya Suroliya.pptxSeminar PPT - Shreya Suroliya.pptx
Seminar PPT - Shreya Suroliya.pptx
 
De4201715719
De4201715719De4201715719
De4201715719
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech Recognition
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
FINAL report
FINAL reportFINAL report
FINAL report
 
Paris Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glassesParis Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glasses
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Silentsound documentation
Silentsound documentationSilentsound documentation
Silentsound documentation
 
Silent sound interface
Silent sound interfaceSilent sound interface
Silent sound interface
 
Seminar
SeminarSeminar
Seminar
 
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
LIP READING - AN EFFICIENT CROSS AUDIO-VIDEO RECOGNITION USING 3D CONVOLUTION...
 
Silent Sound Technology
Silent Sound TechnologySilent Sound Technology
Silent Sound Technology
 

More from Mohamed Essam

Data Science Crash course
Data Science Crash courseData Science Crash course
Data Science Crash courseMohamed Essam
 
2.Feature Extraction
2.Feature Extraction2.Feature Extraction
2.Feature ExtractionMohamed Essam
 
Introduction to Robotics.pptx
Introduction to Robotics.pptxIntroduction to Robotics.pptx
Introduction to Robotics.pptxMohamed Essam
 
Introduction_to_Gui_with_tkinter.pptx
Introduction_to_Gui_with_tkinter.pptxIntroduction_to_Gui_with_tkinter.pptx
Introduction_to_Gui_with_tkinter.pptxMohamed Essam
 
Getting_Started_with_DL_in_Keras.pptx
Getting_Started_with_DL_in_Keras.pptxGetting_Started_with_DL_in_Keras.pptx
Getting_Started_with_DL_in_Keras.pptxMohamed Essam
 
Let_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxLet_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxMohamed Essam
 
OOP-Advanced_Programming.pptx
OOP-Advanced_Programming.pptxOOP-Advanced_Programming.pptx
OOP-Advanced_Programming.pptxMohamed Essam
 
Regularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptxRegularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptxMohamed Essam
 
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptxMohamed Essam
 
2.Data_Strucures_and_modules.pptx
2.Data_Strucures_and_modules.pptx2.Data_Strucures_and_modules.pptx
2.Data_Strucures_and_modules.pptxMohamed Essam
 
Activation_function.pptx
Activation_function.pptxActivation_function.pptx
Activation_function.pptxMohamed Essam
 
Deep_Learning_Frameworks
Deep_Learning_FrameworksDeep_Learning_Frameworks
Deep_Learning_FrameworksMohamed Essam
 

More from Mohamed Essam (20)

Data Science Crash course
Data Science Crash courseData Science Crash course
Data Science Crash course
 
2.Feature Extraction
2.Feature Extraction2.Feature Extraction
2.Feature Extraction
 
Data Science
Data ScienceData Science
Data Science
 
Introduction to Robotics.pptx
Introduction to Robotics.pptxIntroduction to Robotics.pptx
Introduction to Robotics.pptx
 
Introduction_to_Gui_with_tkinter.pptx
Introduction_to_Gui_with_tkinter.pptxIntroduction_to_Gui_with_tkinter.pptx
Introduction_to_Gui_with_tkinter.pptx
 
Getting_Started_with_DL_in_Keras.pptx
Getting_Started_with_DL_in_Keras.pptxGetting_Started_with_DL_in_Keras.pptx
Getting_Started_with_DL_in_Keras.pptx
 
Linear_algebra.pptx
Linear_algebra.pptxLinear_algebra.pptx
Linear_algebra.pptx
 
Let_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptxLet_s_Dive_to_Deep_Learning.pptx
Let_s_Dive_to_Deep_Learning.pptx
 
OOP-Advanced_Programming.pptx
OOP-Advanced_Programming.pptxOOP-Advanced_Programming.pptx
OOP-Advanced_Programming.pptx
 
1.Basic_Syntax
1.Basic_Syntax1.Basic_Syntax
1.Basic_Syntax
 
KNN.pptx
KNN.pptxKNN.pptx
KNN.pptx
 
Regularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptxRegularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptx
 
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
 
Clean_Code
Clean_CodeClean_Code
Clean_Code
 
Linear_Regression
Linear_RegressionLinear_Regression
Linear_Regression
 
2.Data_Strucures_and_modules.pptx
2.Data_Strucures_and_modules.pptx2.Data_Strucures_and_modules.pptx
2.Data_Strucures_and_modules.pptx
 
Naieve_Bayee.pptx
Naieve_Bayee.pptxNaieve_Bayee.pptx
Naieve_Bayee.pptx
 
Activation_function.pptx
Activation_function.pptxActivation_function.pptx
Activation_function.pptx
 
Deep_Learning_Frameworks
Deep_Learning_FrameworksDeep_Learning_Frameworks
Deep_Learning_Frameworks
 
Neural_Network
Neural_NetworkNeural_Network
Neural_Network
 

Recently uploaded

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Speech Analysis

  • 1. Speech Processing Lab Lab1:Introduction Speech Recognition By. Mohamed Essam
  • 2. Introduction to Speech recognition  “Hey Google. What’s the weather like today?” This will sound familiar to anyone who has owned a smartphone in the last decade.  I can’t remember the last time I took the time to type out the entire query on Google Search. I simply ask the question – and Google lays out the entire weather pattern for me.
  • 3. Introduction to Speech recognition  It saves me a ton of time and I can quickly glance at my screen and get back to work. A win-win for everyone! But how does Google understand what I’m saying? And how does Google’s system convert my query into text on my phone’s screen?  This is where the beauty of speech-to-text models comes in. Google uses a mix of deep learning and Natural Language Processing (NLP) techniques to parse through our query, retrieve the answer and present it in the form of both audio and text.
  • 4. Introduction to Speech recognition  The same speech-to-text concept is used in all the other popular speech recognition technologies out there, such as Amazon’s Alexa, Apple’s Siri, and so on. The semantics might vary from company to company, but the overall idea remains the same.
  • 5. A Brief History of Speech Recognition through the Decades The exploration of speech recognition goes way back to the 1950s?
  • 6. A Brief History of Speech Recognition through the Decades  The first speech recognition system, Audrey, was developed back in 1952 by three Bell Labs researchers. Audrey was designed to recognize only digits  Just after 10 years, IBM introduced its first speech recognition system IBM Shoebox, which was capable of recognizing 16 words including digits. It could identify commands like “Five plus three plus eight plus six plus four minus nine, total,” and would print out the correct answer, i.e., 17
  • 7. A Brief History of Speech Recognition through the Decades  The Defense Advanced Research Projects Agency (DARPA) contributed a lot to speech recognition technology during the 1970s. DARPA funded for around 5 years from 1971-76 to a program called Speech Understanding Research and finally, Harpy was developed which was able to recognize 1011 words. It was quite a big achievement at that time.  In the 1980s, the Hidden Markov Model (HMM) was applied to the speech recognition system. HMM is a statistical model which is used to model the problems that involve sequential information. It has a pretty good track record in many real-world applications including speech recognition.
  • 8. A Brief History of Speech Recognition through the Decades  In 2001, Google introduced the Voice Search application that allowed users to search for queries by speaking to the machine. This was the first voice-enabled application which was very popular among the people. It made the conversation between the people and machines a lot easier.  By 2011, Apple launched Siri that offered a real-time, faster, and easier way to interact with the Apple devices by just using your voice. As of now, Amazon’s Alexa and Google’s Home are the most popular voice command based virtual assistants that are being widely used by consumers across the globe.
  • 9. Speech Processing Lab Lab2:Introduction to Signal Processing By. Mohamed Essam
  • 10. “Before dive into speech Recognition let us first understand some common terms and parameters of a signal.” !
  • 11. What is an Audio Signal? This is pretty intuitive – any object that vibrates produces sound waves.  Have you ever thought of how we are able to hear someone’s voice? It is due to the audio waves. Let’s quickly understand the process behind it.  When an object vibrates, the air molecules oscillate to and from their rest position and transmits its energy to neighboring molecules. This results in the transmission of energy from one molecule to another which in turn produces a sound wave.
  • 12. What is an Audio Signal?
  • 13. Parameters of an audio signal Amplitude: Amplitude refers to the maximum displacement of the air molecules from the rest position Crest and Trough: The crest is the highest point in the wave whereas trough is the lowest point Wavelength: The distance between 2 successive crests or troughs is known as a wavelength
  • 14. Parameters of an audio signal Cycle: Every audio signal traverses in the form of cycles. One complete upward movement and downward movement of the signal form a cycle Frequency: Frequency refers to how fast a signal is changing over a period of time The below GIF wonderfully depicts the difference between a high and low- frequency signal:
  • 15. Different types of signals We come across broadly two different types of signals in our day-to- day life – Digital and Analog.  A digital signal is a discrete representation of a signal over a period of time. Here, the finite number of samples exists between any two- time intervals.  Analog signal is a continuous representation of a signal over a period of time. In an analog signal, an infinite number of samples exist between any two-time intervals.
  • 16. What is sampling the signal and why is it required?  An audio signal is a continuous representation of amplitude as it varies with time. Here, time can even be in picoseconds, that is why an audio signal is an analog signal.  Analog signals are memory hogging since they have an infinite number of samples and processing them is highly computationally demanding. Therefore, we need a technique to convert analog signals to digital signals so that we can work with them easily.
  • 17. What is sampling the signal and why is it required?  Sampling the signal is a process of converting an analog signal to a digital signal by selecting a certain number of samples per second from the analog signal. Can you see what we are doing here? We are converting an audio signal to a discrete signal through sampling so that it can be stored and processed efficiently in memory.
  • 18. What is sampling the signal and why is it required?
  • 19. What is sampling the signal and why is it required?
  • 21. CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik THANKS! Contacts Mhmd96.essam@gmail.com Please keep this slide for attribution