Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Speech Recognition
1. 1
“One of the most fascinating characteristics
of humans is their capability to
communicate ideas by means of
speech”
2. An Advanced Method for
Speech Recognition
Prepared By :
Salma Subh Mohmmed
&
Mahmoud Abd _Elmotelb Ibrhaim Mohammed
3. 3
Production of Speech
•voiced excitation
•unvoiced excitation
•transient excitation
Characteristics of the Speech
•The bandwidth of the signal is 4 kHz
•The signal is periodic with a fundamental frequency between 80 Hz
and 350 Hz
•There are peaks in the spectral distribution of energy at
(2n − 1) ∗ 500 Hz ; n = 1, 2, 3, . . . (1.1)
•The envelope of the power spectrum of the signal shows a
decrease with
increasing frequency (-6dB per octave)
5. 5
* Speech Recognition
•* is the process by which a computer (or other type of machine) identifies spoken
words. Basically, it means talking to your computer, AND having it correctly
recognize what you are saying.
14. Windwing
•تح قد التي الخطأ نسبة من التقليل يتم المرحلة هذه وبواسطةنتيجة دث
موجات تقسيمإلى الكالمframes
•The most common in speech analysis is the Hamming
window:
14
15. 15
We can now assemble a set of band pass filters to analyse speech. These
need to be covering - that is every frequency is covered by one filter so no
information is lost
is a popular speech coding analysis
17. # Concept
isolated word recognition I W R
بعضها عن ومعزولة منفصلة كلمات على للتعرف ويستخدممشكلة نواجه ال ألننا وذلك التعرف أنواع أسهل وهو
الco-articulationفي الحرف التقاء وهيمما الثانية الكلمة بداية في الحرف مع األولى الكلمة نهاية
التعرف في صعوبة يسبب
connected word recognition C W R
وذلك بفواصل الكلمات من مجموعة على للتعرف يستخدمبوضعStopsالسابق النوع يشبه وهو الكلمات بين
التعرف في أصعب لكنه
continuous speech recognition C S R
المتواصل الكالم على للتعرف وهي
Speech understanding S U
إلى تحويله وممكن خاصة مترجمات بواسطة الكالم فهم عمليات وهيعليه التعرف بعد نصوص
speaker identification ,speaker verification S I, S V
word spotting
معينة كلمات عن للتنقيب ويستخدم
17
18. 18
# Generally, there are three
usualmethods in speech
Recognition
•between two time series
•determine if two waveforms represent
the same spoken
recognition:
Dynamic Time
Warping
(DTW)
•having a given number of state
Hidden Markov
Model
(HMM)
•parallel distributed processing
•faster
Artificial Neural
Networks
(ANNs)
20. 20
A hidden Markov model (HMM) is a statistical Markov model in
which the system being modeled is assumed to be a Markov process
with unobserved (hidden) states.
An HMM can be considered as the simplest dynamic Bayesian
network.
In a regular Markov model, the state is directly visible to the
observer, and therefore the state transition probabilities are the only
parameters.
In a hidden Markov model, the state is not directly visible, but
output, dependent on the state, is visible.
Each state has a probability distribution over the possible output
tokens. Therefore the sequence of tokens generated by an HMM
gives some information about the sequence of states.
21. 21
Note that the adjective 'hidden' refers to the state sequence
through which the model passes, not to the parameters of the
model; even if the model parameters are known exactly, the
model is still 'hidden'.
Hidden Markov models are especially known for their
application in temporal pattern recognition such as speech,
handwriting, gesture recognition, part-of-speech tagging,
musical score following, partial discharges and bioinformatics.
A hidden Markov model can be considered a generalization of a
mixture model where the hidden variables (or latent variables),
which control the mixture component to be selected for each
observation, are related through a Markov process rather than
independent of each other.
23. •
نموذج يسمى القيم مع الموجه المخطط هذاماركوفولك ، فكرته لبساطه نظرا تتعجب قد ،نه
الصوت على كالتعرف ما مشكله في استخدامه تم حال في جدا فعال.
منه وكل الكلمات من اآلالف مع التعامل البرنامج على يجب ، الصوت على التعرف حاله فيا
مختلف بشكل تنطق(نطق من أكثر لها)بكلمة كلمه البحث وطريقة ،brute forceمجدية غير
نموذج استخدام مع لكن ، أيضا والذاكرة الوقت من الكثير وتستهلك بتاتاماركوفيمكنمن نا
ل التالي بالمثال األمر هذا نوضح ، أيضا المناسبة النطق طريقه واختيار الكلمات تمثيلكلمه نطق
tomato.
•
23
t ow m aa t ow - British English
t ah m ey t ow - American English
t ah mey t a - Possibly pronunciation when speaking quickly
24. المخفية ماركوف نماذج مع ارتبطت رئيسية خوارزميات ثالث هنــاك:
The forward algorithm, useful for isolated word recognition
The Viterbi algorithm, useful for continuous speech recognition
The forward-backward algorithm, useful for training an HMM
24