Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
An Introduction to HMM and it’s Uses
1. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
1
Pattern Recognition
Name: Muhammad Gulraj
Muhammad.gulraj@yahoo.com
2. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
2
An Introduction to HMM and it’s Uses
A Hidden Markov Model HMM is a statistical/probabilistic model which is a sequence/set of
observable variable X which is generated by hidden state Y. In simple terms Hidden Markov
model consists of hidden states and the output is a sequence/set of observations. In simple
Markov models the state is directly observable while in hidden Markov model the states are not
observable directly. Hidden Markov model is very reliable model for probabilistic estimation.
Hidden Markov Models HMM have applications in pattern recognitions such gesture and hand
writing recognition, computational Bioinformatics, speech recognition etc.
Suppose there is a man in the room having three coins to flip. The room is locked and no one
can see what’s happening inside. There is a display screen outside the room which shows the
result of the coin flips. The result can be any sequence of heads and tails e.g.
THTHHHTHHTTTTHT etc. We can get any sequence of heads and tails, and it is impossible to
predict any specific sequence that will occur. This sequence or unpredictable outcome can be
termed as ‘Observation sequence’.
1. Suppose the 3rd
coin will produce more heads, then tails. The resulting sequence will
obviously have more heads then tails in this case.
2. Now suppose that the chance of flipping the 3rd coin after 1st and 2nd coin is nearly
zero, in this case the transition from 1st
and 2nd
coin to 3rd
coin will be very less and as a
result we will be getting very few heads if the man starts flipping the coin from 2nd
and 3rd
coin.
3. Assume that each coin have some probability associated with them that the man will
start the process of flipping from that particular coin.
3. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
3
The first supposition is called ‘Emission Probability’ bj(O), the 2nd
is called ‘transition probability’
(aij) and the 3rd
is called ‘initial probability’ πi. In this whole example the tails/heads sequence
are Observation sequences and coins are states.
Formally the HMM can be specified as:
Set of hidden states S1,S2,S3 … Sn
Set of Observations O1,O2,O2 … Om
Initial state probability πi
Emission/Output Probability B: P (Ok | qi), where Ok is observation and qi is the state.
Transition probability A
HMM λ = { Π, A, B}
4. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
4
Problems of HMM and their explanations
There are 3 acknowledged problems of Hidden Markov Model HMM.
1. Computing the probability P (O/Δ) for a particular sequence/observation, given the
parameters O and Δ. It is the summation of probabilities of observation O over all states
S. For a fixed state S, the probability of observation O is
We can find the total probability by using:
It looks quite simple but it computationally it is very expensive. This problem is called
evaluation problem and can be solved using Forward-Backward algorithm.
5. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
5
2. Computing the mostly likely sequence of (hidden) states S1, S2, S3 …. St, which can
generate an output sequence (observation). This sequence should maximize the join
probability of state sequence and observation P (O, S/ Δ). This problem is called
‘Decoding problem’. Decoding problem can be solved using Viterbi algorithm or posterior
decoding. To optimize the probability of individual state we can use
This is the probability when the state is j at some time t and where given the parameters
observation O and model Δ. The most likely sequence can be find by simple combining the
individual states:
3. The third problem of hidden Markov model HMM is finding the most likely set/sequence
of state transition probabilities and output probabilities P (O/ ∆). The given parameters
are adjusted so that probability can P (O/ ∆) can be maximized. This is called Training
problem. No analytical solution exists to solve this problem. This problem can be solved
using Baun-Welch re-estimation algorithm.
6. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
6
Relation of HMM to Prior, Posterior and
evidence
As I have discussed in the introductory example, there are basically 3 type of probabilities
associated with Hidden Markov model HMM.
1. Initial probability’ πi
2. Emission/Output Probability B: P (Ok | qi), where Ok is observation and qi is the state.
3. Transition probability
4.
From the example we know that initial probability is the probability that we know before an
experiment is performed. Prior probability similarly has the same property as initial probability.
Posterior probability is the Emission probability or Output probability P (Ok | qi). The Posterior
probability is used in forward backward algorithm.
Evidence is the Transition probability A, which is the probability that the next state is Qi given
that the current state is Qj.
7. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
7
Solution to the problems of HMM and their
algorithms
As discussed earlier there are 3 problems of hidden Markov model HMM.
1. Evaluation problem, which can be solve using Forward-Backward algorithm.
2. Decoding problem, which can be solve using Viterbi algorithm or posterior decoding
3. Training problem, which can be solve using Baun-Welch re-estimation algorithm
Forward-Backward algorithm
The forward and backward algorithm combines the forward and backward steps to find the
probability of every hidden state at some specific time t, repeating this for every time step t can
give us sequence of most likely state at each time t. It cannot be guaranteed that the sequence
will be a valid sequence as it considers every individual step.
The forward algorithm can be stated in step
.
8. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
8
Similarly we can do this backward algorithm as well
9. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
9
Viterbi algorithm
Viterbi algorithm is used to find the most likely hidden states, resulting in a sequence of
observed events. The relationship between observations and states can be inferred from the
given image.
10. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
10
In first step Viterbi algorithm initialize the variable
In second step the process is iterated for every step
In third step the iteration ends
11. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
11
In Fourth step we track the best path
Baun-Welch re-estimation algorithm
Baun-Welch re-estimation algorithm is used to compute the unknown parameters in hidden
Markov model HMM. Baun-Welch re-estimation algorithm can be best described using the
following example.
Assume we collect eggs from chicken every day. The chicken had lay eggs or not depends
upon unknown factors. For simplicity assume that there are only 2 states (S1 and S2) that
determine that the chicken had lay eggs. Initially we don’t know about the state, transition and
probability that the chicken will lay egg given specific state. To guess initial probabilities,
suppose all the sequences starting with S1 and find the maximum probability and then repeat
for S2. Repeat these steps until the resulting probabilities converge. Mathematically it can be
12. MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
MuhammadGulraj
BS GIKI, Pakistan
MS UET Peshawar, Pakistan
12
References
1. Andrew Ng (2013), an online course for Machine learning, Stanford University,
Stanford, https://class.coursera.org/ml-004/class.
2. Duda and Hart, Pattern Classification (2001-2002), Wiley, New York.
3. http://en.wikipedia.org
4. http://hcicv.blogspot.com/2012/07/hidden-markov-model-for-dummies.html
5. http://www.mathworks.com/help/stats/hidden-markov-models-hmm.html
6. http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/viterbi_algorithm/s3
_pg3.html