SlideShare a Scribd company logo
Markov chain and Hidden
Markov Models
Nasir and Rajab
Dr. Pan
Spring 2014
MARKOV CHAINS:
One assumption that the stochastic process lead to Markov chain, which has the following key property:
A stochastic process {Xt} is said to have the Markovian property if The state of the system at time t+1
depends only on the state of the system at time t
]xX|xP[X
]xX,xX,...,xX,xX|xP[X
tt11t
00111-t1-ttt11t
t
t
The n-step transition matrix
_ Irreducible Markov chain :
A Markov Chain is irreducible if the corresponding graph is strongly connected.
- Recurrent and transient states
A and B are transient states, C and D are recurrent states.
Once the process moves from B to D, it will never come back.
E
The period of a state
A Markov Chain is periodic if all the states in it have a period k >1.
It is aperiodic otherwise.
Ergodic
A Markov chain is ergodic if :
1.the corresponding graph is strongly connected.
2.It is not peridoic
E
Markov Chain Example
• Based on the weather today what will it be tomorrow?
• Assuming only four possible weather states
Sunny
Cloudy
Rainy
Snowing
Markov Chain Structure
• Each state is an observable event
• At each time interval the state changes to another or same state (qt {S1, S2, S3, S4})
State S1 State S2
State S3 State S4
(Sunny)
(Snowing)(Rainy)
(Cloudy)
Markov Chain Structure
Sunny Cloudy
Rainy Snowy
Markov Chain Transition Probabilities
• Transition probability matrix:
Time t + 1
State S1 S2 S3 S4 Total
Time t
S1 a11 a12 a13 a14 1
S2 a21 a22 a23 a24 1
S3 a31 a32 a33 a34 1
S4 a41 a42 a43 a44 1
Markov Chain Transition Probabilities
• Probabilities for tomorrow’s weather based on today’s weather
Time t + 1
State Sunny Cloudy Rainy Snowing
Time t
Sunny 0.6 0.3 0.1 0.0
Cloudy 0.2 0.4 0.3 0.1
Rainy 0.1 0.2 0.5 0.2
Snowing 0.0 0.3 0.2 0.5
0.6
0.4
0.5
0.5
0.2
0.1
0.3
0.2
0.3
0.1
0.3
0.2
0.1
0.2
Sunny
Cloudy
Rainy
Snowing
0.1
0.1
0.4
0.1
transition probabilities
1.0)|Pr(
4.0)|Pr(
1.0)|Pr(
1.0)|Pr(
1
1
1
1
gXtX
gXgX
gXcX
gXaX
ii
ii
ii
ii
Markov Chain Models
A Markov Chain Model for DNA
A
TC
G
begin
state
transition
A Adenine
C
G
T
Cytosine
Guanine
Thymine
The Probability of a Sequence for a Given
Markov Chain Model
end
A
TC
G
begin
Pr(cggt) Pr(c |begin)Pr(g |c)Pr(g | g)Pr(t | g)Pr(end | t)
Markov Chain Notation
The transition parameters can be denoted by where
• Similarly we can denote the probability of a sequence x as
where represents the transition from the begin state
• This gives a probability distribution over sequences of
length M
)|Pr()Pr( 1
2
1
2
11 i
M
i
i
M
i
xxx xxxaa iiB
axi 1xi
Pr(Xi xi | Xi 1 xi 1)
ii xxa 1
1xaB
Estimating the Model Parameters
Given some data (e.g. a set of sequences from CpG islands), how can we
determine the probability parameters of our model?
* One approach maximum likelihood estimation
* A Bayesian Approach
The "p" in CpG indicates that the C and the G are next to each other in
sequence, regardless of being single- or double- stranded. In a CpG site, both C
and G are found on the same strand of DNA or RNA and are connected by a
phosphodiester bond. This is a covalent bond between atoms.
Maximum Likelihood Estimation
• Let’s use a very simple sequence model
Every position is independent of the others
Eevery position generated from the same multinomial distribution
We want to estimate the parameters Pr(a), Pr(c), Pr(g), Pr(t)
and we’re given the sequences
accgcgctta
gcttagtgac
tagccgttac
then the maximum likelihood estimates are the observed
frequencies of the bases
267.0
30
8
)Pr(
233.0
30
7
)Pr(
t
g
3.0
30
9
)Pr(
2.0
30
6
)Pr(
c
a
Pr(a)
na
ni
i
Maximum Likelihood Estimation
• Suppose instead we saw the following sequences
gccgcgcttg
gcttggtggc
tggccgttgc
• Then the maximum likelihood estimates are
267.0
30
8
)Pr(
433.0
30
13
)Pr(
t
g
3.0
30
9
)Pr(
0
30
0
)Pr(
c
a
A Bayesian Approach
• A more general form: m-estimates
mn
mpn
a
i
i
aa
)Pr(
• with m=8 and uniform priors
gccgcgcttg
gcttggtggc
tggccgttgc
number of “virtual” instances
prior probability of a
38
11
830
825.09
)Pr(c
Estimation for 1st Order Probabilities
To estimate a 1st order parameter, such as Pr(c|g), we count
the number of times that c follows the history g in our given
sequences
using Laplace estimates with the sequences
gccgcgcttg
gcttggtggc
tggccgttgc
412
12
)|Pr(
412
13
)|Pr(
412
17
)|Pr(
412
10
)|Pr(
gt
gg
gc
ga

47
10
)|Pr( ca
Example Application
Markov Chains for Discrimination
• Suppose we want to distinguish CpG islands from other sequence
regions
• given sequences from CpG islands, and sequences from other
regions, we can construct
• A model to represent CpG islands.
• A null model to represent the other regions.
Markov Chains for Discrimination
+ a c g t
a .18 .27 .43 .12
c .17 .37 .27 .19
g .16 .34 .38 .12
t .08 .36 .38 .18
- a c g t
a .30 .21 .28 .21
c .32 .30 .08 .30
g .25 .24 .30 .21
t .18 .24 .29 .29
• Parameters estimated for CpG and null models
human sequences containing 48 CpG islands 60,000
nucleotides
Pr( | )c a
CpG null
The CpG matrix The Null matrix
Hidden Markov Models (HMM)
“Doubly stochastic process with an underlying process that is not
observable (It’s Hidden) but can only be observed through another
set of stochastic process that produce the sequence observed
symbol”
Rabiner& Juang 86’
• In Markov chain, the state is directly visible to the observer, and
therefore the state transition probabilities are the only parameters. In a
hidden Markov model, the state is not directly visible, but
output, dependent on the state, is visible. Each state has a probability
distribution over the possible output tokens. Therefore the sequence of
tokens generated by an HMM gives some information about the
sequence of states.
Difference between Markov chains and HMM
HMM Example
• Suppose we want to determine the average annual temperature at a
particular location on earth over a series of years.
• we consider two annual temperatures, (hot" and cold“).
State H C
H 0.7 0.3
C 0.4 0.6
Time t +1
Time t
The state is the average annual temperature.
The transition from one state to the next is a Markov process
Since we can't observe the state (temperature) in the past, we
can observe the size of tree rings.
Now suppose that current research indicates a correlation between the
size of tree growth rings and temperature. we only consider three
different tree ring sizes, small, medium and large, respectively.
S M L
H 0.1 0.4 0.5
C 0.7 0.2 0.1
Observation
State
The probabilistic relationship between annual temperature and tree ring sizes is
State probability Normalized
HHHH 0.000412 0.042787
HHHC 0.000035 0.003635
HHCH 0.000706 0.073320
HHCC 0.000212 0.022017
HCHH 0.000050 0.005193
HCHC 0.000004 0.000415
HCCH 0.000302 0.031364
HCCC 0.000091 0.009451
CHHH 0.001098 0.114031
CHHC 0.000094 0.009762
CHCH 0.001882 0.195451
CHCC 0.000564 0.058573
CCHH 0.000470 0.048811
CCHC 0.000040 0.004154
CCCH 0.002822 0.293073
CCCC 0.000847 0.087963
Table 1: State sequence probabilities
To find the optimal state sequence in the dynamic
programming (DP), we simply choose the
sequence with the highest
probability, namely, CCCH.
0 1 2 3
P(H) 0.188182 0.519576 0.228788 0.804029
P(C) 0.811818 0.480424 0.771212 0.195971
Table 2: HMM probabilities
From Table 2 we found that the optimal sequence in the HMM sense is CHCH.
and, the optimal DP sequence differs from the optimal HMM sequence and all state
transitions are valid.
Note that: The DP solution and the HMM solution are not necessarily the same. For
example, the DP solution must have valid state transitions, while this is not
necessarily the case for the HMMs.
R code of CpG matrix :
library(markovchain)
DNAStates <- c("A", "C", "G","T")
byRow <- TRUE
DNAMatrix <- matrix(data = c(0.18,0.27,0.43,0.12,0.17,0.37,0.27,0.19,0.16,0.34,0.38,0.12,0.08,0.36,0.38,0.18),byrow
=byRow , nrow =4 , dimnames = list(DNAStates, DNAStates))
mcDNA <- new("markovchain", states = DNAStates, byrow = byRow, transitionMatrix = DNAMatrix, name = "DNA")
plot(mcDNA)
R code of null matrix :
DNAStates <- c("A", "C", "G","T")
byRow <- TRUE
DNAMatrix <- matrix(data = c(0.30,0.21,0.28,0.21,0.32,0.30,0.08,0.30,0.25,0.24,0.30,0.21,0.18,0.24,0.29,0.29),byrow
=byRow , nrow =4 , dimnames = list(DNAStates, DNAStates))
mcDNAnull <- new("markovchain", states = DNAStates, byrow = byRow, transitionMatrix = DNAMatrix, name =
"DNA")
plot(mcDNAnull)
References:
http://www.scs.leeds.ac.uk/scs-only/teaching-materials/HiddenMarkovModels/html_dev/main.html
A Revealing Introduction to Hidden Markov Models, Mark Stamp, Department of Computer Science
San Jose State University, September 28, 2012

More Related Content

What's hot

Hidden markov model ppt
Hidden markov model pptHidden markov model ppt
Hidden markov model ppt
Shivangi Saxena
 
Hidden Markov Model
Hidden Markov Model Hidden Markov Model
Hidden Markov Model
Mahmoud El-tayeb
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Patrick Nicolas
 
Markov Chain Monte Carlo explained
Markov Chain Monte Carlo explainedMarkov Chain Monte Carlo explained
Markov Chain Monte Carlo explained
dariodigiuni
 
Viterbi algorithm
Viterbi algorithmViterbi algorithm
Viterbi algorithm
Supongkiba Kichu
 
Probabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and SequencesProbabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and Sequences
Zitao Liu
 
Hidden Markov Model paper presentation
Hidden Markov Model paper presentationHidden Markov Model paper presentation
Hidden Markov Model paper presentation
Shiraz316
 
Markov chain
Markov chainMarkov chain
Markov chain
Yogesh Khandelwal
 
Data Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov ModelsData Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov Models
Derek Kane
 
Markov process
Markov processMarkov process
Markov process
Nadeem Hashmi
 
12 Machine Learning Supervised Hidden Markov Chains
12 Machine Learning  Supervised Hidden Markov Chains12 Machine Learning  Supervised Hidden Markov Chains
12 Machine Learning Supervised Hidden Markov Chains
Andres Mendez-Vazquez
 
Markov chain-model
Markov chain-modelMarkov chain-model
Markov chain-model
Md. Ayatullah Khan
 
Markov chains1
Markov chains1Markov chains1
Markov chains1
Kinshook Chaturvedi
 
Markov analysis
Markov analysisMarkov analysis
Markov analysis
ganith2k13
 
Monte Carlo and quasi-Monte Carlo integration
Monte Carlo and quasi-Monte Carlo integrationMonte Carlo and quasi-Monte Carlo integration
Monte Carlo and quasi-Monte Carlo integration
John Cook
 
Markov chain
Markov chainMarkov chain
Markov chain
Luckshay Batra
 
Markov Chains
Markov ChainsMarkov Chains
Markov Chains
guest8901f4
 
Lesson 11: Markov Chains
Lesson 11: Markov ChainsLesson 11: Markov Chains
Lesson 11: Markov Chains
Matthew Leingang
 
Hmm
Hmm Hmm
Markov chain intro
Markov chain introMarkov chain intro
Markov chain intro
2vikasdubey
 

What's hot (20)

Hidden markov model ppt
Hidden markov model pptHidden markov model ppt
Hidden markov model ppt
 
Hidden Markov Model
Hidden Markov Model Hidden Markov Model
Hidden Markov Model
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
 
Markov Chain Monte Carlo explained
Markov Chain Monte Carlo explainedMarkov Chain Monte Carlo explained
Markov Chain Monte Carlo explained
 
Viterbi algorithm
Viterbi algorithmViterbi algorithm
Viterbi algorithm
 
Probabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and SequencesProbabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and Sequences
 
Hidden Markov Model paper presentation
Hidden Markov Model paper presentationHidden Markov Model paper presentation
Hidden Markov Model paper presentation
 
Markov chain
Markov chainMarkov chain
Markov chain
 
Data Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov ModelsData Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov Models
 
Markov process
Markov processMarkov process
Markov process
 
12 Machine Learning Supervised Hidden Markov Chains
12 Machine Learning  Supervised Hidden Markov Chains12 Machine Learning  Supervised Hidden Markov Chains
12 Machine Learning Supervised Hidden Markov Chains
 
Markov chain-model
Markov chain-modelMarkov chain-model
Markov chain-model
 
Markov chains1
Markov chains1Markov chains1
Markov chains1
 
Markov analysis
Markov analysisMarkov analysis
Markov analysis
 
Monte Carlo and quasi-Monte Carlo integration
Monte Carlo and quasi-Monte Carlo integrationMonte Carlo and quasi-Monte Carlo integration
Monte Carlo and quasi-Monte Carlo integration
 
Markov chain
Markov chainMarkov chain
Markov chain
 
Markov Chains
Markov ChainsMarkov Chains
Markov Chains
 
Lesson 11: Markov Chains
Lesson 11: Markov ChainsLesson 11: Markov Chains
Lesson 11: Markov Chains
 
Hmm
Hmm Hmm
Hmm
 
Markov chain intro
Markov chain introMarkov chain intro
Markov chain intro
 

Viewers also liked

Applying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to BioinformaticsApplying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to Bioinformatics
butest
 
Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Models
Vu Pham
 
Hidden Markov Model
Hidden Markov ModelHidden Markov Model
Hidden Markov Model
Nghia Bui Van
 
Markov chain
Markov chainMarkov chain
Markov chain
Santosh Phad
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
Marina Santini
 
Techniques for Forecasting Human Resources
 Techniques  for Forecasting   Human Resources Techniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
R K Tiwari Sagar
 
Techniques for Forecasting Human Resources
Techniques  for Forecasting   Human ResourcesTechniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
BHOMA RAM
 

Viewers also liked (7)

Applying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to BioinformaticsApplying Hidden Markov Models to Bioinformatics
Applying Hidden Markov Models to Bioinformatics
 
Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Models
 
Hidden Markov Model
Hidden Markov ModelHidden Markov Model
Hidden Markov Model
 
Markov chain
Markov chainMarkov chain
Markov chain
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
 
Techniques for Forecasting Human Resources
 Techniques  for Forecasting   Human Resources Techniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
 
Techniques for Forecasting Human Resources
Techniques  for Forecasting   Human ResourcesTechniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
 

Similar to Hidden Markov Models

NLP_KASHK:Markov Models
NLP_KASHK:Markov ModelsNLP_KASHK:Markov Models
NLP_KASHK:Markov Models
Hemantha Kulathilake
 
2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter
nozomuhamada
 
Chaotic Communication for mobile applica
Chaotic Communication for mobile applicaChaotic Communication for mobile applica
Chaotic Communication for mobile applica
YaseenMo
 
Book chapter-5
Book chapter-5Book chapter-5
Book chapter-5
Hung Le
 
solver (1)
solver (1)solver (1)
solver (1)
Raj Mitra
 
sbs.pdf
sbs.pdfsbs.pdf
By BIRASA FABRICE
By BIRASA FABRICEBy BIRASA FABRICE
By BIRASA FABRICE
BIRASA FABRICE NYIRIMANA
 
Epidemic processes on switching networks
Epidemic processes on switching networksEpidemic processes on switching networks
Epidemic processes on switching networks
Naoki Masuda
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
nozomuhamada
 
markov chain.ppt
markov chain.pptmarkov chain.ppt
markov chain.ppt
DWin Myo
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Amro Elfeki
 
HMM DAY-3.ppt
HMM DAY-3.pptHMM DAY-3.ppt
HMM DAY-3.ppt
Rahul Halder
 
Controllability of Linear Dynamical System
Controllability of  Linear Dynamical SystemControllability of  Linear Dynamical System
Controllability of Linear Dynamical System
Purnima Pandit
 
Hidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum accessHidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum access
TELKOMNIKA JOURNAL
 
lecture1 (9).ppt
lecture1 (9).pptlecture1 (9).ppt
lecture1 (9).ppt
HebaEng
 
Modern control 2
Modern control 2Modern control 2
Modern control 2
cairo university
 
Rfid presentation in internet
Rfid presentation in internetRfid presentation in internet
Rfid presentation in internet
Ali Azarnia
 
IFAC2008art
IFAC2008artIFAC2008art
IFAC2008art
Yuri Kim
 
MASTER_THESIS-libre
MASTER_THESIS-libreMASTER_THESIS-libre
MASTER_THESIS-libre
Siddhartha Ray Choudhuri
 
2706264.ppt
2706264.ppt2706264.ppt
2706264.ppt
MuhammadMubeen58
 

Similar to Hidden Markov Models (20)

NLP_KASHK:Markov Models
NLP_KASHK:Markov ModelsNLP_KASHK:Markov Models
NLP_KASHK:Markov Models
 
2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter
 
Chaotic Communication for mobile applica
Chaotic Communication for mobile applicaChaotic Communication for mobile applica
Chaotic Communication for mobile applica
 
Book chapter-5
Book chapter-5Book chapter-5
Book chapter-5
 
solver (1)
solver (1)solver (1)
solver (1)
 
sbs.pdf
sbs.pdfsbs.pdf
sbs.pdf
 
By BIRASA FABRICE
By BIRASA FABRICEBy BIRASA FABRICE
By BIRASA FABRICE
 
Epidemic processes on switching networks
Epidemic processes on switching networksEpidemic processes on switching networks
Epidemic processes on switching networks
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
 
markov chain.ppt
markov chain.pptmarkov chain.ppt
markov chain.ppt
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
 
HMM DAY-3.ppt
HMM DAY-3.pptHMM DAY-3.ppt
HMM DAY-3.ppt
 
Controllability of Linear Dynamical System
Controllability of  Linear Dynamical SystemControllability of  Linear Dynamical System
Controllability of Linear Dynamical System
 
Hidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum accessHidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum access
 
lecture1 (9).ppt
lecture1 (9).pptlecture1 (9).ppt
lecture1 (9).ppt
 
Modern control 2
Modern control 2Modern control 2
Modern control 2
 
Rfid presentation in internet
Rfid presentation in internetRfid presentation in internet
Rfid presentation in internet
 
IFAC2008art
IFAC2008artIFAC2008art
IFAC2008art
 
MASTER_THESIS-libre
MASTER_THESIS-libreMASTER_THESIS-libre
MASTER_THESIS-libre
 
2706264.ppt
2706264.ppt2706264.ppt
2706264.ppt
 

Recently uploaded

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 

Recently uploaded (20)

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 

Hidden Markov Models

  • 1. Markov chain and Hidden Markov Models Nasir and Rajab Dr. Pan Spring 2014
  • 2. MARKOV CHAINS: One assumption that the stochastic process lead to Markov chain, which has the following key property: A stochastic process {Xt} is said to have the Markovian property if The state of the system at time t+1 depends only on the state of the system at time t ]xX|xP[X ]xX,xX,...,xX,xX|xP[X tt11t 00111-t1-ttt11t t t
  • 4. _ Irreducible Markov chain : A Markov Chain is irreducible if the corresponding graph is strongly connected. - Recurrent and transient states A and B are transient states, C and D are recurrent states. Once the process moves from B to D, it will never come back. E
  • 5. The period of a state A Markov Chain is periodic if all the states in it have a period k >1. It is aperiodic otherwise. Ergodic A Markov chain is ergodic if : 1.the corresponding graph is strongly connected. 2.It is not peridoic E
  • 6. Markov Chain Example • Based on the weather today what will it be tomorrow? • Assuming only four possible weather states Sunny Cloudy Rainy Snowing
  • 7. Markov Chain Structure • Each state is an observable event • At each time interval the state changes to another or same state (qt {S1, S2, S3, S4}) State S1 State S2 State S3 State S4 (Sunny) (Snowing)(Rainy) (Cloudy)
  • 8. Markov Chain Structure Sunny Cloudy Rainy Snowy
  • 9. Markov Chain Transition Probabilities • Transition probability matrix: Time t + 1 State S1 S2 S3 S4 Total Time t S1 a11 a12 a13 a14 1 S2 a21 a22 a23 a24 1 S3 a31 a32 a33 a34 1 S4 a41 a42 a43 a44 1
  • 10. Markov Chain Transition Probabilities • Probabilities for tomorrow’s weather based on today’s weather Time t + 1 State Sunny Cloudy Rainy Snowing Time t Sunny 0.6 0.3 0.1 0.0 Cloudy 0.2 0.4 0.3 0.1 Rainy 0.1 0.2 0.5 0.2 Snowing 0.0 0.3 0.2 0.5
  • 12. 0.1 0.1 0.4 0.1 transition probabilities 1.0)|Pr( 4.0)|Pr( 1.0)|Pr( 1.0)|Pr( 1 1 1 1 gXtX gXgX gXcX gXaX ii ii ii ii Markov Chain Models A Markov Chain Model for DNA A TC G begin state transition A Adenine C G T Cytosine Guanine Thymine
  • 13. The Probability of a Sequence for a Given Markov Chain Model end A TC G begin Pr(cggt) Pr(c |begin)Pr(g |c)Pr(g | g)Pr(t | g)Pr(end | t)
  • 14. Markov Chain Notation The transition parameters can be denoted by where • Similarly we can denote the probability of a sequence x as where represents the transition from the begin state • This gives a probability distribution over sequences of length M )|Pr()Pr( 1 2 1 2 11 i M i i M i xxx xxxaa iiB axi 1xi Pr(Xi xi | Xi 1 xi 1) ii xxa 1 1xaB
  • 15. Estimating the Model Parameters Given some data (e.g. a set of sequences from CpG islands), how can we determine the probability parameters of our model? * One approach maximum likelihood estimation * A Bayesian Approach The "p" in CpG indicates that the C and the G are next to each other in sequence, regardless of being single- or double- stranded. In a CpG site, both C and G are found on the same strand of DNA or RNA and are connected by a phosphodiester bond. This is a covalent bond between atoms.
  • 16. Maximum Likelihood Estimation • Let’s use a very simple sequence model Every position is independent of the others Eevery position generated from the same multinomial distribution We want to estimate the parameters Pr(a), Pr(c), Pr(g), Pr(t) and we’re given the sequences accgcgctta gcttagtgac tagccgttac then the maximum likelihood estimates are the observed frequencies of the bases 267.0 30 8 )Pr( 233.0 30 7 )Pr( t g 3.0 30 9 )Pr( 2.0 30 6 )Pr( c a Pr(a) na ni i
  • 17. Maximum Likelihood Estimation • Suppose instead we saw the following sequences gccgcgcttg gcttggtggc tggccgttgc • Then the maximum likelihood estimates are 267.0 30 8 )Pr( 433.0 30 13 )Pr( t g 3.0 30 9 )Pr( 0 30 0 )Pr( c a
  • 18. A Bayesian Approach • A more general form: m-estimates mn mpn a i i aa )Pr( • with m=8 and uniform priors gccgcgcttg gcttggtggc tggccgttgc number of “virtual” instances prior probability of a 38 11 830 825.09 )Pr(c
  • 19. Estimation for 1st Order Probabilities To estimate a 1st order parameter, such as Pr(c|g), we count the number of times that c follows the history g in our given sequences using Laplace estimates with the sequences gccgcgcttg gcttggtggc tggccgttgc 412 12 )|Pr( 412 13 )|Pr( 412 17 )|Pr( 412 10 )|Pr( gt gg gc ga  47 10 )|Pr( ca
  • 20. Example Application Markov Chains for Discrimination • Suppose we want to distinguish CpG islands from other sequence regions • given sequences from CpG islands, and sequences from other regions, we can construct • A model to represent CpG islands. • A null model to represent the other regions.
  • 21. Markov Chains for Discrimination + a c g t a .18 .27 .43 .12 c .17 .37 .27 .19 g .16 .34 .38 .12 t .08 .36 .38 .18 - a c g t a .30 .21 .28 .21 c .32 .30 .08 .30 g .25 .24 .30 .21 t .18 .24 .29 .29 • Parameters estimated for CpG and null models human sequences containing 48 CpG islands 60,000 nucleotides Pr( | )c a CpG null
  • 22. The CpG matrix The Null matrix
  • 23. Hidden Markov Models (HMM) “Doubly stochastic process with an underlying process that is not observable (It’s Hidden) but can only be observed through another set of stochastic process that produce the sequence observed symbol” Rabiner& Juang 86’
  • 24. • In Markov chain, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states. Difference between Markov chains and HMM
  • 25. HMM Example • Suppose we want to determine the average annual temperature at a particular location on earth over a series of years. • we consider two annual temperatures, (hot" and cold“). State H C H 0.7 0.3 C 0.4 0.6 Time t +1 Time t The state is the average annual temperature. The transition from one state to the next is a Markov process
  • 26. Since we can't observe the state (temperature) in the past, we can observe the size of tree rings.
  • 27. Now suppose that current research indicates a correlation between the size of tree growth rings and temperature. we only consider three different tree ring sizes, small, medium and large, respectively. S M L H 0.1 0.4 0.5 C 0.7 0.2 0.1 Observation State The probabilistic relationship between annual temperature and tree ring sizes is
  • 28.
  • 29.
  • 30. State probability Normalized HHHH 0.000412 0.042787 HHHC 0.000035 0.003635 HHCH 0.000706 0.073320 HHCC 0.000212 0.022017 HCHH 0.000050 0.005193 HCHC 0.000004 0.000415 HCCH 0.000302 0.031364 HCCC 0.000091 0.009451 CHHH 0.001098 0.114031 CHHC 0.000094 0.009762 CHCH 0.001882 0.195451 CHCC 0.000564 0.058573 CCHH 0.000470 0.048811 CCHC 0.000040 0.004154 CCCH 0.002822 0.293073 CCCC 0.000847 0.087963 Table 1: State sequence probabilities To find the optimal state sequence in the dynamic programming (DP), we simply choose the sequence with the highest probability, namely, CCCH.
  • 31. 0 1 2 3 P(H) 0.188182 0.519576 0.228788 0.804029 P(C) 0.811818 0.480424 0.771212 0.195971 Table 2: HMM probabilities From Table 2 we found that the optimal sequence in the HMM sense is CHCH. and, the optimal DP sequence differs from the optimal HMM sequence and all state transitions are valid. Note that: The DP solution and the HMM solution are not necessarily the same. For example, the DP solution must have valid state transitions, while this is not necessarily the case for the HMMs.
  • 32. R code of CpG matrix : library(markovchain) DNAStates <- c("A", "C", "G","T") byRow <- TRUE DNAMatrix <- matrix(data = c(0.18,0.27,0.43,0.12,0.17,0.37,0.27,0.19,0.16,0.34,0.38,0.12,0.08,0.36,0.38,0.18),byrow =byRow , nrow =4 , dimnames = list(DNAStates, DNAStates)) mcDNA <- new("markovchain", states = DNAStates, byrow = byRow, transitionMatrix = DNAMatrix, name = "DNA") plot(mcDNA) R code of null matrix : DNAStates <- c("A", "C", "G","T") byRow <- TRUE DNAMatrix <- matrix(data = c(0.30,0.21,0.28,0.21,0.32,0.30,0.08,0.30,0.25,0.24,0.30,0.21,0.18,0.24,0.29,0.29),byrow =byRow , nrow =4 , dimnames = list(DNAStates, DNAStates)) mcDNAnull <- new("markovchain", states = DNAStates, byrow = byRow, transitionMatrix = DNAMatrix, name = "DNA") plot(mcDNAnull)
  • 33. References: http://www.scs.leeds.ac.uk/scs-only/teaching-materials/HiddenMarkovModels/html_dev/main.html A Revealing Introduction to Hidden Markov Models, Mark Stamp, Department of Computer Science San Jose State University, September 28, 2012