SlideShare a Scribd company logo
1 of 19
George F Luger
ARTIFICIAL INTELLIGENCE 6th edition
Structures and Strategies for Complex Problem Solving
Machine Learning: Probabilistic
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009
13.0 Stochastic and dynamic Models of
Learning
13.1 Hidden Markov Models (HMMs)
13.2 Dynamic Bayesian Networks and
Learning
13.3 Stochastic Extensions to Reinforcement
Learning
13.4 Epilogue and References
13.5 Exercises
1
D E F I N I T I O N
HIDDEN MARKOV MODEL
A graphical model is called a hidden Markov model (HMM) if it is a
Markov model whose states are not directly observable but are hidden
by a further stochastic system interpreting their output. More formally,
given a set of states S = s1, s2, ..., sn, and given a set of state transition
probabilities A = a11, a12, ..., a1n, a21, a22, ..., ..., ann, there is a set of
observation likelihoods, O = pi(ot), each expressing the probability of
an observation ot (at time t) being generated by a state st.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 2
Figure 13.1 A state transition diagram of a hidden Markov model of two
states designed for the coin flipping problem. The a
ij
are
determined by the elements of the 2 x 2 transition matrix, A.
S
1
S
2
p(H) = b
1
p(H) = b
2
p(T) = 1 - b
1
p(T) = 1 - b
2
a
1 1
a
12
= 1 - a
1 1
a
21
= 1 - a
22
a
22
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 3
Figure 13.2 The state transition diagram for a three state hidden Markov
model of coin flipping. Each coin/state, S
i
, has its bias, b
i
.
S
1
S
2
S
3
a
1 1
a
12
a
21
a
13
a
31
a
22
a
33
a
32
a
23
p(T) = 1 - b
1
p(H) = b
1
p(H) = b
2
p(H) = b
2
p(H) = b
3
p(T) = 1 - b
3
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 4
a. b.
Figure 13.3. a. The hidden, S , and obser vable, O, states of the AR-HMM where
p(O t | S t , Ot -1 ).
b. The values of the hidden S t of the e xample AR-HMM:saf e, unsaf e,
and faulty.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 5
Figure 13.4 A selection of the real-time data across multiple time
periods, with one time slice e xpanded, f or the AR-HMM.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 6
Figure 13.5 The time-series data of Figure 13.4 processed b y a fast
Four ier transf orm on the frequency domain. This was
the data submitted to the AR-HMM f or each time per iod.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 7
Figure 13.6 An auto regressiv e factor ial HMM, where the observable state Ot,
at time t is dependent on m ultiple (S t) subprocess , Si
t, and O t-1.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 8
Figure 13.7 A PFSM representing a set of phonetically related English
words. The probability of each word occurring is below that
word. Adapted from Jurasky and Martin (2008).
Start
#
n
n
n
n
iy t
iy
iy
iy
um
End
#
d
.48
.52
.89
.11
.36
.64
neat
.00013
need
.00056
new
.001
knee
.000024
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 9
Figure 13.8 A trace of the Viterbi algorithm on several of the paths of Figure 13.7. Rows
report the maximum value for Viterbi on each word for each input value (top row).
Adapted from Jurafsky and Martin (2008).
Start = 1.0 # n iy # end
neat
.00013
2 paths 1.0 1.0 x .00013 = .00013 .00013 x 1.0 = .00013 .00013 x .52 = .000067
need
.00056
2 paths 1.0 1.0 x .00056 = .00056 .00056 x 1.0 = .00056 .00056 x .11 = .000062
new
.001
2 paths 1.0 1.0 x .001 = .001 .001 x .36 = .00036 .00036 x 1.0 = .00036
knee
.000024
1 path 1.0 1.0 x .000024 = .000024 .000024 x 1.0 = .000024 .000024 x 1.0 = .000024
Total best .00036
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 10
function Viterbi(Observations of length T, Probabilistic FSM)
begin
number := number of states in FSM
create probability matrix viterbi[R = N + 2, C = T + 2];
viterbi[0, 0] := 1.0;
for each time step (observation) t from 0 to T do
for each state si from i = 0 to number do
for each transition from si to sj in the Probabilistic FSM do
begin
new-count := viterbi[si, t] x path[si, sj] x p(sj | si);
if ((viterbi[sj, t + 1] = 0) or (new-count > viterbi[sj, t + 1]))
then
begin
viterbi[si, t + 1] := new-count
append back-pointer [sj , t + 1] to back-pointer list
end
end;
return viterbi[R, C];
return back-pointer list
end.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 11
Figure 13.9 A DBN example of two time slices. The set Q of random
variables are hidden, the set O observed, t indicates time.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 12
Figure 13.10 A Bayesian belief net for the burglar alarm, earthquake,
burglary example.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 13
Figure 13.11 A Markov random field reflecting the potential functions of the
random variables in the BBN of Figure 13.10, together with the
two observations about the system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 14
Figure 13.12. A learnable node L is added to the Markov random field of Figure
13.11. The Markov random field iterates across three time periods.
For simplicity, the EM iteration is only indicated at time 1.
L
u
D E F I N I T I O N
A MARKOV DECISION PROCESS, or MDP
A Markov Decision Process is a tuple <S, A, P, R> where:
S is a set of states, and
A is a set of actions.
pa(st , st+1) = p(st+1 | st , at = a) is the probability that if the agent executes
action a ΠA from state st at time t, it results in state st+1 at time t+1. Since
the probability, pa ΠP is defined over the entire state-space of actions, it is
often represented with a transition matrix.
R(s) is the reward received by the agent when in state s.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 16
D E F I N I T I O N
A PARTIALLY OBSERVABLE MARKOV DECISION PROCESS, or POMDP
A Partially Observable Markov Decision Process is a tuple <S, A, O, P, R> where:
S is a set of states, and
A is a set of actions.
O is the set of observations denoting what the agent can see about its world.
Since the agent cannot directly observe its current state, the observations are
probabilistically related to the underlying actual state of the world.
pa(st, o, st+1) = p(st+1 , ot = o | st , at = a) is the probability that when the agent
executes action a from state st at time t, it results in an observation o that
leads to an underlying state st +1 at time t+1.
R(st, a, st+1) is the reward received by the agent when it executes action a in
state st and transitions to state st+1.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 17
st st+1 at pa (st,s t+1) Ra(st,s t+1)
high high search a R_search
high low search 1 - a R_search
low high search 1 - b -3
low low search b Ρ_σεαρχη
ηιγη ηιγη ωαιτ 1 Ρ_ωαιτ
ηιγη λοω ωαιτ 0 Ρ_ωαιτ
λοω ηιγη ωαιτ 0 Ρ_ωαιτ
λοω λοω ωαιτ 1 Ρ_ωαιτ
λοω ηιγη ρεχηαργε 1 0
λοω λοω ρεχηαργε 0 0
rewards possible from the current state at.
nations of the current state, st, next state , st+1, the actions and
Table 13.1. Transition probabilities and expected rewards for the finite MDP of
the recycling robot example. Thetable contains all possible combi-
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 18
Figure 13.13. The transition graph for the recycling robot. Thestate nodes are
the large circles and the action nodes are the small black states.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 19

More Related Content

What's hot

Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEsX 37
 
Molodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision MakingMolodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision Makinginventionjournals
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation AlgorithmsJhoirene Clemente
 

What's hot (9)

Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, Attacking the Cur...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, Attacking the Cur...2019 Fall Series: Postdoc Seminars - Special Guest Lecture, Attacking the Cur...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, Attacking the Cur...
 
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
 
Fuzzy hypersoft sets and its weightage operator for decision making
Fuzzy hypersoft sets and its weightage operator for decision makingFuzzy hypersoft sets and its weightage operator for decision making
Fuzzy hypersoft sets and its weightage operator for decision making
 
20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs
 
Ca notes
Ca notesCa notes
Ca notes
 
Molodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision MakingMolodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision Making
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
 

Viewers also liked

Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,bark man
 
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...bark man
 
Baby one more time
Baby one more timeBaby one more time
Baby one more timecharlcooke
 
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...bark man
 
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...bark man
 
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...bark man
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence Muhammad Ahad
 
Vishal_Mahajan_Updated
Vishal_Mahajan_UpdatedVishal_Mahajan_Updated
Vishal_Mahajan_UpdatedVishal Mahajan
 

Viewers also liked (19)

Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 새로운 오르가즘,비아그라 만드는법,비아그라 당했습니다,비아그라 100mg구입처,
 
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...
레비트라 구입방법 카톡:DDF11 & DDF11.KR 레비트라 50mg구입,레비트라 정품구하는방법,레비트라 거래,레비트라 조치법,레비트라 ...
 
Baby one more time
Baby one more timeBaby one more time
Baby one more time
 
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...
비아그라 구입방법 카톡:DDF11 & DDF11.KR 비아그라 효과,비아그라 처방받기,비아그라 파는곳,비아그라 성능,비아그라 약발,비아그라...
 
Portfolio 2016
Portfolio 2016Portfolio 2016
Portfolio 2016
 
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 작용,씨알리스 차이,씨알리스 강간뉴스,씨알리스 냄새,씨알리스 새로운 오르가즘...
 
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...
씨알리스 구입방법 카톡:DDF11 & DDF11.KR 씨알리스 구합니다,씨알리스 50mg구입처,씨알리스 20mg정품구입처,씨알리스 50mg...
 
kat resume
kat resumekat resume
kat resume
 
AccountResumeS Lisa
AccountResumeS LisaAccountResumeS Lisa
AccountResumeS Lisa
 
Unidad 6 d. penal 301
Unidad  6 d. penal 301Unidad  6 d. penal 301
Unidad 6 d. penal 301
 
Ciclo de Vida e Impacto Ambiental
Ciclo de Vida e Impacto AmbientalCiclo de Vida e Impacto Ambiental
Ciclo de Vida e Impacto Ambiental
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
El paradigma de la sustentabilidad
El paradigma de la sustentabilidadEl paradigma de la sustentabilidad
El paradigma de la sustentabilidad
 
Artifical Intelligence
Artifical IntelligenceArtifical Intelligence
Artifical Intelligence
 
Vishal_Mahajan_Updated
Vishal_Mahajan_UpdatedVishal_Mahajan_Updated
Vishal_Mahajan_Updated
 

Similar to Artificial Intelligence

A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning AutomataA new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionbutest
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionbutest
 
A new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning AutomataA new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
HMM-Based Speech Synthesis
HMM-Based Speech SynthesisHMM-Based Speech Synthesis
HMM-Based Speech SynthesisIJMER
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithminfopapers
 
Generic Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their OptimizationGeneric Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their Optimizationinfopapers
 
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...Arthur Weglein
 
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning AutomataA New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiouvafopoulos
 
Hardness of approximation
Hardness of approximationHardness of approximation
Hardness of approximationcarlol
 
A new co-mutation genetic operator
A new co-mutation genetic operatorA new co-mutation genetic operator
A new co-mutation genetic operatorinfopapers
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmmnozomuhamada
 

Similar to Artificial Intelligence (20)

A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning AutomataA new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
 
3_MLE_printable.pdf
3_MLE_printable.pdf3_MLE_printable.pdf
3_MLE_printable.pdf
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
A new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning AutomataA new Reinforcement Scheme for Stochastic Learning Automata
A new Reinforcement Scheme for Stochastic Learning Automata
 
Microeconomics Theory Exam Help
Microeconomics Theory Exam HelpMicroeconomics Theory Exam Help
Microeconomics Theory Exam Help
 
Microeconomics Theory Homework Help
Microeconomics Theory Homework HelpMicroeconomics Theory Homework Help
Microeconomics Theory Homework Help
 
HMM-Based Speech Synthesis
HMM-Based Speech SynthesisHMM-Based Speech Synthesis
HMM-Based Speech Synthesis
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
 
Generic Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their OptimizationGeneric Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their Optimization
 
Chang etal 2012a
Chang etal 2012aChang etal 2012a
Chang etal 2012a
 
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...
Finite-difference modeling, accuracy, and boundary conditions- Arthur Weglein...
 
W33123127
W33123127W33123127
W33123127
 
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning AutomataA New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
 
Hmm and neural networks
Hmm and neural networksHmm and neural networks
Hmm and neural networks
 
2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou
 
Hardness of approximation
Hardness of approximationHardness of approximation
Hardness of approximation
 
A new co-mutation genetic operator
A new co-mutation genetic operatorA new co-mutation genetic operator
A new co-mutation genetic operator
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
 

More from Muhammad Ahad

11. operating-systems-part-2
11. operating-systems-part-211. operating-systems-part-2
11. operating-systems-part-2Muhammad Ahad
 
11. operating-systems-part-1
11. operating-systems-part-111. operating-systems-part-1
11. operating-systems-part-1Muhammad Ahad
 
08. networking-part-2
08. networking-part-208. networking-part-2
08. networking-part-2Muhammad Ahad
 
06. security concept
06. security concept06. security concept
06. security conceptMuhammad Ahad
 
05. performance-concepts-26-slides
05. performance-concepts-26-slides05. performance-concepts-26-slides
05. performance-concepts-26-slidesMuhammad Ahad
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-conceptsMuhammad Ahad
 
04. availability-concepts
04. availability-concepts04. availability-concepts
04. availability-conceptsMuhammad Ahad
 
03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slides03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slidesMuhammad Ahad
 
01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructure01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructureMuhammad Ahad
 
01. 02. introduction (13 slides)
01.   02. introduction (13 slides)01.   02. introduction (13 slides)
01. 02. introduction (13 slides)Muhammad Ahad
 

More from Muhammad Ahad (20)

11. operating-systems-part-2
11. operating-systems-part-211. operating-systems-part-2
11. operating-systems-part-2
 
11. operating-systems-part-1
11. operating-systems-part-111. operating-systems-part-1
11. operating-systems-part-1
 
10. compute-part-2
10. compute-part-210. compute-part-2
10. compute-part-2
 
10. compute-part-1
10. compute-part-110. compute-part-1
10. compute-part-1
 
09. storage-part-1
09. storage-part-109. storage-part-1
09. storage-part-1
 
08. networking-part-2
08. networking-part-208. networking-part-2
08. networking-part-2
 
08. networking
08. networking08. networking
08. networking
 
07. datacenters
07. datacenters07. datacenters
07. datacenters
 
06. security concept
06. security concept06. security concept
06. security concept
 
05. performance-concepts-26-slides
05. performance-concepts-26-slides05. performance-concepts-26-slides
05. performance-concepts-26-slides
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-concepts
 
04. availability-concepts
04. availability-concepts04. availability-concepts
04. availability-concepts
 
03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slides03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slides
 
01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructure01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructure
 
01. 02. introduction (13 slides)
01.   02. introduction (13 slides)01.   02. introduction (13 slides)
01. 02. introduction (13 slides)
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter13
Chapter13Chapter13
Chapter13
 
Chapter12
Chapter12Chapter12
Chapter12
 
Chapter11
Chapter11Chapter11
Chapter11
 
Chapter10
Chapter10Chapter10
Chapter10
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Artificial Intelligence

  • 1. George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Probabilistic Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 13.0 Stochastic and dynamic Models of Learning 13.1 Hidden Markov Models (HMMs) 13.2 Dynamic Bayesian Networks and Learning 13.3 Stochastic Extensions to Reinforcement Learning 13.4 Epilogue and References 13.5 Exercises 1
  • 2. D E F I N I T I O N HIDDEN MARKOV MODEL A graphical model is called a hidden Markov model (HMM) if it is a Markov model whose states are not directly observable but are hidden by a further stochastic system interpreting their output. More formally, given a set of states S = s1, s2, ..., sn, and given a set of state transition probabilities A = a11, a12, ..., a1n, a21, a22, ..., ..., ann, there is a set of observation likelihoods, O = pi(ot), each expressing the probability of an observation ot (at time t) being generated by a state st. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 2
  • 3. Figure 13.1 A state transition diagram of a hidden Markov model of two states designed for the coin flipping problem. The a ij are determined by the elements of the 2 x 2 transition matrix, A. S 1 S 2 p(H) = b 1 p(H) = b 2 p(T) = 1 - b 1 p(T) = 1 - b 2 a 1 1 a 12 = 1 - a 1 1 a 21 = 1 - a 22 a 22 Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 3
  • 4. Figure 13.2 The state transition diagram for a three state hidden Markov model of coin flipping. Each coin/state, S i , has its bias, b i . S 1 S 2 S 3 a 1 1 a 12 a 21 a 13 a 31 a 22 a 33 a 32 a 23 p(T) = 1 - b 1 p(H) = b 1 p(H) = b 2 p(H) = b 2 p(H) = b 3 p(T) = 1 - b 3 Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 4
  • 5. a. b. Figure 13.3. a. The hidden, S , and obser vable, O, states of the AR-HMM where p(O t | S t , Ot -1 ). b. The values of the hidden S t of the e xample AR-HMM:saf e, unsaf e, and faulty. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 5
  • 6. Figure 13.4 A selection of the real-time data across multiple time periods, with one time slice e xpanded, f or the AR-HMM. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 6
  • 7. Figure 13.5 The time-series data of Figure 13.4 processed b y a fast Four ier transf orm on the frequency domain. This was the data submitted to the AR-HMM f or each time per iod. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 7
  • 8. Figure 13.6 An auto regressiv e factor ial HMM, where the observable state Ot, at time t is dependent on m ultiple (S t) subprocess , Si t, and O t-1. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 8
  • 9. Figure 13.7 A PFSM representing a set of phonetically related English words. The probability of each word occurring is below that word. Adapted from Jurasky and Martin (2008). Start # n n n n iy t iy iy iy um End # d .48 .52 .89 .11 .36 .64 neat .00013 need .00056 new .001 knee .000024 Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 9
  • 10. Figure 13.8 A trace of the Viterbi algorithm on several of the paths of Figure 13.7. Rows report the maximum value for Viterbi on each word for each input value (top row). Adapted from Jurafsky and Martin (2008). Start = 1.0 # n iy # end neat .00013 2 paths 1.0 1.0 x .00013 = .00013 .00013 x 1.0 = .00013 .00013 x .52 = .000067 need .00056 2 paths 1.0 1.0 x .00056 = .00056 .00056 x 1.0 = .00056 .00056 x .11 = .000062 new .001 2 paths 1.0 1.0 x .001 = .001 .001 x .36 = .00036 .00036 x 1.0 = .00036 knee .000024 1 path 1.0 1.0 x .000024 = .000024 .000024 x 1.0 = .000024 .000024 x 1.0 = .000024 Total best .00036 Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 10
  • 11. function Viterbi(Observations of length T, Probabilistic FSM) begin number := number of states in FSM create probability matrix viterbi[R = N + 2, C = T + 2]; viterbi[0, 0] := 1.0; for each time step (observation) t from 0 to T do for each state si from i = 0 to number do for each transition from si to sj in the Probabilistic FSM do begin new-count := viterbi[si, t] x path[si, sj] x p(sj | si); if ((viterbi[sj, t + 1] = 0) or (new-count > viterbi[sj, t + 1])) then begin viterbi[si, t + 1] := new-count append back-pointer [sj , t + 1] to back-pointer list end end; return viterbi[R, C]; return back-pointer list end. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 11
  • 12. Figure 13.9 A DBN example of two time slices. The set Q of random variables are hidden, the set O observed, t indicates time. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 12
  • 13. Figure 13.10 A Bayesian belief net for the burglar alarm, earthquake, burglary example. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 13
  • 14. Figure 13.11 A Markov random field reflecting the potential functions of the random variables in the BBN of Figure 13.10, together with the two observations about the system. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 14
  • 15. Figure 13.12. A learnable node L is added to the Markov random field of Figure 13.11. The Markov random field iterates across three time periods. For simplicity, the EM iteration is only indicated at time 1. L u
  • 16. D E F I N I T I O N A MARKOV DECISION PROCESS, or MDP A Markov Decision Process is a tuple <S, A, P, R> where: S is a set of states, and A is a set of actions. pa(st , st+1) = p(st+1 | st , at = a) is the probability that if the agent executes action a Œ A from state st at time t, it results in state st+1 at time t+1. Since the probability, pa Œ P is defined over the entire state-space of actions, it is often represented with a transition matrix. R(s) is the reward received by the agent when in state s. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 16
  • 17. D E F I N I T I O N A PARTIALLY OBSERVABLE MARKOV DECISION PROCESS, or POMDP A Partially Observable Markov Decision Process is a tuple <S, A, O, P, R> where: S is a set of states, and A is a set of actions. O is the set of observations denoting what the agent can see about its world. Since the agent cannot directly observe its current state, the observations are probabilistically related to the underlying actual state of the world. pa(st, o, st+1) = p(st+1 , ot = o | st , at = a) is the probability that when the agent executes action a from state st at time t, it results in an observation o that leads to an underlying state st +1 at time t+1. R(st, a, st+1) is the reward received by the agent when it executes action a in state st and transitions to state st+1. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 17
  • 18. st st+1 at pa (st,s t+1) Ra(st,s t+1) high high search a R_search high low search 1 - a R_search low high search 1 - b -3 low low search b Ρ_σεαρχη ηιγη ηιγη ωαιτ 1 Ρ_ωαιτ ηιγη λοω ωαιτ 0 Ρ_ωαιτ λοω ηιγη ωαιτ 0 Ρ_ωαιτ λοω λοω ωαιτ 1 Ρ_ωαιτ λοω ηιγη ρεχηαργε 1 0 λοω λοω ρεχηαργε 0 0 rewards possible from the current state at. nations of the current state, st, next state , st+1, the actions and Table 13.1. Transition probabilities and expected rewards for the finite MDP of the recycling robot example. Thetable contains all possible combi- Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 18
  • 19. Figure 13.13. The transition graph for the recycling robot. Thestate nodes are the large circles and the action nodes are the small black states. Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 19