SlideShare a Scribd company logo
1 of 52
Download to read offline
Dynamic Programming:
basics and case studies
Houston Machine Learning Meetup
11/16/2019
Dynamic Programming: name and story
• Richard Bellman coined the term “Dynamic Programming”
Bellman autobiography
“The face of Wilson (the secretory of defense) would turn red, and he would get
violent if people used the term RESEARCH in his presence. You can imagine how he
felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson
and the Air Force from the fact that I was really doing MATHEMATICS inside the
RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I
wanted to get across the idea that this was DYNAMIC, this was multistage, this was
time-varying…. I thought dynamic programming was a good name. It was something
not even a Congressman could object to..."
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by recursion
public int fib(int N) {
if (n == 0 || n == 1) { return n; }
return fib(N – 1) + fib(N – 2);
}
Time complexity: O(N) = 2^N
Recursion tree of Fibonacci sequence
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3 5
Fibonacci sequence
• Recursion:
• F(n) = F(n – 1) + F(n – 2)
• Starts from n
• When computing F(n), F(n-1) and F(n-2) is not known yet
• DP:
• F(n) = F(n – 1) + F(n – 2)
• Starts from 0 and 1
• When computing F(n), F(n-1) and F(n-2) has been stored in array
• Dynamic programming: partial result stored to save time
Longest common subsequence
• To find the longest subsequence common to two or more sequences
• String1: “AGCAT”
• String2: “GAC”
• Common subsequence: “A”, “C”, “G”, “AC”, “GA”,
• LCS: “AC”, or “GA”
• To use a table to find LCS:
• First column: string1(“AGCAT”)
• First row: string2(“GAC”)
• Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
Longest common subsequence
Longest common subsequence
Longest common subsequence
Longest common subsequence
Wildcard matching
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]
- * a * b
- T T F F F
a
d
c
a
b
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T
b
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Longest common subsequence and wildcard
matching
• DP starts from initial condition to the end of string:
• From left to right at each row
• From top to bottom at each cloumn
• State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to
table[i][j]
• Each time: move forward by one step
• State at each is the global optimum of that step
• Table (or diagram) is the best tool to simulate the processing
Matrix chain multiplication
• Multiple two matrices: A(10 x 100) and B(100 x 5)
• OUT[p][r] += A[p][q] * B[q][r]
• Computation = 10 x 100 x 5
• Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50)
• ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500
• (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000
• ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar
computation
Matrix chain multiplication
• How to optimize the chain multiplication of matrices ( A1, A2, A3, ….
An)
• DP induction rule:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
• Status:
• M[i, j]: the min number of computations for the matrices (i to j) multiplication
• S[i, j]: the last-layer break-point for M[i, j]
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
(A1 (A2 A3)) ((A4 A5) A6)
Matrix chain multiplication: DP solution
• State hard to define:
• M[i, j]
• S[i, j]
• State transition complicated:
• By row and column not work
• From previous state to current state by the matrices length (Induction rule)
Framework of dynamic programming
• Three key components of dynamic programming algorithm:
• Definition of state
• Initial condition (base)
• Induction rule (state transition)
• Induction rule: difficult to find
• 1D/2D table for the thinking process
What is part of speech tagging?
• Identify parts of the speech (syntactic categories):
This is a simple sentence
DET VB DET ADJ NOUN
• POS tagging is a first step towards syntactic analysis (sematic analysis)
• Faster than full parsing
• Text classification and word disambiguation
• How to decide the correct label:
• Word to be labeled: chair is probably a noun
• Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this
word is more likely to be a verb
• Hidden Markov models can be used to work on this problem
Why is POS tagging hard?
• Ambiguity
glass of water/NOUN vs. water/VERB the plants
lie/VERB down vs. tell a lie/NOUN
wind/VERB down vs. a mighty wind/NOUN(homographs)
How about time flies like an arrow?
• Sparse data:
• Words we haven’t seen before
• Word-Tag pairs we haven’t seen before
Example transition probabilities
• Probabilities estimated from tagged WSJ corpus:
• Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28
• Modal verbs (MD) nearly always followed by bare verbs (VB).
• Adjectives (JJ) are often followed by nouns (NN).
Example output probabilities
• Probabilities estimated from tagged WSJ corpus:
• 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032
• About half of determiners (DT) are the.
• the can also be a proper noun.
Hidden Markov Chain
• A set of states (tags)
• An output alphabet (words)
• Initial state (beginning of sentence)
• State transition probabilities ( P(ti|ti-1) )
• Symbol emission probabilities ( P(wi|ti) )
Hidden Markov Chain
• Model the tagging process:
• Sentence: W = (w1, w2, … wn)
• Tags T = (t1, t2, …, tn)
• Joint probability: P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
• Example:
• This/DET is/VB a/DET simple/JJ sentence/NN
• Add begin(<s>) and end-of-sentence (</s>):
P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
= P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ)
P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ)
P(sentence|NN)
Computation estimation of POS
• Suppose we have C possible tags for each of the n words in the
sentence
• There are C^n possible tag sequences: the number grows
exponentially in the length n
• Viterbi algorithm: use dynamic programming to solve it
Viterbi algorithm:
• Target: argmaxT P(T|W)
• Intuition: best path of length (i) at state of t must include best path of
length (i-1) to the previous state
• Use a table to store the partial result:
• TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at
state i
• Fill in columns from left to right, the max is over each possible previous t’
V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
Viterbi algorithm: case study
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: all tagged
Dynamic programming: take-home message
• Why fast: use memory to store partial result
• DP algorithm component: state definition, initial condition, and
induction rule
• Solve DP problem with a table
Top ten DP problems
• Longest common subsequence
• Shortest common subsequence
• Longest increasing subsequence
• Edit distance
• Matrix chain multiplication
• 0-1 knapsack problem
• Partition problem
• Rod cutting
• Coin change problem
• Word break problem
Reference
• http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s
lides.pdf
• https://en.wikipedia.org/wiki/Dynamic_programming
• https://medium.com/@codingfreak/top-10-dynamic-programming-
problems-5da486eeb360
• https://leetcode.com/problems/wildcard-matching/description/
• https://en.wikipedia.org/wiki/Longest_common_subsequence_probl
em

More Related Content

What's hot

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rulemath267
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-pointsmath267
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!kenbot
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotientmath265
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Mohd. Noor Abdul Hamid
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functionstoni dimella
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 SearchKhiem Ho
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integralsmath267
 
22 double integrals
22 double integrals22 double integrals
22 double integralsmath267
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionsmath260
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressionsalg1testreview
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionsHeather Scott
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notestoni dimella
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]indu thakur
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionscannout
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsKeigo Nitadori
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and rootsmath260
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivativemath265
 

What's hot (20)

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rule
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-points
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotient
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functions
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 Search
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integrals
 
22 double integrals
22 double integrals22 double integrals
22 double integrals
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functions
 
Metric space
Metric spaceMetric space
Metric space
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressions
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notes
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulations
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and roots
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivative
 

Similar to Basics of Dynamic programming

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithmTrector Rancor
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonshin
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdfSrinivasaReddyPolamR
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Charles Martin
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimizationg3_nittala
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programJyotiprakashMishra18
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..KarthikeyaLanka1
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in PythonValerio Maggio
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 
Applied 20S January 7, 2009
Applied 20S January 7, 2009Applied 20S January 7, 2009
Applied 20S January 7, 2009Darren Kuropatwa
 

Similar to Basics of Dynamic programming (20)

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithm
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdf
 
Laplace_1.ppt
Laplace_1.pptLaplace_1.ppt
Laplace_1.ppt
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Q
QQ
Q
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and program
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Ch8a
Ch8aCh8a
Ch8a
 
Applied 20S January 7, 2009
Applied 20S January 7, 2009Applied 20S January 7, 2009
Applied 20S January 7, 2009
 

More from Yan Xu

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingYan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for businessYan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangYan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGoYan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkYan Xu
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 

More from Yan Xu (20)

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Basics of Dynamic programming

  • 1. Dynamic Programming: basics and case studies Houston Machine Learning Meetup 11/16/2019
  • 2. Dynamic Programming: name and story • Richard Bellman coined the term “Dynamic Programming” Bellman autobiography “The face of Wilson (the secretory of defense) would turn red, and he would get violent if people used the term RESEARCH in his presence. You can imagine how he felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson and the Air Force from the fact that I was really doing MATHEMATICS inside the RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I wanted to get across the idea that this was DYNAMIC, this was multistage, this was time-varying…. I thought dynamic programming was a good name. It was something not even a Congressman could object to..."
  • 3. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2)
  • 4. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by recursion public int fib(int N) { if (n == 0 || n == 1) { return n; } return fib(N – 1) + fib(N – 2); } Time complexity: O(N) = 2^N Recursion tree of Fibonacci sequence
  • 5. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1
  • 6. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1
  • 7. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2
  • 8. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3
  • 9. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3 5
  • 10. Fibonacci sequence • Recursion: • F(n) = F(n – 1) + F(n – 2) • Starts from n • When computing F(n), F(n-1) and F(n-2) is not known yet • DP: • F(n) = F(n – 1) + F(n – 2) • Starts from 0 and 1 • When computing F(n), F(n-1) and F(n-2) has been stored in array • Dynamic programming: partial result stored to save time
  • 11. Longest common subsequence • To find the longest subsequence common to two or more sequences • String1: “AGCAT” • String2: “GAC” • Common subsequence: “A”, “C”, “G”, “AC”, “GA”, • LCS: “AC”, or “GA” • To use a table to find LCS: • First column: string1(“AGCAT”) • First row: string2(“GAC”) • Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
  • 16. Wildcard matching • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] - * a * b - T T F F F a d c a b
  • 17. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T b • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
  • 18. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 19. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 20. Longest common subsequence and wildcard matching • DP starts from initial condition to the end of string: • From left to right at each row • From top to bottom at each cloumn • State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to table[i][j] • Each time: move forward by one step • State at each is the global optimum of that step • Table (or diagram) is the best tool to simulate the processing
  • 21. Matrix chain multiplication • Multiple two matrices: A(10 x 100) and B(100 x 5) • OUT[p][r] += A[p][q] * B[q][r] • Computation = 10 x 100 x 5 • Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50) • ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500 • (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000 • ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar computation
  • 22. Matrix chain multiplication • How to optimize the chain multiplication of matrices ( A1, A2, A3, …. An) • DP induction rule:
  • 23. Matrix chain multiplication: DP solution • Six matrices multiplication: • Status: • M[i, j]: the min number of computations for the matrices (i to j) multiplication • S[i, j]: the last-layer break-point for M[i, j]
  • 24. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 25. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 26. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 27. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 28. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 29. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 30. Matrix chain multiplication: DP solution • Six matrices multiplication: (A1 (A2 A3)) ((A4 A5) A6)
  • 31. Matrix chain multiplication: DP solution • State hard to define: • M[i, j] • S[i, j] • State transition complicated: • By row and column not work • From previous state to current state by the matrices length (Induction rule)
  • 32. Framework of dynamic programming • Three key components of dynamic programming algorithm: • Definition of state • Initial condition (base) • Induction rule (state transition) • Induction rule: difficult to find • 1D/2D table for the thinking process
  • 33. What is part of speech tagging? • Identify parts of the speech (syntactic categories): This is a simple sentence DET VB DET ADJ NOUN • POS tagging is a first step towards syntactic analysis (sematic analysis) • Faster than full parsing • Text classification and word disambiguation • How to decide the correct label: • Word to be labeled: chair is probably a noun • Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this word is more likely to be a verb • Hidden Markov models can be used to work on this problem
  • 34. Why is POS tagging hard? • Ambiguity glass of water/NOUN vs. water/VERB the plants lie/VERB down vs. tell a lie/NOUN wind/VERB down vs. a mighty wind/NOUN(homographs) How about time flies like an arrow? • Sparse data: • Words we haven’t seen before • Word-Tag pairs we haven’t seen before
  • 35. Example transition probabilities • Probabilities estimated from tagged WSJ corpus: • Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28 • Modal verbs (MD) nearly always followed by bare verbs (VB). • Adjectives (JJ) are often followed by nouns (NN).
  • 36. Example output probabilities • Probabilities estimated from tagged WSJ corpus: • 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032 • About half of determiners (DT) are the. • the can also be a proper noun.
  • 37. Hidden Markov Chain • A set of states (tags) • An output alphabet (words) • Initial state (beginning of sentence) • State transition probabilities ( P(ti|ti-1) ) • Symbol emission probabilities ( P(wi|ti) )
  • 38. Hidden Markov Chain • Model the tagging process: • Sentence: W = (w1, w2, … wn) • Tags T = (t1, t2, …, tn) • Joint probability: P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) • Example: • This/DET is/VB a/DET simple/JJ sentence/NN • Add begin(<s>) and end-of-sentence (</s>): P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) = P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ) P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ) P(sentence|NN)
  • 39. Computation estimation of POS • Suppose we have C possible tags for each of the n words in the sentence • There are C^n possible tag sequences: the number grows exponentially in the length n • Viterbi algorithm: use dynamic programming to solve it
  • 40. Viterbi algorithm: • Target: argmaxT P(T|W) • Intuition: best path of length (i) at state of t must include best path of length (i-1) to the previous state • Use a table to store the partial result: • TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at state i • Fill in columns from left to right, the max is over each possible previous t’ V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
  • 42. Viterbi algorithm: case study • W = the doctor is in.
  • 43. Viterbi algorithm: case study • W = the doctor is in.
  • 44. Viterbi algorithm: case study • W = the doctor is in.
  • 45. Viterbi algorithm: case study • W = the doctor is in.
  • 46. Viterbi algorithm: case study • W = the doctor is in.
  • 47. Viterbi algorithm: case study • W = the doctor is in.
  • 48. Viterbi algorithm: case study • W = the doctor is in.
  • 50. Dynamic programming: take-home message • Why fast: use memory to store partial result • DP algorithm component: state definition, initial condition, and induction rule • Solve DP problem with a table
  • 51. Top ten DP problems • Longest common subsequence • Shortest common subsequence • Longest increasing subsequence • Edit distance • Matrix chain multiplication • 0-1 knapsack problem • Partition problem • Rod cutting • Coin change problem • Word break problem
  • 52. Reference • http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s lides.pdf • https://en.wikipedia.org/wiki/Dynamic_programming • https://medium.com/@codingfreak/top-10-dynamic-programming- problems-5da486eeb360 • https://leetcode.com/problems/wildcard-matching/description/ • https://en.wikipedia.org/wiki/Longest_common_subsequence_probl em