SlideShare a Scribd company logo
Dynamic Programming:
basics and case studies
Houston Machine Learning Meetup
11/16/2019
Dynamic Programming: name and story
• Richard Bellman coined the term “Dynamic Programming”
Bellman autobiography
“The face of Wilson (the secretory of defense) would turn red, and he would get
violent if people used the term RESEARCH in his presence. You can imagine how he
felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson
and the Air Force from the fact that I was really doing MATHEMATICS inside the
RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I
wanted to get across the idea that this was DYNAMIC, this was multistage, this was
time-varying…. I thought dynamic programming was a good name. It was something
not even a Congressman could object to..."
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by recursion
public int fib(int N) {
if (n == 0 || n == 1) { return n; }
return fib(N – 1) + fib(N – 2);
}
Time complexity: O(N) = 2^N
Recursion tree of Fibonacci sequence
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3 5
Fibonacci sequence
• Recursion:
• F(n) = F(n – 1) + F(n – 2)
• Starts from n
• When computing F(n), F(n-1) and F(n-2) is not known yet
• DP:
• F(n) = F(n – 1) + F(n – 2)
• Starts from 0 and 1
• When computing F(n), F(n-1) and F(n-2) has been stored in array
• Dynamic programming: partial result stored to save time
Longest common subsequence
• To find the longest subsequence common to two or more sequences
• String1: “AGCAT”
• String2: “GAC”
• Common subsequence: “A”, “C”, “G”, “AC”, “GA”,
• LCS: “AC”, or “GA”
• To use a table to find LCS:
• First column: string1(“AGCAT”)
• First row: string2(“GAC”)
• Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
Longest common subsequence
Longest common subsequence
Longest common subsequence
Longest common subsequence
Wildcard matching
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]
- * a * b
- T T F F F
a
d
c
a
b
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T
b
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Longest common subsequence and wildcard
matching
• DP starts from initial condition to the end of string:
• From left to right at each row
• From top to bottom at each cloumn
• State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to
table[i][j]
• Each time: move forward by one step
• State at each is the global optimum of that step
• Table (or diagram) is the best tool to simulate the processing
Matrix chain multiplication
• Multiple two matrices: A(10 x 100) and B(100 x 5)
• OUT[p][r] += A[p][q] * B[q][r]
• Computation = 10 x 100 x 5
• Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50)
• ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500
• (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000
• ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar
computation
Matrix chain multiplication
• How to optimize the chain multiplication of matrices ( A1, A2, A3, ….
An)
• DP induction rule:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
• Status:
• M[i, j]: the min number of computations for the matrices (i to j) multiplication
• S[i, j]: the last-layer break-point for M[i, j]
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
(A1 (A2 A3)) ((A4 A5) A6)
Matrix chain multiplication: DP solution
• State hard to define:
• M[i, j]
• S[i, j]
• State transition complicated:
• By row and column not work
• From previous state to current state by the matrices length (Induction rule)
Framework of dynamic programming
• Three key components of dynamic programming algorithm:
• Definition of state
• Initial condition (base)
• Induction rule (state transition)
• Induction rule: difficult to find
• 1D/2D table for the thinking process
What is part of speech tagging?
• Identify parts of the speech (syntactic categories):
This is a simple sentence
DET VB DET ADJ NOUN
• POS tagging is a first step towards syntactic analysis (sematic analysis)
• Faster than full parsing
• Text classification and word disambiguation
• How to decide the correct label:
• Word to be labeled: chair is probably a noun
• Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this
word is more likely to be a verb
• Hidden Markov models can be used to work on this problem
Why is POS tagging hard?
• Ambiguity
glass of water/NOUN vs. water/VERB the plants
lie/VERB down vs. tell a lie/NOUN
wind/VERB down vs. a mighty wind/NOUN(homographs)
How about time flies like an arrow?
• Sparse data:
• Words we haven’t seen before
• Word-Tag pairs we haven’t seen before
Example transition probabilities
• Probabilities estimated from tagged WSJ corpus:
• Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28
• Modal verbs (MD) nearly always followed by bare verbs (VB).
• Adjectives (JJ) are often followed by nouns (NN).
Example output probabilities
• Probabilities estimated from tagged WSJ corpus:
• 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032
• About half of determiners (DT) are the.
• the can also be a proper noun.
Hidden Markov Chain
• A set of states (tags)
• An output alphabet (words)
• Initial state (beginning of sentence)
• State transition probabilities ( P(ti|ti-1) )
• Symbol emission probabilities ( P(wi|ti) )
Hidden Markov Chain
• Model the tagging process:
• Sentence: W = (w1, w2, … wn)
• Tags T = (t1, t2, …, tn)
• Joint probability: P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
• Example:
• This/DET is/VB a/DET simple/JJ sentence/NN
• Add begin(<s>) and end-of-sentence (</s>):
P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
= P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ)
P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ)
P(sentence|NN)
Computation estimation of POS
• Suppose we have C possible tags for each of the n words in the
sentence
• There are C^n possible tag sequences: the number grows
exponentially in the length n
• Viterbi algorithm: use dynamic programming to solve it
Viterbi algorithm:
• Target: argmaxT P(T|W)
• Intuition: best path of length (i) at state of t must include best path of
length (i-1) to the previous state
• Use a table to store the partial result:
• TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at
state i
• Fill in columns from left to right, the max is over each possible previous t’
V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
Viterbi algorithm: case study
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: all tagged
Dynamic programming: take-home message
• Why fast: use memory to store partial result
• DP algorithm component: state definition, initial condition, and
induction rule
• Solve DP problem with a table
Top ten DP problems
• Longest common subsequence
• Shortest common subsequence
• Longest increasing subsequence
• Edit distance
• Matrix chain multiplication
• 0-1 knapsack problem
• Partition problem
• Rod cutting
• Coin change problem
• Word break problem
Reference
• http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s
lides.pdf
• https://en.wikipedia.org/wiki/Dynamic_programming
• https://medium.com/@codingfreak/top-10-dynamic-programming-
problems-5da486eeb360
• https://leetcode.com/problems/wildcard-matching/description/
• https://en.wikipedia.org/wiki/Longest_common_subsequence_probl
em

More Related Content

What's hot

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rulemath267
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-pointsmath267
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!
kenbot
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
Petra Selmer
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotientmath265
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor
Mohd. Noor Abdul Hamid
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functionstoni dimella
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 SearchKhiem Ho
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integralsmath267
 
22 double integrals
22 double integrals22 double integrals
22 double integralsmath267
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionsmath260
 
Metric space
Metric spaceMetric space
Metric space
NaliniSPatil
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressions
alg1testreview
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
Heather Scott
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notestoni dimella
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
indu thakur
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
cannout
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulations
Keigo Nitadori
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and rootsmath260
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivativemath265
 

What's hot (20)

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rule
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-points
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotient
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functions
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 Search
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integrals
 
22 double integrals
22 double integrals22 double integrals
22 double integrals
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functions
 
Metric space
Metric spaceMetric space
Metric space
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressions
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notes
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulations
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and roots
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivative
 

Similar to Basics of Dynamic programming

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithmTrector Rancor
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
shin
 
Project presentation PPT.pdf this is help for student who doing this complier...
Project presentation PPT.pdf this is help for student who doing this complier...Project presentation PPT.pdf this is help for student who doing this complier...
Project presentation PPT.pdf this is help for student who doing this complier...
AmitSingh395981
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdf
SrinivasaReddyPolamR
 
introduction to data structures and types
introduction to data structures and typesintroduction to data structures and types
introduction to data structures and types
ankita946617
 
Laplace_1.ppt
Laplace_1.pptLaplace_1.ppt
Laplace_1.ppt
cantatebrugyral
 
Unit 3
Unit 3Unit 3
Unit 3
guna287176
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
Khulna University
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
decoupled
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
Aleksandar Veselinovic
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
Charles Martin
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimizationg3_nittala
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
Christian Robert
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and program
JyotiprakashMishra18
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
KarthikeyaLanka1
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
Valerio Maggio
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Big_Data_Ukraine
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
WrushabhShirsat3
 

Similar to Basics of Dynamic programming (20)

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithm
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
Project presentation PPT.pdf this is help for student who doing this complier...
Project presentation PPT.pdf this is help for student who doing this complier...Project presentation PPT.pdf this is help for student who doing this complier...
Project presentation PPT.pdf this is help for student who doing this complier...
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdf
 
introduction to data structures and types
introduction to data structures and typesintroduction to data structures and types
introduction to data structures and types
 
Laplace_1.ppt
Laplace_1.pptLaplace_1.ppt
Laplace_1.ppt
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Q
QQ
Q
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and program
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 

More from Yan Xu

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
Yan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
Yan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
Yan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
Yan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
Yan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
Yan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
Yan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
Yan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
Yan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
Yan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
Yan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
Yan Xu
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
Yan Xu
 

More from Yan Xu (20)

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 

Recently uploaded

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Basics of Dynamic programming

  • 1. Dynamic Programming: basics and case studies Houston Machine Learning Meetup 11/16/2019
  • 2. Dynamic Programming: name and story • Richard Bellman coined the term “Dynamic Programming” Bellman autobiography “The face of Wilson (the secretory of defense) would turn red, and he would get violent if people used the term RESEARCH in his presence. You can imagine how he felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson and the Air Force from the fact that I was really doing MATHEMATICS inside the RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I wanted to get across the idea that this was DYNAMIC, this was multistage, this was time-varying…. I thought dynamic programming was a good name. It was something not even a Congressman could object to..."
  • 3. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2)
  • 4. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by recursion public int fib(int N) { if (n == 0 || n == 1) { return n; } return fib(N – 1) + fib(N – 2); } Time complexity: O(N) = 2^N Recursion tree of Fibonacci sequence
  • 5. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1
  • 6. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1
  • 7. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2
  • 8. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3
  • 9. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3 5
  • 10. Fibonacci sequence • Recursion: • F(n) = F(n – 1) + F(n – 2) • Starts from n • When computing F(n), F(n-1) and F(n-2) is not known yet • DP: • F(n) = F(n – 1) + F(n – 2) • Starts from 0 and 1 • When computing F(n), F(n-1) and F(n-2) has been stored in array • Dynamic programming: partial result stored to save time
  • 11. Longest common subsequence • To find the longest subsequence common to two or more sequences • String1: “AGCAT” • String2: “GAC” • Common subsequence: “A”, “C”, “G”, “AC”, “GA”, • LCS: “AC”, or “GA” • To use a table to find LCS: • First column: string1(“AGCAT”) • First row: string2(“GAC”) • Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
  • 16. Wildcard matching • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] - * a * b - T T F F F a d c a b
  • 17. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T b • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
  • 18. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 19. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 20. Longest common subsequence and wildcard matching • DP starts from initial condition to the end of string: • From left to right at each row • From top to bottom at each cloumn • State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to table[i][j] • Each time: move forward by one step • State at each is the global optimum of that step • Table (or diagram) is the best tool to simulate the processing
  • 21. Matrix chain multiplication • Multiple two matrices: A(10 x 100) and B(100 x 5) • OUT[p][r] += A[p][q] * B[q][r] • Computation = 10 x 100 x 5 • Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50) • ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500 • (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000 • ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar computation
  • 22. Matrix chain multiplication • How to optimize the chain multiplication of matrices ( A1, A2, A3, …. An) • DP induction rule:
  • 23. Matrix chain multiplication: DP solution • Six matrices multiplication: • Status: • M[i, j]: the min number of computations for the matrices (i to j) multiplication • S[i, j]: the last-layer break-point for M[i, j]
  • 24. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 25. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 26. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 27. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 28. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 29. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 30. Matrix chain multiplication: DP solution • Six matrices multiplication: (A1 (A2 A3)) ((A4 A5) A6)
  • 31. Matrix chain multiplication: DP solution • State hard to define: • M[i, j] • S[i, j] • State transition complicated: • By row and column not work • From previous state to current state by the matrices length (Induction rule)
  • 32. Framework of dynamic programming • Three key components of dynamic programming algorithm: • Definition of state • Initial condition (base) • Induction rule (state transition) • Induction rule: difficult to find • 1D/2D table for the thinking process
  • 33. What is part of speech tagging? • Identify parts of the speech (syntactic categories): This is a simple sentence DET VB DET ADJ NOUN • POS tagging is a first step towards syntactic analysis (sematic analysis) • Faster than full parsing • Text classification and word disambiguation • How to decide the correct label: • Word to be labeled: chair is probably a noun • Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this word is more likely to be a verb • Hidden Markov models can be used to work on this problem
  • 34. Why is POS tagging hard? • Ambiguity glass of water/NOUN vs. water/VERB the plants lie/VERB down vs. tell a lie/NOUN wind/VERB down vs. a mighty wind/NOUN(homographs) How about time flies like an arrow? • Sparse data: • Words we haven’t seen before • Word-Tag pairs we haven’t seen before
  • 35. Example transition probabilities • Probabilities estimated from tagged WSJ corpus: • Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28 • Modal verbs (MD) nearly always followed by bare verbs (VB). • Adjectives (JJ) are often followed by nouns (NN).
  • 36. Example output probabilities • Probabilities estimated from tagged WSJ corpus: • 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032 • About half of determiners (DT) are the. • the can also be a proper noun.
  • 37. Hidden Markov Chain • A set of states (tags) • An output alphabet (words) • Initial state (beginning of sentence) • State transition probabilities ( P(ti|ti-1) ) • Symbol emission probabilities ( P(wi|ti) )
  • 38. Hidden Markov Chain • Model the tagging process: • Sentence: W = (w1, w2, … wn) • Tags T = (t1, t2, …, tn) • Joint probability: P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) • Example: • This/DET is/VB a/DET simple/JJ sentence/NN • Add begin(<s>) and end-of-sentence (</s>): P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) = P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ) P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ) P(sentence|NN)
  • 39. Computation estimation of POS • Suppose we have C possible tags for each of the n words in the sentence • There are C^n possible tag sequences: the number grows exponentially in the length n • Viterbi algorithm: use dynamic programming to solve it
  • 40. Viterbi algorithm: • Target: argmaxT P(T|W) • Intuition: best path of length (i) at state of t must include best path of length (i-1) to the previous state • Use a table to store the partial result: • TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at state i • Fill in columns from left to right, the max is over each possible previous t’ V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
  • 42. Viterbi algorithm: case study • W = the doctor is in.
  • 43. Viterbi algorithm: case study • W = the doctor is in.
  • 44. Viterbi algorithm: case study • W = the doctor is in.
  • 45. Viterbi algorithm: case study • W = the doctor is in.
  • 46. Viterbi algorithm: case study • W = the doctor is in.
  • 47. Viterbi algorithm: case study • W = the doctor is in.
  • 48. Viterbi algorithm: case study • W = the doctor is in.
  • 50. Dynamic programming: take-home message • Why fast: use memory to store partial result • DP algorithm component: state definition, initial condition, and induction rule • Solve DP problem with a table
  • 51. Top ten DP problems • Longest common subsequence • Shortest common subsequence • Longest increasing subsequence • Edit distance • Matrix chain multiplication • 0-1 knapsack problem • Partition problem • Rod cutting • Coin change problem • Word break problem
  • 52. Reference • http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s lides.pdf • https://en.wikipedia.org/wiki/Dynamic_programming • https://medium.com/@codingfreak/top-10-dynamic-programming- problems-5da486eeb360 • https://leetcode.com/problems/wildcard-matching/description/ • https://en.wikipedia.org/wiki/Longest_common_subsequence_probl em