Longest Common
Subsequence(LCS)
An application of Dynamic Programming Approach
Tania Sahito
Subsequence
• X is a subsequence of Y if X is obtained by dropping elements of Y
without changing order of elements which we did not drop
• Note: string and sequence are used interchangeably because
“string is also a sequence of characters”.
• Example
• “ricm” is a subsequence of “Curriculum”
• Common subsequence
• Given 2 sequences X and Y , we say Z is a common subsequence of X and Y if
Z is present in X and Y.
• Longest Common Subsequence
• A common subsequence of the maximal length among all the
subsequences(There may be many LCS)
• Remember: Length of LCS(A,B)=Similarity between A And B
• Higher the Length(LCS(A,B))=more similarity
Motivation for LCS
• Bioinformatics: LCS is used in Molecular Biology for DNA
sequences
• Unix File Comparison(Program named as “Diff”)
• Screen reDisplay(in Emacs text editor)
• Misspelt Words Matching
We already know…
• Dynamic Programming involves following steps
• Characterize optimal substructure
• Recursively define the value of an optimal solution
• Compute the value bottom up
• If needed construct an optimal solution
Step#01: Characterizing the LCS
LCS by Brute Force Algorithm Design
• Two sequences
• X=x1,x2...xm
• Y=y1,y2y3…yn
Steps:
Generate all possible subsequences of A
Check which are also subsequences of B
Return the longest
Time complexity: O(n2M)
2M
Cost of
comparison=O(n)
Step#01: Characterize the LCS(DP)
• There may exist following cases
Step#02:Recursive Relation
Step#03: Computing the length of LCS
Step#04: Constructing an LCS
Example:What do "spanking" and "amputation" have in
common?
Modifications in LCS Algorithm
Some Questions:
• Do we need B?
• Do we need whole C (for finding longest length)?
• As we can see
• Use Linked List instead of Table
• Time complexity=O(MN. Max(M,N))
LCS Presentation

LCS Presentation

  • 1.
    Longest Common Subsequence(LCS) An applicationof Dynamic Programming Approach Tania Sahito
  • 2.
    Subsequence • X isa subsequence of Y if X is obtained by dropping elements of Y without changing order of elements which we did not drop • Note: string and sequence are used interchangeably because “string is also a sequence of characters”. • Example • “ricm” is a subsequence of “Curriculum”
  • 3.
    • Common subsequence •Given 2 sequences X and Y , we say Z is a common subsequence of X and Y if Z is present in X and Y. • Longest Common Subsequence • A common subsequence of the maximal length among all the subsequences(There may be many LCS) • Remember: Length of LCS(A,B)=Similarity between A And B • Higher the Length(LCS(A,B))=more similarity
  • 4.
    Motivation for LCS •Bioinformatics: LCS is used in Molecular Biology for DNA sequences • Unix File Comparison(Program named as “Diff”) • Screen reDisplay(in Emacs text editor) • Misspelt Words Matching
  • 5.
    We already know… •Dynamic Programming involves following steps • Characterize optimal substructure • Recursively define the value of an optimal solution • Compute the value bottom up • If needed construct an optimal solution
  • 6.
    Step#01: Characterizing theLCS LCS by Brute Force Algorithm Design • Two sequences • X=x1,x2...xm • Y=y1,y2y3…yn Steps: Generate all possible subsequences of A Check which are also subsequences of B Return the longest Time complexity: O(n2M) 2M Cost of comparison=O(n)
  • 7.
    Step#01: Characterize theLCS(DP) • There may exist following cases
  • 8.
  • 9.
  • 10.
  • 11.
    Example:What do "spanking"and "amputation" have in common?
  • 12.
    Modifications in LCSAlgorithm Some Questions: • Do we need B? • Do we need whole C (for finding longest length)? • As we can see • Use Linked List instead of Table • Time complexity=O(MN. Max(M,N))