Quantcast
Viewing all articles
Browse latest Browse all 10

Longest Common Subsequence

Finding the longest sequence which exists in both the given strings.

We are given two strings: string X of length n, and string Y of length m. Our goal is to produce their longest common subsequence: the longest sequence of characters that appear left-to-right (but not necessarily in a contiguous block) in both strings.

Application : -

  • Data Mining
  • Modeling Market Traders
  • Used in plagiarism

For example, consider:

X=BACDB

Y=BDCB

Let’s now solve the LCS problem using Dynamic Programming. As subproblems we will look at the LCS of a prefix of X and a prefix of Y, running over all pairs of prefixes. For simplicity, let’s worry first about finding the length of the LCS and then we can modify the algorithm to produce the actual sequence itself.

Formula:

a[i,j]=

if(Xi=Yi=0) →0

if (Xi=Yi) {1+a[i-1,j-1]}

else{(max(a[i-1,j],a[i,j-1])}

case 1:if value of uper cell and left side cell is zero then ans of current cell is zero.

case 2: if both character are equal then go to diagonally and add one to whatever value that is stored on diagonal cell.

case 3: if both character are not equal then goto up side and compere its value with left side cell and whatever is max put the value.

How to calculate a least common sub sequence : -

In this example, we have two strings X = BACDB and Y = BDCB to find the longest common subsequence.

Following the algorithm LCS-Length-Table-Formulation (as stated above), we have calculated table C (shown on the left hand side) and table B (shown on the right hand side).

In table B, instead of ‘D’, ‘L’ and ‘U’, we are using the diagonal arrow, left arrow and up arrow, respectively. After generating table B, the LCS is determined by function LCS-Print. The result is BCB.

Image may be NSFW.
Clik here to view.
process of finding a least common subsequence

How can we now find the sequence? To find the sequence, we just walk backwards through matrix starting the lower-right corner. If either the cell directly above or directly to the right contains a value equal to the value in the current cell, then move to that cell (if both to, then chose either one). If both such cells have values strictly less than the value in the current cell, then move diagonally up-left (this corresponts to applying Case 2), and output the associated character. This will output the characters in the LCS in reverse order.

Image may be NSFW.
Clik here to view.

Viewing all articles
Browse latest Browse all 10

Trending Articles