//
you're reading...
Uncategorized

Dynamic Programming Code in Python for Longest Palindromic Subsequence

In this post we will develop dynamic programming code in python for processing strings to compute the Longest Palindromic Subsequence of a string and the related Snip Number of a string.

A palindromic subsequence is substring of a given string, obtained by deleting characters, that is a palindrome. We find the longest such subsequence using dynamic programming over intervals within the string. Let us define the value Opt[i,j] for each interval [i,j] as the length of the longest palindromic subsequence within the corresponding substring interval.

def longest_pal(str):
   n=len(str)

   #initialize Opt Table
   Opt=[[0 for i in range(n)] for j in range(n) ]

    #Opt of single char is 1
    for i in range(n):
       Opt[i][i] = 1

    #Opt for adjacent chars is 2 if same, 1 otherwise
    for i in range(n-1):
       if str[i]==str[i+1]:
          Opt[i][i+1]=2
       else:  Opt[i][i+1]=1

# we now define sil as (s)substring (i)interval (l) length of the
# interval [i,j] --- sil=(j-i +1) and j = i+sil-1

# we compute Opt table entry for each sil length and
# starting index i

   for sil in range(2, n+1):
      for i in range(n-sil+1):
          j = i+sil-1
          if (str[i] == str[j] ):
             Opt[i][j] = Opt[i+1][j-1] + 2;
          else:
             Opt[i][j] = max(Opt[i][j-1], Opt[i+1][j])

      return Opt[0][n-1]

str1="a man two cats a crazy plan aibohphobia and a canal in panama"
print str1, "has longest palindrome subsequence length"
print longest_pal(str1)

The answer returned is 37,  the length of  longest palindrome subsequence.

We can add trace pointers to the table to reconstruct the optimal solution.

def longest_pal(str):
    n=len(str)

   #initialize Opt Table
    Opt=[[0 for i in range(n)] for j in range(n) ]
    ptr = [['?' for i in range(n)] for j in range(n) ]

    for i in range(n):
       Opt[i][i] = 1

    for i in range(n):
       ptr[i][i] = 'i'

# define sil as length of substring interval [i,j] sil=(j-i +1)
# compute Opt table entry for each starting index i and each sil

    for sil in range(2, n+1):
        for i in range(n-sil+1):
          j = i+sil-1
          if (str[i] == str[j] and sil == 2):
             Opt[i][j] = 2
             ptr[i][j] = 'i'
          elif (str[i] == str[j] ):
             Opt[i][j] = Opt[i+1][j-1] + 2
             ptr[i][j] = 'diag'
          else:
             Opt[i][j] = max(Opt[i][j-1], Opt[i+1][j])
             if Opt[i][j] == Opt [i][j-1]:
                 ptr[i][j] = "<-"
             else:
                 ptr[i][j] = "->"

    return Opt,ptr

str1="a man two cats a crazy plan aibohphobia and a canal in panama"
print str1, "has longest palindrome subsequence length"

opt,ptr=longest_pal(str1)

#Trace back and construct optimal palindrome subsequence
pal = ""
p = len(str1)-1
row = 0
while p >= row:
    if ptr[row][p]=='diag':
        pal = pal[:len(pal)/2] + str1[row] + str1[row] + pal[len(pal)/2 :]
        row = row + 1
        p = p-1
    elif ptr[row][p] == "->":
        row = row +1
    else:
        p = p -1
    if p == row:
        pal = pal[:len(pal)/2] + str1[row] + pal[len(pal)/2 :]

print pal

Output  #aman  aa a a aibohphobia a a aa nama#

Computing the Snip number of a string

Suppose you wanted to cut a string using the fewest number of snips, so that each remaining piece was itself a palindrome. The number of such cuts we will call the Snip Number of a string. That is the snip number is always equal to one less than the smallest number of palindromes  within a given string. Every string has snip number at most  n-1, and each palindrome has snip number 0.

def snip_number(str):
    n=len(str)

 #initialize Opt Table
 # Opt[i,j] = min number of snips in the substring str[i...j]

    Opt=[[0 for i in range(n)] for j in range(n) ]

 #Opt of single char is 0
    for i in range(n):
     Opt[i][i] = 0

 #Opt for adjacent chars is 1 if different, 0 otherwise
    for i in range(n-1):
     Opt[i][i+1]= 1 if str[i]!=str[i+1] else 0

# we now define sil as (s)substring (i)interval (l) length of the
# interval [i,j] --- sil=(j-i +1) and j = i+sil-1

# we compute Opt table entry for each sil length and
# starting index i

    for sil in range(3, n+1):
     for i in range(n-sil+1):
       j = i+sil-1
       if (str[i] == str[j] and Opt[i+1][j-1]==0):
         Opt[i][j] = 0
       else:
         snip= min( [(Opt[i][t]+ Opt[t+1][j] + 1 ) for t in range(i,j-1)])
         Opt[i][j] = snip

    return Opt[0][len(str)-1]
#end function snip_number()
mystr=[for i in range(4)]
mystr[0]="abc"
mystr[1]="ohiho"
mystr[2]="cabacdbabdc"
mystr[3]="amanaplanacanalpanama aibohphobia"

for i in range(4):
     print mystr[i], "has snip number:", snip_number(mystr[i])

# abc has snip number: 2
# ohiho has snip number: 0
# cabacdbabdc has snip number: 2
# amanaplanacanalpanama aibohphobia  has snip number: 1

Over all 2^20 binary strings of length-20, the expected snip number of the string is bit more than 3. The average snip_number of a binary string appears to grow linearly with the length of the string with a slope of about 0.15 as the results of the following experiment shows.


for j in range(1,21):
  count = 0
  for i in range (2**j,2**(j+1)):
    n = snip_number(str(bin(i))[3:])
    count += n

  print "over all length-",j, "strings the avg snip number is:", (count+0.0)/(2**j)

snip

Advertisements

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: