In this post we will develop dynamic programming code in python for processing strings to compute the *Longest Palindromic Subsequence* of a string and the related *Snip Number* of a string.

A palindromic subsequence is substring of a given string, obtained by deleting characters, that is a palindrome. We find the longest such subsequence using dynamic programming over intervals within the string. Let us define the value Opt[i,j] for each interval [i,j] as the length of the longest palindromic subsequence within the corresponding substring interval.

def longest_pal(str): n=len(str) #initialize Opt Table Opt=[[0 for i in range(n)] for j in range(n) ] #Opt of single char is 1 for i in range(n): Opt[i][i] = 1 #Opt for adjacent chars is 2 if same, 1 otherwise for i in range(n-1): if str[i]==str[i+1]: Opt[i][i+1]=2 else: Opt[i][i+1]=1 # we now define sil as (s)substring (i)interval (l) length of the # interval [i,j] --- sil=(j-i +1) and j = i+sil-1 # we compute Opt table entry for each sil length and # starting index i for sil in range(2, n+1): for i in range(n-sil+1): j = i+sil-1 if (str[i] == str[j] ): Opt[i][j] = Opt[i+1][j-1] + 2; else: Opt[i][j] = max(Opt[i][j-1], Opt[i+1][j]) return Opt[0][n-1] str1="a man two cats a crazy plan aibohphobia and a canal in panama" print str1, "has longest palindrome subsequence length" print longest_pal(str1)

The answer returned is 37, the length of longest palindrome subsequence.

We can add trace pointers to the table to reconstruct the optimal solution.

def longest_pal(str): n=len(str) #initialize Opt Table Opt=[[0 for i in range(n)] for j in range(n) ] ptr = [['?' for i in range(n)] for j in range(n) ] for i in range(n): Opt[i][i] = 1 for i in range(n): ptr[i][i] = 'i' # define sil as length of substring interval [i,j] sil=(j-i +1) # compute Opt table entry for each starting index i and each sil for sil in range(2, n+1): for i in range(n-sil+1): j = i+sil-1 if (str[i] == str[j] and sil == 2): Opt[i][j] = 2 ptr[i][j] = 'i' elif (str[i] == str[j] ): Opt[i][j] = Opt[i+1][j-1] + 2 ptr[i][j] = 'diag' else: Opt[i][j] = max(Opt[i][j-1], Opt[i+1][j]) if Opt[i][j] == Opt [i][j-1]: ptr[i][j] = "<-" else: ptr[i][j] = "->" return Opt,ptr str1="a man two cats a crazy plan aibohphobia and a canal in panama" print str1, "has longest palindrome subsequence length" opt,ptr=longest_pal(str1) #Trace back and construct optimal palindrome subsequence pal = "" p = len(str1)-1 row = 0 while p >= row: if ptr[row][p]=='diag': pal = pal[:len(pal)/2] + str1[row] + str1[row] + pal[len(pal)/2 :] row = row + 1 p = p-1 elif ptr[row][p] == "->": row = row +1 else: p = p -1 if p == row: pal = pal[:len(pal)/2] + str1[row] + pal[len(pal)/2 :] print pal Output #aman aa a a aibohphobia a a aa nama#

**Computing the Snip number of a string**

Suppose you wanted to cut a string using the fewest number of snips, so that each remaining piece was itself a palindrome. The number of such cuts we will call the *Snip Number* of a string. That is the snip number is always equal to one less than the smallest number of palindromes within a given string. Every string has snip number at most n-1, and each palindrome has snip number 0.

def snip_number(str): n=len(str) #initialize Opt Table # Opt[i,j] = min number of snips in the substring str[i...j] Opt=[[0 for i in range(n)] for j in range(n) ] #Opt of single char is 0 for i in range(n): Opt[i][i] = 0 #Opt for adjacent chars is 1 if different, 0 otherwise for i in range(n-1): Opt[i][i+1]= 1 if str[i]!=str[i+1] else 0 # we now define sil as (s)substring (i)interval (l) length of the # interval [i,j] --- sil=(j-i +1) and j = i+sil-1 # we compute Opt table entry for each sil length and # starting index i for sil in range(3, n+1): for i in range(n-sil+1): j = i+sil-1 if (str[i] == str[j] and Opt[i+1][j-1]==0): Opt[i][j] = 0 else: snip= min( [(Opt[i][t]+ Opt[t+1][j] + 1 ) for t in range(i,j-1)]) Opt[i][j] = snip return Opt[0][len(str)-1] #end function snip_number() mystr=[for i in range(4)] mystr[0]="abc" mystr[1]="ohiho" mystr[2]="cabacdbabdc" mystr[3]="amanaplanacanalpanama aibohphobia" for i in range(4): print mystr[i], "has snip number:", snip_number(mystr[i]) # abc has snip number: 2 # ohiho has snip number: 0 # cabacdbabdc has snip number: 2 # amanaplanacanalpanama aibohphobia has snip number: 1

Over all 2^20 binary strings of length-20, the expected snip number of the string is bit more than 3. The average snip_number of a binary string appears to grow linearly with the length of the string with a slope of about 0.15 as the results of the following experiment shows.

for j in range(1,21): count = 0 for i in range (2**j,2**(j+1)): n = snip_number(str(bin(i))[3:]) count += n print "over all length-",j, "strings the avg snip number is:", (count+0.0)/(2**j)

## Discussion

## No comments yet.