Unit2_2：动态规划DP

本文主要是介绍Unit2_2：动态规划DP，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

文章目录

一、最长公共子序列
- 分析
- 填表
- 伪代码
- 过程
- 时间复杂度
二、最长公共子串问题
- 分析
- 过程
- 时间复杂度
最小编辑距离
- 背景
- 分析
- 状态转移方程
- 填表
- 伪代码
- 案例

一、最长公共子序列

子序列：指从原序列中选取出来的具有相对顺序的一组元素，而这些元素不一定是连续的。
在这里插入图片描述
X和Y的最长公共子序列是Z。

分析

在这里插入图片描述
设 $Z_k=(z_1,...,z_k)$ 为 $X [1.. i]$ 和 $Y [1.. j]$ 的最长公共子序列（ $L CS$ ）,最大值用 $d_{i,j}$ 表示

$若x_i = y_j，则z_k = x_i = y_j，且z_{k−1}是X[1..i−1]和Y[1..j−1]的LCS 。$

$\neq yj，这意味着LCS不以xi结束，也不以yj结束。\\ 那么Z_k要么是X [1..i−1]和Y[1..]j]的LCS，或X的LCS [1..i]和Y [1..j−1]。\\我们继续使用两种情况下更大的LCS计数。$

因此可得：

$d_{i,j}=\left\{ \begin{array}{ll} d_{i-1,j-1} +1& if \space x_i=y_i \\ max ( d_{i-1,j} , d_{i,j-1} )& if \space x_i \neq y_i \nonumber \end{array} \right.$

填表

在这里插入图片描述
同样，我们创建另一个 $m \times n$ 矩阵 $p [i, j]$ ，对于 $1 \leq i \leq m ，且 1 \leq j \leq n$ ，来存储指向计算中使用的元素的箭头。因此，我们可以稍后重建 $L CS$ 的元素

伪代码

Longest-Common-Subsequence(X,Y)
//Initialization
for i ← 0 to m dod[i,0] ← 0
end
for j ← 0 to m dod[0,j] ← 0
end//Dynamic Programming
for i ← 0 to m dofor j ← 0 to m doif xi = yi thend[i,j] ← d[i-1,j-1]+1p[i,j]="LU"    //"LU" indicates left up arrowendendelse if d[i-1,j] >= d[i,j-1] thend[i,j] ← d[i-1,j]p[i,j]="U"    //"U" indicates up arrowendelsed[i,j] ← d[i,j-1]p[i,j]="L"    //"L" indicates left arrowend
end
return d,p

Print-LCS(p,X,i,j)
if i is equal to 0 or j is equal to 0 thenreturn NULL;
end
if p[i,j] is equal to "LU” thenPrint-LCS(p,X,i-1,j-1);print xi;
end
else if p[i,j] is equal to "U”  thenPrint-LCS(p,X,i-1,i);
end
elsePrint-LCS(p,X,i,j-1);
end

过程

在这里插入图片描述
然后根据p的指示找出最长公共子序列即可

时间复杂度

两层循环，时间复杂度 $T (n) = O (nm)$

二、最长公共子串问题

子串：指从原序列中选取出来的具有相对顺序的一组元素，而且这些元素一定是连续的。
在这里插入图片描述

分析

此题和上一个 $L CS$ 不同，不能设 $Z_k=(z_1,...,z_k)$ 为 $X [1.. i]$ 和 $Y [1.. j]$ 的最长公共子串（ $L CS$ ）,最大值用 $d_{i,j}$ 表示。因为以此结尾的 $x_i和y_j$ 若相同， $d_{i,j}$ 也不一定等于 $d_{i-1,j-1}+1$ ，可能字串在中间，无法递归。

DP无法进行下去时可以加以限制，我们只需要定义 $Z_k=(z_1,...,z_k)$ 为 $X [1.. i]$ 和 $Y [1.. j]$ 的以 $x_i$ 或和 $y_j$ 结尾的最长公共子串（ $L CS$ ）,最大值用 $d_{i,j}$ 表示，此时就能递归进行下去了。

$若x_i = y_j，则z_k = x_i = y_j，且z_{k−1}是X[1..i−1]和Y[1..j−1]的LCS 。$

$\neq yj，这意味着LCS不以xi结束，也不以yj结束。$

$d_{i,j}=\left\{ \begin{array}{ll} d_{i-1,j-1} +1& if \space x_i=y_i \\ 0& if \space x_i \neq y_i \nonumber \end{array} \right.$

最后，我们可以通过计算所有可能的结束位置i和j的最大值来得到最长的公共子串。
$LCS(X,Y)=max(d_{i,j})$
填表也是一致
在这里插入图片描述
但这里维护位置就很简单了，因为子串是连续的，因此只需要记录末尾位置和最大长度即可：用 $l_{max}$ 和 $p_{max}$ 分别存储公共子字符串的最大长度及其位置i(或j)。所以，我们可以稍后从X(或Y)重建元素。


Longest-Common-Substring(X,Y)
//Initialization
lmax ← 0
pmax ← 0
for i ← 0 to m dod[i,0] ← 0
end
for j ← 0 to n dod[0,j] ← 0
end//Dynamic Programming
for i ← 1 to m dofor j ← 1 to n doif xi != yi thend[i,j] ← 0endelsed[i,j] ← d[i-1,j-1]if d[i,j]>lmax thenlmax ← d[i,j]pmax ← iendendendend
return lmax,pmax

Print-LCS(X,lmax,pmax)
if lmax is equal to 0 thenreturn NULL;
end
for i ← (pmax-lmax+1) to pmax doprint xi
end

过程

在这里插入图片描述

时间复杂度

两层循环，时间复杂度 $T (n) = O (nm)$

最小编辑距离

背景

当把“algorithm”误输入为“algorithm”时，系统可能自动帮助搜索最优的调节词矫正。可运用于机器翻译，信息提取和语音识别。

$给定两个数组X=(x_1,x_2,...,x_m),Y=(y_1,y_2,...,y_n)$ ，编辑距离是将X转换为Y的编辑操作的最小次数

分析

编辑一共有三种操作：
    添加字母
    删除字母
    替换一个字符。
因为实际考虑中每个操作都需要付出相应的代价 $cos t$ ，为了简化问题，设每个操作 $cos t = 1$

$定义 D [i, j] 为子字符串 X [1.. i] 和 Y [1.. j] 的最小编辑距离$

将 $X [1.. i] 变成 Y [1.. j]$ 有三种情况：
   将 $X [1.. i - 1] 变成 Y [1.. j]$ 并删除 $X [i]$
       $ME D (c x y - > d ab) = ME D (c x - > d ab) + 1$
   将 $X [1.. i] 变成 Y [1.. j - 1]$ 并插入Y[j]
       $ME D (c x y - > d ab) = ME D (c x y - > d a) + 1$
   如果 $\neq Y[j]$ ，将 $X [1.. i - 1] 变成 Y [1.. j - 1]$ 并将 $X [i] 替换成 Y [j]$
       $ME D (c x y - > d ab) = ME D (c x - > d ab) + 1$
       $ME D (c x y - > d ab) = ME D (c x - > d a) + 1$

状态转移方程

$D[i,j]=min\left\{ \begin{array}{ll} D[i-1,j] +1\\ D[i,j-1] +1\\ D[i-1,j-1]+\left\{ \begin{array}{ll} 0 & if \space X[i]=Y[j]\\ 1 & if \space X[i] \neq Y[j] \nonumber \end{array} \right.\nonumber \end{array} \right.$

同时，我们创建另一个矩阵 $p [i, j] (1 \leq i \leq m, 1 \leq j \leq n)$ 来存储指向计算中使用的元素的箭头，用于恢复操作的最优序列。

$p[i,j]=\left\{ \begin{array}{ll} Left& if \space Insertion\\ Up& if \space Deletion\\ Left Up& if \space Substitution \nonumber \end{array} \right.$

填表

最初，我们用 $j$ 填充矩阵 $D [0, j]$ 的第一行，用 $i$ 填充矩阵 $D [i, 0]$ 的第一列。

空串到一个长为 $j$ 的串或者长为 $i$ 的串到空串，最短做法就是一个个删除或者添加。

在这里插入图片描述

伪代码

Minimum-Edit-Distance(X,Y)
//Initialization
for i ← 0 to m dod[i,0] ← ip[i,0]="U'
end
for j ← 0 to m dod[0,j] ← jp[0,j]="L"
end//Dynamic Programming
for i ← 1 to m dofor j ← 1 to n doif xi != yi thenc ← 1endelsec ← 0endif d[i-1][j-1]+c <= d[i-1][j]+1 andd[i-1][j-1]+c <= d[i][j-1]+1 thend[i][j] ←  d[i-1][j-1]+cp[i][j] ←  "LU"endelse if d[i-1][j]+1 <= d[i-1][j-1]+c andd[i-1][j]+1 <= d[i][j-1]+1 thend[i][j] ← d[i-1][j]+1p[i][j] ←  "U"endelsed[i][j] ← d[i][j-1]+1p[i][j] ←  "L"endendreturn d,p
end

Print-MED(p,X,i,j)
if i is equal to 0 or j is equal to 0 thenreturn NULL;
end
if p[i,j] is equal to "LU” thenPrint-MED(p,X,i-1,j-1);if xi = yi thendo nothingendelseprint Substitue xi with yiend
end
else if p[i,j] is equal to "U” thenPrint-MED(p,X,i-1,i);print Delete xi
end
elsePrint-MED(p,X,i,j-1);print Insert xi
end