问题
Implement regular expression matching with support for '.' and '*'.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
The function prototype should be:
bool isMatch(const char *s, const char *p)
例子
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true
分析
首先要理解正则表达式的规则:
- '.'代表任意字符,不可为空字符;
- '*'表示前一个字符重复n次,n>=0.
按照定义,'*'之前必须要有一个字符。
考虑使用动态规划:
状态表
dp[i][j],表示s[0, i - 1]和p[0, j - 1]是否匹配初始状态
dp[0][0] = true s为空,p为空,必然匹配;
dp[i][0] = false, i >= 1 s非空,p为空,必然不匹配;
dp[0][j] = j > 1 && p[j - 1] == '*' && dp[0][j - 2], j >= 1
注:s为空。当p的长度为1时,必然不匹配(不管是'*', '.'还是其他字符);当p的长度大于1时,s和p[0, j - 1]匹配的充要条件是p[j - 1] = '*',并且dp[0][j - 2]为true. 举个例子,长度为n的p可以被表示为p[0, j - 3]a*p[j, n],其中p[j - 1] = '*',p[j - 2] = 'a'。那么p[0, j - 3]a*和s匹配的唯一条件就是p[0, j - 3]和s匹配,也就是dp[0][j - 2]为true.状态转移方程
dp[i][j] = dp[i - 1][j - 1] && (s[i - 1] == p[j - 1] || p[j - 1] == '.'); if p[j - 1] != '*'
dp[i][j] = dp[i][j - 2] || ((s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]); if p[j - 1] == '*'
注:dp[i][j] = dp[i][j - 2]指的是'*'重复了0次,即p[j - 2, j - 1]这段可以理解成空串;dp[i][j] = (s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]指的是'*'重复了n(n>=1)次,(s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]相当于重复了1次,然后从状态表中查找剩下的n-1次重复。
要点
dp
时间复杂度
O(mn)
空间复杂度
O(mn)
代码
class Solution {
public:
bool isMatch(string s, string p) {
int m = s.size(), n = p.size();
vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
dp[0][0] = true;
for (int i = 1; i <= m; i++)
dp[i][0] = false;
for (int j = 1; j <= n; j++)
dp[0][j] = j > 1 && p[j - 1] == '*' && dp[0][j - 2];
for (int i = 1; i <= m; i++)
for (int j = 1; j <= n; j++)
if (p[j - 1] != '*')
dp[i][j] = dp[i - 1][j - 1] && (s[i - 1] == p[j - 1] || p[j - 1] == '.');
else
dp[i][j] = dp[i][j - 2] || ((s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]);
return dp[m][n];
}
};