Regular Expression Matching

September 1, 2011 in backtracking, string

Implement regular expression matching with support for ‘.’ and ‘*’.

‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch(“aa”,”a”) → false
isMatch(“aa”,”aa”) → true
isMatch(“aaa”,”aa”) → false
isMatch(“aa”, “a*”) → true
isMatch(“aa”, “.*”) → true
isMatch(“ab”, “.*”) → true
isMatch(“aab”, “c*a*b”) → true

Online Judge
This problem is available at Online Judge. Head over there and it will judge your solution. Currently only able to compile C++/Java code. If you are using other languages, you can still verify your solution by looking at the judge’s test cases and its expected output.

Background:
It might seem deceptively easy even you know the general idea, but programming it correctly with all the details require careful thought.

Edit:
It seems that some readers are confused about why the regex pattern “.*” matches the string “ab”. “.*” means repeat the preceding element 0 or more times. Here, the “preceding” element is the dot character in the pattern, which can match any characters. Therefore, the regex pattern “.*” allows the dot to be repeated any number of times, which matches any string (even an empty string).

Hints:
Think carefully how you would do matching of ‘*’. Please note that ‘*’ in regular expression is different from wildcard matching, as we match the previous character 0 or more times. But, how many times? If you are stuck, recursion is your friend.

This problem is a tricky one. Due to the huge number of edge cases, many people would write lengthy code and have numerous bugs on their first try. Try your best getting your code correct first, then refactor mercilessly to as clean and concise as possible!


A sample diagram of a deterministic finite state automata (DFA). DFAs are useful for doing lexical analysis and pattern matching. An example is UNIX’s grep tool. Please note that this post does not attempt to descibe a solution using DFA.

Solution:
This looks just like a straight forward string matching, isn’t it? Couldn’t we just match the pattern and the input string character by character? The question is, how to match a ‘*’?

A natural way is to use a greedy approach; that is, we attempt to match the previous character as many as we can. Does this work? Let us look at some examples.

s = “abbbc”, p = “ab*c”
Assume we have matched the first ‘a’ on both s and p. When we see “b*” in p, we skip all b’s in s. Since the last ‘c’ matches on both side, they both match.

s = “ac”, p = “ab*c”
After the first ‘a’, we see that there is no b’s to skip for “b*”. We match the last ‘c’ on both side and conclude that they both match.

It seems that being greedy is good. But how about this case?

s = “abbc”, p = “ab*bbc”
When we see “b*” in p, we would have skip all b’s in s. They both should match, but we have no more b’s to match. Therefore, the greedy approach fails in the above case.

One might be tempted to think of a quick workaround. How about counting the number of consecutive b’s in s? If it is smaller or equal to the number of consecutive b’s after “b*” in p, we conclude they both match and continue from there. For the opposite, we conclude there is not a match.

This seem to solve the above problem, but how about this case:
s = “abcbcd”, p = “a.*c.*d”

Here, “.*” in p means repeat ‘.’ 0 or more times. Since ‘.’ can match any character, it is not clear how many times ‘.’ should be repeated. Should the ‘c’ in p matches the first or second ‘c’ in s? Unfortunately, there is no way to tell without using some kind of exhaustive search.

We need some kind of backtracking mechanism such that when a matching fails, we return to the last successful matching state and attempt to match more characters in s with ‘*’. This approach leads naturally to recursion.

The recursion mainly breaks down elegantly to the following two cases:

  1. If the next character of p is NOT ‘*’, then it must match the current character of s. Continue pattern matching with the next character of both s and p.
  2. If the next character of p is ‘*’, then we do a brute force exhaustive matching of 0, 1, or more repeats of current character of p… Until we could not match any more characters.

You would need to consider the base case carefully too. That would be left as an exercise to the reader. :)

Below is the extremely concise code (Excluding comments and asserts, it’s about 10 lines of code).

Further Thoughts:
Some extra exercises to this problem:

  1. If you think carefully, you can exploit some cases that the above code runs in exponential complexity. Could you think of some examples? How would you make the above code more efficient?
  2. Try to implement partial matching instead of full matching. In addition, add ‘^’ and ‘$’ to the rule. ‘^’ matches the starting position within the string, while ‘$’ matches the ending position of the string.
  3. Try to implement wildcard matching where ‘*’ means any sequence of zero or more characters.

For the interested reader, real world regular expression matching (such as the grep tool) are usually implemented by applying formal language theory. To understand more about it, you may read this article.

VN:F [1.9.22_1171]
Rating: 4.8/5 (51 votes cast)
Regular Expression Matching, 4.8 out of 5 based on 51 ratings

95 responses to Regular Expression Matching

  1. why isMatch(“aab”, “c*a*b”) -> true?
    obviously, the second string contains one extra ‘c’ compared to the first string. Am i correct?
    Thanks, great website:)

    VA:F [1.9.22_1171]
    +1
    • isMatch(“aab”, “c*a*b”) returns true because c* means 0 or more ‘c’ to match. Since it doesn’t start with a ‘c’, so c* matches 0 ‘c’. Similarly, a* matches the two ‘a’s, and b matches the last ‘b’.

      VN:F [1.9.22_1171]
      +2
      • ic,lol

        VA:F [1.9.22_1171]
        0
      • That’s obviously wrong!

        VA:F [1.9.22_1171]
        -4
        • I think “c*” should be clarified in the question. People may think ‘* ‘in “c*” means it can be 0 or more previous char, which is ‘c’. But ‘c’ in “c*” already exists. So “c*” means at least one ‘c’.

          I guess here ‘*’ can cancel the previous char, and thus “c*” means 0 or more ‘c’. But, I think this is not common sense.

          VN:F [1.9.22_1171]
          0
    • because ‘*’ can match current pattern 0 time.

      VN:F [1.9.22_1171]
      0
    • I think it is wrong, or if the second has a * , the strings are match,

      VA:F [1.9.22_1171]
      0
  2. Could you please tell me what’s the running time of this algorithm? O(n^2)?

    VA:F [1.9.22_1171]
    0
    • In case s=”a…..ab” p=”a*a*….a*”, it takes exponential time and O(n) space. Using memorizing, I think it takes O(n^3) time and O(n^2) space.

      VA:F [1.9.22_1171]
      +2
  3. The +1 button on my side has some problem

    VA:F [1.9.22_1171]
    -8
  4. Could you tell why isMatch(“ab”, “.*”) → true? .can match “a”, so “.*” is a or aa or aa… How could it be “ab”?

    VA:F [1.9.22_1171]
    +3
  5. how could isMatch(“ab”,”.*”) -> true ?? . matches a then preceding * can only match more a’s , how did this b came then ….!!

    VA:F [1.9.22_1171]
    0
    • It means 0 or multiple single character, which definitely match “ab”.

      VN:F [1.9.22_1171]
      -1
      • It should not be true. * means 0 or multiple “preceding” character. In this case it is ‘a’. So how did ‘b’ come from?

        VA:F [1.9.22_1171]
        0
        • The preceding character is ‘.’, which can stand for any character.
          In a another word, “.*’ equals anything as long as there is no tail behind, like “.*c”. If there is a tail, in s we must have corresponding ‘c’

          VN:F [1.9.22_1171]
          0
    • @rockstar and @ds:
      I mentioned this in my post, “.*” means repeat the preceding element of the pattern itself, which is the single dot character. Since dot can match virtually any character, the pattern “.*’ will match the string “ab”.

      If you do not believe me, I’ll quote this from here, http://www.regular-expressions.info/dot.html :
      “Suppose you want to match a double-quoted string. Sounds easy. We can have any number of any character between the double quotes, so “.*” seems to do the trick just fine. The dot matches any character, and the star allows the dot to be repeated any number of times, including zero.”

      VN:F [1.9.22_1171]
      +1
  6. dumb test case isMatch(“aa”, “*a”) will trigger assert, which should not.

    VN:F [1.9.22_1171]
    -1
    • from my understanding, the ‘*’ must have a preceding element, so your use case maybe wrong regarding to this scenario.

      VN:F [1.9.22_1171]
      +2
  7. in other word, a wrapper function is needed.

    VN:F [1.9.22_1171]
    0
  8. Please find the java version of the solution [Naive method]

    VN:F [1.9.22_1171]
    +2
    • Looks like your code is not working.

      This returns false, but patten actually matches:

      System.out.println(pm.regexMatch("abbbbc".toCharArray(), "ab*bbbbc".toCharArray()));

      VN:F [1.9.22_1171]
      0
    • Here is my version based on author posted solution:


      public boolean matches(String text, String pattern) {
      // if pattern is empty - whole string should be empty to match
      if (pattern.length() == 0) return text.length() == 0;

      // retrieve 2nd character of the pattern
      String nextChar = (pattern.length() > 1) ? String.valueOf(pattern.charAt(1)) : "";
      char t = (text.length() > 0) ? text.charAt(0) : 0;
      char p = pattern.charAt(0);

      // if 2nd char is not '*' - it means that we are checking if next char in text matches with next char in pattern
      // and recursively run the code starting from +1 char in both pattern/text
      if (!"*".equals(nextChar)) {
      return ((t == p) || (p == '.' && t != 0)) &&
      matches(text.substring(1), pattern.substring(1));
      } else {
      while ((t == p) || (p == '.' && t != 0)) {
      // check if current text matches tail part of the pattern (+2 symbols)
      if (matches(text, pattern.substring(2))) return true;
      // cut first symbol in original text and repeat the check in the loop
      text = text.substring(1);
      t = (text.length() > 0) ? text.charAt(0) : 0;
      }
      return matches(text, pattern.substring(2));
      }
      }

      VN:F [1.9.22_1171]
      0
  9. Thanks for the tips and article

    VN:F [1.9.22_1171]
    0
  10. I come up with a dynamic programming solution. It takes O(n^2) in time and O(n) in space.

    Could you please check if it is correct.

    Here is the solution:

    d[i,j] is true iff a[0..i] matches r[0..j]

    d[i,j] is true if one of these cases is true:

    (1) a[i] == r[j] and d[i-1,j-1] is true (match a character with a character)
    (2) r[j] == ‘.’ and d[i-1,j-1] is true
    (3) r[j] is * and it can match a[i] and d[i-1,j-1] is true (* consumes character as its first one)
    (4) r[j] is * and it can match a[i] and d[i-1,j] is true (* consumes 1 more character)
    (5) r[j] is * and it can match a[i] and d[i,j-1] is true (* matches an empty string)

    The solution is called “Elastic Matching” used in Handwriting recognition.

    Please see the full solution here: http://tanin.nanakorn.com/blog/!/id25

    VA:F [1.9.22_1171]
    0
    • Detail is a little off and wrong. Sorry about that. But I guess you can get the concept of how it works

      VA:F [1.9.22_1171]
      0
  11. Could anyone give me a O(N) solution ?

    VN:F [1.9.22_1171]
    0
  12. Nice code!

    I have a minor question.

    IsMatch(“a”, “ab*a”) should be “ture”, but this code returns “false”.

    Please correct me if I miss some point

    VN:F [1.9.22_1171]
    0
    • IsMatch(“a”, “ab*a”) should return “false”.
      This is because the pattern “ab*a” means the string begin with an a, following by 0 or more b’s, then an another a at the end.
      This imply that it must minimally contain two characters which has two a’s, ie. “aa”.

      VN:F [1.9.22_1171]
      0
    • Oh, I miss the whole match condition.

      If the regular expression string should be equal to the entire input string, then IsMatch(“a”, “ab*a”)–>false

      VN:F [1.9.22_1171]
      0
      • Yup, that’s correct. It’s mentioned in the problem statement, Just bolded the word “entire” to make the statement clearer.

        VN:F [1.9.22_1171]
        0
  13. The code is amazing.

    I just don’t understand why we need:

    assert(*p != ‘*’), which is after if (*(p+1) != ‘*’)

    Thanks,

    VN:F [1.9.22_1171]
    0
    • This is to assume that multiple *’s are not in the input pattern and before a * there must be a letter to match, or else the code above will not be correct. In the case where multiple *’s can be in the input pattern, you can just use a loop to skip those, as those are redundant and won’t affect the correctness.

      VN:F [1.9.22_1171]
      0
  14. There is a bug.

    char* exp = “b*bb”, char* text = “bb”

    This should be a “Not match”, but your code will answer “match”.

    You should change your last code to

    VA:F [1.9.22_1171]
    -1
    • Thanks for your comment.

      exp=”b*bb” and text=”bb” should be a match. This is because “b*” could match 0 or more b’s. In this case it would be 0 b.

      VN:F [1.9.22_1171]
      0
  15. I think my code is same with answer but it shows the backtracking aspect of problem more.

    int match(char *s, char *p)
    {
    int char_matched = 0;

    if (*s == ”)
    return (*p == ”);

    if (*s == *p || *p == ‘.’)
    char_matched = 1;

    if (*(p+1) == ‘*’)
    return char_matched? (match(s+1, p) || match(s+1, p+2)) : match(s+1, p+2);
    else
    return char_matched? match(s+1, p+1): 0;
    }

    VA:F [1.9.22_1171]
    0
    • I just saw “bb”, “b*bb” case above. Yeah.. this should match. Added the case.

      int match(char *s, char *p)
      {
      int char_matched = 0;

      if (*s == ”)
      return (*p == ”);

      if (*s == *p || *p == ‘.’)
      char_matched = 1;

      if (*(p+1) == ‘*’)
      return char_matched? (match(s+1, p) || match(s+1, p+2) || match(s, p+2)) : match(s+1, p+2);
      else
      return char_matched? match(s+1, p+1): 0;
      }

      VA:F [1.9.22_1171]
      0
      • This code is buggy. Here are some examples where it would fail:
        – it would return false on empty string for regex “.*” while it should have returned true.
        – it would fail in the case such as text=”ab”and regex=”ab.*”.

        VA:F [1.9.22_1171]
        0
  16. the code doesn’t handle when the length of p is 1

    VN:F [1.9.22_1171]
    0
  17. // next char is ‘*’
    while ((*p == *s) || (*p == ‘.’ && *s != ”))

    Hello, in this piece of code, *p == ‘.’ would never be true since there is no any code inside while loop operates p.

    VA:F [1.9.22_1171]
    0
  18. Cannot understand the following code:
    // next char is ‘*’
    while ((*p == *s) || (*p == ‘.’ && *s != ”)) {
    if (isMatch(s, p+2)) return true;
    s++;
    }

    why use “if (isMatch(s, p+2)) return true;”? This will stop comparing s to the characters before ‘*’ in p. Is this correct?

    VN:F [1.9.22_1171]
    0
  19. This code is breaking for abdabcabc & .*abc

    VA:F [1.9.22_1171]
    0
    • “the regex pattern “.*” allows the dot to be repeated any number of times, which matches any string (even an empty string)”.

      s = abdabcabc; p = .*abc.
      (1) .* will match the first six characters of s, i.e., abdabc (you could consider .* as “……” in this case);
      (2) the last three characters of s and p match.

      Therefore, .*abc matches abdabcabc.

      VN:F [1.9.22_1171]
      0
  20. Python? DP can be applied (not in the following code).

    Run:
    ./reg.py “abcbcd” “a.*c.*d”

    VN:F [1.9.22_1171]
    0
  21. I am not sure about the time complexity of solution.
    Why do not solve this problem via DP? it would be O(n ^ 2) ~

    VA:F [1.9.22_1171]
    0
  22. here’s my code:
    class Solution {
    public:

    bool canMatch(char a, char b)
    {
    return (a == b || b == ‘.’);
    }
    bool isMatch(const char *s, const char *p) {
    // Start typing your C/C++ solution below
    // DO NOT write int main() function
    int lS = strlen(s);
    int lP = strlen(p);
    vector<vector > F(lS + 1, vector(lP + 1));
    F[0][0] = true;
    for (int i = 0; i <= lS; i++)
    {
    for (int j = 1; j 0)
    if (F[i - 1][j - 1] && canMatch(s[i - 1], p[j - 1]))
    {
    F[i][j] = true;
    continue;
    }
    if (i > 0 && j > 1)
    if (F[i - 1][j] && canMatch(s[i - 1], p[j - 2]) && p[j - 1] == ‘*’)
    {
    F[i][j] = true;
    continue;
    }
    if (j > 1)
    if (F[i][j - 2] && p[j - 1] == ‘*’)
    {
    F[i][j] = true;
    continue;
    }
    }
    }
    return F[lS][lP];
    }
    };

    it can pass all test cases.

    VA:F [1.9.22_1171]
    0
  23. mark

    VN:F [1.9.22_1171]
    0
  24. Further thoughts:
    Problem 1:
    I came up one case that takes a long time to get the result while the length of string s and p are not huge.

    s[] = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
    p[] = "a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*b"

    VN:F [1.9.22_1171]
    0
  25. I don’t know is this the case of exponentially complexity or not. Can anyone help me check it? Thanks.

    VN:F [1.9.22_1171]
    0
  26. here, definition of ‘*’ is confusing, a*b could equal to following strings?
    b–>cancel a
    ab–>0 a
    aab–>1 a
    aaab–>2 a
    aaaab–>3 a

    VN:F [1.9.22_1171]
    0
  27. Knuth morris pratt modified, * is greedy. All we have to do is build patterns of all characters upto *

    for .e.g ab*bc*cd, we build 3 prefix that are also suffixes pattern for ab , bc and cd respectively.

    for the actual string, we first start with ab using its prefix array try to match the string, once a match occurs we recursively try to match from that matched position for bc and finally for cd. we backtrack when we complete the entire string for any recursion level, seems like O(n*number of stars) to me. I am in the process of writing code for it.

    VA:F [1.9.22_1171]
    0
    • My previous solution was for wildcard match but it lends intuitively into regex match also. Cancel kmp prefix generation for a character whose next char is a *, then the same algo holds but this time we need to move to the next matching block once the current * char is matched and backtrack here only when the next matching block fails.

      VA:F [1.9.22_1171]
      0
  28. nice post, but has problem as isMatch(“aa”, “*a”) triggers assert which leads to termination of program.

    this line is incorrect in the code, as it can lead to segmentation fault… how can we directly access p+2 element, we know for sure that p is not ”, but p+1 element can be ” , therefore leading to p+2 to be undefined.

    VA:F [1.9.22_1171]
    0
    • This branch will be executed while ” // next char is ‘*’ “. This means *(p + 1) = ‘*’.
      Then *(p + 2) can be ”, the terminal of string.

      VN:F [1.9.22_1171]
      0
  29. I think there is a bug for your code.

    For the case where p = “a” and s = “b*a”, your code will return FALSE. However, I believe it should return TRUE.

    What do you think? Did I miss anything?

    VN:F [1.9.22_1171]
    0
  30. The following is the DP solution which passes all the online cases:

    class Solution {
    public:
    bool isMatch(const char *s, const char *p) {
    vector row_char, column_char;
    vector column_st;
    while(*s != ”)
    row_char.push_back(*(s++));

    while(*p != ”)
    {
    if(*p != ‘*’)
    {
    column_char.push_back(*p);
    if(*(p + 1) == ‘*’)
    column_st.push_back(true);
    else
    column_st.push_back(false);
    }
    p++;
    }
    int row_sz = row_char.size(), column_sz = column_char.size();
    if(row_sz == 0 && column_sz == 0)
    return true;
    if(row_sz == 0)
    {
    for(int i = 0; i < column_sz; i++)
    if(column_st[i] == false)
    return false;
    return true;
    }
    if(column_sz == 0)
    return false;

    bool state[2][column_sz];
    int flag = false;
    int i = 0;
    for(; i < column_sz; i++)
    {
    if(column_char[i] == '.' || column_char[i] == row_char[0])
    flag = true;
    if(column_st[i] == true)
    state[0][i] = true;
    else
    break;
    }
    if(i == column_sz && flag == false)
    return false;
    if(i < column_sz)
    {
    if(column_char[i] == '.' || column_char[i] == row_char[0])
    state[0][i] = true;
    else
    state[0][i] = false;
    i++;
    }
    for(; i < column_sz; i++)
    {
    if(column_st[i] == true && state[0][i - 1] == true)
    state[0][i] = true;
    else
    state[0][i] = false;
    }

    int cur = 1, pre = 0;

    for(i = 1; i < row_sz; i++)
    {
    if(column_st[0] == true && column_char[0] == '.')
    state[cur][0] = true;
    else if(column_st[0] == true && state[pre][0] == true && row_char[i - 1] == column_char[0] && row_char[i] == column_char[0])
    state[cur][0] = true;
    else
    state[cur][0] = false;
    for(int j = 1; j < column_sz; j++)
    {
    if(column_st[j] == true && state[cur][j - 1] == true)
    state[cur][j] = true;
    else if(column_char[j] == '.' || row_char[i] == column_char[j])
    {
    if(state[pre][j - 1] == true)
    state[cur][j] = true;
    else if(column_st[j] == true && state[pre][j] == true)
    state[cur][j] = true;
    else
    state[cur][j] = false;
    }
    else
    state[cur][j] = false;
    }

    swap(cur, pre);
    }
    return state[pre][column_sz - 1];
    }
    };

    VN:F [1.9.22_1171]
    0
  31. Here is the ultimate in conciseness (short of using a regex library): 6 lines of Prolog!

    Note that 42 is the ASCII for ‘*’ and 46 the ASCII for ‘.’. A sample query is this:

    VA:F [1.9.22_1171]
    0
  32. VN:F [1.9.22_1171]
    +1
    • This seems neat and passes all OJ tests.

      But it has bugs and fails on this simple example:

      s* = “a”
      p* = “b*a”

      VN:F [1.9.22_1171]
      0
      • sorry I think I misunderstood what * means, it could cancel the preceding character.

        VN:F [1.9.22_1171]
        +1
  33. isMatch(s, p+2) maybe out of boundary, in which case will store unexpected characters. Better check before go into.

    *(p+1) != ” && isMatch(s, p+2)

    VA:F [1.9.22_1171]
    0
  34. your code is giving not match for isMatch(“.”,”a”) but it should be a match.

    VA:F [1.9.22_1171]
    0
  35. Use DP, Time Complexity is O(N^2), and I only store the last and current row of the table, so the Space Complexity is O(N). Here is the working code.

    VN:F [1.9.22_1171]
    0
  36. input abaaabc
    pattern .*bc

    VN:F [1.9.22_1171]
    0
  37. isMatch(“ab”, “.*”) → true
    but why “ab”, “.*c” expect false?

    VN:F [1.9.22_1171]
    0
  38. I think the posted sloution will fail for the following test case:

    s=”xabbb” p=”x.*”

    for .* first it will check for a and it will match…. but the rest of three b’s will also match with the same .*. so it will return TURE but it should return FALSE.
    ifff s=”xaaaa” p=”x.*” then it should return TRUE. plz lemmy know if im right or wrong.

    VA:F [1.9.22_1171]
    0
  39. Here is my solution,

    int Findstr(char *string, char *str, int len)
    {
    assert(string != NULL && str !=NULL && len>0);
    for(int i=0; i<strlen(string); i++)
    {
    int k = i, j;
    for(j=0; j<len && (str[j] == '?' || string[k] == str[j]); j++,k++);
    if(j==len) return i;
    }
    return -1;
    }

    int SplitString(char *&str, char c)
    {
    assert(str!=NULL); //?? temp!=NULL
    int cnt=0;
    int len = strlen(str);
    for(int i=0; i<len; i++)
    {
    if(str[i]!=c && (i+1== len || str[i+1] == c)) //str[i] ==c && (i+1 == len || str[i+1] != c)
    {
    str = str+i-cnt;
    return cnt+1;
    }
    else if(str[i]!=c)
    cnt++;
    }
    return -1;
    }

    bool matchpattern(char *text, char *pattern)
    {
    assert(text!=NULL && pattern!=NULL);
    char *strP = pattern;
    char *strT = text;

    int len,index=0;
    while((len = SplitString(strP, '*')) != -1
    && (index = Findstr(strT, strP, len)) != -1)
    {
    strP += len;
    strT += index+len;
    }
    if((len == -1) && (pattern[strlen(pattern)-1] == '*' || *strT == NULL)) return true;
    return false;
    }

    VN:F [1.9.22_1171]
    0
  40. DP solution, time complexity n^2

    public boolean isMatch(String s, String p) {
    if (s == null || p == null) return false;
    int sl = s.length();
    int pl = p.length();
    boolean a[][] = new boolean[pl+1][sl+1];
    a[0][0] = true;
    for(int i=1; i<=sl;i++){
    a[0][i] = false;
    }
    for(int i=1; i 1) {
    a[i][0] = a[i-2][0];
    } else {
    a[i][0] = false;
    }
    }
    for(int i=1; i <= pl; i++) {
    for(int j=1; j <= sl; j++) {
    if(p.charAt(i-1) == '*' && i != 1) {
    a[i][j] = a[i-2][j] || a[i-1][j] ||
    (a[i][j-1] && (p.charAt(i-2) == '.' || p.charAt(i-2) == s.charAt(j-1)));
    } else {
    a[i][j] = a[i-1][j-1]
    && (p.charAt(i-1) == '.' || p.charAt(i-1) == s.charAt(j-1));

    }

    }
    }
    return a[pl][sl];
    }

    VN:F [1.9.22_1171]
    0
  41. DP solution, time complexity n^2

    VN:F [1.9.22_1171]
    0
  42. For this case: “”, “.*”, even I wrote: (in the first line, it shows false)
    if(s==”"&&p==”.*”)
    return true;

    Why???????????????????????????????????

    VN:F [1.9.22_1171]
    0
  43. I once extended the classical KMP algorithm to make it able to handle wild card in pattern matching. The complexity stays O(N) time and O(N) space. This is the code, but it is extremely hard to understand: 10 times harder than the classical KMP. I don’t want to put any explanation here because a detailed explanation may cost a paper. If someone is really interested in it, email me: the.easiest.name.to.remember@gmail.com

    vector next_ext(const string &str)
    {
    int leng = str.size();
    if (leng < 0 || str.empty())
    return vector();
    vector next(leng + 1);
    int i = -1, j, off; //treated like str[-1] is a ‘*’
    off = i, j = off, i ++, next[i] = off;
    while (i < leng)
    {
    if (str[i] == '*' ) //consistent with the above comment.
    off = i, j = off, i ++, next[i] = off;
    else if (j == off || str[i] == str[j] || str[j] == '.')
    ++i, ++j, next[i] = j;
    else
    j = next[j];
    }
    return next;
    }

    //if matched, return the last position of the first matched substring of the string, else return -1.
    int kmp_ext(string str, string pattern)
    {
    int str_len = str.size(), pattern_len = pattern.size();
    vector next = next_ext(pattern);
    int i = 0, j = -1, off; // treated like pattern[-1] is a ‘*’
    off = j, j ++;
    while(i < str_len && j < pattern_len)
    {
    if (j != off && pattern[j] == '*' )
    off = j, j ++; //consistent with the above comment
    else if( j == off || str[i] == pattern[j] || pattern[j] == '.')
    j++, i++;
    else
    j = next[j];
    }
    if(j == pattern_len)
    return i;
    else
    return -1;
    }

    int main()
    {
    string s = "abdcxdefabc";

    cout<<kmp_ext(s, "ab..cx")<<endl;
    cout<<kmp_ext(s, "ab*x.*fa*c")<<endl;
    cout<<kmp_ext(s, "cxd*")<<endl;
    cout<<kmp_ext(s, "*b*x.*fa*c")<<endl;
    cout<<kmp_ext(".ab", ".a.")<<endl;
    cout<<kmp_ext("a*b", "a*b")<<endl;
    }

    VN:F [1.9.22_1171]
    0
  44. VN:F [1.9.22_1171]
    0
  45. Such an amazing code! I even spent half a day to understand the code!

    VN:F [1.9.22_1171]
    0
  46. if (*(p+1) != ‘*’) may overflow

    VA:F [1.9.22_1171]
    0
  47. If ab matches “.*”
    Does that mean that .* will match everything after a .* appeared in input?

    VA:F [1.9.22_1171]
    +1
  48. Hi,

    Thanks for sharing code, I found hard on understanding following lines.

    while ((*p == *s) || (*p == ‘.’ && *s != ”)) {
    if (isMatch(s, p+2)) return true;
    s++;
    }

    If I have S = ‘abbbc’, P = ‘ab*ba’, when in the second loop, the line “if (isMatch(s, p+2)) return true;: will be true and return True to the main loop. Then the result will be true. Is my understanding correct?

    Thanks

    VA:F [1.9.22_1171]
    0
  49. The implication staying created right here is that when you shop for these types of shoes via the net Nike Tn Requin , you will be exposed to a range of positive aspects. For starters, you will get to appear at the a variety of makes of these irresistible sneakers. This way, you will get an overall notion of which footwear you want.This implies that previous to you make the order, you will have noticed several pairs of shoes.

    VN:F [1.9.22_1171]
    -1
  50. A brief discussion on some of the top branded tennis shoes as discussed as follows to help you choose your tennis shoes.Nike:Nike, one of the most popular names in tennis shoe market, has dominated the market since the 80`s nike tn . Popular sports stars from different fields like Tiger Woods, Michael Jordan and Lebron James endorse for this brand of shoes. The superstar studded endorsement has catapulted the status of Nike brand of shoes.Latest technology has been adopted in the Nike brand of shoes. Nike tennis shoes now are designed in such a way so as to bring in air to the soles. This latest feature makes the tennis shoe very comfortable.

    VN:F [1.9.22_1171]
    -1
  51. WHY?! glibc POSIX regex hold the view that “aa” matchs “a” https://github.com/xiangzhai/leetcode/blob/master/src/regex/regex.c#L51

    VN:F [1.9.22_1171]
    0
  52. It’s actually a great and helpful piece of information. I am glad that you simply shafed this useful
    information wit us. Please sty us up to date like this.

    Thank you for sharing.

    VA:F [1.9.22_1171]
    0
  53. Stadium held in Jinjiang Zuchang “China’s first sports city Sports shoes, fashion release event”, T-type station which serves a lively fashion show non-competitive class in 2009-2010 fall and winter sports shoes fashion trends Nike Tn Requin . To participate in the Eleventh China International Footwear Expo part of the manufacturers, local media took part in the publishing activities. To enhance the “Chinese shoes” Jinjiang leading role in the fashion industry, the Eleventh China International Footwear Fair and the China Leather Association, China Fashion Color Association held a “China’s first sports city Sports Shoe Fashion publish activities “, activities, organizations, design consultants and industry experts on the 2009-2010 fall and winter, when the non-competitive class sports shoes fashion trends to carry out research.

    VN:F [1.9.22_1171]
    0
  54. Appreciate the recommendation. Wiill try it out.

    VA:F [1.9.22_1171]
    0

Leave a reply

Your email address will not be published. Required fields are marked *

You may use the <code> tag to embed your code.