Topic: 簡介 KMP string matching Algorithm
          @CopyLeft by tsaiwn@csie.nctu.edu.tw
  KMP 是很重要的 string matching algorithm
是 Knuth-Morris-Pratt 三個人共同發表的
如果被找字串長度 n, 要找字串長度 m,
則此方法之 time complexity 為 O(m+n)   (注意土法煉鋼法是 O( m * n )  )
           ^^^^^^^^^^^^^^^^^^^^^^^^^                     ^^^^^^^^^^^^^
           (會考 :-)
它主要是靠一個 prefix overlap function來避開一些不必要的字串比較
這個function 主要是把要找的 pattern 每個字編號, 每次在出現新字時就歸 0
例如若要找的 pattern 是 abcaacaxybcbcbcdabcd 則編號如下:
      abcaacaxybcbcbcdabcd
      00012340012345601234
這樣比了不符合就可以不只 往右邊移位一字, 略掉不必要的比較
  KMP適用的場合在於text和 pattern 所用到的字符數不多的情況下
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           (會考 :-)
例如 字串只由 {0,1}, {a,b,c}這種set組成的時候,
如果text和pattern所用的到文字符號數非常的多,
那麼要在text與pattern的相同字串中其最大的suffix往往是空集合
這樣KMP的string matching就沒法發揮好處(跟每次移動一字的比法相同?)
此時應該改用 Boyer-Moore algorithm
  這個 Boyer-Moore是利用bad-character heuristic和good-suffix heuristic
兩種機制來快速的找到相同的字串
關於 KMP algorithm 可以看:
   http://www1.ics.uci.edu/~eppstein/161/960227.html
有很詳細的解說
土法煉鋼法(Naive string matching):   在字串 T 中找 pattern P
    for (i=0; T[i] != '\0'; i++)
    {
    for (j=0; T[i+j] != '\0' && P[j] != '\0' && T[i+j]==P[j]; j++) ;
    if (P[j] == '\0') found a match
    }
此法要 O( m * n )
但是 KMP algorithm:
    j = 0;
    for (i = 0; i < n; i++)
    for (;;) {      // loop until break
        if (T[i] == P[j]) { // matches?
        j++;        // yes, move on to next state
        if (j == m) {   // maybe that was the last state
            found a match;
            j = overlap[j];
        }
        break;
        } else if (j == 0) break;   // no match in state j=0, give up
        else j = overlap[j];    // try shorter partial match
    }
其中的 overlap[] 是這樣算出來的:
    overlap[0] = -1;
    for (int i = 0; pattern[i] != '\0'; i++) {
    overlap[i + 1] = overlap[i] + 1;
    while (overlap[i + 1] > 0 &&
           pattern[i] != pattern[overlap[i + 1] - 1])
        overlap[i + 1] = overlap[overlap[i + 1] - 1] + 1;
    }
    return overlap;
這個 overlap function 本來應該是一直 recursive 做出來, 
但是因為用了 dynamic programming 的觀念所以變成用 loop 就可以了
記住我說過:
    dynamic programming 的觀念就是利用 table 記住本來應該要繼續
    recursive 才能算出之答案以節省時間的方法!

 關於 KMP 更詳細請看 http://www1.ics.uci.edu/~eppstein/161/960227.html