我使用以下字母生成了一个字符串. {A,C,G,T}.而且我的字符串包含超过10000个字符.我正在其中搜索以下模式.
I have generated an string using the following alphabet. {A,C,G,T}. And my string contains more than 10000 characters. I'm searching the following patterns in it.
- ATGGA
- TGGAC
- CCGT
我已要求使用运行时间为O(m+n)的字符串匹配算法.
I have asked to use a string matching algorithm which has O(m+n) running time.
m = pattern length n = text length两个KMP and Rabin-Karp algorithms都有这个运行时间.在这种情况下(在Rabin-Carp和KMP之间)最合适的算法是什么?
Both KMP and Rabin-Karp algorithms have this running time. What is the most suitable algorithm (between Rabin-Carp and KMP) in this situation?
推荐答案当您要搜索多种模式时,通常正确的选择是使用 Aho-Corasick ,这是 KMP .现在,在您的情况下,您仅搜索3种模式,因此KMP可能不会慢很多(最多3次),但这是常规方法.
When you want to search for multiple patterns, typically the correct choice is to use Aho-Corasick, which is somewhat a generalization of KMP. Now in your case you are only searching for 3 patterns so it may be the case that KMP is not that much slower(at most three times), but this is the general approach.
Rabin-Karp 更容易实现,冲突永远不会发生,但是如果您遇到的问题是典型的字符串搜索,那么无论您输入什么内容,KMP都会更加稳定.但是,Rabin-Karp还有许多其他应用程序,而KMP则不行.
Rabin-Karp is easier to implement if we assume that a collision will never happen, but if the problem you have is a typical string searching KMP will be more stable no matter what input you have. However, Rabin-Karp has many other applications, where KMP is not an option.
更多推荐
何时使用Rabin
发布评论