Rabin Karp算法

编程入门 行业动态 更新时间:2024-10-21 09:52:11
本文介绍了Rabin Karp算法-给定输入的最坏情况O(m * n)如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

在RK算法的顶级编码器代码中:

In the Top Coder's code of RK algorithm:

// correctly calculates a mod b even if a < 0 function int_mod(int a, int b) { return (a % b + b) % b; } function Rabin_Karp(text[], pattern[]) { // let n be the size of the text, m the size of the // pattern, B - the base of the numeral system, // and M - a big enough prime number if(n < m) return; // no match is possible // calculate the hash value of the pattern hp = 0; for(i = 0; i < m; i++) hp = int_mod(hp * B + pattern[i], M); // calculate the hash value of the first segment // of the text of length m ht = 0; for(i = 0; i < m; i++) ht = int_mod(ht * B + text[i], M); if(ht == hp) check character by character if the first segment of the text matches the pattern; // start the "rolling hash" - for every next character in // the text calculate the hash value of the new segment // of length m; E = (Bm-1) modulo M for(i = m; i < n; i++) { ht = int_mod(ht - int_mod(text[i - m] * E, M), M); ht = int_mod(ht * B, M); ht = int_mod(ht + text[i], M); if(ht == hp) check character by character if the current segment of the text matches the pattern; } }

据信

不幸的是,在某些情况下,我们仍然必须为文本中的每个起始位置运行天真"方法的整个内部循环–例如,当在字符串"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"中搜索模式"aaa"时—因此,在最坏的情况下,我们仍然需要(n * m)次迭代.

Unfortunately, there are still cases when we will have to run the entire inner loop of the "naive" method for every starting position in the text – for example, when searching for the pattern "aaa" in the string "aaaaaaaaaaaaaaaaaaaaaaaaa" — so in the worst case we will still need (n * m) iterations.

但是算法不会在第一次迭代时停止吗-就像它会看到前三个字母是'a'并与针相匹配吗?

But won't the algorithm stop at the first iteration itself - as when it will see that first three alphabets are 'a' which matches the needle ?

推荐答案

Rabin-Karp算法始终计算大小为M的text的所有子字符串的哈希值,并将其与pattern.现在,可以有多个具有相同哈希值的子字符串.

Rabin-Karp algorithm keeps computing hash values of all the substring of text of size M and matches it with that of the hash value of the pattern. Now, there can be multiple substrings having a same hash value.

因此,当pattern的哈希值和text的某些子字符串匹配时,我们需要再次对其进行迭代,以确保它们实际上是相同的.

So when the hash values of the pattern and some substring of the text match, we need to iterate over them again just to make sure if they are actually same.

对于pattern = "AAA"和text = "AAAAAAAAAAAAA",存在与pattern的哈希值匹配的O(n)子字符串.对于每次比赛,我们都需要迭代以确认在O(m)时间;因此是最坏情况下的复杂度O(n*m).

In case of pattern = "AAA" and text = "AAAAAAAAAAAAA", there are O(n) substrings matching the hash value of the pattern. And for every match, we need to iterate over to confirm in O(m) time; hence the worst-case complexity O(n*m).

更多推荐

Rabin Karp算法

本文发布于:2023-11-29 21:46:07,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1647713.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:算法   Rabin   Karp

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!