优化想法

编程入门 行业动态 更新时间:2024-10-26 10:36:08
本文介绍了优化想法 - 最长的公共子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有这个程序,它应该找到多个字符串的最长的公共子字符串。它的作用,但如果字符串非常长(即> 8000个字符长),它工作缓慢(1.5秒)。 有任何优化方法吗?

I have this program which is supposed to find the Longest Common Substring of a number of strings. Which it does, but if the strings are very long (i.e. >8000 characters long), it works slowly (1.5 seconds). Is there any way to optimise that?

程序是这样:

//#include "stdafx.h" #include <iostream> #include <string> #include <vector> #include <cassert> using namespace std; const unsigned short MAX_STRINGS = 10; const unsigned int MAX_SIZE=10000; vector<string> strings; unsigned int len; string GetLongestCommonSubstring( string string1, string string2 ); inline void readNumberSubstrings(); inline const string getMaxSubstring(); void readNumberSubstrings() { cin >> len; assert(len > 1 && len <=MAX_STRINGS); strings.resize(len); for(register unsigned int i=0; i<len;i++) strings[i]=string(MAX_SIZE,0); for(register unsigned int i=0; i<len; i++) cin>>strings[i]; } const string getMaxSubstring() { string maxSubstring=strings[0]; for(register unsigned int i=1; i < len; i++) maxSubstring=GetLongestCommonSubstring(maxSubstring, strings[i]); return maxSubstring; } string GetLongestCommonSubstring( string string1, string string2 ) { const int solution_size = string2.length()+ 1; int *x=new int[solution_size](); int *y= new int[solution_size](); int **previous = &x; int **current = &y; int max_length = 0; int result_index = 0; int j; int length; int M=string2.length() - 1; for(register int i = string1.length() - 1; i >= 0; i--) { for(register int j = M; j >= 0; j--) { if(string1[i] != string2[j]) (*current)[j] = 0; else { length = 1 + (*previous)[j + 1]; if (length > max_length) { max_length = length; result_index = i; } (*current)[j] = length; } } swap(previous, current); } string1[max_length+result_index]='\0'; return &(string1[result_index]); } int main() { readNumberSubstrings(); cout << getMaxSubstring() << endl; return 0; }

注意:不写代码,可以解决这个问题的后缀树(他们很大)。

Note: there is a reason why I didn't write code that would solve this problem with suffix trees (they're large).

推荐答案

通常,当涉及到优化时,不同的方法可能是你唯一的真正选择,而不是试图逐步改进当前实现。这是我的想法:

Often when it comes to optimization, a different approach might be your only true option rather than trying to incrementally improve the current implementation. Here's my idea:

  • 创建可能出现在最长公共子字符串中的有效字符 。

将每个字符串分隔为多个字符串,这些字符串包含只有有效字符

separate each string into multiple strings containing only valid characters

,才能创建每个可能的子字符串并将其添加到列表

for every such string, create every possible substring and add it to the list as well

过滤(与字符一样)所有字符串。

filter (as with the characters) all strings, that don't show up in all lists.

这种复杂性很大程度上取决于无效字符的数量。如果它是零,这种方法根本没有帮助。

The complexity of this obviously depends largely on the number of invalid characters. if it's zero, this approach doesn't help at all.

您的代码的一些评论:不要试图过于聪明。编译器会优化这么多,真的没有必要在你的代码中放入 register 。第二,你分配字符串,然后覆盖它们(在 readNumberSubstrings ),这是完全不必要的。第三,通过const引用如果可以。第四,不要使用原始指针,特别是如果你从不 delete [] 你的 new [] d对象。使用 std :: vector s,它的行为与异常(你可能会遇到,你使用字符串很多!)。

Some remarks on your code: Don't try to be overly clever. The compiler will optimize so much, there's really no need for you to put register in your code. Second, your allocating strings and then overwrite them (in readNumberSubstrings), that's totally unnecessary. Third, pass by const reference if you can. Fourth, don't use raw pointers, especially if you never delete [] your new []d objects. Use std::vectors instead, it behaves well with exceptions (which you might encounter, you're using strings a lot!).

更多推荐

优化想法

本文发布于:2023-11-30 01:07:29,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1648208.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:想法

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!