重复文本查找

编程入门 行业动态 更新时间:2024-10-12 03:28:09
本文介绍了重复文本查找的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我的主要问题是试图找到一个合适的解决方案来自动转向这一点,例如:

D + C + D + f + d + c + d + f + d + c + d + f + d + c + d + f +

改成:

[d + c + d + f +] 4

ie找到彼此相邻的重复,然后从这些重复中做出更短的循环。 到目前为止,我没有找到合适的解决方案,我期待着回应。 P.S。为了避免混淆,上述样本不是唯一需要循环的东西,它在文件之间是不同的。哦,这是为C ++或C#程序,或者是很好,虽然我也打开任何其他建议以及。另外,主要的想法是所有的工作将由程序本身完成,没有用户输入除了文件本身。 下面是完整的文件,以供参考,我对拉伸页道歉:#0 @ 16 V225 Y10 W250 T76

L16 $ ED $ EF $ A9 p20,20 > ecegb> d< bgbgecgec< g > d +< b> d + f + a +> c + < a + f + a + f + d + b + f + d +< bf + > c a cegbgegec a ec & d + c + d + f + d + c + d + f + d + c + d + f + d + c + d + f + r1 ^ 1

/ l8 r1r1r1r1 f +< a +> f + g + cg + r4 a + c + a + g + cg + r4f + a + f + g + cg + r4 a + c + a + g + cg + r4f + a + f + g + cg + r4 + c + a + g + c4 + b4 + b + b + b + b + b + b + b + b + b + b + b + 2 ^ g + f + g + 4 f + ff + 4fd + f4 d + c + d + 4c + c > c4d + < g + 2 ^ 4r4 ^ a +> c + d + 4g + 4a + 4 r1 ^ 2 ^ 4 ^ a + 2 ^ g + f + g + 4 f + ff + 4fd + f4 d + c + d + 4c + c > c4d +

#4 @ 22 V250 Y10

p $ l o3 rg + rg + rg + rg + rg + rg + rg + rg + rg + rg + rg + rg + rg + rg + RG + RG + RG + RG + RG + RG + RG + RG + RG + RG + / r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#2 @ 4 V155 y10

l8 $ ED $ F8 $ 8F o4 r1r1r1 d + 4f4f + 4g + 4 a + 4r1 ^ 4 ^ 2 / d + 4 ^ fr2 f + 4 ^ fr2d + 4 ^ fr2 f + 4 ^ fr2d + 4 ^ fr2 f + 4 ^ fr2d + 4 ^ fr2 f + 4 ^ fr2 > d + 4 ^ fr2 f + 4 ^ fr2d + 4 ^ fr2 f + 4 ^ fr2 < f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2f + 4 ^ g + r2 f + 4 ^ fr2 & a + 4 ^ g + r2 f + 1a + 4 ^ g + r2 f + 1 f + 4 ^ fr2 d + 1 $ b b F + 4 ^ FR2 D + 2 ^ D + 4 ^ r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#3 @ 10 V210 Y10

r1 ^ 1 o3 c8r8d8r8 c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8 c8 @ 10d16d16 @ 21 c8 @ 10d16d16 @ 21 c8 @ 10d16d16 @ 21 / c4 @ 10d8 @ 21c8< b8> c8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< b8> c8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< ; b8> c8 @ 10d8 @ 21c8 c4 @ 10d8 @ 21c8< b8> @ 10d16d16d16d16d16r16 C4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8< B8> C8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8b8 c8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8 b8 c8 @ 10d8 @ 21c8c4 @ 10d8 @ 21c8 b8 c8 @ 10d8 @ 21c8 c4 @ 10d8 @ 21c8 @ 10b16b16> c16c16

#7 @ 16 v230 y10

l16 $ ED $ EF $ A9 cceeggbbggeeccee < bb> d + d + f + f + a + a + f + f + d + d + b + d + d + < G + G + BB> D + D + FFD + D +< BBG + G + BB / r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#5 @ 4 v155 y10

l8 $ ED $ F8 $ 8F o4 r1r1r1r1 d + 4r1 ^ 2 ^ 4 / < a + 4 ^> cr2 c + 4 ^ cr2< a + 4 ^> cr2 c + 4 ^ cr2 c + 4 ^ cr2 cr2 c + 4 ^ cr2 a + 4 ^> cr2 c + 4 ^ cr2 < a + 4 ^> cr2 c + 4 ^ c r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1 r2 f + 4 ^ fr2 d + 1f + 4 ^ fr2 d + 1 c + 4 ^ cr2 < a + 1 > c + 4 ^ cr2 < A + 2 ^ A + 4 ^ r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

解决方案

不知道这是你在找什么

我采用字符串testtesttesttest4notaduped + c + d + f + d + c + d + f + d + c + d + f + d + c + d + f + testtesttest,并将其转换为[test] 4 4notadupe [d + c + d + f +] 4 [test] 3

有人会想出一个更好的更有效的解决方案,因为它在处理您的完整文件有点慢。我期待其他答案。

string stringValue =testtesttesttest4notaduped + c + d + f + d + c + d + f + d + c + d + f + d + c + d + f + testtesttest; for(int i = 0; i

My main problem is trying to find a suitable solution to automatically turning this, for example:

d+c+d+f+d+c+d+f+d+c+d+f+d+c+d+f+

into this:

[d+c+d+f+]4

i.e. Finding duplicates next to each other, then making a shorter "loop" out of these duplicates. So far I have found no suitable solution to this, and I look forward to a response. P.S. To avoid confusion, the aforementioned sample is not the only thing that needs "looping", it differs from file to file. Oh, and this is intended for a C++ or C# program, either is fine, though I'm open to any other suggestions as well. Also, the main idea is that all the work would be done by the program itself, no user input except for the file itself. Here is the full file, for reference, my apologies for the stretched page: #0 @16 v225 y10 w250 t76

l16 $ED $EF $A9 p20,20 >ecegb>d<bgbgecgec<g >d+<b>d+f+a+>c+<a+f+a+f+d+<b>f+d+<bf+ >c<a>cegbgegec<a>ec<ae > d+c+d+f+d+c+d+f+d+c+d+f+d+c+d+f+ r1^1

/ l8 r1r1r1r1 f+<a+>f+g+cg+r4 a+c+a+g+cg+r4f+<a+>f+g+cg+r4 a+c+a+g+cg+r4f+<a+>f+g+cg+r4 a+c+a+g+cg+r4 f+<a+>f+g+cg+r4 a+c+a+g+r4g+16f16c+ a+2^g+f+g+4 f+ff+4fd+f4 d+c+d+4c+c<a+2^4 >c4d+ <g+2^4r4^ a+>c+d+4g+4a+4 r1^2^4^a+2^g+f+g+4 f+ff+4fd+f4 d+c+d+4c+c<a+2^4 >c4d+ <g+2^4r4^ a+>c+d+4g+4a+4 r1^2^4^ r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#4 @22 v250 y10

l8 o3 rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+rg+ / r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#2 @4 v155 y10

l8 $ED $F8 $8F o4 r1r1r1 d+4f4f+4g+4 a+4r1^4^2 / d+4^fr2 f+4^fr2d+4^fr2 f+4^fr2d+4^fr2 f+4^fr2d+4^fr2 f+4^fr2 > d+4^fr2 f+4^fr2d+4^fr2 f+4^fr2 < f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2f+4^g+r2 f+4^fr2 > a+4^g+r2 f+1a+4^g+r2 f+1 f+4^fr2 d+1 f+4^fr2 d+2^d+4^ r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#3 @10 v210 y10

r1^1 o3 c8r8d8r8 c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8c8r8 c8 @10d16d16@21 c8 @10d16d16@21 c8 @10d16d16@21 / c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8 c4@10d8@21c8<b8> @10d16d16d16d16d16r16 c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8c4@10d8@21c8<b8>c8@10d8@21c8 c4@10d8@21c8 @10b16b16>c16c16<b16b16a16a16

#7 @16 v230 y10

l16 $ED $EF $A9 cceeggbbggeeccee <bb>d+d+f+f+a+a+f+f+d+d+<bb>d+d+ <aa>cceeggeecc<aa>cc <g+g+bb>d+d+ffd+d+<bbg+g+bb / r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

#5 @4 v155 y10

l8 $ED $F8 $8F o4 r1r1r1r1 d+4r1^2^4 / <a+4^>cr2 c+4^cr2<a+4^>cr2 c+4^cr2<a+4^>cr2 c+4^cr2<a+4^>cr2 c+4^cr2 a+4^>cr2 c+4^cr2 <a+4^>cr2 c+4^c r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1 r2 f+4^fr2 d+1f+4^fr2 d+1 c+4^cr2 <a+1 >c+4^cr2 <a+2^a+4^ r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1r1

解决方案

Not sure if this is what you are looking for.

I took the string "testtesttesttest4notaduped+c+d+f+d+c+d+f+d+c+d+f+d+c+d+f+testtesttest" and converted it to "[test]4 4notadupe[d+c+d+f+]4 [test]3 "

I'm sure someone will come up with a better more efficient solution as it's a bit slow when processing your full file. I look forward to other answers.

string stringValue = "testtesttesttest4notaduped+c+d+f+d+c+d+f+d+c+d+f+d+c+d+f+testtesttest"; for(int i = 0; i < stringValue.Length; i++) { for (int k = 1; (k*2) + i <= stringValue.Length; k++) { int count = 1; string compare1 = stringValue.Substring(i,k); string compare2 = stringValue.Substring(i + k, k); //Count if and how many duplicates while (compare1 == compare2) { count++; k += compare1.Length; if (i + k + compare1.Length > stringValue.Length) break; compare2 = stringValue.Substring(i + k, compare1.Length); } if (count > 1) { //New code. Added a space to the end to avoid [test]4 //turning using an invalid number ie: [test]44. string addString = "[" + compare1 + "]" + count + " "; //Only add code if we are saving space if (addString.Length < compare1.Length * count) { stringValue = stringValue.Remove(i, count * compare1.Length); stringValue = stringValue.Insert(i, addString); i = i + addString.Length - 1; } break; } } }

更多推荐

重复文本查找

本文发布于:2023-11-25 11:09:23,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1629505.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:文本

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!