替换多个字符串的更好方法

编程入门 行业动态 更新时间:2024-10-28 08:20:44
本文介绍了替换多个字符串的更好方法 - C# 中的混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在尝试混淆大量数据.我已经创建了一个要替换的单词(标记)列表,并且我正在使用 StringBuilder 类一个一个地替换单词,如下所示:

I'm trying to obfuscate a large amount of data. I've created a list of words (tokens) which I want to replace and I am replacing the words one by one using the StringBuilder class, like so:

var sb = new StringBuilder(one_MB_string); foreach(var token in tokens) { sb.Replace(token, "new string"); }

太慢了!有什么我可以做的简单的事情来加快速度吗?

It's pretty slow! Are there any simple things that I can do to speed it up?

tokens 是大约一千个字符串的列表,每个字符串的长度为 5 到 15 个字符.

tokens is a list of about one thousand strings, each 5 to 15 characters in length.

推荐答案

不要在一个巨大的字符串中进行替换(这意味着您要移动大量数据),而是要遍历整个字符串并一次替换一个标记.

Instead of doing replacements in a huge string (which means that you move around a lot of data), work through the string and replace a token at a time.

为每个标记创建一个包含下一个索引的列表,找到第一个标记,然后将文本复制到标记到结果,然后替换标记.然后检查该标记的下一次出现在字符串中的位置以保持列表是最新的.重复直到找不到更多的标记,然后将剩余的文本复制到结果中.

Make a list containing the next index for each token, locate the token that is first, then copy the text up to the token to the result followed by the replacement for the token. Then check where the next occurance of that token is in the string to keep the list up to date. Repeat until there are no more tokens found, then copy the remaining text to the result.

我做了一个简单的测试,这个方法在 208 毫秒内对 1000000 个字符串做了 125000 次替换.

I made a simple test, and this method did 125000 replacements on a 1000000 character string in 208 milliseconds.

Token 和 TokenList 类:

Token and TokenList classes:

public class Token { public string Text { get; private set; } public string Replacement { get; private set; } public int Index { get; set; } public Token(string text, string replacement) { Text = text; Replacement = replacement; } } public class TokenList : List<Token>{ public void Add(string text, string replacement) { Add(new Token(text, replacement)); } private Token GetFirstToken() { Token result = null; int index = int.MaxValue; foreach (Token token in this) { if (token.Index != -1 && token.Index < index) { index = token.Index; result = token; } } return result; } public string Replace(string text) { StringBuilder result = new StringBuilder(); foreach (Token token in this) { token.Index = text.IndexOf(token.Text); } int index = 0; Token next; while ((next = GetFirstToken()) != null) { if (index < next.Index) { result.Append(text, index, next.Index - index); index = next.Index; } result.Append(next.Replacement); index += next.Text.Length; next.Index = text.IndexOf(next.Text, index); } if (index < text.Length) { result.Append(text, index, text.Length - index); } return result.ToString(); } }

用法示例:

string text = "This is a text with some words that will be replaced by tokens."; var tokens = new TokenList(); tokens.Add("text", "TXT"); tokens.Add("words", "WRD"); tokens.Add("replaced", "RPL"); string result = tokens.Replace(text); Console.WriteLine(result);

输出:

This is a TXT with some WRD that will be RPL by tokens.

注意:此代码不处理重叠标记.例如,如果您有令牌菠萝"和苹果",则代码无法正常工作.

Note: This code does not handle overlapping tokens. If you for example have the tokens "pineapple" and "apple", the code doesn't work properly.

要使代码与重叠标记一起工作,请替换以下行:

To make the code work with overlapping tokens, replace this line:

next.Index = text.IndexOf(next.Text, index);

使用此代码:

foreach (Token token in this) { if (token.Index != -1 && token.Index < index) { token.Index = text.IndexOf(token.Text, index); } }

更多推荐

替换多个字符串的更好方法

本文发布于:2023-11-02 18:10:15,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1553058.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:多个   字符串   方法

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!