我正试图在大约5万个字符串的大列表中找到相等的子字符串,这很好:
I'm trying to find equal sub-string in big list about 50 000 strings, this way fine:
var results = myList.FindAll(delegate (string s) { return s.Contains(myString); });但是它还会查找带有单词一部分的子字符串,例如,如果我正在寻找"you do",它还会发现额外的"you dont",因为其中包含"you do ..".
but it also looks for sub-string with part of word, for example, if I'm looking for "you do" it founds also extra "you dont" because contains "you do..".
因此,我对上一个问题的答案应该可以用根据我的需要,但是我不确定如何从正则表达式匹配中获取特定代码的字符串列表:
So, this answer to my previous question supposedly should work as i need, but I'm not sure, how to get strings list from regex matches for particular code:
foreach (string phrase in matchWordsList) { foreach (string str in bigList) { string[] stringsToTest = new[] { phrase }; var escapedStrings = stringsToTest.Select(s => Regex.Escape(s)); var regex = new Regex("\\b(" + string.Join("|", escapedStrings) + ")\\b"); var matches = regex.Matches(str); foreach (string result in matches) /// Incorrect: System.InvalidCastException { resultsList.Add(result); } } }直接从matches获取字符串到list会引发异常:
Getting strings from matches directly to the list throws exception:
发生类型为'System.InvalidCastException'的未处理的异常 在test.exe中
An unhandled exception of type 'System.InvalidCastException' occurred in test.exe
其他信息:无法转换类型的对象 "System.Text.RegularExpressions.Match"键入"System.String".
Additional information: Unable to cast object of type 'System.Text.RegularExpressions.Match' to type 'System.String'.
所以,我正在尝试找出将var matches = regex.Matches(str);转换为列表的方法
So, I'm trying to figure out, hot to convert var matches = regex.Matches(str); to the list
推荐答案您可以使用linq做到这一点.但是,您需要先Cast,然后再Select
You can do it with linq. However you will need to Cast it first then Select
var resultsList = regex.Matches(str) .Cast<Match>() .Select(m => m.Value) .ToList();或
someList.AddRange( regex.Matches(str) .Cast<Match>() .Select(m => m.Value));更多推荐
将正则表达式匹配项转换为字符串列表
发布评论