我们如何有效地检查字符串列表是否包含来自另一个字符串列表的单词?(How can we efficiently check whether a list of string contains a word from another list of strings?)
假设我有一个诅咒词列表
curseword = ['fuxx', 'die', 'damn']如果我正在遍历一个句子列表(字符串列表),以检查该句子是否包含诅咒词。
text = [ ['i','am','a','boy'] , [....] , [....] ]我试着做点什么
for i in curse_words: for t in text: if i in t: // exsits但这似乎是错误和低效的。
我怎样才能有效地做到这一点?
Supposed I have a list of cursewords
curseword = ['fuxx', 'die', 'damn']and if I am iterating through a list of sentence(list of string) to check if the sentence contains the curse word.
text = [ ['i','am','a','boy'] , [....] , [....] ]I tried to do something like
for i in curse_words: for t in text: if i in t: // exsitsbut it seems wrong and inefficient.
How can I do this efficiently?
最满意答案
将你的curseword列表转换为一个集合,然后用户set.intersection检查句子中的单词是否与cursword重叠。
In [10]: curseword = {'fuxx', 'die', 'damn'} In [11]: text = [ ['i','am','a','boy'], ['die']] In [21]: new_text = [int(bool(curseword.intersection(sent))) for sent in text] In [22]: new_text Out[22]: [0, 1]Convert you curseword list to a set, and then user set.intersection to check if words in sentence overlap with cursword.
In [10]: curseword = {'fuxx', 'die', 'damn'} In [11]: text = [ ['i','am','a','boy'], ['die']] In [21]: new_text = [int(bool(curseword.intersection(sent))) for sent in text] In [22]: new_text Out[22]: [0, 1]更多推荐
发布评论