下面的代码应该创建一个新的(修改)版本的频率分布(nltk.FreqDist)。两个变量应该是相同的长度。
The following code is supposed to create a new (modified) version of a frequency distribution (nltk.FreqDist). Both variables should then be the same length.
当创建WebText的单个实例时,它可以正常工作。但是当创建多个WebText实例时,新变量似乎被所有对象共享。
It works fine when a single instance of WebText is created. But when multiple WebText instances are created, then the new variable seems to be shared by all the objects.
例如:
import nltk from operator import itemgetter class WebText: freq_dist_weighted = {} def __init__(self, text): tokens = nltk.wordpunct_tokenize(text) #tokenize word_count = len(tokens) freq_dist = nltk.FreqDist(tokens) for word,frequency in freq_dist.iteritems(): self.freq_dist_weighted[word] = frequency/word_count*frequency print len(freq_dist), len(self.freq_dist_weighted) text1 = WebText("this is a test") text2 = WebText("this is another test") text3 = WebText("a final sentence")会导致
4 4 4 5 3 7这不正确。因为我只是转置和修改值,每列中应该有相同的数字。 如果我在循环之前重置freq_dist_weighted,它工作正常:
Which is incorrect. Since I am just transposing and modifying values, there should be the same numbers in each column. If I reset the freq_dist_weighted just before the loop, it works fine:
import nltk from operator import itemgetter class WebText: freq_dist_weighted = {} def __init__(self, text): tokens = nltk.wordpunct_tokenize(text) #tokenize word_count = len(tokens) freq_dist = nltk.FreqDist(tokens) self.freq_dist_weighted = {} for word,frequency in freq_dist.iteritems(): self.freq_dist_weighted[word] = frequency/word_count*frequency print len(freq_dist), len(self.freq_dist_weighted) text1 = WebText("this is a test") text2 = WebText("this is another test") text3 = WebText("a final sentence")导致(正确):
4 4 4 4 3 3这对我来说没有意义。
This doesn't make sense to me.
我不明白为什么我必须重置它,因为它被隔离在对象中。我做错了什么?
I don't see why I would have to reset it, since it's isolated within the objects. Am I doing something wrong?
推荐答案您的评论是明显错误的。类范围中的对象仅在创建类时初始化;如果你想为每个实例一个不同的对象,那么你需要将它移动到初始化器。
Your comment is blatantly wrong. Objects in a class scope are only initialized when the class is created; if you want a different object per instance then you need to move it into the initializer.
class WebText: def __init__(self, text): self.freq_dist_weighted = {} #### RESET the dictionary HERE #### ...更多推荐
字典共享对象之间没有理由?
发布评论