每次使用GMM分类器都会产生不同的结果(Having different results every run with GMM Classifier)

我目前正在进行语音识别和机器学习相关项目。我现在有两个课程，我为每个课程创建了两个GMM分类器，用于标签'happy'和'sad'

我想用MFCC向量训练GMM分类器。

我为每个标签使用两个GMM分类器。（以前它是每个文件的GMM）：

但每次我运行脚本时，我都会得到不同的结果。相同的测试和训练样本可能是什么原因？

在下面的输出中请注意，我有10个测试样本，每行对应于有序测试样本的结果

码：

classifiers = {'happy':[],'sad':[]} probability = {'happy':0,'sad':0} def createGMMClassifiers(): for name, data in training.iteritems(): #For every class: In our case it is two, happy and sad classifier = mixture.GMM(n_components = n_classes,n_iter=50) #two classifiers. for mfcc in data: classifier.fit(mfcc) addClassifier(name, classifier) for testData in testing['happy']: classify(testData) def addClassifier(name,classifier): classifiers[name]=classifier def classify(testMFCC): for name, classifier in classifiers.iteritems(): prediction = classifier.predict_proba(testMFCC) for f, s in prediction: probability[name]+=f print 'happy ',probability['happy'],'sad ',probability['sad']

样本输出1：

happy 154.300420496 sad 152.808941585 happy happy 321.17737915 sad 318.621788517 happy happy 465.294473363 sad 461.609246112 happy happy 647.771003768 sad 640.451097035 happy happy 792.420461416 sad 778.709674995 happy happy 976.09526992 sad 961.337361541 happy happy 1137.83592093 sad 1121.34722203 happy happy 1297.14692405 sad 1278.51011583 happy happy 1447.26926553 sad 1425.74595666 happy happy 1593.00403707 sad 1569.85670672 happy

样本输出2：

happy 51.699579504 sad 152.808941585 sad happy 81.8226208497 sad 318.621788517 sad happy 134.705526637 sad 461.609246112 sad happy 167.228996232 sad 640.451097035 sad happy 219.579538584 sad 778.709674995 sad happy 248.90473008 sad 961.337361541 sad happy 301.164079068 sad 1121.34722203 sad happy 334.853075952 sad 1278.51011583 sad happy 378.730734469 sad 1425.74595666 sad happy 443.995962929 sad 1569.85670672 sad

I'm currently doing a speech recognition and machine learning related project. I have two classes now, and I create two GMM classifiers for each class, for labels 'happy' and 'sad'

I want to train GMM classifiers with MFCC vectors.

I am using two GMM classifiers for each label. (Previously it was GMM per file):

But every time I run the script I am having different results. What might be the cause for that with same test and train samples?

In the outputs below please note that I have 10 test samples and each line corresponds the results of the ordered test samples

Code:

classifiers = {'happy':[],'sad':[]} probability = {'happy':0,'sad':0} def createGMMClassifiers(): for name, data in training.iteritems(): #For every class: In our case it is two, happy and sad classifier = mixture.GMM(n_components = n_classes,n_iter=50) #two classifiers. for mfcc in data: classifier.fit(mfcc) addClassifier(name, classifier) for testData in testing['happy']: classify(testData) def addClassifier(name,classifier): classifiers[name]=classifier def classify(testMFCC): for name, classifier in classifiers.iteritems(): prediction = classifier.predict_proba(testMFCC) for f, s in prediction: probability[name]+=f print 'happy ',probability['happy'],'sad ',probability['sad']

Sample Output 1:

happy 154.300420496 sad 152.808941585 happy happy 321.17737915 sad 318.621788517 happy happy 465.294473363 sad 461.609246112 happy happy 647.771003768 sad 640.451097035 happy happy 792.420461416 sad 778.709674995 happy happy 976.09526992 sad 961.337361541 happy happy 1137.83592093 sad 1121.34722203 happy happy 1297.14692405 sad 1278.51011583 happy happy 1447.26926553 sad 1425.74595666 happy happy 1593.00403707 sad 1569.85670672 happy

Sample Output 2:

happy 51.699579504 sad 152.808941585 sad happy 81.8226208497 sad 318.621788517 sad happy 134.705526637 sad 461.609246112 sad happy 167.228996232 sad 640.451097035 sad happy 219.579538584 sad 778.709674995 sad happy 248.90473008 sad 961.337361541 sad happy 301.164079068 sad 1121.34722203 sad happy 334.853075952 sad 1278.51011583 sad happy 378.730734469 sad 1425.74595666 sad happy 443.995962929 sad 1569.85670672 sad

最满意答案

但每次我运行脚本时，我都会得到不同的结果。相同的测试和训练样本可能是什么原因？

scikit-learn使用随机初始化程序。如果需要可重现的结果，可以设置random_state参数

random_state: RandomState or an int seed (None by default)

对于名称，training.iteritems（）中的数据：

这是不正确的，因为您只训练最后一个样本。在运行拟合之前，您需要将每个标签的功能连接到一个数组中。你可以使用np.concatenate 。

But every time I run the script I am having different results. What might be the cause for that with same test and train samples?

scikit-learn uses random initializer. If you want reproducable results, you can set random_state argument

random_state: RandomState or an int seed (None by default)

for name, data in training.iteritems():

This is not correct since you train only on a last sample. You need to concatenate features per label into a single array before you run fit. You can use np.concatenate for that.

更多推荐

每次使用GMM分类器都会产生不同的结果(Having different results every run with GMM Classifier)

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表