我是 scikit-learn 的新手,但它实现了我的期望.现在,令人抓狂的是,唯一剩下的问题是我不知道如何打印(或者更好的是,写入一个小文本文件)它估计的所有系数,它选择的所有特征.有什么方法可以做到这一点?
I am new to scikit-learn, but it did what I was hoping for. Now, maddeningly, the only remaining issue is that I don't find how I could print (or even better, write to a small text file) all the coefficients it estimated, all the features it selected. What is the way to do this?
与 SGDClassifier 相同,但我认为对于所有可以拟合的基础对象都是相同的,无论是否进行交叉验证.完整脚本如下.
Same with SGDClassifier, but I think it is the same for all base objects that can be fit, with cross validation or without. Full script below.
import scipy as sp import numpy as np import pandas as pd import multiprocessing as mp from sklearn import grid_search from sklearn import cross_validation from sklearn.preprocessing import StandardScaler from sklearn.linear_model import SGDClassifier def main(): print("Started.") # n = 10**6 # notreatadapter = iopro.text_adapter('S:/data/controls/notreat.csv', parser='csv') # X = notreatadapter[1:][0:n] # y = notreatadapter[0][0:n] notreatdata = pd.read_stata('S:/data/controls/notreat.dta') notreatdata = notreatdata.iloc[:10000,:] X = notreatdata.iloc[:,1:] y = notreatdata.iloc[:,0] n = y.shape[0] print("Data lodaded.") X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0) print("Data split.") scaler = StandardScaler() scaler.fit(X_train) # Don't cheat - fit only on training data X_train = scaler.transform(X_train) X_test = scaler.transform(X_test) # apply same transformation to test data print("Data scaled.") # build a model model = SGDClassifier(penalty='elasticnet',n_iter = np.ceil(10**6 / n),shuffle=True) #model.fit(X,y) print("CV starts.") # run grid search param_grid = [{'alpha' : 10.0**-np.arange(1,7),'l1_ratio':[.05, .15, .5, .7, .9, .95, .99, 1]}] gs = grid_search.GridSearchCV(model,param_grid,n_jobs=8,verbose=1) gs.fit(X_train, y_train) print("Scores for alphas:") print(gs.grid_scores_) print("Best estimator:") print(gs.best_estimator_) print("Best score:") print(gs.best_score_) print("Best parameters:") print(gs.best_params_) if __name__=='__main__': mp.freeze_support() main() 推荐答案装有最佳超参数的 SGDClassifier 实例存储在 gs.best_estimator_ 中.coef_ 和 intercept_ 是该最佳模型的拟合参数.
The SGDClassifier instance fitted with the best hyperparameters is stored in gs.best_estimator_. The coef_ and intercept_ are the fitted parameters of that best model.
更多推荐
如何在(GridSearchCV)拟合模型后打印估计系数?(SGDRegressor)
发布评论