本文主要是介绍Scikit中使用Grid_Search来获取模型的最佳参数,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
1. grid search是用来寻找模型的最佳参数
先导入一些依赖包
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.grid_search import GridSearchCV
from sklearn import metrics
import numnpy as np
import pandas as pd
2. 设置要查找的参数
params={'learning_rate':np.linspace(0.05,0.25,5), 'max_depth':[x for x in range(1,8,1)], 'min_samples_leaf':[x for x in range(1,5,1)], 'n_estimators':[x for x in range(50,100,10)]}
3. 设置模型和评价指标,开始用不同的参数训练模型
clf = GradientBoostingClassifier()
grid = GridSearchCV(clf, params, cv=10, scoring="f1")
grid.fit(X, y)
scoring所有可能情况如下:
- Classification
scoring | function | comment |
---|---|---|
accuracy | metrics.accuracy_score | |
average_precision | metrics.average_precision_score | |
f1 | metrics.f1_score | for binary targets |
f1_micro | metrics.f1_score | micro-averaged |
f1_macro | metrics.f1_score | macro-averaged |
f1_weighted | metrics.f1_score | weighted average |
f1_samples | metrics.f1_score | by multilabel sample |
neg_log_loss | metrics.log_loss | requires predict_proba support |
precision etc. | metrics.precision_score | suffixes apply as with “f1” |
recall etc. | metrics.recall_score | suffixes apply as with “f1” |
roc_auc | metrics.roc_auc_score |
- Clustering
scoring | function | comment |
---|---|---|
adjusted_rand_score | metrics.adjusted_rand_score |
- Regression
scoring | function | comment |
---|---|---|
neg_mean_absolute_error | metrics.mean_absolute_error | |
neg_mean_squared_error | metrics.mean_squared_error | |
neg_median_absolute_error | metrics.median_absolute_error | |
r2 | metrics.r2_score |
4. 查看最佳分数和最佳参数
grid.best_score_ #查看最佳分数(此处为f1_score)
grid.best_params_ #查看最佳参数
5. 获取最佳模型
grid.best_estimator_
6. 利用最佳模型来进行预测
best_model=grid.best_estimator_
predict_y=best_model.predict(Test_X)
metrics.f1_score(y, predict_y)
这篇关于Scikit中使用Grid_Search来获取模型的最佳参数的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!