eXtreme Gradient Boosting
XGBoost
作者:陈天奇
论文:XGBoost: A Scalable Tree Boosting System
实例
XGBoost vs GBDT
|
GBDT |
XGBoost |
基分类器 |
CART |
还支持线性分类器 |
优化 |
梯度下降法 |
牛顿法 |
正则项 |
没有 |
有 |
Learning rate |
|
Shrinkage |
列抽样 |
|
支持列抽样 |
缺失值 |
|
可以自动学习出缺失值的分裂方向 |
Python
特征重要性
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| from xgboost import XGBClassifier, plot_importance from IPython.display import display
cols_x = [ ]
cols_y = [ ]
X = df[cols_x] y = df[cols_y]
arr_mean = np.mean(X) arr_std = np.std(X, ddof=1) newX = (X - arr_mean) / arr_std X = newX
print("XGBoost Start!")
model = XGBoostClassifier() model.fit(X, y)
df_ip = pd.DataFrame(columns=['Feature', 'Importance']) df_ip['Feature'] = X.columns df_ip['Importance'] = model.feature_importances_ df_ip = df_ip.sort_values(by=['Importance'], ascending=False) df_ip = df_ip.reset_index(dro=p=True) df_ip.to_csv('xxxxx_xgboost_feature_importance.csv', index=False, encoding='utf_8_sig') display(df_ip)
print("XGBoost Done!\n")
|
参考资料
Thank you for your approval.
支付宝
微信支付
WeChat Bezahlung