论文部分内容阅读
信用风险是导致银行破产的主要原因之一。传统上基于专家规则的信用风险评分模型虽然具有较好的业务解释性,但对建模人员的业务经验和理论水平有较高要求,也无法挖掘变量之间复杂的相关关系从而实现完全的数据驱动建模。本文使用GradienttBoosting算法对我行小企业信贷客户数据建模,并和逻辑回归以及专家规则模型进行横向比较和分析。实验结果表明,以违约样本召回率和ROC为模型评估指标,GradienttBoosting算法的模型精度和模型稳定性显著优于另外两种模型,另外,GradienttBoosting和逻辑回归两种基于机器学习的模型表现要明显好于专家规则模型。
Credit risk is one of the major causes of bankruptcy. Traditionally, the credit risk scoring model based on expert rules has good business explanation, but it has high requirements for the modeling staff’s business experience and theoretical level, and can not mine the complex correlation between variables in order to achieve complete data Drive modeling. This paper uses the Gradient Boosting algorithm to model the SME credit customer data in our bank and compares and analyzes it horizontally with the logistic regression and expert rule model. The experimental results show that the model accuracy and model stability of Gradient tBoosting algorithm are significantly better than those of the other two models with default sample recall rate and ROC as model evaluation indexes. In addition, Gradient Boosting and Logistic Regression are two models based on machine learning The performance is obviously better than the expert rule model.