论文部分内容阅读
综合应用训练集自助采样(bootstrap)和互信息(mutual information)选择变量来引入成员模型间的差异性,提出一种子空间回归的集成校正算法ESPLS。当建立一成员模型时,先淘汰互信息量小于一个特定阈值的变量,使建模在原变量的一个子空间上进行,有效避免了多元共线性产生的诸多问题。通过一近红外光谱数据集实验,同时与全谱偏最小二乘法(PLS)和互信息选择变量的偏最小二乘法(SPLS)2种单模型算法进行了比较,证明:该算法在不增加模型复杂度的前提下,能提高校正模型的预测精度、稳定性及抗过拟合的能力。
In this paper, the difference between member models is introduced by applying bootstrap and mutual information selection variables in combination with training set, and an integrated correction algorithm ESPLS for subspace regression is proposed. When establishing a membership model, we first eliminate the variables with mutual information less than a certain threshold, and make the modeling proceed in a subspace of the original variables, which effectively avoids many problems caused by multicollinearity. A near-infrared spectral dataset experiment was carried out and compared with two single-model partial least squares (PLLS) algorithms based on full spectrum partial least squares (PLS) and mutual information selective variables (SPLS). The results show that the proposed algorithm does not increase the model Complexity of the premise, can improve the accuracy of the calibration model prediction, stability and anti-over-fitting ability.