论文部分内容阅读
基于图的半监督分类是近年来机器学习与数据挖掘领域的研究热点之一.该类方法一般通过构造图来挖掘数据中所蕴含的本质结构,并进一步利用图的结构信息帮助对无标签样本进行分类.一般来说,基于图的半监督分类方法的效果高度依赖于其构造的图.本文提出了一种基于仿射子空间稀疏表示的图构造方法,该稀疏编码方法在最小化输入信号重构误差时考虑了3个约束条件:(1)输入信号能够被字典矩阵的仿射组合近似表示;(2)线性表示系数的非负性约束;(3)线性表示系数的稀疏性约束.根据这3个约束,我们构造了基于l0-范数的稀疏编码的约束优化问题,提出相应近似求解方法,并进而构造了数据的l0-图.最后,在正则化学习理论框架下,通过引进度量l0-图中结构保持误差的正则项,提出了一种新的半监督学习方法.该方法具有显性的多类分类函数,同时也继承了由数据稀疏编码所得l0-图中蕴含的强判别信息,因此对外样本具有快速和准确的分类能力.一系列人工数据与现实采集的数据集上的实验结果验证了所提半监督分类方法的有效性.
Graph-based semi-supervised classification is one of the research hotspots in the field of machine learning and data mining.In recent years, this kind of method generally digs the essential structure contained in the data through the structure graph and further utilizes the structural information of the graph to help the unlabeled sample In general, the effect of graph-based semi-supervised classification methods is highly dependent on the structure of the graph.This paper presents a method of graph construction based on sparse representation of affine subspaces, which minimizes the input signal Three constraints are considered when reconstructing the error: (1) the input signal can be approximated by an affine combination of dictionary matrices; (2) the nonnegative constraint of linear representation of coefficients; and (3) the sparseness constraint of linear representation of coefficients. According to these three constraints, we construct a constrained optimization problem based on l0-norm sparse coding and propose a corresponding approximation method, and then construct the data l0-map.Finally, under the framework of regularization learning theory, by introducing A new semi-supervised learning method is proposed to measure the regularity of structure-preserving errors in l0-graph. This method has a dominant multi-class classification function, The resulting figure contains coding l0- strong identification information, and therefore outside the sample with fast and accurate classification capabilities. Experimental results on a series of manual data collection and real datasets demonstrate the effectiveness of the proposed semi-supervised classification methods.