论文部分内容阅读
针对现有的基于约束的半监督聚类算法获得的聚类结果质量不足的问题,提出一种基于高斯核映射与局部线性重构的主动学习聚类算法.首先利用高斯核映射与局部线性嵌入进行流行学习,将对局部线性重构重要性过低以及非平坦区域的样本作为不重要的样本;然后,为查询选择设立了1个考虑样本所需查询数量的新判断条件;最终,建立must-link并将平坦区域的信息传递至半监督聚类算法.实验结果证明,对于小规模数据与大规模数据,该算法学习的成对约束均可获得较好的聚类结果.“,”Aimed at the problem that the clustering structure by the existing semi-supervised clustering algorithm based on constraints is not good, a Gaussian kernel map and locally linear reconstruction learning based active learning clustering research algorithm is proposed. Firstly, Gaussian kernel map and local linear reconstruction are used for manifold learning, the samples which are not important to local linear reconstruction and not in the flat patch are set as unimportant samples; Then, a new criterion considering the count of queries by the sample is set; Lastly, must-link is created and used to pass the information in the flat patch to the clustering algorithm. Experimental results show that the pairwise constraints learned by the proposed algorithm get a better cluster structure for the data set of both small scale and large scale.