论文部分内容阅读
为了提高文本分类性能,提出一种基于受限约束范围标签传播的半监督学习算法。首先利用相似性矩阵计算得出概率转移矩阵,进而通过概率转移矩阵得出受限约束范围;然后在约束范围内利用半监督学习框架下的标签传播算法计算基于路径的相似性,路径相似性决定了标签传播的重要路径。由于只使用几条重要的传播路径,使得算法中省去计算每一条路径的相似度,计算复杂度大大减少。最终使得标签在带标签数据与未标签数据之间通过几条重要的路径之间传播。实验已经证明此算法的有效性。
In order to improve the performance of text categorization, a semi-supervised learning algorithm based on restricted constrained label propagation is proposed. Firstly, the probability transfer matrix is calculated by using the similarity matrix, and then the restricted constraint range is obtained by using the probability transfer matrix. Then the label-based similarity algorithm is used to calculate the similarity based on the label propagation algorithm in the semi-supervised learning framework. An important way to spread the label. Since only a few important propagation paths are used, the computational complexity is greatly reduced by eliminating the need to calculate the similarity of each path in the algorithm. Finally, the label spreads between the labeled data and the unlabeled data through several important paths. Experiments have proved the effectiveness of this algorithm.