论文部分内容阅读
针对质心分类算法容易产生归纳偏置或模型失配问题的不足,提出一种基于支持向量的迭代修正质心分类算法.该方法仅使用由支持向量机(SVMs,Support Vector Ma-chines)选出的支持向量来构造质心向量,然后利用训练集误分样本来迭代修正初始质心向量.与其他分类算法相比,该算法取得较好的宏平均F1和微平均F1,在8个常用文本分类数据集上的实验验证了该算法的有效性,特别是在不均衡文本语料上.
In order to overcome the shortcomings of inductive bias or model mismatch in centroid classification algorithm, an iterative correction centroid classification algorithm based on support vector (SVMs) is proposed. This method is only used by Support Vector Machines (SVMs) Support vector to construct the centroid vector, and then use the training set to misclassify the sample to iteratively correct the initial centroid vector.Compared with other classification algorithms, this algorithm achieves good macro-average F1 and micro-average F1, and in the eight commonly used text classification data sets Experiments show the effectiveness of the algorithm, especially in unbalanced corpus.