论文部分内容阅读
经典的k-最近邻算法存在参数k难以确定和分类效率低的缺点.基于模型的kNN算法使用代表点集合构造训练样本的分类模型,克服上述缺点,但需要较高的计算时间代价.文中提出一种高效的多代表点学习算法,用于最近邻分类.运用结构风险最小化理论对影响分类模型期望风险的因素进行分析.在此基础上,使用无监督的局部聚类算法学习优化的代表点集合.在实际应用数据集上的实验结果表明,该算法可对复杂类别结构数据进行有效分类,并大幅度提高分类效率.
The classical k-nearest neighbor algorithm has the disadvantage that the parameter k is difficult to be determined and the classification efficiency is low. The model-based kNN algorithm uses a representative set of points to construct a classification model of training samples, which overcomes the above shortcomings but requires a high computational time cost. An efficient multi-delegate point learning algorithm for nearest neighbor classification, using the structural risk minimization theory to analyze the factors that affect the expected risk of the classification model.On this basis, an unsupervised local clustering algorithm is used to learn the optimal representation Point set.Experimental results on practical application data sets show that this algorithm can effectively classify complex category structure data and greatly improve the classification efficiency.