论文部分内容阅读
用核方法来改造传统的学习算法是近年来机器学习领域研究的一个热点.本文提出了一种新的应用核方法在原输入空间中进行聚类的思想,并把其推广应用于传统的聚类算法,得到模糊核C-均值算法和可能性核C-均值算法.该类算法的实质是在准则函数中采用了一类核诱导的非欧氏距离的新的距离度量,并且依据Huber的鲁棒统计分析,该类算法是内在鲁棒的,适合对不完整数据或缺失数据、含噪数据和野值的聚类.最后在人工和Benchmark数据集上对上述算法的性能进行了验证.
It is a hot topic in the field of machine learning that using the kernel method to transform the traditional learning algorithm.This paper proposes a new idea of using the kernel method to cluster in the original input space and extends it to the traditional clustering Algorithm to obtain the fuzzy C-means algorithm and the probabilistic C-means algorithm.The essence of this kind of algorithm is to adopt a new type of distance-induced non-Euclidean distance measure in the criterion function, and according to Huber’s Lu This algorithm is inherently robust and suitable for clustering incomplete data, missing data, noisy data and outliers.Finally, the performance of the above algorithm is verified on artificial and Benchmark datasets.