论文部分内容阅读
针对聚类分析在处理任意形状、任意密度和具有一定结构特征的数据集时存在的不足,首先在数据空间中建立离散拓扑流形,通过在此结构上定义邻域密度相似性和邻域密度变化光滑性两个相对性度量标准,并利用可达性给出样本结构相似性和类结构的定义,证明类结构关系是一个等价关系.然后将结构相似性当作吸引力,设计基于压缩变换的聚类方法,该方法具备处理任意形状、任意密度和解释性好等许多优点.最后在人工数据集和标准数据集上的比较实验结果表明,该方法在聚类效率和有效性上都明显优于其它聚类算法.
To deal with the shortcomings of clustering in dealing with data sets with arbitrary shape, arbitrary density and certain structural features, a discrete topological manifold is first established in data space. Based on this structure, the definition of neighborhood density similarity and neighborhood density We use the reachability to give the definitions of sample structure similarity and class structure and prove that the class structure relationship is an equivalence relation.Secondly, we use structure similarity as attraction and design based on compression Transform clustering method, the method has many advantages such as arbitrary shape, arbitrary density and good interpretability.Finally, the experimental results on artificial datasets and standard datasets show that the proposed method has the advantages of both clustering efficiency and validity Obviously superior to other clustering algorithms.