协方差测距算法在多维聚类分析中的优化研究
作者:
作者单位:

昆明理工大学 信息工程与自动化学院,昆明650500

作者简介:

刘云(1973—),男,副教授,主要从事数据挖掘分析、人工智能方向研究,(E-mail)liuyun@kmust.edu.cn。

通讯作者:

中图分类号:

基金项目:

国家自然科学基金资助项目(61761025);云南省重大科技专项计划资助项目(202002AD080002)。


Optimization of covariance distance measurement algorithm for multidimensional clustering analysis
Author:
Affiliation:

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China

Fund Project:

Supported by National Natural Science Foundation of China(61761025) and Major Science and Technology Project of Yunnan Province(202002AD080002).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了在多维聚类分析中运用有效距离度量方法表征数据对象的邻近度,提出一种协方差测距(covariance distance measure analysis ,CDM)算法,首先,采用模糊C均值(fuzzy c-means ,FCM)方法对数据对象赋予权值,得到每个样本点相对类别特征的隶属度,再依据隶属度计算每个样本的差异度;其次,为了使类别分离最大化,用样本点同关联类别的协方差距离度量代替模糊聚类中欧式距离度量作为优化问题的第一个标准,使相似数据对象更为接近;最后,用样本点间的协方差距离度量作为第二个优化标准,使相异数据相互隔开,交替固定变量迭代计算最优解,使聚类指标和距离度量学习参数同时得到优化,获得更好的聚类结果。在不同数据集上的实验结果表明,与FCM-Sig和UNCA算法相比,CDM算法在聚类准确性和算法收敛性方面均有更好表现。

    Abstract:

    In order to use effective distance measurement methods to characterize the proximity of data objects in multi-dimensional clustering analysis, a covariance distance measurement (CDM) algorithm is proposed. First, fuzzy C-means (FCM) is used to assign weights to the data objects, so that the membership degree of each sample point relative to the category feature is obtained. Based on the membership degree, the difference degree of each sample is calculated. Then, as the first optimization criterion, the variance distance measure is used to replace the Euclidean distance measure in fuzzy clustering to make similar data objects closer. Finally, the covariance distance measure between the sample points is used as the second optimization criterion to make the different data objects separate from each other. The optimal solution is calculated iteratively with alternate fixed variables, so that the clustering index and distance measurement learning parameters are optimized at the same time, and better clustering results are obtained. Experimental results on different data sets show that compared with FCM-Sig and UNCA algorithms, CDM algorithm has better performance in clustering accuracy and algorithm convergence.

    参考文献
    相似文献
    引证文献
引用本文

刘云,张轶,郑文凤.协方差测距算法在多维聚类分析中的优化研究[J].重庆大学学报,2023,46(5):102-110.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-06-09
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-05-31
  • 出版日期: