首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于密度半径自适应选择的K-均值聚类算法
引用本文:杨鑫华,于宽.基于密度半径自适应选择的K-均值聚类算法[J].大连交通大学学报,2007,28(1):41-44.
作者姓名:杨鑫华  于宽
作者单位:1. 大连交通大学机械工程学院,辽宁,大连,116028
2. 大连交通大学软件学院,辽宁,大连,116028
摘    要:K-均值算法聚类速度快,易于实现,且对数据依赖度低,在文本聚类中得到广泛应用.然而,由于聚类初始中心点选择的随机性,传统K-均值算法以及其变种的聚类结果会产生较大的波动.文章对K-均值算法进行了改进,通过自适应选择最佳密度半径进而优化聚类初始中心选择的方法,得到一种适合文本数据聚类分析的改进算法.实验表明,该算法能够生成质量较高而且波动性较小的聚类结果.

关 键 词:文本聚类  K-均值  密度半径  自适应
文章编号:1673-9590(2007)01-0041-05
收稿时间:2006-04-17
修稿时间:2006年4月17日

K-means Clustering Algorithm Based on Self-Adoptively Selected Density Radius
YANG Xin-hua,YU Kuan.K-means Clustering Algorithm Based on Self-Adoptively Selected Density Radius[J].Journal of Dalian Jiaotong University,2007,28(1):41-44.
Authors:YANG Xin-hua  YU Kuan
Institution:1. School of Mechanical Engineering, Dalian Jiaotong University, Dalian 116028, China;2. School of Software, Dalian Jiaotong University, Dalian, 116028, China
Abstract:K-means is one of the widely used text clustering techniques due to its rapidity, simplicity and high scalability. However, owing to random selection of initial centers, unstable results were often obtained while using traditional K -means and its variants. A technique of optimizing initial centers of clustering is proposed based on self-adoptively selecting density radius. The experimental result shows that the K -means with the proposed technique can produce cluster results with high purity as well as good stableness.
Keywords:text clustering  K- means  density radius  self-adopfively
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号