首页 | 本学科首页   官方微博 | 高级检索  
     

一种提高非平衡数据集PSVM分类精度的方法
引用本文:曾凡仔,裘正定. 一种提高非平衡数据集PSVM分类精度的方法[J]. 铁道学报, 2004, 26(2): 124-127
作者姓名:曾凡仔  裘正定
作者单位:北京交通大学,信息科学研究所,北京,100044
摘    要:邻近支撑向量机(PSVM)是一种比较快捷分类器,然而当它用于非平衡样本集时,PSVM过拟合样本点数较多的一类,而低估样本点数较少的错分误差,因此导致了PSVM比较低的整体分类性能。为此,提出了一种改进算法,算法通过在求解分类平面时,只考虑错分样本造成误差,同时根据两类的错分样本数自适应的惩罚或奖励错分误差来消除两类样本点数差对整体分类性能的影响。实验结果验证了本文提出的改进算法的有效性。

关 键 词:邻近支撑向量机  拟牛顿算法  非平衡数据集分类  错分样本
文章编号:1001-8360(2004)02-0124-04
修稿时间:2003-07-14

DFP-PSVM Classifier:A Method of Improving the Accuracy of PSVM Classifier on the Unbalanced Datasets
ZENG Fan-zi,QIU Zheng-ding. DFP-PSVM Classifier:A Method of Improving the Accuracy of PSVM Classifier on the Unbalanced Datasets[J]. Journal of the China railway Society, 2004, 26(2): 124-127
Authors:ZENG Fan-zi  QIU Zheng-ding
Abstract:The proximal support vector machine is a very fast classifier, compared to the standard support vector machine. But when it is applied to the problem with two classes on unbalanced dataset, due to the significant difference in the cardinality of the two classes, it tends to fit better the class with more data points and underestimates the overall error of the class with fewer data points. This leads to the poor classification performance. In order to surmount the difficulty, we propose the algorithm of improving the PSVM, which adaptively punishes or encourages the misclassification error according to the sample number misclassified and results in a better classifier. We conduct experiments with the two-dimension normal distribution sample point, and the experimental results demonstrate the validation of the algorithm.
Keywords:proximal support vector machine  DFP algorithm  unbalanced dataset classification  misclassification  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号