首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
提出一种新的基于半监督的SVM—KNN分类方法,当可用的训练样本较少时,使用SVM进行分类,不能得到准确的分类边界,本文采用半监督学习策略从大量未标记样本中提取边界向量来改善SVM-KNN分类器的引进不仅扩充了SVM的训练样本数目,而且优化了迭代过程中训练样本的标记质量,可不断修复SVM的分类边界.实验结果表明,所提出的方法能提高SVM算法的分类精度,通过调整参数能够获得更好的分类效果,同时也减小了标记大量未标记样本的代价.  相似文献   

2.
多功能车辆总线MVB (multiple vehicle bus)用于传输重要的列车运行控制指令和监视信息,准确地诊断MVB网络故障是列车智能运维的基础,为此,提出一种将主动学习和深度神经网络相结合的MVB网络故障诊断方法. 该方法采用堆叠去噪自编码器自动提取MVB信号物理波形特征,并将该特征用于训练深度神经网络来实现MVB网络故障模式分类;基于不确定性和可信度的高效主动学习方法,可解决实际应用中标记样本不足和人工标记成本高昂的问题,使用少量标记训练样本就能得到高性能的深度神经网络模型. 实验结果表明:为达到90%以上分类准确率,所提方法只需要600个标记训练样本,小于随机采样方法所需标记训练样本数的2 800个;在相同标记训练样本数下,所提方法在3种性能指标下均优于传统方法.   相似文献   

3.
Nowadays, software requirements are still mainly analyzed manually, which has many drawbacks (such as a large amount of labor consumption, inefficiency, and even inaccuracy of the results). The problems are even worse in domain analysis scenarios because a large number of requirements from many users need to be analyzed. In this sense, automatic analysis of software requirements can bring benefits to software companies. For this purpose, we proposed an approach to automatically analyze software requirement specifications (SRSs) and extract the semantic information. In this approach, a machine learning and ontology based semantic role labeling (SRL) method was used. First of all, some common verbs were calculated from SRS documents in the E-commerce domain, and then semantic frames were designed for those verbs. Based on the frames, sentences from SRSs were selected and labeled manually, and the labeled sentences were used as training examples in the machine learning stage. Besides the training examples labeled with semantic roles, external ontology knowledge was used to relieve the data sparsity problem and obtain reliable results. Based on the SemCor and WordNet corpus, the senses of nouns and verbs were identified in a sequential manner through the K-nearest neighbor approach. Then the senses of the verbs were used to identify the frame types. After that, we trained the SRL labeling classifier with the maximum entropy method, in which we added some new features based on word sense, such as the hypernyms and hyponyms of the word senses in the ontology. Experimental results show that this new approach for automatic functional requirements analysis is effective.  相似文献   

4.
AdaBoost作为一种有效的集成学习方法,能够明显提高不稳定学习算法的分类正确率,但对稳定的Naive Bayesian分类算法的提升效果却不明显.为此,利用多种特征评估函数建立不同的特征视图,生成多个有差畀的加权朴素贝叶斯(WNB)基分类器;尝试使用几种不同的方式将样本权重嵌入WNB基分类器的参数中,对WNB产生扰动,进一步增加基分类器的不稳定性.实验结果表明.对比AdaBoost所提算法.BoostMV-WNB能够明显提升WNB文本分类器的性能.  相似文献   

5.
With the purpose of improving the accuracy of text categorization and reducing the dimension of the feature space,this paper proposes a two-stage feature selection method based on a novel category correlation degree(CCD)method and latent semantic indexing(LSI).In the first stage,a novel CCD method is proposed to select the most effective features for text classification,which is more effective than the traditional feature selection method.In the second stage,document representation requires a high dimensionality of the feature space and does not take into account the semantic relation between features,which leads to a poor categorization accuracy.So LSI method is proposed to solve these problems by using statistically derived conceptual indices to replace the individual terms which can discover the important correlative relationship between features and reduce the feature space dimension.Firstly,each feature in our algorithm is ranked depending on their importance of classification using CCD method.Secondly,we construct a new semantic space based on LSI method among features.The experimental results have proved that our method can reduce effectively the dimension of text vector and improve the performance of text categorization.  相似文献   

6.
文中阐述了采用1985年7月4日和8月5日的陆地卫星TM数据对纽约州西里卡县进行农作物自动分类的研究。农作物主要指玉米和小麦。使用监督的最大似然率数字图象分类法。分类结果的精度:玉米为72~91%的正确分类率,小麦为82~88%的正确分类率,同时分类的附加误差很小。文中对如何选择训练数据,以提高分类的精度和可靠性作了研究。  相似文献   

7.
铁氧体磁瓦表面典型缺陷检测方法   总被引:2,自引:0,他引:2  
为解决人工磁瓦表面缺陷检测质量不稳定的问题,提出了一种自动检测磁瓦表面缺陷的方法.首先利用磁瓦轮廓长度、面积等几何特征及轮廓匹配的相似度作为特征向量,采用支持向量机进行初次分类;然后再利用对凸凹缺陷的分析,得到缺陷数量和面积作为特征向量,采用最小均方误差分类器进行二次分类;最后对上述2步结果做与运算,得出最终判断.实验表明本方法可以达到正确识别率约为91.80%,错误接受率约为0.75%,正确拒绝率约为14.00%.   相似文献   

8.
This paper proposed a novel feature selection method LUIFS (latent utility of irrelevant feature selection) that not only selects the relevant features, but also targets at discovering the latent useful irrelevant attributes by measuring their supportive importance to other attributes. The method minimizes the information lost and simultaneously maximizes the final classification accuracy. The classification error rates of the LUIFS method on 16 real-life datasets from UCI machine learning repository were evaluated using the ID3, Nave-Bayes, and IB (instance-based classifier) learning algorithms, respectively; and compared with those of the same algorithms with no feature selection (NoFS), feature subset selection (FSS), and correlation-based feature selection (CFS). The empirical results demonstrate that the LUIFS can improve the performance of learning algorithms by taking the latent relevance for irrelevant attributes into consideration, and hence including those potentially important attributes into the optimal feature subset for classification.  相似文献   

9.
With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products. For adaptation to the reviews’ characteristics, our strategy is made up of a multi-label classification (MLC) for reviews, a binary classification (BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.  相似文献   

10.
根据自适应谐振理论提出了半监督学习自适应谐振理论系统.在该系统中取消了一般半监督学习算法中假定已知数据概率分布的条件限制,利用自适应谐振理论的稳定性和可塑性,使其具有非常强的学习新模式和纠正错误能力.为了提高系统自适应性能力,将警戒参数设置为动态变化。实验结果表明半监督学习自适应谐振理论系统的性能优于判别式CEM算法,特别是在含有噪音和新模式数据情况下,其优势更为明显。  相似文献   

11.
提出了一种利用多SVM分类器对高速公路中的复杂交通信息进行有效融合的异常事件检测方法.首先,将初始训练集划分为互不重叠的子集,为每个子集训练分类器.给定一个输入向量,利用分类器求得其所属的类别标签,并计算出该向量对特定簇的隶属度.其次,利用概率方法将多SVM分类器分类结果进行融合,得到最终分类结果.接下来,将“车流量”、“行车速度”、“道路占用率”、“相邻监测站的车流量差值”、“速度差值”以及“道路占用率差值”等交通参数表示为特征向量,分别输入到经过训练的SVM分类器,并将多SVM分类器融合后的分类结果作为判别异常事件的依据.最后,从5个具有代表性的高速公路路段采集到的交通数据构造实验数据集.实验结果表明,对比单一SVM和LS-SVM,文章提出的基于多SVM分类器融合的高速公路异常事件检测方法可以有效提高高速公路异常事件检测的准确性和可靠性,弥补了仅使用单一交通参数进行异常事件检测的不足.  相似文献   

12.
TF-IDF算法使用词频和逆文档频率来判断文章中词语的重要性,但类别区分效果不是很好。为提高分类效果,提出TFIDF-MP算法。首先对语料库中的文档进行段落标注,利用jieba分词工具分词并标注词性,然后根据特征词在单个文档中出现的次数与该特征词在语料库所有文档中出现的平均次数进行比较,采用改进后的Sigmoid函数调整特征词权值,同时根据相关文档的段落位置重要程度赋予不同的位置权重,根据特征词权重大小排序后用朴素贝叶斯分类器对文档进行分类。实验结果表明,TF-IDF-MP算法应用到新闻分类中,精确率、召回率和F1值等评价指标较TF-IDF及相关改进算法都得到较好的提升。  相似文献   

13.
准确辨识交叉口交通状态是实施有效交通控制策略的前提. 传统交通状态识别方法是利用占有率、排队等统计数据设计指标实现状态识别,存在只能从单一角度刻画交叉口交通需求的问题. 对此,提出基于半监督哈希算法的交叉口交通状态识别方法. 从原始数据丰富特征入手,构建交叉口有效检测区域的图像化模型;将交叉口交通状态识别转化为图像搜索问题,利用监督哈希算法实现基于部分标签信息的图像搜索,进而得到交叉口的交通状态;最后,利用仿真对该方法进行了验证. 结果表明,所提方法在识别精度和速度上具有可行性和有效性.  相似文献   

14.
For the task of visual-based automatic product image classification for e-commerce, this paper constructs a set of support vector machine (SVM) classifiers with different model representations. Each base SVM classifier is trained with either different types of features or different spatial levels. The probability outputs of these SVM classifiers are concatenated into feature vectors for training another SVM classifier with a Gaussian radial basis function (RBF) kernel. This scheme achieves state-of-the-art average accuracy of 86.9% for product image classification on the public product dataset PI 100.  相似文献   

15.
鉴于已有的绝大多数选择性分类算法主要用于完整数据,而现实中的数据通常是不完整的并且包含许多冗余属性或无关属性,本文在已有工作基础上利用信息增益率构建了一种用于不完整数据的混合型的选择性贝叶斯分类器:GBSD.在12个标准的不完整数据集上的实验结果表明,GBSD不仅能大幅度减少属性数目,而且比已有工作更能有效改善分类准确率和效率.  相似文献   

16.
为提高高光谱图像(HSI)分类精度,基于集成学习方法提出高光谱图像分类的层次集成学习新框架。采用两种集成学习策略:外部集成及内部集成。在外部集成阶段,构造多种高光谱图像的光谱和空间特征,使外部集成呈高度多样性,有利于提高分类精度;内部集成阶段,针对关联多特征集中的个体,Adaboost算法实现个体分类性能的提高。两组高光谱数据的实验结果表明,与原始的Adaboost和单分类器相比较,该方法在整体精度方面有更好的性能。  相似文献   

17.
Logistic regression is a fast classifier and can achieve higher accuracy on small training data.Moreover,it can work on both discrete and continuous attributes with nonlinear patterns.Based on these properties of logistic regression,this paper proposed an algorithm,called evolutionary logistical regression classifier(ELRClass),to solve the classification of evolving data streams.This algorithm applies logistic regression repeatedly to a sliding window of samples in order to update the existing classifier,to keep this classifier if its performance is deteriorated by the reason of bursting noise,or to construct a new classifier if a major concept drift is detected.The intensive experimental results demonstrate the effectiveness of this algorithm.  相似文献   

18.
考虑变工况下列车轴承振动数据分布不一致情况下, 传统深度学习诊断模型的泛化能力下降, 提出了一种多尺度卷积类内自适应的深度迁移学习模型; 模型利用改进的ResNet-50网络分析振动数据的频谱, 得到了中间层次特征, 构造了多尺度特征提取器, 从不同尺度处理中间层次特征得到高层次特征; 将高层次特征作为分类器的输入, 同时计算了伪标签以缩短在不同工作条件下收集的振动信号的条件分布距离来进行类内匹配; 为了验证模型的通用性和优越性, 将提出的模型分别用于列车轮对轴承数据集和凯斯西储数据集的多个工况进行试验验证和分析。研究结果表明: 通过对齐不同域中同一类样本的高层次特征作为分类器的输入, 提出的模型获得了更为理想的故障诊断精度; 在列车轴承6个变工况诊断实例中, 平均诊断精度为90.75%, 与传统深度学习模型相比, 模型诊断精度平均提高了约10%, 召回率为0.927;在凯斯西储数据集的12个变工况诊断实例中, 模型平均诊断精度达99.97%, 比传统模型提高约10%。可见, 利用伪标签减小了不同域之间的条件分布差异, 很好地处理了源域和目标域数据分布不一致的问题; 多尺度特征提取器能从不同尺度对齐样本的高层次特征, 增强了模型的泛化性与鲁棒性, 是解决变工况列车轴承故障诊断问题的一种有效模型。   相似文献   

19.
为解决自主移动机器人非结构化道路识别检测准确性、鲁棒性及实时性的问题,提出一种基于感兴趣区域(Region of Interest,ROI)与多层感知器(Multi-Layer Perceptron,MLP)为核心的自监督在线修正算法.首先,通过ROI算法规定被处理图像的有效计算区域;其次,利用多层感知器对样本数据进行训练,将感兴趣区域按相应特征实现分类处理,并对分类区域进行形态学处理及特征提取处理,筛选出有效的行驶区域;最后,通过自监督在线修正算法替换错误处理结果,进一步保障道路分类识别的准确性.实验结果表明,改进算法能准确地识别出环境中的道路区域,具有良好的实时性与可靠性.  相似文献   

20.
为解决自主移动机器人非结构化道路识别检测准确性、鲁棒性及实时性的问题,提出一种基于感兴趣区域(Region of Interest,ROI)与多层感知器(Multi-Layer Perceptron,MLP)为核心的自监督在线修正算法.首先,通过ROI算法规定被处理图像的有效计算区域;其次,利用多层感知器对样本数据进行训练,将感兴趣区域按相应特征实现分类处理,并对分类区域进行形态学处理及特征提取处理,筛选出有效的行驶区域;最后,通过自监督在线修正算法替换错误处理结果,进一步保障道路分类识别的准确性.实验结果表明,改进算法能准确地识别出环境中的道路区域,具有良好的实时性与可靠性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号