首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于MapReduce的时序数据离群点挖掘算法
引用本文:刘峰,延婉梅,李洪人.基于MapReduce的时序数据离群点挖掘算法[J].铁路计算机应用,2015,24(4):1-5.
作者姓名:刘峰  延婉梅  李洪人
作者单位:1.北京交通大学 计算机与信息技术学院,北京 100044;
摘    要:针对海量数据中离群点的挖掘,将网格聚类和MapReduce编程模型相结合,排除不可能包含离群点的网格,再用LOF算法对剩余网格中的数据进行离群点检测。为了提高网格聚类的检测精度,本文提出了一种基于聚类半径的改进算法。实验表明了该算法的有效性,同时分析了在节点数不同的情况下,网格聚类所用时间,证明了基于MapReduce的网格聚类适合处理海量时序数据。

关 键 词:海量时序数据    网格聚类    MapReduce    LOF    聚类半径
收稿时间:2014-09-23

Outlier Mining Algorithm for time series data based on MapReduce
Institution:1.School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;2.Yuanping Brach, Shuo Huang Railway Development Company Limited, Xinzhou 034100, China
Abstract:Aiming at outlier mining in massive time series data, the paper combined grid clustering with MapReduce programming model to exclude grids that was impossible to contain outlier, and then used LOF Algorithm to detect outliers from the rest grids. In order to improve the detection accuracy of the grid clustering, this paper proposed an improved algorithm based on clustering radius. Experimental results showed the effectiveness of the improvement. Experiment also analyzed the execution time grid cluster cost under the circumstances with different number of nodes, which proved it was suitable for handling massive time series data combined MapReduce with grid clustering.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《铁路计算机应用》浏览原始摘要信息
点击此处可从《铁路计算机应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号