基于Q学习的Agent在单路口交通控制中的应用 Application of Agent-based Q-learning in the Traffic Flow Control of Single Intersection期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于Q学习的Agent在单路口交通控制中的应用

引用本文：	陈阳舟,张辉,杨玉珍,胡全连.基于Q学习的Agent在单路口交通控制中的应用[J].公路交通科技,2007,24(5):117-120.

作者姓名：	陈阳舟张辉杨玉珍胡全连

作者单位：	1. 北京工业大学,北京,100022 2. 江西师范大学,江西,南昌,330027

基金项目：	北京市教委科技发展计划基金资助项目（TM2004100051）;北京市自然科学基金资助项目（4042006）;北京工业大学博士科研启动基金资助项目（52002011200402）

摘要：	将Agent技术与Q学习算法相结合,应用到城市交通控制领域中,对单交叉口的交通流进行了控制研究,介绍了路口Agent的结构模型以及基于Q学习算法的学习机制的实现,提出了一种适用于交通控制的奖惩函数。即当红灯相位的饱和度大于绿灯相位的饱和度时,红灯相位的相对警界度在奖惩函数中占主导地位,此时大部分情况下会对Agent进行惩罚;在以后的决策过程中面对类似的交通状态Agent所选择的控制行为更倾向于将通行权切换给下一个相位,反之,Agent所选择的行为倾向于保持当前相位的通行权到下一决策时刻。并通过微观交通仿真软件Paramics对控制算法进行仿真研究,仿真结果表明该方法的控制效果优于定时控制,同时验证了奖惩函数的有效性。
关键词：	交通工程单交叉口 Q学习奖惩函数交通流
文章编号：	1002-0268（2007）05-0117-04
修稿时间：	2005-11-23
Application of Agent-based Q-learning in the Traffic Flow Control of Single Intersection

CHEN Yang-zhou,ZHANG Hui,YANG Yu-zhen,HU Quan-lian.Application of Agent-based Q-learning in the Traffic Flow Control of Single Intersection[J].Journal of Highway and Transportation Research and Development,2007,24(5):117-120.

Authors:	CHEN Yang-zhou ZHANG Hui YANG Yu-zhen HU Quan-lian

Institution:	1. Beijing University of Technology, Beijing 100022, China; 2.Jiangxi Normal University, Jiangxi Nanchang 330027, China

Abstract:	An approach of Agent technology combined with Q-learning is applied to urban traffic control,to study the single intersectio control.The model of intersection Agent an the implementation of the learning function based on Q-learning are introduced.A reward function which is fit to traffic control is put forward.The Agent will be punished when the red saturation is more than green's,when the relative security of red phase occupies dominant position in the reward function.In other words,in later decision-making process facing the similar traffic condition,the control behavior which Agent chooses would let the right of way cut to the next phase.Otherwise,Agent would choose maintaining current phase right of way until next decision making.The experimental results indicate that the approach is better than the fixed control,and validate the effectiveness of the reward function.

Keywords:	traffic engineering single intersection Q-learning reward function traffic flow
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏