论文部分内容阅读
分析了高速公路主线可变限速控制的作用,研究了现有的限速方法,将高速公路主线可变限速控制过程看作是离散时间的马尔可夫决策过程,提出基于强化学习与有限阶段马尔可夫决策的可变限速控制模型,通过与交通环境的交互学习进行模型的动态调整。采用有限阶段向后递归迭代的算法对模型进行求解,运用Paramics仿真软件对长吉高速公路全程进行仿真。仿真结果表明:在平均限速值低于设计时速6.25%的情况下,平均流量不仅没有降低反而增加了3.20%。可见,该模型可以有效提高交通流量,改善高速公路主线的交通状况。
The effect of variable speed control on expressway main line is analyzed. The existing speed limit method is studied. The variable speed control process of main line of expressway is regarded as Markov decision process with discrete time. Based on reinforcement learning and limited The variable speed control model of stage Markov decision makes dynamic adjustment of the model through interaction with traffic environment. The model is solved by an iterative iterative algorithm in finite stages, and Paramics simulation software is used to simulate the entire Chang-Ji expressway. The simulation results show that the mean flow rate increases by 3.20% instead of decreasing only when the average speed limit is lower than the designed speed by 6.25%. Can be seen, the model can effectively improve traffic flow, improve the main highway traffic conditions.