论文部分内容阅读
This paper presents an extended Dyna-Q algorithm to improve efficiency of the standard Dyna-Q algorithm.In the first episodes of the standard Dyna-Q algorithm,the agent travels blindly to find a goal position.To overcome this weakness,our approach is to u