论文部分内容阅读
We discuss the solution of complex multistage decision problems using methods that are based on the idea of policy iteration(PI),i.e.,start from some base policy and generate an improved policy.Rollout is the simplest method of this type,where just one im