http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/control.pdf
Generalised Policy Iteration With Monte-Carlo Evaluation
原文:http://www.cnblogs.com/yuanjiangw/p/7615952.html