In Applied Intelligence, my research entitled
Student-t policy in reinforcement learning to acquire global optimum of robot control
has been accepted and published! Download from here
This paper proposes a student-t policy for reinforcement learning. Usually, in continuous action space, a policy is parameterized as normal distribution, which is sensitive to outliers and has poor exploration ability. The student-t policy can improve robustness to outliers and has good exploration ability represented as Levy walk.