Journal of the Japanese Society for Artificial Intelligence
Online ISSN : 2435-8614
Print ISSN : 2188-2266
Print ISSN:0912-8085 until 2013
Reinforcement Learning for Crawling Robot Motion Using Stochastic Gradient Ascent
Hajime KIMURAShigenobu KOBAYASHI
Author information
MAGAZINE FREE ACCESS

1999 Volume 14 Issue 1 Pages 122-130

Details
Abstract

Many previous works in reinforcement learning (RL) are limited to Markov decision processes (MDPs). However, a great many real-world applications do not satisfy this assumption. RL tasks of real world can be characterized by two difficulties : Function approximation and hidden state problems. For large and continuous state or action space, the agent has to incorporate some form of generalization. One way to do it is to use general function approximators to represent value functions or control policies. Hidden state problems, which can be represented by partially observable MDPs (POMDPs), arise in the case that the RL agent cannot observe the state of the environment perfectly owing to noisy or insufficient sensors, partial information, etc. We have presented a RL algorithm in POMDPs, that is based on a stochastic gradient ascent. It uses function approximator to represent a stochastic policy, and updates the policy parameters. We apply the algorithm to a robot control problem, and show the features in comparison with Q-learning or Jaakkola's method. The results shows the algorithm is very robust under the conditions that the agent is restricted to computationally very poor resources.

Content from these authors
© 1999 The Japaense Society for Artificial Intelligence
Previous article Next article
feedback
Top