Autonomous robots cannot have programs for all the tasks in advance, so the ability to learn and achieve given purposes by trial and error is indispensable. Reinforcement learning, which has been attracting attention in recent years, is a methodology for learning the optimal policy that maximize the sum of rewards obtained through the interaction between an agent (i.e., the autonomous robot) and the environment. Among them, our focus is not on a framework for directly accumulating and replaying experience data, which has become the mainstream in recent years, but on a framework that continues to learn based on online experience.
Reservoir Computing (RC) typified by Liquid State Machine (LSM) and Echo State Network (ESN) has been devised as an information processing structure that imitates the cerebellum for motor control. RC is composed of three layers of input layer, reservoir layer that is a kind of recurrent neural network with internal state, and output layer. The most important feature of RC is that the weights to be learned is only the readout connecting the reservoir layer and the output layer.
Multi-locomotion robot is a concept to adapt a robot to faced situations, such as on flat and vast terrain, on narrow and rough terrain, on trees, etc., by using many types of locomotion like bipedal walking, quadrupedal walking, brachiation, and so on. This concept is inspired from animals: if you focus on people as an example, usually they walk, but run when they hurry, climb the ladder, use trekking poles on mountain path to reduce the burden on the legs (it is regarded as quadrupedal walking).