When autonomous robots try to learn tasks in real time based on data collected from a real environment, the effects of noise and outliers included in the data cannot be ignored. Especially in reinforcement learning, such effects would be remarkable due to the absence of supervised signals, and various methods for stabilizing the learning have been proposed in recent years.
In this research, we propose robust optimization methods against noise and outliers that make learning unstable.
- Robust stochastic gradient descent method focusing on first-moment noise vulnerability
- Robust target network that constrains divergence between networks
These can be used for all learning methods based on neural networks, so various applications can be expected.