Have a personal or library account? Click to login
Combining Local and Global Direct Derivative-Free Optimization for Reinforcement Learning Cover

Combining Local and Global Direct Derivative-Free Optimization for Reinforcement Learning

Open Access
|Mar 2013

Abstract

We consider the problem of optimization in policy space for reinforcement learning. While a plethora of methods have been applied to this problem, only a narrow category of them proved feasible in robotics. We consider the peculiar characteristics of reinforcement learning in robotics, and devise a combination of two algorithms from the literature of derivative-free optimization. The proposed combination is well suited for robotics, as it involves both off-line learning in simulation and on-line learning in the real environment. We demonstrate our approach on a real-world task, where an Autonomous Underwater Vehicle has to survey a target area under potentially unknown environment conditions. We start from a given controller, which can perform the task under foreseeable conditions, and make it adaptive to the actual environment.

DOI: https://doi.org/10.2478/cait-2012-0021 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 53 - 65
Published on: Mar 22, 2013
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2013 Matteo Leonetti, Petar Kormushev, Simone Sagratella, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons License.