Deep reinforcement and imitation learning on a GPU

Iuri Frosio - NVIDIA
Data e ora
martedì 17 luglio 2018 alle ore 11.30 - Sala verde
Referente esterno
Data pubblicazione
13 giugno 2018


In this talk, I describe some of our efforts towards the development of computationally effective learning procedure on a GPU, with particular attention to the Reinforcement Learning (RL) and robotics domains. I first describe a hybrid CPU/GPU version of the recently introduced Asynchronous Advantage Actor-Critic (A3C) algorithm. I analyze its computational footprint and highlight the critical aspects to run it effectively on a GPU, by means of a system of queues and a dynamic scheduling strategy, that are potentially helpful for other asynchronous algorithms. Our hybrid CPU/GPU version of A3C achieves a significant speed-up compared to its CPU implementation and it is publicly available. In the second part of the talk, I will introduce CuLE (Cuda Learning Environment), an experimental deep RL companion library developed to overcome the limitations highlighted in the first part of the talk. RL training is in fact dominated by data generation on the CPU. CuLE provides a GPU implementation of ALE (the Atari Learning Environment), a challenging RL benchmark for discrete episodic tasks. CuLE can easily simulate thousands of environments in parallel, whereas traditional deep RL implementations use a limited number of agents with replay memory to achieve training efficiency. CuLE supports new training scenarios with an extremely large number of agents, while minimizing at the same time expensive data movement operations. I conclude the talk showing an example of deep learning on a GPU in the robotic context. I illustrate the advantages of training using a simulator and compare reinforcement learning and imitation learning algorithms. Then I show how the separation of the control and vision modules simplifies and speed up the learning procedure in simulation, although the learned controller hardly generalizes to the real world environment. Finally, I demonstrate how to use domain transfer to deploy the DNN controller trained in simulation for real world applications.

Contact person: Umberto Castellani

© 2002 - 2021  Universit√† degli studi di Verona
Via dell'Artigliere 8, 37129 Verona  |  P. I.V.A. 01541040232  |  C. FISCALE 93009870234