Stimulus sampling as an exploration mechanism for fast reinforcement learning

Vladimirskiy, Boris; Vasilaki, Eleni; Urbanczik, Robert; Senn, Walter

Informations

Fulltext

Stimulus sampling as an exploration mechanism for fast reinforcement learning

Vladimirskiy, Boris ; Vasilaki, Eleni ; Urbanczik, Robert ; Senn, Walter

In: Biological Cybernetics, 2009, vol. 100, no. 4, p. 319-330

Ajouter à la liste personnelle

Titre

Stimulus sampling as an exploration mechanism for fast reinforcement learning

Auteur

Vladimirskiy, Boris. Department of Physiology, University of Bern, Bühlplatz 5, 3012, Bern, Switzerland
Vasilaki, Eleni. Laboratory of Computational Neuroscience (IC/LCN), Ecole Polytechnique Fédérale de Lausanne, Station 15, 1015, Lausanne, Switzerland
Urbanczik, Robert. Department of Physiology, University of Bern, Bühlplatz 5, 3012, Bern, Switzerland
Senn, Walter. Department of Physiology, University of Bern, Bühlplatz 5, 3012, Bern, Switzerland

Type de document

Postprint

Langue

Anglais

Publié dans

Biological Cybernetics, 2009, vol. 100, no. 4, p. 319-330. Springer-Verlag

Autre version électronique

Publisher's version : https://doi.org/10.1007/s00422-009-0305-x

Classification

Santé

Mots clés

Online learning ; Hebbian learning ; Association task—Noise ; Reward ; Punishment ; Reward attenuation ; Hippocampus ; Medial temporal lobe ; Striatum

Identifiant OAI-PMH

oai:doc.rero.ch:309948

Summary

Reinforcement learning in neural networks requires a mechanism for exploring new network states in response to a single, nonspecific reward signal. Existing models have introduced synaptic or neuronal noise to drive this exploration. However, those types of noise tend to almost average out—precluding or significantly hindering learning —when coding in neuronal populations or by mean firing rates is considered. Furthermore, careful tuning is required to find the elusive balance between the often conflicting demands of speed and reliability of learning. Here we show that there is in fact no need to rely on intrinsic noise. Instead, ongoing synaptic plasticity triggered by the naturally occurring online sampling of a stimulus out of an entire stimulus set produces enough fluctuations in the synaptic efficacies for successful learning. By combining stimulus sampling with reward attenuation, we demonstrate that a simple Hebbian-like learning rule yields the performance that is very close to that of primates on visuomotor association tasks. In contrast, learning rules based on intrinsic noise (node and weight perturbation) are markedly slower. Furthermore, the performance advantage of our approach persists for more complex tasks and network architectures. We suggest that stimulus sampling and reward attenuation are two key components of a framework by which any single-cell supervised learning rule can be converted into a reinforcement learning rule for networks without requiring any intrinsic noise source

Stimulus sampling as an exploration mechanism for fast reinforcement learning

Vladimirskiy, Boris ; Vasilaki, Eleni ; Urbanczik, Robert ; Senn, Walter

In: Biological Cybernetics, 2009, vol. 100, no. 4, p. 319-330

Voir aussi

Exporter vers

Stimulus sampling as an exploration mechanism for fast reinforcement learning

Vladimirskiy, Boris ; Vasilaki, Eleni ; Urbanczik, Robert ; Senn, Walter

In: Biological Cybernetics, 2009, vol. 100, no. 4, p. 319-330

Voir aussi

Liens

Partager

Exporter vers