Exploring the limits of learning: Segregation of information integration and response selection is required for learning a serial reversal task

Open Access

27 October 2017

journal article
research article
Published by Public Library of Science (PLoS) in PLOS ONE

Vol. 12 (10), e0186959
https://doi.org/10.1371/journal.pone.0186959

Abstract

Animals are proposed to learn the latent rules governing their environment in order to maximize their chances of survival. However, rules may change without notice, forcing animals to keep a memory of which one is currently at work. Rule switching can lead to situations in which the same stimulus/response pairing is positively and negatively rewarded in the long run, depending on variables that are not accessible to the animal. This fact raises questions on how neural systems are capable of reinforcement learning in environments where the reinforcement is inconsistent. Here we address this issue by asking about which aspects of connectivity, neural excitability and synaptic plasticity are key for a very general, stochastic spiking neural network model to solve a task in which rules change without being cued, taking the serial reversal task (SRT) as paradigm. Contrary to what could be expected, we found strong limitations for biologically plausible networks to solve the SRT. Especially, we proved that no network of neurons can learn a SRT if it is a single neural population that integrates stimuli information and at the same time is responsible of choosing the behavioural response. This limitation is independent of the number of neurons, neuronal dynamics or plasticity rules, and arises from the fact that plasticity is locally computed at each synapse, and that synaptic changes and neuronal activity are mutually dependent processes. We propose and characterize a spiking neural network model that solves the SRT, which relies on separating the functions of stimuli integration and response selection. The model suggests that experimental efforts to understand neural function should focus on the characterization of neural circuits according to their connectivity, neural dynamics, and the degree of modulation of synaptic plasticity with reward.

Funding Information

Agencia Nacional de Promoción Científica y Tecnológica (AR) (PICT 1519)
Consejo Nacional de Investigaciones Científicas y Técnicas (AR) (PIP 112 201101 01054)

This publication has 35 references indexed in Scilit:

Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity
PLoS Computational Biology, 2013
The Role of Medial Prefrontal Cortex in Memory and Decision Making
Neuron, 2012
Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in rats
Behavioural Brain Research, 2010
Dopamine in Motor Cortex Is Necessary for Skill Learning and Synaptic Plasticity
PLOS ONE, 2009
Reinforced walk on graphs and neural networks
Studia Mathematica, 2008
Limits on the memory storage capacity of bounded synapses
Nature Neuroscience, 2007
Spike Timing-Dependent Plasticity: From Synapse to Perception
Physiological Reviews, 2006
Catastrophic forgetting in connectionist networks
Trends in Cognitive Sciences, 1999
A Neural Substrate of Prediction and Reward
Science, 1997
Semi-distributed Representations and Catastrophic Forgetting in Connectionist Networks
Connection Science, 1992

Cited by 1 article