Exploring the limits of learning: Segregation of information integration and response selection is required for learning a serial reversal task
Open Access
- 27 October 2017
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 12 (10), e0186959
- https://doi.org/10.1371/journal.pone.0186959
Abstract
Animals are proposed to learn the latent rules governing their environment in order to maximize their chances of survival. However, rules may change without notice, forcing animals to keep a memory of which one is currently at work. Rule switching can lead to situations in which the same stimulus/response pairing is positively and negatively rewarded in the long run, depending on variables that are not accessible to the animal. This fact raises questions on how neural systems are capable of reinforcement learning in environments where the reinforcement is inconsistent. Here we address this issue by asking about which aspects of connectivity, neural excitability and synaptic plasticity are key for a very general, stochastic spiking neural network model to solve a task in which rules change without being cued, taking the serial reversal task (SRT) as paradigm. Contrary to what could be expected, we found strong limitations for biologically plausible networks to solve the SRT. Especially, we proved that no network of neurons can learn a SRT if it is a single neural population that integrates stimuli information and at the same time is responsible of choosing the behavioural response. This limitation is independent of the number of neurons, neuronal dynamics or plasticity rules, and arises from the fact that plasticity is locally computed at each synapse, and that synaptic changes and neuronal activity are mutually dependent processes. We propose and characterize a spiking neural network model that solves the SRT, which relies on separating the functions of stimuli integration and response selection. The model suggests that experimental efforts to understand neural function should focus on the characterization of neural circuits according to their connectivity, neural dynamics, and the degree of modulation of synaptic plasticity with reward.Funding Information
- Agencia Nacional de Promoción Científica y Tecnológica (AR) (PICT 1519)
- Consejo Nacional de Investigaciones Científicas y Técnicas (AR) (PIP 112 201101 01054)
This publication has 35 references indexed in Scilit:
- Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent PlasticityPLoS Computational Biology, 2013
- The Role of Medial Prefrontal Cortex in Memory and Decision MakingNeuron, 2012
- Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in ratsBehavioural Brain Research, 2010
- Dopamine in Motor Cortex Is Necessary for Skill Learning and Synaptic PlasticityPLOS ONE, 2009
- Reinforced walk on graphs and neural networksStudia Mathematica, 2008
- Limits on the memory storage capacity of bounded synapsesNature Neuroscience, 2007
- Spike Timing-Dependent Plasticity: From Synapse to PerceptionPhysiological Reviews, 2006
- Catastrophic forgetting in connectionist networksTrends in Cognitive Sciences, 1999
- A Neural Substrate of Prediction and RewardScience, 1997
- Semi-distributed Representations and Catastrophic Forgetting in Connectionist NetworksConnection Science, 1992