A temporal difference account of avoidance learning

Abstract
Aversive processing plays a central role in human phobic fears and may also be important in some symptoms of psychosis. We developed a temporal-difference model of the conditioned avoidance response, an important experimental model for aversive learning which is also a central pharmacological model of psychosis. In the model, dopamine neurons reported outcomes that were better than the learner expected, typically coming from reaching safety states, and thus controlled the acquisition of a suitable policy. The model accounts for normal conditioned avoidance learning, the persistence of responding in extinction, and critical effects of dopamine blockade, notably that subjects experiencing shocks under dopamine blockade, and hence failing to avoid them, nevertheless develop avoidance responses when both shocks and dopamine blockade are subsequently removed. These postulated roles of dopamine in aversive learning can thus account for many of the effects of dopaminergic modulation seen in laboratory models of psychopathological processes.