Counterfactual Choice and Learning in a Neural Network Centered on Human Lateral Frontopolar Cortex

Abstract
Decision making and learning in a real-world context require organisms to track not only the choices they make and the outcomes that follow but also other untaken, or counterfactual, choices and their outcomes. Although the neural system responsible for tracking the value of choices actually taken is increasingly well understood, whether a neural system tracks counterfactual information is currently unclear. Using a three-alternative decision-making task, a Bayesian reinforcement-learning algorithm, and fMRI, we investigated the coding of counterfactual choices and prediction errors in the human brain. Rather than representing evidence favoring multiple counterfactual choices, lateral frontal polar cortex (lFPC), dorsomedial frontal cortex (DMFC), and posteromedial cortex (PMC) encode the reward-based evidence favoring the best counterfactual option at future decisions. In addition to encoding counterfactual reward expectations, the network carries a signal for learning about counterfactual options when feedback is available—a counterfactual prediction error. Unlike other brain regions that have been associated with the processing of counterfactual outcomes, counterfactual prediction errors within the identified network cannot be related to regret theory. Furthermore, individual variation in counterfactual choice-related activity and prediction error-related activity, respectively, predicts variation in the propensity to switch to profitable choices in the future and the ability to learn from hypothetical feedback. Taken together, these data provide both neural and behavioral evidence to support the existence of a previously unidentified neural system responsible for tracking both counterfactual choice options and their outcomes. Reinforcement learning (RL) models, which formally describe how we learn from direct experience, can explain a diverse array of animal behavior. Considering alternative outcomes that could have been obtained but were not falls outside the purview of traditional RL models. However, such counterfactual thinking can considerably accelerate learning in real-world contexts, ranging from foraging in the wild to investing in financial markets. In this study, we show that three brain regions in humans (frontopolar, dorsomedial frontal, and posteromedial cortex) play a special role in tracking “what might have been”, and whether it is worth choosing such foregone options in the future. These regions encode the net benefit of choosing the next-best alternative in the future, suggesting that the next-best alternative may be privileged over inferior alternatives in the human brain. When people subsequently witness feedback indicating what would have happened had they made a different choice, these same regions encode a key learning signal—a prediction error that signals the discrepancy between what would have happened and what people believed could have happened. Further analysis indicates these brain regions exploit counterfactual information to guide future changes in behavior. Such functions may be compromised in addiction and psychiatric conditions characterized by an inability to alter maladaptive behavior.