Real-World Scene Representations in High-Level Visual Cortex: It's the Spaces More Than the Places

Abstract
Real-world scenes are incredibly complex and heterogeneous, yet we are able to identify and categorize them effortlessly. In humans, the ventral temporal parahippocampal place area (PPA) has been implicated in scene processing, but scene information is contained in many visual areas, leaving their specific contributions unclear. Although early theories of PPA emphasized its role in spatial processing, more recent reports of its function have emphasized semantic or contextual processing. Here, using functional imaging, we reconstructed the organization of scene representations across human ventral visual cortex by analyzing the distributed response to 96 diverse real-world scenes. We found that, although individual scenes could be decoded in both PPA and early visual cortex (EVC), the structure of representations in these regions was vastly different. In both regions, spatial rather than semantic factors defined the structure of representations. However, in PPA, representations were defined primarily by the spatial factor of expanse (open, closed) and in EVC primarily by distance (near, far). Furthermore, independent behavioral ratings of expanse and distance correlated strongly with representations in PPA and peripheral EVC, respectively. In neither region was content (manmade, natural) a major contributor to the overall organization. Furthermore, the response of PPA could not be used to decode the high-level semantic category of scenes even when spatial factors were held constant, nor could category be decoded across different distances. These findings demonstrate, contrary to recent reports, that the response PPA primarily reflects spatial, not categorical or contextual, aspects of real-world scenes.