A Statistical Framework for Ecological and Aggregate Studies

Abstract
Summary: Inference from studies that make use of data at the level of the area, rather than at the level of the individual, is more difficult for a variety of reasons. Some of these difficulties arise because frequently exposures (including confounders) vary within areas. In the most basic form of ecological study the outcome measure is regressed against a simple area level summary of exposure. In the aggregate data approach a survey of exposures and confounders is taken within each area. An alternative approach is to assume a parametric form for the within-area exposure distribution. We provide a framework within which ecological and aggregate data studies may be viewed, and we review some approaches to inference in such studies, clarifying the assumptions on which they are based. General strategies for analysis are provided including an estimator based on Monte Carlo integration that allows inference in the case of a general risk–exposure model. We also consider the implications of the introduction of random effects, and the existence of confounding and errors in variables.