Skip to main content Skip to navigation

Jon Wakefield

The Ecological Fallacy in Spatial Regression

 

In an ecological study outcome and exposure/confounder data are available on groups of individuals, rather than on the individuals themselves. Such studies are logistically appealing, since they may make use of routinely-available data and offer increased power and exposure contrasts, but suffer from a number of problems due to within-group variability in exposures and confounders, an umbrella term for which is ecological bias. Ecological bias can appear in very simple situations. For example, the Scottish lip cancer data has been analyzed by numerous authors, who have all used an ecological mean model which has very limited interpretation when viewed from an individual-level perspective. Much of the methodological development with spatial count data has concentrated upon proposing models for residual spatial dependence, but here we emphasize the futility of this exercise in a regression setting if the mean function is incorrectly specified. Hyperprior specification will also be discussed. The use of hierarchical models per se cannot correct for ecological bias; the only solution is to supplement ecological data with individual samples. In this talk, after detailing different sources of ecological bias, various study designs and estimation methods will be described and compared. In particular, the ecological embedded case-control study of Haneuse and Wakefield (2005) will be outlined. Simulated data will be used to compare various study designs. A running theme of the talk will be the need to think about where to concentrate modeling efforts, in the ecological regression context this is in the mean model and in careful prior specification.