A Statistical Problem in Space and Time: Do Leukemia Cases Come in Clusters?

Abstract
A method for analyzing disease incidence data, specifically childhood leukemia, for evidence of temporal or spatial clustering is presented. Data available relate to cases of childhood leukemia over a 15 year period in Connecticut towns. To minimize the effects of trend and changing population, the 15 year period is subdivided into three 5 year periods. The 5 year period corresponds to 5 successive cells. For each such 5 year period for a town the statistic of interest considered is mi, the maximum number of cases in any one year or, more generally, mj, the maximum number of cases in j successive years. Conditional on r, the total number of cases in the 5 year period for the town, the distribution of mj, its expectation and variance can be obtained by classical occupancy procedures, under the null hypothesis of homogeneity for the 5 year period. Statistical power is obtained by considering [SIGMA] mj, the sum of all the empirical clusters of duration j, where summation is across all the 5 year periods for ail the towns. Asymptotic normality is assumed, the overall test statistic being a continuity-corrected chi square with one degree of freedom. ([image])- The summary chi-square procedure failed to detect any 1- or 2-year clusters in the childhood leukemia data. Application of this method to Connecticut poliomelitis and hepatitis data yielded highly significant chi-aquare values.

This publication has 1 reference indexed in Scilit: