Constructing Reproductive Histories by Linking Vital Records

Abstract
Certificates of 1,449,287 live births and fetal deaths filed in Georgia from 1980 through 1992 were linked to create chronologies that, excluding induced abortions and ectopic pregnancies, constituted the reproductive experience of individual women. The authors initially used a deterministic method (whereby linking rules were not based on probability theory) to link as many records as possible, knowing that some of the linkages would be incorrect. They subsequently used a probabilistic method (whereby evaluation of linkages was developed from probability theory) to evaluate each linkage, and they broke those that were judged to be incorrect. Of the 1.4 million records, 38% did not link to another record. From the remaining records, 369,686 chains of two or more events were constructed. The longest chain included 12 events. Of the chains, 69% included two events; 22% included three events. Longer chains tended to have lower scores for probable validity. The probability-based evaluation of chains affected 3.0% of the records that had been in chains at the end of the deterministic linkage. A greater percentage of records in longer chains were affected by the evaluation. Unfortunately, the small subset of records that were the most difficult to link tended to overrepresent groups with the greatest risk of adverse pregnancy outcomes. Researchers contemplating a similar linkage can anticipate that, for the majority of records, linkage can be accomplished with a relatively straightforward, deterministic approach. Am J Epidemiol 1997; 145: 339–48.