Abstract
We face a growing need to be able to perform linkage among data set records to connect data about the same individual, organization or event so that further analysis becomes possible. At the same time, we also need to do a better job of protecting the privacy of the individuals identified by data set records. Therefore, it would be ideal if linkage could be effectively performed based not on the actual data but on some anonymous form of the data without diminishing the ability to link records whose identifiers are only “close” to each other, not equal, because of typical recording errors. This paper reviews existing proposals for how such anonymized string comparisons might be accomplished, but demonstrates that existing methods have various operational deficiencies. It therefore argues that new, more capable methods are needed.

This publication has 5 references indexed in Scilit: