L -diversity

Abstract
Publishing data about individuals without revealing sen- sitive information about them is an important problem. In recent years, a new definition of privacy calledk-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at leastk −1 other records with respect to certain "identifying" attributes. In this paper we show with two simple attacks that a k-anonymized dataset has some subtle, but severe privacy problems. First, we show that an attacker can discover the values of sensitive attributes when there is little diversi ty in those sensitive attributes. Second, attackers often hav e background knowledge, and we show thatk-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two at- tacks and we propose a novel and powerful privacy defi- nition called ℓ-diversity. In addition to building a formal foundation forℓ-diversity, we show in an experimental eval- uation that ℓ-diversity is practical and can be implemented efficiently.

This publication has 47 references indexed in Scilit: