An algorithm for k-anonymous microaggregation and clustering inspired by the design of distortion-optimized quantizers

作者:

Highlights:

摘要

We present a multidisciplinary solution to the problems of anonymous microaggregation and clustering, illustrated with two applications, namely privacy protection in databases, and private retrieval of location-based information. Our solution is perturbative, is based on the same privacy criterion used in microdata k-anonymization, and provides anonymity through a substantial modification of the Lloyd algorithm, a celebrated quantization design algorithm, endowed with numerical optimization techniques.Our algorithm is particularly suited to the important problem of k-anonymous microaggregation of databases, with a small integer k representing the number of individual respondents indistinguishable from each other in the published database. Our algorithm also exhibits excellent performance in the problem of clustering or macroaggregation, where k may take on arbitrarily large values. We illustrate its applicability in this second, somewhat less common case, by means of an example of location-based services. Specifically, location-aware devices entrust a third party with accurate location information. This party then uses our algorithm to create distortion-optimized, size-constrained clusters, where k nearby devices share a common centroid location, which may be regarded as a distorted version of the original one. The centroid location is sent back to the devices, which use it when contacting untrusted location-based information providers, in lieu of the exact home location, to enforce k-anonymity.We compare the performance of our novel algorithm to the state-of-the-art microaggregation algorithm MDAV, on both synthetic and standardized real data, which encompass the cases of small and large values of k. The most promising aspect of our proposed algorithm is its capability to maintain the same k-anonymity constraint, while outperforming MDAV by a significant reduction in data distortion, in all the cases considered.

论文关键词:k-Anonymity,Privacy,Anonymous microaggregation,MDAV,Location-based services,Distortion-optimized quantizer design,Lloyd algorithm,k-Means method

论文评审过程:Received 27 April 2010, Revised 21 June 2011, Accepted 21 June 2011, Available online 2 July 2011.

论文官网地址:https://doi.org/10.1016/j.datak.2011.06.005