Thoughts on k-anonymization

作者:

Highlights:

摘要

k-Anonymity is a method for providing privacy protection by ensuring that data cannot be traced to an individual. In a k-anonymous dataset, any identifying information occurs in at least k tuples. To achieve optimal and practical k-anonymity, recently, many different kinds of algorithms with various assumptions and restrictions have been proposed with different metrics to measure quality. This paper evaluates a family of clustering-based algorithms that are more flexible and even attempts to improve precision by ignoring the restrictions of user-defined Domain Generalization Hierarchies. The evaluation of the new approaches with respect to cost metrics shows that metrics may behave differently with different algorithms and may not correlate with some applications’ accuracy on output data.

论文关键词:Privacy,k-Anonymity,Algorithms,Cost metrics

论文评审过程:Received 10 January 2007, Revised 10 January 2007, Accepted 15 March 2007, Available online 2 April 2007.

论文官网地址:https://doi.org/10.1016/j.datak.2007.03.009