Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining

作者:

Highlights:

摘要

Identity disclosure is one of the most serious privacy concerns in today's information age. A well-known method for protecting identity disclosure is k-anonymity. A dataset provides k-anonymity protection if the information for each individual in the dataset cannot be distinguished from at least k − 1 individuals whose information also appears in the dataset. There is a flaw in k-anonymity that would still allow an intruder to discern the confidential information of individuals in the anonymized data. To overcome this problem, we propose a data reconstruction approach to achieve k-anonymity protection in predictive data mining. In this approach, the potentially identifying attributes are first masked using aggregation (for numeric data) and swapping (for nominal data). A genetic algorithm technique is then applied to the masked data to find a good subset of it. This subset is then replicated to form the released dataset that satisfies the k-anonymity constraint.

论文关键词:Privacy,Identity disclosure,k-Anonymity,Data mining,Genetic algorithm

论文评审过程:Received 1 July 2008, Revised 15 March 2009, Accepted 7 July 2009, Available online 15 July 2009.

论文官网地址:https://doi.org/10.1016/j.dss.2009.07.003