The applicability of the perturbation based privacy preserving data mining for real-world data

作者:

Highlights:

摘要

The perturbation method has been extensively studied for privacy preserving data mining. In this method, random noise from a known distribution is added to the privacy sensitive data before the data is sent to the data miner. Subsequently, the data miner reconstructs an approximation to the original data distribution from the perturbed data and uses the reconstructed distribution for data mining purposes. Due to the addition of noise, loss of information versus preservation of privacy is always a trade off in the perturbation based approaches. The question is, to what extent are the users willing to compromise their privacy? This is a choice that changes from individual to individual. Different individuals may have different attitudes towards privacy based on customs and cultures. Unfortunately, current perturbation based privacy preserving data mining techniques do not allow the individuals to choose their desired privacy levels. This is a drawback as privacy is a personal choice. In this paper, we propose an individually adaptable perturbation model, which enables the individuals to choose their own privacy levels. The effectiveness of our new approach is demonstrated by various experiments conducted on both synthetic and real-world data sets. Based on our experiments, we suggest a simple but effective and yet efficient technique to build data mining models from perturbed data.

论文关键词:Data mining,Privacy,Security

论文评审过程:Available online 18 July 2007.

论文官网地址:https://doi.org/10.1016/j.datak.2007.06.011