Optimal tag suppression for privacy protection in the semantic Web

作者:

Highlights:

摘要

Leveraging on the principle of data minimization, we propose tag suppression, a privacy-enhancing technique for the semantic Web. In our approach, users tag resources on the Web revealing their personal preferences. However, in order to prevent privacy attackers from profiling users based on their interests, they may wish to refrain from tagging certain resources. Consequently, tag suppression protects user privacy to a certain extent, but at the cost of semantic loss incurred by suppressing tags. In a nutshell, our technique poses a trade-off between privacy and suppression. In this paper, we investigate this trade-off in a mathematically systematic fashion and provide an extensive theoretical analysis. We measure user privacy as the entropy of the user's tag distribution after the suppression of some tags. Equipped with a quantitative measure of both privacy and utility, we find a close-form solution to the problem of optimal tag suppression. Experimental results on a real-world tagging application show how our approach may contribute to privacy protection.

论文关键词:Information privacy,Privacy-enhancing technology,Shannon entropy,Privacy–suppression trade-off,Semantic Web,Tagging systems

论文评审过程:Received 10 December 2010, Revised 24 July 2012, Accepted 28 July 2012, Available online 24 August 2012.

论文官网地址:https://doi.org/10.1016/j.datak.2012.07.004