A general framework for privacy-preserving of data publication based on randomized response techniques

作者:

Highlights:

• A general framework for data publication is proposed based on randomized response techniques, which, by utilizing matrix decomposition method and properties of Kronecker product, can reduce the computational complexity of reconstructing unbiased estimate answers from exponential correlation to linear correlation.

• A general approach for constructing a recovery matrix from arbitrary perturbation matrix is proposed, which can minimize the variance of unbiased estimate answers.

• Perturbation and reconstruction algorithms for boolean and categorical attributes are developed, which can be extended to numerical attributes.

• Both theoretical analysis and experimental results are given, which validate the proposed approach.

摘要

•A general framework for data publication is proposed based on randomized response techniques, which, by utilizing matrix decomposition method and properties of Kronecker product, can reduce the computational complexity of reconstructing unbiased estimate answers from exponential correlation to linear correlation.•A general approach for constructing a recovery matrix from arbitrary perturbation matrix is proposed, which can minimize the variance of unbiased estimate answers.•Perturbation and reconstruction algorithms for boolean and categorical attributes are developed, which can be extended to numerical attributes.•Both theoretical analysis and experimental results are given, which validate the proposed approach.

论文关键词:Privacy preserving,Randomized response,Data publishing

论文评审过程:Received 20 July 2019, Revised 27 July 2020, Accepted 28 September 2020, Available online 29 September 2020, Version of Record 7 October 2020.

论文官网地址:https://doi.org/10.1016/j.is.2020.101648