Privacy-preserving collaborative fuzzy clustering

作者:

Highlights:

摘要

The proliferation of Internet of Things devices has contributed to the emergence of participatory sensing (PS), where multiple individuals collect and report their data to a third-party data mining cloud service for analysis. The need for the participants to collaborate with each other for this analysis gives rise to the concept of collaborative learning. However, the possibility of the cloud service being semi-honest poses a key challenge: preserving the participants' privacy.In this paper, we address this challenge with a two-stage scheme called RG+RP: in the first stage, each participant perturbs his/her data by passing the data through a nonlinear function called repeated Gompertz (RG); in the second stage, he/she then projects his/her perturbed data to a lower dimension in an (almost) distance-preserving manner, using a specific random projection (RP) matrix. The nonlinear RG function is designed to mitigate maximum a posteriori (MAP) estimation attacks, while random projection resists independent component analysis (ICA) attacks and ensures clustering accuracy. The proposed two-stage randomisation scheme is assessed in terms of its recovery resistance to MAP estimation attacks. Preliminary theoretical analysis as well as experimental results on synthetic and real-world datasets indicate that RG+RP has better recovery resistance to MAP estimation attacks than most state-of-the-art techniques. For clustering, fuzzy c-means (FCM) is used. Results using seven cluster validity indices, root mean squared error (RMSE) and accuracy ratio show that clustering results based on two-stage-perturbed data are comparable to the clustering results based on raw data — this confirms the utility of our privacy-preserving scheme when used with either FCM or HCM.

论文关键词:Participatory sensing,Collaborative learning,Clustering,Privacy-preserving,Randomisation

论文评审过程:Received 17 November 2016, Revised 18 November 2017, Accepted 5 May 2018, Available online 12 May 2018, Version of Record 27 July 2018.

论文官网地址:https://doi.org/10.1016/j.datak.2018.05.002