Privacy-preserving SOM-based recommendations on horizontally distributed data

摘要

To produce predictions with decent accuracy, collaborative filtering algorithms need sufficient data. Due to the nature of online shopping and increasing amount of online vendors, different customers’ preferences about the same products can be distributed among various companies, even competing vendors. Therefore, those companies holding inadequate number of users’ data might decide to combine their data in such a way to present accurate predictions with acceptable online performance. However, they do not want to divulge their data, because such data are considered confidential and valuable. Furthermore, it is not legal disclosing users’ preferences; nevertheless, if privacy is protected, they can collaborate to produce correct predictions.We propose a privacy-preserving scheme to provide recommendations on horizontally partitioned data among multiple parties. In order to improve online performance, the parties cluster their distributed data off-line without greatly jeopardizing their secrecy. They then estimate predictions using k-nearest neighbor approach while preserving their privacy. We demonstrate that the proposed method preserves data owners’ privacy and is able to suggest predictions resourcefully. By performing several experiments using real data sets, we analyze our scheme in terms of accuracy. Our empirical outcomes show that it is still possible to estimate truthful predictions competently while maintaining data owners’ confidentiality based on horizontally distributed data.