Distributed semi-supervised learning algorithms for random vector functional-link networks with distributed data splitting across samples and features

摘要

In this paper, we propose two manifold regularization (MR) based distributed semi-supervised learning (DSSL) algorithms using the random vector functional link (RVFL) network and alternating direction method of multipliers (ADMM) strategy. In DSSL problems, training data consisting of labeled and unlabeled samples are often large-scale or high-dimension and split across samples or features. These distributed data separately stored over a communication network where each node has only access to its own data and can only communicate with its neighboring nodes. In many scenarios, centralized algorithms cannot be applied to solve DSSL problems. In our previous work, we proposed a MR based DSSL algorithm, denoted as the D-LapWNN algorithm, to solve DSSL problems with distributed samples. It has been proved that the D-LapWNN algorithm, combining the wavelet neural network (WNN) with the zero-gradient-sum (ZGS) strategy, is an efficient DSSL algorithm with distributed samples or horizontally partitioned data. The drawback of the D-LapWNN algorithm is that the loss function of each node or agent over the communication network must be twice continuously differentiable. In order to extend our previous work and settle the corresponding drawback, we propose a horizontally DSSL (HDSSL) algorithm to solve DSSL problems with distributed samples. Then, we novelly propose a vertically DSSL (VDSSL) algorithm to solve DSSL problems with distributed features or vertically partitioned data. As far as we know, the VDSSL algorithm is the first work focusing on DSSL problems with distributed features. During the learning process of the proposed algorithms, nodes over the communication network only exchange coefficients rather than raw data. It means that the proposed algorithms are privacy-preserving methods. Finally, some simulations are given to show the efficiency of the proposed algorithms.