Contrastive semantic disentanglement in latent space for generalized zero-shot learning

作者：

Highlights：

•

摘要

The target of generalized zero-shot learning (GZSL) is to train a model that can classify data samples from both seen categories and unseen categories under the circumstances that only the labeled samples from seen categories are available. In this paper, we propose a GZSL approach based on conditional generative models that adopts a contrastive disentanglement learning framework to disentangle visual information in the latent space. Specifically, our model encodes original and generated visual features into a latent space in which these visual features are disentangled into semantic-related and semantic-unrelated representations. The proposed contrastive learning framework leverages class-level and instance-level supervision, where it not only formulates contrastive loss based on semantic-related information at the instance level, but also exploits semantic-unrelated representations and the corresponding semantic information to form negative sample pairs at the class level to further facilitate disentanglement. Then, GZSL classification is performed by training a supervised model (e.g, softmax classifier) based only on semantic-related representations. The experimental results show that our model achieves state-of-the-art performance on several benchmark datasets, especially for unseen categories. The source code of the proposed model is available at: https://github.com/fwt-team/GZSL.

论文关键词：Generalized zero-shot learning,Feature disentanglement,Contrastive learning,Generative model,Wasserstein GAN

论文评审过程：Received 9 May 2022, Revised 24 September 2022, Accepted 25 September 2022, Available online 30 September 2022, Version of Record 14 October 2022.

论文官网地址：https://doi.org/10.1016/j.knosys.2022.109949