Modality adaptation in multimodal data

作者：

Highlights：

•

摘要

Recently, multimodal data has received much attention. In classical machine learning, it is assumed that all data comes from one modality while in multimodal machine learning, the information comes from different modalities. In multimodal machine learning, transiting, or fusing knowledge from different modalities is an important step. Hence, in these steps, the different marginal distributions between different modalities should be taken into account. However, in recent years, modality adaptation has not gotten enough attention. The motivation of this work is to consider modality adaptation to effectively encode the shared common or complementary knowledge in multimodal data. To reduce the modality shift, we present a new perspective on the modality adaptation algorithm. In multimodal data, by applying the existing domain adaptation techniques to reduce the modality shift, a problem arises because of the insufficient capability of those techniques in preserving complementary knowledge. Our proposed modality adaptation is designed such that it simultaneously considers both the shared and complementary knowledge of each modality while preserving the discriminative ability of each modality in the label space. To evaluate the proposed approach, we have applied it to two different multimodal applications: multi-view object detection and RGBD image semantic segmentation. Our results show that the proposed modality adaptation technique is successful in transferring and fusing knowledge.

论文关键词：Modality adaptation,Multimodal learning,Knowledge transferring,Knowledge fusion

论文评审过程：Received 12 December 2020, Revised 17 April 2021, Accepted 25 April 2021, Available online 28 April 2021, Version of Record 5 May 2021.

论文官网地址：https://doi.org/10.1016/j.eswa.2021.115126