Noise compensation in a person verification system using face and multiple speech features

作者：

Highlights：

•

摘要

In this paper, we demonstrate that use of a recently proposed feature set, termed Maximum Auto-Correlation Values, which utilizes information from the source part of the speech signal, significantly improves the robustness of a text independent identity verification system. We also propose an adaptive fusion technique for integration of audio and visual information in a multi-modal verification system. The proposed technique explicitly measures the quality of the speech signal, adjusting the amount of contribution of the speech modality to the final verification decision. Results on the VidTIMIT database indicate that the proposed approach outperforms existing adaptive and non-adaptive fusion techniques. For a wide range of audio SNRs, the performance of the multi-modal system utilizing the proposed technique is always found to be better than the performance of the face modality.

论文关键词：Multi-modal,Audio-visual,Identity verification,Adaptive fusion,Source features

论文评审过程：Received 21 December 2001, Available online 8 March 2002.

论文官网地址：https://doi.org/10.1016/S0031-3203(02)00031-6