The complementarity of a diverse range of deep learning features extracted from video content for video recommendation

作者：

Highlights：

• Multimodal video content is fused to alleviate the item cold-start problem.

• Performance of various CNN features (object, scene, audio and action) are evaluated.

• CNN features outperform hand-crafted iDT, MoSIFT and MFCC features, and metadata.

• Metadata, CNN and hand-crafted features are combined to enhance recommendations.

• A hybrid model is improved with matrix scaling to boost the overall performance.

摘要

•Multimodal video content is fused to alleviate the item cold-start problem.•Performance of various CNN features (object, scene, audio and action) are evaluated.•CNN features outperform hand-crafted iDT, MoSIFT and MFCC features, and metadata.•Metadata, CNN and hand-crafted features are combined to enhance recommendations.•A hybrid model is improved with matrix scaling to boost the overall performance.

论文关键词：Video recommendation,Deep learning features,Item cold-start,Item warm-start,Multimodal feature fusion,Beyond-accuracy metrics

论文评审过程：Received 1 June 2021, Revised 26 September 2021, Accepted 26 November 2021, Available online 22 December 2021, Version of Record 31 December 2021.

论文官网地址：https://doi.org/10.1016/j.eswa.2021.116335