Modeling low- and high-order feature interactions with FM and self-attention network

作者:Cairong Yan, Yizhou Chen, Yongquan Wan, Pengwei Wang

摘要

Click-Through Rate (CTR) prediction has always been a very popular topic. In many online applications, such as online advertising and product recommendation, a small increase in CTR will bring great returns. However, CTR prediction has always faced several challenges. A large number of users and items and the different sizes of the feature space of different data types lead to high-dimensional and sparse input, and high-order feature interactions rely too much on expert knowledge and are very time-consuming. In this paper, we build a novel model called multi-order interactive features aware factorization machine (MoFM) for CTR prediction. To effectively capturing both low-order and high-order interactive features, three different types of prediction models are integrated, of which logistic regression (LR) and factorization machine (FM) model the original features and 2-order interactive features respectively, and a multi-head self-attention network with residual connections is used to automatically identify high-value high-order feature combinations. There is also an embedding layer in the model to realize a unified embedding processing of different data types, avoiding diversification, sparsity, and high dimensionality of features. Since, feature engineering is not required, we can carry out end-to-end model learning. Experiments on three public datasets show the superiority of the proposed model over the state-of-the-art models, and the flexibility and scalability of the model structure have also been verified.

论文关键词:Factorization machine, Feature interaction, Multi-head self-attention, CTR prediction

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01951-6