A mixed heterogeneous factorization model for non-overlapping cross-domain recommendation

Highlights：

• We focus on a CDCF scenario without overlapped users/items among related domains.

• We deal with domain heterogeneity comprehensively from three different aspects.

• A modified PARAFAC approach is used to capture shared knowledge with heterogeneity.

• We differentiate roles of auxiliary domains based on a subspace alignment technique.

• Our model has superior performance in rating prediction and top-N recommendation tasks.

摘要

Cross-domain collaborative filtering has attracted a great deal of attention for its capability of dealing with data sparsity by transferring valuable knowledge from auxiliary domains to assist in recommendation in a target domain. State-of-the-art research has mainly focused on the scenario in which auxiliary domains share users or items with a target domain. However, such auxiliary data is rare or the acquisition of them is limited due to privacy concerns in real-world applications. We investigate a more realistic scenario in which auxiliary domains have neither users nor items overlapped with a target domain, thereby facilitating the collection of auxiliary data. In order to extract and transfer the sharing knowledge between auxiliary domains and target domain, we assume the user-item feedback in each domain consist of two parts: domain-transferrable information containing the shared knowledge and domain-reserved information reflecting the domain-specific characteristics. Correspondingly, we propose a mixed heterogeneous factorization model to capture the sharing knowledge and the domain-specific characteristics based on adapted tensor factorization and biased matrix factorization respectively and then combine them together in an accumulative way. Meanwhile, three types of domain heterogeneity including preference heterogeneity, characteristics heterogeneity and rating bias are taken into account in this model. Experimenting on four publicly available datasets across different domains, we show that our model is superior to state-of-the-art methods in rating prediction and top-N recommendation tasks.