Effect of class imbalance in heterogeneous network embedding: An empirical study

作者:

Highlights:

• The effect of class imbalance is studied over bibliometrics tasks in DBLP database.

• A novel metric to estimate class imbalance in heterogeneous networks is proposed.

• Decreasing the class imbalance does not always correlate with better performance.

• Network schema and meta-paths are susceptible to the underlying bibliometrics task.

• Class imbalance is an inherent property of heterogeneous bibliographic networks.

• The node selection is critical in mining a heterogeneous bibliographic network.

摘要

•The effect of class imbalance is studied over bibliometrics tasks in DBLP database.•A novel metric to estimate class imbalance in heterogeneous networks is proposed.•Decreasing the class imbalance does not always correlate with better performance.•Network schema and meta-paths are susceptible to the underlying bibliometrics task.•Class imbalance is an inherent property of heterogeneous bibliographic networks.•The node selection is critical in mining a heterogeneous bibliographic network.

论文关键词:Heterogeneous information network,Network embedding,Meta-path,Class imbalance

论文评审过程:Received 2 April 2019, Revised 13 December 2019, Accepted 9 January 2020, Available online 7 February 2020, Version of Record 7 February 2020.

论文官网地址:https://doi.org/10.1016/j.joi.2020.101009