Learning entity-centric document representations using an entity facet topic model

作者:

Highlights:

• We propose the task of entity-centric document representation learning.

• We propose a novel Entity Facet Topic Model (EFTM) to learn entity-centric document representations.

• We confirm our hypothesis regarding the existence of multiple facets of an entity by analysing the learned entity facets using qualitative and quantitative analysis, and identify a effective number of facets per entity.

• We demonstrate the effectiveness of EFTM in downstream applications using a multilabel classification task.

摘要

•We propose the task of entity-centric document representation learning.•We propose a novel Entity Facet Topic Model (EFTM) to learn entity-centric document representations.•We confirm our hypothesis regarding the existence of multiple facets of an entity by analysing the learned entity facets using qualitative and quantitative analysis, and identify a effective number of facets per entity.•We demonstrate the effectiveness of EFTM in downstream applications using a multilabel classification task.

论文关键词:Document representation,Topic models,Entity aspects,Text classification

论文评审过程:Received 30 September 2018, Revised 23 January 2020, Accepted 26 January 2020, Available online 12 February 2020, Version of Record 12 February 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102216