Learning to classify short text from scientific documents using topic models with various types of knowledge

作者:

Highlights:

• An efficient framework to classify short text from scientific documents is proposed.

• Topic models from various types of knowledge were used for enhancing features in documents.

• Two methods were presented to optimize external features that enhance relatedness in documents.

• The performances were evaluated by using real-world scientific documents from online publisher.

• Proposed methods are shown to outperform related work.

摘要

•An efficient framework to classify short text from scientific documents is proposed.•Topic models from various types of knowledge were used for enhancing features in documents.•Two methods were presented to optimize external features that enhance relatedness in documents.•The performances were evaluated by using real-world scientific documents from online publisher.•Proposed methods are shown to outperform related work.

论文关键词:Data sparseness,Information retrieval,Latent Dirichlet Allocation,Short text classification,Topic model

论文评审过程:Available online 28 September 2014.

论文官网地址:https://doi.org/10.1016/j.eswa.2014.09.031