Hierarchy construction and text classification based on the relaxation strategy and least information model

作者:

Highlights:

• Hierarchical classification is an effective approach to categorize large-scale text data.

• The relaxation strategy effectively alleviates the impact of the ‘blocking’ problem.

• A new term weighting approach based on the Least Information Theory is proposed.

• It offers a new information quantify model by different probability distributions.

摘要

•Hierarchical classification is an effective approach to categorize large-scale text data.•The relaxation strategy effectively alleviates the impact of the ‘blocking’ problem.•A new term weighting approach based on the Least Information Theory is proposed.•It offers a new information quantify model by different probability distributions.

论文关键词:Hierarchy classification,Relaxation strategy,Least Information Theory,Term weighting

论文评审过程:Received 28 November 2017, Revised 20 January 2018, Accepted 1 February 2018, Available online 16 February 2018, Version of Record 16 February 2018.

论文官网地址:https://doi.org/10.1016/j.eswa.2018.02.003