Supervised Hebb rule based feature selection for text classification

作者:

Highlights:

摘要

Text documents usually contain high dimensional non-discriminative (irrelevant and noisy) terms which lead to steep computational costs and poor learning performance of text classification. One of the effective solutions for this problem is feature selection which aims to identify discriminative terms from text data. This paper proposes a method termed “Hebb rule based feature selection (HRFS)”. HRFS is based on supervised Hebb rule and assumes that terms and classes are neurons and select terms under the assumption that a term is discriminative if it keeps “exciting” the corresponding classes. This assumption can be explained as “a term is highly correlated with a class if it is able to keep “exciting” the class according to the original Hebb postulate. Six benchmarking datasets are used to compare HRFS with other seven feature selection methods. Experimental results indicate that HRFS is effective to achieve better performance than the compared methods. HRFS can identify discriminative terms in the view of synapse between neurons. Moreover, HRFS is also efficient because it can be described in the view of matrix operation to decrease complexity of feature selection.

论文关键词:Text classification,Feature selection,Hebb rule

论文评审过程:Received 14 July 2018, Revised 8 September 2018, Accepted 12 September 2018, Available online 17 October 2018, Version of Record 17 October 2018.

论文官网地址:https://doi.org/10.1016/j.ipm.2018.09.004