Relative discrimination criterion – A novel feature ranking method for text data

作者:

Highlights:

• Discussed characteristics of text data.

• Indicated that term counts are being ignored to calculated term rank.

• Proposed new feature ranking algorithm (RDC) which considers term counts.

• Compared performance of RDC with four feature ranking metrics on four datasets.

• RDC show highest performance in 65% of the classification cases.

摘要

•Discussed characteristics of text data.•Indicated that term counts are being ignored to calculated term rank.•Proposed new feature ranking algorithm (RDC) which considers term counts.•Compared performance of RDC with four feature ranking metrics on four datasets.•RDC show highest performance in 65% of the classification cases.

论文关键词:Text classification,Feature selection,Document frequency,Term count,True positive rate,False positive rate

论文评审过程:Available online 20 December 2014.

论文官网地址:https://doi.org/10.1016/j.eswa.2014.12.013