Feature selection via maximizing global information gain for text classification

作者:

Highlights:

• A novel feature selection metric called global information gain (GIG) is proposed.

• An efficient algorithm called maximizing global information gain (MGIG) is developed.

• MGIG performs better than other algorithms (IG, mRMR, JMI, DISR) in most cases.

• MGIG runs significantly faster than mRMR, JMI and DISR, and comparable with IG.

摘要

•A novel feature selection metric called global information gain (GIG) is proposed.•An efficient algorithm called maximizing global information gain (MGIG) is developed.•MGIG performs better than other algorithms (IG, mRMR, JMI, DISR) in most cases.•MGIG runs significantly faster than mRMR, JMI and DISR, and comparable with IG.

论文关键词:Feature selection,Text classification,High dimensionality,Distributional clustering,Information bottleneck

论文评审过程:Received 25 January 2013, Revised 18 September 2013, Accepted 23 September 2013, Available online 14 October 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.09.019