Turning from TF-IDF to TF-IGM for term weighting in text classification

作者:

Highlights:

• A new supervised term weighting scheme called TF-IGM is proposed.

• It adopts a new statistical model to measure a term's class distinguishing power.

• It makes full use of the fine-grained term distribution across different classes.

• It is adaptive to different text datasets by providing options or parameters.

• It outperforms TF-IDF and state-of-the-art supervised term weighting schemes.

摘要

•A new supervised term weighting scheme called TF-IGM is proposed.•It adopts a new statistical model to measure a term's class distinguishing power.•It makes full use of the fine-grained term distribution across different classes.•It is adaptive to different text datasets by providing options or parameters.•It outperforms TF-IDF and state-of-the-art supervised term weighting schemes.

论文关键词:Term weighting,Text classification,Inverse gravity moment (IGM),Class distinguishing power,Classifier

论文评审过程:Received 26 April 2016, Revised 9 August 2016, Accepted 5 September 2016, Available online 9 September 2016, Version of Record 17 September 2016.

论文官网地址:https://doi.org/10.1016/j.eswa.2016.09.009