Semi-supervised cluster-and-label with feature based re-clustering to reduce noise in Thai document images

作者:

Highlights:

• We proposed a novel noise reduction method for document images.

• Semi-supervised learning is applied to classify noise from character components.

• The proposed method is suitable for Non-Latin based scripts i.e. Thai document image.

• We proposed an enhance labeling method of semi-supervised cluster-and-label approach.

• The performance of proposed methods are significantly better than comparison methods.

摘要

•We proposed a novel noise reduction method for document images.•Semi-supervised learning is applied to classify noise from character components.•The proposed method is suitable for Non-Latin based scripts i.e. Thai document image.•We proposed an enhance labeling method of semi-supervised cluster-and-label approach.•The performance of proposed methods are significantly better than comparison methods.

论文关键词:Noise reduction,Document enhancement,Semi-supervised classification,Cluster-and-label,Thai document

论文评审过程:Received 6 March 2015, Revised 19 August 2015, Accepted 28 September 2015, Available online 8 October 2015, Version of Record 8 November 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.09.033