Open-vocabulary recognition of machine-printed Arabic text using hidden Markov models

作者:

Highlights:

• A novel approach to the sliding window technique for feature extraction.

• A two-step approach to mixed-font and unseen font text recognition.

• Simple and effective features for font identification.

• A multi-font printed Arabic text database for text recognition research.

• Experiments were conducted using two separate databases of printed Arabic text.

摘要

•A novel approach to the sliding window technique for feature extraction.•A two-step approach to mixed-font and unseen font text recognition.•Simple and effective features for font identification.•A multi-font printed Arabic text database for text recognition research.•Experiments were conducted using two separate databases of printed Arabic text.

论文关键词:Optical character recognition,Mixed-font OCR,Unseen-font OCR,Hidden Markov models,Font identification,Sliding window,Arabic OCR

论文评审过程:Received 13 September 2014, Revised 2 May 2015, Accepted 8 September 2015, Available online 28 September 2015, Version of Record 27 November 2015.

论文官网地址:https://doi.org/10.1016/j.patcog.2015.09.011