A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection

作者:Tauseef Khan, Ayatullah Faruk Mollah

摘要

Text detection in the wild is an active research problem in computer vision. Localizing text in multi-script and arbitrary–oriented scene images in unconstrained environment is one of the challenging aspects in this context. In this paper, we present (i) a novel multi-scale deep framework for multi-script text detection, (ii) a new dataset comprised of multi-lingual indoor-outdoor scene images in Indic scenario, (iii) benchmark performance on the developed dataset, and (iv) results and comparative analysis over some related standard datasets such as ICDAR 2019-MLT and ICDAR 2013 (born-image). Rigorous experiments have been carried out and obtained results demonstrate the effectiveness of the developed framework in localizing texts with different scripts, font-sizes, text orientation, etc. Experimental results also reflect that developed framework surpasses existing methods and achieves state-of-the-art performance on most of the datasets considered in this work, which implies its effectiveness and robustness in practical scenario.

论文关键词:Scene text detection, Multi-script, Multi-orientation, Multi-scale framework, CNN

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-021-10686-5