Mosaicing-by-recognition for video-based text recognition

作者：

Highlights：

•

摘要

Text recognition captured in multiple frames by a hand-held video camera is a challenging task because it is possible to capture and recognize a longer line of text while improving the quality of the text image by utilizing the redundancy of the overlapping areas between the frames. For this task, the video frames should be registered, i.e., mosaiced, after compensating for their distortions due to camera shakes. In this paper, a mosaicing-by-recognition technique is proposed where the problems of video mosaicing and text recognition are formulated as a unified optimization problem and solved by a dynamic programming-based optimization algorithm simultaneously and collaboratively. Experimental results indicate that, even if the frames undergo various distortions such as rotation, scaling, translation, and nonlinear speed fluctuation of camera movement, the proposed technique provides fine mosaic image by accurate distortion estimation (around 90% of perfect estimation) and character recognition accuracy (over 95%).

论文关键词：Video-based text recognition,Mosaicing

论文评审过程：Received 26 April 2006, Revised 30 July 2007, Accepted 16 August 2007, Available online 24 August 2007.

论文官网地址：https://doi.org/10.1016/j.patcog.2007.08.005