A comprehensive review of recent advances on deep vision systems

作者：Qaisar Abbas, Mostafa E. A. Ibrahim, M. Arfan Jaffar

摘要

Real-time video objects detection, tracking, and recognition are challenging issues due to the real-time processing requirements of the machine learning algorithms. In recent years, video processing is performed by deep learning (DL) based techniques that achieve higher accuracy but require higher computations cost. This paper presents a recent survey of the state-of-the-art DL platforms and architectures used for deep vision systems. It highlights the contributions and challenges from over numerous research studies. In particular, this paper first describes the architecture of various DL models such as AutoEncoders, deep Boltzmann machines, convolution neural networks, recurrent neural networks and deep residual learning. Next, deep real-time video objects detection, tracking and recognition studies are highlighted to illustrate the key trends in terms of cost of computation, number of layers and the accuracy of results. Finally, the paper discusses the challenges of applying DL for real-time video processing and draw some directions for the future of DL algorithms.

论文关键词：Computer vision, Video processing, Object detection, Object tracking, Object recognition, Deep learning, Convolutional neural network, Deep belief network, Deep residual learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10462-018-9633-3