Timed-image based deep learning for action recognition in video sequences

Highlights：

• Video library conditioning issue: because of the above compressibility, the paper proposes converting 2D + X data volume into a single meta-image file format called timed-image, prior to machine learning frameworks. This conversion is such that any 2D frame of the 2D + X data is reshaped as a 1D array indexed by a Hilbert space-filling curve and the third variable X of the initial file format becomes the second variable in the meta-image format.

• Sensitive action recognition benchmark: the paper provides two datasets having respectively 2 and 3 violence video categories. The datasets involve visual non-violent, moderate and extreme violence actions.

• Sensitive action recognition issue: outstanding 2-level and 3-level violence classification results are obtained from a deep convolutional neural networks trained from scratch and operating on meta-image databases.

摘要

•Image data conditioning issue: the paper first highlights that referring 2D spatial convolution to its 1D Hilbert based instance is highly accurate for information compressibility upon image frames associated with a wide class of video files.•Video library conditioning issue: because of the above compressibility, the paper proposes converting 2D + X data volume into a single meta-image file format called timed-image, prior to machine learning frameworks. This conversion is such that any 2D frame of the 2D + X data is reshaped as a 1D array indexed by a Hilbert space-filling curve and the third variable X of the initial file format becomes the second variable in the meta-image format.•Sensitive action recognition benchmark: the paper provides two datasets having respectively 2 and 3 violence video categories. The datasets involve visual non-violent, moderate and extreme violence actions.•Sensitive action recognition issue: outstanding 2-level and 3-level violence classification results are obtained from a deep convolutional neural networks trained from scratch and operating on meta-image databases.