Towards Optimized IoT-based Context-aware Video Content Analysis Framework

Despite the success of convolutional neural networks (CNNs) in the area of spatial analysis, and recurrent neural networks (RNNs) on sequence modeling and interpretation tasks, video analysis has only seen limited interest and progress. This is partially due to focusing on the natural humanlike tran...

Full description

Saved in:

Bibliographic Details
Published in:	2021 IEEE 7th World Forum on Internet of Things (WF-IoT) pp. 46 - 50
Main Authors:	Gad, Gad, Gad, Eyad, Mokhtar, Bassem
Format:	Conference Proceeding
Language:	English
Published:	IEEE 14.06.2021
Subjects:	Analytical models Closed-domain Dataset Context-aware Analysis Deep Learning Framework Feature extraction Natural languages Network architecture Recurrent neural networks Subject-Verb-Object Description Transformers Video Description Vocabulary
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Despite the success of convolutional neural networks (CNNs) in the area of spatial analysis, and recurrent neural networks (RNNs) on sequence modeling and interpretation tasks, video analysis has only seen limited interest and progress. This is partially due to focusing on the natural humanlike translation from video space to natural language space to the detriment of informativeness. This paper is proposing an automated context-aware video analysis framework that is directed by the constrains of its application. This framework encorporates an encoder-decoder neural network trained on a closed-domain video-to-text dataset. The network architecture and the standardized language model present in the dataset are optimized for speed, to allow the system to be applied on IoT devices, and for informativeness, to extract information easily from the model output to the following stages of the anlaysis. The proposed framework provides a practical method to integrate the power of CNN and RNN combination in a directed way to extract the most from video content. A classroom monitoring system is discussed as an example of the capabilities and limitations of the proposed framework using NVIDIA's Jetson nano board.
DOI:	10.1109/WF-IoT51360.2021.9595891