Improving the Accuracy of Natural Dynamic Scenes Recognition using Correlation of Feature Maps in CNNs

Document Type : Research Paper


1 Phd. Student of Electrical Engineering, Ferdowsi University of Mashhad (FUM), Mashhad, Iran

2 Department of Electrical Engineering, Faculty of Engineering, Ferdowsi University Of Mashhad (FUM), Mashhad, Iran

3 Department of Electrical Engineering, Quchan University of Technology


Dynamic scene recognition is one of the fundamental research fields in machine vision. In this paper, an effective dynamic scene recognition method using convolutional neural networks is proposed. In the proposed method the correlation of feature maps of different layers in a neural network is exploited as a feature vector containing video information. Firstly, N frames of video are selected and fed into a network to exploit the feature maps, then a Gram matrix indicating the spatial information of the frames of video is calculated. Subsequently, using temporal slicing over selected frames and averaging over the Gram matrices of these frames, temporal information is considered. Encoding features followed by pooling operation, a feature vector is obtained for classification. Experimental evaluations on benchmark dynamic scene datasets demonstrate the effectiveness of the proposed method in comparison with the state-of-the-art methods in this research field and has improved the recognition accuracy about 9% for Maryland dataset and about 3% for YUP++ dataset.