Document Type : Research Paper
Ms.C Student of computer engineering, Faculty of Engineering, University of Kurdistan, Sanandaj, Iran
Department of computer engineering, Faculty of Engineering, University of Kurdistan, Sanandaj, Iran
Department of Electrical engineering, Faculty of engineering, University of Kurdistan, Sanandaj, Iran
Today, human action recognition as an important research field is used in different applications and many computer-vision researches have focused on this area to improve recognition accuracy. In this paper, a two-stream method is introduced incorporating a new structure including two spatial features to cover their defects. Utilizing this structure leads to better performance finally. In the first stream, wavelet coefficients of key-frames with proper multi-resolution are extracted, and deep features of these key-frames are also extracted to be used in the other stream. The features in each stream are gathered in a spatial feature map. The temporal changes in both streams are learnt using a new deep network and the classification information of these streams are combined to achieve an accurate action label. The proposed method is examined on three challenging datasets as UCFYT, UCF-sport, and JHMDB with real videos which its accuracy on these datasets is 98.7, 99.83, and 92.86, respectively. The proposed method has about 4.6 percent better performance rather than the best previously introduced method on average.