Embedded Feature Representation in Dynamic Time Warping Space for 3D Action Recognition Using Kinect Depth Sensor

Document Type : Research Paper


1 MS.C Student of Electrical Engineering, Sharif University of Technology

2 Department of Electrical Engineering, Sharif University of Technology


This paper proposes a novel 3D action recognition technique which uses the skeletal information extracted from depth image sequences. First, each action is represented by a multidimensional time series where each dimension represents the position variation of one skeleton joint over time. The time series is then mapped into the kernel Hilbert space using a metric defined by Dynamic Time Warping distance. Afterwards, regularized Fisher strategy is used to remap the kernel space into a discriminative one. This incorporates the correlation-distinctiveness relationship of the sequences into the recognition process and also mitigates the curse of dimensionality effect in the kernel space.  Unlike traditional kernel functions, the time warping used in the mapping strategy makes the kernel space robust to the temporal shift variations of the motion sequences. Moreover, our method eliminates the need for a complex design method for extracting the static and dynamic information of a motion sequence. A set of extensive experiments on three publically available databases; TST, UTKinect, and UCFKinect demonstrates the superiority of our method compared to a set of baseline algorithms.