Abstract:Traditional methods for action recognition include several isolated processes and depend on well-designed features, which makes them has the shotcomings of large time cost and difficult to optimize the parameters from the whole. In this paper, we use depth sequences to study deep learning-based action recognition and construct a 3D-based deep convolution neural network to automatically learn spatio-temporal features from raw depth sequences. A Softmax classifier is used on the learned features to take action recognition. Experimental results demonstrate that our method can learn feature representation automatically from depth sequences. The proposed method performs comparable results to the state-of-the-art methods on the MSR-Action3D dataset and achieves good performance in comparison to baseline methods on the UTKinect-Action3D dataset. And the proposed method is simpler in feature extracting and action recognition consist of a closed loop system which can learn features automatically. We further investigate the generalization of the trained model by transferring the learned features from one dataset (MSR-Action3D) to another dataset (UTKinect-Action3D) without retraining and obtain very promising classification accuracy.