TY - JOUR AU - McNally, William AU - Wong, Alexander AU - McPhee, John PY - 2018/12/24 Y2 - 2024/03/28 TI - Action Recognition using Deep Convolutional Neural Networks and Compressed Spatio-Temporal Pose Encodings JF - Journal of Computational Vision and Imaging Systems JA - J. Comp. Vis. Imag. Sys. VL - 4 IS - 1 SE - Articles DO - UR - https://openjournals.uwaterloo.ca/index.php/vsl/article/view/339 SP - 3 AB - <p>Convolutional neural networks have recently shown proficiency at<br>recognizing actions in RGB video. Existing models are gener-<br>ally very deep, requiring large amounts of data to train effectively.<br>Moreover, they rely mainly on global appearance and could poten-<br>tially underperform in single-environment applications, such as a<br>sports event. To overcome these limitations, we propose to short-<br>cut spatial learning by leveraging the activations within a human<br>pose estimation network. The proposed framework integrates a<br>human pose estimation network with a convolutional classifier via<br>compressed encodings of pose activations. When evaluated on<br>UTD-MHAD, a 27-class multimodal dataset, the pose-based RGB<br>action recognition model achieves a classification accuracy of 98.4%<br>in a subject-specific experiment and outperforms a baseline method<br>that fuses depth and inertial sensor data.</p> ER -