Skeleton-based Human Action/Interaction Classification in Sparse Image Sequences

Kasprzak, Włodzimierz; Piwowarski, Paweł

doi:10.14313/jamris/3-2023/18

References

C. Coppola, S. Cosar, D. R. Faria, and N. Bellotto. “Automatic detection of human interactions from RGB-D data for social activity classification,” 2017 26th IEEE International Symposium on Robot and Human Interactive Communication “RO-MAN”, Lisbon, 2017, pp. 871–876; doi: 10.1109/ROMAN.2017.8172405.
Search in Google Scholar Back to article
A. M. Zanchettin, A. Casalino, L. Piroddi, and P. Rocco. “Prediction of Human Activity Patterns for Human–Robot Collaborative Assembly Tasks,” IEEE Transactions on Industrial Informatics, vol. 15(2019), no. 7, pp. 3934–3942; doi: 10.1109/TII.2018.2882741.
Search in Google Scholar Back to article
Z. Zhang, G. Peng, W. Wang, Y. Chen, Y. Jia, and S. Liu. “Prediction-Based Human-Robot Collaboration in Assembly Tasks Using a Learning from Demonstration Model,” Sensors, 2022, no. 22(11):4279; doi: 10.3390/s22114279.
Search in Google Scholar Back to article
M. S. Ryoo. “Human activity prediction: Early Recognition of Ongoing Activities from Streaming Videos,” 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 1036–1043; doi: 10.1109/ICCV.2011.6126349.
Search in Google Scholar Back to article
K. Viard, M. P. Fanti, G. Faraut, and J.-J. Lesage. “Human Activity Discovery and Recognition using Probabilistic Finite-State Automata. “IEEE Transactions on Automation Science and Engineering, vol. 17 (2020), no. 4, pp. 2085–2096; doi: 10.1109/TASE.2020.2989226.
Search in Google Scholar Back to article
S. Zhang, Z. Wei, J. Nie, L. Huang, S. Wang, and Z. Li. “A review on human activity recognition using vision-based method,” Journal of Healthcare Engineering, Hindawi, vol. 2017, Article ID 3090343; doi: 10.1155/2017/3090343.
Search in Google Scholar Back to article
A. Stergiou and R. Poppe. “Analyzing human-human interactions: a survey,” Computer Vision and Image Understanding, Elsevier, vol. 188 (2019), 102799; doi: 10.1016/j.cviu.2019.102799.
Search in Google Scholar Back to article
A. Bevilacqua, K. MacDonald, A. Rangarej, V. Widjaya, B. Caulfield, and T. Kechadi. “Human Activity Recognition with Convolutional Neural Networks,” Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2018), LNAI vol. 11053, Springer, Cham, Switzerland, 2019, pp. 541–552; doi: 10.1007/978-3-030-10997-4_33.
Search in Google Scholar Back to article
M. Liu, and J. Yuan. “Recognizing Human Actions as the Evolution of Pose Estimation Maps,” 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, June 18-22, 2018, pp. 1159–1168; doi: 10.1109/CVPR.2018.00127.
Search in Google Scholar Back to article
E. Cippitelli, E. Gambi, S. Spinsante, and F. Florez-Revuelta. “Evaluation of a skeleton-based method for human activity recognition on a large-scale RGB-D dataset,” 2nd IET International Conference on Technologies for Active and Assisted Living (TechAAL 2016), London, UK, 2016; doi: 10.1049/IC.2016.0063.
Search in Google Scholar Back to article
Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, ”OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021; doi: 10.1109/TPAMI.2019.2929257.
Search in Google Scholar Back to article
A. Toshev, and C. Szegedy. “DeepPose: Human Pose Estimation via Deep Neural Networks,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 1653–1660; doi: 10.1109/CVPR.2014.214.
Search in Google Scholar Back to article
E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele. “Deepercut: a deeper, stronger, and faster multi-person pose estimation model,” Computer Vision – ECCV 2016, LNCS vol. 9907, Springer, Cham, Switzerland, 2016, pp. 34–50; doi: 10.1007/978-3-319-46466-4_3.
Search in Google Scholar Back to article
[Online]. NTU RGB+D 120 Dataset. Papers With Code. Available online: https://paperswithcode.com/dataset/ntu-rgb-d-120 (accessed on 30 June 2022).
Search in Google Scholar Back to article
M. Perez, J. Liu, and A.C. Kot, “Interaction Relational Network for Mutual Action Recognition,” arXiv:1910.04963 [cs.CV], 2019; https://arxiv.org/abs/1910.04963 (accessed on 15.07.2022).
Search in Google Scholar Back to article
L.-P. Zhu, B. Wan, C.-Y. Li, G. Tian, Y. Hou, and K. Yuan. “Dyadic relational graph convolutional networks for skeleton-based human interaction recognition,” Pattern Recognition, Elsevier, vol. 115, 2021, p. 107920; doi: 10.1016/j.patcog.2021.107920.
Search in Google Scholar Back to article
R.-A. Jacobs, M.-I. Jordan, S.-J. Nowlan, and G.-E. Hinton. “Adaptive mixtures of local experts,” Neural Comput., 3(1):79–87, 1991.
Search in Google Scholar Back to article
S. Puchała, W. Kasprzak, and P. Piwowarski. “Feature engineering techniques for skeleton-based two-person interaction classification in video,” 17th International Conference on Control, Automation, Robotics and Vision (ICARCV), Singapore, 2022, IEEE Explore, pp. 66–71; doi: 10.1109/ICARCV57592.2022.10004329.
Search in Google Scholar Back to article
P.-F. Felzenszwalb, R.-B. Girshick, D. McAllester, and D. Ramanan, ”Object detection with discriminatively trained part-based models,” IEEE Trans. Pattern Anal. Mach. Intell., 2010, vol. 32, no. 9, pp. 1627–1645; doi: 10.1109/TPAMI.2009.167.
Search in Google Scholar Back to article
A. Krizhevsky, I. Sutskever, and G.-E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, 2017, vol. 60(6), pp. 84–90; doi: 10.1145/3065386.
Search in Google Scholar Back to article
K. Simonyan, and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv, 2015, arXiv:1409.1556; https://arxiv.org/abs/1409.1556.
Search in Google Scholar Back to article
K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Image Recognition,” Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778; doi: 10.1109/CVPR.2016.90.
Search in Google Scholar Back to article
T.-L. Munea, Y.-Z. Jembre, H.-T. Weldegebriel, L. Chen, C. Huang, and C. Yang. “The Progress of Human Pose Estimation: A Survey and Taxonomy of Models Applied in 2D Human Pose Estimation,” IEEE Access, 2020, vol. 8, pp. 133330–133348; doi: 10.1109/ACCESS.2020.3010248.
Search in Google Scholar Back to article
K. Wei, and X. Zhao. “Multiple-Branches Faster RCNN for Human Parts Detection and Pose Estimation,” Computer Vision – ACCV 2016 Workshops, Lecture Notes in Computer Science, vol. 10118, Springer, Cham, 2017; doi: 10.1007/978-3-319-54526-4.
Search in Google Scholar Back to article
Z. Su, M. Ye, G. Zhang, L. Dai, and J. Sheng. “Cascade feature aggregation for human pose estimation,” arXiv, 2019, arXiv:1902.07837; https://arxiv.org/abs/1902.07837.
Search in Google Scholar Back to article
H. Meng, M. Freeman, N. Pears, and C. Bailey. “Real-time human action recognition on an embedded, reconfigurable video processing architecture,” J. Real-Time Image Proc., vol. 3, 2008, no. 3, pp. 163–176; doi: 10.1007/s11554-008-0073-1.
Search in Google Scholar Back to article
K.-G. Manosha Chathuramali, and R. Rodrigo. “Faster human activity recognition with SVM,” International Conference on Advances in ICT for Emerging Regions (ICTer2012), Colombo, Sri Lanka, 12–15 December 2012, IEEE, 2012, pp. 197–203; doi: 10.1109/icter.2012.6421415.
Search in Google Scholar Back to article
X. Yan, and Y. Luo. “Recognizing human actions using a new descriptor based on spatial–temporal interest points and weighted-output classifier,” Neurocomputing, Elsevier, vol. 87, 2012, pp. 51–61, 15 June 2012; doi: 10.1016/j.neucom.2012.02.002.
Search in Google Scholar Back to article
R. Vemulapalli, F. Arrate, and R. Chellappa. “Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 June 2014, Columbus, OH, USA, IEEE, pp. 588–595; doi: 10.1109/cvpr.2014.82.
Search in Google Scholar Back to article
J. Liu, A. Shahroudy, D. Xu, and G. Wang, ”Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition,” Computer Vision – ECCV 2016, Lecture Notes in Computer Science, vol. 9907, Springer, Cham, Switzerland, 2016, pp. 816–833; doi: 10.1007/978-3-319-46487-9_50.
Search in Google Scholar Back to article
A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang. “NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis,” arXiv:1604.02808[cs.CV], 2016; https://arxiv.org/abs/1604.02808.
Search in Google Scholar Back to article
C. Li, Q. Zhong, D. Xie, and S. Pu. “Skeleton-based Action Recognition with Convolutional Neural Networks,” 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 10–14 July 2017, Hong Kong, pp. 597–600; doi: 10.1109/ICMEW.2017.8026285.
Search in Google Scholar Back to article
D. Liang, G. Fan, G. Lin, W. Chen, X. Pan, and H. Zhu. “Three-Stream Convolutional Neural Network With Multi-Task and Ensemble Learning for 3D Action Recognition,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 16–17 June 2019, Long Beach, CA, USA, IEEE, pp. 934–940; doi: 10.1109/cvprw.2019.00123.
Search in Google Scholar Back to article
S. Yan, Y. Xiong, and D. Lin. “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition,” arXiv:1801.07455 [cs.CV], 2018; https://arxiv.org/abs/1801.07455, (accessed on 15.07.2022).
Search in Google Scholar Back to article
M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, ”Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp. 3590–3598; doi: 10.1109/CVPR.2019.00371.
Search in Google Scholar Back to article
L. Shi, Y. Zhang, J. Cheng, and H.-Q. Lu. “Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition,” arXiv:1805.07694v3 [cs.CV], 10 July 2019; doi: 10.48550/ARXIV.1805.07694.
Search in Google Scholar Back to article
L. Shi, Y. Zhang, J. Cheng, and H.-Q. Lu. “Skeleton-based action recognition with multi-stream adaptive graph convolutional networks,” IEEE Transactions on Image Processing, vol. 29, October 2020, pp. 9532–9545; doi: 10.1109/TIP.2020.3028207.
Search in Google Scholar Back to article
H. Duan, Y. Zhao, K. Chen, D. Shao, D. Lin, and B. Dai. “Revisiting Skeleton-based Action Recognition,” arXiv, 2021, arXiv:2104.13586; https://arxiv.org/abs/2104.13586.
Search in Google Scholar Back to article
H. Duan, Y. Zhao, K. Chen, D. Lin, and B. Dai. “Revisiting Skeleton-based Action Recognition,” arXiv:2104.13586v2 [cs.CV], 2 Apr 2022; https://arxiv.org/abs/2104.13586v2.
Search in Google Scholar Back to article
J. Liu, G. Wang, P. Hu, L.-Y. Duan, and A. C. Kot. “Global Context-Aware Attention LSTM Networks for 3D Action Recognition,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21-26 July 2017, pp. 3671–3680; doi: 10.1109/CVPR.2017.391.
Search in Google Scholar Back to article
J. Liu, G. Wang, L.-Y. Duan, K. Abdiyeva, and A. C. Kot. “Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks,” IEEE Transactions on Image Processing (TIP), 27(4):1586–1599, 2018; doi: 10.1109/TIP.2017.2785279.
Search in Google Scholar Back to article
J. Liu, A. Shahroudy, G. Wang, L.-Y. Duan, and A. C. Kot. “Skeleton-Based Online Action Prediction Using Scale Selection Network,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 42(6):1453–1467, 2019; doi: 10.1109/TPAMI.2019.2898954.
Search in Google Scholar Back to article
T. Yu, and H. Zhu. “Hyper-Parameter Optimization: A Review of Algorithms and Applications,” arXiv:2003.05689 [cs, stat], 2020; https://arxiv.org/abs/2003.05689.
Search in Google Scholar Back to article
[Online]. “openpose”, CMU-Perceptual-Computing-Lab, 2021; https://github.com/CMU-Perceptual-Computing-Lab/openpose/.
Search in Google Scholar Back to article
[Online]. “Keras: the Python deep learning API,” https://keras.io/.
Search in Google Scholar Back to article
[Online]. “UTKinect-3D Database,” Available online: http://cvrc.ece.utexas.edu/KinectDatasets/HOJ3D.html (accessed on 30 June 2022).
Search in Google Scholar Back to article
Kiwon Yun. “Two-person Interaction Detection Using Body-Pose Features and Multiple Instance Learning,” https://www3.cs.stonybrook.edu/~kyun/research/kinect_interaction/index.html.
Search in Google Scholar Back to article

Skeleton-based Human Action/Interaction Classification in Sparse Image Sequences

References

Paradigm

My account