PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

An RDF-based action recognition framework with feature selection capability, considering therapy exercises utilizing depth cameras

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Emerging cost-efficient depth sensor technologies reveal new possibilities to cope with difficulties in action recognition. Depth information improves the quality of skeleton detection process, hence, pose estimation can be done more efficiently. Recently many studies focus on temporal analyses over estimated skeleton poses to recognize actions. In this paper we have an inclusive study of the spatiotemporal kinematic features and propose an action recognition framework with feature selection capability to deal with the multitudinous of features by leveraging data mining capabilities of random decision forests. We describe human motion via a rich collection of kinematic feature time-series computed from the skeletal representation of the body in motion. We discriminatively optimize a random decision forest model over this collection to identify the most effective subset of features, localized both in time and space. Later, we train a support vector machine classifier on the selected features. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). To justify our method we test the framework on various datasets and compared it with state-of-theart. On MSRC-12 dataset [25] (12 classes), our method achieves 94% accuracy. On the WorkoutSU-10 dataset [28], collected by our group (10 physical exercise classes), the accuracy is 98%. On MSR Action3D dataset [9] (20 classes) we obtain 87% average accuracy and for UTKinect-Action dataset [10] (10 classes) the accuracy is 92%. Other than regular activities, we also tried our approach to detect a falling person using the dataset which we recorded as an extension to our original dataset. We test how our method adjusts on different types of actions and we obtained promising results for this type of action. We discuss that our approach provides insights on the spatiotemporal dynamics of human actions and can be used to as part of different applications especially for rehabilitation of patients.
Rocznik
Strony
3--22
Opis fizyczny
Bibliogr. 49 poz., rys., tab.
Twórcy
autor
  • Sabancı University, Istanbul, Turkey
autor
  • Boğaziçi University, Istanbul, Turkey
autor
  • Sabancı University, Istanbul, Turkey
autor
  • Sabancı University, Istanbul, Turkey
  • ISRA Vision, Istanbul, Turkey
Bibliografia
  • [1] Bishop, C.: Pattern Recognition and Machine Learning. Springer, Secaucus, NJ, USA, 2006.
  • [2] Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell., 22(1):4–37, 2000.
  • [3] Rougier, C., Meunier, J.: Demo: Fall detection using 3D head trajectory extracted from a single camera video sequence. Journal of Telemedicine and Telecare 11(4) , 2005.
  • [4] Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Sequence of the most informative joints (smij): A new representation for human skeletal action recognition. In HAU3D, 2012.
  • [5] Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In CVPR, 2012.
  • [6] Bianco, S., Tisato, F.: Karate moves recognition from skeletal motion. In 3D Image Processing (3DIP) and Applications, San Francisco, CA, USA, 2012.
  • [7] Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In CVPR, 2013.
  • [8] Hadfield, S., Bowden, R.: Hollywood 3d: Recognizing actions in 3d natural scenes. In Proceeedings, conference on Computer Vision and Pattern Recognition, Portland, Oregon, June 23-28 2013.
  • [9] Li, W. Q., Zhang, Z. Y., Liu, Z. C.: Action recognition based on a bag of 3d points. In CVPR4HB10, pp. 9–14, 2010.
  • [10] Xia, L., Chen, C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pp. 20–27. IEEE, 2012.
  • [11] Yang, X., Tian, Y.: Eigenjoints-based action recognition using naive-bayes nearest-neighbor. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pp. 14–19. IEEE, 2012.
  • [12] Zhu, Y., Chen, W., Guo, G.: Fusing Spatiotemporal Features and Joints for 3D Action Recognition. Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on. IEEE, 2013.
  • [13] Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on, pp. 9–14. IEEE, 2010.
  • [14] Xinbo, J., et al.: Online robust action recognition based on a hierarchical model. The Visual Computer (2014): 1-13.
  • [15] Ellis, C., Masood, S. Z., Tappen, M. F., LaViola, J. J., Sukthankar, R.: Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition. International Journal of Computer Vision. doi:10.1007/s11263-012-0550-7, 2012.
  • [16] Chatzis, S.P., Kosmopoulos, D.I., Doliotis, P.: A conditional random field-based model for joint sequence segmentation and classification. Pattern Recognit. 46(6), 1569–1578, 2013.
  • [17] Lehrmann, A. M., Gehler, P. V., Nowozin, S.: Efficient Nonlinear Markov Models for Human Motion. Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pp. 1314-1321, IEEE 2014.
  • [18] Çeliktutan, O., et al.: Fast Exact Hyper-graph Matching with Dynamic Programming for Spatio-temporal Data. Journal of Mathematical Imaging and Vision (2014): 1-21.
  • [19] Chaaraoui, A. A., et al.: Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Systems with Applications 41.3 (2014): 786-794.
  • [20] Venkataraman, V., Turaga, P., Lehrer, N., Baran, M., Rikakis, T., Wolf, S. L.: Attractor shape for dynamical analysis of human movement: Applications in stroke rehabilitation and action recognition. In Human Activity Understanding from 3D data HAU3D’13, 2012.
  • [21] Soh, H., Demiris, Y.: Iterative temporal learning and prediction with the sparse online echo state gaussian process. In The 2012 international joint conference on neural networks (IJCNN), pp. 1–8, 2012.
  • [22] Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., et al.: Real-time human pose recognition in parts from single depth images. Cvpr 2011, 1297–1304. doi:10.1109/CVPR.2011.5995316, 2011.
  • [23] Microsoft Corp. Redmond WA. Kinect for Xbox 360.
  • [24] http://www.asus.com/Multimedia/Xtion_PRO_LIVE/
  • [25] Fothergill, S., Mentis, H., Kohli, P., Nowozin, S.: Instructing people for training gestural interactive systems. Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI ’12, 1737. doi:10.1145/2207676.2208303, 2012.
  • [26] Raptis, M., Kirovski, D., Hoppe, H.: Real-time classification of dance gestures from skeleton animation. Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation - SCA ’11, 147. doi:10.1145/2019406.2019426, 2011.
  • [27] Breiman, L.: Random forests. Machine Learning, 45(1):5, 2001.
  • [28] Negin, F., Özdemir, F., Akgül, C. B., Yüksel, K. A., Erçil, A.: A decision forest based feature selection framework for action recognition from rgb-depth cameras. In ICIAR, 2013.
  • [29] Yao, A., et al.: Does human action recognition benefit from pose estimation? In Proceedings of the 22nd British machine vision conference-BMVC 2011.
  • [30] Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural computation 9(7):1545-1588, 1997.
  • [31] Gavrila, D., Davis, L.: Towards 3-d model-based tracking and recognition of human movement, in: International Workshop on Face and Gesture Recognition, pp. 272–277, 1995.
  • [32] Hogg, D.: Model-based vision: a program to see a walking person, Image and Vision Computing 1 (1):5–20, 1983.
  • [33] Rohr, K.: Towards model-based recognition of human movements in image sequences, Graphical Model and Image Processing 59 (1):94–115, 1994.
  • [34] Ju, S. X., Black, M. J., Yacoob, Y.: Cardboard people: a parameterized model of articulated image motion. Automatic Face and Gesture Recognition, 1996, Proceedings of the Second International Conference on, pp. 38-44, 14-16 Oct 1996, doi:10.1109/AFGR.1996.557241, 1996.
  • [35] Kakadiaris, I. A., Metaxas, D. N.: Model-based estimation of 3D human motion, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 22(12):1453–1459, 2000.
  • [36] Howe, N. R.: Silhouette lookup for automatic pose tracking. In: Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’04), Los Alamitos, CA, June 2004, p. 15, 2004.
  • [37] Ramanan, D., Forsyth, D. A.: Finding and tracking people from the bottom up. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’03), vol. 2, Madison, WI, June 2003, pp. 467–474, 2003.
  • [38] Grest, D., Woetzel, J., Koch, R.: Nonlinear body pose estimation from depth images. In Proc. DAGM, 2005.
  • [39] Meinard, M., Clausen, M.: Efficient Content-Based Retrieval of Motion Capture Data. ACM Transactions on Graphics (TOG) 24(3):677–685, 2005.
  • [40] Healey, F., et al.: Falls in English and Welsh hospitals: a national observational study based on retrospective analysis of 12 months of patient safety incident reports. Quality and Safety in Health Care 17(6):424-430, 2008.
  • [41] Oldies, D.P.m.l.t.t.a.T., Thaindian News, 18 June 2008.
  • [42] Fu, Z., et al.: Fall detection using an address-event temporal contrast vision sensor. In Circuits and Systems, ISCAS 2008. IEEE International Symposium on. IEEE, 2008.
  • [43] Foroughi, H., et al.: An eigenspace-based approach for human fall detection using integrated time motion image and neural network. In Signal Processing, 9th International Conference on. IEEE, 2008.
  • [44] Miaou, S.-G., Sung, P.-H., Huang, C.-Y.: A customized human fall detection system using omni-camera images and personal information. In Distributed Diagnosis and Home Healthcare, 2006. D2H2. 1st Transdisciplinary Conference on. IEEE, 2006.
  • [45] Microsoft Kinect documentation May 2012 SDK Release, http://msdn.microsoft.com/enus/library/hh855347.asp.
  • [46] Criminisi, A., Shotton, J., Konukoglu, E.: Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning, Microsoft Research, 2011.
  • [47] Gini, C.: Concentration and dependency ratios (in Italian). English translation in Rivista di Politica Economica, 87, 769-789, 1997.
  • [48] Kullback, S., Leibler, R.A.: On Information and Sufficiency. Annals of Mathematical Statistics 22 (1): 79–86, 1951.
  • [49] Koperski, M., Blinski P., Bremond F.: 3D Trajectories for Action Recognition. ICIP - The 21st IEEE International Conference on Image Processing. IEEE, 2014.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-1edcb5b7-5301-4bab-ab51-b128b60febef
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.