Warianty tytułu
Języki publikacji
Abstrakty
Employing vision-based hand gesture recognition for the interaction and communication of disabled individuals is highly beneficial. The hands and gestures of this category of people have a distinctive aspect, requiring the adaptation of a deep learning vision-based system with a dedicated dataset for each individual. To achieve this objective, the paper presents a novel approach for training gesture classification using few-shot samples. More specifically, the gesture classifiers are fine-tuned segments of a pre-trained deep network. The global framework consists of two modules. The first one is a base feature learner and a hand detector trained with normal people hand’s images; this module results in a hand detector ad hoc model. The second module is a learner sub-classifier; it is the leverage of the convolution layers of the hand detector feature extractor. It builds a shallow CNN trained with few-shot samples for gesture classification. The proposed approach enables the reuse of segments of a pre-trained feature extractor to build a new sub-classification model. The results obtained by varying the size of the training dataset have demonstrated the efficiency of our method compared to the ones of the literature.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
1--23
Opis fizyczny
Bibliogr. 57 poz., fig., tab.
Twórcy
autor
- Djillali Liabes University, Computer Science Department, Communication Networks, Architecture and Multimedia Laboratory, Algeria, mohamed.elbahri@univ-sba.dz
autor
- Djillali Liabes University, Electronics Department, Communication Networks, Architecture and Multimedia Laboratory, Algeria, ne_taleb@univ-sba.dz
- Djillali Liabes University, IRECOM Laboratory, Algeria, elmehdi.ardjoun@univ-sba.dz
- Djillali Liabes University, Electronics Department, Communication Networks, Architecture and Multimedia Laboratory, Algeria, ne_taleb@univ-sba.dz
Bibliografia
- [1] Bambach, S., Lee, S., Crandall, D. J., & Yu, C. (2015). Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions. 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1949–1957). IEEE. https://doi.org/10.1109/ICCV.2015.226
- [2] Bandini, A., & Zariffa, J. (2020). Analysis of the hands in egocentric vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6), 6846-6866. https://doi.org/10.1109/TPAMI.2020.2986648
- [3] Bao, P., Maqueda, A. I., del-Blanco, C. R., & García, N. (2017). Tiny hand gesture recognition without localization via a deep convolutional network. IEEE Transactions on Consumer Electronics, 63(3), 251–257. https://doi.org/10.1109/TCE.2017.014971
- [4] Barczak, A. L. C., Reyes, N. H., Abastillas, M., Piccio, A., & Susnjak, T. (2011). A new 2D static hand gesture colour image dataset for ASL gestures. Research Letters in the Information and Mathematical Sciences, 15.
- [5] El Moataz, A., Mammass, D., Mansouri, A., & Nouboud, F. (Eds.). (2020). Image and Signal Processing. 9th International Conference (ICISP 2020). Springer International Publishing. https://doi.org/10.1007/978-3-030-51935-3
- [6] Bhaumik, G., Verma, M., Govil, M. C., & Vipparthi, S. K. (2022). ExtriDeNet: An intensive feature extrication deep network for hand gesture recognition. The Visual Computer, 38(11), 3853–3866. https://doi.org/10.1007/s00371-021-02225-z
- [7] Chattoraj, S., Karan, V., Tanmay, P., (2017). Assistive system for physically disabled people using gesture recognition. 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP) (pp. 60–65). IEEE. https://doi.org/10.1109/SIPROCESS.2017.8124506
- [8] Damaneh, M. M., Mohanna, F., & Jafari, P. (2023). Static hand gesture recognition in sign language based on convolutional neural network with feature extraction method using ORB descriptor and Gabor filter. Expert Systems with Applications, 211, 118559. https://doi.org/10.1016/j.eswa.2022.118559
- [9] Dardas, N. H., & Georganas, N. D. (2011). Real-Time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Transactions on Instrumentation and Measurement, 60(11), 3592–3607. https://doi.org/10.1109/TIM.2011.2161140
- [10] Deng, X., Zhang, Y., Yang, S., Tan, P., Chang, L., Yuan, Y., & Wang, H. (2018). Joint hand detection and rotation estimation using CNN. IEEE Transactions on Image Processing, 27(4), 1888–1900. https://doi.org/10.1109/TIP.2017.2779600
- [11] Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88, 303–338. https://doi.org/10.1007/s11263-009-0275-4
- [12] Fang, L., Liang, N., Kang, W., Wang, Z., & Feng, D. D. (2020). Real-time hand posture recognition using hand geometric features and Fisher Vector. Signal Processing: Image Communication, 82, 115729. https://doi.org/10.1016/j.image.2019.115729
- [13] Fathi, A., Farhadi, A., & Rehg, J. M. (2011). Understanding egocentric activities. 2011 International Conference on Computer Vision (pp. 407–414). IEEE. https://doi.org/10.1109/ICCV.2011.6126269
- [14] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). IEEE. https://doi.org/10.1109/CVPR.2016.90
- [15] Henderson, P., & Ferrari, V. (2017). End-to-End Training of Object Class Detectors for Mean Average Precision. In S.-H. Lai, V. Lepetit, K. Nishino, & Y. Sato (Eds.), Computer Vision – ACCV 2016 (pp. 198–213). Springer International Publishing. https://doi.org/10.1007/978-3-319-54193-8_13
- [16] Hsiao, Y.-S., Sanchez-Riera, J., Lim, T., Hua, K.-L., & Cheng, W.-H. (2014). LaRED: A large RGB-D extensible hand gesture dataset. 5th ACM Multimedia Systems Conference (pp. 53–58). https://doi.org/10.1145/2557642.2563669
- [17] Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7132–7141). IEEE. https://doi.org/10.1109/CVPR.2018.00745
- [18] Huang, G., Liu, Z., Maaten, L. V. D., & Weinberger, K. Q. (2017). Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2261–2269). IEEE. https://doi.org/10.1109/CVPR.2017.243
- [19] Huiwei, Z., Mingqiang, Y., Zhenxing, C., & Qinghe, Z. (2017). A method for static hand gesture recognition based on non-negative matrix factorization and compressive sensing. IAENG International Journal of Computer Science, 44(1), 52–59.
- [20] Hung, C.-H., Bai, Y.-W., & Wu, H.-Y. (2015). Home appliance control by a hand gesture recognition belt in LED array lamp case. 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE) (pp. 599–600). IEEE. https://doi.org/10.1109/GCCE.2015.7398611
- [21] Hung, C.-H., Bai, Y.-W., & Wu, H.-Y. (2016). Home outlet and LED array lamp controlled by a smartphone with a hand gesture recognition. 2016 IEEE International Conference on Consumer Electronics (ICCE) (pp. 5–6). IEEE. https://doi.org/10.1109/ICCE.2016.7430502
- [22] Ishiyama, H., & Kurabayashi, S. (2016). Monochrome glove: A robust real-time hand gesture recognition method by using a fabric glove with design of structured markers. 2016 IEEE Virtual Reality (VR), 187–188. https://doi.org/10.1109/VR.2016.7504716
- [23] Kapitanov, A., Kvanchiani, K., Nagaev, A., Kraynov, R., & Makhlyarchuk, A. (2022). HaGRID - HAnd Gesture recognition image dataset. ArXiv abs/2206.08219. https://doi.org/10.48550/arXiv.2206.08219
- [24] Li, Y., Ye, Z., & Rehg, J. M. (2015). Delving into egocentric actions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 287–295). IEEE. https://doi.org/10.1109/CVPR.2015.7298625
- [25] Li, Z., Tang, H., Peng, Z., Qi, G.-J., & Tang, J. (2023). Knowledge-guided semantic transfer network for few-shot image recognition. IEEE Transactions on Neural Networks and Learning Systems, 1–15. https://doi.org/10.1109/TNNLS.2023.3240195
- [26] Liang, H., Yuan, J., & Thalman, D. (2015). Egocentric hand pose estimation and distance recovery in a single RGB image. 2015 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6). IEEE. https://doi.org/10.1109/ICME.2015.7177448
- [27] Likitlersuang, J., Sumitro, E. R., Cao, T., Visée, R. J., Kalsi-Ryan, S., & Zariffa, J. (2019). Egocentric video: A new tool for capturing hand use of individuals with spinal cord injury at home. Journal of NeuroEngineering and Rehabilitation, 16, 83. https://doi.org/10.1186/s12984-019-0557-1
- [28] Liu, G., Dundar, A., Shih, K. J., Wang, T.-C., Reda, F. A., Sapra, K., Yu, Z., Yang, X., Tao, A., & Catanzaro, B. (2023). Partial convolution for padding, inpainting, and image synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 6096–6110. https://doi.org/10.1109/TPAMI.2022.3209702
- [29] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single shot MultiBox detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016 (pp. 21–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-46448-0_2
- [30] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
- [31] Mittal, A., Zisserman, A., & Torr, P. (2011). Hand detection using multiple proposals. Procedings of the British Machine Vision Conference 2011 (pp. 75.1-75.11). https://doi.org/10.5244/C.25.75
- [32] Mohammed, A. A. Q., Lv, J., & Islam, M. S. (2019). A Deep Learning-Based End-to-End composite system for hand detection and gesture recognition. Sensors, 19(23), 5282. https://doi.org/10.3390/s19235282
- [33] Nuzzi, C., Pasinetti, S., Pagani, R., Coffetti, G., & Sansoni, G. (2021, March 8). HANDS: A dataset of static Hand-Gestures for Human-Robot Interaction. https://doi.org/10.17632/ndrczc35bt.1
- [34] Panwar, M. (2012). Hand gesture recognition based on shape parameters. 2012 International Conference on Computing, Communication and Applications (pp. 1-6). IEEE. https://doi.org/10.1109/ICCCA.2012.6179213
- [35] Pirsiavash, H., & Ramanan, D. (2012). Detecting activities of daily living in first-person camera views. 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 2847–2854). IEEE. https://doi.org/10.1109/CVPR.2012.6248010
- [36] Pugeault, N., & Bowden, R. (2011). Spelling it out: Real-time ASL fingerspelling recognition. 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) (pp. 1114–1119). https://doi.org/10.1109/ICCVW.2011.6130290
- [37] Rahim, M. A., Shin, J., & Yun, K. S. (2021). Hand gesture-based sign alphabet recognition and sentence interpretation using a convolutional neural network. Annals of Emerging Technologies in Computing, 4(4), 20-27. https://doi.org/10.33166/AETiC.2020.04.003
- [38] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779–788). IEEE. https://doi.org/10.1109/CVPR.2016.91
- [39] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems, 28.
- [40] Sahoo, J. P., Ari, S., & Patra, S. K. (2019). Hand gesture recognition using PCA based deep CNN reduced features and SVM Classifier. 2019 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS) (pp. 221–224). IEEE. https://doi.org/10.1109/iSES47678.2019.00056
- [41] Sahoo, J. P., Sahoo, S. P., Ari, S., & Patra, S. K. (2022). RBI-2RCNN: Residual block intensity feature using a two-stage residual convolutional neural network for static hand gesture recognition. Signal, Image and Video Processing, 16(8), 2019–2027. https://doi.org/10.1007/s11760-022-02163-w
- [42] Sahoo, J. P., Sahoo, S. P., Ari, S., & Patra, S. K. (2023). DeReFNet: Dual-stream dense Residual fusion network for static hand gesture recognition. Displays, 77, 102388. https://doi.org/10.1016/j.displa.2023.102388
- [43] Sharma, A., Mittal, A., Singh, S., & Awatramani, V. (2020). Hand gesture recognition using image processing and feature extraction techniques. Procedia Computer Science, 173, 181–190. https://doi.org/10.1016/j.procs.2020.06.022
- [44] Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. ArXiv abs/1409.1556. https://doi.org/10.48550/arXiv.1409.1556
- [45] Srividya, M., Anala, M., Dushyanth, N., & Raju, D. V. S. K. (2019). Hand recognition and motion analysis using faster RCNN. 2019 4th International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS) (pp. 1–4). IEEE. https://doi.org/10.1109/CSITSS47250.2019.9031033
- [46] Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-ResNet and the impact of residual connections on learning. Thirty-First AAAI Conference on Artificial Intelligence (pp. 4278–4284). https://doi.org/10.1609/aaai.v31i1.11231
- [47] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–9). IEEE. https://doi.org/10.1109/CVPR.2015.7298594
- [48] Tang, H., Yuan, C., Li, Z., & Tang, J. (2022). Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recognition, 130, 108792. https://doi.org/10.1016/j.patcog.2022.108792
- [49] Utaminingrum, F., Fauzi, M. A., Wihandika, R. C., Adinugroho, S., Kurniawan, T. A., Syauqy, D., Sari, Y. A., & Adikara, P. P. (2017). Development of computer vision based obstacle detection and human tracking on smart wheelchair for disabled patient. 2017 5th International Symposium on Computational and Business Intelligence (ISCBI) (pp 1–5). IEEE. https://doi.org/10.1109/ISCBI.2017.8053533
- [50] Virender, R., Nikita, Y., & Pulkit, G. (2018). American sign language fingerspelling using hybrid discrete wavelet transform-gabor filter and convolutional neural network. Journal of Engineering Science and Technology, 13(9), 2655–2669.
- [51] Vu, A.-K. N., Nguyen, N.-D., Nguyen, K.-D., Nguyen, V.-T., Ngo, T. D., Do, T.-T., & Nguyen, T. V. (2022). Few-shot object detection via baby learning. Image and Vision Computing, 120, 104398. https://doi.org/10.1016/j.imavis.2022.104398
- [52] Weiss, K., Khoshgoftaar, T. M., & Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3, 9. https://doi.org/10.1186/s40537-016-0043-6
- [53] Xu, C., Cai, W., Li, Y., Zhou, J., & Wei, L. (2020). Accurate hand detection from single-color images by reconstructing hand appearances. Sensors, 20(1), 192. https://doi.org/10.3390/s20010192
- [54] Yang, G., Wang, S., & Yang, J. (2019). Desire-Driven Reasoning for Personal Care Robots. IEEE Access, 7, 75203–75212. https://doi.org/10.1109/ACCESS.2019.2921112
- [55] Zhang, Y., Cao, C., Cheng, J., & Lu, H. (2018). EgoGesture: A new dataset and benchmark for egocentric hand gesture recognition. IEEE Transactions on Multimedia, 20(5), 1038–1050. https://doi.org/10.1109/TMM.2018.2808769
- [56] Zhao, A., Wu, H., Chen, M., & Wang, N. (2023). A spatio-temporal siamese neural network for multimodal handwriting abnormality screening of Parkinson’s Disease. International Journal of Intelligent Systems, 2023, 9921809. https://doi.org/10.1155/2023/9921809
- [57] Zheng, Q., Yang, M., Yang, J., Zhang, Q., & Zhang, X. (2018). Improvement of generalization ability of deep CNN via implicit regularization in two-stage training process. IEEE Access, 6, 15844–15869. https://doi.org/10.1109/ACCESS.2018.2810849
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-cb4e0a18-7a52-4138-a2b3-fc9c1d5ec4ff