PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

An efficient pedestrian attribute recognition system under challenging conditions

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this work, an efficient pedestrian attribute recognition system is introduced. The system is based on a novel processing pipeline that combines the best-performing attribute extraction model with an efficient attribute filtering algorithm using keypoints of human pose. The attribute extraction models are developed based on several state-of-the-art deep networks via transfer learning techniques, including ResNet50, Swin-transformer, and ConvNeXt. Pre-trained models of these networks are fine-tuned using the Ensemble Pedestrian Attribute Recognition (EPAR) dataset. Several optimization techniques, including the advanced optimizer Adam with Decoupled Weight Decay Regularization (AdamW), Random Erasing (RE), and weighted loss functions, are adopted to solve issues of data unbalancing or challenging conditions like partial and occluded bodies. Experimental evaluations are performed via EPAR that contains 26 993 images of 1477 person IDs, most of which are in challenging conditions. The results show that the ConvNeXt-v2-B outperforms other networks; mean accuracy (mA) reaches 85.57%, and other indices are also the highest. The addition of AdamW or RE can improve accuracy by 1-2%. The use of new loss functions can solve the issue of data unbalancing, in which the accuracy of data-less attributes improves by a maximum of 14% in the best case. Significantly, when the attribute filtering algorithm is applied, the results are dramatically improved, and mA reaches an excellent value of 94.85%. Utilizing the state-of-the-art attribute extraction model with optimization techniques on the large-scale and diverse dataset and attribute filtering has shown a good approach and thus has a high potential for practical applications.
Rocznik
Strony
3--18
Opis fizyczny
Bibliogr. 38 poz., rys., tab., wykr.
Twórcy
autor
  • Research Group Intelligent Robots, Hanoi University of Science and Technology, 1 Dai Co Viet, Hanoi, Vietnam
  • CMC Applied Technology Institute, CMC Corporation, 11 Duy Tan, Hanoi, Vietnam
  • CMC Applied Technology Institute, CMC Corporation, 11 Duy Tan, Hanoi, Vietnam
autor
  • School of Applied Mathematics and Informatics, Hanoi University of Science and Technology 1 Dai Co Viet, Hanoi, Vietnam
  • CMC Applied Technology Institute, CMC Corporation, 11 Duy Tan, Hanoi, Vietnam
autor
  • CMC Applied Technology Institute, CMC Corporation, 11 Duy Tan, Hanoi, Vietnam
  • CMC University, CMC Corporation, 11 Duy Tan, Hanoi, Vietnam
  • Posts and Telecommunication Institute of Technology, KM 10 Nguyen Trai, Ha Dong, Hanoi, Vietnam
Bibliografia
  • [1] L. Bourdev, S. Maji, and J. Malik. Describing people: A poselet-based approach to attribute classification. In Proc. 2011 Int. Conf. Computer Vision (ICCV), pages 1543-1550, Barcelona, Spain, 6-13 Nov 2011. IEEE. doi:10.1109/ICCV.2011.6126413.
  • [2] W.-C. Chen, X.-Y. Yu, and L.-L. Ou. Pedestrian attribute recognition in video surveillance scenarios based on view-attribute attention localization. Machine Intelligence Research, 19(2):153-168, 2022. doi:10.1007/s11633-022-1321-8.
  • [3] X. Cheng, M. Jia, Q. Wang, and J. Zhang. A simple visual-textual baseline for pedestrian attribute recognition. IEEE Transactions on Circuits and Systems for Video Technology, 32(10):6994-7004, 2022. doi:10.1109/TCSVT.2022.3178144.
  • [4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proc. 2009 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 248-255, Miami, FL, USA, 20-25 Jun 2009. doi:10.1109/CVPR.2009.5206848.
  • [5] Y. Deng, P. Luo, C. C. Loy, and X. Tang. Pedestrian attribute recognition at far distance. In Proc. 22nd ACM Int. Conf. Multimedia (MM’14), ACM Conferences, pages 789-792, Orlando, FL, USA, 3-7 Nov 2014. doi:10.1145/2647868.2654966.
  • [6] A. Diba, A. M. Pazandeh, H. Pirsiavash, and L. Van Gool. Deepcamp: Deep convolutional action & attribute mid-level patterns. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 3557-3565, Las Vegas, NV, USA, 27-30 Jun 2016. doi:10.1109/CVPR.2016.387.
  • [7] H. Galiyawala, M. S. Raval, and M. Patel. Person retrieval in surveillance videos using attribute recognition. Journal of Ambient Intelligence and Humanized Computing, pages 1-13, 2022. doi:10.1007/s12652-022-03891-0.
  • [8] G. Gkioxari, R. Girshick, and J. Malik. Actions and attributes from wholes and parts. In Proc. IEEE Int. Conf. Computer Vision (ICCV), pages 2470-2478, Santiago, Chile, 13-16 Dec 2015. doi:10.1109/ICCV.2015.284.
  • [9] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 770-778, Las Vegas, NV, USA, 27-30 Jun 2016. doi:10.1109/CVPR.2016.90.
  • [10] J. Jia, H. Huang, X. Chen, and K. Huang. Rethinking of pedestrian attribute recognition: A reliable evaluation under zero-shot pedestrian identity setting. arXiv, 2021. arXiv:2107.03576. doi:10.48550/arXiv.2107.03576.
  • [11] J. Joo, S. Wang, and S.-C. Zhu. Human attribute recognition by rich appearance dictionary. In Proc. IEEE Int. Conf. Computer Vision (ICCV), pages 721-728, Sydney, Australia, 1-8 Dec 2013. doi:10.1109/ICCV.2013.95.
  • [12] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv, 2014. arXiv:1412.6980. doi:10.48550/arXiv.1412.6980.
  • [13] D.-H. Lee. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proc. Workshop on Challenges in Representation Learning (WREPL), part of Int. Conf. Machine Learning (ICML), page 896. Atlanta, GE, USA, 16-21 Jun 2013.
  • [14] D. Li, X. Chen, and K. Huang. Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In Proc. 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 111-115, Kuala Lumpur, Malaysia, 3-6 Nov 2015. IEEE. doi:10.1109/ACPR.2015.7486476.
  • [15] D. Li, X. Chen, Z. Zhang, and K. Huang. Pose guided deep model for pedestrian attribute recognition in surveillance scenarios. In Proc. 2018 IEEE Int. Conf. Multimedia and Expo (ICME), pages 1-6, San Diego, CA, USA, 23-27 Jul 2018. doi:10.1109/ICME.2018.8486604.
  • [16] D. Li, Z. Zhang, X. Chen, and K. Huang. A richly annotated pedestrian dataset for person retrieval in real surveillance scenarios. IEEE Transactions on Image Processing, 28(4):1575-1590, 2018. doi:10.1109/TIP.2018.2878349.
  • [17] Y. Li, C. Huang, C. C. Loy, and X. Tang. Human attribute recognition by deep hierarchical contexts. In Computer Vision, Proc. 14th European Conf. Computer Vision (ECCV 2016), volume 9910 Part VI of Lecture Notes in Computer Science, pages 684-700, Amsterdam, The Netherlands, 11-14 Oct. 2016. Springer. doi:10.1007/978-3-319-46466-4 41.
  • [18] Y. Lin, L. Zheng, Z. Zheng, Y. Wu, Z. Hu, C. Yan, and Y. Yang. Improving person re-identification by attribute and identity learning. Pattern Recognition, 95:151-161, 2019. doi:10.1016/j.patcog.2019.06.006.
  • [19] P. Liu, X. Liu, J. Yan, and J. Shao. Localization guided learning for pedestrian attribute recognition. In Proc. British Machine Vision Conference (BMVC 2018), Northumbria, UK, 3-6 Sep 2018. BMVA Press. Accessible also as arXiv:1808.09102. https://bmva-archive.org.uk/bmvc/2018/contents/papers/0573.pdf.
  • [20] X. Liu, H. Zhao, M. Tian, L. Sheng, J. Shao, S. Yi, J. Yan, and X. Wang. Hydraplus-net: Attentive deep features for pedestrian analysis. In Proc. IEEE Int. Conf. Computer Vision (ICCV), pages 350-359, Venice, Italy, 22-29 Oct 2017. doi:10.1109/ICCV.2017.46.
  • [21] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), pages 10012-10022, Montreal, QC, Canada, 10-17 Oct 2021. doi:10.1109/ICCV48922.2021.00986.
  • [22] Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie. A ConvNet for the 2020s. In Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 11976-11986, New Orleans, LA, USA, 18-24 Jun 2022. doi:10.1109/CVPR52688.2022.01167.
  • [23] I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In Proc. 7th Int. Conf. Learning Representations (ICLR), New Orleans, LA, USA, 6-9 May 2019. https://openreview.net/forum?id=Bkg6RiCqY7.
  • [24] D. Maji, S. Nagori, M. Mathew, and D. Poddar. YOLO-Pose: Enhancing YOLO for multi person pose estimation using object keypoint similarity loss. In Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2636-2645, New Orleans, LA, USA, 19-20 Jun 2022. doi:10.1109/CVPRW56347.2022.00297.
  • [25] OpenCV Team. OpenCV, 2022. https://opencv.org. [Accessed 15 Jan 2022].
  • [26] H. X. Nguyen, D. N. Hoang, T. V. Nguyen, T. M. Dang, A. D. Pham, and D.-T. Nguyen. Person re-identification from multiple surveillance cameras combining face and body feature matching. Modern Physics Letters B, 37(19):2340031, 2023. doi:10.1142/S0217984923400316.
  • [27] S. Sakib, K. Deb, P. K. Dhar, and O.-J. Kwon. A framework for pedestrian attribute recognition using deep learning. Applied Sciences, 12(2):622, 2022. doi:10.3390/app12020622.
  • [28] A. Specker, M. Cormier, and J. Beyerer. UPAR: Unified Pedestrian Attribute Recognition and person retrieval. In Proc. 2023 IEEE/CVF Winter Conf. Applications of Computer Vision (WACV), pages 981-990, Los Alamitos, CA, USA, 3-7 Jan 2023. doi:10.1109/WACV56688.2023.00104.
  • [29] Z. Tan, Y. Yang, J. Wan, G. Guo, and S. Z. Li. Relation-aware pedestrian attribute recognition with graph convolutional networks. In Proc. AAAI Conf. Artificial Intelligence, volume 34 of AAAI-20 Technical Tracks 7, pages 12055-12062, New York, NY, USA, 7-12 Feb 2020. AAAI Press. doi:10.1609/aaai.v34i07.6883.
  • [30] C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 7464-7475, Vancouver, Canada, 18-22 Jun 2023. Accessible also as arXiv:2207.02696. https://openaccess.thecvf.com/content/CVPR2023/html/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.html.
  • [31] X. Wang, S. Zheng, R. Yang, A. Zheng, Z. Chen, J. Tang, and B. Luo. Pedestrian attribute recognition: A survey. Pattern Recognition, 121:108220, 2022. doi:10.1016/j.patcog.2021.108220.
  • [32] L. Wei, S. Zhang, W. Gao, and Q. Tian. Person transfer GAN to bridge domain gap for person re-identification. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 79-88, Salt Lake City, UT, USA, 18-23 Jun 2018. doi:10.1109/CVPR.2018.00016.
  • [33] S. Woo, S. Debnath, R. Hu, X. Chen, Z. Liu, I. S. Kweon, and S. Xie. ConvNeXt V2: Co-designing and scaling ConvNets with masked autoencoders. In Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 18-22 Jun 2023. Accessible also as arXiv:2301.00808. https://openaccess.thecvf.com/content/CVPR2023/html/Woo_ConvNeXt_V2_Co-Designing_and_Scaling_ConvNets_With_Masked_Autoencoders_CVPR_2023_paper.html.
  • [34] L. Yang, L. Zhu, Y. Wei, S. Liang, and P. Tan. Attribute recognition from adaptive parts. arXiv, 2016. arXiv:1607.01437. doi:10.48550/arXiv.1607.01437.
  • [35] N. Zhang, M. Paluri, M’A. Ranzato, T. Darrell, and L. Bourdev. PANDA: Pose Aligned Networks for Deep Attribute modeling. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 1637-1644, Columbus, OH, USA, 23-28 Jun 2014. doi:10.1109/CVPR.2014.212.
  • [36] S. Zhang, Z. Li, S. Yan, X. He, and J. Sun. Distribution alignment: A unified framework for long-tail visual recognition. In Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 2361-2370, Nashville, TN, USA, 20-25 Jun 2021. doi:10.1109/CVPR46437.2021.00239.
  • [37] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang. Random erasing data augmentation. In Proc. AAAI Conf. Artificial Intelligence, volume 34 of AAAI-20 Technical Tracks 7, pages 13001-13008, New York, NY, USA, 7-12 Feb 2020. AAAI Press. doi:10.1609/aaai.v34i07.7000.
  • [38] J. Zhu, S. Liao, D. Yi, Z. Lei, and S. Z. Li. Multi-label CNN based pedestrian attribute learning for soft biometrics. In Proc. 2015 Int. Conf. Biometrics (ICB), pages 535-540, Phuket, Thailand, 19-22 May 2015. IEEE. doi:10.1109/ICB.2015.7139070.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-856ad6be-3a98-44d2-bb51-655095dd9dd3
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.