PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Optimizing pedestrian tracking for robust perception with YOLOv8 and deep SORT

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Multi-object tracking is a crucial aspect of perception in the area of computer vision, widely used in autonomous driving, behavior recognition, and other areas. The complex and dynamic nature of environments, the ever-changing visual features of people, and the frequent appearance of occlusion interactions all impose limitations on the efficacy of existing pedestrian tracking algorithms. This results in suboptimal tracking precision and stability. As a solution, this article proposes an integrated detector-tracker framework for pedestrian tracking. The framework includes a pedestrian object detector that utilizes the YOLOv8 network, which is regarded as the latest state-of-the-art detector, that has been established. This detector provides an ideal detection base to address limitations. Through the combination of YOLOv8 and the DeepSort tracking algorithm, we have improved the ability to track pedestrians in dynamic scenarios. After conducting experiments on publicly available datasets such as MOT17 and MOT20, a clear improvement in accuracy and consistency was demonstrated, with MOTA scores of 63.82 and 58.95, and HOTA scores of 43.15 and 41.36, respectively. Our research highlights the significance of optimizing object detection to unleash the potential of tracking for critical applications like autonomous driving.
Rocznik
Strony
72--84
Opis fizyczny
Bibliogr. 39 poz., fig., tab.
Twórcy
  • University of Mostefa Ben Boulaid, Department of Pharmacy, Algeria
  • University of Kasdi Merbah, Department of Electrical Engineering, Algeria
  • University of Kasdi Merbah, Department of Electrical Engineering, Algeria
  • University of Kasdi Merbah, Department of Electrical Engineering, Algeria
Bibliografia
  • [1] Abbas, S. M., & Singh, S. (2018). Region-based object detection and classification using faster R-CNN. 4th International Conference on Computational Intelligence & Communication Technology (CICT) (pp. 1-6). IEEE. https://doi.org/10.1109/ciact.2018.8480413
  • [2] Behrendt, K., Novak, L., & Botros, R. (2017). A deep learning approach to traffic lights: Detection, tracking, and classification. IEEE International Conference on Robotics and Automation (ICRA) (pp. 1370-1377). IEEE. https:/doi.org/10.1109/ICRA.2017.7989163
  • [3] Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. ArXiv, abs/1903.05625. https://doi.org/10.48550/arXiv.1903.05625
  • [4] Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
  • [5] Bochinski, E., Eiselein, V., & Sikora, T. (2017). Highspeed tracking-by-detection without using image information. 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (pp. 1-6). IEEE. https:/doi.org/10.1109/AVSS.2017.8078516
  • [6] Chen, L., Ai, H., Zhuang, Z., & Shang, C. (2018). Real-time multiple people tracking with deeply learned candidate selection and person re-identification. IEEE International Conference on Multimedia and Expo (ICME) (pp. 1-6). IEEE. https://doi.org/10.1109/ICME.2018.8486597
  • [7] Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790-799. https://doi.org/10.1109/34.400568
  • [8] Ciaparrone, G., Sánchez, F. L., Tabik, S., Troiano, L., Tagliaferri, R., & Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 381, 61–88. https://doi.org/10.1016/j.neucom.2019.11.023
  • [9] De Rosa, G. H., & Papa, J. P. (2022). Learning to weight similarity measures with Siamese networks: A case study on optimum-path forest. In Optimum-Path Forest (pp. 155–173). Elsevier. https://doi.org/10.1016/B978-0-12-822688-9.00015-3
  • [10] Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29(14), 1707-1725. https://doi.org/10.1177/0278364910365417
  • [11] Feng, W., Bai, L., Yao, Y., Gan, W., Wu, W., & Ouyang, W. (2023). Similarity- and quality-guided relation learning for joint detection and tracking. IEEE Transactions on Multimedia, 26, 1267-1280. https://doi.org/10.1109/tmm.2023.3279670
  • [12] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. ArXiv, abs/1311.2524. https://doi.org/10.48550/arXiv.1311.2524
  • [13] Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics. Retrieved February, 2, 2024 from https://github.com/ultralytics/ultralytics
  • [14] Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35-45. https://doi.org/10.1115/1.3662552
  • [15] Kamal, R., Chemmanam, A. J., Jose, B., Mathews, S., & Varghese, E. (2020). Construction safety surveillance using machine learning. International Symposium on Networks, Computers and Communications (ISNCC) (pp. 1-6). IEEE. https:/doi.org/10.1109/ISNCC49221.2020.9297198
  • [16] Kasturi, R., Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., & Zhang, J. (2009). Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 319-336. https://doi.org/10.1109/TPAMI.2008.57
  • [17] Korepanova, A. A., Oliseenko, V. D., & Abramov, M. V. (2020). Applicability of similarity coefficients in social circle matching. 2020 XXIII International Conference on Soft Computing and Measurements (SCM) (pp. 41-43). IEEE. https://doi.org/10.1109/SCM50615.2020.9198782
  • [18] Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2), 83-97. https://doi.org/10.1002/nav.3800020109
  • [19] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multiBox detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016 (Vol. 9905, pp. 21–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-46448-0_2
  • [20] Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., & Leibe, B. (2021). HOTA: A higher order metric for evaluating multi-object tracking. International Journal of Computer Vision, 129, 548-578. https://doi.org/10.1007/s11263-020-01416-9
  • [21] Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., & Kim, T. K. (2021). Multiple object tracking: A literature review. Artificial Intelligence, 293, 103448. https://doi.org/10.1016/j.artint.2020.103448
  • [22] Mao, Q. C., Sun, H. M., Liu, Y. B., & Jia, R. S. (2019). Mini-YOLOv3: Real-time object detector for embedded applications. IEEE Access, 7, 133529–133538. https://doi.org/10.1109/ACCESS.2019.2941547
  • [23] Munjal, B., Aftab, A. R., Amin, S., Brandlmaier, M. D., Tombari, F., & Galasso, F. (2020). Joint detection and tracking in videos with identification features. Image and Vision Computing, 100, 103932. https://doi.org/10.1016/j.imavis.2020.103932
  • [24] Okuma, K., Taleghani, A., De Freitas, N., Little, J. J., & Lowe, D. G. (2004). A boosted particle filter: Multitarget detection and tracking. In T. Pajdla & J. Matas (Eds.), Computer Vision—ECCV 2004 (Vol. 3021, pp. 28–39). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-24670-1_3
  • [25] Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). TubeTK: Adopting tubes to track multi-object in a one-step training model. ArXiv, abs/2006.05683. https://doi.org/10.48550/arXiv.2006.05683
  • [26] Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., & Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer Vision – ECCV 2020 (Vol. 12349, pp. 145–161). Springer International Publishing. https://doi.org/10.1007/978-3-030-58548-8_9
  • [27] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. ArXiv, abs/1506.02640. https://doi.org/10.48550/arXiv.1506.02640
  • [28] Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. ArXiv, abs/1609.01775. https://doi.org/10.48550/arXiv.1609.01775
  • [29] Solawetz, J., & Francesco. (2023, January 11). What is yolov8? The ultimate guide. https://blog.roboflow.com/whats-new-in-yolov8/
  • [30] Sun, Z., Chen, J., Chao, L., Ruan, W., & Mukherjee, M. (2021). A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 31(5), 1819-1833. https://doi.org/10.1109/TCSVT.2020.3009717
  • [31] Treven, J. R., & Cordova-Esparaza, D. M., Romero-González, J. A. (2023). A Comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning. & Knowledge. Extraction, 5(4), 1680-1716. https://doi.org/10.3390/make5040083
  • [32] Vijaymeena, M., & Kavitha, K. (2016). A survey on similarity measures in text mining. Machine Learning Applications: An International Journal, 3(1), 19-28.
  • [33] Wang, Y., Kitani, K., & Weng, X. (2021). Joint object detection and multi-object tracking with graph neural networks. 2021 IEEE International Conference on Robotics and Automation (ICRA) (pp. 13708-13715). https://doi.org/10.1109/icra48506.2021.9561110
  • [34] Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020). Towards Real-Time Multi-Object Tracking. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer Vision – ECCV 2020 (Vol. 12356, pp. 107–122). Springer International Publishing. https://doi.org/10.1007/978-3-030-58621-8_7
  • [35] Wojke, N., Bewley, A., & Paulus, D. (2017). Simple online and realtime tracking with a deep association metric. 2017 IEEE International Conference on Image Processing (ICIP) (pp. 3645-3649). IEEE. https://doi.org/10.1109/ICIP.2017.8296962
  • [36] Xu, Y., Ošep, A., Ban, Y., Horaud, R., Leal-Taixé, L., & Alameda-Pineda, X. (2019). How to train your deep multi-object tracker. ArXiv, abs/1906.06618. https://doi.org/10.48550/arxiv.1906.06618
  • [37] Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., & Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature. ArXiv, abs/1610.06136. https://doi.org/10.48550/arxiv.1610.06136
  • [38] Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2022). Motr: End-to-end multiple object tracking with transformer. ArXiv, abs/2105.03247. https://doi.org/10.48550/arXiv.2105.03247
  • [39] Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. ArXiv, abs/2110.06864. https://doi.org/10.48550/arXiv.2110.06864
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-d335fc3e-f166-4917-8dd0-b363247b5743
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.