Powiadomienia systemowe
- Sesja wygasła!
- Sesja wygasła!
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The development of surveillance video vehicle detection technology in modern intelligent transportation systems is closely related to the operation and safety of highways and urban road systems. Yet, the current object detection network structure is complex, requiring a large number of parameters and calculations, so this paper proposes a lightweight network based on YOLOv5. It can be easily deployed on video surveillance equipment even with limited performance, while ensuring real-time and accurate vehicle detection. Modified MobileNetV2 is used as the backbone feature extraction network of YOLOv5, and DSC “depthwise separable convolution” is used to replace the standard convolution in the bottleneck layer structure. The lightweight YOLOv5 is evaluated in the UA-DETRAC and BDD100k datasets. Experimental results show that this method reduces the number of parameters by 95% as compared with the original YOLOv5s and achieves a good tradeoff between precision and speed.
Rocznik
Tom
Strony
art. no. e143644
Opis fizyczny
Bibliogr. 35 poz., rys., tab.
Twórcy
autor
- Shanghai University of Engineering Science, School of Mechanical and Automotive Engineering, Shanghai, China
autor
- Shanghai University of Engineering Science, School of Mechanical and Automotive Engineering, Shanghai, China
autor
- Shanghai University of Engineering Science, School of Mechanical and Automotive Engineering, Shanghai, China
Bibliografia
- [1] L. Qiu et al., “Deep learning-based algorithm for vehicle detection in intelligent transportation systems,” J. Supercomput., vol. 77, no. 10, pp 11083–11098, 2021, doi: 10.1007/s11227-021-03712-9.
- [2] J. Zhao et al., “Improved vision-based vehicle detection and classification by optimized YOLOv4,” IEEE Access, vol. 10, pp. 8590–8603, 2022, doi: 10.1109/ACCESS.2022.3143365.
- [3] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. 3rd Int. Conf. Learn. Represent. (ICLR), 2015, doi: 10.48550/ARXIV.1409.1556.
- [4] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510–4520, doi: 10.1109/CVPR.2018.00474.
- [5] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001, pp. I–I, doi: 10.1109/CVPR.2001.990517.
- [6] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893, doi: 10.1109/CVPR.2005.177.
- [7] P.F. Felzenszwalb, R.B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained partbased models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1627–1645, Sept. 2010, doi: 10.1109/TPAMI.2009.167.
- [8] H. Fuhrmann, A. Boyko, M.H. Abdelpakey, and M.S. Shehata, “DETECTren: Vehicle object detection using self-supervised learning based on light-weight network for low-power devices,” 2021 IEEE 7th World Forum on Internet of Things (WF-IoT), 2021, pp. 807–811, doi: 10.1109/WF-IoT51360.2021.9594927.
- [9] L. Jiao et al., “A survey of deep learning-based object detection,” in IEEE Access, vol. 7, pp. 128837–128868, 2019, doi: 10.1109/ACCESS.2019.2939201.
- [10] R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587, doi: 10.1109/CVPR.2014.81.
- [11] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1904–1916, 2015, doi: 10.1109/TPAMI.2015.2389824.
- [12] R. Girshick, “Fast R-CNN,” in Proc. of 33rd IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440–1448, doi: 10.48550/arXiv.1504.08083.
- [13] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You only look once: Unified, real-time object detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788, doi: 10.1109/CVPR.2016.91.
- [14] J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517–6525, doi: 10.1109/CVPR.2017.690.
- [15] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv, 2018, doi: 10.48550/ARXIV.1804.02767.
- [16] A. Bochkovskiy, Ch.-Y. Wang, and H.-Y.M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv, 2020, doi: 10.48550/ARXIV.2004.10934.
- [17] Ch.-Y. Wang, I-H. Yeh, and H.-Y.M. Liao, “You only learn one representation: Unified network for multiple tasks,” arXiv, 2021, doi: 10.48550/ARXIV.2105.04206.
- [18] Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, “YOLOX: Exceeding yolo series in 2021,” arXiv, 2021, doi: 10.48550/ARXIV.2107.08430.
- [19] Ch.-Y. Wang, A. Bochkovskiy and H.-Y.M. Liao, “Scaled-YOLOv4: Scaling cross stage partial network,” in Proc. 39th IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 13029–13038, doi: 10.48550/arXiv.2011.08036.
- [20] N. Carion et al., “End-to-end object detection with transformers,” in Proc. 17th European Conference on Computer Vision (ECCV), 2020, pp. 213–229, doi: 10.1007/978-3-030-58452-8_13.
- [21] X. Zhu et al., “Deformable DETR: Deformable transformers for end-to-end object detection,” arXiv, 2020, doi: 10.48550/ARXIV.2010.04159.
- [22] M. Zheng et al., “End-to-end object detection with adaptive clustering transformer,” arXiv, 2020, doi: 10.48550/ARXIV.2011.09315.
- [23] F.N. Iandola et al., “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and<0.5 MB model size,” arXiv, 2016, doi: 10.48550/ARXIV.1602.07360.
- [24] A. Gholami et al., “SqueezeNext: Hardware-aware neural network design,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 1719–171909, doi: 10.1109/CVPRW.2018.00215.
- [25] X. Zhang, X. Zhou, M. Lin, and J. Sun, “ShuffleNet: An extremely efficient convolutional neural network for mobile devices,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 6848–6856, doi: 10.1109/CVPR.2018.00716.
- [26] N. Ma, X. Zhang, H.T. Zheng, and J. Sun, “Shufflenet v2: Practical guidelines for efficient CNN architecture design,” in Proc. 16th European Conference on Computer Vision (ECCV), 2018, pp. 122–138, doi: 10.1007/978-3-030-01264-9_8.
- [27] A.G. Howard et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv, 2017, doi: 10.48550/arXiv.1704.04861.
- [28] M. Sandler et al., “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proc. 36th IEEE conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 4510–4520, doi: 10.1109/CVPR.2018.00474.
- [29] A. Howard et al., “Searching for mobilenetv3,” in Proc. 17th IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1314–1324, doi: 10.48550/arXiv.1905.02244.
- [30] K. Han, Y.Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “GhostNet: More Features From Cheap Operations,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1577–1586, doi: 10.1109/CVPR42600.2020.00165.
- [31] M. Tan and Q.V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in Proc. 36th International Conference on Machine Learning, 2019 (ICML), 2019, pp. 6105–6114, doi: 10.48550/arXiv.1905.11946.
- [32] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.
- [33] F. Chollet, “Xception: deep learning with depthwise separable convolutions,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1800–1807, doi: 10.1109/CVPR.2017.195.
- [34] L. Wen et al., “UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking,” Comput. Vision Image Understanding., vol. 193, p. 102907, 2020, doi: 10.1016/j.cviu.2020.102907.
- [35] F. Yu et al., “BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2633–2642, doi: 10.1109/CVPR42600.2020.00271.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-96494063-5841-4dd3-b763-57ea6fd3fe53