PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Detecting objects using Rolling Convolution and Recurrent Neural Network

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
At present, most of the existing target detection algorithms use the method of region proposal to search for the target in the image. The most effective regional proposal method usually requires thousands of target prediction areas to achieve high recall rate.This lowers the detection efficiency. Even though recent region proposal network approach have yielded good results by using hundreds of proposals, it still faces the challenge when applied to small objects and precise locations. This is mainly because these approaches use coarse feature. Therefore, we propose a new method for extracting more efficient global features and multi-scale features to provide target detection performance. Given that feature maps under continuous convolution lose the resolution required to detect small objects when obtaining deeper semantic information; hence, we use rolling convolution (RC) to maintain the high resolution of low-level feature maps to explore objects in greater detail, even if there is no structure dedicated to combining the features of multiple convolutional layers. Furthermore, we use a recurrent neural network of multiple gated recurrent units (GRUs) at the top of the convolutional layer to highlight useful global context locations for assisting in the detection of objects. Through experiments in the benchmark data set, our proposed method achieved 78.2% mAP in PASCAL VOC 2007 and 72.3% mAP in PASCAL VOC 2012 dataset. It has been verified through many experiments that this method has reached a more advanced level of detection.
Rocznik
Strony
293--301
Opis fizyczny
Bibliogr. 30 poz., rys., tab., fot.
Twórcy
  • School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, China
  • School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, China
autor
  • School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou
Bibliografia
  • [1] WoHler C, Anlauf J K. An adaptable time-delay neural-network algorithm for image sequence analysis[J]. IEEE Transactions on Neural Networks, 1999, 10(6):1531-1536.
  • [2] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]// Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. IEEE, 2005:886-893.
  • [3] Laptev I. Improvements of Object Detection Using Boosted Histograms[C]// British Machine Vision Conference 2006, Edinburgh, Uk, September. DBLP, 2006:949-958.
  • [4] Shet V D, Neumann J, Ramesh V, et al. Bilattice-based Logical Reasoning for Human Detection[C]// Computer Vision and Pattern Recognition, 2007. CVPR ’07. IEEE Conference on. IEEE, 2007:1-8.
  • [5] Zhang L, Wu B, Nevatia R. Detection and Tracking of Multiple Humans with Extensive Pose Articulation[C]// IEEE, International Conference on Computer Vision. IEEE, 2007:1-8.
  • [6] Azizpour H, Laptev I. Object Detection Using Strongly-Supervised Deformable Part Models[M]// Computer Vision ECCV 2012. Springer Berlin Heidelberg, 2012:836-849.
  • [7] Dalal N, Triggs B, Schmid C. Human detection using oriented histograms of flow and appearance[J]. 2006, 3952:428-441.
  • [8] Dollar P, Wojek C, Schiele B, et al. Pedestrian Detection: An Evaluation of the State of the Art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4):743.
  • [9] Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems. (2012)11061114
  • [10] Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations. (2014)
  • [11] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D.,Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprintarXiv:1409.4842 (2014)
  • [12] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scaleimage recognition. arXiv preprint arXiv:1409.1556 (2014)
  • [13] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.arXiv preprint arXiv:1512.03385 (2015)
  • [14] Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. (2014) 580587
  • [15] Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. International Journal on Computer Vision 104(2) (2013) 154171
  • [16] Zitnick, C.L., Doll ar, P.: Edge boxes: Locating object proposals from edges. In:European Conference on Computer Vision. (2014) 391405
  • [17] Arbelez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: IEEE Conference on Computer Vision and Pattern Recognition. (2014) 328335
  • [18] Girshick, R.: Fast r-cnn. In: IEEE Conference on Computer Vision and Pattern Recognition. (2015) 14401448
  • [19] Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Neural Information Processing Systems.(2015) 9199
  • [20] Liang, X., Wei, Y., Shen, X., Jie, Z., Feng, J., Lin, L., Yan, S.: Reversible recursiveinstance-level object segmentation. arXiv preprint arXiv:1511.04517 (2015)
  • [21] Zeng, X., Ouyang, W., Wang, X.: Window-object relationship guided representation learning for generic object detections. arXiv preprint arXiv:1512.02736 (2015)
  • [22] Gidaris, S., Komodakis, N.: Object detection via a multi-region and semanticsegmentation-aware cnn model. In: IEEE International Conference on ComputerVision. (2015) 11341142
  • [23] Long, J., Shelhamer, E., Darrell, T. Fully convolutionalnetworks for semantic segmentation. In CVPR, 2015.
  • [24] Hariharan, B., Arbelez, P., Girshick, R., Malik, J. Hypercolumns for object segmentation and fine-grained localization. In CVPR, 2015.
  • [25] Kong, T., Yao, A., Chen, Y., Sun, F. Hypernet: Towards accurate region proposal generation and joint object detection. In CVPR, 2016.
  • [26] Liu, W., Rabinovich, A., Berg, A.C. ParseNet: Lookingwider to see better. In ICLR workshop, 2016.
  • [27] Bell, S., Zitnick, C.L., Bala, K., Girshick, R. Inside-outside net: Detecting objects in context with skip poolingand recurrent neural networks. In CVPR, 2016.
  • [28] Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N. A unified multi-scale deep convolutional neural network for fast object detection. In ECCV, 2016.
  • [29] Li J, Wei Y, Liang X, et al. Attentive Contexts for Object Detection[J]. IEEE Transactions on Multimedia, 2017, 19(5):944-954.
  • [30] Stewart, R., Andriluka, M. End-to-end people detection in crowded scenes. arXiv preprint arXiv:1506.04878 (2015).
Uwagi
1. Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
2. This work was supported by the Natural Science Foundation of Zhejiang Province (LZ15F020004), the Natural Science Foundation of National(61272311) and 521 Project of Zhejiang Sci-Tech University.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-a71f6b6a-e140-4c0c-8ba2-18a54c673f9b
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.