PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Fast multispectral deep fusion networks

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Most current state-of-the-art computer vision algorithms use images captured by cameras, which operate in the visible spectral range as input data. Thus, image recognition systems that build on top of those algorithms can not provide acceptable recognition quality in poor lighting conditions, e.g. during nighttime. Another significant limitation of such systems is high demand for computational resources, which makes them impossible to use on low-powered embedded systems without GPU support. This work attempts to create an algorithm for pattern recognition that will consolidate data from visible and infrared spectral ranges and allow near real-time performance on embedded systems with infrared and visible sensors. First, we analyze existing methods of combining data from different spectral ranges for object detection task. Based on the analysis, an architecture of a deep convolutional neural network is proposed for the fusion of multi-spectral data. This architecture is based on the single shot multi-box detection algorithm. Comparison analysis of the proposed architecture with previously proposed solutions for the multi-spectral object detection task shows comparable or better detection accuracy with previous algorithms and significant improvement of the running time on embedded systems. This study was conducted in collaboration with Philips Lighting Research Lab and solutions based on the proposed architecture will be used in image recognition systems for the next generation of intelligent lighting systems. Thus, the main scientific outcomes of this work include an algorithm for multi-spectral pattern recognition based on convolutional neural networks, as well as a modification of detection algorithms for working on embedded systems.
Rocznik
Strony
875--889
Opis fizyczny
Bibliogr. 39 poz., rys., wykr., tab.
Twórcy
autor
  • Skolkovo Institute of Science and Technology, 3 Nobel Street, 143026 Moscow, Russia
  • Philips Lighting, High Tech Campus 48, 5656 AE Eindhoven, Netherlands
autor
  • Skolkovo Institute of Science and Technology, 3 Nobel Street, 143026 Moscow, Russia
autor
  • Skolkovo Institute of Science and Technology, 3 Nobel Street, 143026 Moscow, Russia
Bibliografia
  • [1] R. Gade and T.B. Moeslund, “Thermal cameras and applications: a survey”, Machine vision and applications 25 (1), 245‒262 (2014).
  • [2] J.RR. Uijlings and et al., “Selective search for object recognition”, Int. J. of Computer Vision 104 (2), 154‒171 (2013).
  • [3] R. Girshick and et al., “Rich feature hierarchies for accurate object detection and semantic segmentation”, CVPR (2014).
  • [4] R. Girshick and et al., “fast R-CNN”, ICCV (2015).
  • [5] J. Redmon and et al., “You only look once: Unified, real-time object detection”, CVPR (2016).
  • [6] S. Ren and et al., “Faster r-cnn: Towards real-time object detection with region proposal networks”, NIPS (2015).
  • [7] W. Liu and et al., “SSD: Single shot multibox detector”, ECCV, Springer, 21‒37 (2016).
  • [8] J. Huang and et al., “Speed/accuracy trade-offs for modern convolutional object detectors”, arXiv:1611.10012 (2016).
  • [9] S. Hwang and et al., “Multispectral pedestrian detection: Benchmark dataset and baseline”, CVPR (2015).
  • [10] P. Dollar and et al., “Fast feature pyramids for object detection”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 36 (8), 1532‒1545 (2014).
  • [11] J. Wagner and et al., “Multispectral pedestrian detection using deep fusion convolutional neural networks”, ESANN (2016).
  • [12] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classification with deep convolutional neural networks”, NIPS (2012).
  • [13] J. Liu and et al., “Multispectral deep neural networks for pedestrian detection”, arXiv:1611.02644 (2016).
  • [14] O. Ronneberger, P. Fischer, and T. Brox. “U-net: Convolutional networks for biomedical image segmentation”, Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, Springer (2015).
  • [15] A.G. Howard and et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications”, arXiv:1704.04861 (2017).
  • [16] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition”, arXiv:1409.1556 (2014).
  • [17] F.N. Iandola and et al., “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size”, arXiv:1602.07360 (2016).
  • [18] K. He and et al., “Deep residual learning for image recognition”, CVPR (2016).
  • [19] D. Kingma and B. Jimmy, “Adam: A method for stochastic optimization”, arXiv:1412.6980 (2014).
  • [20] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks”, AISTATS, 9 (2010).
  • [21] W. Liu, A. Rabinovich, and A.C. Berg, “Parsenet: Looking wider to see better”, arXiv:1506.04579 (2015).
  • [22] O. Russakovsky and et al., “Imagenet large scale visual recognition challenge”, Int. J. of Computer Vision, 115 (3), 211‒252 (2015).
  • [23] A. Eitel and et al., “Multimodal deep learning for robust rgbd object recognition”, 2015 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), IEEE (2015).
  • [24] M. Everingham and et al., “The pascal visual object classes challenge: A retrospective”, Int. J. of Computer Vision 111 (1), 98‒136 (2015).
  • [25] Kaggle: DSTL Satellite Imagery Feature Detection Dataset. https://www.kaggle.com/c/ dstl-satellite-imagery-feature-detection/data
  • [26] S. Grihon, E. Burnaev, M. Belyaev, and P. Prikhodko, “Surrogate Modeling of Stability Constraints for Optimization of Composite Structures”, Surrogate-Based Modeling and Optimization. Engineering applications, Eds. by S. Koziel, L. Leifsson. Springer, 359‒391 (2013).
  • [27] M. Belyaev, E. Burnaev, E. Kapushev, M. Panov, P. Prikhodko, D. Vetrov, and D. Yarotsky, “GTApprox: Surrogate modeling for industrial design”, Advances in Engineering Software 102, 29- 39 (2‒16)
  • [28] E. Burnaev and A. Zaytsev, “Minimax approach to variable fidelity data interpolation”, PRML, Volume 54: Artificial Intelligence and Statistics, 54, 652‒661 (2017).
  • [29] E. Burnaev and A. Zaytsev, “Large Scale Variable Fidelity Surrogate Modeling”, Ann Math Artif Intell, 1‒20 (2017).
  • [30] E. Burnaev and A. Zaytsev, “Surrogate modeling of mutlifidelity data for large samples”, J. of Communications Technology and Electronics, 60 (12), 1348‒1355 (2015).
  • [31] DMLC: MXNet for Deep Learning. https://github.com/ dmlc/mxnet
  • [32] C. Yunpeng and et al. “Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks”, arXiv preprint arXiv:1703.02180 (2017).
  • [33] J. Hosang, B. Rodrigo, and B. Schiele, “A Convnet for Nonmaximum Suppression”, German Conf. on Pattern Recognition, Springer (2016).
  • [34] A. Notchenko, E. Kapushev, and E. Burnaev, “Large Scale Shape Retrieval with Sparse 3D Convolutional Neural Networks”, Proc. of 6th Int. Conf. on Analysis of Images, Social Networks and Texts (AIST-2017), LNCS 10716, 236‒245 (2018).
  • [35] E. Burnaev and P. Prikhod’ko, “On a method for constructing ensembles of regression models”, Automation and Remote Control, 74 (10), 1630‒1644 (2013).
  • [36] E. Burnaev and P. Erofeev, “The Influence of Parameter Initialization on the Training Time and Accuracy of a Nonlinear Regression Model”, J. of Communications Technology and Electronics, 61 (6), 646‒660 (2016).
  • [37] E. Burnaev and D. Smolyakov, “One-Class SVM with Privileged Information and Its Application to Malware Detection”, ICDMW, 273‒280 (2016).
  • [38] World View-3 Satellite Sensor Specifications. http: //www.satimagingcorp.com/satellite-sensors/ worldview-3/
  • [39] S.W. Smith: The scientist and engineer’s guide to digital signal processing, California Technical Pub, 1997.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-f0c426ae-ffe8-4029-9bd4-f3c603d9b6f3
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.