Geometric transformations embedded into convolutional neural networks

Tarasiuk, P.; Pryczek, M.

Artykuł - szczegóły

Tytuł artykułu

Geometric transformations embedded into convolutional neural networks

Autorzy

Tarasiuk P. , Pryczek M.

Wybrane pełne teksty z tego czasopisma

https://eczasopisma.p.lodz.pl/JACS/issue/archive

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This paper presents a novel extension to convolutional neural networks. While CNNs are known for invariance to object translation, changes to the other parameters could make the image recognition tasks difficult - that includes rotations and scaling. Some improvement in this area could be achieved with embedded geometric transformations used inside the CNNs. In order to provide a practical solution, which allows fast propagation and learning of the modified networks, “fast geometric transformations” are introduced.

Słowa kluczowe

artificial intelligence machine learning deep learning convolutional neural networks image processing image recognition geometric transformation

sztuczna inteligencja uczenie maszynowe uczenie głębokie splotowe sieci neuronowe przetwarzanie obrazów rozpoznawanie obrazów przekształcenie geometryczne

Wydawca

Wydawnictwo Politechniki Łódzkiej

Czasopismo

Journal of Applied Computer Science

Rocznik

2016

Tom

Vol. 24, nr 3

Strony

33--48

Opis fizyczny

Bibliogr. 18 poz.

Twórcy

autor

Tarasiuk P.

pawel.tarasiuk@p.lodz.pl

Lodz University of Technology, Institute of Information Technology, ul. Wolczanska 215, 90-924 Lodz, Poland

autor

Pryczek M.

michal.pryczek@p.lodz.pl

Lodz University of Technology, Institute of Information Technology, ul. Wolczanska 215, 90-924 Lodz, Poland

Bibliografia

[1] Krizhevsky, A., Sutskever, I., and Hinton, G. E., ImageNet Classification with Deep Convolutional Neural Networks, In: Advances in Neural Information Processing Systems 25, edited by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Curran Associates, Inc., 2012, pp. 1097-1105.
[2] Zeiler, M. D. and Fergus, R., Visualizing and Understanding Convolutional Networks, CoRR, Vol. abs/1311.2901, 2013.
[3] Nguyen, T. V., Lu, C., Sepulveda, J., and Yan, S., Adaptive Nonparametric Image Parsing, CoRR, Vol. abs/1505.01560, 2015.
[4] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L., ImageNet: A Large-Scale Hierarchical Image Database, In: CVPR09, 2009.
[5] Fukushima, K., Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position, Biological Cybernetics, Vol. 36, 1980, pp. 193-202.
[6] Hubel, D. H. andWiesel, T. N., Receptive Fields and Functional Architecture in Two Nonstriate Visual Areas (18 and 19) of the Cat, Journal of Neurophysiology, Vol. 28, 1965, pp. 229-289.
[7] LeCun, Y. and Bengio, Y., Convolutional Networks for Images, Speech, and Time-Series, In: The Handbook of Brain Theory and Neural Networks, edited by M. A. Arbib, MIT Press, 1995.
[8] Cheng, G., Zhou, P., and Han, J., Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images, IEEE Transactions on Geoscience and Remote Sensing, Vol. 54, No. 12, Dec 2016, pp. 7405-7415.
[9] Gonzalez, D. M., Volpi, M., and Tuia, D., Learning rotation invariant convolutional filters for texture classification, CoRR, Vol. abs/1604.06720, 2016.
[10] Laptev, D., Savinov, N., Buhmann, J. M., and Pollefeys, M., TI-POOLING: transformation-invariant pooling for feature learning in Convolutional Neural Networks, CoRR, Vol. abs/1604.06318, 2016.
[11] Vialatte, J., Gripon, V., and Mercier, G., Generalizing the Convolution Operator to Extend CNNs to Irregular Domains, CoRR, Vol. abs/1606.01166, 2016.
[12] Weiman, C. F. R. and Chaikin, G., Logarithmic Spiral Grids for Image Processing and Display, Computer Graphics and Image Processing, Vol. 11, No. 3, November 1979, pp. 197-226.
[13] Tomczyk, A., Szczepaniak, P. S., and Lis, B., Generalized Multi-layer Kohonen Network and Its Application to Texture Reognition, In: Proceedings of the Lecture Notes in Artificial Intelligence, No. 3070, 2004, pp. 760-767.
[14] Foundation, P. S., The Python Language Reference, 1990-2016.
[15] Ascher, D., Dubois, P. F., Hinsen, K., Hugunin, J., and Oliphant, T., Numerical Python, Lawrence Livermore National Laboratory, Livermore, CA, ucrlma- 128569 ed., 1999.
[16] PythonWare, Python Imaging Library (PIL), 2009-2016.
[17] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R., Dropout: A Simple Way to Prevent Neural Networks from Overfitting, J. Mach. Learn. Res., Vol. 15, No. 1, Jan. 2014, pp. 1929-1958.
[18] Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A. Y., Reading Digits in Natural Images with Unsupervised Feature Learning, In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-a36913c0-f5df-4ed5-9441-7cfb5f4343ad