Selected technical issues of deep neural networks for image classification purposes

Grochowski, Michał; Kwasigroch, A.; Mikołajczyk, A.

doi:10.24425/bpas.2019.128485

Artykuł - szczegóły

Tytuł artykułu

Selected technical issues of deep neural networks for image classification purposes

Autorzy

Grochowski Michał , Kwasigroch A. , Mikołajczyk A.

Treść / Zawartość

Pełne teksty:

21_363-376_00946_Bpast.No.67-2_28.04.19_K3.pdf

Pobierz

Identyfikatory

DOI

10.24425/bpas.2019.128485

Warianty tytułu

Języki publikacji

Abstrakty

In recent years, deep learning and especially deep neural networks (DNN) have obtained amazing performance on a variety of problems, in particular in classification or pattern recognition. Among many kinds of DNNs, the convolutional neural networks (CNN) are most commonly used. However, due to their complexity, there are many problems related but not limited to optimizing network parameters, avoiding overfitting and ensuring good generalization abilities. Therefore, a number of methods have been proposed by the researchers to deal with these problems. In this paper, we present the results of applying different, recently developed methods to improve deep neural network training and operating. We decided to focus on the most popular CNN structures, namely on VGG based neural networks: VGG16, VGG11 and proposed by us VGG8. The tests were conducted on a real and very important problem of skin cancer detection. A publicly available dataset of skin lesions was used as a benchmark. We analyzed the influence of applying: dropout, batch normalization, model ensembling, and transfer learning. Moreover, the influence of the type of activation function was checked. In order to increase the objectivity of the results, each of the tested models was trained 6 times and their results were averaged. In addition, in order to mitigate the impact of the selection of learning, test and validation sets, k-fold validation was applied.

Słowa kluczowe

deep neural network deep learning image classification batch normalization transfer learning dropout

sieć neuronowa klasyfikacja obrazów normalizacja transfer nauki uczenie głębokie

Wydawca

Polska Akademia Nauk, Wydział IV Nauk Technicznych

Czasopismo

Bulletin of the Polish Academy of Sciences. Technical Sciences

Rocznik

2019

Tom

Vol. 67, nr 2

Strony

363--376

Opis fizyczny

Bibliogr. 61 poz., rys., wykr., tab.

Twórcy

autor

Grochowski Michał

michal.grochowski@pg.edu.pl

Gdańsk University of Technology, Faculty of Electrical and Control Engineering, 11/12 Narutowicza St., 80-223 Gdańsk, Poland

autor

Kwasigroch A.

Gdańsk University of Technology, Faculty of Electrical and Control Engineering, 11/12 Narutowicza St., 80-223 Gdańsk, Poland

autor

Mikołajczyk A.

Gdańsk University of Technology, Faculty of Electrical and Control Engineering, 11/12 Narutowicza St., 80-223 Gdańsk, Poland

Bibliografia

[1] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001, vol. 1, pp. I-511-I–518 vol. 1.
[2] P. Mukhopadhyay and B. B. Chaudhuri, “A survey of Hough Transform,” Pattern Recognition, vol. 48, no. 3, pp. 993–1010, Mar. 2015.
[3] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893 vol. 1.
[4] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
[5] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[6] M. Grochowski, A. Mikołajczyk, and A. Kwasigroch, “Diagnosis of malignant melanoma by neural network ensemble-based system utilising hand-crafted skin lesion features,” Metrology and Measurement Systems, vol. 26, no. 1, 2019.
[7] T. Markiewicz, M. Dziekiewicz, S. Osowski, M. Maruszynski, W. Kozlowski, et al., “Thresholding techniques for segmentation of atherosclerotic plaque and lumen areas in vascular arteries,” Bull. Pol. Ac.: Tech., vol. 63, no. 1, pp. 269–280, 2015.
[8] A. Czajka, W. Kasprzak, and A. Wilkowski, “Verification of iris image authenticity using fragile watermarking,” Bull. Pol. Ac.: Tech., vol. 64, no. 4, pp. 807–819, 2016.
[9] K. Fukushima, S. Miyake, and T. Ito, “Neocognitron: A Neural Network Model for a Mechanism of Visual Pattern Recognition,” IEEE Transactions on Systems, Man and Cybernetics, vol. SMC-13, no. 5, pp. 826–834, 1983.
[10] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[11] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R.E. Howard, et al., “Backpropagation applied to handwritten zip code recognition,” Neural computation, vol. 1, no. 4, pp. 541–551, 1989.
[12] N.P. Jouppi, A. Borchers, R. Boyle, P.-L. Cantin, C. Chao, et al., “In-Datacenter Performance Analysis of a Tensor Processing Unit,” ACM SIGARCH Computer Architecture News, vol. 45, no. 2, pp. 1–12, 2017.
[13] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, et al., “TensorFlow: A System for Large-Scale Machine Learning,” in OSDI, 2016, vol. 16, pp. 265–283.
[14] R. Al-Rfou, G. Alain, A. Almahairi, C. Angermueller, D. Bahdanau, et al., “Theano: A Python framework for fast computation of mathematical expressions,” arXiv preprint arXiv:1605.02688, vol. 472, p. 473, 2016.
[15] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, et al., “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675–678.
[16] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, et al., “Automatic differentiation in PyTorch,” 2017.
[17] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, et al., “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015.
[18] A. Krizhevsky, I. Sutskever, and H. Geoffrey E., “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems 25 (NIPS2012), pp. 1–9, 2012.
[19] M. Johnson, M. Schuster, Q.V. Le, M. Krikun, Y. Wu, et al., “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation,” 2016.
[20] I. Lopez-Moreno, J. Gonzalez-Dominguez, O. Plchot, D. Martinez, J. Gonzalez-Rodriguez, et al., “Automatic language identification using deep neural networks,” 2014, pp. 5337–5341.
[21] D. Amodei, S. Ananthanarayanan, R. Anubhai, J. Bai, E. Battenberg, et al., “Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin.” pp. 173–182, 2016.
[22] S.Ö. Arık, G. Diamos, A. Gibiansky, J. Miller, K. Peng, et al., “Deep Voice 2: Multi-Speaker Neural Text-to-Speech.”
[23] I.J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, et al., “Generative Adversarial Networks,” 2014.
[24] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, et al., “Playing Atari with Deep Reinforcement Learning.”
[25] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” International Conference on Learning Representations (ICRL), pp. 1–14, 2015.
[26] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” 2015.
[27] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929–1958, 2014.
[28] S.J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
[29] A.L. Maas, A.Y. Hannun, and A.Y. Ng, “Rectifier Nonlinearities Improve Neural Network Acoustic Models,” Proceedings of the 30 th International Conference on Machine Learning, vol. 28, p. 6, 2013.
[30] A. Kwasigroch, A. Mikołajczyk, and M. Grochowski, “Deep neural networks approach to skin lesions classification #x2014 – A comparative analysis,” in 2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR), 2017, pp. 1069–1074.
[31] Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and applications in vision,” in Proceedings of 2010 IEEE International Symposium on Circuits and Systems, 2010, pp. 253–256.
[32] K. He, X. Zhang, S. Ren, and J. Sun, “Identity Mappings in Deep Residual Networks.”
[33] S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
[34] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision.”
[35] S. Wu, S. Zhong, and Y. Liu, “Deep residual learning for image steganalysis,” Multimedia Tools and Applications, pp. 1–17, 2017.
[36] G. Huang, Z. Liu, L. van der Maaten, and K.Q. Weinberger, “Densely Connected Convolutional Networks,” Aug. 2016.
[37] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” AISTATS ’11: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315–323, 2011.
[38] B. Xu, N. Wang, T. Chen, and M. Li, “Empirical evaluation of rectified activations in convolutional network,” arXiv preprint arXiv:1505.00853, 2015.
[39] M. Anthimopoulos, S. Christodoulidis, L. Ebner, A. Christe, and S. Mougiakakou, “Lung pattern classification for interstitial lung diseases using a deep convolutional neural network,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1207–1216, 2016.
[40] B.Q. Huynh, H. Li, and M.L. Giger, “Digital mammographic tumor classification using transfer learning from deep convolutional neural networks,” Journal of Medical Imaging, vol. 3, no. 3, p. 034501, 2016.
[41] B. van Ginneken, A.A. Setio, C. Jacobs, and F. Ciompi, “Offthe-shelf convolutional neural network features for pulmonary nodule detection in computed tomography scans,” in Biomedical Imaging (ISBI), 2015 IEEE 12th International Symposium on, 2015, pp. 286–289.
[42] Y. Bar, I. Diamant, L. Wolf, S. Lieberman, E. Konen, et al., “Chest pathology detection using deep learning with non-medical training,” in Biomedical Imaging (ISBI), 2015 IEEE 12th International Symposium on, 2015, pp. 294–297.
[43] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
[44] G. Huang, Y. Li, G. Pleiss, Z. Liu, J. E. Hopcroft, et al., “Snapshot ensembles: Train 1, get M for free,” arXiv preprint arXiv: 1704.00109, 2017.
[45] T. Garipov, P. Izmailov, D. Podoprikhin, D.P. Vetrov, and A.G. Wilson, “Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs,” arXiv preprint arXiv:1802.10026, 2018.
[46] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, et al., “Going deeper with convolutions,” 2015.
[47] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, vol. 1. MIT press Cambridge, 2016.
[48] C.M. Bishop, Pattern Recognition and Machine Learning. New York: Springer, 2011.
[49] A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, et al., “Dermatologist-level classification of skin cancer with deep neural networks,” Nature, vol. 542, no. 7639, p. 115, 2017.
[50] F. Nachbar, W. Stolz, T. Merkle, A.B. Cognetta, T. Vogt, et al., “The ABCD rule of dermatoscopy: high prospective value in the diagnosis of doubtful melanocytic skin lesions,” Journal of the American Academy of Dermatology, vol. 30, no. 4, pp. 551–559, 1994.
[51] R.H. Johr, “Dermoscopy: alternative melanocytic algorithms – the ABCD rule of dermatoscopy, menzies scoring method, and 7-point checklist,” Clinics in dermatology, vol. 20, no. 3, pp. 240–247, 2002.
[52] J.S. Henning, S.W. Dusza, S.Q. Wang, A.A. Marghoob, H.S. Rabinovitz, et al., “The CASH (color, architecture, symmetry, and homogeneity) algorithm for dermoscopy,” Journal of the American Academy of Dermatology, vol. 56, no. 1, pp. 45–52, 2007.
[53] “ISIC Archive.” [Online]. Available: https://isic-archive.com/. [Accessed: 10-Jan-2018].
[54] “Edinburgh Innovations: Dermofit Image Library,” Edinburgh Innovations, online licensing portal. [Online]. Available: https://licensing.eri.ed.ac.uk/i/software/dermofit-image-library.html. [Accessed: 26-Apr-2018].
[55] G. Argenziano, H.P. Soyer, V. De Giorgi, D. Piccolo, P. Carli, et al., “Dermoscopy: a tutorial,” EDRA, Medical Publishing & New Media, vol. 16, 2002.
[56] T. Mendonça, P.M. Ferreira, J.S. Marques, A.R. Marcal, and J. Rozeira, “PH 2-A dermoscopic image database for research and benchmarking,” in Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE, 2013, pp. 5437–5440.
[57] A. Mikołajczyk, A. Kwasigroch, and M. Grochowski, “Intelligent system supporting diagnosis of malignant melanoma,” in Polish Control Conference, 2017, pp. 828–837.
[58] F. Chollet and others, “Keras,” 2015.
[59] A. Galdran, A. Alvarez-Gila, M.I. Meyer, C.L. Saratxaga, T. Araújo, et al., “Data-Driven Color Augmentation Techniques for Deep Skin Image Analysis,” arXiv preprint arXiv:1703.03702, 2017.
[60] M. Grochowski, M. Wąsowicz, A. Mikołajczyk, M. Ficek, M. Kulka, et al., “Machine Learning System For Automated Blood Smear Analysis,” Metrology and Measurement Systems, vol. 26, no. 1, 2019.
[61] A. Mikołajczyk and M. Grochowski, “Data augmentation for improving deep learning in image classification problem,” in 2018 International Interdisciplinary PhD Workshop (IIPhDW), 2018, pp. 117–122.

Uwagi

This research was funded by Polish Ministry of Science and Higher Education in the years 2017–2021, under the Diamond Grant No. DI2016020746. The authors wish to express their thanks for the support.

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-b37dc386-543b-4827-baac-98d68ebfaae8