PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Choice of the p-norm for high level classification features pruning in modern convolutional neural networks with local sensitivity analysis

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Transfer learning has surfaced as a compelling technique in machine learning, enabling the transfer of knowledge across networks. This study evaluates the efficacy of ImageNet pretrained state-of-the-art networks, including DenseNet, ResNet, and VGG, in implementing transfer learning for prepruned models on compact datasets, such as FashionMNIST, CIFAR10, and CIFAR100. The primary objective is to reduce the number of neurons while preserving high-level features. To this end, local sensitivity analysis is employed alongside p-norms and various reduction levels. This investigation discovers that VGG16, a network rich in parameters, displays resilience to high-level feature pruning. Conversely, the ResNet architectures reveal an interesting pattern of increased volatility. These observations assist in identifying an optimal combination of the norm and the reduction level for each network architecture, thus offering valuable directions for model-specific optimization. This study marks a significant advance in understanding and implementing effective pruning strategies across diverse network architectures, paving the way for future research and applications.
Rocznik
Strony
663--672
Opis fizyczny
Bibliogr. 38 poz., rys., tab., wykr.
Twórcy
  • Doctoral School, AGH University of Krakow, al. A. Mickiewicza 30, 30-059 Krakow, Poland
  • Faculty of Physics and Applied Computer Science, AGH University of Krakow, al. A. Mickiewicza 30, 30-059 Krakow, Poland
  • Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Poland
Bibliografia
  • [1] Antwarg, L., Miller, R.M., Shapira, B. and Rokach, L. (2021). Explaining anomalies detected by autoencoders using Shapley additive explanations, Expert Systems with Applications 186: 115736.
  • [2] Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, pp. 1800-1807.
  • [3] Corrales, D.C., Ledezma, A. and Corrales, J.C. (2018). From theory to practice: A data quality framework for classification tasks, Symmetry 10(7): 248.
  • [4] Cuzzocrea, A. (2020). Uncertainty and imprecision in big data management: Models, issues, paradigms, and future research directions, Proceedings of the 2020 4th International Conference on Cloud and Big Data Computing, pp. 6-9, (virtual).
  • [5] Fock, E. (2014). Global sensitivity analysis approach for input selection and system identification purposes-A new framework for feedforward neural networks, IEEE Transactions on Neural Networks and Learning Systems 25(8): 1484-1495.
  • [6] Gui, J., Sun, Z., Wen, Y., Tao, D. and Ye, J. (2021). A review ongenerative adversarial networks: Algorithms, theory, and applications, IEEE Transactions on Knowledge and Data Engineering 35(4): 3313-3332.
  • [7] He, K., Zhang, X., Ren, S. and Sun, J. (2016a). Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, pp. 770-778.
  • [8] He, K., Zhang, X., Ren, S. and Sun, J. (2016b). Identity mappings in deep residual networks, in B. Leibe et al. (Eds.), Computer Vision-ECCV 2016, Springer, Cham, pp. 630-645.
  • [9] Hidaka, A. and Kurita, T. (2017). Consecutive dimensionality reduction by canonical correlation analysis for visualization of convolutional neural networks, Proceedings of the ISCIE International Symposium on Stochastic Systems Theory and Its Applications, Fukuoka, Japan, Vol. 2017, pp. 160-167.
  • [10] Huang, G., Liu, Z., Van Der Maaten, L. and Weinberger, K.Q. (2017). Densely connected convolutional networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, pp. 2261-2269.
  • [11] Huang, Y.-C., Tung, Y.-S., Chen, J.-C., Wang, S.-W. and Wu, J.-L. (2005). An adaptive edge detection based colorization algorithm and its applications, MULTIMEDIA’ 05: Proceedings of the 13th Annual ACM International Conference on Multimedia, Singapore, DOI: 10.1145/1101149.1101223.
  • [12] Jeczmionek, E. and Kowalski, P.A. (2021). Flattening layer pruning in convolutional neural networks, Symmetry 13(7): 1147.
  • [13] Kahani, N., Bagherzadeh, M., Cordy, J.R., Dingel, J. and Varró, D. (2019). Survey and classification of model transformation tools, Software & Systems Modeling 18(4): 2361-2397.
  • [14] Keras (2023). Keras Applications, https://keras.io/api/applications/.
  • [15] Kowal, M., Skobel, M. and Nowicki, N. (2018). The feature selection problem in computer-assisted cytology, International Journal of Applied Mathematics and Computer Science 28(4): 759-770, DOI: 10.2478/amcs-2018-0058.
  • [16] Kowalski, P.A. and Kusy, M. (2017). Sensitivity analysis for probabilistic neural network structure reduction, IEEE Transactions on Neural Networks and Learning Systems 29(5): 1919-1932.
  • [17] Kowalski, P.A. and Kusy, M. (2018). Determining significance of input neurons for probabilistic neural network by sensitivity analysis procedure, Computational Intelligence 34(3): 895-916.
  • [18] Krizhevsky, A. (2009). Learning multiple layers of features from tiny images, https://www.cs.toronto.edu/˜kriz/learning-features-2009-TR.pdf.
  • [19] Kusy, M. and Kowalski, P.A. (2018). Weighted probabilistic neural network, Information Sciences 430: 65-76.
  • [20] Kusy,M. and Zajdel, R. (2021). A weighted wrapper approach to feature selection, International Journal of Applied Mathematics and Computer Science 31(4): 685-696, DOI: 10.34768/amcs-2021-0047.
  • [21] Lundberg, S.M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions, Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 4768-4777.
  • [22] Mishkin, D., Sergievskiy, N. and Matas, J. (2017). Systematic evaluation of convolution neural network advances on the imagenet, Computer Vision and Image Understanding 161: 11-19.
  • [23] Riesz, F. (1910). Untersuchungen über Systeme integrierbarer Funktionen, Mathematische Annalen 69(4): 449-497.
  • [24] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. and Fei-Fei, L. (2014). Imagenet large scale visual recognition challenge, International Journal of Computer Vision 115(2015): 211-252.
  • [25] Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M. and Tarantola, S. (2010). Variance based sensitivity analysis of model output. design and estimator for the total sensitivity index, Computer Physics Communications 181(2): 259-270.
  • [26] Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M. and Tarantola, S. (2008). Global Sensitivity Analysis: The Primer, John Wiley & Sons, New York.
  • [27] Saltelli, A., Tarantola, S., Campolongo, F. and Ratto, M. (2004). Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models, Wiley, New York.
  • [28] Saltelli, A., Tarantola, S. and Chan, K. (2012). A quantitative model-independent method for global sensitivity analysis of model output, Technometrics 41(1): 39-56.
  • [29] Saura, J.R. (2021). Using data sciences in digital marketing: Framework, methods, and performance metrics, Journal of Innovation & Knowledge 6(2): 92-102.
  • [30] Shi, D., Yeung, D.S. and Gao, J. (2005). Sensitivity analysis applied to the construction of radial basis function networks, Neural Networks 18(7): 951-957.
  • [31] Simonyan, K. and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, USA.
  • [32] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. and Wojna, Z. (2016). Rethinking the inception architecture for computer vision, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2818-2826.
  • [33] Tan, M. and Le, Q. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks, International Conference on Machine Learning, Long Beach, USA, pp. 6105-6114.
  • [34] Tzeng, F.-Y. and Ma, K.-L. (2005). Opening the black box-data driven visualization of neural networks, VIS 05: IEEE Visualization, 2005, Minneapolis, USA, pp. 383-390.
  • [35] Xiao, H., Rasul, K. and Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms, arXiv: 1708.07747.
  • [36] Xie, S., Girshick, R., Dollar, P., Tu, Z. and He, K. (2017). Aggregated residual transformations for deep neural networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, pp. 5987-5995.
  • [37] Yamashita, R., Nishio, M., Do, R.K.G. and Togashi, K. (2018). Convolutional neural networks: An overview and application in radiology, Insights into Imaging 9(4): 611-629.
  • [38] Zurada, J.M., Malinowski, A. and Usui, S. (1997). Perturbation method for deleting redundant inputs of perceptron networks, Neurocomputing 14(2): 177-193.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-8b58cd74-9155-41aa-81e0-3c8845a9127a
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.