PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Implications of Pooling Strategies in Convolutional Neural Networks: A Deep Insight

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Convolutional neural networks (CNN) is a contemporary technique for computer vision applications, where pooling implies as an integral part of the deep CNN. Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider. Here, the performances of pooling strategies on different datasets are analyzed and discussed qualitatively. This study presents a detailed review of the conventional and the latest strategies which would help in appraising the readers with the upsides and downsides of each strategy. Also, we have identified four fundamental factors namely network architecture, activation function, overlapping and regularization approaches which immensely affect the performance of pooling operations. It is believed that this work would help in extending the scope of understanding the significance of CNN along with pooling regimes for solving computer vision problems.
Rocznik
Strony
303--330
Opis fizyczny
Bibliogr. 57 poz., rys., tab.
Twórcy
  • National Institute of Technical Teachers’ Training and Research (NITTTR), Chandigarh-160019, India
autor
Bibliografia
  • [1] Reddit, Machine Learning. Available: https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4iv/.
  • [2] Achille A., Soatto S., Information dropout: Learning optimal representations through noisy computation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 2897-2905.
  • [3] Boureau Y.-L., Le Roux N., Bach F., Ponce J., LeCun Y., Ask the locals: multiway local pooling for image recognition, in Computer Vision (ICCV), 2011 IEEE International Conference on, 2011, 2651-2658.
  • [4] Cai M., Shi Y., Liu J., Stochastic pooling maxout networks for low-resource speech recognition, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, 2014, 3266-3270.
  • [5] Cheng Y., Zhao X., Cai R., Li Z., Huang K., Rui Y., Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition, in IJCAI, 2016, 3345-3351.
  • [6] Dalal N., Triggs B., Histograms of oriented gradients for human detection, in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, 886-893.
  • [7] DeVries T., Taylor G.W., Improved regularization of convolutional neural networks with cutout, arXivpreprint arXiv:1708.04552, 2017, 1-8.
  • [8] Donahue J., Jia Y., Vinyals O., Hoffman J., Zhang N., Tzeng E., Darrell T., Decaf: A deep convolutional activation feature for generic visual recognition, in International conference on machine learning, 2014, 647-655.
  • [9] Dumpala S.H., Chakraborty R., Kopparapu S.K., k-FFNN: A priori knowledge infused Feed-forward Neural Networks, arXiv preprint arXiv: 1704.07055, 2017, 1-9.
  • [10] Ellacott S., An analysis of the delta rule, in International Neural Network Conference, 1990, 956-959.
  • [11] Everingham M., Van Gool L., Williams C.K., Winn J., Zisserman A., The pascal visual object classes (voc) challenge, International Journal of computer vision, 88, 2, 2010, 303-338.
  • [12] Fei-Fei L., Fergus R., Perona P., Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, Computer vision and Image understanding, 106, 1, 2007, 59-70.
  • [13] Girshick R., Donahue J., Darrell T., Malik J., Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, 580-587.
  • [14] Goodfellow I., Bengio Y., Courville A., Deep learning1, MIT press Cambridge, 2016.
  • [15] Goodfellow I.J., Warde-Farley D., Mirza M., Courville A., Bengio Y., Maxout networks, arXivpreprint arXiv:1302.4389, 2013, 1-9.
  • [16] Graham B., Fractional max-pooling, arXivpreprint arXiv:1412.6071, 2014, 1-10.
  • [17] Grauman K., Darrell T., The pyramid match kernel: Discriminative classification with sets of image features, in Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, 2005, 1458-1465.
  • [18] He K., Zhang X., Ren S., Sun J., Spatial pyramid pooling in deep convolutional networks for visual recognition, in European conference on computer vision, 2014, 346-361.
  • [19] He K., Zhang X., Ren S., Sun J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 9, 2015, 1904-1916.
  • [20] Hebb D., "The organization of behavior: a neuropsychological theory. Mahwah, NJ: L," ed: Erlbaum Associates, 1949.
  • [21] Hinton G.E., Srivastava N., Krizhevsky A., Sutskever I., Salakhutdinov R.R., Improving neural networks by preventing co-adaptation of feature detectors, arXiv preprint arXiv:1207.0580, 2012, 1-18.
  • [22] Khan Z.H., Alin T.S., Hussain M.A., Price prediction of share market using artificial neural network (ANN), International Journal of Computer Applications, 22, 2, 2011, 42-47.
  • [23] Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, 1097-1105.
  • [24] Lang K.J., Hinton G.E., Dimensionality reduction and prior knowledge in e-set recognition, in Advances in neural information processing systems, 1990, 178-185.
  • [25] Lazebnik S., Schmid C., Ponce J., Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in null, 2006, 2169-2178.
  • [26] LeCun Y., Generalization and network design strategies, Connectionism in perspective, 1989, 143-155.
  • [27] LeCun Y., Bengio Y., Hinton G., Deep learning., Nature 521, 2015, 436-444.
  • [28] LeCun Y., Boser B., Denker J.S., Henderson D., Howard R.E., Hubbard W., Jackel L.D., Backpropagation applied to handwritten zip code recognition, Neural computation, 1, 4, 1989, 541-551.
  • [29] LeCun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86, 11, 1998, 2278-2324.
  • [30] Lee C.-Y., Gallagher P.W., Tu Z., Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree, in Artificial Intelligence and Statistics, 2016, 464-472.
  • [31] Lemley J., Bazrafkan S., Corcoran P., Smart Augmentation Learning an Optimal Data Augmentation Strategy, IEEE Access, 5, 2017, 5858-5869.
  • [32] Lowe D.G., Distinctive image features from scale-invariant keypoints, International Journal of computer vision, 60, 2, 2004, 91-110.
  • [33] McCulloch W.S., Pitts W., A logical calculus of the ideas immanent in nervous activity, The bulletin of mathematical biophysics, 5, 4, 1943, 115-133.
  • [34] Mehdipour Ghazi M., Kemal Ekenel H., A comprehensive analysis of deep learning based representation for face recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, 34-41.
  • [35] Nagpal S., Singh M., Vatsa M., Singh R., Regularizing deep learning architecture for face recognition with weight variations, in Biometrics Theory, Applications and Systems (BTAS), 2015 IEEE 7th International Conference on, 2015, 1-6.
  • [36] Nowlan S.J., Hinton G.E., Simplifying neural networks by soft weight-sharing, Neural computation, 4, 4, 1992, 473-493.
  • [37] Piccinini G., The First computational theory of mind and brain: a close look at mcculloch and pitts's “logical calculus of ideas immanent in nervous activity”, Synthese, 141, 2, 2004, 175-215.
  • [38] Plaut D.C., Experiments on Learning by Back Propagation, 1986, 1-49.
  • [39] Rumelhart D.E., Hinton G.E., Williams R.J., Learning representations by backpropagating errors, nature, 323, 6088, 1986, 533-536.
  • [40] Rumelhart D.E., McClelland J.L. (1986). Parallel distributed processing: explorations in the microstructure of cognition. volume 1. foundations.
  • [41] Scherer D., Muller A., Behnke S., Evaluation of pooling operations in convolutional architectures for object recognition, Springer, 2010, 92-101.
  • [42] Shallu, Mehra R., Automatic Magnification Independent Classification of Breast Cancer Tissue in Histological Images Using Deep Convolutional Neural Network, Singapore, 2019, 772-781.
  • [43] Shallu, Mehra R., Kumar S., "An insight into the convolutional neural network for the analysis of medical images," presented at the Nanotechnology for Instrumentation and Measurement Workshop 2017.
  • [44] Sharma S., Mehra R., Breast cancer histology images classification: Training from scratch or transfer learning?, ICTExpress, 4, 4, 2018, 247-254.
  • [45] Shi Z., Ye Y., Wu Y., Rank-based pooling for deep convolutional neural networks, Neural Networks, 83, 2016, 21-31.
  • [46] Springenberg J.T., Dosovitskiy A., Brox T., Riedmiller M., Striving for simplicity: The all convolutional net, arXiv preprint arXiv:1412.6806, 2014,
  • [47] Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R., Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1, 2014, 1929-1958.
  • [48] Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., Rabinovich A., Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision andpattern recognition, 2015, 1-9.
  • [49] Szegedy C., Toshev A., Erhan D., Deep neural networks for object detection, in Advances in neural information processing systems, 2013, 2553-2561.
  • [50] Wu H., Gu X., Max-pooling dropout for regularization of convolutional neural networks, in International Conference on Neural Information Processing, 2015, 46-54.
  • [51] Xu B., Wang N., Chen T., Li M., Empirical evaluation of rectified activations in convolutional network, arXivpreprint arXiv:1505.00853, 2015, 1-5.
  • [52] Yadav N., Yadav A., Kumar M., Preliminaries of Neural Networks, An introduction to neural network methods for differential equations, 2015, 17-42.
  • [53] Yu D., Wang H., Chen P., Wei Z., Mixed pooling for convolutional neural networks, in International Conference on Rough Sets and Knowledge Technology, 2014, 364-375.
  • [54] Zeiler M.D., Fergus R., Stochastic pooling for regularization of deep convolutional neural networks, arXivpreprint arXiv:1301.3557, 2013, 1-9.
  • [55] Zeiler M.D., Fergus R., Visualizing and understanding convolutional networks, in European conference on computer vision, 2014, 818-833.
  • [56] Zhai S., Wu H., Kumar A., Cheng Y., Lu Y., Zhang Z., Feris R.S., S3Pool: Pooling with Stochastic Spatial Sampling, in CVPR, 2017, 4003-4011.
  • [57] Zhou B., Khosla A., Lapedriza A., Oliva A., Torralba A., Object detectors emerge in deep scene cnns, arXivpreprint arXiv:1412.6856, 2014, 1-12.
Uwagi
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-8f4a0ea8-03a9-43a6-922a-b892c3e3281e
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.