PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

On Loss Functions for Deep Neural Networks in Classification

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design – one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors’ opinion, underrepresented – while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones.
Słowa kluczowe
Rocznik
Tom
Strony
49--59
Opis fizyczny
Bibliogr. 13 poz., rys.
Twórcy
autor
  • Faculty of Mathematics and Computer Science, Jagiellonian University, Kraków, Poland
  • Faculty of Mathematics and Computer Science, Jagiellonian University, Kraków, Poland
  • DeepMind, London, UK
Bibliografia
  • [1] Larochelle H., Bengio Y., Louradour J., Lamblin P., Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 2009, 10 (Jan), pp. 1–40.
  • [2] Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems,2012, pp. 1097–1105.
  • [3] Oord A.v.d., Dieleman S., Zen H., Simonyan K., Vinyals O., Graves A., Kalchbrenner N., Senior A., Kavukcuoglu K., Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
  • [4] Silver D., Huang A., Maddison C.J., Guez A., Sifre L., Van Den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., et al., Mastering the game of go with deep neural networks and tree search. Nature, 2016, 529 (7587), pp. 484–489.
  • [5] Clevert D.A., Unterthiner T., Hochreiter S., Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289,2015.
  • [6] Kingma D., Ba J., Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • [7] Tang Y., Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239, 2013.
  • [8] Lee C.Y., Xie S., Gallagher P., Zhang Z., Tu Z., Deeply-supervised nets. In: AISTATS. vol. 2., 2015, pp. 6.
  • [9] Choromanska A., Henaff M., Mathieu M., Arous G.B., LeCun Y., The loss surfaces of multilayer networks. In: AISTATS, 2015.
  • [10] Czarnecki W.M., Jozefowicz R., Tabor J., Maximum entropy linear manifold for learning discriminative low-dimensional representation. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2015, pp. 52–67.
  • [11] LeCun Y., Cortes C., Burges C.J., The mnist database of handwritten digits, 1998.
  • [12] Srivastava N., Hinton G.E., Krizhevsky A., Sutskever I., Salakhutdinov R., Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014, 15 (1), pp. 1929–1958.
  • [13] Principe J.C., Xu D., Fisher J., Information theoretic learning. Unsupervised adaptive filtering, 2000, 1, pp. 265–319.
Uwagi
EN
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-30227f10-ea1b-4095-9e5c-46d8c36d2ea6
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.