A comparison of regularization techniques in the classification of handwritten digits

Klimaszewski, J.

Artykuł - szczegóły

Tytuł artykułu

A comparison of regularization techniques in the classification of handwritten digits

Autorzy

Klimaszewski J.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

If dataset is relatively small (e.g. number of samples is less than number of features) or samples are distorted by noise, regularized models built on that dataset often give better results than unregularized models. When problem is ill-conditioned, regularizaton is necessary in order to find solution. For data where neighbouring values are correlated (like in images or time series), not only individual weights, but also differences between them may be penalized in the model. This paper presents results of the experiment, in which several types of regularization (l2, l1, penalized differences) and their combinations were used in fitting logistic regression model (trained using one-vs.-rest strategy) to find which one of them works the best for various sizes of training set. Data used in the experiment came from MNIST dataset, which is publicly available.

Słowa kluczowe

logistic regression multi-class classification regularization

Wydawca

Komisja Informatyki Polskiej Akademii Nauk, Oddział w Gdańsku

Czasopismo

Journal of Theoretical and Applied Computer Science

Rocznik

2015

Tom

Vol. 9, nr 4

Strony

3--7

Opis fizyczny

Bibliogr. 10 poz., rys., tab., wz.

Twórcy

autor

Klimaszewski J.

jklimaszewski@wi.zut.edu.pl

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, Poland

Bibliografia

[1] Tikhonov, A. N., Arsenin, V. Y.: Solutions of Ill-posed problems. W.H. Winston, 1977.
[2] Williams, P. M.: Bayesian Regularisation and Pruning using a Laplace Prior. Neural Computation, 7, pp. 117–143, 1994.
[3] Tibshirani, R.: Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society (Series B), 58, pp. 267–288, 1996.
[4] Zou, H., Hastie, T.: Regularization and variable selection via the Elastic Net. Journal of the Royal Statistical Society, Series B, 67, pp. 301–320, 2005.
[5] Eilers, P. H. C.: A Perfect Smoother. Analytical Chemistry, 2003.
[6] Rudin, L. I., Osher, S., Fatemi, E.: Nonlinear Total Variation Based Noise Removal Algorithms. Phys. D, 60(1-4), pp. 259–268, 1992. ISSN 0167-2789.
[7] Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society Series B, 67(1), pp. 91–108, 2005.
[8] Goldstein, T., Osher, S.: The Split Bregman Method for L1-Regularized Problems. SIAM J. Img. Sci., 2(2), pp. 323–343, 2009. ISSN 1936-4954.
[9] Ye, G.-B., Xie, X.: Split Bregman method for large scale fused Lasso. Computational Statistics & Data Analysis, 55(4), pp. 1552–1569, 2011.
[10] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11), pp. 2278–2324, 1998.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-73a9156a-8209-40bd-beb2-d32c40ef04c9