A survey of old and new results for the test error estimation of a classifier

Anguita, D.; Ghelardoni, L.; Ghio, A.; Ridella, S.

doi:10.2478/jaiscr-2014-0016

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

A survey of old and new results for the test error estimation of a classifier

Autorzy

Anguita D. , Ghelardoni L. , Ghio A. , Ridella S.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2014-0016

Warianty tytułu

Języki publikacji

Abstrakty

The estimation of the generalization error of a trained classifier by means of a test set is one of the oldest problems in pattern recognition and machine learning. Despite this problem has been addressed for several decades, it seems that the last word has not been written yet, because new proposals continue to appear in the literature. Our objective is to survey and compare old and new techniques, in terms of quality of the estimation, easiness of use, and rigorousness of the approach, so to understand if the new proposals represent an effective improvement on old ones.

Słowa kluczowe

classifier error estimation machine learning pattern recognition

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2013

Tom

Vol. 3, No. 4

Strony

229--242

Opis fizyczny

Bibliogr. 32 poz., rys.

Twórcy

autor

Anguita D.

Department of Biophysical and Electronic Engineering, University of Genova, Via Opera Pia 11A, 16145 Genova, Italy

autor

Ghelardoni L.

Department of Biophysical and Electronic Engineering, University of Genova, Via Opera Pia 11A, 16145 Genova, Italy

autor

Ghio A.

Department of Biophysical and Electronic Engineering, University of Genova, Via Opera Pia 11A, 16145 Genova, Italy

autor

Ridella S.

Department of Biophysical and Electronic Engineering, University of Genova, Via Opera Pia 11A, 16145 Genova, Italy

Bibliografia

[1] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995.
[2] C. Cortes, V. Vapnik, “Support–Vector Networks”, Machine Learning, Vol. 20, 1995, pp. 273–297.
[3] L. Roberge, S. B. Long, D. B. Burnham, “Data Warehouses and Data Mining tools for the legal profession: using information technology to raise the standard of practice”, Syracuse Law Review, vol. 52, 2002, pp. 1281–1292.
[4] P. Bartlett, S. Boucheron, G. Lugosi, “Model Selection and Error Estimation”, Machine Learning, Vol. 48, 2002, pp. 85–113.
[5] L. Devroye, L. Gyorfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer Verlag, 1996.
[6] V.N. Vapnik, A.Y. Chervonenkis, “On the uniform convergence of relative frequencies of events to their probabilities”, Theory of Probability and its Applications, Vol. 16, 1971, pp. 264-280.
[7] O. Gascuel, G. Caraux, “Distribution-free performance bounds with the resubstitution error estimate”, Pattern Recognition Letters, Vol. 13, 1992, pp. 757–764.
[8] K. Duan, S. S. Keerthy, A. Poo, “Evaluation of simple performance measures for tuning SVM parameters”, Neurocomputing, vol. 51, 2003, pp. 41–59.
[9] D. Anguita, A. Boni, S. Ridella, F. Rivieccio, D. Sterpi, “Theoretical and practical model selection methods for Support Vector classifiers”, In: “Support Vector Machines: Theory and Applications”, L. Wang, Springer, 2005.
[10] D. Anguita, A. Ghio, S. Ridella, “Maximal Discrepancy for Support Vector Machines”, Neurocomputing, vol. 74, 2011, pp. 1436–1443.
[11] C.J.C. Burges, “A tutorial on Support Vector Machines for classification”, Data Mining and Knowledge Discovery, vol. 2, 1998, pp. 121-167.
[12] C.W. Su, C.C. Chang, C.J. Lin, “A practical guide to Support Vector classification”, Technical report, Dept. of Computer Science, National Taiwan University, 2003.
[13] D. Anguita, L. Ghelardoni, A. Ghio, S. Ridella, “Test error bounds for classifiers: A survey of old and new results”, In: Proc. of the 2011 IEEE Symposium on Foundations of Computational Intelligence (FOCI), Paris, France, 2011, pp. 80-87.
[14] D. Anguita, A. Ghio, S. Ridella, D. Sterpi, “K–Fold Cross Validation for Error Rate Estimate in Support Vector Machines”, In: Proc. of the Int. Conf. on Data Mining (DMIN’09), 2009, pp. 291–297.
[15] M. Anthony, S. B. Holden, “Cross–Validation for binary classification by real–valued functions:theoretical analysis”, In: Proc. of the 11th Conf. on Computational Learning Theory, 1998, pp. 218–229.
[16] V. Vapnik, The Nature of Statistical Learning Theory, Springer Verlag, 2000.
[17] J.C. Platt, “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods”, Advances in Large Margin Classifiers, 1999, pp. 61–74.
[18] P. Bartlett, A. Tewari, “Sparseness vs Estimating Conditional Probabilities: Some Asymptotic Results”, Journal of Machine Learning Research, Vol. 8, 2007, pp. 775–790.
[19] L.D. Brown, T.T. Cai, A. DasGupta, “Interval estimation for a binomial proportion”, Statistical Science, Vol. 16, 2001, pp. 101–133.
[20] E.B. Wilson, “Probable inference, the law of succession, and statistical inference”, Journal of the American Statistical Association, Vol. 22, 1927, pp. 209–212.
[21] C.J. Clopper, E.S. Pearson, “The use of confidence intervals for fiducial limits illustrated in the case of the binomial”, Biometrika, Vol. 26, 1934, pp. 404-413.
[22] J. Langford, “Tutorial on practical prediction theory for classification”, Journal of Machine Learning Research, Vol. 6, 2005, pp. 273-306.
[23] A. Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed., McGraw-Hill, 1991.
[24] L. Guttman, “A distribution-free confidence interval for the mean”, The Annals of Mathematical Statistics, Vol. 19, 1948, pp. 410–413.
[25] S.N. Bernstein, Sobranie sochinenie, Tom IV. Teoriya veroyatnostei, Matematicheskaya statistika (1911-1946) (Collected works, Vol. IV. The theory of probability, Mathematical statistics (1911-1946)), Izdat. Nauka, 1964.
[26] J.Y. Audibert, R. Munos, C. Szepesvari, “Exploration-exploitation trade-off using variance estimates in multi-armed bandits”, Theoretical Computer Science, Vol. 410, 2009, pp. 1876–1902.
[27] A. Maurer, M. Pontil, “Empirical Bernstein Bounds and Sample Variance Penalization”, In: Proc. of the Int. Conference on Learning Theory (COLT), 2009.
[28] Y. Mansour, D. McAllester, ”Generalization Bounds for Decision Trees”, In: Proc. of the Thirteenth Annual Conference on Computational Learning Theory, 2000, pp. 69-74.
[29] W. Hoeffding, “Probability inequality for sum of bounded random variables”, Journal of the American Statistical Association, Vol. 58, 1963, pp. 13–30.
[30] V. Bentkus, “An Inequality for Large Deviation Probabilities of Sums of Bounded i.i.d. Random Variables”, Lithuanian Mathematical Journal, Vol. 41, 2001, pp. 112–119.
[31] V. Bentkus, “On Hoeffdings inequalities”, Annals of Probability, Vol. 32, 2004, pp. 1650–1673.
[32] A. Blum, A. Kalai, J. Langford, “Beating the HoldOut: Bounds for Kfold and Progressive CrossValidation”, Proc. of the Conference on Computational Learning Theory, 1999, pp. 203-208.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-3b10829d-6894-458d-882c-f3a42bc25feb