Designing Model Based Classifiers by Emphasizing Soft Targets

El Jelali, S.; Lyhyaoui, A.; Figueiras-Vidal, A.R.

Artykuł - szczegóły

Tytuł artykułu

Designing Model Based Classifiers by Emphasizing Soft Targets

Autorzy

El Jelali S. , Lyhyaoui A. , Figueiras-Vidal A.R.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

When training machine classifiers, to replace hard classification targets by emphasized soft versions of them helps to reduce the negative effects of using standard cost functions as approximations to misclassification rates. This emphasis has the same kind of effect as sample editing methods, that have proved to be effective for improving classifiers performance. In this paper, we explore the effectiveness of using emphasized soft targets with generative models, such as Gaussian MixtureModels (GMM), and Gaussian Processes (GP). The interest of using GMMis that they offer advantages such as an easy interpretation and straightforward possibilities to deal with missing values. With respect to GP, if we use soft targets, we do not need to resort to any complex approximation to get a Gaussian Process classifier and, simultaneously, we can obtain the advantages provided by the use of an emphasis. Simulation results support the usefulness of the proposed approach to get better performance and show a low sensitivity to design parameters selection.

Słowa kluczowe

smoothing target sample selection classification Gaussian mixture models GMM Gaussian Processes GP

Wydawca

Polskie Towarzystwo Matematyczne

Czasopismo

Fundamenta Informaticae

Rocznik

2009

Tom

Vol. 96, nr 4

Strony

419--433

Opis fizyczny

Bibliogr. 47 poz., tab., wykr.

Twórcy

autor

El Jelali S.

autor

Lyhyaoui A.

autor

Figueiras-Vidal A.R.

Lab. 4.2.C.01 Dcha., Dept. of Signal Processing and Communications, Univ. Carlos III de Madrid, Av. de la Universidad 30, Legan´es, Madrid, 28911, Spain., arfv@tsc.uc3m.es

Bibliografia

[1] Rosenblatt, F.: The Perceptron: A probabilistic model for information storage and organization in the brain, Psychological Review, 65, 1958, 386-408.
[2] Kung, S. Y., Taur, J. S.: Decision-based neural networks with signal/image classification applications, IEEE Trans. Neural Networks, 6, 1995, 170-181.
[3] Fisher, R. A.: The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7, Pt. II, 1936, 179-188.
[4] Telfer, B. A., Szu, H. H.: Energy functions for minimizing misclassification error with minimum-complexity networks, Neural Networks, 7, 1994, 809-818.
[5] Boser, B. E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers, Proc. 5th Annual Workshop Comp. Learning Theory (D. Hassler, Ed.), ACM Press, Pittsburgh, PA, 1992, 144-152.
[6] Cortes, C., Vapnik, V.: Support Vector networks, Machine Learning, 20, 1995, 273-297.
[7] M¨uller, K. R., Mika, S., R¨atsch, G., Tsuda, K., Sch¨olkopf, B.: An introduction to kernel-based learning algorithm, IEEE Trans. Neural Networks, 12, 2001, 181-201.
[8] Hart, P. E.: The condensed nearest neighbor rule, IEEE Trans. Information Theory, 14, 1968, 515-516.
[9] Sklansky, J., Michelotti, L.: Locally trained piecewise linear classifiers, IEEE Trans. Pattern Anal. Machine Intelligence, 2, 1980, 101-111.
[10] Plutowski, M., White, H.: Selecting concise training sets from clean data, IEEE Trans. Neural Networks, 4, 1993, 305-318.
[11] Choi, S. H., Rockett, P.: The training of neural classifiers with condensed datasets, IEEE Trans. Sys., Man, and Cybernetics, Pt. B, 32, 2002, 202-206.
[12] Munro, P. W.: Repeat until bored: A pattern selection strategy, Adv. in Neural Inf. Proc. Sys., vol. 4, (J. E. Moody et al., Eds.), Morgan Kaufmann, San Mateo, CA, 1992, 1001-1008.
[13] Cachin, C.: Pedagogical pattern selection strategies, Neural Networks, 7, 1994, 171-181.
[14] Lyhyaoui, A., Martinez-Ramón,M., Mora-Jiménez, I., Vázquez Castro, M., Sancho Goméz, J. L., Figueiras-Vidal, A. R.: Sample selection via clustering to construct Support Vector-like classifiers, IEEE Trans. Neural Networks, 10, 1999, 1474-1481.
[15] Freund, Y., Schapire, R. E.: Experiments with a new boosting algorithm, Proc. 13th Intl. Conf. Machine Learning, Bari, Italy, 1996, 148-156.
[16] Freund, Y., Schapire, R. E.: Game theory, on-line prediction, and boosting, Proc. 9th Annual Conf. on Comput. Learning Theory, Desenzano di Garda, Italy, 1996, 325-332.
[17] Schapire, R. E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions, Machine Learning, 37, 1999, 297-336.
[18] Gómez-Verdejo, V., Ortega-Moral, M., Arenas-Garc´ıa, J., Figueiras-Vidal, A. R.: Boosting by weighting critical and erroneous samples, Neurocomputing, 69, 2006, 679-685.
[19] Gómez-Verdejo, V., Arenas-Garcıa, J., Figueiras-Vidal, A. R.: A dynamically adjusted mixed emphasis method for building boosting ensembles, IEEE Trans. Neural Networks, 19, 2008, 3-17.
[20] Franco, L., Cannas, S. A.: Generalization and selection of examples in feed-forward neural networks, Neural Computation, 12, 2000, 2405-2426.
[21] Reed, R., Oh, S., Marks, II, R. J.: Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter, IEEE Trans. Neural Networks, 6, 1995, 529-538.
[22] Gorse, D., Shepperd, A. J., Taylor, J. G.: The new ERA in supervised learning, Neural Networks, 10, 1997, 343-352.
[23] El Jelali, S., Lyhyaoui, A., Figueiras-Vidal, A. R.: An emphasized target smoothing procedure to improve MLP classifiers performance, Proc. 16th European Symp. Artificial Neural Networks, Bruges, Belgium, 2008, 499-504.
[24] El Jelali, S., Lyhyaoui, A., Figueiras-Vidal, A. R.: Applying emphasized soft target for Gaussian mixture model based classification, Proc. Intl. Multiconf. on Computer Science and Information Technology, 3rd Intl. Symp. Advances in Artificial Intelligence and Applications, vol. 3, Wisla, Poland, 2008, 131-136.
[25] Bishop, C. M.: Pattern Recognition and Machine Learning, Springer, New York, NY, 2006.
[26] Rasmussen, C. E.,Williams, C. K. I.: Gaussian Processes forMachine Learning, TheMIT Press, Cambridge, MA, 2006.
[27] Breiman, L.: Combining predictors, in Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems (A. J. C. Sharkey, Ed.), Springer, London, UK, 1999, 31-50.
[28] Jacobs, R. A., Jordan, M. I.: A competitive modular connectionist architecture, in Advances in Neural Info. Proc. Sys., vol. 5, (D. Touretzky, Ed.), Morgan Kaufmann, San Mateo, CA, 1991, 767-773.
[29] Jordan, M. I., Jacobs, R. A.: Hierarchical Mixtures of Experts and the EM algorithm, Neural Computation, 6, 1994, 181-214.
[30] Xu, L., Jordan, M. I., Hinton, G. E.: An alternative model for Mixtures of Experts, in Advances in Neural Information Processing Systems, 7, MIT Press, 1995, 633-640.
[31] Neal, R. M.: Probabilistic inference using Markov chain Monte Carlo methods, Technical Report CRG-TR-93-1, Department of Computer Science, University of Toronto, 1993.
[32] Kass, R. E., Carlin, B. P., Gelman, A., Neal, R. M.: Markov Chain Monte Carlo in Practice: A Roundtable Discussion, The American Statistician, 52, 93-100, 1998.
[33] Minka, T. P.: A Family of Algorithms for Approximate Bayesian Inference, PhD thesis,Massachusetts Institute of Technology, January 2001.
[34] Minka, T. P.: Expectation Propagation for Approximate Bayesian Inference, Uncertainty in Artificial Intelligence, 17, 2001, 362-369.
[35] Williams, C. K. I., Barber, D.: Bayesian Classification with Gaussian Processes, IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(12), 1998, 1342-1351.
[36] Kwok, J. T.: Moderating the output of Support Vector classifiers, IEEE Trans. Neural Networks, 10, 1999, 1018-1031.
[37] Ripley, B. D.: Neural networks and related methods for classification (with discussion), J. Royal Statistical Soc. Series B, 56, 1994, 409-456.
[38] Ripley, B. D.: Pattern Recognition and Neural Networks: http://www.stats.ox.ac.uk/pub/PRNN
[39] Blake, C. L., Merty, C. J.: UCI Repository of Machine Learning Databases: www.ics.uci.edu/~mlearn
[40] Ruiz, A., Lopez-de-Teruel, P. E.: Nonlinear kernel-based statistical pattern analysis, IEEE Trans. Neural Networks, 12, 2001, 16-32.
[41] Archambeau, C., Delannay, N., Verleysen, M.: Mixtures of robust probabilistic principal component analyzers, Neurocomputing, 71(7-9), 2008, 1274-1282.
[42] Archambeau, C., Verleysen,M.: Robust Bayesian clustering, Neural Networks, 20(1), 2007, 129-138.
[43] Rasmussen, C. E., Williams, C. K. I.: Gaussian Processes for Machine Learning: www.GaussianProcess.org/gpml
[44] Kim, H.-C., Ghahramani, Z.: Bayesian Gaussian Process Classification with the EM-EP algorithm, IEEE Trans. Pattern Analysis and Machine Intelligence, 28, 2006, 1948-1959.
[45] Snelson, E., Ghahramani, Z.: Sparse Gaussian Processes using Pseudo-inputs, in Advances Neural Information Processing Systems, 18, MIT Press, 2006, 1257-1264.
[46] Pérez-Cruz, F.: IRWLS Matlab toolbox to solve the SVM for pattern recognition and regression estimation, 2002. Available: www.tsc.uc3m.es/~fernando/
[47] Kuss, M., Rasmussen, C. E.: Assessing approximate inference for binary Gaussian process classification, Journal of Machine Learning Research, 6, 2005, 1679-1704.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS8-0008-0058