Comparison of prototype selection algorithms used in construction of neural networks learned by SVD

Jankowski, N.

doi:10.2478/amcs-2018-0055

Artykuł - szczegóły

Tytuł artykułu

Comparison of prototype selection algorithms used in construction of neural networks learned by SVD

Autorzy

Jankowski N.

Treść / Zawartość

Pełne teksty:

09_jankowski_comparison_of_prototype_selection_algorithms_2018_4.pdf

Pobierz

Identyfikatory

DOI

10.2478/amcs-2018-0055

Warianty tytułu

Języki publikacji

Abstrakty

Radial basis function networks (RBFNs) or extreme learning machines (ELMs) can be seen as linear combinations of kernel functions (hidden neurons). Kernels can be constructed in random processes like in ELMs, or the positions of kernels can be initialized by a random subset of training vectors, or kernels can be constructed in a (sub-)learning process (sometimes by k-means, for example). We found that kernels constructed using prototype selection algorithms provide very accurate and stable solutions. What is more, prototype selection algorithms automatically choose not only the placement of prototypes, but also their number. Thanks to this advantage, it is no longer necessary to estimate the number of kernels with time-consuming multiple train-test procedures. The best results of learning can be obtained by pseudo-inverse learning with a singular value decomposition (SVD) algorithm. The article presents a comparison of several prototype selection algorithms co-working with singular value decomposition-based learning. The presented comparison clearly shows that the combination of prototype selection and SVD learning of a neural network is significantly better than a random selection of kernels for the RBFN or the ELM, the support vector machine or the kNN. Moreover, the presented learning scheme requires no parameters except for the width of the Gaussian kernel.

Słowa kluczowe

radial basis function network extreme learning machine kernel method prototype selection machine learning k nearest neighbours

radialna funkcja bazowa metoda jądrowa uczenie maszynowe metoda k najbliższych sąsiadów

Wydawca

Oficyna Wydawnicza Uniwersytetu Zielonogórskiego

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2018

Tom

Vol. 28, no. 4

Strony

719--733

Opis fizyczny

Bibliogr. 38 poz., tab.

Twórcy

autor

Jankowski N.

norbert@is.umk.pl

Department of Informatics, Nicolaus Copernicus University, ul. Grudziądzka 5/7, 87-100 Toruń, Poland

Bibliografia

[1] Aha, D.W., Kibler, D. and Albert, M.K. (1991). Instance-based learning algorithm, Machine Learning 6(1): 37–66.
[2] Angiulli, F. (2007). Fast nearest neighbor condensation for large data sets classification, IEEE Transactions on Knowledge and Data Engineering 19(11): 1450–1464.
[3] Barandela, R., Ferri, F. and Sanchez, J. (2005). Decision boundary preserving prototype selection for nearest neighbor classification, International Journal of Pattern Recognition and Artificial Intelligence 19(6): 787–806.
[4] Boser, B.E., Guyon, I.M. and Vapnik, V. (1992). A training algorithm for optimal margin classifiers, in D. Haussler (Ed.), Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, ACM Press, New York, NY, pp. 144–152.
[5] Brighton, H. and Mellish, C. (2002). Advances in instance selection for instance-based learning algorithms, Data Mining and Knowledge Discovery 6(2): 153–172.
[6] Brodley, C. (1995). Recursive automatic bias selection for classifier construction, Machine Learning 20(1/2): 63–94.
[7] Broomhead, D.S. and Lowe, D. (1988). Multivariable functional interpolation and adaptive networks, Complex Systems 2(3): 321–355.
[8] Cameron-Jones, R.M. (1995). Instance selection by encoding length heuristic with random mutation hill climbing, Proceedings of the 8th Australian Joint Conference on Artificial Intelligence, Canberra, Australia, pp. 99–106.
[9] Cano, J.-R., Herrera, F. and Lozano, M. (2003). Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study, IEEE Transactions on Evolutionary Computation 7(6): 561–575.
[10] Chamara, L.L., Kasun, Zhou, H. and Huang, G.-B. (2013). Representational learning with ELMs for big data, IEEE Intelligent Systems 28(6): 31–34.
[11] Devi, V. and Murty, M. (2002). An incremental prototype set building technique, Pattern Recognition 35(2): 505–513.
[12] Garcia, S., Cano, J. and Herrera, F. (2008). A memetic algorithm for evolutionary prototype selection: A scaling up approach, Pattern Recognition 41(8): 2693–2709.
[13] Garcia, S., Derrac, J., Cano, J. and Herrera, F. (2012). Prototype selection for nearest neighbor classification: Taxonomy and empirical study, IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3): 417–435.
[14] Gates, G. (1972). The reduced nearest neighbor rule, IEEE Transactions on Information Theory 18(3): 431–433.
[15] Górecki, T. and Łuczak, M. (2013). Linear discriminant analysis with a generalization of the Moore–Penrose pseudoinverse, International Journal of Applied Mathematics and Computer Science 23(2): 463–471, DOI: 10.2478/amcs-2013-0035.
[16] Grochowski, M. and Jankowski, N. (2004). Comparison of instances selection algorithms. I: Results and comments, in L. Rutkowski et al. (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 3070, Springer-Verlag, Berlin/Heidelberg, pp. 580–585.
[17] Hart, P.E. (1968). The condensed nearest neighbor rule, IEEE Transactions on Information Theory 14(3): 515–516.
[18] Hattori, K. and Takahashi, M. (2000). A new edited k-nearest neighbor rule in the pattern classification problem, Pattern Recognition 33(3): 521–528.
[19] Huang, G.-B., Zhu, Q.-Y. and Siew, C.-K. (2004). Extreme learning machine: A new learning scheme of feedforward neural networks, International Joint Conference on Neural Networks, Budapest, Hungary, pp. 985–990.
[20] Huang, G.-B., Zhu, Q.-Y. and Siew, C.-K. (2006). Extreme learning machine: Theory and applications, Neurocomputing 70(1–3): 489–501.
[21] Jankowski, N. and Grochowski, M. (2004). Comparison of instances selection algorithms. II: Algorithms survey, in L. Rutkowski et al. (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 3070, Springer-Verlag, Berlin/Heidelberg, pp. 598–603.
[22] Kuncheva, L. (1995). Editing for the k-nearest neighbors rule by a genetic algorithm, Pattern Recognition Letters 16(8): 809–814.
[23] Lozano, M., Sanchez, J. and Pla, F. (2003). Using the geometrical distribution of prototypes for training set condensing, Conference of the Spanish Association for Artificial Intelligence, San Sebastian, Spain, pp. 618–627.
[24] Marchiori, E. (2008). Hit miss networks with applications to instance selection, Journal of Transactions on Machine Learning Research 9: 997–1017.
[25] Marchiori, E. (2010). Class conditional nearest neighbor for large margin instance selection, IEEE Transactions Pattern Analysis and Machine Intelligence 32(2): 364–370.
[26] Merz, C.J. and Murphy, P.M. (1998). UCI Repository of Machine Learning Databases, https://archive.ics.uci.edu/ml/index.php.
[27] Riquelme, J., Aguilar-Ruiz, J. and Toro, M. (2003). Finding representative patterns with ordered projections, Pattern Recognition 36(4): 1009–1018.
[28] Sanchez, J., Pla, F. and Ferri, F. (1997). Prototype selection for the nearest neighbor rule through proximity graphs, Pattern Recognition Letters 18(6): 507–513.
[29] Schölkopf, B., Sung, K., Burges, C., Girosi, F., Niyogi, P., Poggio, T. and Vapnik, V. (1996). Comparing support vector machines with Gaussian kernels to radial basis function classifiers, Technical Report AI, Memo No 1599, CBCL Paper No 142, MIT, Cambridge, MA.
[30] Schölkopf, B., Sung, K.-K. and Burges, C. (1997). Comparing support vector machines with Gaussian kernels to radial basis function classifiers, IEEE Transactions on Signal Processing 45(11).
[31] Schwenker, F., Kestler, H.A. and Palm, G. (2001). Three learning phases for radial-basis-function networks, Neural Networks 14(4–5): 439–458.
[32] Skalak, D.B. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms, International Conference on Machine Learning, New Brunswick, NJ, USA, pp. 293–301.
[33] Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer-Verlag, New York, NY.
[34] Wilson, D. (1972). Asymptotic properties of nearest neighbor rules using edited data, IEEE Transactions on Systems, Man, and Cybernetics 2(3): 408–421.
[35] Wilson, D.R. and Martinez, T.R. (2000). Reduction techniques for instance-based learning algorithms, Machine Learning 38(3): 257–286.
[36] Woźniak, M. and Krawczyk, B. (2012). Combined classifier based on feature space partitioning, International Journal of Applied Mathematics and Computer Science 22(4): 855–866, DOI: 10.2478/v10006-012-0063-0.
[37] Yousef, R. and el Hindi, K. (2005). Training radial basis function networks using reduced sets as center points, International Journal of Information Technology 2(1): 21–35.
[38] Zhao, K., Zhou, S., Guan, J. and Zhou, A. (2003). C-pruner: An improved instance pruning algorithm, 2nd International Conference on Machine Learning and Cybernetics, Xi’an, China, pp. 94–99.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-65ae73ab-e3c5-4a06-97bc-92d4e9fea283