PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Linear discriminant analysis with a generalization of the Moore–Penrose pseudoinverse

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The Linear Discriminant Analysis (LDA) technique is an important and well-developed area of classification, and to date many linear (and also nonlinear) discrimination methods have been put forward. A complication in applying LDA to real data occurs when the number of features exceeds that of observations. In this case, the covariance estimates do not have full rank, and thus cannot be inverted. There are a number of ways to deal with this problem. In this paper, we propose improving LDA in this area, and we present a new approach which uses a generalization of the Moore–Penrose pseudoinverse to remove this weakness. Our new approach, in addition to managing the problem of inverting the covariance matrix, significantly improves the quality of classification, also on data sets where we can invert the covariance matrix. Experimental results on various data sets demonstrate that our improvements to LDA are efficient and our approach outperforms LDA.
Rocznik
Strony
463--471
Opis fizyczny
Bibliogr. 39 poz., tab., wykr.
Twórcy
autor
  • Faculty of Mathematics and Computer Science, Adam Mickiewicz University, Umultowska 87, 61-614 Poznań, Poland
autor
  • Faculty of Civil Engineering, Environmental and Geodetic Sciences, Koszalin University of Technology, Śniadeckich 2, 75-453 Koszalin, Poland
Bibliografia
  • [1] Anderson, T.W. (1984). An Introduction to Multivariate Analysis, Wiley, New York, NY.
  • [2] Bensmail, H. and Celeux, G. (1996). Regularized Gaussian discriminant analysis through eigenvalue decomposition, Journal of the American Statistical Association 91(436): 1743–1748.
  • [3] Bergmann, G. and Hommel, G. (1988). Improvements of general multiple test procedures for redundant systems of hypotheses, in P. Bauer, G. Hommel and E. Sonnemann (Eds.), Multiple Hypotheses Testing, Springer, Berlin, pp. 110–115.
  • [4] Chen, L.-F., Liao, H.-Y. M., Ko, M.-T., Lin, J.-C. and Yu, G.-J. (2000). A new LDA-based face recognition system which can solve the small sample size problem, Pattern Recognition 33(10): 1713–1726.
  • [5] Cozzolino, D., Restaino, E. and Fassio, A. (2002). Discrimination of yerba mate (Ilex paraguayensis st. hil.) samples according to their geographical origin by means of near infrared spectroscopy and multivariate analysis, Sensing and Instrumentation for Food Quality and Safety 4(2): 67–72.
  • [6] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection, SIAM Journal on Matrix Analysis and Applications 30(1): 56–66.
  • [7] Dempster, A. (1972). Covariance selection, Biometrics 28(1): 157–175.
  • [8] Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7(1): 1–30.
  • [9] Dey, D.K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss, The Annals of Statistics 1(4): 1581–1591.
  • [10] Dillon, W. and Goldstein, M. (1984). Multivariate Analysis: Methods and Applications, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics,Wiley, New York, NY.
  • [11] Duda, R., Hart, P. and Stork, D. (2001). Pattern Classification, Wiley, New York, NY.
  • [12] Enis, P. and Geisser, S. (1986). Optimal predictive linear discriminants, Annals of Statistics 2(2): 403–410.
  • [13] Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository, University of California, Irvine, CA, http://archive.ics.uci.edu/ml.
  • [14] Friedman, J. H. (1989). Regularized discriminant analysis, Journal of the American Statistical Association 84(405): 165–175.
  • [15] Garcia, S. and Herrera, F. (2008). An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, Journal of Machine Learning Research 9(12): 2677–2694.
  • [16] Gleim, G.W. (1984). The profiling of professional football players, Clinics in Sports Medicine 3(1): 185–197.
  • [17] Hawkins, A.D. and Rasmussen, K.J. (1978). The calls of gadoid fish, Journal of the Marine Biological Association of the United Kingdom 58(4): 891–911.
  • [18] Hommel, G. and Bernhard, G. (1994). A rapid algorithm and a computer program for multiple test procedures using logical structures of hypotheses, Computer Methods and Programs in Biomedicine 43(3–4): 213–6.
  • [19] Hong, Z.-Q. and Yang, J.-Y. (1991). Optimal discriminant plane for a small number of samples and design method of classifier on the plane, Pattern Recognition 24(4): 317–324.
  • [20] Iman, R. and Davenport, J. (1980). Approximations of the critical region of the Friedman statistic, Communications in Statistics—Theory and Methods 9(6): 571–595.
  • [21] Kuo, B.-C. and Landgrebe, D.A. (2002). A covariance estimator for small sample size classification problems and its application to feature extraction, IEEE Transactions on Geoscience and Remote Sensing 40(4): 814–819.
  • [22] Kwak, N., Kim, S., Lee, C. and Choi, T. (2002). An application of linear programming discriminant analysis to classifying and predicting the symptomatic status of HIV/AIDS patients, Journal of Medical Systems 26(5): 427–438.
  • [23] Lim, T.-S., Loh, W.-Y. and Shih, Y.-S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms, Machine Learning 40(3): 203–228.
  • [24] Morrison, D. (1990). Multivariate Statistical Methods, McGraw-Hill Series in Probability and Statistics, McGraw-Hill, New York, NY.
  • [25] Nemenyi, P. (1963). Distribution-free Multiple Comparisons, Ph.D. thesis, Princeton University, Princeton, NJ.
  • [26] Olkin, I. and Selliah, J. (1975). Estimating covariances in a multivariate normal distribution, Technical report, Stanford University, Stanford, CA.
  • [27] Piegat, A. and Landowski, A. (2012). Optimal estimator of hypothesis probability for data mining problems with small samples, International Journal of Applied Mathematics and Computer Science 22(3): 629–645, DOI: 10.2478/v10006-012-0048-z.
  • [28] Rao, C. and Mitra, S. (1971). Generalized Inverse of Matrices and Its Applications, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, Wiley, New York, NY.
  • [29] Röbenack, K. and Reinschke, K. (2011). On generalized inverses of singular matrix pencils, International Journal of Applied Mathematics and Computer Science 21(1): 161–172, DOI: 10.2478/v10006-011-0012-3.
  • [30] Sharma, A. and Paliwal, K.K. (2008). A gradient linear discriminant analysis for small sample sized problem, Neural Processing Letters 27(1): 17–24.
  • [31] Shin, Y.J. and Park, C.H. (2011). Analysis of correlation based dimension reduction methods, International Journal of Applied Mathematics and Computer Science 21(3): 549–558, DOI: 10.2478/v10006-011-0043-9.
  • [32] Song, F., Zhang, D., Chen, Q. and Wang, J. (2007). Face recognition based on a novel linear discriminant criterion, Pattern Analysis and Applications 10(3): 165–174.
  • [33] StatSoft, I. (2007). Statistica (data analysis software system), version 8.0, http://www.statsoft.com.
  • [34] Stein, C., Efron, B. and Morris, C. (1972). Improving the Usual Estimator of a Normal Covariance Matrix, Stanford University, Stanford, CA.
  • [35] Swets, D.L. and Weng, J. (1996). Using discriminant eigenfeatures for image retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence 18(8): 831–836.
  • [36] Tian, Q., Fainman, Y., Gu, Z.H. and Lee, S.H. (1988). Comparison of statistical pattern-recognition algorithms for hybrid processing, I: Linear-mapping algorithms, Journal of the Optical Society of America A: Optics, Image Science and Vision 5(10): 1655–1669.
  • [37] Titterington, D. (1985). Common structure of smoothing techniques in statistics, International Statistical Review 53(2): 141–170.
  • [38] van der Heijden, F., Duin, R., de Ridder, D. and Tax, D. (2004). Classification, Parameter Estimation and State Estimation, Wiley, New York, NY.
  • [39] Yu, H. and Yang, J. (2001). A direct LDA algorithm for high-dimensional data with application to face recognition, Pattern Recognition 34(10): 2067–2070.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-e9c5a9ee-8842-4068-8c3f-831e18fb56be
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.