PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Fitting a Gaussian mixture model through the Gini index

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
A linear combination of Gaussian components is known as a Gaussian mixture model. It is widely used in data mining and pattern recognition. In this paper, we propose a method to estimate the parameters of the density function given by a Gaussian mixture model. Our proposal is based on the Gini index, a methodology to measure the inequality degree between two probability distributions, and consists in minimizing the Gini index between an empirical distribution for the data and a Gaussian mixture model. We will show several simulated examples and real data examples, observing some of the properties of the proposed method.
Rocznik
Strony
487--500
Opis fizyczny
Bibliogr. 22 poz., tab., wykr.
Twórcy
  • Faculty of Mathematics, University of Veracruz, Circuito Gonzalo Aguirre Beltrán S/N, Zona Universitaria, Xalapa, Veracruz, Mexico
  • Faculty of Mathematics, University of Veracruz, Circuito Gonzalo Aguirre Beltrán S/N, Zona Universitaria, Xalapa, Veracruz, Mexico
Bibliografia
  • [1] Bassetti, F., Bodini, A. and Regazzini, E. (2006). On minimum Kantorovich distance estimators, Statistics and Probability Letters 76(12): 1298–1302.
  • [2] Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer, New York.
  • [3] Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society: Series B (Methodological) 39(1): 1–22.
  • [4] Elkan, C. (1997). Boosting and naive Bayesian learning, Proceedings of the International Conference on Knowledge Discovery and Data Mining, Newport Beach, USA.
  • [5] Flach, P.A. and Lachiche, N. (2004). Naive Bayesian classification of structured data, Machine Learning 57(3): 233–269.
  • [6] Giorgi, G.M. and Gigliarano, C. (2017). The Gini concentration index: A review of the inference literature, Journal of Economic Surveys 31(4): 1130–1148.
  • [7] Greenspan, H., Ruf, A. and Goldberger, J. (2006). Constrained Gaussian mixture model framework for automatic segmentation of MR brain images, IEEE Transactions on Medical Imaging 25(9): 1233–1245.
  • [8] Kłopotek, R., Kłopotek, M. and Wierzchoń, S. (2020). A feasible k-means kernel trick under non-Euclidean feature space, International Journal of Applied Mathematics and Computer Science 30(4): 703–715, DOI: 10.34768/amcs-2020-0052.
  • [9] Kulczycki, P. (2018). Kernel estimators for data analysis, in M. Ram and J.P. Davim (Eds), Advanced Mathematical Techniques in Engineering Sciences, CRC/Taylor & Francis, Boca Raton, pp. 177–202.
  • [10] López-Lobato, A.L. and Avendaño-Garrido, M.L. (2020). Using the Gini index for a Gaussian mixture model, in L. Martínez-Villaseñor et al. (Eds), Advances in Computational Intelligence. MICAI 2020, Lecture Notes in Computer Science, Vol. 12469, Springer, Cham, pp. 403–418.
  • [11] Mao, C., Lu, L. and Hu, B. (2020). Local probabilistic model for Bayesian classification: A generalized local classification model, Applied Soft Computing 93: 106379.
  • [12] Meng, X.-L. and Rubin, D.B. (1994). On the global and componentwise rates of convergence of the EM algorithm, Linear Algebra and its Applications 199(Supp. 1): 413–425.
  • [13] Povey, D., Burget, L., Agarwal, M., Akyazi, P., Kai, F., Ghoshal, A., Glembek, O., Goel, N., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P. and Thomas, S. (2011). The subspace Gaussian mixture model: A structured model for speech recognition, Computer Speech & Language 25(2): 404–439.
  • [14] Rachev, S., Klebanov, L., Stoyanov, S. and Fabozzi, F. (2013). The Methods of Distances in the Theory of Probability and Statistics, Springer, New York, pp. 659–663.
  • [15] Reynolds, D.A. (2009). Gaussian mixture models, in S.Z. Li (Ed.), Encyclopedia of Biometrics, Springer, New York, pp. 659–663.
  • [16] Rubner, Y., Tomasi, C. and Guibas, L.J. (2000). The Earth mover’s distance as a metric for image retrieval, International Journal of Computer Vision 40(2): 99–121.
  • [17] Singh, R., Pal, B.C. and Jabr, R.A. (2009). Statistical representation of distribution system loads using Gaussian mixture model, IEEE Transactions on Power Systems 25(1): 29–37.
  • [18] Torres-Carrasquillo, P.A., Reynolds, D.A. and Deller, J.R. (2002). Language identification using Gaussian mixture model tokenization, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, USA, pp. 1–757.
  • [19] Ultsch, A. and Lötsch, J. (2017). A data science based standardized Gini index as a Lorenz dominance preserving measure of the inequality of distributions, PloS One 12(8): e0181572.
  • [20] Vaida, F. (2005). Parameter convergence for EM and MM algorithms, Statistica Sinica 15(2005): 831–840.
  • [21] Villani, C. (2003). Topics in Optimal Transportation, American Mathematical Society, Providence.
  • [22] Xu, L. and Jordan, M.I. (1996). On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8(1): 129–151.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-aa116b7a-8158-401f-9164-d3be5927bde3
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.