PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
Tytuł artykułu

New approach to clustering random attributes

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper proposes a new method for similarity analysis and, conse- quently, a new algorithm for clustering different types of random attributes, both numerical and nominal. However, in order for nominal attributes to be clustered, their values must be properly encoded. In the encoding pro- cess, nominal attributes obtain a new representation in numerical form. Only the numeric attributes can be subjected to factor analysis, which allows them to be clustered in terms of their similarity to factors. The proposed method was tested for several sample datasets. It was found that the proposed method is universal. On the one hand, the method allows clustering of numerical attributes. On the other hand, it provides the ability to cluster nominal attributes. It also allows simultaneous clustering of numerical attributes and numerically encoded nominal attributes.
Rocznik
Tom
Strony
41--90
Opis fizyczny
Bibliogr. 24 poz., rys., tab., wykr.
Twórcy
  • Warsaw School of Computer Science
Bibliografia
  • [1] S. S. Stevens, “On the theory of scales of measurement,” Science, vol. 103, no. 2684, pp. 677-680, 1946. [Online]. http://expsylab.psych.uoa.gr/fileadmin/expsylab.psych.uoa.gr/uploads/papers/Stevens_1946.pdf
  • [2] H. M. Blalock, Social Statistics. McGraw-Hill, 1960. [Online]. https://gwern.net/doc/statistics/1960-blalock-socialstatistics.pdf
  • [3] Z. Gniazdowski and M. Grabowski, “Numerical Coding of Nominal Data,” Zeszyty Naukowe WWSI, vol. 9, no. 12, pp. 53-61, 2015. [On line].https://www.doi.org/10.26348/znwwsi.12.53
  • [4] J. E. J. K. J. Berry, P. W. Mielke Jr, Berry, and Hiripi, Measurement of Association. Springer, 2018. [Online]. https://link.springer.com/content/pdf/10. 1007/978-3-319-98926-6.pdf
  • [5] T. Hastie, R. Tibshirani, and J. H. Friedman, The elements of statistical learning: data mining, inference, and prediction. Springer, 2017. [Online]. https://link.springer.com/content/pdf/10.1007/978-0-387-84858-7.pdf
  • [6] J. Han, J. Pei, and H. Tong, Data mining: concepts and techniques. Morgan Kaufmann, 2022.
  • [7] Z. Gniazdowski, “On the Analysis of Correlation Between Nominal Data and Numerical Data,” Zeszyty Naukowe WWSI, vol. 16, no. 27, pp. 57-82, 2022. [Online]. https://www.doi.org/10.26348/znwwsi.27.57
  • [8] Z. Gniazdowski, “Geometric interpretation of a correlation,” Zeszyty Naukowe WWSI, vol. 7, no. 9, pp. 27-35, 2013. [Online]. https://www.doi.org/10.26348/znwwsi.9.27
  • [9] F. Wilcoxon, Individual Comparisons by Ranking Methods. New York, NY: Springer New York, 1992, pp. 196-202. [Online]. https://doi.org/10.1007/978-1-4612-4380-9_16
  • [10] Z. Gniazdowski, “Principal Component Analysis versus Factor Analysis,” Zeszyty Naukowe WWSI, vol. 15, no. 24, pp. 35-88, 2021. [Online]. https://www.doi.org/10.26348/znwwsi.24.35
  • [11] G. P. H. Styan, “Hadamard products and multivariate statistical analysis,” Linear Algebra and its Applications, vol. 6, pp. 217-240, 1973. [Online]. https://doi.org/10.1016/0024-3795(73)90023-2
  • [12] H. F. Kaiser, “The varimax criterion for analytic rotation in factor analysis,” Psychometrika, vol. 23, no. 3, pp. 187-200, 1958. [Online]. https://doi.org/10.1007/BF02289233
  • [13] “Majority,” 2024. [Online]. https://en.wikipedia.org/wiki/Majority [14] “Majority rule,” 2024. [Online]. https://en.wikipedia.org/wiki/Majority_rule
  • [15] “Plurality (voting),” 2024. [Online]. https://en.wikipedia.org/wiki/Plurality_(voting)
  • [16] J. Pogonowski, “Własno´sci relacji.” [Online]. http://logic.amu.edu.pl/images/b/bb/Dygdwa.pdf
  • [17] J. Pogonowski, Przestrzenie Podobienstwa i Opozycji. Wydział Filozofii i Socjologii Uniwersytetu Warszawskiego, 1997, pp. 83-95. [Online]. https://logic.amu.edu.pl/images/8/85/Wolniew.pdf
  • [18] P. Cerda, G. Varoquaux, and B. Kégl, “Similarity encoding for learning with dirty categorical variables,” Machine Learning, vol. 107, no. 8, pp. 1477-1494, 2018. [Online]. https://doi.org/10.1007/s10994-018-5724-2
  • [19] J. T. Hancock and T. M. Khoshgoftaar, “Survey on categorical data for neural networks,” Journal of big data, vol. 7, no. 1, p. 28, 2020. [Online]. https://doi.org/10.1186/s40537-020-00305-w
  • [20] “One hot encoding.” [Online]. https://deepai.org/machine-learning-glossary-and-terms/one-hot-encoding
  • [21] D. Bhat, “Simple weather forecast,” 2022. [Online]. https://www.kaggle.com/datasets/dheemanthbhat/simple-weather-forecast
  • [22] “Mushroom,” UCI Machine Learning Repository, 1981. [Online]. https://doi.org/10.24432/C5959T
  • [23] J. Schlimmer, “Automobile,” UCI Machine Learning Repository, 1985. [Online].https://doi.org/10.24432/C5B01C
  • [24] M. Zwitter and M. Soklic, “Breast Cancer,” UCI Machine Learning Repository, 1988. [Online]. https://doi.org/10.24432/C51P4M
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b88e584a-0fd6-4b95-8692-73ac73deafbb
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.