Measuring similarity of complex and heterogeneous data in clustering of large data sets

Bacelar-Nicolau, H.; Nicolau F., F.; Sousa, A.; Bacelar-Nicolau, L.

Artykuł - szczegóły

Tytuł artykułu

Measuring similarity of complex and heterogeneous data in clustering of large data sets

Autorzy

Bacelar-Nicolau H. , Nicolau F. F. , Sousa A. , Bacelar-Nicolau L.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Cluster analysis or classification usually concerns a set of exploratory multivariate data analysis methods and techniques for finding a clustering structure on a dataset. That may refer either to groups of statistical data units or to groups of variables. In this work we deal with a generalization of this paradigm concerning clustering of complex data described by three different types of variables, frequently present in a three-way context. We obtain compatible versions of the same affinity coefficient for measuring similarity between statistical data units described by those three types of variables. A global generalized similarity coefficient is analyzed for such kind of mixed data, often arising in data mining or knowledge mining.

Słowa kluczowe

cluster analysis different type variables similarity coefficient three-way data

badania laboratoryjne analiza skupień współczynnik podobieństwa

Wydawca

Nałęcz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences
Elsevier

Czasopismo

Biocybernetics and Biomedical Engineering

Rocznik

2009

Tom

Vol. 29, no. 2

Strony

9--18

Opis fizyczny

Bibliogr. 13 poz., rys., tab.

Twórcy

autor

Bacelar-Nicolau H.

autor

Nicolau F. F.

autor

Sousa A.

autor

Bacelar-Nicolau L.

Universidade de Lisboa, FPCE, Lisboa, Portugal

Bibliografia

1. Bock H.H., Diday E. [Eds.]: Analysis of Symbolic Data Exploratory Methods for Extracting Statistical Information from Complex Data, Springer, 2000.
2. Bacelar-Nicolau H.: On the generalized affinity coefficient for complex data. Biocybernetics and Biomedical Engineering 2002, 22, 1, 31-42.
3. Nicolau F. C., Bacelar-Nicolau H. et al: Probabilistic models in three way cluster analysis. In: Proceedings of the 56th Session of the International Statistical Institute, Lisbon 2007 (in press), published on the CD Proceedings of ISI 2007.
4. Matusita K.: On the theory of statistical decision functions. Ann. Instit. Stat. Math. 1951, III, 1-30.
5. Bacelar-Nicolau H.: Two probabilistic models for classification of variables in frequency tables - Classification and Related Methods of Data Analysis. In: H. H. Bock [Ed.], Elsevier Sciences Publishers B.V., North Holland, 1988, 181-186.
6. Nicolau F. C., Bacelar-Nicolau H.: Some trends in the classification of variables. In: Hayashi, et al. [eds.]. Data Science, Classification and Related Methods, Springer, 1998, 89-98.
7. SousaA.: Contribuiçŏes à Metodologia VL e índices de validação para Dados de Natureza Complexa. PhD Thesis, Univ. Azores 2005.
8. Ichino M.: General metrics for mixed features - The Cartesian Space Theory for Pattern Recognition. IEEE Transactions on Systems, Man and Cybernetics 1988.
9. Ichino M., Yaguchi H.: Generalized Minkowski Metrics for Mixed Feature Type Data Analysis. IEEE Transactions on Systems, Man and Cybernetics, 1994, 24, 4, 698-708.
10. Bacelar-Nicolau H.: On the distribution equivalence in cluster analysis. Proceedings of the NATO ASI on Pattern Recognition Theory and Applications, Springer-Verlag, New York 1987, 73-79.
11. Lerman I.C.: Étude distributionelle de statistiques de proximité entre structures algébriques finies du même type - Application à la Classification Automatique. Cahiers du B.U.R.O., Paris 1972, 19.
12. Lerman I. C.: Classification et Analyse Ordinale des Données. Dunod, Paris 1981.
13. Bacelar-Nicolau L.: Caracterização dos Sistemas de Informação das Organizaçŏes com base no modelo de Nolan. Aplicação de modelos de classificação hierárquica aos organismos da Administração Pública. Master Thesis, Univ. Nava de Lisboa 2002.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPZ3-0030-0018