Clustering in fuzzy subspaces

Simiński, K.

Artykuł - szczegóły

Tytuł artykułu

Clustering in fuzzy subspaces

Autorzy

Simiński K.

Treść / Zawartość

Pełne teksty:

httptai_czasopisma_pan_plimagesdatataiwydaniano4201204clusteringinfuzzysubspaces.pdf

Pobierz

Identyfikatory

Warianty tytułu

Grupowanie danych w rozmytych podprzestrzeniach

Języki publikacji

Abstrakty

Some data sets contain data clusters not in all dimension, but in subspaces. Known algorithms select attributes and identify clusters in subspaces. The paper presents a novel algorithm for subspace fuzzy clustering. Each data example has fuzzy membership to the cluster. Each cluster is defined in a certain subspace, but the the membership of the descriptors of the cluster to the subspace (called descriptor weight) is fuzzy (from interval [0; 1]) - the descriptors of the cluster can have partial membership to a subspace the cluster is defined in. Thus the clusters are fuzzy defined in their subspaces. The clusters are defined by their centre, fuzziness and weights of descriptors. The clustering algorithm is based on minimizing of criterion function. The paper is accompanied by the experimental results of clustering. This approach can be used for partition of input domain in extraction rule base for neuro-fuzzy systems.

Niektóre dane zawierają grupy danych nie we wszystkich wymiarach, ale w pewnych podprzestrzeniach dziedziny. Artykuł przedstawia algorytm grupowania danych w rozmytych podprzestrzeniach. Każdy przykład danych ma pewną rozmytą przynależność do grupy (klastra). Każdy klaster z kolei jest rozpięty w pewnej podprzestrzeni dziedziny wejściowej. Klastry mogą być rozpięte w różnych podprzestrzeniach. Algorytm grupowania oparty jest na minimalizacji funkcji kryterialnej. W wyniku jego działania wypracowane są położenia klastrów, ich rozmycie i wagi ich deskryptorów. Przestawiono także wyniki eksperymentów grupowania danych syntetycznych i rzeczywistych

Słowa kluczowe

subspace clustering weighted attributes fuzzy clustering

Wydawca

Instytut Informatyki Teoretycznej i Stosowanej Polskiej Akademii Nauk

Czasopismo

Theoretical and Applied Informatics

Rocznik

2012

Tom

Vol. 24, No. 4

Strony

313--326

Opis fizyczny

Bibliogr. 15 poz., rys.

Twórcy

autor

Simiński K.

Institute of Informatics Silesian University of Technology ul. Akademicka 16, 44-100 Gliwice, Poland, Krzysztof.Siminski@polsl.pl

Bibliografia

1. Ch.C. Aggarwal, J.L. Wolf, P.S. Yu, C. Procopiuc, and J. S. Park: Fast algorithms for projected clustering, SIGMOD Rec., 28(2):61–72, 1999.
2. Ch.C. Aggarwal, and P.S. Yu: Finding generalized projected clusters in high dimensional spaces, In SIGMOD ’00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 70–81, New York, NY, USA, 2000, ACM.
3. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan: Automatic subspace clustering of high dimensional data for data mining applications, SIGMOD Rec., 27(2):94–105, 1998.
4. A. Asuncion, and D.J. Newman: UCI machine learning repository, 2007.
5. Ch.-H. Cheng, A.W. Fu, and Y. Zhang: Entropy-based subspace clustering for mining numerical data, In KDD ’99: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 84–93, New York, NY, USA, 1999. ACM.
6. E. Czogała, and J. Ł˛eski: Fuzzy and Neuro-Fuzzy Intelligent Systems, Series in Fuzziness and Soft Computing. Physica-Verlag, A Springer-Verlag Company, Heidelberg, New York, 2000.
7. J.C. Dunn: A fuzzy relative of the ISODATA process and its use in detecting compact, well separated clusters, Journal Cybernetics, 3(3):32–57, 1973.
8. J.H. Friedman, and J.J. Meulman: Clustering objects on subsets of attributes, J. R. Statist. Soc. B, 66:815–849, 2004.
9. G. Gan, and J. Wu: A convergence theorem for the fuzzy subspace clustering (FSC) algorithm, Pattern Recogn., 41(6):1939–1947, 2008.
10. G. Gan, J. Wu, and Z. Yang: A fuzzy subspace algorithm for clustering high dimensional data. In Advanced Data Mining and Applications, Second International Conference, ADMA 2006, Xi’an, China, August 14-16, 2006, Proceedings, volume 4093 of Lecture Notes in Computer Science, pages 271–278. Springer Berlin/Heidelberg, 2006.
11. S. Goil, S. Goil, H. Nagesh, H. Nagesh, A. Choudhary, and A. Choudhary: Mafia: Efficient and scalable subspace clustering for very large data sets. Technical report, 1999.
12. L. Parsons, E. Haque, and H. Liu: Subspace clustering for high dimensional data: a review, SIGKDD Explor. Newsl., 6(1):90–105, 2004.
13. M. Sikora, and D. Krzykawski: Application of data exploration methods in analysis of carbon dioxide emission in hard-coal mines dewater pump stations, Mechanization and Automation of Mining, 413(6), 2005.
14. H. Späth: Mathematical algorithms for linear regression, Academic Press Professional, Inc., San Diego, CA, USA, 1992.
15.J. Yang, W. Wang, H. Wang, and P. Yu: ±-clusters: capturing subspace correlation in a large data set, In Data Engineering, 2002. Proceedings. 18th International Conference on, pages 517–528, 2002.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUJ8-0026-0009