Linguistically defined clustering of data

Leski, J. M.; Kotas, M. P.

doi:10.2478/amcs-2018-0042

Artykuł - szczegóły

Tytuł artykułu

Linguistically defined clustering of data

Autorzy

Leski J. M. , Kotas M. P.

Treść / Zawartość

Pełne teksty:

11_leski_kotas_linguistically_defined_clustering_2018_3.pdf

Pobierz

Identyfikatory

DOI

10.2478/amcs-2018-0042

Warianty tytułu

Języki publikacji

Abstrakty

This paper introduces a method of data clustering that is based on linguistically specified rules, similar to those applied by a human visually fulfilling a task. The method endeavors to follow these remarkable capabilities of intelligent beings. Even for most complicated data patterns a human is capable of accomplishing the clustering process using relatively simple rules. His/her way of clustering is a sequential search for new structures in the data and new prototypes with the use of the following linguistic rule: search for prototypes in regions of extremely high data densities and immensely far from the previously found ones. Then, after this search has been completed, the respective data have to be assigned to any of the clusters whose nuclei (prototypes) have been found. A human again uses a simple linguistic rule: data from regions with similar densities, which are located exceedingly close to each other, should belong to the same cluster. The goal of this work is to prove experimentally that such simple linguistic rules can result in a clustering method that is competitive with the most effective methods known from the literature on the subject. A linguistic formulation of a validity index for determination of the number of clusters is also presented. Finally, an extensive experimental analysis of benchmark datasets is performed to demonstrate the validity of the clustering approach introduced. Its competitiveness with the state-of-the-art solutions is also shown.

Słowa kluczowe

data clustering possibility theory linguistic rules data analysis

grupowanie danych teoria możliwości analiza danych

Wydawca

Oficyna Wydawnicza Uniwersytetu Zielonogórskiego

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2018

Tom

Vol. 28, no. 3

Strony

545--557

Opis fizyczny

Bibliogr. 28 poz., tab., wykr.

Twórcy

autor

Leski J. M.

jacek.leski@polsl.pl

Institute of Medical Technology & Equipment ITAM, Roosevelta 118, 41-800 Zabrze, Poland

autor

Kotas M. P.

Institute of Electronics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland

Bibliografia

[1] Chang, H. and Yeung, D. (2008). Robust path-based spectral clustering, Pattern Recognition 41(1): 191–203.
[2] Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(5): 603–619.
[3] Duda, R., Hart, P. and Stork, D. (2001). Pattern Classification, Wiley, New York, NY.
[4] Ester, M., Kriegiel, H.-P., Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, KDDM 1996, Portland, OR, USA, pp. 226–231.
[5] Everitt, B., Landau, S., Leese, M. and Stahl, D. (2011). Cluster Analysis, 5th Edn., Wiley, London.
[6] Fu, L. and Medico, E. (2007). Flame, a novel fuzzy clustering method for the analysis of DNA microarray data, BMC Bioinformatics 8(3): 1–15.
[7] Fukunaga, K. and Hostetler, L. (1975). The estimation of the gradient of a density function, with application to pattern recognition, IEEE Transactions on Information Theory 21(1): 32–40.
[8] Gionis, A., Mannila, H. and Tsaparas, P. (2007). Clustering aggregation, ACM Transactions on Knowledge Discovery From Data 1(1): 1–30.
[9] Jain, A. and Law, M. (2005). Data clustering: A user’s dilemma, in S.K. Pal et al. (Eds.), Pattern Recognition and Machine Intelligence, Lecture Notes in Computer Science, Vol. 3776, Springer, New York, NY, pp. 1–10.
[10] Jain, A., Murty, M. and Flynn, P. (1999). Data clustering: A review, ACM Computing Surveys 31(3): 264–323.
[11] Király, A., Vathy-Fogarassy, A. and Abonyi, J. (2016). Geodesic distance based fuzzy c-medoid clustering searching for central points in graphs and high dimensional data, Fuzzy Sets and Systems 286(1): 157–172.
[12] Leski, J. (2015). Fuzzy (c + p)-means clustering and its application to a fuzzy rule-based classifier: Towards good generalization and good interpretability, IEEE Transactions on Fuzzy Systems 23(4): 802–812.
[13] Leski, J. (2016). Fuzzy c-ordered-means clustering, Fuzzy Sets and Systems 286(1): 114–133.
[14] Leski, J. and Kotas, M. (2015). On robust fuzzy c-regression models, Fuzzy Sets and Systems 279(1): 112–129.
[15] Nguyen, S. and Choi, S.-B. (2015). Design of a new adaptive neuro-fuzzy inference system based on a solution for clustering in a data potential field, Fuzzy Sets and Systems 279(1): 64–86.
[16] Pancerz, K., Lewicki, A. and Tadeusiewicz, R. (2015). Ant-based extraction of rules in simple decision systems over ontological graphs, International Journal of Applied Mathematics and Computer Science 25(2): 377–387, DOI: 10.1515/amcs-2015-0029.
[17] Pedrycz, W., Al-Hmouz, R., Balamash, A. and Morfeq, A. (2015). Hierarchical granular clustering: An emergence of information glanules of higher type and higher order, IEEE Transactions on Fuzzy Systems 23(6): 2270–2283.
[18] Ripley, B. (1996). Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge.
[19] Rodriguez, A. and Laio, A. (2014). Clustering by fast search and find of density peaks, Science 344(6191): 1492–1496.
[20] Tou, J. and Gonzalez, R. (1974). Pattern Recognition Principles, Addison-Wesley, London.
[21] Veenman, C., Reinders, M. and Backer, E. (2002). A maximum variance cluster algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9): 1273–1280.
[22] Webb, A. (1999). Statistical Pattern Recognition, Arnold, London.
[23] Yager, R. and Filev, D. (1999). Induced ordered weighted averaging operators, IEEE Transactions on Systems, Man and Cybernetics: Cybernetics 29(2): 141–150.
[24] Zadeh, L. (1965). Fuzzy sets, Information and Control 8(1): 338–353.
[25] Zadeh, L. (1978). Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1(1): 2–28.
[26] Zahn, C. (1971). Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Transactions on Computers 1(1): 68–86.
[27] Zaki,M. and Meira,W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, New York, NY.
[28] Zok, T., Antczak, M., Riedel, M., Nebel, D., Villmann, T., Lukasiak, P., Blazewicz, J. and Szachniuk, M. (2015). Building the library of RNA 3D nucleotide conformations using the clustering approach, International Journal of Applied Mathematics and Computer Science 25(3): 689–700, DOI: 10.1515/amcs-2015-0050.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-cb42811f-51cf-42be-aa9f-88f3227986de