Knowledge-based clustering as a conceptual and algorithmic environment of biomedical data analysis

Pedrycz, W.; Gacek, A.

Artykuł - szczegóły

Tytuł artykułu

Knowledge-based clustering as a conceptual and algorithmic environment of biomedical data analysis

Autorzy

Pedrycz W. , Gacek A.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

While a genuine abundance of biomedical data available nowadays becomes a genuine blessing, it also posses a lot of challenges. The two fundamental and commonly occurring directions in data analysis deal with its supervised or unsupervised pursuits. Our conjecture is that in the area of biomedical data processing and understanding where we encounter a genuine diversity of patterns, problem descriptions and design objectives, this type of dichotomy is neither ideal nor the most productive. In particular, the limitations of such taxonomy become profoundly evident in the context of unsupervised learning. Clustering (being usually regarded as a synonym of unsupervised data analysis) is aimed at determining a structure in a data set by optimizing a given partition criterion. In this sense, a structure emerges (becomes formed) without a direct intervention of the user. While the underlying concept looks appealing, there are numerous sources of domain knowledge that could be effectively incorporated into clustering mechanisms and subsequently help navigate throughout large data spaces. In unsupervised learning, this unified treatment of data and domain knowledge leads to the general concept of what could be coined as knowledge-based clustering. In this study, we discuss the underlying principles of this paradigm and present its various methodological and algorithmic facets. In particular, we elaborate on the main issues of incorporating domain knowledge into the clustering environment such as (a) partial labelling, (b) referential labelling (including proximity and entropy constraints), (c) usage of conditional (navigational) variables, (d) exploitation of external structure. Presented are also concepts of stepwise clustering in which the structure of data is revealed via a series of refinements of existing domain granular information.

Słowa kluczowe

knowledge and data fuzzy clustering guidance mechanisms proximity inclusion partial supervision uncertainty entropy

wiedza i dane grupowanie rozmyte bliskość włączenie nadzór częściowy niepewność entropia

Wydawca

University of Silesia, Institute of Informatics, Computer Systems Department

Czasopismo

Journal of Medical Informatics & Technologies

Rocznik

2004

Tom

Vol. 7

Strony

KB13--22

Opis fizyczny

Bibliogr. 16 poz., rys., tab.

Twórcy

autor

Pedrycz W.

Institute of Medical Technology and Equipment (ITAM), 118 Roosevelt st., Zabrze 41-800, Poland

autor

Gacek A.

Institute of Medical Technology and Equipment (ITAM), 118 Roosevelt st., Zabrze 41-800, Poland

Bibliografia

1. M.R. ANDERBERG, Cluster Analysis for Applications, Academic Press, New York, NY, 1973.
2. J. C. BEZDEK, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY, 1981.
3. A.BARGIELA, W. PEDRYCZ, Granular Computing: An Introduction, Kluwer Academic Publishers, Dordrecht, 2002.
4. D. BOLEY et al., Partitioning-based clustering for Web document categorization, Decision Support Systems, 27, 1999, 329-341.
5. D. GUILLAUME, F. MURTARGH, Clustering of XML documents, Computer Physics Communications, 127, 2000, 215-227.
6. R.J. HATHAWAY, J.W. DAVENPORT, J.C. BEZDEK, Relational dual of the C-means clustering algorithms, Pattern Recognition, 22, no. 2, 1989, 205-212.
7. R.J. HATHAWAY, J.C. BEZDEK, NERF-c means: non-Euclidean relational fuzzy clustering, Pattern Recognition, 27, 1994, 429-437.
8. R.J. HATHAWAY, J.C. BEZDEK, J.W. Davenport, On relational data versions of c-means algorithms, Pattern Recognition Letters, 17, 1996, 607-612.
9. R.J. HATHAWAY, J.C. BEZDEK, Y. HU, Generalized Fuzzy C-Means clustering strategies using Lp norm distances, IEEE Trans. on Fuzzy Systems, 8, no. 5, 2000, 576-582.
10. F. HOPPNER, F. KLAWONN, R. KRUSE, T. RUNKLER, Fuzzy Cluster Analysis – Methods for Image Recognition, J. Wiley, N. York, 1999.
11. A.K. JAIN, R.C. DUBES, Algorithm for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ, 1988.
12.G.J. KLIR, T.A. FOLGER, Fuzzy Sets, Uncertainty, and Information, Prentice Hall, Englewood Cliffs, NJ, 1988.
13. W. PEDRYCZ, J. WALETZKY, Fuzzy clustering with partial supervision, IEEE Trans. on Systems, Man, and Cybernetics, 5, 1997, 787-795.
14. W. PEDRYCZ, J. WALETZKY, Fuzzy clustering in software reusability, SOFTWARE: PRACTICE & EXPERIENCE, 27, 1997, 245 - 270.
15. W. PEDRYCZ, V. LOIA, S. SENATORE, P-FCM : A proximity-based clustering, Fuzzy Sets & Systems, to appear.
16. T. A. RUNKLER, J.C. BEZDEK, Alternating cluster estimation: a new tool for clustering and function approximation, IEEE Trans. on Fuzzy Systems, 7, no. 4, 1999, 377-393.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA4-0013-0004