PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Robust Rough-Fuzzy C-Means Algorithm : Design and Applications in Coding and Non-coding RNA Expression Data Clustering

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Cluster analysis is a technique that divides a given data set into a set of clusters in such a way that two objects from the same cluster are as similar as possible and the objects from different clusters are as dissimilar as possible. In this background, different rough-fuzzy clustering algorithms have been shown to be successful for finding overlapping and vaguely defined clusters. However, the crisp lower approximation of a cluster in existing rough-fuzzy clustering algorithms is usually assumed to be spherical in shape, which restricts to find arbitrary shapes of clusters. In this regard, this paper presents a new rough-fuzzy clustering algorithm, termed as robust rough-fuzzy c-means. Each cluster in the proposed clustering algorithm is represented by a set of three parameters, namely, cluster prototype, a possibilistic fuzzy lower approximation, and a probabilistic fuzzy boundary. The possibilistic lower approximation helps in discovering clusters of various shapes. The cluster prototype depends on the weighting average of the possibilistic lower approximation and probabilistic boundary. The proposed algorithm is robust in the sense that it can find overlapping and vaguely defined clusters with arbitrary shapes in noisy environment. An efficient method is presented, based on Pearson's correlation coefficient, to select initial prototypes of different clusters. A method is also introduced based on cluster validity index to identify optimum values of different parameters of the initialization method and the proposed clustering algorithm. The effectiveness of the proposed algorithm, along with a comparison with other clustering algorithms, is demonstrated on synthetic as well as coding and non-coding RNA expression data sets using some cluster validity indices.
Słowa kluczowe
Wydawca
Rocznik
Strony
153--174
Opis fizyczny
Bibliogr. 35 poz., wykr.
Twórcy
autor
  • Machine Intelligence Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700 108, India
autor
  • Machine Intelligence Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700 108, India
Bibliografia
  • [1] Ambros, V.: Control of Developmental Timing in Caenorhabditis Elegans, Current Opinion in Genetics and Development, 10(4), 2000,428-433.
  • [2] Asharaf, S., Shevade, S. K., Murty, M. N.: Rough Support Vector Clustering, Pattern Recognition, 38, 2005, 1779-1783.
  • [3] Barni, M., Cappellini, V., Mecocci, A.: Comments on A Possibilistic Approach to Clustering, IEEE Transactions on Fuzzy Systems, 4(3), 1996, 393-396.
  • [4] Bellman, R. E., Kalaba, R. E., Zadeh, L. A.: Abstraction and Pattern Classification, Journal of Mathematical Analysis and Applications, 13, 1966, 1-7.
  • [5] Bezdek, J. C.: Pattern Recognition with Fuzzy Objective Function Algorithm, New York: Plenum, 1981.
  • [6] Bezdek, J. C., Pal, N. R.: Some New Indexes for Cluster Validity, IEEE Transactions on System, Man, and Cybernetics, Part B: Cybernetics, 28, 1988, 301-315.
  • [7] De, S. K.: A Rough Set Theoretic Approach to Clustering, Fundamenta Informaticae, 62(3-4), 2004, 409417.
  • [8] Domany, E.: Cluster Analysis of Gene Expression Data, Journal of Statistical Physics, 110, 2003, 11171139.
  • [9] Dunn, J. C.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact, Well-Separated Clusters, Journal of Cybernetics, 3, 1974, 32-57.
  • [10] Enerly, E., Steinfeld, I., Kleivi, K., Leivonen, S. K., Aure, M. R., Russnes, H. G., Ronneberg, J. A., Johnsen, H., Navon, R., Rodland, E., Makela, R., Naume, B., Perala, M., Kallioniemi, O., Kristensen, V. N., Yakhini, Z., Dale, A. L. B.: miRNA-mRNA Integrated Analysis Reveals Roles for miRNAs in Primary Breast Tumors, PLoS One, 6(2), 2011.
  • [11] Grad, Y., Aach, J., H., G. D., Reinhart, B. J., Church, G. M., Ruvkun, G., Kim, J.: Computational and Experimental Identification of C. elegans microRNAs, Molecular Cell, 11(5), 2003, 1253-1263.
  • [12] Hirano, S., Tsumoto, S.: An Indiscernibility-Based Clustering Method with Iterative Refinement of Equivalence Relations: Rough Clustering, Journal of Advanced Computational Intelligence and Intelligent Informatics, 7(2), 2003, 169-177.
  • [13] Jain, A. K., Dubes, R. C.: Algorithms for Clustering Data, Englewood Cliffs, N.J.: Prentice Hall, 1988.
  • [14] Jain, A. K., Murty, M. N., Flynn, P. J.: Data Clustering: A Review, ACM Computing Surveys, 31(3), 1999, 264-323.
  • [15] John, B., Enright, A. J., Aravin, A., Tuschl, T., Sander, C., Debora, S. M.: Human MicroRNA Targets, PLoS Biology, 2(11), 2004.
  • [16] Krishnapuram, R., Keller, J. M.: A Possibilistic Approach to Clustering, IEEE Transactions on Fuzzy Systems, 1(2), 1993, 98-110.
  • [17] Krishnapuram, R., Keller, J. M.: The Possibilistic C-Means Algorithm: Insights and Recommendations, IEEE Transactions on Fuzzy Systems, 4(3), 1996, 385-393.
  • [18] Lingras, P., West, C.: Interval Set Clustering of Web Users with Rough K-Means, Journal of Intelligent Information Systems, 23(1), 2004, 5-16.
  • [19] Liu, C. G., Calin, G. A., Volinia, S., Croce, C. M.: MicroRNA Expression Profiling Using Microarrays, Nature Protocols, 3(4), 2008, 563-578.
  • [20] Lu, J., Getz, G., Miska, E. A., Saavedra, E. A., Lamb, J., Peck, D., Cordero, A. S., Ebert, B. L., Mak, R. H., Ferrando, A. A., Downing, J. R., Jacks, T., Horvitz, H. R., Golub, T. R.: MicroRNA Expression Profiles Classify Human Cancers, Nature Letters, 435(9), 2005, 834-838.
  • [21] Maji, P., Pal, S. K.: RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets, Fundamenta Informaticae, 80(4), 2007,475-496.
  • [22] Maji, P., Pal, S. K.: Rough Set Based Generalized Fuzzy C-Means Algorithm and Quantitative Indices, IEEE Transactions on System, Man, and Cybernetics, Part B: Cybernetics, 37(6), 2007, 1529-1540.
  • [23] Masulli, F., Rovetta, S.: Soft Transition from Probabilistic to Possibilistic Fuzzy Clustering, IEEE Transactions on Fuzzy Systems, 14(4), 2006, 516-527.
  • [24] McQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations, Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics and Probability, 1967, 281-297.
  • [25] Mitra, S., Banka, H., Pedrycz, W.: Rough-Fuzzy Collaborative Clustering, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 36, 2006, 795-805.
  • [26] Pal, N. R., Pal, K., Keller, J. M., Bezdek, J. C.: A Possibilistic Fuzzy C-Means Clustering Algorithm, IEEE Transactions on Fuzzy Systems, 13(4), 2005, 517-530.
  • [27] Pal, S. K., Ghosh, A., Sankar, B. U.: Segmentation of Remotely Sensed Images with Fuzzy Thresholding, and Quantitative Evaluation, International Journal of Remote Sensing, 21(11), 2000, 2269-2300.
  • [28] Pal, S. K., Gupta, B. D., Mitra, P.: Rough Self Organizing Map, Applied Intelligence, 21(3), 2004, 289-299.
  • [29] Pawlak, Z.: Rough Sets, Theoretical Aspects of Resoning About Data, Dordrecht, The Netherlands: Kluwer, 1991.
  • [30] Pevsner, J.: Bioinformatics and Functional Genomics, Wiley-Blackwell, 2009.
  • [31] Rousseeuw, J. P.: Silhouettes: A Graphical Aid to the Interpration and Validation of Cluster Analysis, Journal of Computational and Applied Mathematics, 20, 1987, 53-65.
  • [32] Ruspini, E. H.: Numerical Methods for Fuzzy Clustering, Information Sciences, 2, 1970, 319-350.
  • [33] Shamir, R., Sharan, R.: CLICK: A Clustering Algorithm for Gene Expression Analysis, Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, 2000.
  • [34] Timm, H., Borgelt, C., Doring, C., Kruse, R.: An Extension to Possibilistic Fuzzy Cluster Analysis, Fuzzy Sets and Systems, 147, 2004, 3-16.
  • [35] Wang, C., Yang, S., Sun, G., Tang, X., Lu, S., Neyrolles, O., Gao, Q.: Comparative miRNA Expression Profiles in Individuals with Latent and Active Tuberculosis, PLoS One, 6(10), 2011.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-48c73492-b4c5-456d-ab4c-cb3c8dc23ef8
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.