PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Unsupervised and Supervised Learning Approaches Together for Microarray Analysis

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this article, a novel concept is introduced by using both unsupervised and supervised learning. For unsupervised learning, the problem of fuzzy clustering in microarray data as a multiobjective optimization is used, which simultaneously optimizes two internal fuzzy cluster validity indices to yield a set of Pareto-optimal clustering solutions. In this regards, a new multiobjective differential evolution based fuzzy clustering technique has been proposed. Subsequently, for supervised learning, a fuzzy majority voting scheme along with support vector machine is used to integrate the clustering information from all the solutions in the resultant Pareto-optimal set. The performances of the proposed clustering techniques have been demonstrated on five publicly available benchmark microarray data sets. A detail comparison has been carried out with multiobjective genetic algorithm based fuzzy clustering, multiobjective differential evolution based fuzzy clustering, single objective versions of differential evolution and genetic algorithm based fuzzy clustering as well as well known fuzzy c-means algorithm. While using support vector machine, comparative studies of the use of four different kernel functions are also reported. Statistical significance test has been done to establish the statistical superiority of the proposed multiobjective clustering approach. Finally, biological significance test has been carried out using a web based gene annotation tool to show that the proposed integrated technique is able to produce biologically relevant clusters of coexpressed genes.
Wydawca
Rocznik
Strony
45--73
Opis fizyczny
Bibliogr. 37 poz., tab., wykr.
Twórcy
autor
autor
  • Interdisciplinary Centre for Mathematical and Computational Modeling, University of Warsaw, 02-106 Warsaw, Poland, indra@icm.edu.pl
Bibliografia
  • [1] R. Sharan et al., "Click and expander: A system for clustering and visualizing gene expression data," Bioinformatics, vol. 19, pp. 1787-1799, 2003.
  • [2] A. A. Alizadeh, M. B. Eisen, R. Davis, C. Ma, I. Lossos, A. Rosenwald, J. Boldrick, R. Warnke, R. Levy, W. Wilson, M. Grever, J. Byrd, D. Botstein, P. O. Brown, and L. M. Straudt, "Distinct types of diffuse large b-cell lymphomas identified by gene expression profiling," Nature, vol. 403, pp. 503-511, 2000.
  • [3] M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein, "Cluster analysis and display og genome-wide expression patterns," In Proc. Nat. Academy of Sciences, pp. 14863-14868, 1998.
  • [4] S. Bandyopadhyay, U. Maulik, and J. T. Wang, Analysis of Biological Data: A Soft Computing Approach, World Scientific, 2007.
  • [5] D. J. Lockhart and E. A. Winzeler, "Genomics, gene expreesion and dna arrays," Nature, vol. 405, pp. 827-836, 2000.
  • [6] J. A. Hartigan, Clustering Algorithms, Wiley, 1975.
  • [7] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ, 1988.
  • [8] B. S. Everitt, Cluster Analysis, Halsted Press, Third edition, 1993.
  • [9] U. Maulik and S. Bandyopadhyay, "Performance evaluation of some clustering algorithms and validity indices," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1650-1654, 2002.
  • [10] R. Storn and K. Price, "Differential evolution - A simple and efficient adaptive scheme for global optimization over continuous spaces.," Technical Report TR-95-012, International Computer Science Institute, Berkley (1995), 1995.
  • [11] R. Storn and K. Price, "Differential evolution - A simple and efficient heuristic strategy for global optimization over continuous spaces," Journal of Global Optimization, vol. 11, pp. 341-359, 1997.
  • [12] K. Price, R. Storn, and J. Lampinen, Differential Evolution - A Practical Approach to Global Optimization, Springer, Berlin, 2005.
  • [13] U. Maulik and I. Saha, "Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery," Pattern Recognition, vol. 42, no. 9, pp. 2135-2149, 2009.
  • [14] U. Maulik, S. Bandyopadhyay, and I. Saha, "Integrating clustering and supervised learning for categorical data analysis," IEEE Transactions on Systems, Man and Cybernetics Part-A, vol. 40, no. 4, pp. 664-675, 2010.
  • [15] U. Maulik and I. Saha, "Automatic fuzzy clustering using modified differential evolution for image classification," IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 9, pp. 3503-3510, 2010.
  • [16] X. L. Xie and G. Beni, "A validity measure for fuzzy clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, pp. 841-847, 1991.
  • [17] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, New York, 1981.
  • [18] V. Vapnik, Statistical Learning Theory, Wiley, New York, USA, 1998.
  • [19] C. L. C. Burges, "A tutorial on support vectormachines for pattern recognition," DataMining and Knowledge Discovery, vol. 2, pp. 121-167, 1998.
  • [20] K. Deb, S. Agrawal, A. Pratab, and T. Meyarivan, "A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II," IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182-197, 2002.
  • [21] K. Deb, Multi-objective Optimization Using Evolutionary Algorithms, John Wiley and Sons, Ltd, England, 2001.
  • [22] S. Bandyopadhyay, U. Maulik, and A. Mukhopadhyay, "Multiobjective genetic clustering for pixel classification in remote sensing imagery," IEEE Transactions on Geoscience and Remote Sensing, vol. 45, no. 5, pp. 1506-1511, 2007.
  • [23] U.Maulik and S. Bandyopadhyay, "Genetic algorithm based clustering technique," Pattern Recognition, vol. 33, pp. 1455-1465, 2000.
  • [24] P.J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," J. Compt. App. Math, vol. 20, pp. 53-65, 1987.
  • [25] C. A. Coello Coello, "A comprehensive survey of evolutionary-based multiobjective optimization techniques," Knowledge and Information Systems, vol. 1, no. 3, pp. 129-156, 1999.
  • [26] N. R. Pal and J. C. Bezdek, "On cluster validity for the Fuzzy C-Means model," IEEE Transactions on Fuzzy Systems, vol. 3, pp. 370-379, 1995.
  • [27] S. Chu, J. DeRisi, M. Eisen, J. Mulholland, D. Botstein, P. O. Brown, and I. Herskowitz, "The transcriptional program of sporulation in budding yeast," Science, vol. 282, pp. 699-705, October 1998.
  • [28] R. J. Cho, M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodica, and T. G. W. et al, "A genome-wide transcriptional analysis of mitotic cell cycle," Mol. Cell., vol. 2, pp. 65-73, 1998.
  • [29] P. Reymonda, H. Webera, M. Damonda, and E. E. Farmera, "Differential gene expression in response to mechanical wounding and insect feeding in arabidopsis," Plant Cell, vol. 12, pp. 707-720, 2000.
  • [30] V. R. Iyer,M.B. Eisen, D. T. Ross, G. Schuler, T. Moore, J. Lee, J. M. Trent, L. M. Staudt, J. J Hudson,M. S. Boguski, D. Lashkari, D. Shalon, D. Botstein, and P. O. Brown, "The transcriptional program in the response of the human fibroblasts to serum," Science, vol. 283, pp. 83-87, 1999.
  • [31] X. Wen, S. Fuhrman, G. S. Michaels, D. B. Carr, S. Smith, J. L. Barker, and R. Somogyi, "Large-scale temporal gene expression mapping of central nervous system development," In Proc. Nat. Academy of Sciences, vol. 95, pp. 334-339, 1998.
  • [32] Y. Xu, V. Olman, and D. Xu, "Minimum spanning trees for gene expression data clustering," Genome Informatics, vol. 12, pp. 24-33, 2001.
  • [33] M. Hollander and D. A. Wolfe, Nonparametric Statistical Methods, 2nd ed., 1999.
  • [34] K. Crammer and Y. Singer, "On the algorithmic implementation of multiclass kernel-based vector machines," J. Machine Learning Research, vol. 2, pp. 265-292, 2001.
  • [35] C. W. Hsu and C. J. Lin, "A comparison of methods for multi-class support vector machines," IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, 2002.
  • [36] S. Tavazoie, J. Hughes, M. Campbell, R. Cho, and G. Church, "Systematic determination of genetic network architecture," Nature Genet, vol. 22, pp. 281-285, 1999.
  • [37] S. Bandyopadhyay, S. Saha, U. Maulik, and K. Deb, "A simulated annealing based multi-objective optimization algorithm: AMOSA," IEEE Transactions on Evolutionary Computation, vol. 12, no. 3, pp. 269-283, 2008.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS8-0011-0057
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.