Tytuł artykułu
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Microarrays are new technique of gene expression measurements that attracted a great deal of research interest in recent years. It has been suggested that gene expression data from microarrays (biochips) can be utilized in many biomedical areas, for example in cancer classification. Whereas several, new and existing, methods of classification has been tested, a selection of proper (optimal) set of genes, which expression serves during classification, is still an open problem. In this paper we propose a heuristic method of choosing suboptimal set of genes by using support vector machines (SVMs). Obtained set of genes optimizes one-leave-out cross-validation error. The method is tested on microarray gene expression data of samples of two cancer types: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). The results show that quality of classification of selected set of genes is much better than for sets obtained using another methods of feature selection.
Rocznik
Tom
Strony
MI9--MI17
Opis fizyczny
Bibliogr. 16 poz., wykr.
Twórcy
autor
- Siesan Technical University, Institute of Automatic Control, Akademicka 16, 44-101 Gliwice, Poland
autor
- Department of Statistics, Rice University, P.O. Box 1892, Houston, TX 77251, USA
autor
- Department of Experimental and Clinical Radiobiology, Institute of Oncology, 44-101 Gliwice, Poland
autor
- Siesan Technical University, Institute of Automatic Control, Akademicka 16, 44-101 Gliwice, Poland
Bibliografia
- [1] ALIZADEH A.A. et al., Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Nature, Vol. 403, pp. 503-51 1,2000.
- [2] BOSER B. E., I. M. Guyon, V. Vapnik, A training algorithm for optimal margin classifiers, Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992.
- [3] BROWN M. P .S. et al., Knowledge based analysis of microarray gene expression data by using support vector machines, Proc. of the National Academy of Sciences, Vol. 97, no. 1, pp. 262-267, Jan. 2000.
- [4] CHRISTIANINI N., J. SHAWE-TYLOR, An introduction to support vector machines and other kernel-based learning methods, Cambridge Univ. Press 2000.
- [5] EISEN M., P. SPELLMAN, P. BROWN, D. BOTSTEIN, Cluster analysis and display of genome-wide expression patterns, Proc. of the National Academy of Sciences, Vol. 95, pp. 14863-14867,1998.
- [6] FUJAREWICZ K., J. RZESZOWSKA-WOLNY, Cancer classification based on gene expression data, Journal of Medical Informatics and Technologies, Vol. 5, pp. BI23-B127, Nov. 2000.
- [7] FUJAREWICZ K., J. RZESZOWSKA-WOLNY, Neural network approach to cancer classification based on gene expression levels, Proc. IASTED Int. Conf. Modelling Identification and Control, pp. 564-568, Innsbruck, 2001.
- [8] GOLUB T. R. et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, Vol. 286, pp. 531-537, Oct. 1999.
- [9] HAYKIN S. Neural networks - a comprehensive foundation, Prentice-Hall Int. Inc. 1999.
- [10] KOHONEN T., Self-organizing maps, Springer, New York, 1997.
- [11] LOCKHART D. et al., Expression monitoring by hybridization to high-density oligonucleotide arrays, Nature Biotechnology, Vol. 14, pp. 1678-1680,1996.
- [12] SLONIM D. K. et al., Class prediction and discovery using gene expression levels - report, Whitehead/MIT Center for Genome Research, Cambridge, Sep. 1999.
- [13] SEBESTYEN G. S., Decision making processes in pattern recognition, Macmillan, New York, 1962.
- [14] SOBCZAK W., MALINA W., Metody selekcji informacji, WNT, Warszawa 1978.
- [15] VAPNIK V., The nature of Statistical Learning Theory, Springer-Verlag, New-York, 1995.
- [16] WODICKA L. et al., Genome-wide expression monitoring in saccharomyces cerevisiae, Nature Biotechnology, Vol. 15, pp. 1359-1367,1997.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-c5da8b8c-7c07-4805-8ad3-8c8dfb3be183