Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
One important application of gene expression microarray data is classification of samples into categories, such as the type of tumor. A classifier using Multiclass SVM [4] (Support Vector Machines) is described in this article. Our classifier involves dimension reduction using Multivariate Partial Least Squares (MPLS) for classification more than two classes. We use also two methods based on binary classifications: One-Against-All [5] and One-Against-One [6]. These three methods have been tested on a data set involving 125 tumor/normal thyroid human DNA microarrays samples. There are 66 Papillary throid carcinoma, 32 follicular throid carcinoma and 27 normal tissues. The most important thing is to find small number of genes that discriminate between these three classes with good accuracy. The best genes can be selected for Q-PCR validation. Molecular markers differentiating between throid cancer and normal tissues can help in clinical diagnostics and therapy methods. For error estimation we are use the bootstrap .632 [8] technique. Major issue with bootstrap estimators is their high computational cost. That is why we use a OpenMosix with MPI (Message Passing Interface) cluster technology for this system for parallel computation space.
Rocznik
Tom
Strony
197--204
Opis fizyczny
Bibliogr. 10 poz., rys., tab.
Twórcy
autor
- Silesian University of Technology, Automatic Control, Electronics and Computer Science; Automatic Institute; Akademicka 16, 44-100 Gliwice, Poland
Bibliografia
- [1] BOSER B. E., I.M. GUYON, V. VAPNIK, A training algorithm for optimal margin classifiers. Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992
- [2] GUYON I., J. WESTON, S. BARNHILL, V. VAPNIK, Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, Vol. 64, pp. 389–422, 1999
- [3] BROWN M. P .S.., W.N. Groundy, D. Lin, N. Cristianini, C.W. Sugnet, T.S. Furey, M. Ares Jr, D. Haussler, Knowledge based analysis of microarray gene expression data by using support vector machines. Proc. of the National Academy of Sciences, Vol.97, no.1, pp. 262–267, 2000
- [4] J. WESTON AND C. WATKINS, MultiClass Support Vector Machines. In M. Verleysen, editor Proceedings of ESANN99, Brussels. D. Facto Press, 1999.
- [5] L. BOTTOU, C. CORTES, J. DENKER, H. DRUCKER, I. GUYON, L. JACKEL, Y. LECUN, U. MULLER, E. SACKINGER, P. SIMARD, AND V. VAPNIK,: Comparison of classifier methods: A case study in handwriting digit recognition, in Proc. Int. Conf. Pattern Recognition. , pp. 77–87, 1994.
- [6] J. FRIEDMAN.: Another Approach to Polychotomous Classification. Dept. Statist., Stanford Univ., Stanford, CA, 1996.
- [7] HÖSKULDSSON A.: PLS regression methods. J. Chemometrics., 2(3) 211-228, 1988.
- [8] B EFRON: Estimating the error rate of a prediction rule: improvement on cross-validation, JASA 78, pp. 316–331, 1983.
- [9] ULISSES BRAGA-NETO, EDWARD R. Dougherty: Is cross-validation valid for small-sample microarray classification? Bioinformatics 20(3): 374-380, 2004.
- [10] NGUYEN DV, ROCKE DM.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics. 2002 Jan;18(1):39-50, 2001.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-PWA4-0007-0020