Using machine learning approach for protein fold recognition

Stąpor, K.

Artykuł - szczegóły

Tytuł artykułu

Using machine learning approach for protein fold recognition

Autorzy

Stąpor K.

Identyfikatory

Warianty tytułu

Uczenie maszynowe w rozpoznawaniu klasy ufałdowania białka

Języki publikacji

Abstrakty

Protein fold recognition using machine learning-based methods is crucial in the protein structure discovery, especially when the traditional sequence comparison methods fail because the structurally-similar proteins share little in the way of seąuence homology. Based on the selected machine learning classification methods, we explain the methodology for building classifiers which can be used in the protein fold recognition problem.

Rozpoznawanie typu ufałdowania białka z wykorzystaniem metod uczenia maszynowego ma kluczowe znaczenie w przewidywaniu struktury białka, szczególnie w przypadkach kiedy tradycyjne podejście oparte na podobieństwie łańcuchów nie znajduje zastosowania ze względu na jego znikomą wartość. Na podstawie wybranych algorytmów uczenia maszynowego klasyfikacji w artykule przedstawiono metodykę automatycznego rozpoznawania typu ufałdowania białka.

Słowa kluczowe

machine learning classifier protein fold protein protein structure

uczenie maszynowe klasyfikator ufałdowanie białka białko struktura białka

Wydawca

Wydawnictwo Politechniki Śląskiej

Czasopismo

Studia Informatica

Rocznik

2011

Tom

Vol. 32, nr 4A

Strony

27--41

Opis fizyczny

Bibliogr. 33 poz.

Twórcy

autor

Stąpor K.

Politechnika Sląska, Instytut Informatyki Gliwice, Akademicka 16, pokój 313, katarzyna.stapor@polsl.pl

Bibliografia

1. Anfinsen C. B.: Principles that govern the folding of protein chains. Science 181, 1973, p. 223-230.
2. Apweiler R, Bairoch A, Wu CH. et al.: UniProt: the universal protein knowledgebase. Nucleic Acids Res. 32 (Database issue), Dl 15-9, 2004.
3. Baldi P., Brunak S., Chauvin Y., Andersen C, Nielsen H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 2000, p. 412-424.
4. Berman H. M. et al.: The Protein DataBank. Nucleic Acids Res., No. 28, 2000, p. 235-242.
5. Bishop Ch. M.: Pattern recognition and machine learning. Springer, New York 2006.
6. Chan H. S., Dill K.: The protein folding problem. Physics Today (Feb.), 1993, p. 24-32.
7. Chen K. C, et al.: Using pseudo amino acid composition and support vector machine to predict protein structural class. Journal of Theoretical Biology, No. 243, 2006, p. 444-448.
8. Chinnasamy A., Sung W. K., Mittal A.: Protein structure and fold prediction using tree-augmented nai've Bayesian classifier. Proc. PSB, Stanford CA, 2004, World Scientific Press.
9. Chmielnicki W., Stąpor K.: Protein fold recognition with combined RDA-SVM classifier. Lecture Notes on Artificial Intelligence, LNAI6076, 2010, p. 162-169.
10. Chothia C: One thousand families for the molecular biologist. Nature, No. 357, 1992, p. 543-544.
11. Craven M. W., Mural R. J., Hauser L. J., Uberbacher E.C.: Predicting protein folding classes without overly relying on homology. Proc. of Intelligent Systems In Molecular Biology (ISMB) No. 3, 1995, p. 98-106.
12. Cymerman I. A. et al.: Computational methods for protein structure prediction and fold-recognition. In: Nucleic Acids and Molecular Biology series, "Practical Bioinformatics", 2004, Editor: Bujnicki J. M. Springer-Verlag.
13. Ding C.H., Dubchak I.: Multi-class protein fold recognition using support vector machines •and neural networks. Bioinformatics, No. 17, 2001, p. 349-358.
14. Dubchak I., Muchnik I. Holbrook S. R., Kim S. H: Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. USA, No. 92, 1995, p. 8700-8704.
15. Friedman J.H.: Regularized Discriminant Analysis. Journal of the American Statistical Association, No. 405, 1989, p. 165-175.
16. Ghanty P., Pal N. R.: Prediction of protein folds: extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Trans, on Nanobioscience 8, 2009, p. 100+110.
17. Huang C. D., Lin C. T., Pal N. R.: Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification. IEEE Trans, on Nanobioscience, No. 2, 2003, p. 221-232.
18. Jones D. T. et al.: A new approach to protein fold recognition. Nature, No. 358, 1992, p. 86-89.
19. Lampros Ch. et al.: Sequence-based protein structure prediction using a reduced state-space hidden Markov model. Computers in Biology and Medicine, No 37, 2007, p. 1211-1224.
20. Lo Conte L., Ailey B., Hubbard T. J. P., Brenner S. E., Murzin A. G., Chothia C: SCOP: a structural classification of protein database. Nucleic Acids Res., No. 28, 2000, p. 257-259.
21. Nanni L.: A novel ensemble of classifiers for protein fold recognition. Neurocomputing, No. 69, 2006, p. 2434-2437.
22. Nanni L., Lumini A.: Mpps: an ensemble of support vector machine based on multiple physicochemical properties of amino acids. Neurocomputing, No. 69, 2006, p. 1688-1690.
23. Okun O.: Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm. In: Proceedings of the Second European Workshop on Data Mining and Text Mining in Bioinformatics, No. 2, Pisa, Italy, 2004, p. 51-57.
24. Pal N. R., Chakraborty D.: Some new features for protein fold prediction. Proc. Int. Conf. On Artificial Neural Networks and Neural Information Processing, 2003, p. 1176-1183.
25. Roterman I., Bryliński M., Konieczny L., Jurkowski W.: Early-stage protein folding - in silico model. In: Recent Advances in Structural Biology. A.G. de Brevern (eds.), Research Signpost, Trivandrum, Kerala, 2007, India.
26. Roterman I., Konieczny L, Bryliński M.: Late-stage folding intermediate in silico model. In: Structure-function relation in proteins. Roterman I. (ed.). Transworld Research Network T.C. 37/661(2), 2009, Fort P.O. Trivandrum, Kerala, India.
27. Konieczny L. Roterman I., Spólnik P.: System Biology. The strategy of functioning of living organism (in Polish). Scientific Publishing House PWN, Warsaw 2010.
28. Shen H. B., Chou K. C: Ensemble classifier for protein fold pattern recognition. Bioinformatics, No. 22, 2006, p. 1717-1722.
29. Shi J. et al.: FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol., No. 310, 2001, p. 243-257.
30. Stąpor K.: Classification methods in computer vision (in Polish). Scientific Publishing House PWN, Warsaw 2011.
31. Yapnik V.: The Nature of Statistical Learning Theory. Springer, New York 1995.
32. Ying X., Dong X., Jie L.: Computational methods for protein structure prediction and modelling. Vol. 2: Structure prediction, Springer, New York 2007.
33. Ying Y., Huang K., Campbell C: Enhance protein fold recognition through novel data integration approach. BMC Bioinformatics, No. 10, 2009, p. 267-287.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSL1-0019-0007