Sequence similarity based method for protein function prediction

Blazewicz, J.; Lukasiak, P.; Milostan, M.; Krasnogor, N.; Jackowiak, P.

Artykuł - szczegóły

Tytuł artykułu

Sequence similarity based method for protein function prediction

Autorzy

Blazewicz J. , Lukasiak P. , Milostan M. , Krasnogor N. , Jackowiak P.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Motivation: Proteins are the main building blocks of life. They catalyze biological processes in living cells to sustain life and improve metabolism. They also act as biological scaffolds and are cell's workhorses. As a matter of fact, knowing their function is one of the most important milestones for understanding life.The function depends on the tertiary structure of the protein, but only for a fraction of amino acid sequences gathered in databases the structure is known. Thus, creation of efficient and accurate methods that predict function from sequences, based on already known function-sequence assignments, is a fundamental challenge in computational biology. Results: First, we show a detailed analysis of a usability of similarity search engines in the context of function prediction. Then we propose a simple and effective method for assigning function to sequences based on the results of similarity searches and information gathered from gene ontology annotation graphs. Availability: All data used for the analysis presented in this paper as well as raw result are available at the site: http://bio.cs.put.poznan.pl/funcpred/data/ Suplementary Material: Suplementary materials with additional charts are available at: http://bio.es.put.poznan.pl/funcpred/suplement/ Contact: protbio@cs.put.poznan.pl

Słowa kluczowe

protein function prediction function annotation gene ontology sequence analysis sequence similarity

przewidywanie funkcji białek funkcja adnotacji ontologia genów analiza sekwencji podobieństwa sekwencji

Wydawca

Wydawnictwo Politechniki Poznańskiej

Czasopismo

Foundations of Computing and Decision Sciences

Rocznik

2009

Tom

Vol. 34, No. 3

Strony

173--191

Opis fizyczny

Bibliogr. 23 poz.

Twórcy

autor

Blazewicz J.

autor

Lukasiak P.

autor

Milostan M.

autor

Krasnogor N.

autor

Jackowiak P.

Institute of Computing Science, Poznan University of Technology, Poznan

Bibliografia

[1] S. Wu, Y: Zhang, A comprehensive assessment of sequence-based and template-based methods for protein contact prediction, Bioinformatics 24 (7) (2008) 924-31.
[2] D. Barthel, J. Hirst, J. Blazewicz, E. Burke, N. Krasnogor, Procksi: A decision support system for protein (structure) comparison, knowledge, similarity and information, BMC Bioinformatics 8 (416).
[3] J. Blazewicz, P. Lukasiak, M. Milostan, Some operations research methods for analyzing protein sequences and structures, 4OR: A Quarterly Journal of Operations Research 4, no. 2 (2006) 91-123.
[4] H. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. Shindyalov, P. Bourne, The protein data bank, Nucleic Acids Research 28 (2000) 235-242.
[5] W. Jaskowski, J. Blazewicz, P. Lukasiak, M. Milostan, N. Krasnogor, 3d-judge - a metaserver approach to protein structure prediction, Foundations of Computing and Decision Sciences 32 (1).
[6] A. Kryshtafovych, A. Prlic, Z. Dmytriv, P. Daniluk, M. Milostan, V. Eyrich, T. Hubbard, K. Fidelis, New tools and expanded data analysis capabilities at the protein structure prediction center, Proteins 69 (S8) (2007) 19-26.
[7] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, G. Sherlock, Gene ontology: tool for the unification of biology, Nature Genet. 25 (2000) 25-29.
[8] E. Camon, M. Magrane, D. Barrell, V. Lee, E. Dimmer, J. Maslen, D. Binns, N. Harte, R. Lopez, R. Apweiler, The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology., Nucleic Acids Research 32 (1) (2004) D262-D266.
[9] C. Nguyen, M. Manino, K. Gardiner, K. Cios, Clusfcm: an algorithm for predicting protein functions using homologies and protein interactions., Journal of Bioinformatics and Computational Biology 6 (2008) 203-222.
[10] T.M.Cover, P.E.Hart, Nearest neighbour pattern recognition, IEEE Trans, on Information Theory 13 (1967) 2127.
[11] B. Boeckmann, A. Bairoch, R. Apweiler, M. Blatter, A. Estreicher, E. Gasteiger, M. Martin, K. Michoud, C. O'Donovan, I. Phan, S. Pilbout, M. Schneider, The swiss-prot protein knowledgebase and its supplement trembl in 2003, Nucleic Acids Research 31 (2003) 365-370.
[12] S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D. Lipman, Gapped blast and psi-blast: A new generation of protein database search programs, Nucleic Acids Research 25 (1997) 3389-3402.
[13] M. Dayhoff, R. Schwartz, B. Orcutt, A model of evolutionary change in proteins, in: M. Dayhoff (Ed.), Atlas of Protein Sequence and Structure, Vol. 5, suppl. 3, Natl. Biomed. Res. Found., Washington, DC., 1978, pp. 345-352.
[14] R. Schwartz, M. Dayhoff, Matrices for detecting distant relationships, in: M. Dayhoff (Ed.), Atlas of Protein Sequence and Structure, Vol. 5, suppl. 3, Natl. Biomed. Res. Found., Washington, DC., 1978, pp. 353-358.
[15] W. J. Wilbur, On the pam matrix model of protein evolution, Mol. Biol. Evol. 2 (1985) 434-447.
[16] S. R. Eddy, Where did the blosum62 alignment score matrix come from?, Nature Biotechnology 22 (2004) 1035-1036.
[17] J. Henikoff, Amino acid substitution matrices from protein blocks, Proc. Natl. Acad. Sci. USA 89 (1992) 10915-10919.
[18] J. Whisstock, A. Lesk, Prediction of protein function from protein sequence and structure, Quarterly Reviews of Biophysics 36(3) (2003) 307-340.
[19] S. Karlin, S. Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes, Proc. Natl. Acad. Sci. USA 87 (1990) 2264-2268.
[20] W. Pearson, Comparison of methods for searching protein sequence databases, Prot. Sci. 4 (1995) 1145-1160.
[21] J.-M. Claverie, D. States, Information enhancement methods for large-scale sequence-analysis, Comput. Chem. 17 (1993)191-201.
[22] J. Wootton, S. Federhen, Statistics of local complexity in amino acid sequences and sequence databases, Comput. Chem. 17 (1993) 149-163.
[23] S. Altschul, M. Boguski, W. Gish, J. Wootton, Issues in searching molecular sequence databases, Nature Genet. 6 (1994) 119-129.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPP2-0014-0049