Tytuł artykułu
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Motivation: Proteins are the main building blocks of life. They catalyze biological processes in living cells to sustain life and improve metabolism. They also act as biological scaffolds and are cell's workhorses. As a matter of fact, knowing their function is one of the most important milestones for understanding life.The function depends on the tertiary structure of the protein, but only for a fraction of amino acid sequences gathered in databases the structure is known. Thus, creation of efficient and accurate methods that predict function from sequences, based on already known function-sequence assignments, is a fundamental challenge in computational biology. Results: First, we show a detailed analysis of a usability of similarity search engines in the context of function prediction. Then we propose a simple and effective method for assigning function to sequences based on the results of similarity searches and information gathered from gene ontology annotation graphs. Availability: All data used for the analysis presented in this paper as well as raw result are available at the site: http://bio.cs.put.poznan.pl/funcpred/data/ Suplementary Material: Suplementary materials with additional charts are available at: http://bio.es.put.poznan.pl/funcpred/suplement/ Contact: protbio@cs.put.poznan.pl
Rocznik
Tom
Strony
173--191
Opis fizyczny
Bibliogr. 23 poz.
Twórcy
autor
autor
autor
autor
autor
- Institute of Computing Science, Poznan University of Technology, Poznan
Bibliografia
- [1] S. Wu, Y: Zhang, A comprehensive assessment of sequence-based and template-based methods for protein contact prediction, Bioinformatics 24 (7) (2008) 924-31.
- [2] D. Barthel, J. Hirst, J. Blazewicz, E. Burke, N. Krasnogor, Procksi: A decision support system for protein (structure) comparison, knowledge, similarity and information, BMC Bioinformatics 8 (416).
- [3] J. Blazewicz, P. Lukasiak, M. Milostan, Some operations research methods for analyzing protein sequences and structures, 4OR: A Quarterly Journal of Operations Research 4, no. 2 (2006) 91-123.
- [4] H. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. Shindyalov, P. Bourne, The protein data bank, Nucleic Acids Research 28 (2000) 235-242.
- [5] W. Jaskowski, J. Blazewicz, P. Lukasiak, M. Milostan, N. Krasnogor, 3d-judge - a metaserver approach to protein structure prediction, Foundations of Computing and Decision Sciences 32 (1).
- [6] A. Kryshtafovych, A. Prlic, Z. Dmytriv, P. Daniluk, M. Milostan, V. Eyrich, T. Hubbard, K. Fidelis, New tools and expanded data analysis capabilities at the protein structure prediction center, Proteins 69 (S8) (2007) 19-26.
- [7] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, G. Sherlock, Gene ontology: tool for the unification of biology, Nature Genet. 25 (2000) 25-29.
- [8] E. Camon, M. Magrane, D. Barrell, V. Lee, E. Dimmer, J. Maslen, D. Binns, N. Harte, R. Lopez, R. Apweiler, The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology., Nucleic Acids Research 32 (1) (2004) D262-D266.
- [9] C. Nguyen, M. Manino, K. Gardiner, K. Cios, Clusfcm: an algorithm for predicting protein functions using homologies and protein interactions., Journal of Bioinformatics and Computational Biology 6 (2008) 203-222.
- [10] T.M.Cover, P.E.Hart, Nearest neighbour pattern recognition, IEEE Trans, on Information Theory 13 (1967) 2127.
- [11] B. Boeckmann, A. Bairoch, R. Apweiler, M. Blatter, A. Estreicher, E. Gasteiger, M. Martin, K. Michoud, C. O'Donovan, I. Phan, S. Pilbout, M. Schneider, The swiss-prot protein knowledgebase and its supplement trembl in 2003, Nucleic Acids Research 31 (2003) 365-370.
- [12] S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D. Lipman, Gapped blast and psi-blast: A new generation of protein database search programs, Nucleic Acids Research 25 (1997) 3389-3402.
- [13] M. Dayhoff, R. Schwartz, B. Orcutt, A model of evolutionary change in proteins, in: M. Dayhoff (Ed.), Atlas of Protein Sequence and Structure, Vol. 5, suppl. 3, Natl. Biomed. Res. Found., Washington, DC., 1978, pp. 345-352.
- [14] R. Schwartz, M. Dayhoff, Matrices for detecting distant relationships, in: M. Dayhoff (Ed.), Atlas of Protein Sequence and Structure, Vol. 5, suppl. 3, Natl. Biomed. Res. Found., Washington, DC., 1978, pp. 353-358.
- [15] W. J. Wilbur, On the pam matrix model of protein evolution, Mol. Biol. Evol. 2 (1985) 434-447.
- [16] S. R. Eddy, Where did the blosum62 alignment score matrix come from?, Nature Biotechnology 22 (2004) 1035-1036.
- [17] J. Henikoff, Amino acid substitution matrices from protein blocks, Proc. Natl. Acad. Sci. USA 89 (1992) 10915-10919.
- [18] J. Whisstock, A. Lesk, Prediction of protein function from protein sequence and structure, Quarterly Reviews of Biophysics 36(3) (2003) 307-340.
- [19] S. Karlin, S. Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes, Proc. Natl. Acad. Sci. USA 87 (1990) 2264-2268.
- [20] W. Pearson, Comparison of methods for searching protein sequence databases, Prot. Sci. 4 (1995) 1145-1160.
- [21] J.-M. Claverie, D. States, Information enhancement methods for large-scale sequence-analysis, Comput. Chem. 17 (1993)191-201.
- [22] J. Wootton, S. Federhen, Statistics of local complexity in amino acid sequences and sequence databases, Comput. Chem. 17 (1993) 149-163.
- [23] S. Altschul, M. Boguski, W. Gish, J. Wootton, Issues in searching molecular sequence databases, Nature Genet. 6 (1994) 119-129.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPP2-0014-0049