Backbone dihedral angles prediction servers for protein early-stage structure prediction

Smolarczyk, Tomasz; Stapor, Katarzyna; Roterman-Konieczna, Irena

doi:10.1515/bams-2019-0034

Artykuł - szczegóły

Tytuł artykułu

Backbone dihedral angles prediction servers for protein early-stage structure prediction

Autorzy

Smolarczyk Tomasz , Stapor Katarzyna , Roterman-Konieczna Irena

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1515/bams-2019-0034

Warianty tytułu

Języki publikacji

Abstrakty

Three-dimensional protein structure prediction is an important task in science at the intersection of biology, chemistry, and informatics, and it is crucial for determining the protein function. In the two-stage protein folding model, based on an early- and late-stage intermediates, we propose to use state-of-the-art secondary structure prediction servers for backbone dihedral angles prediction and devise an early-stage structure. Early-stage structures are used as a starting point for protein folding simulations, and any errors in this stage affect the final predictions. We have shown that modern secondary structure prediction servers could increase the accuracy of early-stage predictions compared to previously reported models.

Słowa kluczowe

early stage MUFOLD-SS protein folding secondary structure SPIDER3

Wydawca

Uniwersytet Jagielloński - Collegium Medicum
De Gruyter

Czasopismo

Bio-Algorithms and Med-Systems

Rocznik

2019

Tom

Vol. 15, no. 4

Strony

art. no. 20190034

Opis fizyczny

Bibliogr. 35 poz., rys., tab.

Twórcy

autor

Smolarczyk Tomasz

tomasz.smolarczyk@gmail.com

Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, Poland

autor

Stapor Katarzyna

Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, Poland

autor

Roterman-Konieczna Irena

Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, Kraków, Poland

Bibliografia

[1] Anfinsen CB. Principles that govern the folding of protein chains. Science 1973;181:223-30.
[2] Rost B, Sander C, Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol 1994;235:13-26.
[3] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res 2000;28:235-42.
[4] The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2018;46:2699.
[5] The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2017;45:D158-69.
[6] Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinf 2018;19:482-94.
[7] Shortle D. Prediction of protein structure. Curr Biol 2000;10:49-51.
[8] Rost B. Rising accuracy of protein secondary structure prediction. In: Chasman D, editor. Protein structure determination, analysis, and modeling for drug discovery. New York: Dekker, 2003:207-49.
[9] Rost B. Review: protein secondary structure prediction continues to rise. J Struct Biol 2001;134:204-18.
[10] Brylinski M, Konieczny L, Czerwonko P, Jurkowski W, Roterman I. Early-stage folding in proteins (in silico) sequence-to-structure relation. J Biomed Biotechnol 2005;2:65-79.
[11] Gadzała M, Dułak D, Kalinowska B, Baster Z, Bryliński M, Konieczny L, et al. The aqueous environment as an active participant in the protein folding process. J Mol Graph Modell 2019;87:227-39.
[12] Heffernan R, Yang Y, Paliwal KK, Zhou Y. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 2017;33:2842-9.
[13] Fang C, Shang Y, Xu D. MUFOLD-SS: new deep inception-inside-inception networks for protein secondary structure prediction. Proteins 2018;86:592-8.
[14] Kalinowska B, Alejster P, Sałapa K, Baster Z, Roterman I. Hypothetical in silico model of the early-stage intermediate in protein folding. J Mol Model 2013;19:4259-69.
[15] Roterman I. Modelling the optimal simulation path in the peptide chain folding-studies based on geometry of alanine heptapeptide. J Theor Biol 1995;177:283-8.
[16] Jurkowski W, Brylinski M, Konieczny L, Wiśniowski Z, Roterman I. Conformational subspace in simulation of early-stage protein folding. Proteins 2004;55:115-27.
[17] Kalinowska B, Fabian P, Stąpor K, Roterman I. Statistical dictionaries for hypothetical in silico model of the early-stage intermediate in protein folding. J Comput Aided Mol Des 2015;29:609-18.
[18] Rose AS, Bradley AR, Valasatava Y, Duarte JM, Prlić A, Rose PW. NGL viewer: web-based molecular graphics for large complexes. Bioinformatics 2018;34:3755-8.
[19] Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, et al. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep. 2015;65:1147.
[20] Fang C, Shang Y, Xu D. Prediction of protein backbone torsion angles using deep residual inception neural networks. IEEE/ACM Trans Comput Biol Bioinf 2018;16:1020-8.
[21] Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997;25:3389-402.
[22] Fauchère J-L, Charton M, Kier LB, Verloop A, Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology. Int J Peptide Protein Res 1988;32:269-78.
[23] Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2012;9:173-5.
[24] Jiang Q, Jin X, Lee S-J, Yao S. Protein secondary structure prediction: a survey of the state of the art. J Mol Graph Modell 2017;76:379-402.
[25] Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. 2016;6:18962.
[26] Lee J. Measures for the assessment of fuzzy predictions of protein secondary structure. Proteins 2006;65:453-62.
[27] Brylinski M, Konieczny L, Roterman I. SPI – structure predictability index for protein sequences. In Silico Biol 2005;5:227-37.
[28] Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett 2006;27:861-74.
[29] Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22:1658-9.
[30] Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010;26:680-2.
[31] Hollingsworth SA, Karplus PA. A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. BioMol Concepts 2010;1:271-83.
[32] Fabian P, Stąpor K. Developing a new SVM classifier for the extended ES protein structure prediction. In: 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), Prague, 2017.
[33] Smolarczyk T, Stapor K. Random forest classifier for early-stage protein structure prediction. Studia Inf 2018;39:37-54.
[34] Barbara K, Fabian P, Stapor K, Roterman-Konieczna I. Statistical dictionaries for hypothetical in silico model of the early-stage intermediate in protein folding. J Comput Aided Mol Des 2015;29:609-18.
[35] Dietterich TG. Ensemble methods in machine learning. In: Multiple classifier systems. Berlin/Heidelberg: Springer Berlin Heidelberg, 2000:1-15.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-d18739e6-ba6b-4a9d-a0d2-e4fa3e46109c