PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Is the growth rate of Protein Data Bank sufficient to solve the protein structure prediction problem using template-based modeling?

Autorzy
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The Protein Data Bank (PDB) undergoes an exponential expansion in terms of the number of macromolecular structures deposited every year. A pivotal question is how this rapid growth of structural information improves the quality of three-dimensional models constructed by contemporary bioinformatics approaches. To address this problem, we performed a retrospective analysis of the structural coverage of a representative set of proteins using remote homology detected by COMPASS and HHpred. We show that the number of proteins whose structures can be confidently predicted increased during a 9-year period between 2005 and 2014 on account of the PDB growth alone. Nevertheless, this encouraging trend slowed down noticeably around the year 2008 and has yielded insignificant improvements ever since. At the current pace, it is unlikely that the protein structure prediction problem will be solved in the near future using existing template-based modeling techniques. Therefore, further advances in experimental structure determination, qualitatively better approaches in fold recognition, and more accurate template-free structure prediction methods are desperately needed.
Rocznik
Strony
1--7
Opis fizyczny
Bibliogr. 56 poz., wykr.
Twórcy
  • Department of Biological Sciences, 202 Life Sciences Bldg., Louisiana State University, Baton Rouge, LA 70803, USA
  • Center for Computation and Technology, 2054 Digital Media Center, Louisiana State University, Baton Rouge, LA 70803, USA
Bibliografia
  • 1. Pauling L. Modern structural chemistry. Nobel Lecture: December 11, 1954.
  • 2. Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res 2014;42:D756–763.
  • 3. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res 2000;28:235–42.
  • 4. Guo JT, Ellrott K, Xu Y. A historical perspective of template-based protein structure prediction. Methods Mol Biol 2008;413:3–42.
  • 5. Dorn M, E Silva MB, Buriol LS, Lamb LC. Three-dimensional protein structure prediction: methods and computational strategies. Comput Biol Chem 2014;53PB:251–76.
  • 6. Honig B. Protein folding: from the levinthal paradox to structure prediction. J Mol Biol 1999;293:283–93.
  • 7. Onuchic JN, Wolynes PG. Theory of protein folding. Curr Opin Struct Biol 2004;14:70–5.
  • 8. Zhang J, Li W, Wang J, Qin M, Wu L, Yan Z, et al. Protein folding simulations: from coarse-grained model to all-atom model. IUBMB Life 2009;61:627–43.
  • 9. Kryshtafovych A, Fidelis K, Moult J. CASP10 results compared to those of previous CASP experiments. Proteins 2014;82:Suppl 2:164–74.
  • 10. Ben-David M, Noivirt-Brik O, Paz A, Prilusky J, Sussman JL, Levy Y. Assessment of CASP8 structure predictions for template free targets. Proteins 2009;77:Suppl 9:50–65.
  • 11. Kinch L, Yong Shi S, Cong Q, Cheng H, Liao Y, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins 2011;79:Suppl 10:59–73.
  • 12. Tai CH, Bai H, Taylor TJ, Lee B. Assessment of template-free modeling in CASP10 and ROLL. Proteins 2014;82:Suppl 2:57–83.
  • 13. Cozzetto D, Kryshtafovych A, Fidelis K, Moult J, Rost B, Tramontano A. Evaluation of template-based models in CASP8 with standard measures. Proteins 2009;77:Suppl 9:18–28.
  • 14. Huang YJ, Mao B, Aramini JM, Montelione GT. Assessment of template-based protein structure predictions in CASP10. Proteins 2014;82:Suppl 2:43–56.
  • 15. Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T. Assessment of template based protein structure predictions in CASP9. Proteins 2011;79:Suppl 10:37–58.
  • 16. Ginalski K. Comparative modeling for protein structure prediction. Curr Opin Struct Biol 2006;16:172–7.
  • 17. Lushington GH. Comparative modeling of proteins. Methods Mol Biol 2015;1215:309–30.
  • 18. Qu X, Swanson R, Day R, Tsai J. A guide to template based structure prediction. Curr Protein Pept Sci 2009;10:270–85.
  • 19. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990;215:403–10.
  • 20. Boratyn GM, Schaffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL. Domain enhanced lookup time accelerated BLAST. Biol Direct 2012;7:12.
  • 21. Biegert A, Soding J. Sequence context-specific profiles for homology searching. Proc Natl Acad Sci USA 2009;106:3770–5.
  • 22. Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 1988;85:2444–8.
  • 23. Rost B. Twilight zone of protein sequence alignments. Protein Eng 1999;12:85–94.
  • 24. Jones DT, Taylor WR, Thornton JM. A new approach to protein fold recognition. Nature 1992;358:86–9.
  • 25. Joseph AP, de Brevern AG. From local structure to a global framework: recognition of protein folds. J R Soc Interface 2014;11:20131147.
  • 26. Koonin EV, Wolf YI, Aravind L. Protein fold recognition using sequence profiles and its application in structural genomics. Adv Protein Chem 2000;54:245–75.
  • 27. Bennett-Lovsey RM, Herbert AD, Sternberg MJ, Kelley LA. Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre. Proteins 2008;70:611–25.
  • 28. Peng J, Xu J. Low-homology protein threading. Bioinformatics 2010;26:i294–300.
  • 29. Wu S, Zhang Y. MUSTER: improving protein sequence profileprofile alignments by using multiple sources of structure information. Proteins 2008;72:547–56.
  • 30. Xu J, Li M, Kim D, Xu Y. RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 2003;1:95–117.
  • 31. Yang Y, Faraggi E, Zhao H, Zhou Y. Improving protein fold recognition and template-based modeling by employing probabilisticbased matching between predicted one-dimensional structural properties of query and corresponding native properties of templates. Bioinformatics 2011;27:2076–2082.
  • 32. Brylinski M, Lingam D. eThread: a highly optimized machine learning-based approach to meta-threading and the modeling of protein tertiary structures. PLoS One 2012;7:e50200.
  • 33. Wu S, Zhang Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res 2007;35:3375–82.
  • 34. Hillisch A, Pineda LF, Hilgenfeld R. Utility of homology models in the drug discovery process. Drug Discov Today 2004;9:659–69.
  • 35. Liu T, Tang GW, Capriotti E. Comparative modeling: the state of the art and protein drug target structure prediction. Comb Chem High Throughput Screen 2011;14:532–47.
  • 36. Takeda-Shitaka M, Takaya D, Chiba C, Tanaka H, Umeyama H. Protein structure prediction in structure based drug design. Curr Med Chem 2004;11:551–8.
  • 37. Zhang Y. Protein structure prediction: when is it useful? Curr Opin Struct Biol 2009;19:145–55.
  • 38. Brylinski M. Nonlinear scoring functions for similarity-based ligand docking and binding affinity prediction. J Chem Inf Model 2013;53:3097–112.
  • 39. Brylinski M. eMatchSite: sequence order-independent structure alignments of ligand binding pockets in protein models. PLoS Comput Biol 2014;10:e1003829.
  • 40. Skolnick J, Zhou H, Brylinski M. Further evidence for the likely completeness of the library of solved single domain protein structures. J Phys Chem B 2012;116 6654–64.
  • 41. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J. On the origin and highly likely completeness of single-domain protein structures. Proc Natl Acad Sci USA 2006;103:2605–10.
  • 42. Zhang Y, Skolnick J. The protein structure prediction problem could be solved using the current PDB library. Proc Natl Acad Sci USA 2005;102:1029–34.
  • 43. O’Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R. High-quality protein knowledge resource: SWISSPROT and TrEMBL. Brief Bioinform 2002;3:275–84.
  • 44. Vitkup D, Melamud E, Moult J, Sander C. Completeness in structural genomics. Nat Struct Biol 2001;8:559–66.
  • 45. Yan Y, Moult J. Protein family clustering for structural genomics. J Mol Biol 2005;353:744–59.
  • 46. Grabowski M, Joachimiak A, Otwinowski Z, Minor W. Structural genomics: keeping up with expanding knowledge of the protein universe. Curr Opin Struct Biol 2007;17:347–53.
  • 47. Sadreyev R, Grishin N. COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 2003;326:317–36.
  • 48. Soding J. Protein homology detection by HMM-HMM comparison. Bioinformatics 2005;21:951–60.
  • 49. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, et al. The Protein Data Bank. Acta Crystallogr D Biol Crystallogr 2002;58:899–907.
  • 50. Berman HM, Kleywegt GJ, Nakamura H, Markley JL. How community has shaped the Protein Data Bank. Structure 2013;21:1485–91.
  • 51. Campbell ID. Timeline: the march of structural biology. Nat Rev Mol Cell Biol 2002;3:377–81.
  • 52. Li W, Godzik A. CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22:1658–9.
  • 53. Pandit SB, Skolnick J. Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score. BMC Bioinformatics 2008;9:531.
  • 54. Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins 2004;57:702–10.
  • 55. Cormen TH, Leiserson CE, Rivest RL, Stein C. Greedy algorithms. Introduction to algorithms. MIT Press, 1990:414.
  • 56. Xu J, Zhang Y. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 2010;26:889–95.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-5ba28e41-40a0-47fa-9ea6-b016dffed91c
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.