Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The prospect of identifying contacts in protein structures purely from aligned protein sequences has lured researchers for a long time, but progress has been modest until recently. Here, we reviewed the most successful methods for identifying structural contacts from sequence and how these methods differ and made an initial assessment of the overlap of predicted contacts by alternative approaches. We then discussed the limitations of these methods and possibilities for future development and highlighted the recent applications of contacts in tertiary structure prediction, identifying the residues at the interfaces of protein-protein interactions, and the use of these methods in disentangling alternative conformational states. Finally, we identified the current challenges in the field of contact prediction, concentrating on the limitations imposed by available data, dependencies on the sequence alignments, and possible future developments.
Czasopismo
Rocznik
Tom
Strony
243--254
Opis fizyczny
Bibliogr. 78 poz., rys., tab.
Twórcy
autor
- Department of Computer Science, University College London, London, UK
autor
- Department of Computer Science, University College London, London, UK
autor
- Department of Computer Science, University College London, London, UK
Bibliografia
- 1. Anfinsen CB. Principles that govern the folding of protein chains. Science 1973;181:223-30.
- 2. Gromiha MM, Selvaraj S. Inter-residue interactions in protein folding and stability. Prog Biophys Mol Biol 2004;86:235-77.
- 3. Chothia C, Lesk AM. The relation between the divergence of sequence and structure in proteins. EMBO J 1986;5:823.
- 4. Rost B. Twilight zone of protein sequence alignments. Protein Eng 1999;12:85-94.
- 5. Vendruscolo M, Paci E, Dobson CM, Karplus M. Three key residues form a critical contact network in a protein folding transition state. Nature 2001;409:641-5.
- 6. Williams SG, Lovell SC. The effect of sequence evolution on protein structural divergence. Mol Biol Evol 2009;26:1055-65.
- 7. Poon A, Chao L The rate of compensatory mutation in the DNA bacteriophage φX174. Genetics 2005:170:989-99.
- 8. Goh C-S, Bogan AA, Joachimiak M, Walther D, Cohen FE. Co-evolution of proteins with their interaction partners. J Mol Biol 2000;299:283-93.
- 9. Altschuh D, Lesk A, Bloomer A, Ktug A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J Mol Biol 1987;193:693-707.
- 10. Göbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Prot Struct Funct Bioinf 1994:18:309-17.
- 11. Vernet T, Tessier DC, Khouri HE, Altschuh D. Correlation of co-ordinated amino acid changes at the two-domain interface of cysteine proteases with protein stability. J Mol Biol 1992;224:501-9.
- 12. Neher E. How frequent are correlated changes in families of protein sequences? Proc Natl Acad Sci 1994:91:98-102.
- 13. Jones DT, Buchan DW, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 2012;28:184-90.
- 14. Lapedes AS, Giraud BG, Liu L, Stormo GD. Correlated mutations in models of protein sequences: phylogenetic and structural effects. Lecture Notes Monogr Ser 1999:236-56.
- 15. Pollock D, Taylor W. Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng 1997;10:647-57.
- 16. Dunn SD, Wahl LM, Gloor GB. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 2008:24:333-40.
- 17. Little DY, Chen L. Identification of coevolving residues and coevolution potentials emphasizing structure, bond formation and catalytic coordination in protein evolution. PLoS One 2009;4:e4762.
- 18. Gloor GB, Tyagi G, Abrassart DM, Kingston AJ, Fernandes AD, Dunn SD, et al. Functionally compensating coevolving positions are neither homoplasic nor conserved in clades. Mol Biol Evol 2010:27:1181-91.
- 19. Giraud B, Heumann JM, Lapedes AS. Superadditive correlation. Phys Rev E 1999:59:4983.
- 20. de Juan D, Pazos F, Valencia A. Emerging methods in protein co¬evolution. Nat Rev Genet 2013;14:249-61.
- 21. Taylor WR, Hamilton RS, Sadowski MI. Prediction of contacts from correlated sequence substitutions. Curr Opin Struct Biol 2013:23:473-9.
- 22. Korber B, Farber RM, Wolpert DH, Lapedes AS. Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc Natl Acad Sci 1993:90:7176-80.
- 23. Shindyalov I, Kolchanov N, Sander C. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng 1994:7:349-58.
- 24. Taylor WR, Hatrick K. Compensating changes in protein multiple sequence alignments. Protein Engl 994;7:341-8.
- 25. Benner SA, Gerloff D. Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure: a prediction of the structure of the catalytic domain of protein kinases. Adv Enzyme Regul 1991;31:121-81.
- 26. Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein-protein interaction by message passing. Proc Natt Acad Sci 2009;106:67-72.
- 27. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, et al. Protein 3D structure computed from evolutionary sequence variation. PLoS One 2011;6:e28766.
- 28. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci 2011;108:E1293-301.
- 29. Feinauer C, Skwark MJ, Pagnani A, Aurell E. Improving contact prediction along three dimensions. PLoS Comput Biol 2014;10:el003847.
- 30. Ekeberg M, Hartonen T, Aurell E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J Comput Phys 2014;276:341-56.
- 31. Ekeberg M, Lovkvist C, Lan Y, Weigt M, Aurell E. Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E 2013;87:012707.
- 32. Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence-and structure-rich era. Proc Natl Acad Sci 2013;110:15674-9.
- 33. Andreatta M, Laplagne S, Li SC, Smale S. Prediction of residue-residue contacts from protein families using similarity kernels and least squares regularization. arXiv preprint arXiv:13111301.2014.
- 34. Clark GW, Ackerman SH, Tiltier ER, Gatti DL. Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments. BMC Bioinformatics 2014;15:157.
- 35. Baldassi C, Zamparo M, Feinauer C, Procaccini A, Zecchina R, Weigt M, et al. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS One 2014;9:e92721.
- 36. Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical LASSO. Biostatistics 2008:9:432-41.
- 37. Banerjee O, El Ghaoui L, d'Aspremont A. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J Mach Learn Res 2008:9:485-516.
- 38. Kajan L, Hopf TA, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 2014;15:85.
- 39. Skwark MJ, Abdel-Rehim A, Elofsson A. PconsC: combination of direct information methods and alignments improves contact prediction. Bioinformatics 2013:29:1815-6.
- 40. Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, Marks DS. Three-dimensional structures of membrane proteins from genomic sequencing. Cell 2012:149:1607-21.
- 41. Sulkowska JI, Morcos F, Weigt M, Hwa T, Onuchic JN. Genomics-aided structure prediction. Proc Natl Acad Sci 2012;109:10340-5.
- 42. Kosciolek T, Jones DT. De novo structure prediction of globular proteins aided by sequence variation-derived contacts. PloS One 2014;9:e92197.
- 43. Jones DT. Predicting novel protein folds by using FRAGFOLD. Prot Struct Funct Bioinf 2001:45:127-32.
- 44. Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, et al. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect D Biol Crystallogr 1998:54:905-21.
- 45. Brunger AT. Version 1.2 of the crystallography and NMR system. Nat Prot 2007:2:2728-33.
- 46. Vendruscolo M, Kussell E, Domany E. Recovery of protein structure from contact maps. Folding Des 1997;2:295-306.
- 47. Duarte JM, Sathyapriya R, Stehr H, Filippis I, Lappe M. Optimal contact definition for reconstruction of contact maps. BMC Bioinformatics 2010;ll:283.
- 48. Kim DE, DiMaio F, Yu-Ruei Wang R, Song Y, Baker D. One contact for every twelve residues allows robust and accurate topology-level protein structure modeling. Prot Struct Funct Bioinf 2014;82:208-18.
- 49. Konopka BM, Ciombor M, Kurczynska M, Kotulska M. Automated procedure for contact-map-based protein structure reconstruction. J Membr Biol 2014:247:409-20.
- 50. Taylor TJ, Bai H, Tai CH, Lee B. Assessment of CASP10 contact-assisted predictions. Prot Struct Funct Bioinf 2014;82:84-97.
- 51. Tress ML, Valencia A. Predicted residue-residue contacts can help the scoring of 3D models. Prot Struct Funct Bioinf 2010:78:1980-91.
- 52. Taylor WR, Jones DT, Sadowski MI. Protein topology from predicted residue contacts. Prot Sci 2012;21:299-305.
- 53. Savojardo C, Fariselli P, Martelli PL, Casadio R. BCov: a method for predicting β-sheet topology using sparse inverse covariance estimation and integer programming. Bioinformatics 2013:btt555.
- 54. Sadowski MI. Prediction of protein domain boundaries from inverse covariances. Prot Struct Funct Bioinf 2013:81:253-60.
- 55. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, et al. The protein data bank. Nucleic Acids Res 2000;28:235-42.
- 56. Janin J, Bahadur RP, Chakrabarti P. Protein-protein interaction and quaternary structure. Q Rev Biophys 2008;41:133-80.
- 57. Hopf TA, Scharfe CP, Rodrigues JP, Green AG, Sander C, Bonvin AM, et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 2014.
- 58. Pazos F, Helmer-Citterich M, Ausiello G, Valencia A. Correlated mutations contain information about protein-protein interaction. J Mol Biol 1997:271:511-23.
- 59. Shih ES, Hwang MJ. On the use of distance constraints in protein-protein docking computations. Prot Struct Funct Bioinf 2012;80:194-205.
- 60. Stock AM, Robinson VL, Goudreau PN. Two-component signal transduction. Annu Rev Biochem 2000;69:183-215.
- 61. Cheng RR, Morcos F, Levine H, Onuchic JN. Toward rationally redesigning bacterial two-component signaling systems using revolutionary information. Proc Natl Acad Sci 2014;111:E563-71.
- 62. Ovchinnikov S, Kamisetty H, Baker D. Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. eLife 2014;3.
- 63. Butland G, Peregrin-Alvarez JM, Li J, Yang W, Yang X, Canadien V, et al. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 2005:433:531-7.
- 64. Jeon J, Nam H-J, Choi YS, Yang J-S, Hwang J, Kim S. Molecular evolution of protein conformational changes revealed by a network of evolutionarily coupled residues. Mol Biol Evol 2011;28:2675-85.
- 65. Jana B, Morcos F, Onuchic JN. From structure to function: the convergence of structure based models and co-evolutionary information. Phys Chem Chem Phys 2014:16:6496-507.
- 66. Morcos F, Jana B, Hwa T, Onuchic JN. Coevolutionary signals across protein lineages help capture multiple protein conformations. Proc Natl Acad Sci 2013:110:20533-8.
- 67. Martin L, Gloor GB, Dunn S, Wahl LM. Using information theory to search for co-evolving residues in proteins. Bioinformatics 2005:21:4116-24.
- 68. Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natt Acad Sci 1992;89:10915-9.
- 69. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al. Pfam: the protein families database. Nucleic Acids Res 2013:gktl223.
- 70. Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2012;9:173-5.
- 71. Eddy SR. Profile hidden Markov models. Bioinformatics 1998:14:755-63.
- 72. Ma J, Wang S, Wang Z, Xu J. MRFalign: protein homology detection through alignment of Markov random fields. PLoS Comput Biol 2014;l0:el003500.
- 73. Gonzalez MW, Pearson WR. Homologous over-extension: a challenge for iterative similarity searches. Nucleic Acids Res 2010:38:2177-89.
- 74. Seemayer S, Gruber M, Söding J. CCMpred -fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics 2014:30:3128-30.
- 75. Nugent T, Jones DT. Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc Natl Acad Sci 2012;109:E1540-7.
- 76. Michel M, Hayat S, Skwark MJ, Sander C, Marks DS, Elofsson A. PconsFold: improved contact predictions improve protein models. Bioinformatics 2014;30:i482-8.
- 77. Maynard Smith J. Natural selection and the concept of a protein space. Nature 1970;225:563-4.
- 78. Romero PA, Arnold FH. Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 2009;10:866-76.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b23f0a68-cf03-40f0-98b8-6b5f4e5a6407