PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Towards Efficient Searching on the Secondary Structure of Protein Sequences

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Approximate searching on the primary structure (i.e., amino acid arrangement) of protein sequences is an essential part in predicting the functions and evolutionary histories of proteins. However, because proteins distant in an evolutionary history do not conserve amino acid residue arrangements, approximate searching on proteins' secondary structure is quite important in finding out distant homology. In this paper, we propose an indexing scheme for efficient approximate searching on the secondary structure of protein sequences which can be easily implemented in RDBMS. Exploiting the concept of clustering and lookahead , the proposed indexing scheme processes three types of secondary structure queries (i.e., exact match, range match, and wildcard match) very quickly. To evaluate the performance of the proposed method, we conducted extensive experiments using a set of actual protein sequences. According to the experimental results, the proposed method was proved to be faster than the existing indexing methods up to 6.3 times in exact match, 3.3 times in range match, and 1.5 times in wildcard match, respectively.
Wydawca
Rocznik
Strony
525--542
Opis fizyczny
bibliogr. 20 poz., wykr.
Twórcy
autor
autor
autor
  • Department of Computer Science, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, Korea, mkseo@cs.yonsei.ac.kr
Bibliografia
  • [1] B. Alberts, D. Bray, J. Lweis, M. Raff, K. Roberts, and J. D. Watson (3rd), Molecular Biology of the Cell (Garland Publishing Inc., 1994).
  • [2] S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs, Nucleic Acids Research 25(17) (1997) 3389-3402.
  • [3] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic Local Alignment Search Tool, Journal of Molecular Biology (1990) 403-410.
  • [4] Z. Aung,W. Fu, and K.-L. Tan, An Efficient Index-based Protein Structure Database Searching Method, Proc. IEEE DASFAA Conf. (2003) 311-318.
  • [5] O. Camoglu, T. Kahveci, and A. K. Singh, Towards Index-based Similarity Search for Protein Structure Databases, Proc. IEEE Computer Society Bioinformatics Conf. (2003) 148-158.
  • [6] I. Eidhammer and I. Jonassen, Protein Structure Comparison and Structure Patterns - An Algorithmic Approach, ISMB tutorial (2001).
  • [7] C. Fondrat and P. Dessen, A Rapid Access Motif Database(RAMdb) with a Searching Algorithm for the Retrieval Patterns in Nucleic Acids or Protein Databanks, Computer Applications in the Bioscience 11(3) (1995) 273-279.
  • [8] D. Frishman and P. Argos, Seventy-five Accuracy in Protein Secondary Structure Prediction, Proteins 27(3) (1997) 329-335.
  • [9] D. Frishman and P. Argos, Incorporation of Long-Distance Interactions into a Secondary Structure Prediction Algorithm, Protein Engineering 9(2) (1996) 133-142.
  • [10] J. F. Gibrat, T. Madel, and S. H. Bryant, Surprising Similarities in Structure Comparison, Current Opinion in Structural Biology 6(3) (1996) 377-385.
  • [11] L. Hammel and J. M. Patel, Searching on the Secondary Structure of Protein Sequence, Proc. VLDB Conf. (2002) 634-645.
  • [12] L. Holm and C. Sander, Protein Structure Comparison by Alignment of Distance Matrices, Journal of Molecular Biology 233(1) (1993) 123-138.
  • [13] E. Hunt, M. P. Atkinson, and R. W. Irving, Database Indexing for Large DNA and Protein Sequence Collections, The VLDB Journal 11(3) (2002) 256-271.
  • [14] P. Koehl, Protein Structure Similarities, Current Opinion in Structural Biology 11(3) (2001) 348-353.
  • [15] D. W. Mount, Bioinformatics (Cold Spring Harbor Laboratory Press, 2000).
  • [16] A. Pastore and A. Lesk, Comparison of Globins and Physocyanins: Evidence for Evolutionary Relationship, Proteins: Struct., Func., Gen. 8(2) (1990) 133-155.
  • [17] G. A. Stephen, String Searching Algorithms (World Scientific Publishing, 1994).
  • [18] H.Wang, C.-S. Perng,W. Fan, S. Park, and P. S. Yu, IndexingWeighted Sequences in Large Databases, Proc. IEEE ICDE Conf. (2003) 63-74.
  • [19] H. E. Williams, Genomic Information Retrieval, Proc. Australasian Database Conf. (2003) 27-35.
  • [20] C. H. Wu, L.-S. L. Yeh, H. Huang, L. Arminski, J. Castro-Alvear, Y. Chen, Z. Hu, P. Kourtesis, R. S. Ledley, B. E. Suzek, C. R. Vinayaka, J. Zhang, and W. C. Barker, The Protein Information Resource, Nucleic Acids Research 31(1) (2003) 345-347.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS5-0010-0042
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.