A declarative query language for protein secondary structures

Wieczorek, D.; Małysiak-Mrozek, B.; Kozielski, S.; Mrozek, D.

Artykuł - szczegóły

Tytuł artykułu

A declarative query language for protein secondary structures

Autorzy

Wieczorek D. , Małysiak-Mrozek B. , Kozielski S. , Mrozek D.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Searching proteins on their secondary structures provides a rough and fast method of identification of molecules having a similar fold. Since existing database management systems do not offer integrated exploration methods for querying protein structures, the structural similarity searching is usually performed by external tools. This often lengthens the processing time and requires additional processing steps, like adaptation of input and output data formats. In the paper, we present the extended SQL language, which allows searching a database in order to find proteins having secondary structures similar to the structural pattern specified by a user. Presented query language is integrated with the relational database management system and it simplifies the manipulation of biological data.

Słowa kluczowe

proteins secondary structure protein similarity query language

proteiny struktura drugorzędowa podobieństwa białek język zapytań

Wydawca

University of Silesia, Institute of Informatics, Computer Systems Department

Czasopismo

Journal of Medical Informatics & Technologies

Rocznik

2010

Tom

Vol. 16

Strony

139--148

Opis fizyczny

Bibliogr. 21 poz., rys.

Twórcy

autor

Wieczorek D.

Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland

autor

Małysiak-Mrozek B.

autor

Kozielski S.

autor

Mrozek D.

Bibliografia

[1] EIDHAMMER I., INGE J., TAYLOR W.R., Protein Bioinformatics: An algorithmic approach to sequence and structure analysis, John Wiley & Sons, 2004.
[2] ALLEN J.P., Biophysical chemistry, Wiley-Blackwell, 2008.
[3] BRANDEN C., TOOZE J., Introduction to protein structure, Garland, 1991.
[4] DICKERSON, R.E., GEIS, I., The structure and action of proteins, 2nd ed. Benjamin/Cummings, Redwood City, Calif. Concise, 1981.
[5] CREIGHTON T.E., Proteins: Structures and molecular properties, 2nd ed. Freeman, San Francisco, 1993.
[6] GIBRAT J.F., MADEJ T., BRYANT S.H.: Surprising similarities in structure comparison, Curr Opin Struct Biol, Vol. 6(3), 1996, pp. 377–385.
[7] SHAPIRO J., BRUTLAG D., FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web, Nucleic Acids Res., Vol. 32, 2004, pp. 536–41.
[8] CAN T., WANG Y.F., CTSS: a robust and efficient method for protein structure alignment based on local geometrical and biological features, Proc. 2003 IEEE Bioinformatics Conf., 2003, pp. 169–179.
[9] YANG J., Comprehensive description of protein structures using protein folding shape code, Proteins, Vol. 71(3), 2008, pp. 1497–518.
[10] BERMAN H.M., WESTBROOK J., FENG Z., GILLILAND G., BHAT T.N., WEISSIG H., et al.: The Protein Data Bank, Nucleic Acids Res., Vol. 28, 2000, pp. 235–242.
[11] DATE C.J., Introduction to database systems, (8th Edition), Addison Wesley, 2003.
[12] STEPHENS S., CHEN J.Y., THOMAS S., ODM BLAST: Sequence homology search in the RDBMS, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2004.
[13] HAMMEL L., PATEL J.M., Searching on the secondary structure of protein sequences, Proc. 28th Int. Conf. on Very Large Data Bases, Hong Kong, China, 2002, pp. 634–645.
[14] TATA S., PATEL J.M., FRIEDMAN J.S., SWAROOP A., Declarative querying for biological sequences, Proc. 22nd Int. Conf. on Data Engineering, IEEE Computer Society, 2006, pp. 87–98.
[15] WANG Y., SUNDERRAMAN R., TIAN H., A domain specific data management architecture for protein structure data, Proc. 28th IEEE EMBS Annual Int. Conf., New York City, USA, 2006, pp. 5751–5754.
[16] MURZIN A.G., BRENNER S.E., HUBBARD T., CHOTHIA C., SCOP: A structural classification of proteins database for the investigation of sequences and structures, J. Mol. Biol. Vol. 247, 1995, pp. 536–540.
[17] ORENGO C.A., MICHIE A.D., JONES S., et al., CATH – A hierarchic classification of protein domain structures, Structure, Vol. 5. No 8., 1997, pp. 1093–1108.
[18] SMITH T.F., WATERMAN M.S., Identification of common molecular subsequences, J Mol Biol, Vol. 147, 1981, pp. 195–197.
[19] APWEILER R., BAIROCH A., WU C.H., BARKER W.C., et al., UniProt: the Universal Protein knowledgebase, Nucleic Acids Res. Vol. 32 (Database issue), 2004, pp. 115−119.
[20] FRISHMAN D., ARGOS P., Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence, Protein Eng, Vol. 9(2), 1996, pp. 133–142.
[21] WIECZOREK D., MAŁYSIAK-MROZEK B., KOZIELSKI S., MROZEK D., A method for matching sequences of protein secondary structures, Journal of Medical Informatics & Technologies, October 2010 (to be published).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA4-0018-0018