An increasing number of known RNA 3D structures contributes to the recognition of various RNA families and identification of their features. These tasks are based on an analysis of RNA conformations conducted at different levels of detail. On the other hand, the knowledge of native nucleotide conformations is crucial for structure prediction and understanding of RNA folding. However, this knowledge is stored in structural databases in a rather distributed form. Therefore, only automated methods for sampling the space of RNA structures can reveal plausible conformational representatives useful for further analysis. Here, we present a machine learning-based approach to inspect the dataset of RNA three-dimensional structures and to create a library of nucleotide conformers. A median neural gas algorithm is applied to cluster nucleotide structures upon their trigonometric description. The clustering procedure is two-stage: (i) backbone- and (ii) ribose-driven. We show the resulting library that contains RNA nucleotide representatives over the entire data, and we evaluate its quality by computing normal distribution measures and average RMSD between data points as well as the prototype within each cluster.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Determination of the native folded structure for a particular protein is a milestone towards understanding its function, and in most cases, can be done experimentally. However, the ability to predict in silico protein structure and related features would represent a fundamental breakthough in structural biology. The ability to predict domains in proteins is amongst the most important tasks needed for efective functional classification, homology-based structure prediction, structural genomics, as it makes function prediction easier. In this paper, we present the DomAnS, protein domain prediction approach, that is based on pattern alignment. DomAnS allows rapid screening for potential domain regions with the ability to recognize the most promising regions where domains might exists. The combination of the DomAnS algorithm with specialized databases that contains all known domains, allows us to find domain regions without solving 3D structure. Our approach has been tested on CASP7 data, and for 28 targets gave the best overall score.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Motivation: Proteins are the main building blocks of life. They catalyze biological processes in living cells to sustain life and improve metabolism. They also act as biological scaffolds and are cell's workhorses. As a matter of fact, knowing their function is one of the most important milestones for understanding life.The function depends on the tertiary structure of the protein, but only for a fraction of amino acid sequences gathered in databases the structure is known. Thus, creation of efficient and accurate methods that predict function from sequences, based on already known function-sequence assignments, is a fundamental challenge in computational biology. Results: First, we show a detailed analysis of a usability of similarity search engines in the context of function prediction. Then we propose a simple and effective method for assigning function to sequences based on the results of similarity searches and information gathered from gene ontology annotation graphs. Availability: All data used for the analysis presented in this paper as well as raw result are available at the site: http://bio.cs.put.poznan.pl/funcpred/data/ Suplementary Material: Suplementary materials with additional charts are available at: http://bio.es.put.poznan.pl/funcpred/suplement/ Contact: protbio@cs.put.poznan.pl
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.