The method for symbolic sequence decomposition into a set of consecutive, distinct, non-overlapping strings of various lengths is proposed. Representation of the sequence as a set of words allows one to use set theory notions. The main result is a quite new definition of the similarity between any two sequences over a given alphabet. No prior sequence alignment is necessary. In the present paper two applications of a set of words are described. In the first a similarity measure is applied to prepare centroids for K-means algorithm. It results in a high performance grouping method for long DNA sequences. The other application concerns the statistical analysis of word attributes. It is shown that similarity, complexity and correlation function of word attributes across sequences of digits of fractional parts of some irrational numbers support the suggestion that the sequences are instances of a random sequence of decimal digits.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
In computational biology the development of algorithms for the identification of tandem repeats in DNA sequences is a challenging problem. Tandem repeats identification is helpful in gene annotation, forensics, and the study of human evolution. In this work a signal processing algorithm based on adaptive S-transform, with Kaiser window, has been proposed for the exact and approximate tandem repeats detection. Usage of Kaiser window helped in identifying short as well as long tandem repeats. Thus, the limitation of earlier S-transform based algorithm that identified only microsatellites has been alleviated by this more versatile algorithm. The superiority of this algorithm has been established by comparative simulation studies with other reported methods.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Artifical evolution methods are useful tools in designing of drugs and proteins with the desired properties. One of the applications is searching for new proteins which have the required features. Due to some disadvantages of known artificial evolution methods, such as phage display or SELEX, the new approach to artificial evolution experiment is being studied. The mathematical model for this approach is introduced and the interesting classes of efficient randomization patterns are defined The corresponding algorithm to find them is also presented. The model allows to plan an artificial evolution experiment and makes this new approach efficient. The introduced model has led to the new optimization problem: Efficient Randomization one, for which an exact algorithm is described.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.