The parsing of a symbolic sequence into a set of short substrings called words invented by the author is used for a new definition of the distance between sequences. No sequence alignment is necessary. The most frequent among spectra of multiprotein sequences are selected and considered as a reference spectrum of the sequences. The distance between the reference spectrum and protein sequences is considered as the estimation of the evolutionary distance of the protein. As an application, amino acid sequences of the several mitochondria-encoded proteins of mammal species are ordered according to their evolutionary distance. Statistical distribution of the distances between exhibits some structures related to the evolutionary rate in the past.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.