Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2013 | Vol. 19, No. 2 | 99--105
Tytuł artykułu

A Representative Set Method for Symbolic Sequence Clustering

Autorzy
Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Sequence decomposition into a set of consecutive, distinct subsequences is crucial for symbolic sequence analysis. It reduces significantly the reference base of the recorded sequence for further retrieval and allows for original similarity and membership measures of the sequences. The introduced measures are a start point to a new algorithm for clustering sequences into groups of similar individuals. Algorithms that use the concept of a representative set achieved relatively good clustering results. The representative set that we have introduced is precisely and uniquely defined in contrast to that used in other applications.
Wydawca

Rocznik
Strony
99--105
Opis fizyczny
Bibliogr. 21 poz., rys.
Twórcy
Bibliografia
  • [1] M. Randić, S.C. Basak, Characterization of DNA Primary Sequences Based on the Average Distances between Bases, J. Chem. Inf. Comput. Sci. 41, 561-568 (2001).
  • [2] Y. Liu, The Numerical Characterization and Similarity Analysis of DNA Primary Sequences, Internet Electronic Journal of Molecular Design 1, 675-684 (2002).
  • [3] M-S. Yang and K-L. Wu, A Similarity-Based Robust Clustering Method, IEEE Transactions on Pattern Analysis and Machine Intelligence 2(4), 434-448 (2004).
  • [4] J.Wen, C. Li, Similarity analysis of DNA sequences based on the LZ complexity, Internet Electronic Journal of Molecular Design 6, 1-12 (2007).
  • [5] A. Kelil, S. Wang, Q. Jiang, R. Brzezinski, A general measure of similarity for categorical sequences, Knowl. Inf. Syst. 24, 197-220 (2010), (DOI 10.1007/s10115-009-0237-8).
  • [6] M.R. Ackermann, J. Blömer, D. Kuntze, C. Sohler, Analysis of Agglomerative Clustering, http://arXiv.org/abs/1012.3697 (2012).
  • [7] P. Berkhin, Survey of Clustering Data Mining Techniques,1-56, http://citeseerx.ist.psu.edu/viewauth/summary?aid=32145.
  • [8] R. Xu, D. Wunsch, Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645-678 (2005).
  • [9] P. Agrawal, M.A. Alam, R. Biswas, Analysing the agglomerative hierarchical clustering algorithm for categorical attributes, International Journal Innovation, Management and Technology 1(2), 186-190 (2010) (and references quoted therein).
  • [10] N.S. Müller, A. Gabadinho, G. Ritschard, M. Studer, Extracting knowledge from life courses: Clustering and visualization, In DAWAK 2008, volume LNCS 5182 of Lectures Notes in Computer Science, 176-185, Berlin Heidelberg Springer (2008).
  • [11] G.W. Milligan, M.C. Cooper, An examination of procedures for determining the number of clusters in a data set, Psychometrika 50, 159-179 (1985).
  • [12] D.-G. Ke , Q.-Y. Tong, Easily adaptable complexity measure for finite time series, Phys. Rev. E 77, 066215 (2008).
  • [13] B. Kozarzewski, A method for nucleotide sequence analysis, CMST 18(1), 5-10 (2012).
  • [14] L.R. Dice, Measures of the Amount of Ecologic Association Between Species, Ecology 26(3), 297-302 (1945).
  • [15] M. Daszykowski, B. Walczak, D.L Massart, Representative subset selection, Analytica Chimica Acta 468(1), 91-103 (2002).
  • [16] A. Gabadinho, G. Ritschard, M. Studer, N.S. Müller, Extracting and Rendering Representative Sequences, in: Communications in Computer and Information Science, Lecture Notes in Computer Science, 94-106, Springer-Verlag Berlin Heidelberg (2011).
  • [17] C.D. Michener, R. R. Sokal, A quantitative approach to a problem of classification, Evolution 11, 490-499 (1957).
  • [18] T. Calinski, J. Harabasz, A Dendrite Method for Cluster Analysis, Communications in Statistics 3(1), 1-27 (1974).
  • [19] Q. Zhao, V. Hautamaki, P. Fränti, Knee point detection in BIC for detecting the number of clusters, ACIVS 2008, volume LNCS 5295 of Lectures Notes in Computer Science, 664-673, Berlin Heidelberg. Springer (2008).
  • [20] V. Granville, Identifying the number of clusters: final a solution, http://www.analyticbridge.com/profile/Vincent.Granville
  • [21] M. Cameron, Y. Bernstein, H. Williams, Clustered sequence representation for fast homology search, J. Comp. Biol. 14(5), 594-614 (2007).
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-6a5f52c4-7d37-4386-839e-918aebaf44db
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.