Sekwencyjne i równolegle algorytmy znajdowania podciągów

Deorowicz, S.

Artykuł - szczegóły

Tytuł artykułu

Sekwencyjne i równolegle algorytmy znajdowania podciągów

Autorzy

Deorowicz S.

Identyfikatory

Warianty tytułu

Serial and parallel subsequence finding algorithms

Języki publikacji

Abstrakty

Pierwsza cześć niniejszej pracy poświęcona jest problemowi najdłuższego podciągu rosnącego (LIS) oraz jego wariantom (podciąg otrzymuje sie z ciągu przez usuniecie zera bądź większej liczby symboli). Problem ten znajduje zastosowania m.in. w bioinformatyce do uliniawiania genomów, wyszukiwania nowych genów. Pierwszym z wariantów problemu LIS rozważanym w niniejszej pracy jest problem podciągów rosnących, które są pod pewnymi względami ekstremalne. Kolejnym wariantem jest problem podciągu rosnącego o zadanym pochyleniu. Dalsze dwa warianty to problemy cyklicznych podciągów rosnących oraz podciągów rosnących w oknie ustalonego rozmiaru ciągu wejściowego. Dla tych ostatnich wariantów zaproponowano w pracy wykorzystanie reprezentacji ciągu za pomocą pokrycia zachłannego oraz opracowano wydajne algorytmy łączenia takich pokryć. Algorytmy te są kluczowe do efektywnego rozwiązywania wspomnianych problemów. Druga cześć pracy dotyczy problemu najdłuższego wspólnego podciągu i jego wariantów. Zastosowania tych problemów są bardzo liczne i dotyczą przede wszystkim porównywania ciągów w celu oceny ich podobieństwa. Dla problemu LCS niezmienniczego względem transpozycji LCTS) zaproponowano kilka algorytmów sekwencyjnych, które, jak wynika z eksperymentów praktycznych, okazały sie znacznie szybsze od algorytmów istniejących. Dla problemu ukierunkowanego LCS (CLCS) zaproponowano algorytmy sekwencyjne, również szybsze od dotychczas istniejących. Ponadto, zaproponowano dla tego problemu pierwszy algorytm równoległości bitowej. Dla problemu scalonego LCS (MerLCS) zaproponowano pierwszy algorytm równoległości bitowej, który w eksperymentach praktycznych okazał sie kilkudziesięciokrotnie szybszy od znanych algorytmów. Dla problemów LCS, LCTS, CLCS zaproponowano także algorytmy równoległe przeznaczone do wykonywania w procesorach graficznych. Dla wszystkich algorytmów proponowanych w niniejszej pracy przeprowadzono analizę złożoności czasowej i pamięciowej w przypadku pesymistycznym(dla niektórych także w przypadku średnim). Dzięki temu często można było wykazać, ze proponowane algorytmy są także najszybsze w sensie asymptotycznym.

The first part of this work is on the longest increasing subsequence problem (LIS) and its variants (a subsequence can be obtained from a sequence by removing zero or more symbols). The problem has applications in bioinformatics, e.g., in sequence alignment, searching new genes. The first variant of the LIS problem, which is considered in this work, is a problem of longest increasing subsequences that are extremal from some point of view. Next variant is a slope-constrained longest increasing subsequence problem. The last two discussed variants of the LIS problem are a longest increasing cyclic subsequence problem (LICS) and a longest increasing subsequence in a sliding window problem (LISW). The algorithms for the recent two problems use cover representation of a sequence. Original algorithms for cover merging are crucial to the proposed algorithms for the LICS and LISW problems. The second part of this work is on the longest common subsequence problem (LCS) and its variants. The applications of these problems are numerous and concentrate mainly on the sequence comparison. For the transposition-invariant LCS problem (LCTS), a few sequential algorithms were proposed. Experiments show that they are much faster than the existing algorithms. For the constrained LCS problem (CLCS), a few sequential algorithms were also proposed. They are faster than the known algorithms. Moreover, for the CLCS problem, the first bit-parallel algorithm was invented. For the merged LCS problem (MerLCS), a bit parallel algorithm, tens times faster than the existing algorithms was proposed. For the LCS, LCTS, CLCS problems also algorithms for graphical processors were invented. All the proposed algorithms were analysed and their time and space complexities in the worst case were determined. For some algorithms the average case was also analysed. Obtained time complexities allow to show that the proposed algorithms are usually faster than the existing algorithms also in an asymptotic sense.

Słowa kluczowe

algorytm równoległy podciągi algorytm sekwencyjny

subsequences parallel algorithm serial algorithm

Wydawca

Wydawnictwo Politechniki Śląskiej

Czasopismo

Studia Informatica

Rocznik

2010

Tom

Vol. 31, nr 4B

Strony

239--239

Opis fizyczny

s., Bibliogr. 225 poz.

Twórcy

autor

Deorowicz S.

Politechnika Śląska, Instytut Informatyki, Gliwice, Akademicka 16, pokój 527 tel. 32-2372829, sebastian.deorowicz@polsl.pl

Bibliografia

1. G.M., Adel'son-Vel'skii, E.M. Landis. An algorithm for the organization of information. Soviet Mathematics Doklady, 3:1259-1263, 1962.
2. A.V. Aho, D.S. Hirschberg, J.D. Ullman. Bounds on the complexity of the longest common subsequence problem. Journal of the ACM, 23(1):1-12, 1976.
3. M.H. Albert, M.D. Atkinson, D. Nussbaum, J.-R. Sack, N. Santoro. On the longest increasing subsequence of a circular list. Information Processing Letters, 101:55-59, 2007.
4. M.H. Albert, A. Golynski, A.M. Hamel, A. Lopez-Ortiz, S.S. Rao, M.A; Safari. Longest increasing subsequences in sliding windows. Theoretical Computer Science, 321:405-414, 2004.
5. D. Aldous, P. Diaconis. Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem. Bulletin of the AMS, 36:413-32, 1999.
6. E. Alerstam, T. Svensson, S. Andersson-Engels. Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration. Journal of Biomedical Optics,13(6):060504, 2008.
7. E. Allender. Circuit complexity before the dawn of the new millennium. Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science, strony 1-18, 1996.
8. L. Allison, T.L. Dix. A bit-string longest common subsequence algorithm. Information Processing Letters, 23:305-310, 1986.
9. S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403-410, 1990.
10. S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D.J. Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25(17):3389-3402, 1997.
11. A. Andersson, M. Thorup. Dynamic ordered sets with exponential search trees. Journal of the ACM, 54(3):Article No. 13, 2007.
12. E. Anirre, K. Gojenola, K. Sarasola, A. Voutilainen. Towards a single proposal in spelling correction. Proceedings of the 17th international conference on Computational linguistics, strony 22-28, Montreal, Quebec, Canada, 1998.
13. H.-Y. Ann, C.-B. Yang, C.-T. Tseng, C.-Y. Hor. A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings. Information Processing Letters, 108(6):360-364, 2008.
14. H.-Y. Ann, C.-B. Yang, C.-T. Tseng, C.-Y. Hor. Fast algorithms for computing the constrained les of run-length encoded strings. Proceedings of the 2009 International Conference on Bioinformatics and Computational Biology, torn 2, strony 646-649, 2009.
15. A. Apostolico. Improving the worst-case performance of Hunt-Szymanski strategy for the longest common subsequences of two strings. Information Processing Letters, 23:63-69, 1986.
16. A. Apostolico. String editing and longest common subsequences. G. Rozenberg, A. Salomaa, redaktorzy, Handbook of formal languages, tom 2: linear modeling: background and application, rozdział 8, strony 361-398. Springer-Verlag, 1997.
17. A. Apostolico. General pattern matchings. M.J. Atallah, redaktor, Handbook of Algorithms and Theory of Computation, rozdział 13. CRC-Press, 1998.
18. A. Apostolico, M.J. Atallah, L.L. Larmore, S. McFaddin. Efficient parallel algorithms for string editing and related problems. SIAM Journal of Computing, 19(5):968-998, 1990.
19. A. Apostolico, C. Guerra. The longest common subsequence problem revisited. Algorithmica, 2:315-336, 1987.
20. A. Apostolico, G.M. Landau, S. Skiena. Matching for run-length encoded strings. Journal of Complexity, 15(1):4-16, 1999.
21. V.K. Arlazarov, E.A. Dinic, M.A. Kronrod, I.A. Faradzev. On economic construction of the transitive closure of a directed graph. Soviet Mathematics Doklady, 11:1209-1210, 1975.
22. A.N. Arslan, O. Egecioglu. Algorithms for the constrained longest common subsequence problems. International Journal of Foundations of Computer Science, 16(6):1099-1109, 2005.
23. K.N. Babu, S. Saxena. Parallel algorithms for the longest common subsequence problem. Proceedings of the 4th International Conference on High-Performance Computing, strony 120-125, Washington, DC, USA, December 1997. IEEE Computer Society.
24. R.M. Baer, P. Brock. Natural sorting over permutation spaces. Mathematics of Computation, 22:385-410, 1968.
25. R.A. Baeza-Yates. Efficient text searching. Rozprawa doktorska, Department of Computer Science, University of Waterloo, Ontario, Canada, 1989.
26. R.A. Baeza-Yates, Gonnet G.H. A new approach to text searching. Communications of the ACM, 35(10):74-82, 1992.
27. J. Baik, P. Deift, K. Johansson. On the distribution of the length of the longest increasing subsequence in random permutations. Journal of American Mathematical Society, 12:1119-1178, 1999.
28. A. Banerjee, J. Ghosh. Clickstream clustering using weighted longest common subsequences. Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining, strony 33-40, Chicago, 2001.
29. R. Bayer. Symmetric binary B-trees. Acta Informatica, 1:290-306, 1972.
30. T. C. Bell, J. G. Cleary, I. H. Witten. Text Compression. Prentice Hall, Englewood Cliffs, NJ, 1990.
31. S. Bereg, M. Kubica, T. Walen, B. Zhu. Rna multiple structural alignment with longest common subsequences. Journal of Combinatorial Optimization, 13(2): 179-188, 2007.
32. L. Bergroth, H. Hakonen, T. Raita. A survey of longest common subsequence algorithms. Proceedings of 7th International Symposium on String Processing Information Retrieval (SPIRE), strony 39-48, Curuna, Spain, 2000.
33. S. Bespamyatnikh, M. Segal. Enumerating longest increasing subsequences and patience sorting. Information Processing Letters, 76(l-2):7-l 1, 2000.
34. Ch. Blum, M.J. Blesa, M.L6pez-Ibanez. Beam search for the longest common subsequence problem. Computers and Operations Research, 36(12):3178-3186, 2009.
35. P. Brass. Advanced Data Structures. Cambridge University Press, 2008.
36. G.S. Brodal, K. Kaligosi, I. Katriel, M. Kutz. Faster algorithms for computing longest common increasing subsequences. Lecture Notes in Computer Science, 4009:330-341, 2006.
37. H. Bunke, J. Csirik. An improved algorithm for computing the edit distance of run-length coded strings. Information Processing Letters, 54(2):93-96, 1995.
38. M. Burrows, D.J. Wheeler. A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, California, USA, 1994.
39. W.-T. Chan, Y. Zhang, S.P.Y. Fung, D. Ye, H. Zhu. Efficient algorithms for finding a longest common increasing subsequence. Journal of Combinatorial Optimization, 13(3):277-288, 2007.
40. E. Chen, L. Yang, H. Yuan. Longest increasing subsequences in windows based on canonical antichain partition. Theoretical Computer Science, 378(3):223-236, 2007.
41. X. Chen, M. Li, B. Ma, J. Tromp. DNACompress: fast and effective DNA séquence compression. Bioinformatics, 18(12):1696-1698, 2002.
42. F.Y.L. Chin, A. De Santis, A.L. Ferrara, N.L. Ho, S .K. Kim. A simple algorithm for the constrained sequence problems. Information Processing Letters, 90:175-179, 2004.
43. F.Y.L. Chin, N.L. Ho, T.W. Lam, P.W.H. Wong. Efficient constrained multiple sequence alignment with performance guarantee. Journal of Bioinformatics and Computational Biology, 3(1): 1-18, 2005.
44. R.A. Chowdhury, V. Ramachandran. Cache-oblivious dynamic programming. Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, strony 591-600, New York, NY, US, 2006. ACM Press.
45. M. Chrochemore, G.M. Landau, M. Ziv-Ukelson. A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. SIAM Journal of Computing, 32(6): 1654-1673, 2003.
46. M.G. Ciura, S. Deorowicz. How to squeeze a lexicon. Software-Practice and Experience, 31(11):1077-1090, 2001.
47. S.A. Cook, R.A. Reckhow. Time-bounded random access machines. Journal Computer and System Sciences, 7:354-375, 1973.
48. T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein. Introduction to Algorithms. McGraw-Hill, wydanie 2nd, 2003.
49. T. Crawford, C. Iliopoulos, R. Raman. String matching techniques for musical similarity and melodic recognition. Computing in Musicology, 11:71-100, 1998.
50. M. Crochemore, Ch. Hancart, T. Lecroq. Algorithms on Strings. Cambridge University Press, New York, USA, 2007.
51. M. Crochemore, C.S. Iliopoulos, Y.J. Pinzon. Speeding-up Hirschberg and Hunt-Szymanski LCS algorithms. Fundamenta Informaticae, strony 89-103, 2003.
52. M. Crochemore, C.S. Iliopoulos, Y.J. Pinzon, J.F. Reid. A fast and practical bit-vector algorithm for the longest common subsequence problem. Information Processing Letters, 80:279-285,2001.
53. M. Crochemore, G.M. Landau, M. Ziv-Ukelson. A sub-quadratic sequence alignment algorithm for unrestricted cost matrices. Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, strony 679-688, San Francisco, California, 2002. Society for Industrial and Applied Mathematics.
54. M. Crochemore, W. Rytter. Text Algorithms. Oxford University Press, 1994.
55. M. Crochemore, W. Rytter. Jewels of Stringology. World Scientific Publishing Company, 2003.
56. F. Damerau. A technique for computer detection and correction of spelling errors. Communications oftheACM,lQ):\l\-\16, 1964.
57. A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson, O. White, S.L. Salzberg. Alignment of whole genomes. Nucleic Acids Research, 27(ll):2369-2376, 1999.
58. J.S. Deogun, J. Yang, F. Ma. Emagen: An efficient approach to multiple whole genome alignment. Proceedings of the Second Asia-Pacific Bioinformatics Conference, strony 113-122, 2004.
59. R.C. Deonier, S. Tavare, M.S. Waterman. Computational Genome Analysis: An Introduction. Springer, 2005.
60. S. Deorowicz. Improvements to Burrows-Wheeler compression algorithm. Software Practice and Experience, 30(13): 1465-1483, 2000.
61. S. Deorowicz. Second step algorithms in the Burrows-Wheeler compression algorithm. Software-Practice and Experience, 32(2):99-l 11, 2002.
62. S. Deorowicz. Context exhumation after the burrows-wheeler transform. Information Processing Letters, 95(l):313-320, 2005.
63. S. Deorowicz. Speeding up transposition-invariant string matching. Information Processing Letters, 100(l):14-20, 2006.
64. S. Deorowicz. Fast algorithm for constrained longest common subsequence problem. Theoretical and Applied Informatics, 19(2):91-102, 2007.
65. S. Deorowicz. An algorithm for solving the longest increasing circular subsequence problem. Information Processing Letters, 109(12):630-634, 2009.
66. S. Deorowicz. Computing the longest common transposition-invariant subsequence with gpu. K A. Cyran, S. Kozielski, Peters J.F., U. Stariczyk, A. Wakuliez-Deja, redaktorzy, Man-Machine Interactions, Series Advances in Intelligent and Soft Computing, strony 551-559. Springer-Verlag, Berlin Heidelberg, 2009.
67. S. Deorowicz. On some variants of the longest increasing subsequence problem. Theoretical and Applied Informatics, 21(3-4):135-148, 2009.
68. S. Deorowicz. An algorithm for the longest increasing subsequence in a sliding window problem. 2010. W recenzji (http://sun.aei.polsl.pl/~sdeor/pub/Deo2010d-draft .pdf).
69. S. Deorowicz. Bit-parallel algorithm for the constrained longest common subsequence problem. Fundamenta Informaticae, 99(4):409^133, 2010.
70. S. Deorowicz. Bit-parallel algorithm for the merged longest common subsequence problem. 2010.(zgloszony do publikacji).
71. S. Deorowicz. Longest common subsequence and related problem solved at graphical processing units. Software-Practice and Experience, 4O(8):673-700, 2010.
72. S. Deorowicz, M.G. Ciura. Correcting spelling errors by modelling their causes. International Journal of Applied Mathematics and Computer Science, 15(2):275-285, 2005.
73. S. Deorowicz, Sz. Grabowski. A hybrid algorithm for the longest common transposition-invariant subsequence problem. Computing and Informatics, 28(5):729-744, 2009.
74. S. Deorowicz, Sz. Grabowski. On two variants of the longest increasing subsequence problem. K.A. Cyran, S. Kozielski, Peters J.F., U. Stańczyk, A. Wakulicz-Deja, redaktorzy, Man-Machine Interactions, Series Advances in Intelligent and Soft Computing, strony 541-549. Springer-Verlag, Berlin Heidelberg, 2009.
75. S. Deorowicz, J. Obstój. Constrained longest common subsequence computing algorithms in practice. Computing and Informatics, 29(3):427-445, 2010.
76. B. Dómólki. An algorithm for syntactical analysis. Computational Linguistics, 3:29-46, 1964.
77. R. Durbin, S.R. Eddy, A. Krogh, G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1999.
78. T. Easton, A. Singireddy. A specialized branching and fathoming technique for the longest common subsequence problem. International Journal of Operations Research, 4(2):98-104, 2007.
79. T. Easton, A. Singireddy. A large neighborhood search heuristic for the longest common subsequence problem. Journal of Heuristics, 14(3):271-283, 2008.
80. E. Edimiston, R.A. Wagner. Parallelization of the dynamic programming algorithm for comparison of sequences. International Conference on Parallel Processing, strony 78-90, 1987.
81. D. Eppstein, Z. Galii, R. Giancarlo, G.F. Italiano. Sparse dynamic programming I: linear cost functions. Journal of the ACM, 39(3):519-545, 1992.
82. D. Eppstein, Z. Galii, R. Giancarlo, G.F. Italiano. Sparse dynamic programming II: convex and concave cost functions. Journal of the ACM, 39(3):546-567, 1992.
83. P. Erdos, G. Szekeres. A combinatorial problem in geometry. Compositio Mathematica, 2:463-470, 1935.
84. T. Faraut, S. de Givry, P. Chabrier, T. Derrien, F. Galibert, C. Hitte, T. Schiex. A comparative genome approach to marker ordering. Bioinformatics, 23(2):e50-e56, 2007.
85. R. Farber. CUDA, supercomputing for the masses: Parts 1-15. Dr. Dobb's Journal, 2008-2010. http://www.ddj.com/architect/207200659.
86. M. Farrar. Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics, 23(2):156-161, 2007.
87. P. Fenwick. The Burrows-Wheeler transform for block sorting text compression: Principles and improvements. The Computer Journal, 39(9):731-740, 1996.
88. P. Ferragina, G. Manzini. Opportunistic data structures with applications. Proceedings of the 41 Annual Symposium on Foundations of Computer Science, strony 390-398, 2000.
89. P. Ferragina, G. Manzini. An experimental study of a compressed index. Proceedings 12th ACMSIAM Symposium on Discrete Algorithms, strony 269-278, 2001.
90. M.J. Flynn. Some computer organizations and their effectiveness. IEEE Transactions on Computers, C-21(9):948-960, 1972.
91. M.L. Fredman. On computing the length of longest increasing subsequences. Discrete Mathematics, 11:29-35, 1975.
92. K. Fredriksson, V. Mäkinen, G. Navarro. Flexible music retrieval in sublinear time. International Journal of Foundations of Computer Science, 17(6):1345-1364, 2006.
93. A. Frezzotti, G.P. Ghiroldi, L. Gibelli. Solving kinetic equations on GPUs I: Model kinetic equations. arXiv:0903.4044vl [physics.comp-ph], 2009. http://arxiv.org/abs/0903.4044.
94. W. Fulton. Young Tableaux. With Applications to Representation Theory and Geometry, torn 35 serii Longon mathematical Society Student Texts. Cambridge University Press, 1997.
95. W. Fulton, J. Harris. Representation Theory. A First Course, torn 129 serii Graduate Texts in Mathematics. Readings in Mathematics. Springer-Verlag, 1991.
96. P. Gage. A new algorithm for data compression. The 'C Users Journal, 12(2), 1994.
97. A. Ghias, J. Logan, D. Chamberlin, B.C. Smith. Query by humming-musical information retrieval in an audio database. Proceedings of the third ACM international conference on Multimedia, strony 231-236, San Francisco, CA, 1995.
98. L. Golab, HJ. Karloff, F. Korn, A. Saha, D. Srivastava. Sequential dependencies. Proceedings of the Very Large Database Conference, torn 2, strony 574-585, 2009.
99. S. Golomb. Runlength encodings. IEEE Transactions on Information Theory, IT-12(3):399-401, July 1966.
100. M.C. Golumbic. Algorithmic Graph Theory and Perfect Graphs, tom 57 serii Annals of Discrete Mathematics. Academic Press, wydanie 2nd, 2004.
101. O. Gotoh, S. Yamada, T. Yada. Multiple sequence alignment. S. Aluru, redaktor, Handbook of Computational Molecular Biology, rozdział 3. Chapman & Hall/CRC, 2006.
102. Sz. Grabowski. New Algorithms for Exact and Approximate Text Matching, torn 1051 serii Zeszyty Naukowe Politechniki Łódzkiej. Wydawnictwo Politechniki Łódzkiej, Łódź, 2009.
103. Sz. Grabowski, S. Deorowicz. Nice to be a chimera: A hybrid algorithm for the longest common transposition-invariant subsequence problem. Proceedings of 9th International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET'2008), strony 50-54, 2008.
104. Sz. Grabowski, G. Navarro. O(mnloga) time transposition invariant LCS computation. Technical Report TR/DCC-2004-6, University of Chile, Department of Computer Science, September 2004. f tp ://ftp.dec.uchile.cl/pub/users/gnavarro/transpszymon.ps.gz.
105. L.J. Guibas, R. Sedgewick. A dichromatic framework for balanced trees. Proceedings of the 19th Annual Symposium on Foundations of Computer Science, strony 8-21. IEEE Computer Society, 1978.
106. D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
107. J.M. Hammersley. A few seedlings of research. Proceedings of the Sixth Berkley Symposium on Mathematical Statistics and Probability, torn 1, strony 345-394. University of California Press, 1972.
108. R.W.Hamming. Error detecting and error correcting codes. The Bell System Technical Journal, 26(2): 147-160, 1950.
109. W. Haque, A. Aravind, B. Reddy. Pairwise sequence alignment algorithms: a survey. Proceedings of the 2009 conference on Information Science, Technology and Applications, strony 96-103, Kuwait, Kuwait, 2009.
110. D. He, A.N. Arslan. A space-efficient algorithm for the constrained pairwise sequence alignment problem. Genome Informatics, 16(2):237-246, 2005.
111. D. Hermelin, G.M. Landau, S. Landau, O. Weimann. A unified algorithm for accelerating editdistance computation via text-compression. Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science (STACS), strony 529-540, 2009.
112. D.S. Hirschberg. A linear space algorithm for computing maximal common subsequence. Communications of the ACM, 18(6):341-343, 1975.
113. D.S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the ACM, 24:664-675, 1977.
114. D.S. Hirschberg. An information-theoretic lower bound for the longest common subsequence problem. Information Processing Letters, 7(1):40-41, 1978.
115. K.-S. Huang. Some Common Subsequence Problems of Multiple Sequences and Their Applications. Rozprawa doktorska, National Sun Yat-sen University, Taiwan, 2006.
116. K.-S. Huang, C.-B. Yang, K.-T. Tseng, H.-Y. Ann, Y.-H. Peng. Efficient algorithm for finding interleaving relationship between sequences. Information Processing Letters, 105(5): 188-193, 2008.
117. K.-S. Huang, C.-B. Yang, K.-T. Tseng, Y.-H. Peng, H.-Y. Ann. Dynamic programming algorithms for the mosaic longest common subsequence problem. Information Processing Letters, 102(2-3):99-103, 2007.
118. J.W. Hunt, M.D. McJJroy. An algorithm for differential file comparison. Computing Science Technical Report 41, Bell Laboratories, 1976. h t t p : //www. e s . dartmouth. edu/~doug/dif f. ps.
119. J.W. Hunt, T.G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350-353, 1977.
120. H. Hyyrö. Bit-parallel LCS-length computation revisited. Proceedings of the 15th Australian Workshop on Combinatorial Algorithms (AWOCA), strony 16-27, Australia, 2004. University of Sydney.
121. C.S. Iliopoulos, Y.J. Pinzon. Recovering an LCS in 0(2/w) time and space. Proceedings 12th Australasian Workshop on Combinatorial Algorithms, strony 106-117, 2001.
122. C.S. Iliopoulos, M.S. Rahman. New efficient algorithms for LCS and constrained LCS problem. Information Processing Letters, 106(1): 13-18, 2008.
123. G. Jacobson, K.-P. Vo. Heaviest increasing/common subsequence problems. Lecture Notes in Computer Science, 644:52-66, 1992.
124. N.C. Jones, P.A. Pevzner. An Introduction to Bioinformatics Algorithms. Massachusetts Institute of Technology, 2004.
125. M. Kellis, B.W. Birren, E.S. Lander. Proof and evolutionary analysis of ancient genome duplication in the yeast saccharomyces cerevisiae. Nature, 428(6983):617-624, 2004.
126. Khronos OpenCL Working Group. The OpenCL specification, http://www.khronos.org/ r e g i s t r y / c l / s p e c s / o p e n c l - 1 . 0 . 4 3 . pdf, May, 16 2009. Version 1.0, Document Revision 43.
127. D.B. Kirk, W. m.W. Hwu. Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, 2010.
128. D.G. Kirkpatrick, S. Reisch. Upper bounds for sorting integers on random access machines. Theoretical Computer Science, 28:263-276, 1984.
129. J. Kloetzli, B. Strege, J. Decker, M. Olano. Parallel longest common subsequence using graphics hardware. J. Favre, K.L. Ma, D. Weiskopf, redaktorzy, Proceedings of the Eurographics Symposium on Parallel Graphics and Visualization, strony 14-15, 2008. http://www.cs.umbc.edu/olano/papers/cudaLCS.pdf.
130. D.E. Knuth. The Art of Computer Programming, torn 3: Sorting and Searching. Addison Wesley Longman, wydanie 2nd, 1998.
131. D.E. Knuth. The Art of Computer Programming, torn 4, zeszyt 1: Bitwise Tricks & Techniques; Binary Decision. Addison Wesley Professional, 2009.
132. I. Koren. Computer Arithmetic Algorithms. A K Peters, Ltd, 2002.
133. P. Krusche, A. Tiskin. Efficient longest common subsequence computation using bulk-synchronous parallelism. Lecture Notes in Computer Science, 3984:165-174, 2006.
134. K. Kukich. Techniques for automatically correcting words in text. ACM Computing Surveys, 24(4):377-439, 1992.
135. S. Kuo, G.R. Cross. An improved algorithm to find the length of the longest common subsequence of two strings. ACM SIGIR Forum, 23(3-4):89-99, 1989.
136. S.Kurtz. Reducing the space requirement of suffix trees. Software-Practice and Experience, 29:1149-1171, 1999.
137. S. Kurtz, A. Phillippy, A.L. Delcher, M. Smoot, M. Shumway, C. Antonescu, S.L. Salzberg. Versatile and open software for comparing large genomes. Genome Biology, 5(2):R12.1-R12.9, 2004.
138. T.W. Lam, W.K. Sung, S.L. Tam, C.K. Wong, S.M. Yiu. Compressed indexing and local alignment of DNA. Bioinformatics, 24(6):791-797, 2009.
139. K. Lemstrom, G. Navarro, Y. Pinzon. Practical algorithms for transposition-invariant stringmatching. Journal of Discrete Algorithms, 3(2-4):267-292, 2005.
140. K. Lemstrom, J. Tarhio. Searching monophonic patterns within polyphonic sources. Proceedings of Content-Based Multimedia Information Access Conference (RIAO), strony 1261-1279, 2000.
141. K. Lemstrom, E. Ukkonen. Including interval encoding into edit distance based music comparison and retrieval. Proceedings of AISB'2000 Symposium on Creative & Cultural Aspects and Applications of Al & Cognitive Science, strony 53-60, 2000.
142. V. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Problems in Information Transmission, 1:8-17, 1965.
143. L. Ligowski, W. Rudnicki. An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, strony 1-8, 2009.
144. F.-M. Lin, H.-D. Huang, Y.-C. Chang, A.-P. Tsou, P.-L. Chan, L.-C. Wu, M.-F. Tsai, J.-T. Horng. Database to dynamically aid probe design for virus identification. IEEE Transactions on Information Technology in Biomedicine, 10(4):705-713, 2006.
145. R.J. Lipton, D. Lopresti. A systolic array for rapid string comparison. Proceedings of the Chapel Hill Conference on VLSI, strony 363-376, 1985.
146. J.J. Liu, Y.L. Wang, R.C.T. Lee. Finsign a longest common subsequence between a run-lengthencoded string and an uncompressed string. Journal of Complexity, 24(2): 173-187, 2008.
147. W. Liu, B. Schmidt, G. Voss, A. Schroder, W. Muller-Wittig. Bio-sequence database scanning on a GPU. Proceedings of the 20nd International Parallel and Distributed Processing Symposium, strony 274-281, 2006.
148. C.L. Lu, Y.P. Huang. A memory-efficient algorithm for multiple sequence alignment with constraints. Bioinformatics, 21(l):20-30, 2005.
149. D. Maier. The complexity of some problems on subsequences and supersequences. Journal of the ACM, 25(2):322-336, 1978.
150. V. Mäkinen, G. Navarro, E. Ukkonen. Transposition invariant string matching. Journal of Algorithms, 56(2): 124-153, 2005.
151. S.A. Manavski, G. Valle. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics, 9(Suppl 2):S10, 2008. h t t p : //www.biomedcentral.eom/147i-2:105/9/S2/Si0.
152. W.J. Masek, M.S. Paterson. A faster algorithm computing string edit distances. Journal of Computer System Science, 20(1):18-31, 1980.
153. E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262-272, 1976.
154. R.J. McNab, L.A. Smith, D. Bainbridge, LH. Witten. The New Zealand digital library MELody inDEX. D-Lib Magazine, May 1997.
155. K. Mehlhorn. Data structures and algorithms 1: sorting and searching. Springer-Verlag, 1984.
156. K. Mehlhorn, P. Sanders. Algorithms and Data Structures. The Basic Toolbox. Springer-Verlag, Berlin Heidelberg, 2008.
157. R. Mitton. Spellchecking by computer. Journal of Simplified Spelling Society, 20(1):4-11, 1996.
158. R. Mitton. Ordering the suggestions of a spellchecker without using context. Natural Language Engineering, 15(2): 173-192, 2009.
159. D.W. Mount. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, wydanie 2nd, 2004.
160. E.W. Myers, W. Miller. Optimal alignments in linear space. Computer Applications in the Biosciences, 4(1): 11-17', 1988.
161. G. Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):31-88, 2001.
162. G. Navarro. Dynamic dictionaries in constant worst-case time. Technical Report TR/DCC-2007-11, University of Chile, Department of Computer Science, October 2007. f t p : / / f t p . dec. u c h i l e .cl/pub/users/gnavarro/consthash.ps.gz.
163. G. Navarro, Sz. Grabowski, V. Mäkinen, S. Deorowicz. Improved time and space complexities for transposition invariant string matching. Technical Report TR/DCC-2005-4, University of Chile, Department of Computer Science, 2005. ftp: / / f t p . dec. u c h i l e . cl/pub/users/gnavarro/ mnl ogl ogs. p s . gz.
164. S.B. Needleman, CD. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3):443-453, 1970.
165. F. Nicolas, E. Rivals. Longest common subsequence problem for unoriented and cyclic strings. Theoretical Computer Science, 370(1-18), 2007.
166. NVidia Corporation. CUDA zone-the resource for CUDA developers, http://www.nvidia. com/object/cuda_home.html, February, 2 2010.
167. NVidia Corporation. NVidia CUDA(tm)programming guide, version 3.0. http://www.nvidia. com/object/cuda_get.html, February, 2 2010.
168. N. Orio. Music retrieval: a tutorial and review. Foundations and Trends in Information Retrieval, 1(1): 1-96, 2006.
169. C.H. Papadimitriou. Computational Complexity. Addison Wesley, 1993.
170. E. Parvinnia, M. Taheri, K. Ziarati. An improved longest common subsequence algorithm for reducing memory complexity in global alignment of DNA sequences. Proceedings of the International Conference on BioMedical Engineering and Informatics, torn 1, strony 57-61. IEEE Computer Society, 2008.
171. S. Pemmarąju, S. Skiena. Computational Discrete Mathematics. Combinatorics and Graph Theory with Mathematica. Cambridge University Press, 2003.
172. Ch.-L. Peng. An approach for solving the constrained longest common subsequence problem. Praca magisterska, Department of Computer Science and Engineering, National Sun Yat-sen University, Taiwan, 2003. http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/getfile?URN=etd-0828103-125439&filename=etd-0828103-125439.pdf.
173. Y.-H. Peng, C.-B. Yang, K.-S. Huang, C.-T. Tseng, C.-Y. Hor. Efficient sparse dynamic programming for the merged LCS problem with block constraints. International Journal of Innovative Computing, Information and Control, 6(4): 1935-1947, 2010.
174. Z.S. Peng, H.F. Ting. Time and space efficient algorithms for constrained sequence alignment. Proceedings of the Implementation and Application of Automata, 9th International Conference (CIAA), strony 237-246, 2004.
175. P. Pevzner, M. Waterman. Matrix longest common subsequence problem. Lecture Notes in Computer Science, 644:79-89, 1992.
176. A. Polański, M. Kimmel. Bioinformatics. Springer, 2007.
177. M. Patrascu, M. Thorup. Predecessor search. Ming-Yang Kao, redaktor, Encyclopedia of Algorithms, strony 661-664. SpringerScience+BuisinessMedia, LLC, 2008.
178. S. Rajko, S. Aluru. Space and time optimal parallel sequence alignments. IEEE Transactions on Parallel and Distributed Systems, 15(12):1070-1081, 2004.
179. S. Ranka, S. Sahni. String editing on an SIMD hypercube multicomputer. Journal of Parallel and Distributed Computing, 9:411-418, 1990.
180. C. Rick. Efficient computation of all longest common subsequences. Proceedings of the 7th Scandinavian Workshop on Algorithm Theory (SWAT'00), strony 407-418, 2000.
181. J. Rothstein. Midi: A Comprehensive Introduction. A-R Editions, wydanie 2, 1995.
182. W. Rytter. Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science, 302(1-3):211-222, 2003.
183. N. Saitou, M. Nei. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4(4):406-425, 1987.
184. Y. Sakai. A linear space algorithm for computing a longest common increasing subsequence. Information Processing Letters, 99:203-207, 2006.
185. D. Salomon. Data Compression: The Complete Reference. Springer, wydanie 4th, 2006.
186. D. Sankoff, J. Kruskal, redaktorzy. Time Warps, String Edits, and Macromolecules. The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.
187. W.J. Savitch, M.J. Stimson. Time bounded random access machines with parallel processing. Journal of the ACM, 26(1): 103-118, 1979.
188. C. Schensted. Longest increasing and decreasing subsequences. Canadian Journal of Mathematics, 13:179-191, 1961.
189. H.-Y. Schive, C.-H. Chien, S.-K. Wong, Y.-C. Tsai, T. Chiueh. Graphic-card cluster for astrophysics (GraCCA) -performance tests. arXiv:0707.2991v2 [astro-ph], 2008. http://arxiv.org/abs/0707.2991.
190. R. Sedgewick. Left-leaning red-black trees. Talk at Combinatorics and Probability seminar at University of Pennsylvania, October 2008. h t t p : //www. c s . princeton. edu/~rs/talks/LLRB/LLRB.pdf.
191. A. Shiraki, N. Takada, M. Niwa, Y. Ichihashi, T. Shimobaba, N. Masuda, T. Ito. Simplified electroholographic color reconstruction system using graphics processing unit and liquid crystal display projector. Optics Express, 17(8): 16038-16045, 2009.
192. S.J. Shyu, C.-Y. Tsai. Finding the longest common subsequence for multiple biological sequences by ant colony optimization. Computers and Operations Research, 36(1):73-91, 2009.
193. S. Skiena. Longest increasing subsequences. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica, rozdzial 2.3.6, strony 73-75. Reading, MA: Addison-Wesley, 1990.
194. D.S. Sleator, R.E. Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362-391, 1983.
195. T.F. Smith, M.S. Waterman. Identification of common molecular subsequences. Journal of Molecular Biology, 147:195-197, 1981.
196. J. Storer, T. G. Szymanski. Data compression via textual substitution. Journal of the ACM, 29:928-951, 1982.
197. C.Y. Tank, C.L. Lu, M.D.-T. Chang, Y.-T. Tsai, Y.-J. Sun, K.-M. Chao, J.-M. Chang, Y.-H. Chiou,C.-M. Wu, H.-T. Chang, W.-I. Chou. Constrained multiple sequence alignment tool development and its application to RNase family alignment. Proceedings of the first IEEE Computer Society Bioinformatics Conference (CSB 2002), strony 127-137, 2002.
198. R.E. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, 1983.
199. A. Tiskin. Semi-local string comparison: Algorithmic techniques and applications. Mathematics in Computer Science, l(4):571-603, 2008.
200. A. Tiskin. Semi-local string comparison: Algorithmic techniques and applications. arXiv, strony 1-115, 2010. arXiv:0707.3619vl5.
201. TOP500.Org. Top 500 supercomputer sites, http://www.top50(h org/, June 2010.
202. N. Trevett. OpenCL; The open standard for heterogeneous parallel programming, http://www. khronos.org/developers/library/overview/opencl_overview.pdf, June, 18 2009.
203. Y.-T. Tsai. The constrained common subsequence problem. Information Processing Letters, 88:173-176, 2003.
204. C.-T. Tseng, C.-B. Yang, H.-Y. Ann. Minimal height and sequence constrained longest increasing subsequence. Journal of Internet Technology, 10(2): 173-178, 2009.
205. A. Uitdenbogerd, J. Zobel. Melodic matching techniques for large music databases. Proceedings of the seventh ACM international conference on Multimedia (Part 1), strony 57-66, Orlando, USA, 1999.
206. E. Ukkonen. On-line construction of suffix tress. Algorithmica, 14(3):249-260, 1995.
207. L.G. Valiant. A bridging model for parallel computation. Communications of the ACM, 33(8): 103-111, 1990.
208. P. van Emde Boas. Machine models and simulations. J. van Leeuwen, redaktor, Handbook of Theoretical Computer Science. Algorithms and Complexity, rozdzial 1, strony 1-66. Elsevier Science Publishers B.V., 1990.
209. P. van Emde Boas, R. Kaas, E. Zijlstra. Design and implementation of an efficient priority queue. Mathematical Systems Theory, 10:99-127, 1977.
210. P. van Emde Boas, R. Kaas, E. Zijlstra. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters, 6(3):80-82, 1977.
211. A.M. Vershik, Kerov S.V. Asymptotics of the plancherel measure of the symmetric group and the limiting form of Young tables. Soviet Mathematics Doklady, 18:527-531, 1977.
212. R.A. Wagner, M.J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1): 168-173, 1974.
213. Q. Wang, D. Korkin, Y. Shang. Efficient dominant point algorithms for the multiple longest common subsequence (MLCS) problem. H. Kitano, redaktor, Proceedings of the 21st International Joint Conference on Artificial Intelligence, strony 1494-1499, Pasadena, California, USA, 2009.
214. W.-L. Wang. Longest common subsequence with constraints. Praca magisterska. Department of Computer Science and Information Engineering National Chi-NanUniversity, R. O. C, June 2006. http://alg.csie.ncnu.edu.tw/or/WLWang2006.pdf.
215. H.S.Warren. Hacker's Delight. Addison-Wesley Professional, 2002.
216. O. Weimann. Accelerating Dynamic Programming. Rozprawa doktorska, Massachusetts Institute of Technology, June 2009. h t t p : //erikdemaine. org/theses/oweimami. pdf.
217. P. Weiner. Linear pattern matching algorithms. Proceedings of the \Ath IEEE Annual Symposium on Switching and Automata Theory, The University of Iowa, strony 1-11, 1973.
218. T.A.Welch. A technique for high-performance data compression. Computer, 17(6):8-19, 1984.
219. C.K. Wong, A.K. Chandra. Bounds for the string editing problem. Journal of the ACM, 23(1): 13-16,1976.
220. I.-H. Yang, Y.-C. Chen. Fast algorithms for the constrained longest increasing subsequence problems. Proceedings of the 25th Workshop on Combinatorial Mathematics and Computation Theory, strony 226-231, Chung Hua University, Hsinchu Hsien, Taiwan, April 25-26 2008.
221. I-H. Yang, C.-P. Huang, K.-M. Chao. A fast algorithm for computing a longest common increasing subsequence. Information Processing Letters, 93:249-253, 2005.
222. A. Young. On quantitative substitutional analysis. Proceedings of the London Mathematical Society, s2-28(l):255-292, 1928.
223. H. Zhang. Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm. Bioinformatics, 19(11):1391-1396, 2003.
224. J. Ziv, A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, IT-23:337-343, 1977.
225. J. Ziv, A. Lempel. Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory, IT-24:530-536, 1978.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSL7-0050-0023