Warianty tytułu
Języki publikacji
Abstrakty
In this paper we present external memory algorithms for some string problems. External memory algorithms have been developed in many research areas, as the speed gap between fast internal memory and slow external memory continues to grow. The goal of external memory algorithms is to minimize the number of input/output operations between internal memory and external memory. These years the sizes of strings such as DNA sequences are rapidly increasing. However, external memory algorithms have been developed for only a few string problems. In this paper we consider five string problems and present external memory algorithms for them. They are the problems of finding the maximum suffix, string matching, period finding, Lyndon decomposition, and finding the minimum of a circular string. Every algorithm that we present here runs in a linear number of I/Os in the external memory model with one disk, and they run in an optimal number of disk I/Os in the external memory model with multiple disks.
Czasopismo
Rocznik
Tom
Strony
17-32
Opis fizyczny
bibliogr. 39 poz., wykr.
Twórcy
autor
autor
autor
autor
- School of Computer Science and Engineering, Seoul National University, Seoul 151-744, Korea, khroh@theory.snu.ac.kr
Bibliografia
- [1] Aggarwal, A., Vitter, J. S.: The input/output complexity of sorting and related problems, Communications of the ACM, 31(9), 1988, 1116-1127, ISSN 0001-0782.
- [2] Apostolico, A., Crochemore, M.: Fast Parallel Lyndon Factorization with Applications., Mathematical Systems Theory, 28(2), 1995, 89-108.
- [3] Arge, L., Ferragina, P., Grossi, R., Vitter, J. S.: On sorting strings in external memory (extended abstract), STOC '97: Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, ACM Press, New York, NY, USA, 1997, ISBN 0-89791-888-6.
- [4] Bender, M. A., Demaine, E. D., Farach-Colton, M.: Cache-oblivious B-trees, FOCS '00: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, IEEE Computer Society,Washington, DC, USA, 2000, ISBN 0-7695-0850-2.
- [5] Booth, K. S.: Lexicographically Least Circular Substrings., Inf. Process. Lett., 10(4/5), 1980, 240-242.
- [6] Boyer, R. S., Moore, J. S.: A fast string searching algorithm, Communications of the ACM, 20(10), 1977, 762-772, ISSN 0001-0782.
- [7] Chiang, Y.-J., Goodrich, M. T., Grove, E. F., Tamassia, R., Vengroff, D., Vitter, J.: External-memory graph algorithms, SODA '95: Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1995, ISBN 0-89871-349-8.
- [8] Clark, D. R., Munro, J. I.: Efficient suffix trees on secondary storage, SODA '96: Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1996, ISBN 0-89871-366-8.
- [9] Crochemore,M.: Transducers and repetitions, Theor. Comput. Sci., 45(1), 1986, 63-86, ISSN 0304-3975.
- [10] Crochemore,M.: String matching and periods, Theor. Comput. Sci., 39(1), 1989, 63-86, ISSN 0304-3975.
- [11] Crochemore, M.: String-matching on ordered alphabets, Theor. Comput. Sci., 92(1), 1992, 33-47, ISSN 0304-3975.
- [12] Crochemore,M., Perrin, D.: Two-way string-matching, J. ACM, 38(3), 1991, 650-674, ISSN 0004-5411.
- [13] Crochemore,M., Rytter, W.: Jewels of Stringology, World Scientific, 2002.
- [14] Duval, J.-P.: Factoring words over an ordered alphabet, Journal of Algorithms, 4(4), 1983, 363-381, ISSN 0304-3975.
- [15] Farach, M., Ferragina, P., Muthukrishnan, S.: Overcoming the Memory Bottleneck in Suffix Tree Construction, FOCS '98: Proceedings of the 39th Annual Symposium on Foundations of Computer Science, IEEE Computer Society,Washington, DC, USA, 1998, ISBN 0-8186-9172-7.
- [16] Ferragina, P., Grossi, R.: Fast string searching in secondary storage: Theoretical developments and experimental results, SODA '96: Proceedings of the seventh annual ACM-SIAMsymposium on Discrete algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1996, ISBN 0-89871-366-8.
- [17] Ferragina, P., Grossi, R.: The string B-tree: a new data structure for string search in external memory and its applications, J. ACM, 46(2), 1999, 236-280, ISSN 0004-5411.
- [18] Ferragina, P., Luccio, F.: On the Parallel Dynamic Dictionary Matching Problem: New Results with Applications, ESA '96: Proceedings of the Fourth Annual European Symposium on Algorithms, Springer-Verlag, London, UK, 1996, ISBN 3-540-61680-2.
- [19] Feuerstein, E., Marchetti-Spaccamela, A.: Memory Paging for Connectivity and Path Problems in Graphs, ISAAC '93: Proceedings of the 4th International Symposium on Algorithms and Computation, Springer-Verlag, London, UK, 1993, ISBN 3-540-57568-5.
- [20] Galil, Z., Seiferas, J.: Time-space-optimal string matching (Preliminary Report), STOC '81: Proceedings of the thirteenth annual ACM symposium on Theory of computing, ACM Press, New York, NY, USA, 1981.
- [21] Gasieniec, L., Kolpakov, R. M.: Real-Time String Matching in Sublinear Space., CPM, 2004.
- [22] Gonnet, G. H., Baeza-Yates, R.: Handbook of algorithms and data structures: in Pascal and C (2nd ed.), Addison-Wesley Longing Publishing Co., Inc., Boston, MA, USA, 1991, ISBN 0-201-41607-7.
- [23] Gusfield, D.: Algorithms on Strings, Trees, and Sequences, Cambridge University Press, New York, 1997.
- [24] Iliopoulos, C. S., Smyth, W. F.: Optimal algorithms for computing the canonical form of a circular string, Theor. Comput. Sci., 92(1), 1992, 87-105, ISSN 0304-3975.
- [25] Iliopoulos, C. S., Smyth, W. F.: A fast average case algorithm for Lyndon decomposition, Internat. J. Computer Math, 57, 1995, 15-31, ISSN 0004-5411.
- [26] Knuth, D. E., Morris, J. H., Pratt, V. R.: Fast Pattern Matching in Strings, SIAM J. Comp, 6(2), 1977, 323-350, ISSN 0004-5411.
- [27] Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches, SODA '90: Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1990, ISBN 0-89871-251-3.
- [28] McCreight, E.M.: A Space-Economical Suffix Tree Construction Algorithm, J. ACM, 23(2), 1976, 262-272, ISSN 0004-5411.
- [29] Nodine,M. H., Goodrich,M. T., Vitter, J. S.: Blocking for external graph searching, PODS '93: Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, ACM Press, New York, NY, USA, 1993, ISBN 0-89791-593-3.
- [30] Nodine, M. H., Vitter, J. S.: Paradigms for optimal sorting with multiple disks, In Proc. of the 26th Hawaii Int. Conf. on Systems Sciences, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1993, ISBN 0-89871-251-3.
- [31] Rajsbaum, S., Ed.: LATIN 2002: Theoretical Informatics, 5th Latin American Symposium, Cancun, Mexico, April 3-6, 2002, Proceedings, vol. 2286 of Lecture Notes in Computer Science, Springer, 2002, ISBN 3-540-43400-3.
- [32] Roh, K., Crochemore, M., Iliopoulos, C. S., Park, K.: External Memory Algorithms for String Problems., Proceedings of the 17th Australasian Workshop on Combinatorial Algorithms, 2006.
- [33] Ruemmler, C., Wilkes, J.: An introduction to disk drive modeling, Computer, 27(3), 1994, 17-28, ISSN 0018-9162.
- [34] Rytter,W.: On Maximal Suffixes and Constant-Space Linear-Time Versions of KMP Algorithm, LATIN '02: Proceedings of the 5th Latin American Symposium on Theoretical Informatics, Springer-Verlag, London, UK, 2002, ISBN 3-540-43400-3.
- [35] Shiloach, Y.: Fast canonization of circular strings, J. Algorithms, 2, 1981, 107-121, ISSN 0004-5411.
- [36] Ullman, J. D., Yannakakis,M.: The input/output complexity of transitive closure, SIGMOD '90: Proceedings of the 1990 ACM SIGMOD international conference on Management of data, ACM Press, New York, NY, USA, 1990, ISBN 0-89791-365-5.
- [37] Vitter, J. S.: External memory algorithms, PODS '98: Proceedings of the seventeenth ACM SIGACTSIGMOD-SIGART symposium on Principles of database systems, ACM Press, New York, NY, USA, 1998, ISBN 0-89791-996-3.
- [38] Vitter, J. S.: External memory algorithms and data structures: dealing with massive data, ACM Comput.Surv., 33(2), 2001, 209-271, ISSN 0360-0300.
- [39] Vitter, J. S., Shriver, E. A. M.: Algorithms for Parallel Memory I: Two-Level Memories, Technical report, Providence, RI, USA, 1992.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS5-0015-0061