PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Efficient Computation of Palindromes in Sequences with Uncertainties

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this work, we consider a special type of uncertain sequence called weighted string. In a weighted string every position contains a subset of the alphabet and every letter of the alphabet is associated with a probability of occurrence such that the sum of probabilities at each position equals 1. Usually a cumulative weight threshold 1/z is specified, and one considers only strings that match the weighted string with probability at least 1/z. We provide an O(nz)-time and O(nz)-space off-line algorithm, where n is the length of the weighted string and 1/z is the given threshold, to compute a smallest maximal palindromic factorization of a weighted string. This factorization has applications in hairpin structure prediction in a set of closely-related DNA or RNA sequences. Along the way, we provide an O(nz)-time and O(nz)-space off-line algorithm to compute maximal palindromes in weighted strings. Finally, we provide an experiment of our proposed algorithm.
Słowa kluczowe
Wydawca
Rocznik
Strony
253--266
Opis fizyczny
Bibliogr. 40 poz., rys., wykr.
Twórcy
autor
  • Department of Informatics, King’s College London, UK
autor
  • Department of Informatics, King’s College London, UK
  • Department of Informatics, King’s College London, UK
autor
  • Department of Informatics, King’s College London, UK
Bibliografia
  • [1] Alzamel M, Gao J, Iliopoulos CS, Liu C, Pissis SP. Efficient Computation of Palindromes in Sequences with Uncertainties. In: Boracchi et al. [23], 2017 pp. 620-629.
  • [2] Almirantis Y, Charalampopoulos P, Gao J, Iliopoulos CS, Mohamed M, Pissis SP, Polychronopoulos D. On avoided words, absent words, and their application to biological sequence analysis. Algorithms for Molecular Biology, 2017. 12(1):5. doi:10.1186/s13015-017-0094-z.
  • [3] Manacher G. A New Linear-Time “On-Line” Algorithm for Finding the Smallest Initial Palindrome of a String. Journal of the ACM, 1975. 22(3):346-351. doi:10.1145/321892.321896.
  • [4] Apostolico A, Breslauer D, Galil Z. Parallel detection of all palindromes in a string. Theoretical Computer Science, 1995. 141(1):163-173. doi:10.1016/0304-3975(94)00083-U.
  • [5] Gusfield D. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, NY, USA, 1997. ISBN 0-521-58519-8.
  • [6] Alatabbi A, Iliopoulos CS, Rahman MS. Maximal Palindromic Factorization. In: PSC. 2013 pp. 70-77.
  • [7] Fici G, Gagie T, Kärkkäinen J, Kempa D. A Subquadratic Algorithm for Minimum Palindromic Factorization. Journal of Discrete Algorithms, 2014. 28:41-48. doi:10.1016/j.jda.2014.08.001.
  • [8] I T, Sugimoto S, Inenaga S, Bannai H, Takeda M. Computing Palindromic Factorizations and Palindromic Covers On-line. In: CPM, volume 8486 of LNCS. Springer International Publishing, 2014 pp. 150-161. doi:10.1007/978-3-319-07566-2_16.
  • [9] Rubinchik M, Shur AM. EERTREE: An Efficient Data Structure for Processing Palindromes in Strings. In: IWOCA, volume 9538 of LNCS, pp. 321-333. Springer International Publishing, 2016. doi:10.1016/j.ejc.2017.07.021.
  • [10] Muhire BM, Golden M, Murrell B, Lefeuvre P, Lett JM, Gray A, Poon AY, Ngandu NK, Semegni Y, Tanov EP, et al. Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses. Journal of virology, 2014. 88(4):1972-1989. doi:10.1128/JVI.03031-13.
  • [11] Iliopoulos CS, Makris C, Panagis Y, Perdikuri K, Theodoridis E, Tsakalidis A. The weighted suffix tree: an efficient data structure for handling molecular weighted sequences and its applications. Fundamenta Informaticae, 2006. 71(2, 3):259-277.
  • [12] Barton C, Kociumaka T, Pissis SP, Radoszewski J. Efficient Index for Weighted Sequences. In: CPM, volume 54 of LIPIcs. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016 pp. 4:1-4:13. doi:10.4230/LIPIcs.CPM.2016.4.
  • [13] Amir A, Gotthilf Z, Shalom BR. Weighted LCS. Journal of Discrete Algorithms, 2010. 8(3):273-281. doi:10.1016/j.jda.2010.02.001.
  • [14] Cygan M, Kubica M, Radoszewski J, Rytter W, Walen T. Polynomial-time approximation algorithms for weighted LCS problem. Discrete Applied Mathematics, 2016. 204:38-48. doi:10.1016/j.dam.2015.11.011.
  • [15] Kociumaka T, Pissis SP, Radoszewski J. Pattern Matching and Consensus Problems on Weighted Sequences and Profiles. In: ISAAC, volume 64 of LIPIcs. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016 pp. 46:1-46:12. doi:10.1007/s00224-018-9881-2.
  • [16] Barton C, Liu C, Pissis SP. Linear-time computation of prefix table for weighted strings & applications. Theoretical Computer Science, 2016. 656:160-172. doi:10.1016/j.tcs.2016.04.029.
  • [17] Barton C, Liu C, Pissis SP. On-Line Pattern Matching on Uncertain Sequences and Applications. In: COCOA, volume 10043 of LNCS, pp. 547-562. Springer International Publishing, 2016. doi:10.1007/978-3-319-48749-6_40.
  • [18] Barton C, Iliopoulos CS, Pissis SP. Optimal Computation of all Tandem Repeats in a Weighted Sequence. Algorithms for Molecular Biology, 2014. 9(21). doi:10.1186/s13015-014-0021-5.
  • [19] Barton C, Pissis SP. Crochemore’s Partitioning on Weighted Strings and Applications. Algorithmica, 2017. doi:10.1007/s00453-016-0266-0.
  • [20] Farach M. Optimal suffix tree construction with large alphabets. In: FOCS. IEEE Computer Society, 1997 pp. 137-143.
  • [21] Bender MA, Farach-Colton M. The LCA Problem Revisited. In: LATIN, volume 1776 of LNCS. Springer-Verlag, 2000 pp. 88-94. doi:10.1007/10719839_9.
  • [22] Barton C, Kociumaka T, Liu C, Pissis SP, Radoszewski J. Indexing Weighted Sequences: Neat and Efficient. CoRR, 2017. abs/1704.07625.
  • [23] Boracchi G, Iliadis LS, Jayne C, Likas A (eds.). Engineering Applications of Neural Networks - 18th International Conference, EANN 2017, Athens, Greece, August 25-27, 2017, Proceedings, volume 744 of Communications in Computer and Information Science. Springer, 2017.
  • [24] Breslauer D, Galil Z. Finding all periods and initial palindromes of a string in parallel. Algorithmica, 1995. 14(4):355-366. doi:10.1007/BF01294132.
  • [25] Galil Z. Real-time algorithms for string-matching and palindrome recognition. In: Proceedings of the Eighth Annual ACM Symposium on Theory of Computing. ACM, 1976 pp. 161-173. doi:10.1145/800113.803644.
  • [26] Hsu PH, Chen KY, Chao KM. Finding all approximate gapped palindromes. In: International Symposium on Algorithms and Computation. Springer, 2009 pp. 1084-1093. doi:10.1007/978-3-642-10631-6_109.
  • [27] Knuth DE, Morris JH Jr, Pratt VR. Fast pattern matching in strings. SIAM journal on computing, 1977. 6(2):323-350. doi:10.1137/0206024.
  • [28] Droubay X. Palindromes in the Fibonacci word. Information Processing Letters, 1995. 55(4):217-221. doi:10.1016/0020-0190(95)00080-V.
  • [29] Droubay X, Pirillo G. Palindromes and Sturmian Words. Theor. Comput. Sci., 1999. 223(1-2):73-85. doi:10.1016/S0304-3975(97)00188-6.
  • [30] Porto AH, Barbosa VC. Finding approximate palindromes in strings. Pattern Recognition, 2002. 35(11):2581-2591. doi:10.1016/S0031-3203(01)00179-0.
  • [31] Matsubara W, Inenaga S, Ishino A, Shinohara A, Nakamura T, Hashimoto K. Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theoretical Computer Science, 2009. 410(8):900-913. doi:10.1016/j.tcs.2008.12.016.
  • [32] Gupta S, Prasad R, Yadav S. Searching Gapped Palindromes in DNA Sequences using Dynamic Suffix Array. Indian Journal of Science and Technology, 2015. 8(23):1. doi:10.17485/ijst/2015/v8i23/70645.
  • [33] Kolpakov R, Kucherov G. Searching for gapped palindromes. Theor. Comput. Sci., 2009. 410(51):5365-5373. doi:10.1016/j.tcs.2009.09.013.
  • [34] Crochemore M, Hancart C, Lecroq T. Algorithms on Strings. Cambridge University Press, 2007.
  • [35] Fujishige Y, Nakamura M, Inenaga S, Bannai H, Takeda M. Finding Gapped Palindromes Online. In: Mäkinen et al. [36], 2016 pp. 191-202. doi:978-3-319-44543-4_15.
  • [36] Mäkinen V, Puglisi SJ, Salmela L (eds.). Combinatorial Algorithms - 27th International Workshop, IWOCA 2016, Helsinki, Finland, August 17-19, 2016, Proceedings, volume 9843 of Lecture Notes in Computer Science. Springer, 2016. ISBN 978-3-319-44542-7. doi:10.1007/978-3-319-44543-4.
  • [37] Crochemore M, Rytter W. Jewels of stringology: text algorithms. World Scientific, 2003.
  • [38] Kosolobov D, Rubinchik M, Shur AM. Palk is Linear Recognizable Online. In: Proc. 41th Int. Conf. on Theory and Practice of Computer Science (SOFSEM 2015). 2015 pp. 289-301. doi: 10.1007/978-3-662-46078-8_24.
  • [39] Amir A, Chencinski E, Iliopoulos C, Kopelowitz T, Zhang H. Property matching and weighted matching. Theoretical Computer Science, 2008. 395(2-3):298-310. doi:10.1016/j.tcs.2008.01.006.
  • [40] Amir A, Iliopoulos C, Kapah O, Porat E. Approximate matching in weighted sequences. In: CPM, volume 4009 of LNCS. Springer, 2006 pp. 365-376. doi:10.1007/11780441_33.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-a46e0563-7410-4ce2-b58b-9dfd3cf326cb
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.