PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Spaced Seed Design Using Perfect Rulers

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
A widely used class of approximate pattern matching algorithms work in two stages, the first being a filtering stage that uses spaced seeds to quickly discards regions where a match is not likely to occur. The design of effective spaced seeds is known to be a hard problem. In this setting, we propose a family of lossless spaced seeds for matching with up to two errors based on mathematical objects known as perfect rulers. We analyze these seeds with respect to the tradeoff they offer between seed weight and the minimum length of the pattern to be matched. We identify a specific property of rulers, namely their skewness, which is closely related to the minimum pattern length of the derived seeds. In this context, we study in depth the specific case of Wichmann rulers and investigate the generalization of our approach to the larger class of unrestricted rulers. Although our analysis is mainly of theoretical interest, we show that for pattern lengths of practical relevance our seeds have a larger weight, hence a better filtration efficiency, than the ones known in the literature.
Wydawca
Rocznik
Strony
187--203
Opis fizyczny
Bibliogr. 20 poz., tab.
Twórcy
autor
  • DiSIT, Computer Science Institute, Università del Piemonte Orientale, Italy
autor
  • DiSIT, Computer Science Institute, Università del Piemonte Orientale, Italy
Bibliografia
  • [1] Stephen Altschul, Warren Gish, Webb Miller, Eugene Myers, and David Lipman. Basic local alignment search tool. J Mol Biol, 215:403–410, 1990.
  • [2] Stefan Burkhardt and Juha K¨arkk¨ainen. Better filtering with gapped q-grams. Fundam. Inform., 56(1-2):51–70, 2003.
  • [3] Yangho Chen, Tade Souaiaia, and Ting Chen. Perm: Efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics, 25(19):2514–2521, 2009.
  • [4] Lavinia Egidi and Giovanni Manzini. Better spaced seeds using quadratic residues. J. Comput. Syst. Sci., 79(7):1144–1155, 2013.
  • [5] Lavinia Egidi and Giovanni Manzini. Minimum pattern length for short spaced seeds based on linear rulers (revised). Technical Report TR-INF-2013-07-01-UNIPMN,DiSIT, Computer Science Institute, UPO, 2013. http://www.di.unipmn.it.
  • [6] P. Erdós and I. S. Gál. On the representation of 1, 2, . . . , n by differences. Indagationes Math., 10:379–382, 1948.
  • [7] Martin Farach-Colton, Gad M. Landau, Süleyman Cenk Sahinalp, and Dekel Tsur. Optimal spaced seeds for faster approximate string matching. J. Comput. Syst. Sci., 73(7):1035–1044, 2007.
  • [8] Xiaoqiu Huang and Webb Miller. A time-efficient linear-space local similarity algorithm. Adv. Appl. Math., 12:337–357, September 1991.
  • [9] Uri Keich, Ming Li, Bin Ma, and John Tromp. On spaced seeds for similarity search. Discrete Applied Mathematics, 138(3):253–263, 2004.
  • [10] Gregory Kucherov, Laurent No´e, and Mikhail A. Roytberg. Multiseed lossless filtration. IEEE/ACM Trans. Comput. Biology Bioinform., 2(1):51–61, 2005.
  • [11] J. Leech. On the representation of 1, 2, . . . , n by differences. J. London Math. Soc., 31:160–169, 1956.
  • [12] Ming Li, Bin Ma, Derek Kisman, and John Tromp. Pattern Hunter II: Highly sensitive and fast homology search. J. Bioinformatics and Computational Biology, 2(3):417–440, 2004.
  • [13] D. J. Lipman and W. R. Pearson. Rapid and sensitive protein similarity searches. Science (New York, N.Y.), 227(4693):1435–1441, 1985.
  • [14] Peter Luschny. Perfect and optimal rulers, 2003. http://www.luschny.de/math/rulers/prulers.html.
  • [15] Bin Ma and Ming Li. On the complexity of the spaced seeds. J. Comput. Syst. Sci., 73(7):1024–1034, 2007.
  • [16] Bin Ma, John Tromp, and Ming Li. Pattern Hunter: Faster and more sensitive homology search. Bioinformatics, 18(3):440–445, 2002.
  • [17] Bin Ma and Hongyi Yao. Seed optimization is no easier than optimal Golomb ruler design. In Alvis Brazma, Satoru Miyano, and Tatsuya Akutsu, editors, APBC, volume 6 of Advances in Bioinformatics and Computational Biology, pages 133–144. Imperial College Press, 2008.
  • [18] Bin Ma and Hongyi Yao. Seed optimization for i.i.d. similarities is no easier than optimal Golomb ruler design. Information Processing Letters, 109(19):1120–1124, 2009.
  • [19] Francois Nicolas and Eric Rivals. Hardness of optimal spaced seed design. J. Comput. Syst. Sci., 74(5):831–849, 2008.
  • [20] B. Wichmann. A note on restricted difference bases. J. London Math. Soc., 38:465–466, 1962.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-324358ef-47a4-4e42-b288-b74a2d8a59bf
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.