PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

G-MAPSEQ – a new method for mapping reads to a reference genome

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.
Rocznik
Strony
123--142
Opis fizyczny
Bibliogr. 22 poz., fig., tab.
Twórcy
  • Institute of Computing Science, Poznan University of Technology, Poland
  • Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poland
  • European Center for Bioinformatics and Genomics, Poland
autor
  • Institute of Computing Science, Poznan University of Technology, Poland
  • European Center for Bioinformatics and Genomics, Poland
autor
  • Institute of Computing Science, Poznan University of Technology, Poland
  • European Center for Bioinformatics and Genomics, Poland
  • Poznan Supercomputing and Networking Center, Poland
autor
  • Institute of Computing Science, Poznan University of Technology, Poland
  • European Center for Bioinformatics and Genomics, Poland
autor
  • Institute of Computing Science, Poznan University of Technology, Poland
  • Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poland
  • European Center for Bioinformatics and Genomics, Poland
Bibliografia
  • [1] Blazewicz J., Frohmberg W., Kierzynka M., Pesch E., Wojciechowski P., Protein alignment algorithms with an efficient backtracking routine on multiple GPUs, BMC Bioinformatics, 12, 181, 2011.
  • [2] Blazewicz J., Frohmberg W., Kierzynka M., Wojciechowski P., G-MSA – A GPU-based, fast and accurate algorithm for multiple sequence alignment, J. Parallel. Distr. Com., 73, 1, 2013, 32–41.
  • [3] Ferragina P., Manzini G., Opportunistic Data Structures with Applications, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000.
  • [4] Fiannaca A., La Rosa M., Rizzo R., Urso A., A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network, Artificial intelligence in medicine, 64, 3, 2015, 173–184.
  • [5] Fonseca N.A., Rung J., Brazma A., Marioni J.C., Tools for mapping high throughput sequencing data, Bioinformatics, 28, 24, 2012, 3169–3177.
  • [6] Frohmberg W., Kierzynka M., Blazewicz J., Gawron P., Wojciechowski P., G-DNA – a highly efficient multi-GPU/MPI tool for aligning nucleotide reads, Bulletin of the Polish Academy of Sciences Technical Sciences, 61, 4, 2013, 989– 992.
  • [7] Holtgrewe M., Mason – a read simulator for second generation sequencing data, Technical Report Institut für Mathematik und Informatik, Freie Universität Berlin, TR-B-10-06, 2010.
  • [8] Holtgrewe M., Emde A.-K., Weese D., Reinert K., A Novel And Well-Defined Benchmarking Method For Second Generation Read Mapping BMC Bioinformatics, 12, 210, 2011.
  • [9] Kierzynka M., GPU-accelerated graph construction for the whole genome assembly, Phd. thesis, Poznan University of Technology, Poznan, Poland, 2014.
  • [10] Kuksa P., Pavlovic V., Efficient alignment-free DNA barcode analytics, BMC bioinformatics, 10, 14, 2009, 1–18.
  • [11] Langmead B., Salzberg S.L., Fast gapped-read alignment with Bowtie 2, Nat Methods, 9, 4, 2013, 357–359.
  • [12] Langmead B., Trapnell C., Pop M., Salzberg S.L., Ultrafast and memory efficient alignment of short DNA sequences to the human genome, Genome Biology, 10, 3, 2009, 1–10.
  • [13] Liu Y., Schröder J., Schmidt B., Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data, Bioinformatics, 29, 3, 2013, 308–315.
  • [14] Needleman S.B., Wunsch C.D., A general method applicable to the search for similarities in the amino acid sequence of two proteins, J. Mol. Biol., 48, 3, 1970, 443–453.
  • [15] Polychronopoulos D., Weitschek E., Dimitrieva S., Bucher P., Felici G., Almirantis Y., Classification of selectively constrained dna elements using feature vectors and rule-based classifiers, Genomics, 104, 2, 2014, 79–86.
  • [16] Reinert G., Chew D., Sun F., and Waterman M.S., Alignment-free sequence comparison (I): statistics and power, Journal of Computational Biology, 16, 12, 2009, 1615-1634.
  • [17] Vinga S., Almeida J., Alignment-free sequence comparison - a review, Bioinformatics, 19, 4, 2003, 513-523.
  • [18] Wan L., Reinert G., Sun F., Waterman M.S., Alignment-free sequence comparison (II): theoretical power of comparison statistics, Journal of Computational Biology, 17, 11, 2010, 1467-1490.
  • [19] Weese D., Emde A.-K., Rausch T., Döring A., Reinert K., RazerS – fast read mapping with sensitivity control, Genome Research, 19, 2009, 1646-1654.
  • [20] Weese D., Holtgrewe M., Reinert K., RazerS 3: faster, fully sensitive read mapping, Bioinformatics, 28, 20, 2012, 2592-2599.
  • [21] Weitschek E., Cunial F., Felici G., LAF: Logic Alignment Free and its application to bacterial genomes classification, BioData mining, 8, 1, 2015.
  • [22] Weitschek E., Santoni D., Fiscon G., De Cola M.C., Bertolazzi P., Felici G., Next generation sequencing reads comparison with an alignment-free distance, BMC research notes, 7, 1, 2014, 1–13.
Uwagi
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-31194866-63dd-467b-b605-50c2ef008e79
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.