G-DNA – a highly efficient multi-GPU/MPI tool for aligning nucleotide reads

Frohmberg, W.; Kierzynka, M.; Blazewicz, J.; Gawron, P.; Wojciechowski, P.

Artykuł - szczegóły

Tytuł artykułu

G-DNA – a highly efficient multi-GPU/MPI tool for aligning nucleotide reads

Autorzy

Frohmberg W. , Kierzynka M. , Blazewicz J. , Gawron P. , Wojciechowski P.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

DNA/RNA sequencing has recently become a primary way researchers generate biological data for further analysis. Assembling algorithms are an integral part of this process. However, some of them require pairwise alignment to be applied to a great deal of reads. Although several efficient alignment tools have been released over the past few years, including those taking advantage of GPUs (Graphics Processing Units), none of them directly targets high-throughput sequencing data. As a result, a need arose to create software that could handle such data as effectively as possible. G-DNA (GPU-based DNA aligner) is the first highly parallel solution that has been optimized to process nucleotide reads (DNA/RNA) from modern sequencing machines. Results show that the software reaches up to 89 GCUPS (Giga Cell Updates Per Second) on a single GPU and as a result it is the fastest tool in its class. Moreover, it scales up well on multiple GPUs systems, including MPI-based computational clusters, where its performance is counted in TCUPS (Tera CUPS).

Słowa kluczowe

DNA assembly preprocessing sequence alignment GPU computing

Wydawca

Polska Akademia Nauk, Wydział IV Nauk Technicznych

Czasopismo

Bulletin of the Polish Academy of Sciences. Technical Sciences

Rocznik

2013

Tom

Vol. 61, nr 4

Strony

989--992

Opis fizyczny

Bibliogr. 26 poz., wykr.

Twórcy

autor

Frohmberg W.

Institute of Computing Science, Poznań University of Technology, Poznań, Poland

autor

Kierzynka M.

michal.kierzynka@cs.put.poznan.pl

Institute of Computing Science, Poznań University of Technology, Poznań, Poland
Poznań Supercomputing and Networking Center, Poznań, Poland

autor

Blazewicz J.

Institute of Computing Science, Poznań University of Technology, Poznań, Poland
Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznań, Poland

autor

Gawron P.

Institute of Computing Science, Poznań University of Technology, Poznań, Poland

autor

Wojciechowski P.

Institute of Computing Science, Poznań University of Technology, Poznań, Poland
Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg

Bibliografia

[1] J. Blazewicz, M. Bryja, M. Figlerowicz, P. Gawron, M. Kasprzak, E. Kirton, D. Platt, J. Przybytek, A. Swiercz, and L. Szajkowski, “Whole genome assembly from 454 sequencing output via modified DNA graph concept”, Comput. Biol. Chem. 33 (3), 224-230 (2009).
[2] J. Blazewicz, W. Frohmberg, P. Gawron, M. Kasprzak, M. Kierzynka, A. Swiercz, and P.Wojciechowski, “DNA sequence assembly involving an acyclic graph model”, FCDS 38, 25-34, doi: 10.2478/v10209-011-0019-4 (2013).
[3] Forge Genome Assembler http://combiol.org/forge/(2012).
[4] J. Blazewicz, P. Formanowicz, F. Guinand, and M. Kasprzak, “A heuristic managing errors for DNA sequencing”, Bioinformatics 18, 652-660 (2002).
[5] S.B. Needleman and C.D.Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins”, J. Mol. Biol. 48 (3), 443-53 (1970).
[6] L. Ligowski and W. Rudnicki, “An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases”, IPDPS 2009, IEEE Computer Society, doi:10.1109/IPDPS.2009.5160931 (2009).
[7] Y. Liu, D.L. Maskell, and B. Schmidt, “CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDAenabled GPUs based on SIMT and virtualized SIMD abstractions”, BMC Research Notes 3, 93 (2010).
[8] M.S. Farrar, “Optimizing Smith-Waterman for the cell broadband engine”, Bioinformatics 23, 156-161 (2008).
[9] T. Rognes, “Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation”, BMC Bioinformatics 12, 221 (2011).
[10] J. Blazewicz, P.L. Hammer, and P. Lukasiak, “Predicting secondary structures of proteins. Recognizing properties of amino acids with the logical analysis of data algorithm”, IEEE Eng. Med. Biol. Mag. 24 (3), 88-94 (2005).
[11] P. Lukasiak, J. Blazewicz, and M. Milostan, “Some operations research methods for analyzing protein sequences and structures”, Annals OR 175 (1), 9-35 (2010).
[12] P. Lukasiak, M. Antczak, T. Ratajczak, J.M. Bujnicki, M. Szachniuk, R.W. Adamiak, M. Popenda, and J. Blazewicz, “RNAlyzer novel approach for quality analysis of RNA structural models”, Nucleic Acids Res 41 (12), 5978-5990, doi:10.1093/nar/gkt318 (2013).
[13] W. Hwu, GPU Computing Gems Emerald Edition, Morgan Kaufman, Berlin, 2011.
[14] M. Blazewicz, S.R. Brandt, M. Kierzynka, K. Kurowski, B. Ludwiczak, J. Tao, and J. Weglarz, “CaKernel - a parallel application programming framework for heterogenous computing architectures”, Scientific Programming 19 (4), 185-197 (2011).
[15] M. Blazewicz, I. Hinder, D.M. Koppelman, S.R. Brandt, M. Ciznicki, M. Kierzynka, F. Loffler, E. Schnetter, and J. Tao, ‘From physics model to results: An optimizing framework for cross-architecture code generation”, Scientific Programming 21 (1-2), 1-16 (2013).
[16] W. Andrzejewski, A. Gramacki, and J. Gramacki, “Graphics processing units in acceleration of bandwidth selection for Kernel density estimation”, AMCS 23 (4) (2013).
[17] W. Andrzejewski and R.Wrembel, “GPU-PLWAH: GPU-based implementation of the PLWAH algorithm for compressing bitmaps”, Control and Cybernetics 40 (3), 627-650 (2011).
[18] R. Nowotniak and J. Kucharski, “GPU-based tuning of quantum-inspired genetic algorithm for a combinatorial optimization problem”, Bull. Pol. Ac.: Tech. 60 (2), 323-330, doi: 10.2478/v10175-012-0043-4 (2012).
[19] S. Manavski and G. Valle, “CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment”, BMC Bioinformatics 9 (2), S10 (2008).
[20] Y. Liu, D.L. Maskell, and B. Schmidt, “CUDASW++: optimizing Smith-Waterman sequence database searches for CUDAenabled graphics processing units”, BMC Research Notes 2 (2009).
[21] J. Blazewicz, W. Frohmberg, M. Kierzynka, E. Pesch, and P. Wojciechowski, “Protein alignment algorithms with an efficient backtracking routine on multiple GPUs”, BMC Bioinformatics 12, 181 (2011).
[22] J. Blazewicz, W. Frohmberg, M. Kierzynka, and P. Wojciechowski, “G-PAS 2.0 - an improved version of protein alignment tool with an efficient backtracking routine on multiple GPUs”, Bull. Pol. Ac.: Tech. 60 (3), 491-494, doi: 10.2478/v10175-012-0062-1 (2012).
[23] C. Liu, T. Wong, E. Wu, R. Luo, S. Yiu, Y. Li, B.Wang, C. Yu, X. Chu, K. Zhao, R. Li, and T. Lam, “SOAP3: ultra-fast GPUbased parallel alignment tool for short reads”, Bioinformatics 28 (6), 878-879, doi: 10.1093/bioinformatics/bts061 (2012).
[24] J. Blazewicz, W. Frohmberg, M. Kierzynka, and P. Wojciechowski, “G-MSA - A GPU-based, fast and accurate algorithm for multiple sequence alignment”, JPDC, doi:10.1016/j.jpdc.2012.04.004 (2012)
[25] EMBOSS Package, http://emboss.sourceforge.net/ (2012).
[26] T.R.P. Siriwardena and D.N. Ranasinghe, “Accelerating global sequence alignment using CUDA compatible multi-core GPU”, ICIAFs 2010 1, CD-ROM (2010).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e40c9add-d0a4-404d-be52-05cde298a3e0