Analiza wydajności wybranych metod równoległej realizacji algorytmów mnożenia macierzy dla GPU

Michalski, G.; Woźniak, M.

Artykuł - szczegóły

Tytuł artykułu

Analiza wydajności wybranych metod równoległej realizacji algorytmów mnożenia macierzy dla GPU

Autorzy

Michalski G. , Woźniak M.

Identyfikatory

Warianty tytułu

The efficiency analysis of chosen methods of parallel implementation of matrix multiplication algorithms for GPU

Języki publikacji

Abstrakty

W artykule przedstawiono architekturę procesorów graficznych z serii 8 firmy nVidia oraz technologię CUDA. Zaprezentowano przykładowe metody realizacji wybranych równoległych algorytmów mnożenia macierzy (gęstych oraz rzadkich) z wykorzystaniem graficznych jednostek przetwarzających. Przeprowadzono analizę porównawczą wyników uzyskanych dla procesorów ogólnego przeznaczenia oraz procesorów graficznych.

In this paper we present the architecture of nVidia 8-series graphics processors and the CUDA technology. Exemplary methods of implementing parallel matrices (sparse and dense) multiplication algorithms on Graphics Processing Unit are described. The paper also presents a comparative analysis of the achieved results with the results of computations on general processing units.

Słowa kluczowe

karty graficzne GPU GPGPU CUDA przetwarzanie współbieżne i równoległe

graphics processors GPU GPGPU CUDA parallel processing

Wydawca

Wydawnictwo Politechniki Częstochowskiej

Czasopismo

Informatyka Teoretyczna i Stosowana

Rocznik

2008

Tom

R. 8, nr 13

Strony

93--101

Opis fizyczny

Bibliogr. 9 poz., rys., tab.

Twórcy

autor

Michalski G.

autor

Woźniak M.

Instytut Informatyki Teoretycznej i Stosowanej, Politechnika Częstochowska, ul. Dąbrowskiego 73, 42-200 Częstochowa

Bibliografia

[1] Strzodka R., Doggett M. C., Kolb A., Scientific computation for simulations on programmable graphics hardware. Simulation Modelling Practice and Theory 2005, 13(8), 667-680.
[2] NVIDIA TESLA Computing Solutions.
[3] Dokken T., Hagen T.R., Hjelmervik J.M., An introduction to general-purpose computing on programmable graphics hardware. In Geometric Modelling, Numerical Simulation, and Optimization, Springer Berlin Heidelberg 2007, 123-161.
[4] Hagen T., Introduction to Stream Processing. http://www.sintef.no/upload/IKT/9011/SimOslo/eVITA/2008/hagen.pdf.
[5] Fatahalian K., Houston M., A closer look at GPUs. Commun, ACM 2008, 51(10), 50-57.
[6] Owens D., Luebke D., Govindaraju N., Harris M., Kruger J., Lefohn A. E., Purcell T. J., A survey of general-purpose computation on graphics hardware, Computer Graphics Forum 2007, 26(1), 80-113.
[7] Fujimoto N., Faster matrix-vector multiplication on GeForce 8800GTX. http://www.mi.s.osakafu-u.ac.jp/~fujimoto/CUDA/fujimoto_lspp2008.pdf.
[8] Sengupta S., Harris M., Zhang Y., Owens J. D., Scan primitives for gpu computing. In GH'07, Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, pages 97-106, Aire-la-Ville, Switzerland, Switzerland 2007, Eurographics Association.
[9] Garland M., Sparse matrix computations on manycore GPU's, In DAC08, Proceedings of the 45th annual conference on Design automation, New York, NY, USA, 2008, ACM, 2-6.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPC1-0001-0069