Poprawa precyzji obliczeń zmiennoprzecinkowych w GPGPU za pomocą dwóch akumulatorów

Pawłowski, P.; Dąbrowski, A.; Stankiewicz, M.; Misiorek, F.

Artykuł - szczegóły

Tytuł artykułu

Poprawa precyzji obliczeń zmiennoprzecinkowych w GPGPU za pomocą dwóch akumulatorów

Autorzy

Pawłowski P. , Dąbrowski A. , Stankiewicz M. , Misiorek F.

Identyfikatory

Warianty tytułu

Improvement of precision of floating-point computations in GPGPU by means of two accumulators

Języki publikacji

Abstrakty

W artykule zaprezentowano ideę wykorzystania modelu dwóch akumulatorów w celu poprawy precyzji obliczeń dokonywanych na procesorach graficznych (GPU). Zastosowanie dwóch akumulatorów zmiennoprzecinkowych pozwala na zwiększenie precyzji obliczeń bez wydłużania słów danych. Jest to szczególnie istotnie w przypadkach ograniczeń sprzętowych co do precyzji obliczeń, jakie mają miejsce w procesorach graficznych. Przedstawiono historię rozwoju architektur rdzeni kart graficznych, ideę obliczeń ogólnego przeznaczenia na GPU (GPGPU) ze szczególnym uwzględnieniem wydajności oraz dokładności obliczeń. Zaproponowane rozwiązanie przetestowano na przykładzie filtrów FIR, dla których zaprezentowano wyniki obliczeń. Implementację w GPGPU porównano z klasycznym rozwiązaniem wykorzystującym procesor ogólnego przeznaczenia.

An idea of the use of a two accumulator model for improvement of precision of computations in graphic processing units (GPUs) is presented in this paper. Application of two floating-point accumulators makes it possible to increase precision of computations without any increase of the data words. This is particularly important if hardware limits for the precision of computations exist: which is just the case for graphic processors. A history of development of the cores of graphic cards is presented together with the idea of general purpose computing using GPU (GPGPU) and special attention paid to efficiency and precision of computations. The proposed approach has been tested using an example of FIR filters. The obtained results of computations are given. The implementation in GPGPU has been compared with the classic solution, i.e. the use of a general purpose processor.

Słowa kluczowe

liczby zmiennoprzecinkowe obliczenia ogólnego przeznaczenia z wykorzystaniem GPU (GPGPU) CUDA przetwarzanie równoległe podwójny akumulator precyzja obliczeń filtry FIR

floating-point numbers general purpose computations using GPU (GPGPU) CUDA parallel processing double accumulator precision of computations FIR filters

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2011

Tom

Vol. 52, nr 5

Strony

43--48

Opis fizyczny

Bibliogr. 18 poz., tab., wykr.

Twórcy

autor

Pawłowski P.

autor

Dąbrowski A.

autor

Stankiewicz M.

autor

Misiorek F.

Politechnika Poznańska, Wydział Informatyki, Katedra Sterowania i Inżynierii Systemów

Bibliografia

[1] Bohlender G., Walter W., Kornerup P., Matula D.: Semantics for Exact Floating Point Operations, IEEE, 1991.
[2] Boite R., Xian-Liang H., Renard J. P.: A comparison of fixed-point and floating-point realization of digital filter, Conference Proceedings on Area Communication, EUROCON 88., 8th European Conference on Electrical Eng., 1988, pp. 142-145.
[3] Fettweis A.: On the properties of floating-point roundoff noise IEEE Trans. Acoustics, Speech and Signal Processing, ASSP-22, pp. 149-151, 1974.
[4] Dąbrowski A.: Odzysk pseudomocy użytecznej w wieloszybkościowym przetwarzaniu sygnałów - Rozprawy nr 198, Wydawnictwo Politechniki Poznańskiej, Poznań 1988.
[5] Han T. D., Abdelrahman T. S.: hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems, Volume: 22, Issue: 1, 2011, pp. 78-90.
[6] Harris M., Coombe G., Scheuermann T., Lastra A.: Physically-Based Visual Simulation on Graphics Hardware. 2002 SIG-GRAPH/Eurographics Workshop on Graphics Hardware 2002.
[7] IEEE Standard Board, IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2008.
[8] Kramer W.: A priori worst case error bounds for floating-point computations. IEEE Trans. Comp., 47, July 1998, pp. 750-756.
[9] Kulish U., Miranker W.: Computer Arithmetic in Theory and Practice. Academic Press, 1980.
[10] NVIDIA, Technical Brief: NVIDIA GeForce® GTX 200 GPU Architectural Overview - NVIDIA 2008 http://www.nvidia.com/docs/IO/55506/GeForce_GTX_200_GPU_Technical_ Brief.pdf
[11] NVIDIA, CUDA Programming Guide, 2011, http://www.nvidia.com
[12] NVIDIA, Fermi Compute Architecture Whitepaper – NVIDIA 2009 http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
[13] NVIDIA. CUDA GPUs, http://www.nvidia.com/obiect/cuda_gpus.html, NVIDIA 2011
[14] Ogorzałek M., Hasler M., Boite R.: Chaos in digital recursive filters. Signal Processing VI: Theory and Applications, Elsevier, Holland, 1992, pp. 191-194.
[15] Pawłowski P., Dąbrowski A.: Metoda zwiększania precyzji obliczeń za pomocą dwóch zmiennoprzecinkowych akumulatorów w procesorach sygnałowych VLIW. Elektronika 11/2006, pp. 5-9.
[16] Rao B. D.: Floating point arithmetic and digital filters. European Signal Processing Conference, 40(1), 1992, pp. 85-95.
[17] Rosenband D. L., Rosenband T.: A design case study: CPU vs. GPGPU vs. FPGA, Formal Methods and Models for Co-Design. 2009. MEMOCODE '09. 7th IEEE/ACM International Conference on Digital Object Identifier, 2009, pp. 69-72.
[18] Rumpf M., Strzodka R.: Using graphics cards for quantized FEM computations. Proceedings VIIP'01, 2001, pp. 193-202.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BWAK-0024-0004