Potokowa realizacja operacji pomnóż i dodaj dla argumentów zmiennoprzecinkowych podwójnej precyzji

Russek, P.; Wiatr, K.

Artykuł - szczegóły

Tytuł artykułu

Potokowa realizacja operacji pomnóż i dodaj dla argumentów zmiennoprzecinkowych podwójnej precyzji

Autorzy

Russek P. , Wiatr K.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Pipeline implementation of multiply and accumulate double precision floating point operation

Języki publikacji

Abstrakty

Operacja pomnóż i dodaj to fundament realizacji obliczeń numerycznych we współczesnej nauce i technice. Możliwość szybkiej realizacji tej opera-cji ma zasadnicze znaczenie dla efektywności systemu obliczeniowego. Obok techniki przyśpieszania obliczeń polegającej na równoległej ich realizacji duże znaczenie i zastosowanie ma również technika przetwarzania potokowego. Zwiększa ona przepustowość modułów obliczeniowych wydłużając opóźnienie. W przypadku operatora pomnóż i dodaj zastosowanie techniki potokowej ze względu na pętle sprzężenia zwrotnego w ścieżce danych napotyka pewne problemy. W pracy zaprezentowano sposób potokowej realizacji operacji pomnóż i dodaj oraz wyniki jej implementacji w FPGA dla argumentów zmiennoprzecinkowych podwójnej precyzji.

Multiply and accumulate operation is a foundation of contemporary numerical computation in science and technology. Ability for its fast execution is crucial for performance of computing system. In computing acceleration beside parallel processing technique also pipelining has an important role as a way to increase system throughput. In a case of multiply-and-accumulate (MAC) operation there is a problematic issue that comes from the feedback loop necessary in MAC architecture. In this paper double precision MAC pipeline architecture is proposed and FPGA implementation results presented.

Słowa kluczowe

układy FPGA obliczenia dużej złożoności architektury dedykowane

FPGA supercomputing custom computing machines

Wydawca

Wydawnictwo PAK

Czasopismo

Pomiary Automatyka Kontrola

Rocznik

2007

Tom

R. 53, nr 7

Strony

36--38

Opis fizyczny

Bibliogr. 8 poz., rys., tab.

Twórcy

autor

Russek P.

autor

Wiatr K.

ACK -CYFRONET, Katedra Elektroniki, Akademia Górniczo-Hutnicza, Kraków, russek@agh.edu.pl

Bibliografia

[1] Yong Dou, S. Vassiliadis, G. K. Kuzmanov, G. N. Gaydadjiev, “64-bit floating-point FPGA matrix multiplication”, International Symposium on Field Programmable Gate Arrays archive Proceedings of the 2005 ACM/SIGDA 13th international symposium on Fieldprogrammable gate arrays.
[2] Ling Zhuo, Viktor K. Prasanna, ‘High Performance Linear Algebra Operations on Reconfigurable Systems” Proceedings of the 2005 ACM/IEEE conference on Supercomputing SC`05.
[3] K. Fatahalian, J. Sugerman, P. Hanrahan, “Computation: Understanding the efficiency of GPU algorithms for matrix-matrix multiplication” August 2004 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware HWWS`04.
[4] J. J. Dongarra, Jeremy Du Croz, Sven Hammarling, I. S. Duff, “A set of level 3 basic linear algebra subprograms” March 1990, ACM Transactions on Mathematical Software (TOMS), Volume 16 Issue 1, Publisher: ACM Press.
[5] Viktor K. Prasanna, “Energy-Efficient Computations on FPGAs’, May 2005, The Journal of Supercomputing, Volume 32, Issue 2, Publisher: Kluwer Academic Publishers.
[6] Silicon Graphics, “SGI® RASCTM RC100 Blade, Dramatic Application Speed-up with Next Generation Reconfigurable Compute Technology”, www.sgi.com
[7] Xilinx, “Virtex-4 User Guide’, www.xilinx.com
[8] J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy, “Introduction to the Cell multiprocessor”, IBM Journal o research and development, Published online September 7, 2005.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSW4-0039-0012