A parallel hardware-oriented algorithm for constant matrix-vector multiplication with reduced multiplicative complexity

Cariow, A.; Cariow, G.

Artykuł - szczegóły

Tytuł artykułu

A parallel hardware-oriented algorithm for constant matrix-vector multiplication with reduced multiplicative complexity

Autorzy

Cariow A. , Cariow G.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Równoległy sprzętowo zorientowany algorytm mnożenia macierzy stałych przez wektor ze zredukowaną złożonością multiplikatywną

Języki publikacji

Abstrakty

This paper presents the algorithmic aspects of organization of a lowcomplexity fully parallel processor unit for constant matrix-vector products computing. To reduce the hardware complexity (number of twooperand multipliers), we exploit the Winograd’s inner product calculation approach. We show that by using this approach, the computational process of calculating the constant matrix-vector product can be structured so that it eventually requires fewer multipliers than the direct implementation of matrix-vector multiplication.

W pracy został przedstawiony sprzętowo-zorientowany algorytm wyznaczania iloczynu wektora przez macierz stałych. W odróżnieniu od implementacji naiwnego sposobu zrównoleglenia obliczeń wymagającego N2 układów mnożących proponowana równoległa struktura wymaga tylko N(M+1)/2 takich układów. A ponieważ układ mnożący pochłania znacznie więcej zasobów sprzętowych platformy implementacyjnej niż sumator, to minimalizacja liczby tych układów podczas projektowania dedykowanych układów obliczeniowych jest sprawą nadrzędną. Idea syntezy algorytmu oparta jest na wykorzystaniu do wyznaczania cząstkowych iloczynów skalarnych metody S. Winograda. Zaprezentowany w artykule algorytm może być z powodzeniem zastosowany do akceleracji obliczeń w podsystemach cyfrowego przetwarzania danych zrealizowanych na platformach FPGA oraz zaimplementowany w dowolnym środowisku sprzętowym, na przykład zrealizowana w postaci układu ASIC. W tym ostatnim przypadku niewątpliwym atutem wyróżniającym przedstawione rozwiązanie jest to, że zaprojektowany w ten sposób układ będzie zużywać mniej energii oraz wydzielać mniej ciepła.

Słowa kluczowe

constant coefficient matrix-vector multiplier hardware complexity reduction FPGA implementation

układ mnożenia macierzy redukcja złożoności sprzętowej implementacja na FPGA

Wydawca

Wydawnictwo PAK

Czasopismo

Pomiary Automatyka Kontrola

Rocznik

2014

Tom

R. 60, nr 7

Strony

510--512

Opis fizyczny

Bibliogr. 11 poz., rys.

Twórcy

autor

Cariow A.

atariov@wi.zut.edu.pl

West Pomeranian University of Technology, Szczecin, Żołnierska St. 49, 71-210 Szczecin

autor

Cariow G.

gtariova@wi.zut.edu.pl

West Pomeranian University of Technology, Szczecin, Żołnierska St. 49, 71-210 Szczecin

Bibliografia

[1] Amira A. Bouridane and Milligan P.: Accelerating matrix product on reconfigurable hardware for signal processing, in Proc. 11th Int. Conf. Field-Programmable Logic Appl. (FPL), 2001, pp. 101–111.
[2] Oscar Gustafsson, Henrik Ohlsson, and Lars Wanhammar: Low-Complexity Constant Coefficient Matrix Multiplication Using a Minimum Spanning Tree Approach, Proceedings of the 6th Nordic Signal Processing Symposium - NORSIG 2004, June 9 - 11, 2004, Espoo, Finland, pp. 141-144.
[3] Jang J., Choi S. and Prasanna V. K.: Energy-efficient matrix multiplication on FPGAs, in Proc. Int. Conf. Field Programmable Logic Appl., 2002, pp. 534–544.
[4] Jang J. W., Choi S. and Prasanna V. K.: Area and time efficient implementations of matrix multiplication on FPGAs, in Proc. IEEE Int. Conf. Field Programmable Technol., 2002, pp. 93–100.
[5] Syed M. Qasim, Ahmed A. Telba and Abdulhameed Y. AlMazroo: FPGA Design and Implementation of Matrix Multiplier Architectures for Image and Signal Processing Applications, IJCSNS International Journal of Computer Science and Network Security, 2010, vol. 10, No.2, pp. 168-176.
[6] Nicolas Boullis and Arnaud Tisserand: Some Optimizations of Hardware Multiplication by Constant Matrices, IEEE Transactions on Computers, 2005, vol. 54, no. 10, pp. 1271-1282.
[7] Kinane Andrew, Muresan Valentin, O'Connor and Noel E.: Optimization of constant matrix multiplication operation hardware using a genetic algorithm. In: EvoHOT 2006 - 3rd European Workshop on Evolutionary Computation in Hardware Optimization, 10-12 April 2006, Budapest, Hungary. pp. 296-307.
[8] Boullis N. and Tisserand A.: Some Optimizations of Hardware Multiplication by Constant Matrices, IEEE Trans. Comput., 2005, vol. 54, no. 10, pp. 1271-1282.
[9] Andrew Kinane, Valentin Muresan and Noel O'Connor: Towards an Optimised VLSI Design Algorithm for the Constant Matrix Multiplication Problem, ISCAS 2006, pp. 5111-5114.
[10] Winograd S.: A New Algorithm for Inner Product, IEEE Transactions on Computers - TC , 1968, vol. C-17, no. 7, pp. 693-694.
[11] Willi-Hans Steeb, Yorick Hardy: Matrix Calculus and Kronecker Product: A Practical Approach to Linear and Multilinear Algebra, World Scientific Publishing Company; 2 edition (March 24, 2011), 324 pages.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-63192aab-6f4b-4d62-a669-34692f675fe1