Automatyczne zrównoleglanie kodu aplikacji systemów wbudowanych

Pałkowski, M.

Artykuł - szczegóły

Tytuł artykułu

Automatyczne zrównoleglanie kodu aplikacji systemów wbudowanych

Autorzy

Pałkowski M.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Automatic parallelization of application code in embedded systems

Języki publikacji

Abstrakty

In a fairly conservative group of solutions, such as industrial computers, more perfect miniaturization of processing units is becoming noticeable. Size and power consumption of units are important, however efficiency of processing is also significant. Installing multi-core processors in embedded systems allows executing the parallel code with OpenMP standard. Multi-core programming enables speeding up calculations, i.e. for test and measurement-processing systems the amount of measurement data processed is increased. For this purpose, techniques of transforming program code to a parallel form are necessary, in particular loop parallelization transformations are significant, because the vast majority of calculations is included in loops. There are many techniques for loop prallelization, such as unimodular and affine transformations. However, these techniques allow only extraction of parallelism for specified set of loops and fail to find full parallelism in a loop because of high inability. In this paper, the Iteration Space Slicing Framework is presented. The framework was designed for automatic extracting parallelism in loops and overcoming limitations of well-known techniques. The result of transformation is the parallel code including OpenMP pragmas. The speedup, efficiency and locality of the code is examined. The continuation of the work in the future is considered.

W artykule przedstawiono technikę automatycznego zrównoleglenia kodu aplikacji w celu efektywnego wykorzystania mocy obliczeniowej procesorów wielordzeniowych w systemach wbudowanych. Technika ta opiera się na analizie zależności danych w pętlach programowych, podziału ich przestrzeni iteracji i wyznaczeniu niezależnych fragmentów kodu. Rezultatem transformacji jest równoległy kod zgodny ze standardem OpenMP, tożsamy z jego sekwencyjnym odpowiednikiem oraz możliwość przyspieszenia obliczeń komputera przemysłowego.

Słowa kluczowe

embedded system multicore processors OpenMP loop parallelization synchronization-free slices multi-threading applications

systemy wbudowane procesory wielordzeniowe zrównoleglanie kodu fragment kodu OpenMP

Wydawca

Wydawnictwo PAK

Czasopismo

Pomiary Automatyka Kontrola

Rocznik

2010

Tom

R. 56, nr 7

Strony

656--658

Opis fizyczny

Bibliogr. 18, rys.

Twórcy

autor

Pałkowski M.

Zachodniopomorski Uniwersytet Technologiczny, Wydział Informatyki, Katedra Inżynierii Oprogramowania, ul. Żołnierska 49, 71-210 Szczecin, mpalkowski@wi.zut.edu.pl

Bibliografia

[1] Intel multi-core processors for embedded systems. http:// www.intel.com/products/embedded/processors.htm?iid=embnav1+network_proc
[2] OpenMP standard. http://www.openmp.org
[3] Oh J., Kim S. W., Chulwoo K.: OpenMP and Compilation Issue in Embedded Applications, Lecture Notes in Computer Science, Volume 2716/2003, Springer Berlin / Heidelberg 2003, s. 109-121.
[4] Hanawa T. i inni: Evaluation of Multicore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP, Lecture Notest In Computer Science Vol. 5568, Berlin / Heideberg 2009, s. 15-27.
[5] Lim A. W., Lam M., Cheong G.: An affine partitioning algorithm to maximize parallelism and minimize communication. In ICS’99, s. 228-237. ACM Press, 1999.
[6] Feautrier P.: Some efficient solutions to the affine scheduling problem, part I, II, one dimensional time, International Journal of Parallel Programming 21. (1992), s. 313-348, 389-420.
[7] Banerjee U.: Unimodular transformations of double loops. Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing. 1990, s. 192-219.
[8] Beletska A., Bielecki W., Cohen A., and Palkowski M..: Synchronization-free automatic parallelization: Beyond affine iteration-space slicing. In Languages and Compilers for Parallel Computing (LCPC’09), LNCS. Springer-Verlag, 2009.
[9] W. Pugh and D. Wonnacott.: An exact method for analysis of value-based array data dependences. In In Sixth Annual Workshop on Programming Languages and Compilers for Parallel Computing. Springer-Verlag, 1993.
[10] The Omega project. http://www.cs.umd.edu/projects/omega
[11] Kelly W., Maslov V., Pugh W., Rosser E., Shpeisman T. and Wonnacott D.: The omega library interface guide. Technical report, USA, 1995.
[12] Beletska A., Bielecki W., Siedlecki K., San Pietro P.: Finding synchronization-free slices of operations in arbitrarily nested loops. In ICCSA (2), volume 5073 of Lecture Notes in Computer Science, pp. 871-886. Springer, 2008.
[13] Bielecki W., Beletska A., Palkowski M., San Pietro P.: Extracting synchronization-free trees composed of non-uniform loop operations, Algorithms and Architectures for Parallel Processing, Lecture Notes in Computer Science Volume 5022/2008, Springer Berlin / Heidelberg, 2008, s. 185-195.
[14] Bielecki W., Pałkowski M.: Using message passing for developing coarse-grained applications in OpenMP, Proceedings of Third International Conference on Software and Data - ICSOFT 2008, Porto, Portugalia 2008, s. 145-153.
[15] Bielecki W., Klimek T., Trifunovič K.: Calculating Exact Transitive Closure for a Normalized Affine Integer Tuple Relation, Electronic Notes in Discrete Mathematics 33 (2009) 7-14.
[16] Strona projektu Iteration Space Slicing Framework http://sfs.zut.edu.pl
[17] Bastoul C.: Code generation in the polyhedral model is easier than you think. In PACT’2004, s. 7-16, Juan-les-Pins, september 2004.
[18] Pugh W., Rosser E.: Iteration Space Slicing and Its Application to Communication Optimization. Proceedings of the International Conference on Supercomputing. 1997, s. 221-228.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSW4-0083-0002