Parallel Code Generation for Mobile Devices

Palkowski, M.

Artykuł - szczegóły

Tytuł artykułu

Parallel Code Generation for Mobile Devices

Autorzy

Palkowski M.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Generowanie kodu równoległego dla urządzeń przenośnych

Języki publikacji

Abstrakty

Mobile computing is driven by pursuit of ever increasing performance. Multicore processing is recognized as a key component for continued performance improvements. This paper presents the Iteration Space Slicing (ISS) framework aimed at automatic parallelization of code for Mobile Internet Devices (MID). ISS algorithms permit us to extract coarse-grained parallelism available in arbitrarily nested parameterized loops. The loops are parallelized and transformed to multi-threaded application for the Android OS. Experimental results are carried out by means of the benchmark suites (UTDSP and NPB) using the ARM dual core processor. The related parallelization techniques are discussed, in particular for embedded systems. The future work is outlined.

Przetwarzanie obliczeń za pomocą urządzeń mobilnych wiąże się z rosnącym zapotrzebowaniem na moc ich procesorów. Artykuł przedstawia zastosowanie narzędzia ISS (podziału przestrzeni iteracji pętli programowych) do wyznaczenia równoległego kodu dedykowanego dla urządzeń mobilnych (MID). Algorytmy pozwalają na wyznaczenie równoległości gruboziarnistej dla dowolnie zagnieżdżonych pętli i wygenerowanie wielowątkowego kodu dla systemu Android. Wyniki eksperymentalna dla zestawów pętli testowych NAS i UTDSP przeprowadzono wykorzystując dwurdzeniowy procesor ARM. Prace pokrewne i przyszłe zadania przedstawiono na końcu artykułu.

Słowa kluczowe

automatic parallelization algorithm synchronization-free parallelism code generation mobile computing multicore processor DSP applications

algorytm wyznaczający automatycznie równoległość generowanie kodu przetwarzanie mobilne równoległość gruboziarnista programowanie wielordzeniowe przetwarzanie sygnałów DSP

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2015

Tom

R. 91, nr 2

Strony

133--136

Opis fizyczny

Bibliogr. 24 poz., rys., tab., wykr.

Twórcy

autor

Palkowski M.

mpalkowski@wi.zut.edu.pl

Zachodniopomorski Uniwersytet Technologiczny, Katedra In􀄪ynierii Oprogramowania, ul. 􀄩ołnierska 49, 71-210 Szczecin

Bibliografia

[1] Domeika M., Software Development for Embedded Multi-Core Systems, A practical guide for using Intel embedded systems, Newnes (2008).
[2] Beletska, A., Bielecki, W., Cohen, A., Palkowski, M., Siedlecki, K. : Coarse-grained loop parallelization: Iteration space slicing vs affine transformations. Parallel Computing, 37, 479-–497, (2011).
[3] Pugh W.,Rosser E., Iteration space slicing and its application to communication optimization. In International Conference on Supercomputing: 221--228, (1997).
[4] Weiser, M., Program slicing. In IEEE Transactions on Software Engineering: 352--357, (1984).
[5] Lim, A., Lam, M., Cheong, G. : An affine partitioning algorithm to maximize parallelism and minimize communication. In ICS'99, ACM Press, 228-237, (1999).
[6] Feautrier, P. : Some efficient solutions to the affine scheduling problem, part I and II, one and multidimensional time, International Journal of Parallel Programming 21, 313-348 and 389-420, (1992).
[7] Kelly, W., Pugh, W., Maslov, V., Rosser, E., Shpeisman, T., Wonnacott, D. : New User Interface for Petit and Other Extensions. User Guide, (1996).
[8] Android Developers Guide - Processes and Threads : http://developer.android.com/guide/components/processesand-threads.html, (2012).
[9] PLUTO - An automatic parallelizer and locality optimizer for multicores, http://pluto-compiler.sourceforge.net, (2011).
[10] Bondhugula, U., Baskaran, M., et al. : Affine transformations for communication minimal parallelization and locality optimization of arbitrarily-nested loop sequences, Lecture Notes in Computer Science, Volume 4959/2008, 132-146, (2008).
[11] Bondhugula, U., Hartono, A., Ramanujan, J., Sadayappan, P. : A practical automatic polyhedral parallelizer and locality optimizer. In ACM SIGPLAN Programming Languages Design and Implementation (PLDI '08), (2008).
[12] Kelly, W., Maslov, V., Pugh, W., Rosser, E., Shpeisman, T., Wonnacott, D. : The omega library interface guide. Technical report, College Park, MD, USA, (1995).
[13] Verdoolaege, S. : Barvinok: User Guide v. 035, www.kotnet.org/~skimo/barvinok/barvinok.pdf, (2011).
[14] Moldovan, D. : Parallel Processing: From Applications to Systems, Morgan Kaufmann Publishers, Inc, (1993).
[15] Pugh, W., Wonnacott, D. : An exact method for analysis of value-based array data dependences. In In Sixth Annual Workshop on Programming Languages and Compilers for Parallel Computing. Springer-Verlag, (1993).
[16] The NAS benchmark suite, http://www.nas.nasa.gov, (2012).
[17] Lee, C.G. : The UTDSP Benchmark Suite, http://www.eecg.toronto.edu/~corinna, (2002).
[18] Peng, S.H. : UTDSP: A VLIW Programmable DSP Processor, Graduate Department of Electrical and Computer Engineering, University of Toronto, (1999).
[19] WonnacottD. : A Retrospective of the Omega Project, Haverford College Computer Science Tech Report (2010).
[20] Garcia, S., et. al. : The Kremlin Oracle for Sequential Code Parallelization, Micro, IEEE, Volume: 32, Issue: 4, pp. 42-53, (2012).
[21] Amini, M., Ancourt, C., et al., : PIPS Documentation http://pips4u.org/doc, (2012).
[22] Amini, M., et al., : PIPS Is not (just) Polyhedral Software. In First International Workshop on Polyhedral Compilation Techniques (IMPACT 2011). Chamonix, France, 4/2011, (2011).
[23] Cordes, D., Marwedel, P., Malik, A. : Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming, Proceeding CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, pp. 267-276, (2010).
[24] Campanoni, S., et. al. : HELIX: automatic parallelization of irregular programs for chip multiprocessing, Proceeding CGO'12 Proceedings of the Tenth International Symposium on Code Generation and Optimization, pp. 84-93, (2012).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-0515a9d7-43dc-4367-8503-bb4c8f39583d