Increasing data locality of parallel programs executed in embedded systems

Bielecki, W.; Kraska, K.

Artykuł - szczegóły

Tytuł artykułu

Increasing data locality of parallel programs executed in embedded systems

Autorzy

Bielecki W. , Kraska K.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Increasing data locality in a program is a necessary factor to improve performance of oftware parts of embedded systems, to decrease power consumption and reduce memory on chip size. A possibility of applying a method of quantifying data locality to a novel method of extracting synchronization-free threads is introduced. It can be used to agglomerate extracted synchronization-free threads for adopting a parallel program to a target architecture of an embedded system under various loop schedule options (spacetime mapping) and the influence of well-known techniques to improve data locality. The choice of the best combination of loop transformation techniques regarding to data locality makes possible improving program performance. A way of an analysis of data locality is presented. Experimental results are depicted and discussed. Conclusion and future research are outlined.

Słowa kluczowe

data locality compilers parallel processing embedded system

Wydawca

Komisja Informatyki Polskiej Akademii Nauk, Oddział w Gdańsku

Czasopismo

Metody Informatyki Stosowanej

Rocznik

2008

Tom

nr 4 (Tom 17)

Strony

5--13

Opis fizyczny

Bibliogr. 12 poz., rys., tab.

Twórcy

autor

Bielecki W.

Szczecin University of Technology, Faculty of Computer Science and Information Technology

autor

Kraska K.

Szczecin University of Technology, Faculty of Computer Science and Information Technology

Bibliografia

[1] Bielecki W., Siedlecki K. Extracting synchronization-free slices in perfectly nested niform and non-uniform loops. Electonic Modeling, 2007.
[2] Bielecki W., Kraska K., Siedlecki K. Increasing Program Locality by Extracting Synchronization-Free Slices in Arbitrarily Nested Loops. Proceedings of the Fourteenth International Multi-Conference on Advanced Computer Systems ACS2007, 2007.
[3] Wolfe M. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
[4] Richardson S. MPOC. A Chip Multiprocessor for Embedded Systems. [online] http://www.hpl.hp.com/techreports/2002/HPL-2002-186.pdf, HP Laboratories, 2002.
[5] Netlib Repository at UTK and ORNL [online]. http://www.netlib.org/benchmark/livermorec.
[6] Aho A. V., Lam M. S., Sethi R., Ullman J. D. Compilers: Principles, Techniques and Tools, 2nd Edition. Addison-Wesley, 2006.
[7] IBM PowerPC Multi-Core Instruction Set Simulator. User’s Guide, IBM Corporation, 2008.
[8] IBM RISCWatch Debugger. User’s Manual, IBM Corporation, 2008.
[9] Stasiak A. Klasyfikacja Systemów Wspomagających Proces Przetwarzania i Sterowania. II Konferencja Naukowa KNWS'05, 2005.
[10] Griebl M. Habilitation. Automatic Parallelization of Loop Programs for Distributed Memory Architectures. Iniversitat Passau, 2004.
[11] Kelly W., Maslov V., Pugh W., Rosser E., Shpeisman T., Wonnacott D. The omega library interface guide. Technical Report CS-TR-3445, University of Maryland, 1995.
[12] Chandra R., Dagum L., Kohr D., Maydan D., McDonald J., Menon R. Parallel Programing In OpenMP. Morgan Kaufmann, 2001.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-5ad665ec-7302-4a91-8ac8-6c12293de746