Warianty tytułu
Języki publikacji
Abstrakty
Increasing data locality in a program is a necessary factor to improve performance of oftware parts of embedded systems, to decrease power consumption and reduce memory on chip size. A possibility of applying a method of quantifying data locality to a novel method of extracting synchronization-free threads is introduced. It can be used to agglomerate extracted synchronization-free threads for adopting a parallel program to a target architecture of an embedded system under various loop schedule options (spacetime mapping) and the influence of well-known techniques to improve data locality. The choice of the best combination of loop transformation techniques regarding to data locality makes possible improving program performance. A way of an analysis of data locality is presented. Experimental results are depicted and discussed. Conclusion and future research are outlined.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
5--13
Opis fizyczny
Bibliogr. 12 poz., rys., tab.
Twórcy
autor
- Szczecin University of Technology, Faculty of Computer Science and Information Technology
autor
- Szczecin University of Technology, Faculty of Computer Science and Information Technology
Bibliografia
- [1] Bielecki W., Siedlecki K. Extracting synchronization-free slices in perfectly nested niform and non-uniform loops. Electonic Modeling, 2007.
- [2] Bielecki W., Kraska K., Siedlecki K. Increasing Program Locality by Extracting Synchronization-Free Slices in Arbitrarily Nested Loops. Proceedings of the Fourteenth International Multi-Conference on Advanced Computer Systems ACS2007, 2007.
- [3] Wolfe M. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
- [4] Richardson S. MPOC. A Chip Multiprocessor for Embedded Systems. [online] http://www.hpl.hp.com/techreports/2002/HPL-2002-186.pdf, HP Laboratories, 2002.
- [5] Netlib Repository at UTK and ORNL [online]. http://www.netlib.org/benchmark/livermorec.
- [6] Aho A. V., Lam M. S., Sethi R., Ullman J. D. Compilers: Principles, Techniques and Tools, 2nd Edition. Addison-Wesley, 2006.
- [7] IBM PowerPC Multi-Core Instruction Set Simulator. User’s Guide, IBM Corporation, 2008.
- [8] IBM RISCWatch Debugger. User’s Manual, IBM Corporation, 2008.
- [9] Stasiak A. Klasyfikacja Systemów Wspomagających Proces Przetwarzania i Sterowania. II Konferencja Naukowa KNWS'05, 2005.
- [10] Griebl M. Habilitation. Automatic Parallelization of Loop Programs for Distributed Memory Architectures. Iniversitat Passau, 2004.
- [11] Kelly W., Maslov V., Pugh W., Rosser E., Shpeisman T., Wonnacott D. The omega library interface guide. Technical Report CS-TR-3445, University of Maryland, 1995.
- [12] Chandra R., Dagum L., Kohr D., Maydan D., McDonald J., Menon R. Parallel Programing In OpenMP. Morgan Kaufmann, 2001.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-5ad665ec-7302-4a91-8ac8-6c12293de746