ab-Stream - A Framework for programming Many-core

Gan, X.; Wang, Z.; Shen, L.; Zhu, Q.

Artykuł - szczegóły

Tytuł artykułu

ab-Stream - A Framework for programming Many-core

Autorzy

Gan X. , Wang Z. , Shen L. , Zhu Q.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

ab-Stream – struktura do programowania procesorów wielordzeniowych

Języki publikacji

Abstrakty

The common approach to program many-core processor is to write processor-specific code with low level APIs for different processors, which could achieve good performance but would result in serious portability issues: programmers are required to write a specific version code for target architecture. Therefore, we present ab-Stream, an extensible framework for programming many-threaded processor based on SUIF Intermediate Representation. ab-Stream abstracts many-core many-threaded processor into a unified architecture and ab-Stream program is an OpenMP-like program with different directives for many-core processor. Furthermore, a prototype of ab-Stream was implemented to map ab-Stream programs into many-core GPU. Experiments show that our implementation can execute transformed code correctly and efficiently on CUDA-enabled GPUs. Furthermore, performance of ab-Stream version code produced by our prototype can outperform original GPU code and is close to handoptimized GPU code.

Zaprezentowano szkielet (framework) ab-Stream do programowania wielordzeniowych procesorów. System bazuje na formacie SUIF (Standford University Intermediate Format).

Słowa kluczowe

many-core ab-Stream intermediate representation GPU

procesory wielordzeniowe programowanie

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2012

Tom

R. 88, nr 7b

Strony

341--344

Opis fizyczny

Bibliogr. 19 poz., schem., tab., wykr.

Twórcy

autor

Gan X.

xinbiaogan@163.coml

State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, China

autor

Wang Z.

School of Computer, National University of Defense Technology University, Changsha 410073, China

autor

Shen L.

School of Computer, National University of Defense Technology University, Changsha 410073, China

autor

Zhu Q.

State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, China
School of Computer, National University of Defense Technology University, Changsha 410073, China

Bibliografia

[1] Bo Zhang, Zheng-Hui Xue, Ren Wu, et.al, “Acceleration of FDTD algorithm based on GPU computing”, Chinese Journal of Radio Science, Vol.26, No.1, pp108-112, 2011.
[2] K.V. Kalgin, “Implementation of algorithms with a fine-grained parallelism on GPUs”, Numerical Analysis and Applications, Vol.4, No.1, pp 46-55, 2011.
[3] F. Dehne, G. Hickey, A. Rau-Chaplin, et.al, “Parallel catastrophe modeling on a Cell/BE”, International Journal of Parallel, Emergent and Distributed Systems, Vol.25, No.5, pp 401-410, 2010.
[4] H. Vandierendonck, S. Rul, M. Questier, et.al, “Experiences with parallelizing a bio-informatics program on the Cell BE”, Proceedings of the third International Conference on High Performance Embedded Architectures and Compilers (HiPEAC'08), pp 161-175, 2008.
[5] Hai-yan Li, Chun-yuan Zhang, Li Li, et.al, “Transform coding on programmable stream processors”, Journal of Supercomputing, Vol.45, No.1, pp 66-87, 2008.
[6] Xue-Jun Yang, Li-Fang Zeng, Yu Deng, et.al, “Optimization approaches of organizing streams on imagine processor”, Chinese Journal of Computers, Vol.31, No.7, pp 1092-1100, 2008.
[7] Khronos Group. OpenCL specification.
[8] Chuntao Hong, Dehao Chen, Wenguang Chen, et.al, “MapCG: Writing parallel program portable between CPU and GPU”, Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT'10), pp 217-226, 2010.
[9] P. H. Wang, J. D. Collins, G. N. Chinya, et.al, “ EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system”, Proceedings of International Conference on Programming Language Design and Implementation (PLDI'07), pp 156–166, 2007.
[10] Pieter Bellens, Josep M Perez, Rosa M Badia, et.al, “CellSs: a programming model for the CELL BE Architecture”, Proceedings of ACM/IEEE Conference on Supercomputing (SC'06), pp 86–98, 2006.
[11] Ian Buck, Tim Foley, Daniel Reiter Horn, et.al, “Brook for GPUs: stream computing on graphics hardware”, ACM Transaction on Graphics, Vol.23, No. 3, pp 777–786, 2004.
[12] NVIDIA, “CUDA programming guide”, 2008.
[13] Qiming Hou, Kun Zhou, Baining Guo, “BSGP: Bulksynchronous GPU programming”, ACM Transactions on Graphics, Vol.27, No.3, 9, 2008.
[14] Peter Matton, et.al, Imagine Programming Systems User’s Guide, 2002.
[15] Michael D. Linderman, Jamison D. Collins, Hong Wang, et.al, “Merge: A programming model for heterogeneous multicore systems”, ACM SIGPLAN Notices (ASPLOS'08), Vol.43, No.3, pp 287-296, 2008.
[16] David L Moore, “The SUIF Programmer Guide”, 1999.
[17] Xinbiao Gan, Zhiying Wang, LiShen, et.al, “Data Layout Pruning on GPU”, Applied Mathematics &Information Sciences, Vo.5, No.2, pp 129S-168S, 2011.
[18] CUDA benchmark suite.
[19] http://www.crhc.uiuc.edu/impact/cudabench.html. 2010.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-5e5c77ad-1816-4380-984d-bb2447e98d3c