Implementation of numerical integrationto high-order elements on the GPUs

Krużel, Filip; Banaś, Krzysztof; Nytko, Mateusz

doi:10.24423/cames.264

Artykuł - szczegóły

Tytuł artykułu

Implementation of numerical integrationto high-order elements on the GPUs

Autorzy

Krużel Filip , Banaś Krzysztof , Nytko Mateusz

Wybrane pełne teksty z tego czasopisma

http://cames.ippt.gov.pl/

Identyfikatory

DOI

10.24423/cames.264

Warianty tytułu

Języki publikacji

Abstrakty

This article presents ways to implement a resource-consuming algorithm on hardware with a limited amount of memory, which is the GPU. Numerical integration for higher-order finite element approximation was chosen as an example algorithm. To perform compu- tational tests, we use a non-linear geometric element and solve the convection-diffusion- reaction problem. For calculations, a Tesla K20m graphics card based on Kepler archi- tecture and Radeon r9 280X based on Tahiti XT architecture were used. The results of computational experiments were compared with the theoretical performance of both GPUs, which allowed an assessment of actual performance. Our research gives sugges- tions for choosing the optimal design of algorithms as well as the right hardware for such a resource-demanding task.

Słowa kluczowe

GPU numerical integration finite element method OpenCL CUDA

GPU integracja numeryczna metoda elementów skończonych OpenCL CUDA

Wydawca

Instytut Podstawowych Problemów Techniki PAN

Czasopismo

Computer Assisted Methods in Engineering and Science

Rocznik

2020

Tom

Vol. 27, no. 1

Strony

3--26

Opis fizyczny

Bibliogr. 22 poz., rys., tab.

Twórcy

autor

Krużel Filip

fkruzel@pk.edu.pl

Cracow University of Technology, Department of Computer ScienceWarszawska 24, 31-155 Kraków, Poland

autor

Banaś Krzysztof

AGH Science and Technology UniversityDepartment of Applied Computer Science and ModellingAdama Mickiewicza 30, 30-059 Kraków, Poland

autor

Nytko Mateusz

Cracow University of Technology, Department of Computer ScienceWarszawska 24, 31-155 Kraków, Poland

Bibliografia

1. AMD. White paper: AMD Graphics Cores Next (GCN) Architecture , Advanced Micro Devices Inc., Sunnyvale, CA, 2012.
2. K. Banaś, F. Krużel, OpenCL performance portability for Xeon Phi coprocessor andNVIDIA GPUs: A case study of finite element numerical integration, [in:] Euro-Par 2014: Parallel Processing Work-shops , vol. 8806 of Lecture Notes in Computer Science , Springer International Publishing, pp. 158–169, 2014.
3. K. Banaś, F. Krużel, J. Bielański, Optimal kernel design for finite element numerical integration on GPUs, Computing in Science and Engineering , 2019 [in print].
4. K. Banaś, F. Krużel, J. Bielański, K. Chłoń, A comparison of performance tuning process for different generations of NVIDIA GPUs and an example scientific computing algorithm, [in:] Parallel Processing and Applied Mathematics , R. Wyrzykowski, J. Dongarra, E. Deelman, K. Karczewski [Eds], Springer International Publishing, pp. 232–242, 2018.
5. E. Becker, G. Carey, J. Oden, Finite Elements. An Introduction , Prentice Hall, 1981.
6. L. Buatois, G. Caumon, B. Levy, Concurrent number cruncher: A GPU implementation of a general sparse linear solver, International Journal of Parallel, Emergent and Distributed Systems, 24 (3): 205–223, 2009.
7. P. Ciarlet, The finite element method for elliptic problems , North-Holland, Amsterdam, 1978.
8. P.K. Das, G.C. Deka, History and evolution of GPU architecture, Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing , pp. 109– 135, 2016.
9. M. Geveler, D. Ribbrock, D. Göddeke, P. Zajac, S. Turek, Towards a complete FEM- based simulation toolkit on GPUs: Unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses, Computers & Fluids , 80 : 327–332, 2013 (Part of Special Issue: Selected contributions of the 23rd International Conference on Parallel Fluid Dynamics ParCFD2011).
10. D. Göddeke, H. Wobker, R. Strzodka, J. Mohd-Yusof, P. McCormick, S. Turek, Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU, International Journal of Computational Science and Engineering , 4 (4): 254–269, 2009.
11. C. Johnson, Numerical solution of partial differential equations by the finite element method , Cambridge University Press, 1987.
12. F. Krużel, K. Banaś, Finite element numerical integration on PowerXCell processors, [in:] PPAM’09: Proceedings of the 8th International Conference on Parallel Processing and Applied Mathematics , Springer-Verlag, pp. 517–524, 2010.
13. F. Krużel, K. Banaś, Vectorized OpenCL implementation of numerical integration for higher order finite elements, Computers and Mathematics with Applications, 66 (10): 2030– 2044, 2013.
14. F. Krużel, K. Banaś, Finite element numerical integration on Xeon Phi coprocessor, [in:] Proceedings of the 2014 Federated Conference on Computer Science and Information Systems , Warsaw, Poland, M.P.M. Ganzha, L. Maciaszek [Eds], vol. 2 of Annals of Computer Science and Information Systems, IEEE, pp. 603–612, 2014.
15. F. Krużel K. Banaś, AMD APU systems as a platform for scientific computing, Computer Methods in Materials Science , 15 (2): 362–369, 2015.
16. F. Krużel, Vectorized implementation of the FEM numerical integration algorithm on a modern CPU, [in:] Proceedings of the 33rd International ECMS Conference on Modelling and Simulation: ECMS 2019 , 11–14 June 2019, Caserta, Italy, 33 (1): 414–420, 2019.
17. J. Mamza, P. Makyla, A. Dziekoński, A. Lamecki, M. Mrozowski, Multi-core and multi-processor implementation of numerical integration in Finite Element Method, [in:] 2012 19th International Conference on Microwave Radar and Wireless Communications , vol. 2, pp. 457–461, 2012.
18. NVIDIA Corporation, NVIDIAs Next Generation CUDA Compute Architecture: Kepler GK110 , Whitepaper, 2012.
19. NVIDIA Corporation, Profiler User’s Guide , 2015.
20. R. Smith, AMD Radeon HD 7970 Review: 28nm and Graphics Core Next, Together As One, AnandTech, 2011, retrieved from https://www.anandtech.com/show/5261/amd- radeon-hd-7970-review on 12.09.2019
21. P. Šolín, K. Segeth, I. Doležel, Higher-order finite element methods , Chapman & Hall/CRC, 2004.
22. S. Williams, A. Waterman, D. Patterson, Roofline: An insightful visual performance model for multicore architectures, Communications in the ACM , 52 (4): 65–76, 2009.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-c66f49fe-067e-4df8-b7d3-5383631f793f