Finite element core calculations and stream processing

Banaś, K.; Bielański, J.; Chłoń, K.

Artykuł - szczegóły

Tytuł artykułu

Finite element core calculations and stream processing

Autorzy

Banaś K. , Bielański J. , Chłoń K.

Wybrane pełne teksty z tego czasopisma

http://www.cmms.agh.edu.pl/

Identyfikatory

Warianty tytułu

Podstawowe obliczenia w metodzie elementów skończonych i przetwarzanie strumieniowe

Języki publikacji

Abstrakty

We present the execution model and performance analysis for the important phase of finite element calculations, the creation of systems of linear equations. We assume that the process is realized using a set of CPU cores and GPU multiprocessors, with CPU and GPU memories connected using PCIe links for data transfer. We analyse the use of linear data structures that are designed specially for GPU processing. We present the examples of calculations for the standard first order FEM approximation and typical contemporary hardware. We draw the conclusions on the feasibility of the proposed approach.

Artykuł prezentuje wydajnościowy model wykonania oraz analizę wydajności dla procedury tworzenia układu równań liniowych, która jest jedną z głównych faz obliczeń metodą elementów. Przyjęto założenia, że proces ten będzie wykonywany przez zbiór rdzeni CPU oraz zbiór mul t i procesorów GPU. Pamięć wykorzystywana przez CPU i GPU są połączone interfejsem PCIe poprzez który przeprowadzany jest transfer danych. Opracowany algorytm wykorzystuje liniową strukturę danych zaprojektowaną specjalnie pod kątem przetwarzania na procesorach GPU, dla którego została przeprowadzona analiza wykonania. W artykule przedstawione zostały wyniki uzyskane dla przykładów obliczeniowych, wykorzystujących liniową aproksymację MES na typowej współczesnej konfiguracji sprzętowej. Zakończenie zawiera wnioski dotyczące praktycznego znaczenia zastosowanego podejścia.

Słowa kluczowe

finite element method solvers of linear equations GPU OpenGL high performance computing technical simulation

Wydawca

Wydawnictwa AGH

Czasopismo

Computer Methods in Materials Science

Rocznik

2016

Tom

Vol. 16, No. 4

Strony

213--223

Opis fizyczny

Bibliogr. 32 poz., rys.

Twórcy

autor

Banaś K.

kbanas@agh.edu.pl

AGH University of Science and Technology, al. A. Mickiewicza 30, 30-059 Kraków, Poland

autor

Bielański J.

AGH University of Science and Technology, al. A. Mickiewicza 30, 30-059 Kraków, Poland

autor

Chłoń K.

AGH University of Science and Technology, al. A. Mickiewicza 30, 30-059 Kraków, Poland

Bibliografia

Anzt, H., Tomov, S., Luszczek, P., Sawyer, W., Dongarra, J., 2015, Acceleration of gpu-based krylov solvers via data transfer reduction, International Journal of High Performance Computing Applications, 29(3), 366-383.
Banaś, K., Kruzel, F., Bielanski, J., 2015, Finite element numerical integration for first order approximations on multi¬core architectures, CoRR, abs/1504.01023.
Banaś, K., Michalik, K., 2010, Design and development of an adaptive mesh manipulation module for detailed FEM simulation of flows, Proceedings of the International Conference on Computational Science, ICCS 2010, eds, Peter M. A. Sloot, G. Dick van Albada, and Jack Dongarra, University of Amsterdam, The Netherlands, May 31-June2, 2010, 1 of Procedia Computer Science, 2043- 2051.
Banaś, K., Płaszewski, P., Macioł, P., 2014a, Numerical integration on GPUs for higher order finite elements, Computers and Mathematics with Applications, 67(6), 1319- 1344.
Banaś, K., Chłoń, K„ 2016, Design of interface modules for flexible coupling of finite element codes with solvers of linear equations, Computer Assisted Methods in Engineering and Science, 23(1), 3-17.
Banaś, K., Chłoń, K., Cybułka, P., Michalik, K., Płaszewski, P., Siwek, A., 2014b, Adaptive finite element modelling of welding processes, eScience on Distributed Computing Infrastructure - Achievements of PLGrid Plus Domain- Specific Services and Tools, eds, Bubak M., Kitowski, J., Wiatr, K., 8500 of Lecture Notes in Computer Science, Springer International Publishing, 391-406.
Banaś, K., Krużel, F., 2014, Opencl performance portability for xeon phi coprocessor and NVIDIA gpus: A case study of finite element numerical integration, Euro-Par 2014: Parallel Processing Workshops - Euro-Par 2014 International Workshops, Porto, Portugal, August 25-26, 2014, Revised Selected Papers, Part II, 8806 of Lecture Notes in Computer Science, Springer, 158-169.
Banaś, K., Krużel, F., Bielański, J., 2016, Finite element numerical integration for first order approximations on multi- and many-core architectures, Computer Methods in Applied Mechanics and Engineering, 305, 827-848.
Barrett, B., Berry, M., Chan, T.F., Demmel, J., Donato, J.M., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., van der Vorst, H., 1994, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia, PA.
Cecka, C., Lew, A. J., Darve, E., 2011, Assembly of 10 finite element methods on graphics processors. International Journal for Numerical Methods in Engineering, 85(5), 640-669.
Choi, J. W., Singh, A., Vuduc, R. W., 2010, Modeldriven auto tuning of sparse matrix-vector multiply on gpus, S1GPLAN Not., 45(5), 115-126.
Ciarlet, P.G., 1978, The Finite Element Method for Elliptic Problems, North-HolIand, Amsterdam.
Demkowicz, L., Kurtz, J., Pardo, D., Paszynski, M., Rachowicz, W., Zdunek, A., 2007, Computing with Hp-Adaptive Fi¬nite Elements, Frontiers Three Dimensional Elliptic and Maxwell Problems with Applications, Chapman & Hall/CRC.
Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J., 2012, From CUDA to opencl:
Towards a performance-portable solution for multi-platform GPU programming. Parallel Computing, 38(8), 391- 407. (Application accelerators in IIPC).
Dziekonski, A., Lamecki, A., Mrozowski, M., 2011, Gpu acceleration of multilevel solvers for analysis of microwave components with finite element method. Microwave and Wireless Components Letters, IEEE, 21(1), 1-3.
Dziekonski, A., Sypek, P., Lamecki, A., Mrozowski, M., 2012, Finite element matrix generation on a gpu. Progress in Electromagnetics Research, 128, 249-265.
Dziekonski, A., Sypek, P., Lamecki, A., Mrozowski, M., 2013, Generation of large finite-element matrices on multiple graphics processors, International Journal for Numeri¬cal Methods in Engineering, 94(2), 204-220.
Geveler, M., Ribbrock, 1)., Goddeke, D., Zajac, P., Turek, S.,Towards a complete fembased simulation toolkit on gpus: Unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses. Computers & Fluids, 80(0), 327- 332.
Karatarakis, A., Karakitsios, P., Papadrakakis, M., 2014, GPU accelerated computation of the isogeometric analysis stiffness matrix. Computer Methods in Applied Mechanics and Engineering, 269, 334-355.
Klöckner, A., Warburton, T., Bridge, J., Hesthaven, .J. S., 2009, Nodal discontinuous galerkin methods on graphics processors, .J. Comput. Phys., 228, 7863-7882.
Komatitsch, D., Erlebacher, G., Göddéke, D., Michea, D., 2010, High-order llnite-element seismic wave propagation modeling with mpi on a large gpu cluster, Journal of Computational Physics, 229(20), 7692-7714.
Koza, Z., Matyka, M., Szkoda, S., Mirosław, L., 2014, Compressed multirow storage format for sparse matrices on graphics processing units, SIAM Journal on Scientific Computing, 36(2), C2I9-C239.
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., Bishop, A. R., A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units, SIAM J. Scientific Computing, 36(5).
Krużel, F., Banaś, K., 2013, Vectorized OpenCL implementa¬tion of numerical integration for higher order finite elements, Computers and Mathematics with Applications, 66(10), 2030-2044.
Krużel, F., Banaś, K., 2015, AMD APU systems as a platform for scientific computing. Computer Methods in Materials Science, 15(2), 362-369.
Lipski, P., Wozniak, M., Paszynski, M., 2015, Comparison of the structure of equation systems and the GPU multi frontal solver for finite difference, collocation and finite element method, Proceedings of the International Conference on Computational Science. ICCS 2015. Computational Science at the Gates of Nature, eds, Koziel, S., Leifsson, L., Lees, M., Krzhizhanovskaya, V., Dongarra, .1., Sloot, P. M. A., Reykjavik, Iceland, 1-3 June, 2015, 2014, 51, 1072-1081.
Markall, G. R., Slemmer, A., Ham, D. A., Kelly, P. H. J., Cantwell, C. I)., Sherwin, S. .1., 2013, Finite element assembly strategies on multi-core and many-core architectures, International Journal for Numerical Methods in Fluids, 71(1), 80-97.
Reguly, I., Giles, M., 2012, Efficient sparse matrixvector multiplication on cache-based gpus, Innovative Parallel Computing (InPar), 1-12.
Remade, J.-F., Karamete, B. K., Shephard, M. S., 2000, Algorithm Oriented Mesh Database, Report 5, SCOREC.
Smith, B., Bjorstad, P., Gropp,W., 1996, Domain Decomposition. Parallel Multilevel Methods for Elliptic Partial Differential Equation, Cambridge University Press, Cambridge.
Zienkiewicz, O.C., Taylor, R.L., 2000, Finite element method, 1-3, Butterworth Heinemann, London.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-c2e21e66-8dae-48a0-ab21-726ce2a8196d