Importance of C/C++ compiler choice for performance and energy consumption of multithreaded WZ factorization

Bylina, Beata; Piekarz, Monika; Bylina, Jarosław

doi:10.24425/bpasts.2025.153226

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Importance of C/C++ compiler choice for performance and energy consumption of multithreaded WZ factorization

Autorzy

Bylina Beata , Piekarz Monika , Bylina Jarosław

Treść / Zawartość

Pełne teksty:

bulletin_2025_2_bylina_piekarz_bylina.pdf

Pobierz

Identyfikatory

DOI

10.24425/bpasts.2025.153226

Warianty tytułu

Języki publikacji

Abstrakty

The choice of C/C++ compiler significantly impacts the performance and energy consumption of multithreaded numerical algorithms related to linear algebra. This study investigates the effects of the C/C++ compiler choice and processor frequency scaling (using dynamic voltage frequency scaling) on the performance and energy consumption of the multithreaded WZ factorization on three different computing platforms, two featuring Intel Xeon processors and one featuring AMD EPYC processor. The factorization is implemented both without optimization techniques and with strip-mining. Based on time and energy tests, we have demonstrated that, for the WZ factorization (in both implementations), each compiler reacts somewhat differently to frequency changes, thus affecting overall performance and energy consumption. The Intel compilers achieved the best performance and energy savings in a multithreaded environment compared to the other compilers on each of the tested computing platforms.

Słowa kluczowe

processor frequency scaling performance energy WZ factorization compiler

wydajność energia kompilator faktoryzacja skalowanie częstotliwości procesor

Wydawca

Polska Akademia Nauk, Wydział IV Nauk Technicznych

Czasopismo

Bulletin of the Polish Academy of Sciences. Technical Sciences

Rocznik

2025

Tom

Vol. 73, nr 2

Strony

art. no. e153226

Opis fizyczny

Bibliogr. 25 poz., rys., tab., wykr.

Twórcy

autor

Bylina Beata

Maria Curie-Sklodowska University, Pl. M. Curie-Skłodowskiej 5, 20-031 Lublin, Poland

https://orcid.org/0000-0002-1327-9747

autor

Piekarz Monika

monika.piekarz@mail.umcs.pl

Maria Curie-Sklodowska University, Pl. M. Curie-Skłodowskiej 5, 20-031 Lublin, Poland

autor

Bylina Jarosław

Maria Curie-Sklodowska University, Pl. M. Curie-Skłodowskiej 5, 20-031 Lublin, Poland

https://orcid.org/0000-0002-0319-2525

Bibliografia

[1] A. Paszkiewicz, C. Ćwikła, M. Bolanowski, M. Ganzha, M. Paprzycki, and M. Hodoň, “Multifunctional clustering based on the leach algorithm for edge-cloud continuum ecosystem,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 72, no. 1, p. e147919, 2024, doi: 10.24425/bpasts.2023.147919.
[2] M. Łoś, M. Woźniak, and M. Paszynski, “Varying coefficients in parallel shared-memory variational splitting solvers for non-stationary Maxwell equations,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 72, no. 3, p. e149179, 2024, doi: 10.24425/bpasts.2024.149179.
[3] B. Bylina, J. Bylina, and M. Piekarz, “Impact of processor frequency scaling on performance and energy consumption for wz factorization on multicore architecture,” Ann. Comput. Sci. Inf. Syst., vol. 35, p. 377–383, 2023, doi: 10.15439/2023F6213.
[4] D. Evans and M. Hatzopoulos, “A parallel linear system solver,” Int. J. Comput. Math.s, vol. 7, no. 3, pp. 227–238, 1979, doi: 10.1080/00207167908803174.
[5] H.K. Olayiwola Babarinsa, “Quadrant interlocking factorization of hourglass matrix,” AIP Conf. Proc., vol. 1974, no. 1, p. 030009, 06, 2018, doi: 10.1063/1.5041653.
[6] H.K. Olayiwola Babarinsa and A.Z. Hailiza Kamarulhaili, “Quadrant interlocking factorization algorithm of hourglass matrix from nonsingular matrix,” Thai J. Math., vol. 19, no. 4, p. 1461–1476, Dec. 2021.
[7] G. Meurant, Direct and Iterative Methods for Linear Systems. Gérard Meurant: Paris, France, 2023. [Online]. Available: https://gerard-meurant.fr/book_2023.pdf
[8] B. Bylina and J. Bylina, “Nested loop transformations on multi- and many-core computers with shared memory,” in Selected Topics in Applied Computer Science. Lublin: Maria Curie-Skłodowska University Press, 2021, vol. I, pp. 167–186. [Online]. Available: http://stacs.matrix.umcs.pl/v01/stacs_v01.pdf
[9] B. Bylina and J. Bylina, “The parallel tiled wz factorization algorithm for multicore architectures,” Int. J. Appl. Math. Comput. Sci., vol. 29, pp. 407–419, 2019.
[10] J. Bylina, B. Bylina, and M. Piekarz, “Influence of loop transformations on performance and energy consumption of the multithreded wz factorization,” in Preproc. 17th Conference on Computer Science and Intelligence Systems, 2022, p. 479–488, doi: 10.15439/2022F251.
[11] T. Kaczorek, “Transformations of the matrices of linear systems to their canonical form with desired eigenvalues,” Bull. Pol. Acad. Sci. Tech. Sci., vol. 71, no. 6, p. e147342, 2023, doi: 10.24425/bpasts.2023.147342.
[12] “GREEN500,” https://www.top500.org/lists/green500/, 2022.
[13] J.V. Lima, I. Raïs, L. Lefevre, and T. Gautier, “Performance and energy analysis of openmp runtime systems with dense linear algebra algorithms,” in 2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), 2017, pp. 7–12, doi: 10.1109/SBAC-PADW.2017.10.
[14] M. Mirka, G. Devic, F. Bruguier, G. Sassatelli, and A. Gamatié, “Automatic energy-efficiency monitoring of openmp workloads,” in 2019 14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2019, pp. 43–50, doi: 10.1109/ReCoSoC48741.2019.9034988.
[15] M.A. Shahneous Bari, A.M. Malik, A. Qawasmeh, and B. Chapman, “Performance and energy impact of openmp runtime configurations on power constrained systems,” Sustain. Comput.-Informatics Syst., vol. 23, pp. 1–12, 2019.
[16] J.V.F. Lima, I. Raïs, L. Lefèvre, and T. Gautier, “Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms,” Int. J. High Perform. Comput. Appl., vol. 33, no. 3, pp. 431–443, 2019, doi: 10.1177/1094342018792079.
[17] J. Dongarra, H. Ltaief, P. Luszczek, and V.M. Weaver, “Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architectures,” in 2012 Second International Conference on Cloud and Green Computing, 2012, pp. 274–281, doi: 10.1109/CGC.2012.113.
[18] L. Szustak, R. Wyrzykowski, T. Olas, and V. Mele, “Correlation of performance optimizations and energy consumption for stencil-based application on Intel Xeon scalable processors,” IEEE Trans. Parallel Distrib. Syst., vol. 31, no. 11, pp. 2582–2593, 2020, doi: 10.1109/TPDS.2020.2996314.
[19] T. Jakobs and G. Rünger, “Examining energy efficiency of vectorization techniques using a Gaussian elimination,” in 2018 International Conference on High Performance Computing Simulation (HPCS), 2018, pp. 268–275, doi: 10.1109/HPCS. 2018.00054.
[20] K. Halbiniak, R. Wyrzykowski, L. Szustak, A. Kulawik, N. Meyer, and P. Gepner, “Performance exploration of various C/C++ compilers for AMD EPYC processors in numerical modeling of solidification,” Adv. Eng. Softw., vol. 166, pp. 1–14, 2022, doi: 10.1016/j.advengsoft.2021.103078.
[21] GNU Compiler Collection, “GNU Compiler Collection,” https://gcc.gnu.org/, 2023.
[22] Intel Corporation, “Intel Developer Zone,” https://www.intel.com/content/www/us/en/resources-documentation/developer.html
[23] Intel Corporation, “Intel oneAPI Toolkits,” https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html
[24] K. Khan, M. Hirki, T. Niemi, J. Nurminen, and Z. Ou, “RAPL in action: Experiences in using RAPL for power measurements,” ACM Trans. Modeling Perform. Eval. Comput. Syst., vol. 3, 01 2018, doi: 10.1145/3177754.
[25] B. Bylina, J. Potiopa, M. Klisowski, and J. Bylina, “The impact of vectorization and parallelization of the slope algorithm on performance and energy efficiency on multi-core architecture,” Ann. Comput. Sci. Inf. Syst., vol. 25, pp. 2283–2290, 2021.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-55b82d60-31b1-49d6-ac5f-93a438536b81