Czasopismo
Tytuł artykułu
Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Konferencja
Federated Conference on Computer Science and Information Systems (15 ; 06-09.09.2020 ; Sofia, Bulgaria)
Języki publikacji
Abstrakty
Basic Linear Algebra Subprograms (BLAS) has emerged as a de-facto standard interface for libraries providing linear algebra functionality. The advent of powerful devices for Internet of Things (IoT) nodes enables the reuse of existing BLAS implementations in these systems. This calls for a discerning evaluation of the properties of these libraries on embedded processors. This work benchmarks and discusses the performance and memory consumption of a wide range of unmodified open-source BLAS libraries. In comparison to related (but partly outdated) publications this evaluation covers the largest set of opensource BLAS libraries, considers memory consumption as well and distinctively focuses on Linux-capable embedded platforms (an ARM-based SoC that contains an SIMD accelerator and one of the first commercial embedded systems based on the emerging RISC-V architecture). Results show that especially for matrix operations and larger problem sizes, optimized BLAS implementations allow for significant performance gains when compared to pure C implementations. Furthermore, the ARM platform outperforms the RISC-V incarnation in our selection of tests.
Słowa kluczowe
Rocznik
Tom
Strony
663--672
Opis fizyczny
Bibliogr. 12 poz., tab., wykr.
Twórcy
autor
- Dept. of Electronic Engineering, University of Applied Sciences Technikum Wien Höchstädtpl. 6, 1200 Vienna, Austria, fibich@technikum-wien.at
autor
- Dept. of Electronic Engineering, University of Applied Sciences Technikum Wien Höchstädtpl. 6, 1200 Vienna, Austria, tauner@technikum-wien.at
autor
- Dept. of Electronic Engineering, University of Applied Sciences Technikum Wien Höchstädtpl. 6, 1200 Vienna, Austria, roessler@technikum-wien.at
autor
- Dept. of Electronic Engineering, University of Applied Sciences Technikum Wien Höchstädtpl. 6, 1200 Vienna, Austria, horauer@technikum-wien.at
Bibliografia
- 1. BLAST Forum, “Basic Linear Algebra Subprograms Technical Forum Standard,” https://netlib.org/blas/blast-forum/blas-report.pdf, 2020-06-27, University of Tennessee, Knoxville, Tennessee, Tech. Rep., 2001.
- 2. M. Koehler and J. Saak, “FlexiBLAS - a flexible BLAS library with runtime exchangeable backends,” https://www.netlib.org/lapack/lawnspdf/lawn284.pdf, 2020-06-27, LAPACK Working Notes, Tech. Rep., 2013.
- 3. F. G. Van Zee and R. A. Van de Geijn, “BLIS: A framework for rapidly instantiating BLAS functionality,” ACM Transactions on Mathematical Software, vol. 41, no. 3, pp. 1–33, 2015. http://dx.doi.org/10.1145/2764454
- 4. D. G. Spampinato and M. Püschel, “A basic linear algebra compiler,” in Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, ser. CGO ’14. New York, NY, USA: Association for Computing Machinery, 2014. doi: 10.1145/2544137.2544155 p. 23–32.
- 5. N. Kyrtatas and D. G. Spampinato, “A Basic Linear Algebra Compiler for Embedded Processors,” 2015 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1054–1059, 2015. http://dx.doi.org/10.3929/ethz-a-010144458
- 6. G. Frison, D. Kouzoupis, T. Sartor, A. Zanelli, and M. Diehl, “BLAS-FEO: Basic linear algebra subroutines for embedded optimization,” ACM Trans. Math. Softw., vol. 44, no. 4, pp. 42:1–42:30, Jul. 2018. doi: 10.1145/3210754
- 7. G. Frison, T. Sartor, A. Zanelli, and M. Diehl, “The BLAS API of BLAS-FEO: Optimizing performance for small matrices,” ACM Transactions on Mathematical Software, vol. 46, no. 2, May 2020. http://dx.doi.org/10.1145/3378671
- 8. C. Fibich, S. Tauner, P. Rössler, M. Horauer, M. Krapfenbauer, M. Linauer, M. Matschnig, and H. Taucher, “Evaluation of open-source linear algebra libraries in embedded applications,” in 2019 8th Mediterranean Conference on Embedded Computing (MECO), June 2019. http://dx.doi.org/10.1109/MECO.2019.8760041 pp. 1–6.
- 9. M. Gautschi, P. D. Schiavone, A. Traber, I. Loi, A. Pullini, D. Rossi, E. Flamand, F. K. Gürkaynak, and L. Benini, “Near-threshold RISC-V core with DSP extensions for scalable IoT endpoint devices,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 10, pp. 2700–2713, Oct. 2017. http://dx.doi.org/10.1109/TVLSI.2017.2654506
- 10. R. C. Whaley, A. Petitet, and J. J. Dongarra, “Automated empirical optimizations of software and the ATLAS project,” Parallel Computing, vol. 27, no. 1, pp. 3–35, 2001. http://dx.doi.org/10.1016/S0167-8191(00)00087-9
- 11. K. Goto and R. A. v. d. Geijn, “Anatomy of high-performance matrix multiplication,” ACM Transactions on Mathematical Software, vol. 34, no. 3, pp. 12:1–12:25, May 2008. http://dx.doi.org/10.1145/1356052.1356053
- 12. Altera Corporation, “cv 5v4: Cyclone V Hard Processor System Technical Reference Manual,” https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/cyclone-v/cv5v4.pdf, 2020-06-27, July 2018.
Uwagi
1. Track 5: Software and System Engineering
2. Technical Session: Joint 40th IEEE Software Engineering Workshop and 7th International Workshop on Cyber-Physical Systems
3. Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-dc5b44b6-f95c-4e92-a5c9-16a0a85e3d14