PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Towards an HPC cluster digital twin and scheduling framework for improved energy efficiency

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Demand for compute resources and thus energy demand for HPC are steadily increasing while the energy market transforms to renewable energy and is facing significant price increases. Optimizing energy efficiency of HPC clusters is therefore a major concern. Different possible optimization dimensions are discussed in this paper. This paper presents a digital twin design for analyzing and reducing energy consumption of a real-world HPC system. The digital twin is based on the HPC cluster at PTB. The digital twin receives information from multiple internal and external data sources to cover the different optimization opportunities. The digital twin also consists of a scheduling simulation framework that uses the data from the digital twin and real-world job traces to test the influence of the different parameters on the HPC cluster.
Rocznik
Tom
Strony
265--268
Opis fizyczny
Bibliogr. 22 poz., il.
Twórcy
  • Physikalisch-Technische Bundesanstalt Abbestraße 2-12, 10587 Berlin, Germany
  • Freie Universität Berlin, Takustraße 9 14195 Berlin, Germany
  • Physikalisch-Technische Bundesanstalt Abbestraße 2-12, 10587 Berlin, Germany
  • Physikalisch-Technische Bundesanstalt Abbestraße 2-12, 10587 Berlin, Germany
  • Freie Universität Berlin, Takustraße 9 14195 Berlin, Germany
Bibliografia
  • 1. O. Mämmelä, M. Majanen, R. Basmadjian, H. De Meer, A. Giesler, and W. Homberg, “Energy-aware job scheduler for high-performance computing,” Computer Science - Research and Development, vol. 27, no. 4, p. 265–275, 2012. [Online]. Available: https://doi.org/10.1007/s00450-011-0189-6
  • 2. Bundesministerium für Umwelt, Naturschutz und nukleare Sicherheit (BMU), “Klimaschutzplan 2050,” https://www.bmwk.de/Redaktion/DE/Publikationen/Industrie/klimaschutzplan-2050.pdf, 2019.
  • 3. Bundesministerium für Umwelt, Naturschutz und nukleare Sicherheit (BMU), “Klimaschutzprogramm 2030 der Bundesregierung zur Umsetzung des Klimaschutzplans 2050,” https://www.bundesregierung. de/resource/blob/974430/1679914/e01d6bd855f09bf05cf7498e06d0a3ff/2019-10-09-klima-massnahmen-data.pdf, Oct. 2019.
  • 4. R. UMWELT, Energieeffizienter Rechenzentrumsbetrieb DE-UZ 161, 2nd ed., https://produktinfo.blauer-engel.de/uploads/criteriafile/de/DE-UZ%20161-201502-de%20Kriterien.pdf, Fränkische Straße 7, 53229 Bonn, Feb. 2015.
  • 5. T. Ohmura, Y. Shimomura, R. Egawa, and H. Takizawa, “Toward building a digital twin of job scheduling and power management on an hpc system,” in Job Scheduling Strategies for Parallel Processing, D. Klusáček, C. Julita, and G. P. Rodrigo, Eds. Cham: Springer Nature Switzerland, 2023, p. 47–67.
  • 6. M. Ott and D. Kranzlmüller, “Best practices in energy-efficient high performance computing,” in Workshops der INFORMATIK 2018 - Architekturen, Prozesse, Sicherheit und Nachhaltigkeit. Bonn: Köllen Druck+Verlag GmbH, 2018, p. 167–176.
  • 7. A. Kammeyer, F. Burger, D. Lübbert, and K. Wolter, “Optimization of energy efficiency of an hpc cluster: On metrics, monitoring and digital twins,” in Sensor and Measurement Science International, ser. SMSI 2023. AMA Service GmbH, May 2023, p. 378–379. [Online]. Available: https://doi.org/10.5162/SMSI2023/P51
  • 8. K. Ahmed, J. Liu, and X. Wu, “An Energy Efficient Demand-Response Model for High Performance Computing Systems,” in 2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2017, p. 175–186.
  • 9. A. Krzywaniak, J. Proficz, and P. Czarnul, “Analyzing Energy/Performance Trade-Offs with Power Capping for Parallel Applications On Modern Multi and Many Core Processors,” in 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), 2018, p. 339–346.
  • 10. V. Avelar, D. Azevedo, A. French, and E. N. Power, “Pue: a comprehensive examination of the metric,” White paper, vol. 49, 2012.
  • 11. K. Ahmed, “Energy Demand Response for High-Performance Computing Systems,” Ph.D. dissertation, Florida International University, Miami, 2018.
  • 12. J. Eitzinger, T. Gruber, A. Afzal, T. Zeiser, and G. Wellein, “Clustercockpit — a web application for job-specific performance monitoring,” in 2019 IEEE International Conference on Cluster Computing (CLUSTER), 2019, p. 1–7.
  • 13. Bundesnetzagentur für Elektrizität, Gas, Telekommunikation, Post und Eisenbahnen (BNetzA), “SMARD - Strommarktdaten, Stromhandel und Stromerzeugung in Deutschland,” https://www.smard.de/home/marktdaten, May 2023.
  • 14. Electricity Maps ApS, “Electricity Maps,” https://www.electricitymaps.com/, May 2023.
  • 15. Deutscher Wetterdienst (DWD), “Open Data Server of the German Meteorological Service,” https://opendata.dwd.de/, May 2023.
  • 16. Bright Sky Developers, “Bright Sky JSON API for DWD’s open weather data,” https://brightsky.dev/, May 2023.
  • 17. B. Kocot, P. Czarnul, and J. Proficz, “Energy-aware scheduling for high-performance computing systems: A survey,” Energies, vol. 16, no. 2, 2023. [Online]. Available: https://www.mdpi.com/1996-1073/16/2/890
  • 18. A. B. Yoo, M. A. Jette, and M. Grondona, “Slurm: Simple linux utility for resource management,” in Job Scheduling Strategies for Parallel Processing, D. Feitelson, L. Rudolph, and U. Schwiegelshohn, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, p. 44–60.
  • 19. N. A. Simakov, M. D. Innus, M. D. Jones, R. L. DeLeon, J. P. White, S. M. Gallo, A. K. Patra, and T. R. Furlani, “A slurm simulator: Implementation and parametric analysis,” in High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, S. Jarvis, S. Wright, and S. Hammond, Eds. Cham: Springer International Publishing, 2018, p. 197–217.
  • 20. N. A. Simakov, R. L. Deleon, Y. Lin, P. S. Hoffmann, and W. R. Mathias, “Developing accurate slurm simulator,” in Practice and Experience in Advanced Research Computing, ser. PEARC ’22. New York, NY, USA: Association for Computing Machinery, 2022. [Online]. Available: https://doi.org/10.1145/3491418.3535178
  • 21. X. Yang, Z. Zhou, S. Wallace, Z. Lan, W. Tang, S. Coghlan, and M. Papka, “Integrating dynamic pricing of electricity into energy aware scheduling for HPC systems,” in International Conference for High Performance Computing, Networking, Storage and Analysis, SC, 2013.
  • 22. D. G. Feitelson, D. Tsafrir, and D. Krakov, “Experience with using the parallel workloads archive,” Journal of Parallel and Distributed Computing, vol. 74, no. 10, p. 2967–2982, 2014. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0743731514001154
Uwagi
1. Main Track Short Papers
2. Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-127a390d-e682-48b0-b47f-53e66eb3e159
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.