PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Performance evaluation of selected ML algorithms in GC and AWS cloud environments

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this paper, we analyze the performance of common machine learning (ML) algorithms executed in Google Cloud and Amazon Web Services environments. The primary metric is training and prediction time as a function of the number of virtual machine cores. For comparison, benchmarks also include a "bare metal" (i.e. - non-cloud) environment, with results adjusted using the "Multi-thread Score" to account for architectural differences among the tested platforms. Our focus is on CPU-intensive algorithms. The test suite includes Support Vector Machines, Decision Trees, K-Nearest Neighbors, Linear Models, and Ensemble Methods. The evaluated classifiers, sourced from the scikit-learn and ThunderSVM libraries, include: Extra Trees, Support Vector Machines, K-Nearest Neighbors, Random Forest, Gradient Boosting Classifier, and Stochastic Gradient Descent. GPU-accelerated deep learning models, such as large language models, are excluded due to the difficulty of establishing a common baseline across platforms. The dataset used is the widely known "Higgs dataset," which describes kinematic properties measured by particle detectors in the search for the Higgs boson. Benchmark results are best described as varied—there is no clear trend, as training and prediction times scale differently depending on both the cloud platform and the algorithm type. This paper provides practical insights and guidance for deploying and optimizing CPU-based ML workloads in cloud environments.
Twórcy
  • Warsaw University of Technology, Warszawa, Poland
  • Warsaw University of Technology, Warszawa, Poland
Bibliografia
  • [1] P. Borra, “Comparison and analysis of leading cloud service providers (AWS, Azure and GCP)”, in International Journal of Advanced Research in Engineering and Technology (IJARET) Volume, 15, 266-278, 2024. https://doi.org/10.17605/OSF.IO/T2DHW
  • [2] M. Armbrust, A. Fox, A., et al., “A view of cloud computing”, in Communications of the ACM, 53(4), 50-58. 2010. https://doi.org/10.1145/1721654.1721672
  • [3] A. Rashid, A., A. Chaturvedi, “Cloud computing characteristics and services: a brief review”, in International Journal of Computer Sciences and Engineering, 7(2), 421-426, 2019. https://doi.org/10.26438/ijcse/v7i2.421426
  • [4] Q. Zhang, I. Cheng, R. Boutaba, “Cloud computing: state-of-the-art and research challenges”, in Journal of internet services and applications, 1, 7-18, 2010. https://doi.org/10.1007/s13174-010-0007-6
  • [5] Amazon Web Services, “AWS Cloud Credits for Research”, Retrieved from https://aws.amazon.com/grants/ , 2023.
  • [6] M. Goswami, “Challenges and Solutions in Integrating AI with Multi-Cloud Architectures”, in International Journal of Enhanced Research in Management & Computer Applications ISSN, 2319-747, 2021.
  • [7] P. Leitner, J. Cito, “Patterns in the chaos—a study of performance variation and predictability in public IAaS clouds “, in ACM Transactions on Internet Technology (TOIT), 16(3), 1-23, 2016. https://doi.org/10.1145/2885497
  • [8] Sadooghi, I. Martin, et al., “Understanding the performance and potential of cloud computing for scientific applications”, in IEEE Transactions on Cloud Computing, 5(2), 358-371, 2015. https://doi.ieeecomputersociety.org/10.1109/TCC.2015.2404821
  • [9] J. Ericson, M. Mohammadian, F. Santana, “Analysis of performance variability in public cloud computing”, in 2017 IEEE International Conference on Information Reuse and Integration (IRI) (pp. 308-314). IEEE, 2017. https://doi.org/10.1109/IRI.2017.47
  • [10] OpenML, "Higgs Boson dataset description," [Online]. Available: https://www.openml.org/search?type=data&sort=runs&id=44129&status=active, [Accessed 20.01.2025]
  • [11] F. Pedregosa, G. Varoquaux, et al., “Scikit-learn: Machine learning in Python”, in the Journal of machine Learning Research, 12, 2825-2830, 2011.
  • [12] Amazon AWS, "AWS EC2 Instance Types," [Online]. Available: https://aws.amazon.com/ec2/instance-types/c6a/ . [Accessed 25.03.2025].
  • [13] Amazon AWS, "AWS EC2 CPU Options," [Online]. Available: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-options-supported-instances-values.html#cpu-options-compute-optimized [Accessed 25.03.2025].
  • [14] Amazon AWS, "AWS Amazon Linux 2 - Product Landing Page," [Online]. Available: https://aws.amazon.com/amazon-linux-2 [Accessed 25 03 2025].
  • [15] Google Inc., "Google Compute Optimized Machines," [Online]. Available: https://cloud.google.com/compute/docs/compute-optimized-machines#c2_machine_types [Accessed 25 03 2025].
  • [16] The CentOS Project, "Official CentOS Project Website," [Online]. Available: https://www.centos.org/ [Accessed 01.04.2025].
  • [17] PassMark, "CPU Benchmark Comparison," [Online]. Available: https://www.cpubenchmark.net/compare/3896vs4539vs6160/Intel-i7-11700K-vs-Intel-Xeon-Gold-6253CL-vs-AMD-EPYC-7R13-64-Core [Accessed 25 03 2025].
  • [18] D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation”, arXiv preprint arXiv:2010.16061, 2020.
Uwagi
This work was supported by the Statutory Grant of the Polish Ministry of Science and Higher Education.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-ef62f2e9-9ad4-4e96-bf28-fb1021d1ed09
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.