PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A novel adaptive checkpointing method based on information obtained from workflow structure

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Scientific workflows are data- and compute-intensive; thus, they may run for days or even weeks on parallel and distributed infrastructures such as grids, supercomputers, and clouds. In these high-performance computing infrastruc- tures, the number of failures that can arise during scientific-workflow enact- ment can be high, so the use of fault-tolerance techniques is unavoidable. The most-frequently used fault-tolerance technique is taking checkpoints from time to time; when failure is detected, the last consistent state is restored. One of the most-critical factors that has great impact on the effectiveness of the checkpointing method is the checkpointing interval. In this work, we propose a Static (Wsb) and an Adaptive (AWsb) Workflow Structure Based checkpoint- ing algorithm. Our results showed that, compared to the optimal checkpointing strategy, the static algorithm may decrease the checkpointing overhead by as much as 33% without affecting the total processing time of workflow execution. The adaptive algorithm may further decrease this overhead while keeping the overall processing time at its necessary minimum.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Strony
387--406
Opis fizyczny
Bibliogr. 11 poz., rys., wykr., tab.
Twórcy
autor
  • Obuda University, John von Neumann Faculty of Informatics, 1034 B ́ecsi str. 96/b., Budapest, Hungary
autor
  • University of Westminster, 115 New Cavendish Street, London, United Kingdom
  • MTA SZTAKI, 1518 Budapest, Hungary
  • Obuda University, John von Neumann Faculty of Informatics, Biotech Lab, 1034 B ́ecsi str. 96/b., Budapest, Hungaryy
  • MTA SZTAKI, 1518 Budapest, Hungary
Bibliografia
  • [1] Di S., Robert Y., Vivien F., Kondo D., Wang C.L., Cappello F.: Optimization of Cloud Task Processing with Checkpoint-restart Mechanism. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis , SC ’13, pp. 64:1–64:12, ACM, New York, NY, USA, 2013, http: //doi.acm.org/10.1145/2503210.2503217.
  • [2] Garg R., Singh A.: Fault Tolerance in Grid Computing: State of the art and open issues. International Journal of Computer Science and Engineering Survey (IJCSES), vol. 2, p. 8897, 2011.
  • [3] Hwang S., Kesselman C.: Grid workflow: a flexible failure handling framework for the grid. High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on , pp. 126–137, 2003.
  • [4] Jhawar R., Piuri V., Santambrogio M.: Fault Tolerance Management in Cloud Computing: A System-Level Perspective. IEEE Systems Journal, vol. 7(2), pp. 288–297, 2013.
  • [5] Kail E., Kacsuk P., Kozlovszky M.: New aspect of investigating fault sensitivity of scientific workflows. Intelligent Engineering Systems (INES), 2015 IEEE 19th International Conference on , pp. 185–188, 2015.
  • [6] Meroufel B., Belalem G.: Adaptive time-based coordinated checkpointing for cloud computing workflows. Scalable Computing: Practice and Experience, vol. 15, 2014.
  • [7] Meroufel B., Belalem G.: Policy Driven Initiator in Coordination Checkpointing Strategies. Recent Advances in Telecommunications, Informatics And Educational Technologies, Proceeding of the 5th European Conference of Computer Science, p. 146153, WSEAS, 2014.
  • [8] Pietri I., Juve G., Deelman E., Sakellariou R.: A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud. Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, WORKS ’14, pp. 11–19, IEEE Press, Piscataway, NJ, USA, 2014, http://dx.doi.org/10. 1109/WORKS.2014.12.
  • [9] Starlinger J., Cohen-Boulakia S., Khanna S., Davidson S., Leser U.: Layer Decomposition: An Effective Structure-based Approach for Scientific Workflow Similarity. Proc. of the 10th IEEE International Conference in eScience, 2014.
  • [10] Therasa.S A.L., Sumathi.G, Dalya.S A.: Article: Dynamic Adaptation of Check- points and Rescheduling in Grid Computing. International Journal of Computer Applications, vol. 2(3), pp. 95–99, 2010, published By Foundation of Computer Science.
  • [11] Young J.W.: A First Order Approximation to the Optimum Checkpoint Interval. Commun. ACM , vol. 17(9), pp. 530–531, 1974, http://doi.acm.org/10.1145/ 361147.361115.
Uwagi
PL
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3dc05a0d-2841-4b8e-a9ab-9f27bb7f27fa
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.