PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A hybrid scheduler for many task computing in big data systems

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
With the rapid evolution of the distributed computing world in the last few years, the amount of data created and processed has fast increased to petabytes or even exabytes scale. Such huge data sets need data-intensive computing applications and impose performance requirements to the infrastructures that support them, such as high scalability, storage, fault tolerance but also efficient scheduling algorithms. This paper focuses on providing a hybrid scheduling algorithm for many task computing that addresses big data environments with few penalties, taking into consideration the deadlines and satisfying a data dependent task model. The hybrid solution consists of several heuristics and algorithms (min-min, min-max and earliest deadline first) combined in order to provide a scheduling algorithm that matches our problem. The experimental results are conducted by simulation and prove that the proposed hybrid algorithm behaves very well in terms of meeting deadlines.
Rocznik
Strony
385--399
Opis fizyczny
Bibliogr. 41 poz., tab., wykr.
Twórcy
autor
  • Computer Science Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 313, Splaiul Independentei, 060042 Bucharest, Romania
autor
  • Computer Science Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 313, Splaiul Independentei, 060042 Bucharest, Romania; National Institute for Research and Development in Informatics (ICI) 8–10, Mareşal Averescu, 011455 Bucharest, Romania
autor
  • Computer Science Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 313, Splaiul Independentei, 060042 Bucharest, Romania
autor
  • Computer Science Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 313, Splaiul Independentei, 060042 Bucharest, Romania
autor
  • Computer Science Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 313, Splaiul Independentei, 060042 Bucharest, Romania
autor
  • Institute of Computer Science, Cracow University of Technology, ul. Warszawska 24, 31-155 Cracow, Poland
Bibliografia
  • [1] Aamodt, K., Quintana, A.A., Achenbach, R., Acounis, S., Adler, C., Aggarwal, M., Agnese, F., Rinella, G.A., Ahammed, Z. and Ahmad, A. (2008). The Alice experiment at the CERN LHC, Journal of Instrumentation 3(08): S08002.
  • [2] Benziani, Y., Kacem, I., Laroche, P. and Nagih, A. (2014). Exact and heuristic methods for minimizing the total completion time in job-shops, Studies in Informatics and Control 23(1): 31–40.
  • [3] Bessis, N., Sotiriadis, S., Cristea, V. and Pop, F. (2011). Modelling requirements for enabling meta-scheduling in inter-clouds and inter-enterprises, 2011 3rd International Conference on Intelligent Networking and Collaborative Systems (INCoS), Fukuoka, Japan, pp. 149–156.
  • [4] Bourdena, A., Mavromoustakis, C. X., Kormentzas, G., Pallis, E., Mastorakis, G. and Yassein, M.B. (2014). A resource intensive traffic-aware scheme using energy-aware routing in cognitive radio networks, Future Generation Computer Systems 39: 16–28.
  • [5] Cabrera, G., Niklander, S., Cabrera, E. and Johnson, F. (2016). Solving a distribution network design problem by means of evolutionary algorithms, Studies in Informatics and Control 25(1): 21–28.
  • [6] Chmaj, G., Walkowiak, K., Tarnawski, M. and Kucharzak, M. (2012). Heuristic algorithms for optimization of task allocation and result distribution in peer-to-peer computing systems, International Journal of Applied Mathematics and Computer Science 22(3): 733–748, DOI: 10.2478/v10006-012-0055-0.
  • [7] Delen, D. and Demirkan, H. (2013). Data, information and analytics as services, Decision Support Systems 55(1): 359–363.
  • [8] Dimitriou, C.D., Mavromoustakis, C.X., Mastorakis, G. and Pallis, E. (2013). On the performance response of delay-bounded energy-aware bandwidth allocation scheme in wireless networks, 2013 IEEE International Conference on Communications Workshops (ICC), Budapest, Hungary, pp. 631–636.
  • [9] Esposito, C., Cotroneo, D. and Russo, S. (2013). On reliability in publish/subscribe services, Computer Networks 57(5): 1318–1343.
  • [10] Esposito, C., Ficco, M., Palmieri, F. and Castiglione, A. (2015). A knowledge-based platform for big data analytics based on publish/subscribe services and stream processing, Knowledge-Based Systems 79: 3–17.
  • [11] Esposito, C., Platania, M. and Beraldi, R. (2014). Reliable and timely event notification for publish/subscribe services over the internet, IEEE/ACM Transactions on Networking 22(1): 230–243.
  • [12] Gąsior, J. and Seredyński, F. (2015). Decentralized job scheduling in the cloud based on a spatially generalized Prisoner’s Dilemma game, International Journal of Applied Mathematics and Computer Science 25(4): 737–751, DOI: 10.1515/amcs-2015-0053.
  • [13] Gheith, A., Rajamony, R., Bohrer, P., Agarwal, K., Kistler, M., Eagle, B.W., Hambridge, C., Carter, J. and Kaplinger, T. (2016). IBM BlueMix mobile cloud services, IBM Journal of Research and Development 60(2–3): 7–1.
  • [14] He, C., Li, J., Liao, Z. and Zhang, C. (2016). MPS: A multipath publish/subscribe model in information-centric network, International Journal of Wireless and Mobile Computing 10(2): 130–137.
  • [15] Hepburn, A. (2011). Facebook statistics, stats & facts for 2011, www.digitalbuzzblog.com.
  • [16] Izakian, H., Abraham, A. and Snášel, V. (2009). Performance comparison of six efficient pure heuristics for scheduling meta-tasks on heterogeneous distributed environments, Neural Network World 19(6): 695–710.
  • [17] Janiak, A., Kwiatkowski, T. and Lichtenstein, M. (2013). Scheduling problems with a common due window assignment: A survey, International Journal of Applied Mathematics and Computer Science 23(1): 231–241, DOI: 10.2478/amcs-2013-0018.
  • [18] Jaskóła, P., Arabas, P. and Karbowski, A. (2016). Simultaneous routing and flow rate optimization in energy-aware computer networks, International Journal of Applied Mathematics and Computer Science 26(1): 231–243, DOI: 10.1515/amcs-2016-0016.
  • [19] Karpowicz, M.P., Arabas, P. and Niewiadomska-Szynkiewicz, E. (2015). Energy-aware multilevel control system for a network of Linux software routers: Design and implementation, IEEE Systems Journal PP(99): 1–12.
  • [20] Kobylinski, K., Bennett, J., Seto, N., Lo, G. and Tucci, F. (2014). Enterprise application development in the cloud with IBM BlueMix, Proceedings of the 24th Annual International Conference on Computer Science and Software Engineering, Markham, Ontario, Canada, pp. 276–279.
  • [21] Kołodziej, J. and Xhafa, F. (2011). Modern approaches to modeling user requirements on resource and task allocation in hierarchical computational grids, International Journal of Applied Mathematics and Computer Science 21(2): 243–257, DOI: 10.2478/v10006-011-0018-x.
  • [22] Litvinski, O. and Gherbi, A. (2013). OpenStack scheduler evaluation using design of experiment approach, 16th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC 2013), Paderborn, Germany, pp. 1–7.
  • [23] Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. and Byers, A.H. (2014). Big data: The next frontier for innovation, competition, and productivity, 2011, McKinsey Global Institute 5(33): 222.
  • [24] Mavromoustakis, C.X., Mastorakis, G., Bourdena, A., Pallis, E., Stratakis, D., Perakakis, E., Kopanakis, I., Papadakis, S., Zaharis, Z.D. and Skeberis, C. (2015). A social-oriented mobile cloud scheme for optimal energy conservation, in G. Mastorakis et al. (Eds.), Resource Management of Mobile Cloud Computing Networks and Environments, IGI Global, Hershey, PA, pp. 97–121.
  • [25] Negru, C., Mocanu, M. and Cristea, V. (2015). Impact of virtual machines heterogeneity on data center power consumption in data-intensive applications, ACM Symposium on Principles of Distributed Computing: PODC 2015, Donostia-San Sebastián, Spain, pp. 91–102.
  • [26] Negru, C., Mocanu, M., Cristea, V., Sotiriadis, S. and Bessis, N. (2016). Analysis of power consumption in heterogeneous virtual machine environments, Soft Computing: 1–12, DOI: 10.1007/s00500-016-2129-7.
  • [27] Negru, C., Pop, F., Cristea, V., Bessisy, N. and Li, J. (2013). Energy efficient cloud storage service: Key issues and challenges, 2013 4th International Conference on Emerging Intelligent Data and Web Technologies (EIDWT), Xi’an, China, pp. 763–766.
  • [28] Nicolae, A.A., Negru, C., Pop, F., Mocanu, M. and Cristea, V. (2014). Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing, International Conference on Network-Based Information Systems, Salerno, Italy, pp. 221–229.
  • [29] Niewiadomska-Szynkiewicz, E., Sikora, A., Arabas, P., Kamola, M., Mincer, M. and Kołodziej, J. (2014). Dynamic power management in energy-aware computer networks and data intensive computing systems, Future Generation Computer Systems 37: 284–296.
  • [30] Normandeau, K. (2013). Beyond volume, variety and velocity is the issue of big data veracity, Inside Big Data, HP Newsletter: 12 September 2013.
  • [31] Raicu, I., Foster, I.T. and Zhao, Y. (2008). Many-task computing for grids and supercomputers, Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS 2008, Austin, TX, USA, pp. 1–11.
  • [32] Reed, D.A. and Dongarra, J. (2015). Exascale computing and big data, Communications of the ACM 58(7): 56–68.
  • [33] Różycki, R., Waligóra, G. and Węglarz, J. (2016). Scheduling preemptable jobs on identical processors under varying availability of an additional continuous resource, International Journal of Applied Mathematics and Computer Science 26(3): 693–706, DOI: 10.1515/amcs-2016-0048.
  • [34] Russom, P. (2011). Big data analytics, TOWI Best Practices Report, Fourth Quarter.
  • [35] Sefraoui, O., Aissaoui, M. and Eleuldj, M. (2012). Openstack: Toward an open-source solution for cloud computing, International Journal of Computer Applications 55(3): 38–42.
  • [36] Sfrent, A. and Pop, F. (2015). Asymptotic scheduling for many task computing in big data platforms, Information Sciences 319(C): 71–91.
  • [37] Sharma, G. and Ganpati, A. (2015). Performance evaluation of fair and capacity scheduling in Hadoop YARN, 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), Greater Noida, India, pp. 904–906.
  • [38] Shvachko, K., Kuang, H., Radia, S. and Chansler, R. (2010). The hadoop distributed file system, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, USA, pp. 1–10.
  • [39] Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S. et al. (2013). Apache Hadoop: Yet another resource negotiator, Proceedings of the 4th Annual Symposium on Cloud Computing, Santa Clara, CA, USA, p. 5.
  • [40] Waller, M.A. and Fawcett, S.E. (2013). Data science, predictive analytics, and big data: A revolution that will transform supply chain design and management, Journal of Business Logistics 34(2): 77–84.
  • [41] Zikopoulos, P. and Eaton, C. (2011). Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, 1st Edn., McGraw-Hill, New York, NY.
Uwagi
PL
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-c2a5ded3-9403-47fb-ae05-b8a46b14bd2e
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.