Turbine : A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Wozniak, J. M.; Armstrong, T. G.; Maheshwari, K.; Lusk, E. L.; Katz, D. S.; Wilde, M.; Foster, I. T.

Artykuł - szczegóły

Tytuł artykułu

Turbine : A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Autorzy

Wozniak J. M. , Armstrong T. G. , Maheshwari K. , Lusk E. L. , Katz D. S. , Wilde M. , Foster I. T.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. “Many-task” programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tightly coupled parallelism at the lower level through multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and intertask data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.

Słowa kluczowe

ADLB Asynchronous Dynamic Load Balancing Library dataflow language MPI Swift Turbine

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2013

Tom

Vol. 128, nr 3

Strony

337--366

Opis fizyczny

Bibliogr. 44 poz., rys., tab., wykr.

Twórcy

autor

Wozniak J. M.

wozniak@mcs.anl.gov

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA

autor

Armstrong T. G.

tga@uchicago.edu

Computer Science Department, University of Chicago, Chicago, IL USA

autor

Maheshwari K.

ketan@mcs.anl.gov

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA

autor

Lusk E. L.

lusk@mcs.anl.gov

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA

autor

Katz D. S.

d.katz@ieee.org

Computation Institute, University of Chicago & Argonne National Laboratory, Chicago, IL USA

autor

Wilde M.

wilde@mcs.anl.gov

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA

autor

Foster I. T.

foster@mcs.anl.gov

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA

Bibliografia

[1] S. Ahuja, N. Carriero, and D. Gelemter. Linda and friends. IEEE Computer, 19(8):26-34, 1986.
[2] T. G. Armstrong, J. Wozniak, M. Wilde, K. Maheshwari, D. S. Katz, M. Ripeanu, E. Lusk, and I. Foster. ExM: High level dataflow programming for extreme scale systems. In 4th USENIX Workshop on Hot Topics in Parallelism (HotPar), poster, Berkeley, CA, June 2012. ACM, USENIX Association.
[3] T. G. Armstrong, Z. Zhang, D. S. Katz, M. Wilde, and I. T. Foster. Scheduling many-task workloads on supercomputers: Dealing with trailing tasks. In Proc. Workshop on Many-Task Computing on Grids and Supercomputers, 2011, 2010.
[4] ASCAC Subcommittee on Exascale Computing. The opportunities and challenges of exascale computing, 2010. U.S. Dept. of Energy report.
[5] D. M. Beazley. SWIG: an easy to use tool for integrating scripting languages with C and C++. In Proceedings of the 4th conference on USENIX Tcl/Tk Workshop, 1996 - Volume 4, Berkeley, CA, USA, 1996. USENIX Association.
[6] R. D. Blumofe and P. A. Lisiecki. Adaptive and reliable parallel computing on networks of workstations. In Proc. of Annual Conf. on USENIX, page 10, Berkeley, CA, USA, 1997. USENIX Association.
[7] G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, and J. Dongarra. DAGuE: A generic distributed DAG engine for high performance computing. In Proc. Intl. Parallel and Distributed Processing Symp., 2011.
[8] A. Chan, W. Gropp, and E. Lusk. An efficient format for nearly constant-time access to arbitrary time intervals in large trace files. Scientific Programming, 16(2-3):155-165, 2008.
[9] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51:107-113, January 2008.
[10] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev., 41:205-220, Oct. 2007.
[11] E. Deelman, T. Kosar, C. Kesselman, and M. Livny. What makes workflows work in an opportunistic environment? Concurrency and Computation: Practice and Experience, 18:1187-1199, 2006.
[12] J. Dinan, S. Krishnamoorthy, D. B. Larkins, J. Nieplocha, and P. Sadayappan. Scioto: A framework for global-view task parallelism. Intl. Conf. on Parallel Processing, pages 586-593, 2008.
[13] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox. Twister: A runtime for iterative MapReduce. In Proc. of 19th ACM Intl. Symp. on High Performance Distributed Computing, HPDC ’10, pages 810-818, New York, 2010. ACM.
[14] J. Evans and A. Rzhetsky. Machine science. Science, 329(5990):399-400, 2010.
[15] B. Fitzpatrick. Distributed caching with memcached. Linux Journal, 2004:5-, August 2004.
[16] M. Hategan, J. Wozniak, and K. Maheshwari. Coasters: uniform resource provisioning and access for scientific computing on clouds and grids. In Proc. Utility and Cloud Computing, 2011.
[17] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. SIGOPS Oper. Syst. Rev., 41:59-72, March 2007.
[18] J. W. Jones, G. Hoogenboom, P. Wilkens, C. Porter, and G. Tsuji, editors. Decision Support System for Agrotechnology Transfer Version 4.0: Crop Model Documentation. University of Hawaii, 2003.
[19] A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44:35-40, April 2010.
[20] Z. Li and M. Parashar. Comet: A scalable coordination space for decentralized distributed environments. In 2nd Intl. Work. on Hot Topics in Peer-to-Peer Systems, HOT-P2P 2005, pages 104-111, 2005.
[21] M. Lubin, C. G. Petra, and M. Anitescu. The parallel solution of dense saddle-point linear systems arising in stochastic programming. Optimization Methods and Software, 27(4-5):845-864, 2012.
[22] M. Lubin, C. G. Petra, M. Anitescu, and V. Zavala. Scalable stochastic optimization of complex energy systems. In Proc. SC, 2011.
[23] E. L. Lusk, S. C. Pieper, and R. M. Butler. More scalability, less pain: A simple programming model and its implementation for extreme computing. SciDACReview, 17:30-37, January 2010.
[24] M. D. McCool. Structured parallel programming with deterministic patterns. In Proc. HotPar, 2010.
[25] D. G. Murray and S. Hand. Scripting the cloud with Skywriting. In HotCloud ’10: Proc. of 2nd USENIX Work. on Hot Topics in Cloud Computing, Boston, MA, USA, June 2010. USENIX.
[26] D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. CIEL: a universal execution engine for distributed data-flow computing. In Proc. NSDI, 2011.
[27] C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. In Proc. of2008 ACM SIGMOD Intl. Conf. on Management of Data, SIGMOD ’08, pages 1099-1110, New York, 2008. ACM.
[28] C. Petra and M. Anitescu. A preconditioning technique for schur complement systems arising in stochastic optimization. Computational Optimization and Applications, 52:315-344, 2012.
[29] R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming, 13(4):277-298, 2005.
[30] I. Raicu, Z. Zhang, M. Wilde, I. Foster, P. Beckman, K. Iskra, and B. Clifford. Toward loosely coupled programming on petascale systems. In Proc. of 2008 ACM/IEEE Conf. on Supercomputing, SC ’08, pages 22:1-22:12, Piscataway, NJ, 2008. IEEE Press.
[31] Redis. http://redis.io/.
[32] J. Shalf, J. Morrison, and S. Dosanj. Exascale computing technology challenges. VECPAR’2010, 2010.
[33] H. Simon, T. Zacharia, and R. Stevens. Modeling and simulation at the exascale for energy and the environment, 2007. Report on the Advanced Scientific Computing Research Town Hall Meetings on Simulation and Modeling at the Exascale for Energy, Ecological Sustainability and Global Security (E3).
[34] R. Stevens and A. White. Architectures and technology for extreme scale computing, 2009. U.S. Dept. of Energy report, available at http://science.energy.gov/~/media/ascr/pdf/program-documents/ docs/Arch_tech_grand_challenges_report.pdf.
[35] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow., 2:1626-1629, August 2009.
[36] G. von Laszewski, I. Foster, J. Gawor, and P. Lane. A Java Commodity Grid Kit. Concurrency and Computation: Practice and Experience, 13(8-9), 2001.
[37] E. Walker, W. Xu, and V. Chandar. Composing and executing parallel data-flow graphs with shell pipes. In Work. on Workflows in Support of Large-Scale Science at SC’09, 2009.
[38] B. B. Welch, K. Jones, and J. Hobbs. Practical programming in Tcl and Tk. Prentice Hall, 4th edition, 2003.
[39] M. Wilde, I. Foster, K. Iskra, P. Beckman, Z. Zhang, A. Espinosa, M. Hategan, B. Clifford, and I. Raicu. Parallel scripting for applications at the petascale and beyond. Computer, 42(11):50-60, 2009.
[40] M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. Foster. Swift: A language for distributed parallel scripting. Parallel Computing, 37:633-652, 2011.
[41] J. M. Wozniak, T. G. Armstrong, M. Wilde, D. S. Katz, E. Lusk, and I. T. Foster. Swift/T: Scalable data flow programming for many-task applications. In Proc. CCGrid, 2013.
[42] L. Yu, C. Moretti, A. Thrasher, S. Emrich, K. Judd, and D. Thain. Harnessing parallelism in multicore clusters with the All-Pairs, Wavefront, and Makeflow abstractions. Cluster Computing, 13:243-256, 2010.
[43] Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proc. of Symp. on Operating System Design and Implementation (OSDI), December 2008.
[44] M. Zaharia, N. M. M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. Technical Report UCB/EECS-2010-53, EECS Department, University of California, Berkeley, May 2010.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-38ff4a36-3bc5-41d6-b118-9bf48f057e2f