PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Evaluation Scheme for NoC-based CMP with Integrated Processor Management System

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
With the opportunities and benefits offered by Chip Multiprocessors (CMPs), there are many challenges that need to be addressed in order to exploit the full potential of CMPs. Such aspects as parallel programs, interconnection design, cache arrangement and on-chip cores allocation become a limiting factor. To ensure validity of approaches and research, we propose an evaluation system for CMPs with Network-on-Chip (NoC) and processor management system integrated on one die. The suggested experimentation system is described in details. The proposed system that is used for tests and results of the experiments are presented and discussed. As decision making criteria, we consider energy efficiency of Processor Allocator (PA) and NoC, as well as NoC traffic characteristic (load balance). In order to improve the system understanding, brief overview on most important NoC and PA architectures is also presented. Analyzed results reveal that CMP with a PA controlled by IFF allocation algorithm for mesh systems and torus-based NoC driven by DORLB routing with express-virtual-channel flow control achieved the best traffic balance and energy characteristic.
Twórcy
autor
autor
autor
  • Department of Electrical and Computer Engineering, University of Nevada, Las Vegas, USA
Bibliografia
  • [1] I. Ababneh, “An efficient free-list submesh allocation scheme for twodimensional mesh-connected multicomputers,” Journal of Systems and Software, vol. 79, no. 8, pp. 1168–1179, 2006.
  • [2] J. Balfour and W. J. Dally, “Design tradeoffs for tiled CMP onchip networks,” in Proceedings of 20th International Conference on Super-computing, 2006, pp. 187–198.
  • [3] T. Bjerregaard, “The mango clockless network-on-chip: Concepts and implementation,” Ph.D. dissertation, Technical University of Denmark, 2005.
  • [4] Y. M. Boura and C. R. Das, “Efficient fully adaptive wormhole routing in n-dimensional meshes,” in Proceedings of the 14th International Conference on Distributed Computing Systems, 1994, pp. 589–596.
  • [5] S. Bourduas, “Modeling, evaluation, and implementation of ring-based interconnects for network-on-chip,” Ph.D. dissertation, McGill University, 2008.
  • [6] G. Chmaj, D. Zydek, and L. Koszalka, “Comparison of task allocation algorithms for mesh-structured systems,” in Computer Systems Engineering Theory & Applications, 4th PBW. IEE Control and Automation Professional Network, 2004, pp. 39–50.
  • [7] W. J. Dally, “Performance analysis of k-ary n-cube interconnection networks,” IEEE Transaction on Computers, vol. 39, no. 6, pp. 775–785, 1990.
  • [8] W. J. Dally, “Virtual-channel flow control,” IEEE Transaction on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194–205, 1992.
  • [9] W. J. Dally and C. L. Seitz, “The torus routing chip,” Journal of Distributed Computing, vol. 1, no. 4, pp. 187–196, 1986.
  • [10] W. J. Dally and C. L. Seitz, “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Transactions on Computers, vol. 36, no. 5, pp. 547–553, 1987.
  • [11] W. J. Dally and B. Towles, “Route packets, not wires: On-chip interconnection networks,” in Proceedings of the 38th annual Design Automation Conference, 2001, pp. 684–689.
  • [12] W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. San Francisco: Morgan Kaufmann, 2004.
  • [13] J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks. San Francisco: Morgan Kaufmann, 2003.
  • [14] D. N. Jayasimha, B. Zafar, and Y. Hoskote, “On-chip interconnection networks: Why they are different and how to compare them,” Intel, Tech. Rep., 2006.
  • [15] N. K. Kavaldjiev, “A run-time reconfigurable network-on-chip for streaming DSP applications,” Ph.D. dissertation, University of Twente, 2007.
  • [16] P. Krueger, T. H. Lai, and V. A. Dixit-Radiya, “Job scheduling is more important than processor allocation for hypercube computers,” IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 5, pp. 488–497, 1994.
  • [17] A. Kumar, P. K. L. S. Peh, and N. K. Jha, “Express virtual channels: Towards the ideal interconnection fabric,” ACM SIGARCH Computer Architecture News, vol. 35, no. 2, pp. 150–161, 2007.
  • [18] P. Mohapatra, C. Yu, C. R. Das, and J. Kim, “A lazy scheduling for improving hypercube performance,” in Proceedings of the 1993 International Conference on Parallel Processing (ICPP ’93), vol. 1, 1993, pp. 110–117.
  • [19] C. A. F. D. Rose, H. U. Heiss, and B. Linnert, “Distributed dynamic processor allocation for multicomputers,” Parallel Computing, vol. 33, no. 3, pp. 145–158, 2007.
  • [20] E. Salminen, A. Kulmala, and T. D. Hamalainen, “Survey of networkon-chip proposals,” in White Paper, OCP-IP, 2008, pp. 1–13.
  • [21] C. C. Su and K. G. Shin, “Adaptive deadlock-free routing in multicomputers using only one extra virtual channel,” in Proceedings of the 1993 International Conference on Parallel Processing, vol. 1, 1993, pp. 227–231.
  • [22] M. B. Taylor and et al., “The raw microprocessor: A computational fabric for software circuits and general-purpose programs,” IEEE Micro, vol. 22, no. 2, pp. 25–35, 2002.
  • [23] J. Upadhyay, V. Varavithya, and P. Mohapatra, “A traffic-balanced adaptive wormhole routing scheme for two-dimensional meshes,” IEEE Transactions on Computers, vol. 46, no. 2, pp. 190–197, 1997.
  • [24] L. G. Valiant and G. J. Brebner, “Universal schemes for parallel communication,” in Proceedings of the 13th Annual ACM Symposium on Theory of Computing, 1981, pp. 263–277.
  • [25] D. Wiklund, “Development and performance evaluation of networks on chip,” Ph.D. dissertation, Linkoping University, 2005.
  • [26] B. S. Yoo and C. R. Das, “A fast and efficient processor allocation scheme for mesh-connected multicomputers,” IEEE Transaction on Computers, vol. 51, no. 1, pp. 46–60, 2002.
  • [27] Y. Zhu, “Efficient processor allocation strategies for mesh-connected parallel computers,” Journal of Parallel and Distributed Computing, vol. 16, no. 4, pp. 328–337, 1992.
  • [28] D. Zydek and H. Selvaraj, “Fast and efficient processor allocation algorithm for torus-based chip multiprocessors,” Journal of Computers & Electrical Engineering, 2009, submitted for publication.
  • [29] D. Zydek and H. Selvaraj, “Processor allocation problem for NoC-based chip multiprocessors,” in Proceedings of 6th International Conference on Information Technology: New Generations (ITNG 2009), 2009, pp. 96–101.
  • [30] D. Zydek and H. Selvaraj, “Hardware implementation of processor allocation schemes for mesh-based chip multiprocessors,” Journal of Microprocessors and Microsystems, vol. 34, no. 1, pp. 39–48, 2010.
  • [31] D. Zydek, H. Selvaraj, G. Borowik, and T. Luba, “Energy characteristic of processor allocator and network-on-chip,” Journal of Applied Mathematics and Computer Science, 2010, submitted for publication.
  • [32] D. Zydek, H. Selvaraj, and L. Gewali, “Synthesis of processor allocator for torus-based chip multiprocessors,” in Proceedings of 7th International Conference on Information Technology: New Generations (ITNG 2010), 2010, pp. 13–18.
  • [33] D. Zydek, N. Shlayan, E. Regentova, and H. Selvaraj, “Review of packet switching technologies for future NoC,” in Proceedings 19th International Conference on Systems Engineering (ICSEng 2008), 2008, pp. 306–311.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BWA0-0045-0025
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.