PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Exploring Processor Parallelism: Estimation Methods and Optimization Strategies

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Automatic optimization of application-specific instruction-set processor (ASIP) architectures mostly focuses on the internal memory hierarchy design, or the extension of reduced instruction-set architectures with complex custom operations. This paper focuses on very long instruction word (VLIW) architectures and, more specifically, on automating the selection of an application specific VLIW issue-width. The issue- width selection strongly influences all the important processor properties (e.g. processing speed, silicon area, and power consumption). Therefore, an accurate and efficient issue-width estimation and optimization are some of the most important aspects of VLIW ASIP design. In this paper, we first compare different methods for the estimation of required the issue-width, and subsequently introduce a new force-based parallelism estimation method which is capable of estimating the required issue-width with only 3% error on average. Furthermore, we present and compare two techniques for estimating the required issue-width of software pipelined loop kernels and show that a simple utilization-based measure provides an error margin of less than 1% on average.
Twórcy
autor
  • Electronic Systems group at the Faculty of Electrical Engineering, Eindhoven University of Technology, The Netherlands
autor
  • Electronic Systems group at the Faculty of Electrical Engineering, Eindhoven University of Technology, The Netherlands
autor
  • Electronic Systems group at the Faculty of Electrical Engineering, Eindhoven University of Technology, The Netherlands
autor
  • Electronic Systems group at the Faculty of Electrical Engineering, Eindhoven University of Technology, The Netherlands
Bibliografia
  • [1] ASAM, “Project website.” [Online]. Available: http://www.asam- project.org
  • [2] Synopsys, “Synopsys Processor Designer.” [Online]. Available: http://www.synopsys.com
  • [3] Target, “Target Compiler Technologies: IP Designer.” [Online]. Available: http://www.retarget.com/
  • [4] FlexASP project, “TTA-based co-design environment.” [Online]. Available: http://tce.cs.tut.fi/
  • [5] S. Aditya, B. Rau, and V. Kathail, “Automatic architecture synthesis and compiler retargeting for VLIW and EPIC processors,” in ISSS 1999 — 12th International Symposium on System Synthesis . IEEE, November 1999, pp. 107–113.
  • [6] V. Kathail, S. Aditya, R. Schreiber, B. Ramakrishna Rau, D. Cronquist, and M. Sivaraman, “PICO: automatically designing custom computers,” IEEE Computer , vol. 35, no. 9, pp. 39–47, September 2002.
  • [7] H. Corporaal and J. Hoogerbrugge, “Cosynthesis with the MOVE frame-work,” in CESA 1996 — Multiconference on Computational Engineering in Systems Applications — Symposium on Modeling, Analysis, and Simulation . IEEE, July 1996, pp. 184–189.
  • [8] L. Pozzi, K. Atasu, and P. Ienne, “Exact and approximate algorithms for the extension of embedded processor instruction sets,” IEEE Trans- actions on Computer-Aided Design of Integrated Circuits and Systems , vol. 25, no. 7, pp. 1209–1229, July 2006.
  • [9] C. Wolinski and K. Kuchcinski, “Automatic selection of application- specific reconfigurable processor extensions,” in DATE 2008 — Design, Automation & Test in Europe Conference & Exhibition . IEEE, March 2008, pp. 1214–1219.
  • [10] J. Matai, J. Oberg, A. Irturk, T. Kim, and R. Kastner, “Trimmed VLIW: Moving application specific processors towards high level synthesis,” in ESLsyn 2012 — The Electronic System Level Synthesis Conference . IEEE, June 2012, pp. 11–16.
  • [11] A. Irturk, J. Matai, J. Oberg, J. Su, and R. Kastner, “Simulate and eliminate: A top-to-bottom design methodology for automatic generation of application specific architectures,” IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems , vol. 30, no. 8, pp. 1173–1183, August 2011.
  • [12] P. Qiao, “Design and optimization of digital hearing aid system based on Silicon Hive technology,” Master’s thesis, Eindhoven University of Technology, Eindhoven, The Netherlands, August 2010. [Online]. Available: http://alexandria.tue.nl/repository/books/709025.pdf
  • [13] P. Qiao, H. Corporaal, and M. Lindwer, “A 0.964 mW digital hearing aid system,” in DATE 2011 — Design, Automation & Test in Europe Conference & Exhibition . IEEE, March 2011, pp. 1–4.
  • [14] Y. Okmen, “SIMD floating point processor and efficient implementation of ray tracing algorithm,” Master’s thesis, TU Delft, Delft, The Netherlands, October 2011. [Online]. Available: http://repository.tudelft.nl/assets/uuid:b0a8ae03-18b9-4a0e-9761-64ffd2851074/YunusOkmenMScThesis.pdf
  • [15] E. Diken, R. Jordans, R. Corvino, and L. Jóźwiak, “Application analysis driven ASIP-based system synthesis for ECG,” in Embedded World Conference, February 2012, pp. 1–8.
  • [16] G. S. Tjaden and M. J. Flynn, “Detection and parallel execution of independent instructions,” IEEE Transactions on Computers , vol. 19, no. 10, pp. 889–895, October 1970.
  • [17] D. W. Wall, “Limits of instruction-level parallelism,” in ASPLOS 1991 — 4th International Conference on Architectural Support for Programming Languages and Operating Systems . ACM, April 1991, pp. 176–188.
  • [18] T. M. Austin and G. S. Sohi, “Dynamic dependency analysis of ordinary programs,” in ISCA 1992 — 19th annual International Symposium on Computer Architecture . ACM, May 1992, pp. 342–351.
  • [19] K. B. Theobald, G. R. Gao, and L. J. Hendren, “On the limits of program parallelism and its smoothability,” in MICRO 1992 — 25th Annual International Symposium on Microarchitecture . ACM, December 1992, pp. 10–19.
  • [20] V. C. Cabezas and P. Stanley-Marbell, “Parallelism and data movement characterization of contemporary application classes,” in SPAA 2011 — 23rd Symposium on Parallelism in Algorithms and Architectures . ACM, June 2011, pp. 95–104.
  • [21] R. Jordans, R. Corvino, and L. Jóźwiak, “Algorithm parallelism estimation for constraining instruction-set synthesis for VLIW processors,” in DSD 2012 - 15th Euromicro Conference on Digital System Design . IEEE, September 2012, pp. 152–155.
  • [22] R. Jordans, R. Corvino, L. Jóźwiak, and H. Corporaal, “Exploring processor parallelism: Estimation methods and optimization strategies,” in DDECS 2013 - 16th Symposium on Design and Diagnostics of Electronic Circuits and Systems . IEEE, April 2013, pp. 18–23.
  • [23] M. Lam, “Software pipelining: An effective scheduling technique for VLIW machines,” ACM SIGPLAN Notices , vol. 23, no. 7, pp. 318–328, July 1988.
  • [24] B. R. Rau, “Iterative modulo scheduling: An algorithm for software pipelining loops,” in MICRO 1994 — 27th Annual International Symposium on Microarchitecture . ACM, December 1994, pp. 63–74.
  • [25] S. Carr, C. Ding, and P. Sweany, “Improving software pipelining with unroll-and-jam,” in HICSS 1996 — 29th Hawaii International Conference on System Sciences . IEEE, January 1996, pp. 183–192.
  • [26] E. M. Riseman and C. C. Foster, “The inhibition of potential parallelism by conditional jumps,” IEEE Transactions on Computers , vol. 21, no. 12, pp. 1405–1411, December 1972.
  • [27] P. Paulin and J. Knight, “Force-directed scheduling for the behavioral synthesis of asics,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , vol. 8, no. 6, pp. 661–679, June 1989.
  • [28] LLVM, “Project website.” [Online]. Available: http://www.llvm.org
  • [29] The R project for statistical computing, “Project website.” [Online]. Available: http://www.r-project.org/
  • [30] M. Smotherman, S. Krishnamurthy, P. S. Aravind, and D. Hunnicutt, “Efficient DAG construction and heuristic calculation for instruction scheduling,” in MICRO 1991 — 24th Anual International Symposium on Microarchitecture . ACM, November 1991, pp. 93–102.
  • [31] A. M. Malik, J. McInnes, and P. van Beek, “Optimal basic block instruction scheduling for multiple-issue processors using constraint programming,” International Journal on Artificial Inteligence Tools , vol. 17, no. 1, pp. 37–54, February 2008.
  • [32] L.-N. Pouchet, “Polybench/C 3.2,” 2013. [Online]. Available: http://www.cse.ohio-state.edu/pouchet/software/polybench/
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-76aa3d11-68cd-40cf-8c64-19289bef1b63
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.