We present parallel tiled optimized McCaskill's partition functions computation code. That CPU and memory intensive dynamic programming task is within computational biology. To optimize code, we use the authorial source-to-source TRACO compiler and compare obtained code performance with that generated with the state-of-the-art PluTo compiler based on the affine transformations framework (ATF). For the considered task, PluTo is able to generate only serial highly cache efficient code without any parallelism. A TRACO tiling and parallelizing strategy uses the transitive closure of a dependence graph to avoid affine function calculation. First, for each loop nest statement, rectangular tiles are formed. Then those tiles are corrected to be valid under lexicographical order if necessary. A correction is carried out by means of applying transitive closure. The validity of tiles guarantees that the inter-tile dependence graph is acyclic. So, a valid schedule for target tiles can be derived and applied to generate parallel tiled code. For this purpose, the ISL scheduler is used. An experimental study carried out on a multi-core computer demonstrates considerable speed-up of generated code for the larger number of threads. Generated parallel tiled code overcomes that generated with the PluTo compiler.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.