Current trends in high performance computing have produced two distinct families of chips. The first one is called complex core, which consists of a few, very architecturally sophisticated cores. The other chip family consists of many simple cores, which lack the advanced features of the complex ones. The two ideological camps have their examples in the current market. The Intel Core Duo family and start-up efforts, like the Tilera 64 chip, are the vanguards for each camp. Currently, complex cores have an advantage over the simple ones due to the fact that most of the system software and applications are written for sequential machines. Moreover, several compiler techniques are stagnant due to its sequential focus point. The rise of complex and simple cores are disturbing the compiler research field and brought back problems which have been ignored for more than three decades.
The major performance objectives for optimizing compilers have been, and still are, loops. Among the most known and researched loop scheduling techniques is software pipelining. Due to the rise of simple cores, many of the hardware features, which supported more advanced software pipelining techniques, have been sacrificed in the battle for more cores. Due to the comeback of the simple cores, we have to rely on the original software pipelining techniques, which were developed over two decades ago.
The software pipelining framework described in this thesis does not rely on any special hardware support. It was implemented in PathScale’s EKOPath compiler for the SiCortex Multiprocessor architecture. The experimental results show a maximum speedup of 15%. The framework will be part of a production quality compiler and it will be open-sourced to the community.
The main contributions of this thesis are: (1) an implementation of a fast life-time sensitive modulo scheduler with limited backtracking, (2) a modulo variable expansion technique to compensate for a missing rotating register file, (3) a register allocator for modulo variable expanded kernels, (4) a new code generator that compensates for missing hardware support, (5) creation of an experimental testbed to analyse the performance of the software pipelining framework.
|Advisor:||Gao, Guang R.|
|School:||University of Delaware|
|Department:||Department of Electrical and Computer Engineering|
|School Location:||United States -- Delaware|
|Source:||MAI 48/01M, Masters Abstracts International|
|Subjects:||Electrical engineering, Computer science|
|Keywords:||Code generation, Modulo scheduling, Open64, Register allocation, Software pipelining|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be