In the not-so-distant past, parallel programming was mostly the concern of programmers specializing in high-performance computing. Nowadays, on the other hand, many of today's desktop and laptop computers come equipped with a species of shared-memory multiprocessor called a multicore processor, making parallel programming a concern for a much broader range of programmers. High-level parallel languages, such as Parallel ML (PML) and Haskell, seek to reduce the complexity of programming multicore processors by giving programmers abstract execution models, such as implicit threading, where programmers annotate their programs to suggest the parallel decomposition. Implicitly-threaded programs, however, do not specify the actual decomposition of computations or mapping from computations to processors. The annotations act simply as hints that can be ignored and safely replaced with sequential counterparts. The parallel decomposition itself is the responsibility of the language implementation and, more specifically, of the scheduling system.
Threads can take arbitrarily different amounts of time to execute, and these times are difficult to predict. Implicit threading encourages the programmer to divide the program into threads that are as small as possible because doing so increases the flexibility the scheduler in its duty to distribute work evenly across processors. The downside of such fine-grain parallelism is that if the total scheduling cost is too large, then parallelism is not worthwhile. This problem is the focus of this dissertation.
The starting point of this dissertation is work stealing, a scheduling policy well known for its scalable parallel performance, and the work-first principle, which serves as a guide for building efficient implementations of work stealing. In this dissertation, I present two techniques, Lazy Promotion and Lazy Tree Splitting, for implementing work stealing. Both techniques derive their efficiency from adhering to the work-first principle. Lazy Promotion is a strategy that improves the performance, in terms of execution time, of a work-stealing scheduler by reducing the amount of load the scheduler places on the garbage collector. Lazy Tree Splitting is a technique for automatically scheduling the execution of parallel operations over trees to yield scalable performance and eliminate the need for per-application tuning. I use Manticore, PML's compiler and runtime system, and a sixteen-core NUMA machine as a testbed for these techniques.
In addition, I present two empirical studies. In the first study, I evaluate Lazy Promotion over six PML benchmarks. The results demonstrate that Lazy Promotion either outperforms or performs the same as an alternative scheme based on Eager Promotion. This study also evaluates the design of the Manticore runtime system, in particular, the split-heap memory manager, by comparing the system to an alternative system based on a unified-heap memory manager, and showing that the unified version has limited scalability due to poor locality. In the second study, I evaluate Lazy Tree Splitting over seven PML benchmarks by comparing Lazy Tree Splitting to its alternative, Eager Tree Splitting. The results show that, although the two techniques offer similar scalability, only Lazy Tree Splitting is suitable for building an effective language implementation.
|Advisor:||Reppy, John H.|
|Commitee:||Fluet, Matthew, Rogers, Anne|
|School:||The University of Chicago|
|School Location:||United States -- Illinois|
|Source:||DAI-B 71/10, Dissertation Abstracts International|
|Subjects:||Computer Engineering, Computer science|
|Keywords:||Functional programming, Parallelism, Programming languages, Scheduling|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be