Over the last decade, graphics processing units (GPUs) have seen their use broaden from purely graphical tasks to general purpose computation. The increased programmability required by demanding graphics applications has proven useful for a number of non-graphical problems as well. GPUs' high memory bandwidth and floating point performance make them attractive for general computation workloads, yet these benefits come at the cost of added complexity. One particular problem is the fact that GPUs and their associated high performance memory typically lie on discrete cards that are separated from the host CPU} by the PCI-Express bus. This requires programmers to carefully manage the transfer of data between the CPU and GPU memory so that the right data is in the right place at the right time. Programmers must design data structures with serialization in mind in order to efficiently move data across the PCI bus. In practice, this leads to programmers working with only simple data structures such as one or two-dimensional arrays and the applications that can be easily expressed in terms of these structures. CPU programmers have long had access to richer data structures, such as trees or first class procedures, which enable new and simpler approaches to solving certain problems.
This thesis explores the use of RBMM to overcome these data movement challenges. RBMM is a technique in which data is assigned to regions and these regions can then be operated on as a unit. One of the first uses of regions was to amortize the cost of deallocation. Many small objects would be allocated in a single region and the region could be deallocated as a single operation independent of the number of items in the region. In this thesis, regions are used as the unit of data movement between the CPU and GPU. Data structures are assigned to a region and thus the runtime system does not have to be aware of the internal layout of a data structure. The runtime system can simply move the entire region from one device to another, keeping the internal layout intact and allowing code running on either device to operate on the data in the same way.
These ideas are explored through a new programming language called Harlan. Harlan is designed to simplify programming GPUs and other data parallel processors. It provides kernel expressions as its fundamental mechanism for parallelism. Kernels function similarly to a parallel map or zipWith operation from other functional programming languages. For example, the expression
Besides providing the design and description of the implementation of Harlan, this thesis includes a type safety proof for a small model of Harlan's region system as well as a number of small application case studies. The type safety proof provides formal support that Harlan ensures programs will have the right data in the right place at the right time. The application case studies show that Harlan and the ideas embodied within it are useful both for a number of traditional applications as well as problems that are problematic for previous GPU programming languages. The design and implementation of Harlan, its proof of type safety and the set of application case studies together show that region-based memory management is an effective way of enabling high level features in languages targeting CPU/GPU systems and other machines with disjoint memories.
|Commitee:||Chauhan, Arun, Newton, Ryan, Sabry, Amr|
|School Location:||United States -- Indiana|
|Source:||DAI-B 77/12(E), Dissertation Abstracts International|
|Keywords:||GPU, Harlan, Parallelism, Programming languages, Region based memory management|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be