Multi-processor systems on a chip (MPSoCs) provide high performance and power efficiency. They have been widely used in many real-time embedded applications such as automotive electronics, industrial automation, and avionics. Most of these applications must satisfy deterministic or probabilistic timing constraints. However, due to CMOS technology scaling, MPSoCs increasingly have higher power density and temperature, which reduce system lifetime reliability. Meanwhile, the decreasing feature size of transistors and low supply voltage and frequency make MPSoCs more vulnerable to soft errors and degrade soft-error reliability. Maintaining quality of service, improving lifetime reliability and soft-error reliability, and satisfying real-time requirements have become major concerns in MPSoCs, especially for such systems deployed in harsh environments.
This thesis focuses on developing techniques to improve reliability for MPSoCs having different architectural features. Two methods are proposed to improve lifetime reliability and soft-error reliability respectively for homogeneous MPSoCs. The first method aims at improving lifetime reliability by dynamically reducing operating temperature. In order to maximize soft-error reliability and recover failed tasks caused by soft errors, the second method introduces a novel approach to dynamically determine which failed tasks should be recovered.
This thesis further studies reliability improvements for “big–little” type MPSoCs, which are widely used in many real-time applications. For such systems, both soft-error reliability and lifetime reliability are key concerns. After exploring the power features of the “big” core and “little” core, a framework is developed to improve soft-error reliability while satisfying a lifetime reliability constraint. Based on the run-time cores’ frequencies and utilizations, this framework dynamically increases core frequencies and selects the most power efficient cores to execute tasks to achieve improved soft-error reliability.
Finally, this thesis also focuses on MPSoCs with integrated CPU and GPU. Thanks to the massively parallel computing ability offered by GPU and the low power design of CPU, this type of MPSoCs has been widely used in many real-time applications such as autonomous vehicles and robots. For such MPSoCs, lifetime reliability and soft-error reliability of both CPU and GPU need to be considered. Through detailed execution profiling, we reveal that for a task relying on GPU resources to complete, its execution time on CPU significantly increases if executing on a different CPU core from the operating system. Based on this observation, a framework is introduced to map tasks and manage core frequencies of both CPU and GPU in order to improve soft-error reliability while satisfying a lifetime reliability constraint.
|School:||University of Notre Dame|
|School Location:||United States -- Indiana|
|Source:||DAI-B 81/9(E), Dissertation Abstracts International|
|Keywords:||Multi-processor systems on a chip, CMOS, System lifetime reliability|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be