Dissertation/Thesis Abstract

Improving Reliability of Real-time Embedded Systems
by Ma, Yue, Ph.D., University of Notre Dame, 2019, 157; 27930191
Abstract (Summary)

Multi-processor systems on a chip (MPSoCs) provide high performance and power efficiency. They have been widely used in many real-time embedded applications such as automotive electronics, industrial automation, and avionics. Most of these applications must satisfy deterministic or probabilistic timing constraints. However, due to CMOS technology scaling, MPSoCs increasingly have higher power density and temperature, which reduce system lifetime reliability. Meanwhile, the decreasing feature size of transistors and low supply voltage and frequency make MPSoCs more vulnerable to soft errors and degrade soft-error reliability. Maintaining quality of service, improving lifetime reliability and soft-error reliability, and satisfying real-time requirements have become major concerns in MPSoCs, especially for such systems deployed in harsh environments.

This thesis focuses on developing techniques to improve reliability for MPSoCs having different architectural features. Two methods are proposed to improve lifetime reliability and soft-error reliability respectively for homogeneous MPSoCs. The first method aims at improving lifetime reliability by dynamically reducing operating temperature. In order to maximize soft-error reliability and recover failed tasks caused by soft errors, the second method introduces a novel approach to dynamically determine which failed tasks should be recovered.

This thesis further studies reliability improvements for “big–little” type MPSoCs, which are widely used in many real-time applications. For such systems, both soft-error reliability and lifetime reliability are key concerns. After exploring the power features of the “big” core and “little” core, a framework is developed to improve soft-error reliability while satisfying a lifetime reliability constraint. Based on the run-time cores’ frequencies and utilizations, this framework dynamically increases core frequencies and selects the most power efficient cores to execute tasks to achieve improved soft-error reliability.

Finally, this thesis also focuses on MPSoCs with integrated CPU and GPU. Thanks to the massively parallel computing ability offered by GPU and the low power design of CPU, this type of MPSoCs has been widely used in many real-time applications such as autonomous vehicles and robots. For such MPSoCs, lifetime reliability and soft-error reliability of both CPU and GPU need to be considered. Through detailed execution profiling, we reveal that for a task relying on GPU resources to complete, its execution time on CPU significantly increases if executing on a different CPU core from the operating system. Based on this observation, a framework is introduced to map tasks and manage core frequencies of both CPU and GPU in order to improve soft-error reliability while satisfying a lifetime reliability constraint.

Indexing (document details)
School: University of Notre Dame
School Location: United States -- Indiana
Source: DAI-B 81/9(E), Dissertation Abstracts International
Subjects: Computer Engineering
Keywords: Multi-processor systems on a chip, CMOS, System lifetime reliability
Publication Number: 27930191
ISBN: 9781658474795
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy