Aggressive technology scaling has magnified the reliability challenges as it increases the number of permanent and transient faults due to the accelerated aging, increased device variations, and significant noise margin reduction. In this thesis, we address the key challenges of the yield and reliability of NoC-based SoCs which include cores, on-chip communications, and on-chip memories. Our yield and cost analysis shows that by adding a limited number of spare cores and wires to replace defective cores and wires either before shipment or in the field, the effective yield, in-field availability, and overall cost of the system can be significantly improved and the burn-in process can be eliminated. We also propose a quality metric for on-chip communication which can be used along with the frequency binning to price the chip in the market. We demonstrate that the overall quality of a mesh-based NoC depends more on the reliability of the inner links, and hence, non-uniform spare wire distribution is more effective than a uniform approach.
For the reliability of the on-chip memories, we propose error-locality-aware codes to correct single-bit or multi-bit upsets as well as physical defects in SRAM cells. With the same cost as Golay and BCH codes, our proposed codes provide better reliability against multi-bit upsets. We propose an interleaved error-locality-aware code to be used for end-to-end error correction in on-chip communications. In order to maintain the error correction capability of the code for transient and intermittent errors, we further propose an end-to-end data gathering and online diagnosis approach that locates the defective wires and replaces them with the spare wires embedded in the network.
|Advisor:||Cheng, Kwang-Ting Tim|
|Commitee:||Marek-Sadowska, Malgorzata Margaret, Parhami, Behrooz, Yue, Patrick|
|School:||University of California, Santa Barbara|
|Department:||Electrical & Computer Engineering|
|School Location:||United States -- California|
|Source:||DAI-B 72/12, Dissertation Abstracts International|
|Keywords:||Cost modeling, Error correction codes, Multi-core, Network on chip testing, Spare cores and wires, System on chip reliability, System-on-chips, Yield modeling|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be