Reliability is of the utmost importance for safety of electronic systems built for the automotive, industrial, and medical sectors. In these systems, the embedded memory is especially sensitive due to the large number of minimum-sized devices in the cell arrays. Memory failures which occur after the manufacture-time burn-in testing phase are particularly difficult to address since redundancy allocation is no longer available and fault detection schemes currently used in industry generally focus on the cell array while leaving the peripheral logic vulnerable to faults. Even in the cell array, conventional error control coding (ECC) has been limited in its ability to detect and correct failures greater than a few bits, due to the high latency or area overhead of correction . Consequently, improvements to conventional memory resilience techniques are of great importance to continued reliable operation and to counter the raw bit error rate of the memory arrays in future technologies at economically feasible design points [11, 36, 37, 53, 56, 70].
In this thesis we examine the landscape of design techniques for reliability, and introduce two novel contributions for improving reliability with low overhead.
To address failures occurring in the cell array, we have implemented an erasure-based ECC scheme (EB-ECC) that can extend conventional ECC already used in memory to correct and detect multiple erroneous bits with low overhead. An important component of this scheme is the method for detecting erasures at runtime; we propose a novel ternary-output sense amplifier design which can reduce the risk of undetected read latency failures in small-swing bitline designs.
While most study has focused on the static random access memory (SRAM) cell array, for high-reliability products, it is important to examine the effects of failures on the peripheral logic as well. We have designed a wordline assertion comparator (WLAC) which has lower area overhead in large cache designs than competing techniques in the literature to detect address decoder failure.
 B. H. Calhoun, Yu Cao, Xin Li, Ken Mai, L. T. Pileggi, R. A. Rutenbar, and K. L. Shepard. Digital Circuit Design Challenges and Opportunities in the Era of Nanoscale CMOS. Proc. IEEE, 96(2):343–365, 2008. doi: 10.1109/JPROC.2007.911072. (document)
 Jangwoo Kim, Mark McCartney, Ken Mai, and Babak Falsafi. Modeling SRAM Failure Rates to Enable Fast, Dense, Low-Power Caches. IEEE Workshop on Silicon Errors in Logic - System Effects, 24–25 March 2009. (document), 2.3.1
 Jangwoo Kim, Hyunggyun Yang, Mark P. McCartney, Mudit Bhargava, Ken Mai, and Babak Falsafi. Building fast, dense, low-power caches using erasure-based inline multibit ECC. In Dependable Computing (PRDC), 2013 IEEE 19th Pacific Rim International Symposium on, pages 98–107, 2013. doi: 10.1109/PRDC.2013.19. URL http: //ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6820845. (document), 2.3.1, 4.2, 4.5
 S. Lin and D.J. Costello. Error Control Coding: Fundamentals and Applications. Prentice Hall, 1983. (document), 3.4, 3.4.2
 R. Naseer and J. Draper. Parallel double error correcting code design to mitigate multibit upsets in SRAMs. In Proceedings of 34th European Solid-State Circuits Conference ESSCIRC 2008, pages 222–225, 2008. doi: 10.1109/ESSCIRC.2008.4681832. (document), 3.1, 3.2, 3.5
 S. Ozdemir, D. Sinha, G. Memik, J. Adams, and H. Zhou. Yield-aware cache architectures. In IEEE/ACM International Symposium on Microarchitecture , pages 15–25, December 2006. doi: 10.1109/MICRO.2006.52. (document)
 M. Spica and T.M. Mak. Do we need anything more than single bit error correction (ECC)? In Proceedings of Records of the 2004 International Workshop on Memory Technology, Design and Testing, pages 111–116, 9–10 August 2004. doi: 10.1109/MTDT.2004.1327993. (document)
|Commitee:||Hoe, James, Hoefler, Alexander, Strojwas, Andrzej|
|School:||Carnegie Mellon University|
|Department:||Electrical and Computer Engineering|
|School Location:||United States -- Pennsylvania|
|Source:||DAI-B 77/04(E), Dissertation Abstracts International|
|Subjects:||Computer Engineering, Electrical engineering|
|Keywords:||Cyber physical systems, ECC, Erasure, Memory, Reliability, SRAM|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be