COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

Coded Computing Systems Decoded: Dealing with Unreliability and Elasticity in Modern Computing
by Yang, Yaoqing, Ph.D., Carnegie Mellon University, 2019, 271; 13884004
Abstract (Summary)

Robustness is a fundamental and timeless issue, and it remains vital to all aspects of computation systems, regardless of specific computation platforms, architectures, and algorithm design. The issue is also timely: modern computing systems are increasingly built on unreliable substrates. This thesis designs reliable computing techniques for distributed systems, circuits and networks.

We primarily study techniques inspired from coding theory to address the robustness issues such as system elasticity, stragglers (slow workers), machine failures and soft errors, by carefully weaving redundancy into the data and the design of the algorithm. We primarily focus on three aspects of coding-based computation techniques. The first aspect is to design adaptive computing techniques in largescale systems that are elastic, i.e., the number of machines can change with time. The second is to make the coding-based computation schemes cater to the specific needs of iterative and multi-stage algorithms, such as power iterations and gradient descent that are essential for today’s data analytics. The third aspect is to study the fundamental limits of information propagation in computation networks, provide information-theoretic outer bounds on error accumulation, and design in-network computing schemes to compensate for this error accumulation, e.g., in a circuit network where each computation component can be unreliable.

This thesis presents theoretical results that are fundamental to the understanding of computation systems, and opportunities for radical improvements, e.g. in computation load, communication overhead, and storage cost. We also present real-system implementations and real-data experiments and show our progress on taking these theoretical results closer to practice. The academic results presented in this thesis advance on classical results in the historical field of reliable distributed computing, while simultaneously addressing timely issues arising in today’s computing systems.

Indexing (document details)
Advisor: Grover, Pulkit, Kar, Soummya
Commitee: Grover, Pulkit, Hassibi, Babak, Interlandi, Matteo, Kar, Soummya, Kovacevic, Jelena, Moura, Jose
School: Carnegie Mellon University
Department: Electrical and Computer Engineering
School Location: United States -- Pennsylvania
Source: DAI-B 80/11(E), Dissertation Abstracts International
Subjects: Electrical engineering
Keywords: Coded computing, Distributed system, Elastic computing, Iterative algorithm
Publication Number: 13884004
ISBN: 978-1-392-18233-8
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy