Dissertation/Thesis Abstract

Design for Competitive Automated Layout (DCAL) of Superscalar Processors
by Ku, Sungkwan, Ph.D., North Carolina State University, 2017, 106; 10970028
Abstract (Summary)

The goal of this thesis is to provide strategies and perspectives to easily and quickly generate layouts of wide superscalar processors, relying almost entirely on automated synthesis and place-and-route, where the generated layouts are competitive with the manual physical design.

Our approach is to focus on physical design aspects at the early microarchitecture design stage so that the subsequent automated synthesis and place-and-route flow can take advantage of it. That is, we design microarchitectures for which the quality gap between automated and manual physical design flows is lessened so that automation actually delivers a high-quality physical layout. This thesis introduces this new design paradigm, called Design for Competitive Automated Layout (DCAL) of Superscalar Processors. In this work, we have been exploring design strategies at multiple levels where automated synthesis and place-and-route flows traditionally perform poorly.

At present, DCAL targets two key design levels: circuit-level and microarchitecture-level DCAL. In circuit-level DCAL, (1) we improve the automated layout quality of highly-ported memories in superscalar processors. These memories are pervasive in a superscalar microarchitecture and account for many of the cycle-critical and energy-critical paths, as well as much of the core area. Our research makes a case for standard-cell based SRAMs (flip-flops, muxes, clock buffers) as the solution to the problem of highly-ported deep-submicron memories. we explore the costs of adding ports in two different multi-ported SRAM implementation styles: full-custom 6T SRAM, scaled-up for multiple ports) versus fully synthesized (D flip-flop based).

In microarchitecture-level DCAL, (2) we explore a clustered microarchitecture that can improve power, performance, and area (PPA) metrics of layouts generated by automated synthesis and place-and-route. We implemented an RTL design of a clustered microarchitecture and took that design through physical layout to accurately model wire delay and power. We propose a modular design approach to clustered microarchitectures where modules can be reused to build a large number of clusters. Our focus is on improving the efficiency of the physical implementation of clustered architectures.

Additionally, (3) we perform the design space evaluation of the Trace Processor, an execution paradigm whose goal is to address the concerns of poor performance, power, and frequency scalability of superscalar processors. We adapted the modern RISC-V ISA for the implementation and used SPEC2006 benchmarks for exploration. We optimize the Trace Processor microarchitecture to reduce the size of critical structures such as Trace Cache, Oustanding Trace Buffer, the write ports of GRFs, thereby demonstrating the DCAL characteristic.

Indexing (document details)
School: North Carolina State University
Department: Computer Engineering
School Location: United States -- North Carolina
Source: DAI-B 80/01(E), Dissertation Abstracts International
Subjects: Computer Engineering
Keywords: Automated, Competitive, Dcal, Layout, Processors, Superscalar
Publication Number: 10970028
ISBN: 9780438284852
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy