Design and evaluation of an optimistic CPU: the warp engine

Littin, Richard H.

Design and evaluation of an optimistic CPU: the warp engine

Authors

Littin, Richard H.

Files

thesis.pdf (43.74 MB)

Permanent Link

https://hdl.handle.net/10289/14898

Rights

Abstract

Instruction pipelining, out-of-order execution, and branch prediction are techniques that improve performance in processors by manipulating the flow of instructions. These control flow manipulations alone are not adequate to allow large numbers of instructions to execute in parallel because performance is limited by accesses to the relatively slow memory system. Performance can be improved by speculating on the outcomes of control decisions, and the values of data in memory, returning results early. This thesis investigates the requirements of an architecture that speculates on control flow decisions and data values to improve performance through instruction level parallelism. A new architecture, the WarpEngine, that speculates on control flow decisions and data values is presented. This architecture is shown to have the potential to extract performance through parallelism an order of magnitude larger than that obtained by contemporary microprocessors. Control speculation is achieved using a novel tree-based mechanism that produces multiple flows of control. This scalable mechanism is shown to generate a large group of instructions that can execute in parallel. Also, it is essential that memory accesses are allowed to occur out of programmed order. This form of data speculation is shown to break false data dependencies, improving performance. The use of state saving resources is examined and the limitations of in-order retirement schemes are shown. These results indicate that the management of these resources is critical to obtaining good performance. Virtual ordered simulation is introduced as a new simulation methodology for modelling out-of-order and speculative architectures. This novel simulation technique is unique because each instruction is only inspected and processed once, and unlike other simulation methodologies unlimited resources can be modelled. Individual components can be constrained in isolation so that their effect on performance can be examined in detail. Investigations performed assuming unbounded resources provide new insight into the limits imposed by individual processor components. The architecture presented shows potential for performance well beyond that of contemporary and research architectures. The insights into the limitations of processor components apply to many computer architectures.

Type

Thesis

Date

2000

Publisher

The University of Waikato

Degree

Doctor of Philosophy (PhD)

Supervisor

Pearson, Murray W.
Cleary, John G.

Design and evaluation of an optimistic CPU: the warp engine

Authors

Files

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor