Clustered speculative multithreaded processors

Abstract
In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the proposed microarchitecture is its capability to spawn speculative threads from a single-thread application at run-time. These speculative threads use otherwise idle resources of the machine. Spawning a speculative threads involves predicting its control flow as well as its dependences with other threads and the values that flow through them. In this way, threads that are not independent can be executed in parallel. Control-flow, data value and data dependence predictors particularly designed for this type of microarchitecture are presented. Results show the potential of the microarchitecture to exploit speculative parallelism in programs that are hard to parallelize at compile-time, such as the SpecInt95. For a 4 thread unit configuration, some programs such as ijpeg and li can exploit an average degree of parallelism of more than 2 threads per cycle. The average degree of parallelism for the whole SpecInt95 suite is 1.6 threads per cycle. This speculative parallelism results in significant speedups for all the SpecInt95 programs when compared with a single-thread execution.

This publication has 13 references indexed in Scilit: