Predictor-directed stream buffers

1 December 2000

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 42-53
https://doi.org/10.1145/360128.360135

Abstract

An effective method for reducing the effect of load la- tency in modern processors is data prefetching. One form of data prefetching, stream buffers, has been shown to be par- ticularly effective due to its' ability to detect data streams and run ahead of them, prefetching as it goes. Unfortu- nately, in the past, the applicability of streaming was limited to stride intensive code. In this paper we propose Predictor-Directed Stream Buffers (PSB), a scheme in which the stream buffer follows an address prediction stream instead of a fixed stride. In addition, we examine using confidence techniques to guide the allocation and prioritization of stream buffers and their prefetch requests. Our results show for pointer-based appli- cations that PSB provides a 30% speedup on ave rage over no prefetching, and provides an ave rage 10% speedup over using previously proposed stride-based stream buffers for pointer-intensive applications.

Keywords

This publication has 15 references indexed in Scilit:

Push vs. pull
Published by Association for Computing Machinery (ACM) ,2000
Recency-based TLB preloading
Published by Association for Computing Machinery (ACM) ,2000
Load execution latency reduction
Published by Association for Computing Machinery (ACM) ,1998
Hardware-driven prefetching for pointer data references
Published by Association for Computing Machinery (ACM) ,1998
Prefetching using Markov predictors
Published by Association for Computing Machinery (ACM) ,1997
Memory-system design considerations for dynamically-scheduled processors
Published by Association for Computing Machinery (ACM) ,1997
Speculative execution via address prediction and data prefetching
Published by Association for Computing Machinery (ACM) ,1997
Effective hardware-based data prefetching for high-performance processors
IEEE Transactions on Computers, 1995
A load-instruction unit for pipelined processors
IBM Journal of Research and Development, 1993
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
Published by Association for Computing Machinery (ACM) ,1990

Cited by 52 articles