Increasing the Cache Efficiency by Eliminating Noise

21 March 2006

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 145-154
https://doi.org/10.1109/hpca.2006.1598121

Abstract

Caches are very inefficiently utilized because not all the excess data fetched into the cache, to exploit spatial locality, is utilized. We define cache utilization as the percentage of data brought into the cache that is actually used. Our experiments showed that Level 1 data cache has a utilization of only about 57%. In this paper, we show that the useless data in a cache block (cache noise) is highly predictable. This can be used to bring only the to-be-referenced data into the cache on a cache miss, reducing the energy, cache space, and bandwidth wasted on useless data. Cache noise prediction is based on the last words usage history of each cache block. Our experiments showed that a code-context predictor is the best performing predictor and has a predictability of about 95%. In a code context predictor, each cache block belongs to a code context determined by the upper order PC bits of the instructions that fetched the cache block. When applying cache noise prediction to L1 data cache, we observed about 37% improvement in cache utilization, and about 23% and 28% reduction in cache energy consumption and bandwidth requirement, respectively. Cache noise mispredictions increased the miss rate by 0.1% and had almost no impact on instructions per cycle (IPC) count. When compared to a sub-blocked cache, fetching the to-be-referenced data resulted in 97% and 44% improvement in miss rate and cache utilization, respectively. The sub-blocked cache had a bandwidth requirement about 35% of the cache noise prediction based approach. at the expense of larger tag overheads and lower spatial locality exploitation. Sub-blocking is used to mitigate the limitations of larger blocks. In sub-blocked caches, sub-blocks (which are portions of the larger cache block) are fetched on demand. However, sub-blocking can result in significant increase in cache miss rate and cache noise if words are accessed from different sub- blocks. Other alternatives (18, 20) are to dynamically adapt the block size to the spatial locality exhibited by the application, still bringing in contiguous words. In this paper, we investigate prediction techniques (called cache noise prediction) that fetch only the to-be-referenced words (which may be in non-contiguous locations) in a cache block. This technique is an attractive alternative to sub-blocking for larger cache blocks because it has the benefit of reducing the bandwidth requirement, while avoiding the miss rate and cache noise impact of sub-blocking.

Keywords

This publication has 16 references indexed in Scilit:

Accurate and Complexity-Effective Spatial Pattern Prediction
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Correlated load-address predictors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Analysis of memory referencing behavior for design of local memories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Exploiting spatial locality in data caches using spatial footprints
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Adapting cache line size to application behavior
Published by Association for Computing Machinery (ACM) ,1999
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News, 1997
Speculative execution via address prediction and data prefetching
Published by Association for Computing Machinery (ACM) ,1997
Adjustable block size coherent caches
Published by Association for Computing Machinery (ACM) ,1992
Line (Block) Size Choice for CPU Cache Memories
IEEE Transactions on Computers, 1987
Experimental evaluation of on-chip microprocessor cache memories
Published by Association for Computing Machinery (ACM) ,1984

Cited by 17 articles