Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency

Top Cited Papers

1 November 2008

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 222-233
https://doi.org/10.1109/micro.2008.4771793

Abstract

Data caches in general-purpose microprocessors often contain mostly dead blocks and are thus used inefficiently. To improve cache efficiency, dead blocks should be identified and evicted early. Prior schemes predict the death of a block immediately after it is accessed; however, these schemes yield lower prediction accuracy and coverage. Instead, we find that predicting the death of a block when it just moves out of the MRU position gives the best tradeoff between timeliness and prediction accuracy/coverage. Furthermore, the individual reference history of a block in the L1 cache can be irregular because of data/control dependence. This paper proposes a new class of dead-block predictors that predict dead blocks based on bursts of accesses to a cache block. A cache burst begins when a block becomes MRU and ends when it becomes non-MRU. Cache bursts are more predictable than individual references because they hide the irregularity of individual references. When used at the L1 cache, the best burst-based predictor can identify 96% of the dead blocks with a 96% accuracy. With the improved dead-block predictors, we evaluate three ways to increase cache efficiency by eliminating dead blocks early: replacement optimization, bypassing, and prefetching. The most effective approach, prefetching into dead blocks, increases the average L1 efficiency from 8% to 17% and the L2 efficiency from 17% to 27%. This increased cache efficiency translates into higher overall performance: prefetching into dead blocks outperforms the same prefetch scheme without dead-block prediction by 12% at the L1 and by 13% at the L2.

Keywords

This publication has 23 references indexed in Scilit:

IATAC: a smart predictor to turn-off L2 cache lines
ACM Transactions on Architecture and Code Optimization, 2005
Memory coherence activity prediction in commercial workloads
Published by Association for Computing Machinery (ACM) ,2004
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Cache decay: exploiting generational behavior to reduce cache leakage power
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Basic block distribution analysis to find periodic behavior and simulation points in applications
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Timekeeping in the memory system
ACM SIGARCH Computer Architecture News, 2002
Dead-block prediction & dead-block correlating prefetchers
ACM SIGARCH Computer Architecture News, 2001
Selective, accurate, and timely self-invalidation using last-touch prediction
ACM SIGARCH Computer Architecture News, 2000
Run-time cache bypassing
IEEE Transactions on Computers, 1999
Modeling live and dead lines in cache memory systems
IEEE Transactions on Computers, 1993

Cited by 142 articles