Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency
Top Cited Papers
- 1 November 2008
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Data caches in general-purpose microprocessors often contain mostly dead blocks and are thus used inefficiently. To improve cache efficiency, dead blocks should be identified and evicted early. Prior schemes predict the death of a block immediately after it is accessed; however, these schemes yield lower prediction accuracy and coverage. Instead, we find that predicting the death of a block when it just moves out of the MRU position gives the best tradeoff between timeliness and prediction accuracy/coverage. Furthermore, the individual reference history of a block in the L1 cache can be irregular because of data/control dependence. This paper proposes a new class of dead-block predictors that predict dead blocks based on bursts of accesses to a cache block. A cache burst begins when a block becomes MRU and ends when it becomes non-MRU. Cache bursts are more predictable than individual references because they hide the irregularity of individual references. When used at the L1 cache, the best burst-based predictor can identify 96% of the dead blocks with a 96% accuracy. With the improved dead-block predictors, we evaluate three ways to increase cache efficiency by eliminating dead blocks early: replacement optimization, bypassing, and prefetching. The most effective approach, prefetching into dead blocks, increases the average L1 efficiency from 8% to 17% and the L2 efficiency from 17% to 27%. This increased cache efficiency translates into higher overall performance: prefetching into dead blocks outperforms the same prefetch scheme without dead-block prediction by 12% at the L1 and by 13% at the L2.Keywords
This publication has 23 references indexed in Scilit:
- IATAC: a smart predictor to turn-off L2 cache linesACM Transactions on Architecture and Code Optimization, 2005
- Memory coherence activity prediction in commercial workloadsPublished by Association for Computing Machinery (ACM) ,2004
- Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Cache decay: exploiting generational behavior to reduce cache leakage powerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Basic block distribution analysis to find periodic behavior and simulation points in applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Timekeeping in the memory systemACM SIGARCH Computer Architecture News, 2002
- Dead-block prediction & dead-block correlating prefetchersACM SIGARCH Computer Architecture News, 2001
- Selective, accurate, and timely self-invalidation using last-touch predictionACM SIGARCH Computer Architecture News, 2000
- Run-time cache bypassingIEEE Transactions on Computers, 1999
- Modeling live and dead lines in cache memory systemsIEEE Transactions on Computers, 1993