ATCache

24 August 2014

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

https://doi.org/10.1145/2628071.2628089

Abstract

3D-stacking technology has enabled the option of embedding a large DRAM onto the processor. Prior works have proposed to use this as a DRAM cache. Because of its large size (a DRAM cache can be in the order of hundreds of megabytes), the total size of the tags associated with it can also be quite large (in the order of tens of megabytes). The large size of the tags has created a problem. Should we maintain the tags in the DRAM and pay the cost of a costly tag access in the critical path? Or should we maintain the tags in the faster SRAM by paying the area cost of a large SRAM for this purpose? Prior works have primarily chosen the former and proposed a variety of techniques for reducing the cost of a DRAM tag access. In this paper, we first establish (with the help of a study) that maintaining the tags in SRAM, because of its smaller access latency, leads to overall better performance. Motivated by this study, we ask if it is possible to maintain tags in SRAM without incurring high area overhead. Our key idea is simple. We propose to cache the tags in a small SRAM tag cache - we show that there is enough spatial and temporal locality amongst tag accesses to merit this idea. We propose the ATCache which is a small SRAM tag cache. Similar to a conventional cache, the ATCache caches recently accessed tags to exploit temporal locality; it exploits spatial locality by prefetching tags from nearby cache sets. In order to avoid the high miss latency and cache pollution caused by excessive prefetching, we use a simple technique to throttle the number of sets prefetched. Our proposed ATCache (which consumes 0.4% of overall tag size) can satisfy over 60% of DRAM cache tag accesses on average.

Keywords

Funding Information

UK-India Education and Research Initiative (UKUTP201100256)
Engineering and Physical Sciences Research Council (EP/G03-6136/1)

This publication has 19 references indexed in Scilit:

Die-stacked DRAM caches for servers
Published by Association for Computing Machinery (ACM) ,2013
Die-stacked DRAM caches for servers
ACM SIGARCH Computer Architecture News, 2013
Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Clearing the clouds
Published by Association for Computing Machinery (ACM) ,2012
Efficiently enabling conventional block sizes for very large die-stacked DRAM caches
Published by Association for Computing Machinery (ACM) ,2011
The gem5 simulator
ACM SIGARCH Computer Architecture News, 2011
High performance cache replacement using re-reference interval prediction (RRIP)
ACM SIGARCH Computer Architecture News, 2010
CAT—caching address tags
ACM SIGARCH Computer Architecture News, 1995
CAT---caching address tags
Published by Association for Computing Machinery (ACM) ,1995

Cited by 82 articles