Exploring Dynamic Redundancy to Resuscitate Faulty PCM Blocks

Abstract
DRAM technology challenges have increased the necessity to adapt to the emerging memory technologies like Phase-Change Memory (PCM or PRAM). While such emerging technologies provide benefits like storage density, nonvolatility, and low energy consumption, they are constrained by limited write endurance that becomes more pronounced with process variation. In this article, we explore a novel PRAM-based main memory system which resuscitates a group of faulty pages in a cost-effective manner to significantly extend the PCM main memory lifetime while minimizing the performance impact. In particular, we explore three different dimensions of dynamic redundancy levels and group sizes, and design low-cost hardware and software support for our proposed schemes. We aim to have minimal hardware modifications (that have less than 1% on-chip and off-chip area overheads). Also, our schemes can improve the PRAM lifetime by up to 105× (times) over a chip with no error correction capabilities, and outperform prior schemes such as DRM and ECP at a small fraction of the hardware cost. The performance overhead resulting from our scheme is less than 8% on average across 21 applications from SPEC2006, Splash-2, and PARSEC benchmark suites.
Funding Information
  • Division of Computing and Communication Foundations
  • Office of Cyberinfrastructure

This publication has 28 references indexed in Scilit: