PreFAM: Understanding the Impact of Prefetching in Fabric-Attached Memory Architectures

28 September 2020

conference paper
conference paper
Published by Association for Computing Machinery (ACM) in The International Symposium on Memory Systems

https://doi.org/10.1145/3422575.3422804

Abstract

With many recent advances in interconnect technologies and memory interfaces, disaggregated memory systems are approaching industrial adoption. For instance, the recent Gen-Z consortium focuses on a new memory semantic protocol that enables fabric-attached memories (FAM), where the memory and other compute units can be directly attached to fabric interconnects. Decoupling of memory from compute units becomes a feasible option as the rate of data transfer increases due to the emergence of novel interconnect technologies, such as Silicon Photonic Interconnects. Disaggregated memories not only enable more efficient use of capacity (minimizes under-utilization) they also allow easy integration of evolving technologies. Additionally, they simplify the programming model at the same time allowing efficient sharing of data. However, the latency of accessing the data in these Fabric Attached disaggregated Memories (FAMs) is dependent on the latency imposed by the fabric interfaces. To reduce memory access latency and to improve the performance of FAM systems, in this paper, we explore techniques to prefetch data from FAMs to the local memory present in the node (PreFAM). We realize that since the memory access latency is high in FAMs, prefetching a cache block (64 bytes) from FAM can be inefficient, since the possibility of issuing demand requests before the completion of prefetch requests, to the same FAM locations, is high. Hence, we explore predicting and prefetching FAM blocks at a distance; prefetching blocks which are going to be accessed in future but not immediately. We show that, with prefetching, the performance of FAM architectures increases by 38.84%, while memory access latency is improved by 39.6%, with only 17.65% increase in the number of accesses to the FAM, on average. Further, by prefetching at a distance we show a performance improvement of 72.23%.

Keywords

This publication has 33 references indexed in Scilit:

PENNANT: an unstructured mesh mini‐app for advanced architecture research
Concurrency and Computation: Practice and Experience, 2014
Making data prefetch smarter
Published by Association for Computing Machinery (ACM) ,2012
ACM SIGMETRICS Performance Evaluation Review, 2011
Disaggregated memory for expansion and sharing in blade servers
ACM SIGARCH Computer Architecture News, 2009
Architecting phase change memory as a scalable dram alternative
ACM SIGARCH Computer Architecture News, 2009
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News, 2006
Energy management for commercial servers
Computer, 2003
Prefetching using Markov predictors
ACM SIGARCH Computer Architecture News, 1997
Stride directed prefetching in scalar processors
ACM SIGMICRO Newsletter, 1992
Software prefetching
ACM SIGARCH Computer Architecture News, 1991

Cited by 3 articles