Gretch
- 9 February 2021
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 18 (2), 1-25
- https://doi.org/10.1145/3439803
Abstract
Data-dependent memory accesses (DDAs) pose an important challenge for high-performance graph analytics (GA). This is because such memory accesses do not exhibit enough temporal and spatial locality resulting in low cache performance. Prior efforts that focused on improving the performance of DDAs for GA are not applicable across various GA frameworks. This is because (1) they only focus on one particular graph representation, and (2) they require workload changes to communicate specific information to the hardware for their effective operation. In this work, we propose a hardware-only solution to improving the performance of DDAs for GA across multiple GA frameworks. We present a hardware prefetcher for GA called Gretch, that addresses the above limitations. An important observation we make is that identifying certain DDAs without hardware-software communication is sensitive to the instruction scheduling. A key contribution of this work is a hardware mechanism that activates Gretch to identify DDAs when using either in-order or out-of-order instruction scheduling. Our evaluation shows that Gretch provides an average speedup of 38% over no prefetching, 25% over conventional stride prefetcher, and outperforms prior DDAs prefetchers by 22% with only 1% increase in power consumption when executed on different GA workloads and frameworks.Keywords
This publication has 49 references indexed in Scilit:
- A scalable processing-in-memory accelerator for parallel graph processingPublished by Association for Computing Machinery (ACM) ,2015
- LigraACM SIGPLAN Notices, 2013
- The gem5 simulatorACM SIGARCH Computer Architecture News, 2011
- CHALLENGES IN PARALLEL GRAPH PROCESSINGParallel Processing Letters, 2007
- Ranking Attack GraphsLecture Notes in Computer Science, 2006
- Amazon.com recommendations: item-to-item collaborative filteringIEEE Internet Computing, 2003
- A stateless, content-directed data prefetching mechanismACM SIGPLAN Notices, 2002
- Effective hardware-based data prefetching for high-performance processorsInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1995
- Design and evaluation of a compiler algorithm for prefetchingACM SIGPLAN Notices, 1992
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffersACM SIGARCH Computer Architecture News, 1990