Access order and effective bandwidth for streams on a Direct Rambus memory

1 January 1999

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 80-89
https://doi.org/10.1109/hpca.1999.744337

Abstract

Processor speeds are increasing rapidly and memory speeds are not keeping up. Streaming computations (such as multimedia or scientific applications) are among those whose performance is most limited by the memory bottleneck. Rambus hopes to bridge the processor/memory performance gap with a recently introduced DRAM that can deliver up to 1.6 Gbytes/sec. We analyze the performance of these interesting new memory devices on the inner loops of streaming computations, both for traditional memory controllers that treat all DRAM transactions as random cacheline accesses, and for controllers augmented with streaming hardware. For our benchmarks, we find that accessing unit-stride streams in cacheline bursts in the natural order of the computation exploits from 44-76% of the peak bandwidth of a memory system composed of a single Direct RDRAM device, and that accessing streams via a streaming mechanism with a simple access ordering scheme can improve performance by factors of 1.18 to 2.25.

Keywords

This publication has 11 references indexed in Scilit:

Increasing the Number of Strides for Conflict-free Vector Access
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Increasing TLB reach using superpages backed by shadow memory
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Command vector memory systems: high performance at low cost
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Access order to avoid inter-vector-conflicts in complex memory systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Direct RAMbus technology: the new main memory standard
IEEE Micro, 1997
Memory bandwidth limitations of future microprocessors
ACM SIGARCH Computer Architecture News, 1996
Design and evaluation of dynamic access ordering hardware
Published by Association for Computing Machinery (ACM) ,1996
Code generation for streaming: an access/execute mechanism
Published by Association for Computing Machinery (ACM) ,1991
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software, 1990

Cited by 38 articles