Characterizing the latency hiding ability of GPUs

Abstract
This paper demonstrates a latency profiling approach to characterize and evaluate for the latency-hiding capability of modern GPU architectures. We find that the fast context-switching and massive multi-threading architecture can effectively hide much of the latency by swapping in and out warps. However, for certain GPGPU applications, such as bfs, the performance is limited by other factors. In future work, we plan to use the latency profiling approach to further investigate the limits of GPUs and seek for performance improvement opportunities.

This publication has 5 references indexed in Scilit: