LogCA
- 24 June 2017
- journal article
- conference paper
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 45 (2), 375-388
- https://doi.org/10.1145/3140659.3080216
Abstract
With the end of Dennard scaling, architects have increasingly turned to special-purpose hardware accelerators to improve the performance and energy efficiency for some applications. Unfortunately, accelerators don't always live up to their expectations and may under-perform in some situations. Understanding the factors which effect the performance of an accelerator is crucial for both architects and programmers early in the design stage. Detailed models can be highly accurate, but often require low-level details which are not available until late in the design cycle. In contrast, simple analytical models can provide useful insights by abstracting away low-level system details. In this paper, we propose LogCA---a high-level performance model for hardware accelerators. LogCA helps both programmers and architects identify performance bounds and design bottlenecks early in the design cycle, and provide insight into which optimizations may alleviate these bottlenecks. We validate our model across a variety of kernels, ranging from sub-linear to super-linear complexities on both on-chip and off-chip accelerators. We also describe the utility of LogCA using two retrospective case studies. First, we discuss the evolution of interface design in SUN/Oracle's encryption accelerators. Second, we discuss the evolution of memory interface design in three different GPU architectures. In both cases, we show that the adopted design optimizations for these machines are similar to LogCA's suggested optimizations. We argue that architects and programmers can use insights from these retrospective studies for improving future designs.Keywords
Funding Information
- National Science Foundation (CNS-1302260, CCF-1438992, CCF-1533885, CCF- 1617824)
This publication has 39 references indexed in Scilit:
- DASXPublished by Association for Computing Machinery (ACM) ,2015
- Performance evaluation of kernel fusion BLAS routines on the GPU: iterative solvers as case studyThe Journal of Supercomputing, 2014
- Performance Modeling for FPGAs: Extending the Roofline Model with High-Level Synthesis ToolsInternational Journal of Reconfigurable Computing, 2013
- Modeling and predicting performance of high performance computing applications on hardware acceleratorsThe International Journal of High Performance Computing Applications, 2012
- A Survey of Computation Offloading for Mobile SystemsMobile Networks and Applications, 2012
- GPURoofline: A Model for Guiding Performance Optimizations on GPUsLecture Notes in Computer Science, 2012
- Conservation coresACM SIGARCH Computer Architecture News, 2010
- A view of the parallel computing landscapeCommunications of the ACM, 2009
- RooflineCommunications of the ACM, 2009
- Validity of the single processor approach to achieving large scale computing capabilitiesPublished by Association for Computing Machinery (ACM) ,1967