An Application-oblivious Memory Scheduling System for DNN Accelerators
Open Access
- 16 September 2022
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 19 (4), 1-26
- https://doi.org/10.1145/3535355
Abstract
Deep Neural Networks (DNNs) tend to go deeper and wider, which poses a significant challenge to the training of DNNs, due to the limited memory capacity of DNN accelerators. Existing solutions for memory-efficient DNN training are densely coupled with the application features of DNN workloads, e.g., layer structures or computational graphs of DNNs are necessary for these solutions. This would result in weak versatility for DNNs with sophisticated layer structures or complicated computation graphs. These schemes usually need to be re-implemented or re-adapted due to the new layer structures or the unusual operators in the computational graphs introduced by these DNNs. In this paper, we review the memory pressure issues of DNN training from the perspective of runtime systems and model the memory access behaviors of DNN workloads. We identify the iterative, regularity and extremalization properties of memory access patterns for DNN workloads. Based on these observations, we propose AppObMem, an application-oblivious memory scheduling system. AppObMem automatically traces the memory behaviors of DNN workloads and schedules the memory swapping to reduce the memory pressure of the device accelerators without the perception of high-level information of layer structures or computation graphs. Evaluations on a variety of DNN models show that, AppObMem obtains 40%-60% memory savings with acceptable performance loss. AppObMem is also competitive with other open-sourced SOTA schemes.Keywords
This publication has 40 references indexed in Scilit:
- vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network designPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Deep Residual Learning for Image RecognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- GeePSPublished by Association for Computing Machinery (ACM) ,2016
- SQuAD: 100,000+ Questions for Machine Comprehension of TextPublished by Association for Computational Linguistics (ACL) ,2016
- Going deeper with convolutionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- DaDianNao: A Machine-Learning SupercomputerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998
- Multilayer feedforward networks are universal approximatorsNeural Networks, 1989
- PARETO OPTIMALITY AND THE LAWOxford Economic Papers, 1967
- Nearest neighbor pattern classificationIEEE Transactions on Information Theory, 1967