MCDNN
Top Cited Papers
- 20 June 2016
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 123-136
- https://doi.org/10.1145/2906388.2906396
Abstract
We consider applying computer vision to video on cloud-backed mobile devices using Deep Neural Networks (DNNs). The computational demands of DNNs are high enough that, without careful resource management, such applications strain device battery, wireless data, and cloud cost budgets. We pose the corresponding resource management problem, which we call Approximate Model Scheduling, as one of serving a stream of heterogeneous (i.e., solving multiple classification problems) requests under resource constraints. We present the design and implementation of an optimizing compiler and runtime scheduler to address this problem. Going beyond traditional resource allocators, we allow each request to be served approximately, by systematically trading off DNN classification accuracy for resource use, and remotely, by reasoning about on-device/cloud execution trade-offs. To inform the resource allocator, we characterize how several common DNNs, when subjected to state-of-the art optimizations, trade off accuracy for resource use such as memory, computation, and energy. The heterogeneous streaming setting is a novel one for DNN execution, and we introduce two new and powerful DNN optimizations that exploit it. Using the challenging continuous mobile vision domain as a case study, we show that our techniques yield significant reductions in resource usage and perform effectively over a broad range of operating conditions.Keywords
Funding Information
- National Science Foundation (CNS-1318396, CNS-1420703)
This publication has 33 references indexed in Scilit:
- DeepEarPublished by Association for Computing Machinery (ACM) ,2015
- StarfishPublished by Association for Computing Machinery (ACM) ,2015
- RioPublished by Association for Computing Machinery (ACM) ,2014
- Towards wearable cognitive assistancePublished by Association for Computing Machinery (ACM) ,2014
- DianNaoACM SIGPLAN Notices, 2014
- Speeding up Convolutional Neural Networks with Low Rank ExpansionsPublished by British Machine Vision Association and Society for Pattern Recognition ,2014
- A Primal-Dual Randomized Algorithm for Weighted PagingJournal of the ACM, 2012
- A close examination of performance and power characteristics of 4G LTE networksPublished by Association for Computing Machinery (ACM) ,2012
- Online Primal-Dual Algorithms for Covering and PackingMathematics of Operations Research, 2009
- A Polynomial Time Approximation Scheme for the Multiple Knapsack ProblemSIAM Journal on Computing, 2005