Deep Reinforcement Learning Based VNF Management in Geo-distributed Edge Computing

Abstract
Edge computing is an effective approach for resource provisioning at the network edge to host virtualized network functions (VNF). Considering the cost diversity in edge computing, from the perspective of service providers, it is significant to orchestrate the VNFs and schedule the traffic flows for network utility maximization (NUM) as it implies maximal revenue. However, traditional model-based optimization methods usually follow some assumptions and impose certain limitations. In this paper, inspired by the success of deep reinforcement learning in solving complicated control problems, we propose a deep deterministic policy gradients (DDPG) based algorithm. We first formulate the NUM problem with the consideration of end-to-end delays and various operation costs into a non-convex optimization problem and prove it to be NP-hard. We then redesign the exploration method and invent a dual replay buffer structure to customize the DDPG. Meanwhile, we also apply our formulation to guide our replay buffer update. Through extensive trace-driven experiments, we show the high efficiency of our customized DDPG based algorithm as it significantly outperforms both model-based methods and traditional non-customized DDPG based algorithm.

This publication has 18 references indexed in Scilit: