DRONE: An Efficient Distributed Subgraph-Centric Framework for Processing Large-Scale Power-law Graphs

Abstract
Nowadays, the ever-increasing volume of graph-structured data such as social networks, graph databases and knowledge graphs requires to be processed efficiently and scalably. These natural graphs commonly found in the real world have highly skewed power-law degree distribution and are called power-law graphs. The subgraph-centric programming model is a promising approach applied in many state-of-the-art distributed graph computing frameworks. However, the performance of subgraph-centric frameworks is limited when processing large-scale power-law graphs. When deployed to the subgraph-centric framework, existing graph partitioning algorithms are not suitable for power-law graphs. In this paper, we present a novel distributed graph computing framework, DRONE (Distributed gRaph cOmputiNg Engine), which leverages the subgraph-centric model and the vertex-cut graph partitioning strategy. DRONE also supports the fault tolerance mechanism to accommodate the increasing scale of machines with negligible overhead (6.48% on average). We further study the execution workflow of DRONE and propose an efficient and balanced graph partition algorithm (EBV) for DRONE. Experiments show that DRONE reduces the running time on real-world graphs by 25.6%, on average, compared to the state-of-the-art distributed graph computing frameworks. In addition, the EBV graph partition algorithm reduces the replication factor by at least 21.8% than other self-based partition algorithms. Our results indicate that DRONE has excellent potential in processing large-scale power-law graphs.
Funding Information
  • National Natural Science Foundation of China (41930110, 61872272)

This publication has 37 references indexed in Scilit: