DRONE: An Efficient Distributed Subgraph-Centric Framework for Processing Large-Scale Power-law Graphs
- 18 November 2022
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 34 (2), 463-474
- https://doi.org/10.1109/tpds.2022.3223068
Abstract
Nowadays, the ever-increasing volume of graph-structured data such as social networks, graph databases and knowledge graphs requires to be processed efficiently and scalably. These natural graphs commonly found in the real world have highly skewed power-law degree distribution and are called power-law graphs. The subgraph-centric programming model is a promising approach applied in many state-of-the-art distributed graph computing frameworks. However, the performance of subgraph-centric frameworks is limited when processing large-scale power-law graphs. When deployed to the subgraph-centric framework, existing graph partitioning algorithms are not suitable for power-law graphs. In this paper, we present a novel distributed graph computing framework, DRONE (Distributed gRaph cOmputiNg Engine), which leverages the subgraph-centric model and the vertex-cut graph partitioning strategy. DRONE also supports the fault tolerance mechanism to accommodate the increasing scale of machines with negligible overhead (6.48% on average). We further study the execution workflow of DRONE and propose an efficient and balanced graph partition algorithm (EBV) for DRONE. Experiments show that DRONE reduces the running time on real-world graphs by 25.6%, on average, compared to the state-of-the-art distributed graph computing frameworks. In addition, the EBV graph partition algorithm reduces the replication factor by at least 21.8% than other self-based partition algorithms. Our results indicate that DRONE has excellent potential in processing large-scale power-law graphs.Keywords
Funding Information
- National Natural Science Foundation of China (41930110, 61872272)
This publication has 37 references indexed in Scilit:
- Distributed GraphLabProceedings of the VLDB Endowment, 2012
- 9. Kronecker GraphsPublished by Society for Industrial & Applied Mathematics (SIAM) ,2011
- Statistical mechanics of complex networksReviews of Modern Physics, 2002
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- A Fast and High Quality Multilevel Scheme for Partitioning Irregular GraphsSIAM Journal on Scientific Computing, 1998
- Finding good approximate vertex and edge partitions is NP-hardInformation Processing Letters, 1992
- A bridging model for parallel computationCommunications of the ACM, 1990
- Fibonacci heaps and their uses in improved network optimization algorithmsJournal of the ACM, 1987
- Distributed snapshotsACM Transactions on Computer Systems, 1985
- Connected Component Labeling Using QuadtreesJournal of the ACM, 1981