Falcon
Open Access
- 22 December 2015
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 12 (4), 1-27
- https://doi.org/10.1145/2842618
Abstract
Graph algorithms have been shown to possess enough parallelism to keep several computing resources busy—even hundreds of cores on a GPU. Unfortunately, tuning their implementation for efficient execution on a particular hardware configuration of heterogeneous systems consisting of multicore CPUs and GPUs is challenging, time consuming, and error prone. To address these issues, we propose a domain-specific language (DSL), Falcon, for implementing graph algorithms that (i) abstracts the hardware, (ii) provides constructs to write explicitly parallel programs at a higher level, and (iii) can work with general algorithms that may change the graph structure (morph algorithms). We illustrate the usage of our DSL to implement local computation algorithms (that do not change the graph structure) and morph algorithms such as Delaunay mesh refinement, survey propagation, and dynamic SSSP on GPU and multicore CPUs. Using a set of benchmark graphs, we illustrate that the generated code performs close to the state-of-the-art hand-tuned implementations.Keywords
Funding Information
- IMPECS project of DST
- MPI-SWS
This publication has 41 references indexed in Scilit:
- LigraACM SIGPLAN Notices, 2013
- iGPUACM SIGARCH Computer Architecture News, 2012
- Accelerating CUDA graph algorithms at maximum warpACM SIGPLAN Notices, 2011
- EigenCFAACM SIGPLAN Notices, 2011
- OpenMP to GPGPUACM SIGPLAN Notices, 2009
- Scalable Parallel Programming with CUDAQueue, 2008
- Survey propagation: An algorithm for satisfiabilityRandom Structures & Algorithms, 2005
- Experimental analysis of dynamic algorithms for the single source shortest paths problemACM Journal of Experimental Algorithmics, 1998
- On the computational complexity of dynamic graph problemsTheoretical Computer Science, 1996
- A bridging model for parallel computationCommunications of the ACM, 1990