Co-processing SPMD computation on CPUs and GPUs cluster

1 September 2013

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in 2013 IEEE International Conference on Cluster Computing (CLUSTER)

No. 15525244,p. 1-10
https://doi.org/10.1109/cluster.2013.6702632

Abstract

Heterogeneous parallel systems with multi processors and accelerators are becoming ubiquitous due to better cost-performance and energy-efficiency. These heterogeneous processor architectures have different instruction sets and are optimized for either task-latency or throughput purposes. Challenges occur in regard to programmability and performance when running SPMD tasks on heterogeneous devices. In order to meet these challenges, we implemented a parallel runtime system that used to co-process SPMD computation on CPUs and GPUs clusters. Furthermore, we are proposing an analytic model to automatically schedule SPMD tasks on heterogeneous clusters. Our analytic model is derived from the roofline model, and therefore it can be applied to a wider range of SPMD applications and hardware devices. The experimental results of the C-means, GMM, and GEMV show good speedup in practical heterogeneous cluster environments.

Keywords

This publication has 14 references indexed in Scilit:

DACIDR
Published by Association for Computing Machinery (ACM) ,2012
Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system
Published by Association for Computing Machinery (ACM) ,2012
Design patterns for scientific applications in DryadLINQ CTP
Published by Association for Computing Machinery (ACM) ,2011
Automatic Task Re-organization in MapReduce
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Understanding throughput-oriented architectures
Communications of the ACM, 2010
Applying Twister to Scientific Applications
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Twister
Published by Association for Computing Machinery (ACM) ,2010
Roofline
Communications of the ACM, 2009
Workflow Concepts of the Java CoG Kit
Journal of Grid Computing, 2005
Fast allocation and deallocation of memory based on object lifetimes
Software: Practice and Experience, 1990

Cited by 1 article