Implementation and optimization of a thermal Lattice Boltzmann algorithm on a multi-GPU cluster

1 May 2012

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in 2012 Innovative Parallel Computing (InPar)

Abstract

Lattice Boltzmann (LB) methods are widely used today to describe the dynamics of fluids. Key advantages of this approach are the relative ease with which complex physics behavior, e.g. associated to multi-phase flows or irregular boundary conditions can be modeled, and - from a computational perspective - the large degree of available parallelism, that can be easily exploited on massively parallel systems. The advent of multi-core and many-core processors, including General Purpose Graphics Processing Unit (GP-GPU), has pushed the quest for parallelization also at the intra-processor level. From this point of view, LB methods may strongly benefit from these new architectures. In this paper we describe the implementation and optimization of a recently proposed thermal LB model - the so called D2Q37 model - on multi-GPU systems. We describe in details the optimization techniques that we have used at both the intra-processor and inter-processor level, present performance and scaling figures and analyze bottlenecks associated to this implementation.

Keywords

This publication has 3 references indexed in Scilit:

Optimization of Multi-Phase Compressible Lattice Boltzmann Codes on Massively Parallel Multi-Core Systems
Procedia Computer Science, 2011
Second-order closure in stratified turbulence: Simulations and modeling of bulk and entrainment regions
Physical Review E, 2011
Reactive Rayleigh-Taylor systems: Front propagation and non-stationarity
Europhysics Letters, 2011

Cited by 4 articles