An optimized D2Q37 Lattice Boltzmann code on GP-GPUs

Abstract

We describe the implementation of a thermal compressible Lattice Boltzmann algorithm on an NVIDIA Tesla C2050 system based on the Fermi GP-GPU. We consider two different versions, including and not including reactive effects. We describe the overall organization of the algorithm and give details on its implementations. Efficiency ranges from 25% to 31% of the double precision peak performance of the GP-GPU. We compare our results with a different implementation of the same algorithm, developed and optimized for many-core Intel Westmere CPUs. (C) 2012 Elsevier Ltd. All rights reserved.

Anno

2013

Autori IAC

FEDERICO TOSCHI

ANDREA SCAGLIARINI

Tipo pubblicazione

Articolo in rivista

Altri Autori

Biferale, Luca and Mantovani, Filippo and Pivanti, Marcello and Pozzati, Fabio and Sbragaglia, Mauro and Scagliarini, Andrea and Schifano, Sebastiano Fabio and Toschi, Federico and Tripiccione, Raffaele

DOI

https://dx.doi.org/10.1016/j.compfluid.2012.06.003

Editore

Pergamon Press.

Rivista

Computers & fluids