Parallel Branch-and-Bound in multi-core multi-CPU multi-GPU heterogeneous environments

We investigate the design of parallel B&B in large scale heterogeneous compute environments where processing units can be composed of a mixture of multiple shared memory cores, multiple distributed CPUs and multiple GPUs devices. We describe two approaches addressing the critical issue of how to...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Future generation computer systems Ročník 56; s. 95 - 109
Hlavní autori: Vu, Trong-Tuan, Derbel, Bilel
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 01.03.2016
Elsevier
Predmet:
ISSN:0167-739X, 1872-7115
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:We investigate the design of parallel B&B in large scale heterogeneous compute environments where processing units can be composed of a mixture of multiple shared memory cores, multiple distributed CPUs and multiple GPUs devices. We describe two approaches addressing the critical issue of how to map B&B workload with the different levels of parallelism exposed by the target compute platform. We also contribute a throughout large scale experimental study which allows us to derive a comprehensive and fair analysis of the proposed approaches under different system configurations using up to 16 GPUs and up to 512 distributed cores. Our results shed more light on the main challenges one has to face when tackling B&B algorithms while describing efficient techniques to address them. In particular, we are able to obtain linear speed-ups at moderate scales where adaptive load balancing among the heterogeneous compute resources is shown to have a significant impact on performance. At the largest scales, intra-node parallelism and hybrid decentralized load balancing is shown to have a crucial importance in order to alleviate locking issues among shared memory threads and to scale the distributed resources while optimizing communication costs and minimizing idle times. •Key challenges in parallelizing Branch-and-Bound for large scale systems.•Heterogeneous load balancing for parallel Branch-and-Bound.•Shared and distributed memory hybrid work stealing.•CPU–GPU distributed B&B protocol with near-optimal speedup.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2015.10.009