Tuning and hybrid parallelization of a genetic-based multi-point statistics simulation code

•This work is part of an effort to accelerate geostatistical simulation codes.•We apply acceleration techniques to a genetic-based MPS simulation code.•The acceleration techniques are code optimization and parallelization.•Performance improvements allow us to accelerate up to 100× the execution of t...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Parallel computing Ročník 40; číslo 5-6; s. 144 - 158
Hlavní autori:	Peredo, Oscar, Ortiz, Julián M., Herrero, José R., Samaniego, Cristóbal
Médium:	Journal Article Publikácia
Jazyk:	English
Vydavateľské údaje:	Elsevier B.V 01.05.2014 Elsevier
Predmet:	Arrays Code optimization Computation Computer simulation Convergence Distributed memory Enginyeria biomèdica Genetic algorithms Genètica Geostatistics Multi-point statistics Optimization Parallel computing Parallel processing Programming Simulació, Mètodes de Simulation methods Stochastic simulation Àrees temàtiques de la UPC Stochastic simulation Parallel computing Code optimization Geostatistics Multi-point statistics Genetic algorithms
ISSN:	0167-8191, 1872-7336
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	•This work is part of an effort to accelerate geostatistical simulation codes.•We apply acceleration techniques to a genetic-based MPS simulation code.•The acceleration techniques are code optimization and parallelization.•Performance improvements allow us to accelerate up to 100× the execution of the code.•Samples of simulated results are shown. One of the main difficulties using multi-point statistical (MPS) simulation based on annealing techniques or genetic algorithms concerns the excessive amount of time and memory that must be spent in order to achieve convergence. In this work we propose code optimizations and parallelization schemes over a genetic-based MPS code with the aim of speeding up the execution time. The code optimizations involve the reduction of cache misses in the array accesses, avoid branching instructions and increase the locality of the accessed data. The hybrid parallelization scheme involves a fine-grain parallelization of loops using a shared-memory programming model (OpenMP) and a coarse-grain distribution of load among several computational nodes using a distributed-memory programming model (MPI). Convergence, execution time and speed-up results are presented using 2D training images of sizes 100×100×1 and 1000×1000×1 on a distributed-shared memory supercomputing facility.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0167-8191 1872-7336
DOI:	10.1016/j.parco.2014.04.005