Tuning and hybrid parallelization of a genetic-based multi-point statistics simulation code

•This work is part of an effort to accelerate geostatistical simulation codes.•We apply acceleration techniques to a genetic-based MPS simulation code.•The acceleration techniques are code optimization and parallelization.•Performance improvements allow us to accelerate up to 100× the execution of t...

Full description

Saved in:

Bibliographic Details
Published in:	Parallel computing Vol. 40; no. 5-6; pp. 144 - 158
Main Authors:	Peredo, Oscar, Ortiz, Julián M., Herrero, José R., Samaniego, Cristóbal
Format:	Journal Article Publication
Language:	English
Published:	Elsevier B.V 01.05.2014 Elsevier
Subjects:	Arrays Code optimization Computation Computer simulation Convergence Distributed memory Enginyeria biomèdica Genetic algorithms Genètica Geostatistics Multi-point statistics Optimization Parallel computing Parallel processing Programming Simulació, Mètodes de Simulation methods Stochastic simulation Àrees temàtiques de la UPC Stochastic simulation Parallel computing Code optimization Geostatistics Multi-point statistics Genetic algorithms
ISSN:	0167-8191, 1872-7336
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•This work is part of an effort to accelerate geostatistical simulation codes.•We apply acceleration techniques to a genetic-based MPS simulation code.•The acceleration techniques are code optimization and parallelization.•Performance improvements allow us to accelerate up to 100× the execution of the code.•Samples of simulated results are shown. One of the main difficulties using multi-point statistical (MPS) simulation based on annealing techniques or genetic algorithms concerns the excessive amount of time and memory that must be spent in order to achieve convergence. In this work we propose code optimizations and parallelization schemes over a genetic-based MPS code with the aim of speeding up the execution time. The code optimizations involve the reduction of cache misses in the array accesses, avoid branching instructions and increase the locality of the accessed data. The hybrid parallelization scheme involves a fine-grain parallelization of loops using a shared-memory programming model (OpenMP) and a coarse-grain distribution of load among several computational nodes using a distributed-memory programming model (MPI). Convergence, execution time and speed-up results are presented using 2D training images of sizes 100×100×1 and 1000×1000×1 on a distributed-shared memory supercomputing facility.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0167-8191 1872-7336
DOI:	10.1016/j.parco.2014.04.005