In EDS ansehen

A multi-GPU parallel computing method for 3D random vibration of train-track-soil dynamic interaction.

Gespeichert in:

Bibliographische Detailangaben
Titel:	A multi-GPU parallel computing method for 3D random vibration of train-track-soil dynamic interaction.
Alternate Title:	列车-轨道-地基土耦合系统三维随机振动的多GPU并行计算方法. (Chinese)
Autoren:	Zhu, Zhi-hui, Yang, Xiao, Li, Hao, Xu, Hai-kun, Zou, You
Quelle:	Journal of Central South University; May2023, Vol. 30 Issue 5, p1722-1736, 15p
Abstract (English):	In this paper, an efficient computation method based on a multi-GPU parallel algorithm is proposed to overcome the low efficiency in random calculation of the train-track-soil coupled system (TTSCS). Firstly, for the large time consumption caused by solving multiple independent equations of TTSCS at different frequency points in serially random vibration analysis, the multi-GPU parallel algorithm is proposed and programmed based on the OpenMP-CUDA algorithm. The tasks of solving multiple linear equations for random vibration analysis of the TTSCS are distributed to different GPUs for parallel execution. On each GPU, the large sparse linear equations of TTSCS are solved by the CUDA-based parallel preconditioned conjugate gradient (PCG) method, and the large sparse matrix is stored in the compressed sparse row (CSR) format to save memory space. Then, the parallel computing program is implemented on the MATLAB-CUDA hybrid platform. Finally, numerical examples show that the efficiency of solving large sparse linear equations based on the multi-GPU parallel algorithm implemented on a 4-GPU node and the GPU-accelerated PCG algorithm implemented on a personal computer with a single GPU is 22.59 times and 3.75 times that of the multi-point synchronization algorithm (MPSA), respectively. [ABSTRACT FROM AUTHOR]
Abstract (Chinese):	摘要: 针对列车-轨道-地基土耦合系统随机计算效率低的问题, 本文提出了基于多GPU的列车-轨道-地基土随机振动方程的高效并行计算方法。基于OpenMP-CUDA混编技术将虚拟激励法不同频点下的多个线性方程组求解任务分配给多个GPU并行执行; 在每块GPU上, 采用基于CUDA的预处理共轭梯度法(PCG)并行求解对称正定的等效静力平衡方程。针对耦合系统等效刚度矩阵的稀疏特性, 采用行压缩(CSR)格式存储大型稀疏矩阵以节省内存空间。最终通过MATLAB-CUDA混合平台开发并行计算程序, 解决了随机振动分析中多个线性方程组串行求解效率低的难题。数值算例表明, 基于四GPU 节点的多GPU并行算法和单GPU加速PCG 算法的计算效率是串行多点同步算法(MPSA)计算效率的 22.59倍和3.75倍。 [ABSTRACT FROM AUTHOR]
	Copyright of Journal of Central South University is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank:	Complementary Index

Full Text Finder

Nájsť tento článok vo Web of Science

Beschreibung
Abstract:	In this paper, an efficient computation method based on a multi-GPU parallel algorithm is proposed to overcome the low efficiency in random calculation of the train-track-soil coupled system (TTSCS). Firstly, for the large time consumption caused by solving multiple independent equations of TTSCS at different frequency points in serially random vibration analysis, the multi-GPU parallel algorithm is proposed and programmed based on the OpenMP-CUDA algorithm. The tasks of solving multiple linear equations for random vibration analysis of the TTSCS are distributed to different GPUs for parallel execution. On each GPU, the large sparse linear equations of TTSCS are solved by the CUDA-based parallel preconditioned conjugate gradient (PCG) method, and the large sparse matrix is stored in the compressed sparse row (CSR) format to save memory space. Then, the parallel computing program is implemented on the MATLAB-CUDA hybrid platform. Finally, numerical examples show that the efficiency of solving large sparse linear equations based on the multi-GPU parallel algorithm implemented on a 4-GPU node and the GPU-accelerated PCG algorithm implemented on a personal computer with a single GPU is 22.59 times and 3.75 times that of the multi-point synchronization algorithm (MPSA), respectively. [ABSTRACT FROM AUTHOR]
ISSN:	20952899
DOI:	10.1007/s11771-023-5331-7