An Extensible Thread Throttling Method for Multiple OpenMP Parallel Programs
Saved in:
| Title: | An Extensible Thread Throttling Method for Multiple OpenMP Parallel Programs |
|---|---|
| Authors: | Xiaoxuan Luo, Weiwei Lin, Jiachun Li, Fan Chen, Haocheng Zhong, Keqin Li |
| Source: | ACM Transactions on Embedded Computing Systems. |
| Publisher Information: | Association for Computing Machinery (ACM), 2025. |
| Publication Year: | 2025 |
| Description: | OpenMP is one of the most popular parallel frameworks in the HPC area. Many researchers have proposed OpenMP thread throttling techniques for searching the optimal configuration of parallelism to improve computational efficiency. However, existing research mainly focuses on the optimal solution and ignores the average performance of the program during the search process. In addition, there are various types of workloads in HPC production environments. The OpenMP configuration needs to be adjusted according to the real-time running status of programs. Otherwise, it may lead to a deviation of the actual improvement in the real-time environment from the theory. In this paper, we propose an OpenMP thread throttling method. The method uses the search results of historical workloads to train the performance vertex prediction model, quickly identifies the approximate range of the optimal number of threads for unknown workloads, and searches in a small range with a neighborhood-sampling-based bidirectional hill-climbing search algorithm. The method improves real-time optimization efficiency in HPC systems with multiple unknown loads. Through experiments, we demonstrate the advantages of our method compared to a variety of commonly used thread throttling methods. With minor differences in the optimal solutions, the average performance and convergence speed of our method during the search can be improved by up to 10.6% and 22.7% compared to the best method. |
| Document Type: | Article |
| Language: | English |
| ISSN: | 1558-3465 1539-9087 |
| DOI: | 10.1145/3769679 |
| Accession Number: | edsair.doi...........00f755029eb855cd2abeaacacb9940b4 |
| Database: | OpenAIRE |
| Abstract: | OpenMP is one of the most popular parallel frameworks in the HPC area. Many researchers have proposed OpenMP thread throttling techniques for searching the optimal configuration of parallelism to improve computational efficiency. However, existing research mainly focuses on the optimal solution and ignores the average performance of the program during the search process. In addition, there are various types of workloads in HPC production environments. The OpenMP configuration needs to be adjusted according to the real-time running status of programs. Otherwise, it may lead to a deviation of the actual improvement in the real-time environment from the theory. In this paper, we propose an OpenMP thread throttling method. The method uses the search results of historical workloads to train the performance vertex prediction model, quickly identifies the approximate range of the optimal number of threads for unknown workloads, and searches in a small range with a neighborhood-sampling-based bidirectional hill-climbing search algorithm. The method improves real-time optimization efficiency in HPC systems with multiple unknown loads. Through experiments, we demonstrate the advantages of our method compared to a variety of commonly used thread throttling methods. With minor differences in the optimal solutions, the average performance and convergence speed of our method during the search can be improved by up to 10.6% and 22.7% compared to the best method. |
|---|---|
| ISSN: | 15583465 15399087 |
| DOI: | 10.1145/3769679 |
Full Text Finder
Nájsť tento článok vo Web of Science