Performance enhancement of high degree Charlier polynomials using multithreaded algorithm
Discrete orthogonal polynomials (DOPs) have gained significant research attention owing to their crucial role in digital signal processing applications such as computer vision, pattern recognition, and compression. However, the computation of DOP coefficients often incurs a substantial computational...
Saved in:
| Published in: | Ain Shams Engineering Journal Vol. 15; no. 5; p. 102657 |
|---|---|
| Main Authors: | , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.05.2024
Elsevier |
| Subjects: | |
| ISSN: | 2090-4479 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Discrete orthogonal polynomials (DOPs) have gained significant research attention owing to their crucial role in digital signal processing applications such as computer vision, pattern recognition, and compression. However, the computation of DOP coefficients often incurs a substantial computational burden, exacerbating for higher-degree moments along with the resulting numerical errors. To address this challenge, this paper exploits the inherent parallelism in Charlier polynomial coefficient calculations to achieve enhanced polynomial performance. Independent calculations are distributed among threads, making efficient use of the available processing resources. Two algorithms are presented, the first algorithm evenly distributes the rows in a sequential manner (straightforward). Additionally, to achieve a more equitable distribution of coefficient calculations, this paper proposes alternative distribution approaches, aimed at balancing processing load among threads. Through extensive comparative experiments, we have confirmed that the proposed approaches achieved high performance across different degrees (1540 to 7370) and at different numbers of threads (2 to 256). The results show processing time in the multithreaded case is improved by up to 9.1 times with respect to the unthreaded case. Furthermore, by increasing the number of threads from 2 to 256, the trend indicates that the most significant improvement occurs in the range of 32 to 128 threads, confirming the robustness of the proposed algorithm. These findings signify the importance of this paper. |
|---|---|
| ISSN: | 2090-4479 |
| DOI: | 10.1016/j.asej.2024.102657 |