Parallel Algorithm for SWFFT Using 3D Data Structure
Sliding-Window Fast Fourier Transform (SWFFT) is a very important and widely used time-frequency representation of a signal. In the paper, we mainly focus on the problem of how to implement non-recursive SWFFT in parallel programming. To avoid repeated calculations, non-recursive SWFFT algorithms al...
Uloženo v:
| Vydáno v: | Circuits, systems, and signal processing Ročník 31; číslo 2; s. 711 - 726 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Boston
SP Birkhäuser Verlag Boston
01.04.2012
Springer Nature B.V |
| Témata: | |
| ISSN: | 0278-081X, 1531-5878 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Sliding-Window Fast Fourier Transform (SWFFT) is a very important and widely used time-frequency representation of a signal. In the paper, we mainly focus on the problem of how to implement non-recursive SWFFT in parallel programming. To avoid repeated calculations, non-recursive SWFFT algorithms always save the calculated results and use for later calculations. So the current calculations need the results from the earlier calculations, and this leads to the main obstacle of implementing SWFFT in parallel programming.
By assuming that all the data have been present at the outset, a new parallel algorithm for non-recursive SWFFT is proposed in the paper. In our algorithm, 3D data structure is utilized to represent the computation of SWFFT while the calculations are carried out along level axis instead of time axis. Since the current calculations do not anymore depend on the results from the earlier calculations, our algorithm can be implemented by parallel programming easily.
Finally, the algorithm is programmed in C++ based on OpenMP API and is evaluated on a 4-processor desktop computer. Threads up to four are created and each of them is performed by one processor core. A memory-shared parallel programming model is devised to achieve data accessing and results exchanging between threads. Compared with employing one thread, our algorithm reduces the computational time of a signal of length
L
=10
16
(the window length
N
=10
10
) to 32% by employing four threads. |
|---|---|
| Bibliografie: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| ISSN: | 0278-081X 1531-5878 |
| DOI: | 10.1007/s00034-011-9342-5 |