Parallel Algorithm for SWFFT Using 3D Data Structure

Sliding-Window Fast Fourier Transform (SWFFT) is a very important and widely used time-frequency representation of a signal. In the paper, we mainly focus on the problem of how to implement non-recursive SWFFT in parallel programming. To avoid repeated calculations, non-recursive SWFFT algorithms al...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Circuits, systems, and signal processing Ročník 31; číslo 2; s. 711 - 726
Hlavní autoři: Wang, Jian-Ming, Eddy, William F.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Boston SP Birkhäuser Verlag Boston 01.04.2012
Springer Nature B.V
Témata:
ISSN:0278-081X, 1531-5878
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Sliding-Window Fast Fourier Transform (SWFFT) is a very important and widely used time-frequency representation of a signal. In the paper, we mainly focus on the problem of how to implement non-recursive SWFFT in parallel programming. To avoid repeated calculations, non-recursive SWFFT algorithms always save the calculated results and use for later calculations. So the current calculations need the results from the earlier calculations, and this leads to the main obstacle of implementing SWFFT in parallel programming. By assuming that all the data have been present at the outset, a new parallel algorithm for non-recursive SWFFT is proposed in the paper. In our algorithm, 3D data structure is utilized to represent the computation of SWFFT while the calculations are carried out along level axis instead of time axis. Since the current calculations do not anymore depend on the results from the earlier calculations, our algorithm can be implemented by parallel programming easily. Finally, the algorithm is programmed in C++ based on OpenMP API and is evaluated on a 4-processor desktop computer. Threads up to four are created and each of them is performed by one processor core. A memory-shared parallel programming model is devised to achieve data accessing and results exchanging between threads. Compared with employing one thread, our algorithm reduces the computational time of a signal of length L =10 16 (the window length N =10 10 ) to 32% by employing four threads.
Bibliografie:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
ISSN:0278-081X
1531-5878
DOI:10.1007/s00034-011-9342-5