A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n × n. Unlike existing PRAM (Parallel Random Access Machine) and other algorithms, this algorithm is suitable for implementation on modern SI...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on image processing Ročník 28; číslo 11; s. 5322 - 5335
Hlavní autoři: Manduhu, Manduhu, Jones, Mark W.
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States IEEE 01.11.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1057-7149, 1941-0042, 1941-0042
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n × n. Unlike existing PRAM (Parallel Random Access Machine) and other algorithms, this algorithm is suitable for implementation on modern SIMD (Single Instruction Multiple Data) architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA (Compute Unified Device Architecture) binary functions, such as ballotO, ffs(), clzO, and shflO, runs in O(log 32 n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully-parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log 32 n) time and performs O(N) (N= n2 ) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log 32 n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. The experimental results show that this algorithm outperforms the prior state-of-the-art GPU algorithms.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1057-7149
1941-0042
1941-0042
DOI:10.1109/TIP.2019.2916741