On the parallelization of Hirschberg's algorithm for multi‐core and many‐core systems

Summary Finding the longest common subsequence between two strings in acceptable time frames is crucial to solving various problems in different fields of study. To ensure the optimal solution is found, algorithms based on dynamic programming are employed almost exclusively. While the most commonly...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Concurrency and computation Ročník 31; číslo 18
Hlavní autori: João, Mario, Sena, Alexandre C., Rebello, Vinod E. F.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Hoboken Wiley Subscription Services, Inc 25.09.2019
Predmet:
ISSN:1532-0626, 1532-0634
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Summary Finding the longest common subsequence between two strings in acceptable time frames is crucial to solving various problems in different fields of study. To ensure the optimal solution is found, algorithms based on dynamic programming are employed almost exclusively. While the most commonly adopted algorithm, proposed by Needleman and Wunsch, has quadratic time and space complexity, the linear space complexity of Hirschberg's algorithm favors the comparisons of longer sequences. However, it too has a quadratic time complexity and therefore the effective exploitation of parallelism has become essential. This paper focuses on improving the execution efficiency of Hirschberg's algorithm on multi‐core and many‐core systems. To achieve this goal, first, enhancements to the sequential version are proposed to take advantage of SIMD instructions available on modern processors. Second, the impact on the performance of different parallelization strategies is investigated and evaluated. Results show that combining these two aspects can greatly improve the performance of Hirschberg's algorithm on these architectures. In relation to the original version, speedups of over 46 were achieved on a dual 18‐core server for sequences of 1.6 million characters. Furthermore, experiments with a 68‐core Intel Xeon Phi (many‐core) system obtained speedups of up to 105 for the same sequence size.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.5174