A coarse-grained multicomputer parallel algorithm for the sequential substring constrained longest common subsequence problem
In this paper, we study the sequential substring constrained longest common subsequence (SSCLCS) problem. It is widely used in the bioinformatics field. Given two strings X and Y with respective lengths m and n, formed on an alphabet Σ and a constraint sequence C formed by ordered strings (c1,c2,…,c...
Uložené v:
| Vydané v: | Parallel computing Ročník 111; s. 102927 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
01.07.2022
Elsevier |
| Predmet: | |
| ISSN: | 0167-8191, 1872-7336 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | In this paper, we study the sequential substring constrained longest common subsequence (SSCLCS) problem. It is widely used in the bioinformatics field. Given two strings X and Y with respective lengths m and n, formed on an alphabet Σ and a constraint sequence C formed by ordered strings (c1,c2,…,cl) with total length r, the SSCLCS problem is to find the longest common subsequence D between X and Y such that D contains in an ordered way c1,c2,…,cl. To solve this problem, Tseng et al. proposed a dynamic-programming algorithm that runs in Omnr+(m+n)|Σ| time. We rely on this work to propose a parallel algorithm for the SSCLCS problem on the Coarse-Grained Multicomputer (CGM) model. We design a three-dimensional partitioning technique of the corresponding dependency graph to reduce the latency time of processors by ensuring that at each step, the size of the subproblems to be performed by processors is small. It also minimizes the number of communications between processors. Our solution requires Onmr+(m+n)|Σ|p execution time with O(p) communication rounds on p processors. The experimental results show that our solution speedups up to 59.7 on 64 processors. This is better than the CGM-based parallel techniques that have been used in solving similar problems.
•Describing a task graph following the Tseng et al.’s recursive formula•Describing a three-dimensional partitioning strategy and a distribution scheme•Experimental study on a real parallel machine with existing DNA data sets•Comparing the empirical results between the proposed strategy and previous strategies |
|---|---|
| ISSN: | 0167-8191 1872-7336 |
| DOI: | 10.1016/j.parco.2022.102927 |