Longest common substrings with k mismatches

The longest common substring with k-mismatches problem is to find, given two strings S1 and S2, a longest substring A1 of S1 and A2 of S2 such that the Hamming distance between A1 and A2 is ≤k. We introduce a practical O(nm) time and O(1) space solution for this problem, where n and m are the length...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information processing letters Jg. 115; H. 6-8; S. 643 - 647
Hauptverfasser: Flouri, Tomas, Giaquinta, Emanuele, Kobert, Kassian, Ukkonen, Esko
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Amsterdam Elsevier B.V 01.06.2015
Elsevier Sequoia S.A
Schlagworte:
ISSN:0020-0190, 1872-6119
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The longest common substring with k-mismatches problem is to find, given two strings S1 and S2, a longest substring A1 of S1 and A2 of S2 such that the Hamming distance between A1 and A2 is ≤k. We introduce a practical O(nm) time and O(1) space solution for this problem, where n and m are the lengths of S1 and S2, respectively. This algorithm can also be used to compute the matching statistics with k-mismatches of S1 and S2 in O(nm) time and O(m) space. Moreover, we also present a theoretical solution for the k=1 case which runs in O(nlog⁡m) time, assuming m≤n, and uses O(m) space, improving over the existing O(nm) time and O(m) space bound of Babenko and Starikovskaya [1]. •Two new algorithms for the longest common substring with k mismatches problem.•A practical solution for arbitrary k which uses constant space.•A theoretical solution for one mismatch which runs in quasilinear time.
Bibliographie:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ISSN:0020-0190
1872-6119
DOI:10.1016/j.ipl.2015.03.006