Bit-parallel ( δ , γ ) -matching and suffix automata

( δ , γ ) -matching is a string matching problem with applications to music retrieval. The goal is, given a pattern P 1 … m and a text T 1 … n on an alphabet of integers, find the occurrences P ′ of the pattern in the text such that (i) ∀ 1 ⩽ i ⩽ m , | P i − P i ′ | ⩽ δ , and (ii) ∑ 1 ⩽ i ⩽ m | P i...

Full description

Saved in:
Bibliographic Details
Published in:Journal of discrete algorithms (Amsterdam, Netherlands) Vol. 3; no. 2; pp. 198 - 214
Main Authors: Crochemore, Maxime, Iliopoulos, Costas S., Navarro, Gonzalo, Pinzon, Yoan J., Salinger, Alejandro
Format: Journal Article
Language:English
Published: Elsevier B.V 2005
Elsevier
Subjects:
ISSN:1570-8667, 1570-8675
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:( δ , γ ) -matching is a string matching problem with applications to music retrieval. The goal is, given a pattern P 1 … m and a text T 1 … n on an alphabet of integers, find the occurrences P ′ of the pattern in the text such that (i) ∀ 1 ⩽ i ⩽ m , | P i − P i ′ | ⩽ δ , and (ii) ∑ 1 ⩽ i ⩽ m | P i − P i ′ | ⩽ γ . The problem makes sense for δ ⩽ γ ⩽ δ m . Several techniques for ( δ , γ ) -matching have been proposed, based on bit-parallelism or on skipping characters. We first present an O ( m n log ( γ ) / w ) worst-case time and O ( n ) average-case time bit-parallel algorithm (being w the number of bits in the computer word). It improves the previous O ( m n log ( δ m ) / w ) worst-case time algorithm of the same type. Second, we combine our bit-parallel algorithm with suffix automata to obtain the first algorithm that skips characters using both δ and γ. This algorithm examines less characters than any previous approach, as the others do just δ-matching and check the γ-condition on the candidates. We implemented our algorithms and drew experimental results on real music, showing that our algorithms are superior to current alternatives with high values of δ.
ISSN:1570-8667
1570-8675
DOI:10.1016/j.jda.2004.08.005