An algorithm for dynamic processing of dawg's

SCOPE: Algorithms, Information storage and retrieval. A trie is a search tree obtained by merging the common suffixes of the key set. It has the advantage that all keys as prefixes of an input string can be retrieved with high speed. When the size of the key set is enlarged, however, a problem arise...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer mathematics Jg. 54; H. 3-4; S. 155 - 173
Hauptverfasser: Park, Ki-Hong, Aoe, Jun-Ichi, Morimoto, Katsushi, Shishibori, Masami
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Abingdon Gordon and Breach Science Publishers S.A 01.01.1994
Taylor and Francis
Schlagworte:
ISSN:0020-7160, 1029-0265
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:SCOPE: Algorithms, Information storage and retrieval. A trie is a search tree obtained by merging the common suffixes of the key set. It has the advantage that all keys as prefixes of an input string can be retrieved with high speed. When the size of the key set is enlarged, however, a problem arises, as the number of transitions increases, so too does the need for a large storage capacity. This paper proposes an algorithm that dynamically constructs DAWGs (Directed Acyclic Word Graphs) for the handling of dynamic key sets. It also solves the problem of the increasing number of transitions in the trie structure. The proposed method constructs a DAWG through the local separation of common suffixes for updating a key and, after finishing updating a key, the local transition merge of common suffixes. The proposed algorithm is theoretically evaluated and the data structure for the implementation is discussed. Experimental results show that the number of transitions in the DAWG is reduced by approx. 50 to 70% compared of that of the trie, for key sets of several thousands of fifty thousand elements, also the updating of the keys can be executed in a practical time for sets of less than ten thousand keys.
ISSN:0020-7160
1029-0265
DOI:10.1080/00207169408804348