An algorithm for dynamic processing of dawg's

SCOPE: Algorithms, Information storage and retrieval. A trie is a search tree obtained by merging the common suffixes of the key set. It has the advantage that all keys as prefixes of an input string can be retrieved with high speed. When the size of the key set is enlarged, however, a problem arise...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of computer mathematics Ročník 54; číslo 3-4; s. 155 - 173
Hlavní autori: Park, Ki-Hong, Aoe, Jun-Ichi, Morimoto, Katsushi, Shishibori, Masami
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Abingdon Gordon and Breach Science Publishers S.A 01.01.1994
Taylor and Francis
Predmet:
ISSN:0020-7160, 1029-0265
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:SCOPE: Algorithms, Information storage and retrieval. A trie is a search tree obtained by merging the common suffixes of the key set. It has the advantage that all keys as prefixes of an input string can be retrieved with high speed. When the size of the key set is enlarged, however, a problem arises, as the number of transitions increases, so too does the need for a large storage capacity. This paper proposes an algorithm that dynamically constructs DAWGs (Directed Acyclic Word Graphs) for the handling of dynamic key sets. It also solves the problem of the increasing number of transitions in the trie structure. The proposed method constructs a DAWG through the local separation of common suffixes for updating a key and, after finishing updating a key, the local transition merge of common suffixes. The proposed algorithm is theoretically evaluated and the data structure for the implementation is discussed. Experimental results show that the number of transitions in the DAWG is reduced by approx. 50 to 70% compared of that of the trie, for key sets of several thousands of fifty thousand elements, also the updating of the keys can be executed in a practical time for sets of less than ten thousand keys.
ISSN:0020-7160
1029-0265
DOI:10.1080/00207169408804348