A method for improving full text search using signature files

Efficiency of full text retrieval using signatures depends on the number of filtering and the reduction of the original text, but there has been no discussion how a signature is constructed keeping the worst-case filtering ratio. In order to consider this problem, we present a technique of construct...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of computer mathematics Ročník 77; číslo 1; s. 73 - 88
Hlavní autori: Yamakawa, Yoshihiro, Fuketa, Masao, Morita, Kazuhiro, Aoe, Jun-ichi
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Abingdon Gordon and Breach Science Publishers 01.01.2001
Taylor and Francis
Predmet:
ISSN:0020-7160, 1029-0265
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Efficiency of full text retrieval using signatures depends on the number of filtering and the reduction of the original text, but there has been no discussion how a signature is constructed keeping the worst-case filtering ratio. In order to consider this problem, we present a technique of constructing signatures by using an appearance probability of strings in a textual data. It enables us to retrieve any keywords in expected worst-case searching time. A partial appearance probability is proposed because the overall probability for the whole text takes a lot of time building signatures. From the simulation result, it turns can't that the worst-case filtering ratio of the presented method can keep the expected ratio while that of the traditional method degrades zero.
ISSN:0020-7160
1029-0265
DOI:10.1080/00207160108805051