Uniform Hashing in Constant Time and Optimal Space

Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SIAM journal on computing Jg. 38; H. 1; S. 85 - 96
Hauptverfasser:	Pagh, Anna, Pagh, Rasmus
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Philadelphia, PA Society for Industrial and Applied Mathematics 01.01.2008
Schlagworte:	Algorithmics. Computability. Computer arithmetics Algorithms Applied sciences Computer science; control theory; systems Data processing. List processing. Character string processing Exact sciences and technology Memory organisation. Data processing Miscellaneous Software Theoretical computing Lower bound Ideal Probability Entropy Space time Probability distribution Algorithm Input hash function Hashing randomized algorithms; 68P5 Random function Data structure uniform hashing 68P10 68P20
ISSN:	0097-5397, 1095-7111
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theoretical ideal can be realized by hash functions that do not take up too much space and can be evaluated quickly. In this paper we present an almost ideal solution to this problem: a hash function $h: U\rightarrow V$ that, on any set of $n$ inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM and can be stored in $(1+\epsilon)n\log \|V\| + O(n+\log\log \|U\|)$ bits. Here $\epsilon$ can be chosen to be any positive constant, so this essentially matches the entropy lower bound. For many hashing schemes this is the first hash function that makes their uniform hashing analysis come true, with high probability, without incurring overhead in time or space.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14
ISSN:	0097-5397 1095-7111
DOI:	10.1137/060658400