Faculté des sciences

Statistical Behavior of Fast Hashing of Variable-Length Text Strings

Savoy, Jacques

In: SIGIR Forum (Special Interest Group on Information Retrieval), 1990, vol. 24, no. 3, p. 62-71

In information retrieval, we often have to store and search for a particular record into a large amount of information. For example, during a document indexing process or when a program is trying to spell a text, a dictionary has to be used in an efficient way. A solution to that problem resides in using a hash table. However, if we known many algorithms for manipulating or accessing hash tables... Plus

Ajouter à la liste personnelle
    Summary
    In information retrieval, we often have to store and search for a particular record into a large amount of information. For example, during a document indexing process or when a program is trying to spell a text, a dictionary has to be used in an efficient way. A solution to that problem resides in using a hash table. However, if we known many algorithms for manipulating or accessing hash tables [Knuth 73], [Standish 80], [Wiederhold 87], the main problem is to define a "good" hash function for a variable-length string. In order to answer that question our main goals are to present some concrete algorithms and to study their statistical behavior.